How we used Category Theory to solve a problem in Java

A few months ago, my colleague Chris Myers and I used some basic category theory concepts to guide us to a design that elegantly solved a problem in a Java codebase.

It isn’t the only way we could have arrived at the design; anyone could have done it, really! However, you might find it interesting to see what practical application of these ideas can look like. Importantly, category theory gives us a framework to shed light on what makes many good design concepts useful, and why.

The problem

There is a Java API at the core of our business that accepts user searches and returns property listings. Over the three years of its existence, it has handled considerable traffic volumes, and seen intense development from many teams and dozens of developers. While in some respects the original design has handled the expansion gracefully, in parts it has grown out of control and become difficult to maintain.


Our new problem looks like this:

  1. The existing API results have all kinds of extra features baked in that have nothing do with listings per se; mortgage calculators, real estate agents, and UI conveniences and the like. More and more get added over time.
  2. We want to delegate different kinds of searches to completely different services and data stores.
  3. The new APIs will only return bare-bones search results, without all the extra baggage.
  4. We don’t want to rewrite the bolted-on features anew for each data source; we just want to write them once, and have them work for listings extracted from anywhere.


The trouble is of course, that all of the current logic is hardwired to the current implementation, which extracts the necessary information from the old legacy data store. How can we repurpose the existing features such that they can be added and removed independently of other code, and without requiring duplication for each data store, and without an expensive rewrite of the codebase?

Composing programs

First, let’s consider the idea of composition in software, and how it helps us.

Often systems grow over time with a layering of new concepts over old; the two live side by side, and both must be understood to be used. Eventually there will be a stack of concepts, each a different market stall, hawking its wares to the overwhelmed maintainer. And so time brings complexity in tow.

On the other hand, when we say that modules or processes are composable, it means we are able to combine them in self-similar ways that do not add to the cognitive burden of the system; the composite forms are indistinguishable from their component parts. This is a powerful concept, because it implies that the software has an obvious avenue of growth that does not introduce greater complexity.

This is nothing new so far—this is also a staple of Object Oriented modelling, after all! We may wonder though: how can we capture this property? What is the essence that makes something composable?

Monoids: the essence of composition

The idea of combining two things into a self-similar composite suggests a type signature:

(M,M) -> M

This notation represents a function that takes two Ms as arguments, and returns an M result. A simple thing then—two Ms become one.  We’ll call this compose, or use the conventional infix operator ° as shorthand.

We make things simpler for ourselves if we can say that composition should be associative, so that:

(a ∘ b) ∘ c == a ∘ (b ∘ c)

This lets us notationally omit grouping brackets.

Instantly, we have a building block:

m1 ∘ m2 ∘ m3 ∘ m4 ∘ m5

Each atom, layer, and the final result have the same type, and may be treated in the same way. However, there is still an action taking place; if we have a way to neutralise the operation, then we can build from it with greater flexibility.

id: M
where (id ∘ M) == (M ∘ id) == M

When we have these two things—an associative binary function and an identity element—we have the essential ingredients of composition.

compose: (M,M) -> M
id: M

In abstract algebra, this structure is called a monoid. It is a simple concept, but profound; it is nothing less than the fundamental notion of category theory! [1]

Category theory may have a fearsome reputation, but not really for complexity; in fact it is the very simplicity of this scalable composition mechanism that allows the most terrifying and alien concepts and meta-concepts to be toppled to the ground, and discussed with the very same vocabulary as the most mundane.

This is precisely what it brings to software development. We don’t want ever-growing towers of concepts and meta-concepts any more than mathematicians do!

Common monoid examples

Monoids are extremely common in nature and in software; it will benefit us to examine some garden varieties that constantly crop up in everyday software development. Notice that in each case, we are building with raw materials, but the results let us retain the same notation and vocabulary:

Integers with addition:

1 + 2 + 3 + 4 == 10
0 + 77 + 0 == 77

Integers again, with multiplication:

2 * 3 * 4 == 24
1 * 444 * 1 == 444

Lists with appending:

[1,2] ++ [3,4] ++ [5,6] == [1,2,4,3,5,6]
[] ++ [9,9,9] ++ [] == [9,9,9]

Functions A -> A [2], with composition:

(λa -> a+1) ∘ (λa -> a*2) ∘ (λa -> a*a-1) == (λa -> ((a*a-1) * 2) + 1)
(λa -> a) ∘ (λa -> a+777) ∘ (λa -> a) == (λa -> a+777)

Can you see the pattern? While we are performing a different operation in each case, in fact all these examples exhibit the exact same compositional behaviour. We can forget the messy specifics of each data type, and just consider them generically as monoids.

Fussier composition

Assuming that function composition requires no explanation, we know that functions compose associatively too:

f ∘ (g ∘ h) == (f ∘ g) ∘ h

And of course, the identity function has no effect when composed with any other function.

We saw that functions A -> A are monoids, but general functions A -> B are fussier; they can’t be piped together unless the types line up! For instance A -> B can compose with B -> C, but cannot be connected to an X -> Z.

compose: (Fn<B,C>, Fn<A,B>) -> Fn<A,C>
id: Fn<A,A>

We see that functions behave a bit like monoids, but generalised to allow restrictions on composition. This more general concept, represented here by “Fn” is called a category; it captures the essence of what makes functions composable, without getting bogged down in their specific details.

From functions to functors

Let’s consider a similar concept, that we frequently come across in day-to-day software: containers or processes that produce values we can map over.

A trivial example is a simple value of some type A; applying a function A -> B naturally turns it into a B.

A more interesting case is List; if we know how to turn A -> B, then we know how to turn a List<A> -> List<B>; we just iterate through the list, apply the function to each element, and accumulate the resulting list.

functor list

Binary trees too, for example; a function from A -> B is enough information for us to convert a Tree<A> into a Tree<B>.

functor tree

Data structures are one thing, but what about processes that produce a result? Take a function Input -> A. The composition that we looked at before gives us the answer already; a function from A -> B will give us another function Input -> B. So even though they are not a fixed structure, functions have this same mapping behaviour over their output! Notice the difference in behaviour though; rather than directly applying the mapping function, it is composed and stored for later.



interface OutputGiver<A> {
    public A giveOutput(Input input);

    public default <B> OutputGiver<B> map(Function<A,B> f) {
        return input -> f.apply(giveOutput(input));

A structure or computation that can be mapped over in this way is called a functor [3]. We can see, in fact, that the underlying mechanism that allows some F<A> to be a functor is if the generic parameter is only used for “output”. If it’s used in an “input” position, then it just won’t fit. [4]

As it happens, functors are also composable; if F<?> and G<?> are functors, then F<G<?>> is a functor too! But why are they composable? What is the underlying principle that makes them so?

Firstly, we’ve seen that composing two functors F<?> and G<?> will give us a single resulting functor F<G<?>>.

Secondly, the “single value” functor in the first example is called the “Identity functor”; when composed, the structure will be basically unchanged:

Id<F<A>> ≅ F<Id<A>> ≅ F<A> [5]

Do these two conditions now remind you of a pattern you’ve already seen?


Looking more closely at computations where the type variable is exclusively used in an “input” position, we see that a similar pattern emerges.

If we take the simple function A -> Output, how can we turn it into a B -> Output? A function A -> B won’t do the trick, because it doesn’t give us any ability to operate on the incoming B value. We can however, use a flipped function B -> A!


interface InputTaker<A> {
    public Output takeInput(A a);

    public default <B> InputTaker<B> contraMap(Function<B,A> f) {
        return b -> takeInput(f.apply(b));

A computation that can be “contramapped” over like this is called a contravariant functor; it composes in the same way that regular functors do. [6]

This is really useful, because sometimes the type parameter just has to be in an “input”  position; now we can change the structure in a compositional, manageable way.

Category theory applied to the problem

If you remember, we needed to find a way to add features to listings in a sustainable and modular way; we only want to write the feature logic once, regardless of which service provided the data in whatever format.

A monoid appears

When considering the problem, we first recognised that there was a basic A -> A monoid pattern in the way that each feature modifies the search results:

interface Extension {
    Results apply(Results results);

This suggested to us that there might be a way to compose any number of extensions to the results processing without making the system harder to understand. Monoid functionality could be immediately added:

interface Extension {
    Results apply(Results results);

    final default Extension compose(Extension e) {
        return results -> e.apply(Extension.this.apply(results));

    static final Extension IDENTITY = r -> r;
    static Extension composeAll(Extension... extensions) {
        return Stream.of(extensions).reduce(IDENTITY, Extension::compose);

Now, using composeAll(), multiple extensions running in sequence reduce to a single one!

Composing with input

We’re not done yet, though; in practice each extension will need some kind of input to operate on. The bit that adds real estate agent data will need the agent ID and the AgentService; the bit that adds pretty URLs will need the PrettyUrlCreator, and so forth. Fortunately though, we knew that a function shaped (S, A) -> A is still a monoid. So we could add a “source” parameter, without giving up the compositional properties we want:

interface Extension<S> {

    Results apply(S source, Results results);

    final default Extension<S> compose(Extension<S> e) {
        return (src, results) -> e.apply(src, Extension.this.apply(src, results));

    static final Extension IDENTITY = (s, r) -> r;

    static <S> Extension<S> composeAll(Extension<S>... extensions) {
        return Stream.of(extensions).reduce(IDENTITY, Extension::compose);

Contramapping input

There’s a key piece of the puzzle missing though; all these disparate services and data stores are going to return their own special formats. However, each extension only knows about its own specific dependencies:


How can we fish out the input we need to run our generic extensions? We want to reduce the datastore-specific code to some minimal glue code, so that everything important happens independently of them. So we need a way to widen the input required, so that, say, an Extension<Banana> can become an Extension<GroceryStore> if we know how to buy the banana from the store.

If we look carefully at the (S, Results) -> Results shape of our Extension type, we can see that the S parameter is only used as input, never as output. Of course, this means that Extension<S> is a contravariant functor over S!

By adding a contraMap method, our generic extensions can be adapted to any particular data store, given a function that can connect the inputs.

interface Extension<S> {

    Results apply(S source, Results results);

    final default Extension<S> compose(Extension<S> e) {
        return (src, results) -> e.apply(src, Extension.this.apply(src, results));

    static final Extension IDENTITY = (s, r) -> r;

    static <S> Extension<S> composeAll(Extension<S>... extensions) {
        return Stream.of(extensions).reduce(IDENTITY, Extension::compose);

    default <T> Extension<T> contraMap(Function<T, S> f) {
        return (T src, Results results) -> Extension.this.apply(f.apply(src), results);

Once contraMap has widened all of the extension to uniformly require the data source output, they can be collapsed into a single extension, using monoid composition.

There we have it! An elegant compositional architecture with the properties we need:

  1. Extensions can be written once, as independent features.
  2. Input functions can be composed horizontally, to make all the extensions take the same datasource-specific input.
  3. The stack of extensions can be vertically collapsed down to a single composite.
  4. The composite extension gets slotted into the existing code somewhere, as one line.

What has category theory bought us?

We started with an important, mature Java codebase with extremely high traffic and rate of development. It was starting to become legacy; modifications were harder, and there was talk about deprecating and replacing it. A new requirement to use many differing data-stores underneath left it at the crossroads; we needed a clean design, or else the whole thing would have to be dismantled, at great risk and expense.

We finally came up with a design, using category theory as a guiding principle. It allowed us to revitalise the existing codebase, allowing continued development while ensuring a sustainable architecture. It is now running in production with over 50 generic extensions wired in, using two different data stores. It saved us a whole lot of time and money.

There are many ways to arrive at a good design, but navigating the endless void of the solution space is often the hardest part; simple solutions are frequently far from obvious. Category theory offers a ready-made repertoire of concepts and vocabulary that can help light a path toward a good design; we can think of it as “the study of composable abstractions”. Exposing this very essence of composition provides insights into software design that are not as clearly articulated elsewhere.

The fact that we’ve successfully brought category theory principles to bear with a such a thoroughly blue-collar language as Java is worth thinking about. Far from being the preserve of researchers and Haskell programmers, these principles apply universally, and can produce simple, clean modules that are easily understood by programmers with no knowledge of the theory. It is well worth the investment of familiarising yourself with the fundamental ideas.

Further reading

The category design pattern (article)​ – Gabriel Gonzalez

Scalable program architectures (article) – Gabriel Gonzalez

Category Theory for Programmers (online book) – Bartosz Milewski

Category Theory for Beginners (talk, slides) – Ken Scambler


[1] Mac Lane, Categories for the Working Mathematician [back]

[2] Functions with the same input/output type are known as endofunctions, or just “endos” to the lazy. [back]

[3] In the maths, functors map between categories, in a similar way to how functions map between sets. When applied to programming, in fact everything is inside the one category of types and functions, so we can more properly call them endofunctors; just like endofunctions, they map from and to the same thing (categories/types, respectively). [back]

[4] The “input” and “output” positions are often called negative and positive positions, respectively. [back]

[5] By the hand-wavy “basically unchanged”, I actually mean isomorphic, which has a precise definition: there exists a one-to-one mapping between them, so that if you go back-and-forth either way, you get the same result. The two types are the “same size”. [back]

[6] Functors can also be called covariant functors, by way of contrast with contravariant functors. [back]

  • Pingback: How we used Category Theory to solve a problem in Java | JAVA()

  • Pingback: Java Annotated Monthly – December 2015 | IntelliJ IDEA Blog()

  • Pingback: Java Annotated Monthly – December 2015 | Technology Toolz()

  • Pingback: Bookmarks for January 13th | Chris's Digital Detritus()

  • disqus_jTbchRrBq0

    This article has provided me with a lot of food for thought. Thanks!

  • Pingback: === === popular today()

  • Tomer Ben David

    For me this is the first sane and excellent category theory and functional programming explanation for me this is SUPER helpful.

  • Pingback: Sussman on Generic Ops – javascriptbeach()

  • Pingback: Lazy Weekend #13 – Mascaras de caballo,el costo de la vida,niveles de calidad en código,miseria,certificaciones y mi opinión sobre el machismo,hembrismo y feminazis – Another Techie Blog()

  • Is based on Java?

    • Ken Scambler

      Our systems are written in many languages; some of our core systems are
      Java. Ruby is the most commonly used language. New applications tend
      to be written in either Scala or Ruby.

  • Chris

    Here is a gist of this idea in action; although it is quite contrived, it gave me more insight once I saw it in action running.

    • Ken Scambler

      That’s pretty much it. Nice one!

    • Jeff Cahoon

      @Chris and @Ken, I found the article really useful, and this example implementation multiplied how much I got out of the article. Thanks a bunch. I am new to Category Theory and I am struggling a bit with covariant functors, as my question will probably show.

      I have a question about the contraMap. It seems to me that if the only input for the convert function is the incoming source, then we didn’t really need to convert. (The conversion code could have been inside of ExtensionImpl3) I understand that that may be because this is a simple example. However, I am trying to understand how much freedom we have in the convert function. For example, can we look at Result instead of Source to help us perform a lookup in a different object not seen so far (like a database)? Could we ignore the source input and use time of day to generate content for the new source? Are there any restrictions?

      • Chris

        If I understand your question correctly; I will take a stab at trying to answer it.

        You are correct in the stating that the conversion could have taken place in ExtensionImpl3, since in essence it is just down casting. I believe the usage of the convert function vs doing the conversion in the ExtensionImpl3 is a better choice. For one functions compose better. Secondly you can separate that conversion of the request from the actual business logic, which is in ExtensionImpl3.

        As far as how much freedom you have in the convert function, I would imagine you can do pretty much what you need to do, like go to the database, make web service call, etc; but only within the confines of what is available to you in that function. Since that convert function is defined as Function convert, that means you do not have access to the response to do additional work in this function. But of course in the ExtensionImpl3 the apply function does have access to the response, thus you can use whatever data that is available to you in the response to do additional lookups, etc

        The question about ignoring your source, is doable, for whatever reason you have. But remember you are still locked within the confine of your type, since java is statically typed you still need to respect that whatever you convert your request into, still has a relation to its interface.

        Not sure if I addressed your question. Obviously my response is my opinion, I do not work for the company that sponsors this blog, and that presented this solution. So their response or beliefs may be different.

        • Jeff Cahoon

          @Chris, your answer was great. I think it clearly explains the limits of what you can do in contraMap.

          If I understand correctly, here is an example in the context of the article that shows how you CANNOT use contraMap.

          Someone does a search for properties based on some set of requirements like price and size. This comes back as a set of houses. One extension is to supply a set of agents that work in the relevant areas. Another extension is to supply a set of title companies that service the relevant areas. Each of these extensions would like the list of zip codes for the set of houses. In fact, it would be the same list of zip codes. Unfortunately, because the data comes out of the response of a previous extension, we cannot use contraMap to put those zip codes into the input of the extensions. Each extension would have to discover the list of zip codes on its own.

          I sure appreciate your response and your example implementation. Let me know if I have misunderstood what you have said.

          • Chris

            If you are saying that the set of zip codes are retrieved in the initial response, then the first extension does some work; you were wondering if the second extension’s contramap method would have access to the zip codes in the request, or could we somehow make the zip codes accessible in the contramap method.

            I would say yes, you can do it. Your request would have to have some sort of attribute or collection that can store the zip codes. In your first extension, you have access to the initial Request and response, and you can set the zip codes from the response to the Request object.

            When you have an extension that requires the contramap, and it would have to be an extension that is called after the zip codes are set of course, you can then retrieve the zip codes from the request. Then you can do whatever you want with the zip codes in the contramap method.

            Now, maybe I don’t understand your use case, but just because you can do it does not mean that you should do it. I think whatever business functionality you require the zip codes should still remain in the Extension’s apply method. Also, you may be really leaving the realm of the definition of contravariance , at least on a programatic level. Contravariance reverses the direction of object composition, I am not sure if you are trying to attain that in your example. But maybe I misunderstand.

          • Jeff Cahoon

            @Chris, once again you have understood perfectly and answered clearly. For some reason, I assumed that Category Theory would require the Request to be immutable.

            Thinking about it now, I suppose that it would be legal to compose the extensions, each with its own unique Request. If I understand right, the objects in our category are Requests and Responses, and our morphisms are functions from Requests to Responses. ContraMap allows a morphism from Response to Request, which we could then use in another morphism to get a different Response. If this is correct, I think I am beginning to understand better.