How we used Category Theory to solve a problem in Java

A few months ago, my colleague Chris Myers and I used some basic category theory concepts to guide us to a design that elegantly solved a problem in a Java codebase.

It isn’t the only way we could have arrived at the design; anyone could have done it, really! However, you might find it interesting to see what practical application of these ideas can look like. Importantly, category theory gives us a framework to shed light on what makes many good design concepts useful, and why.

The problem

There is a Java API at the core of our business that accepts user searches and returns property listings. Over the three years of its existence, it has handled considerable traffic volumes, and seen intense development from many teams and dozens of developers. While in some respects the original design has handled the expansion gracefully, in parts it has grown out of control and become difficult to maintain.


Our new problem looks like this:

  1. The existing API results have all kinds of extra features baked in that have nothing do with listings per se; mortgage calculators, real estate agents, and UI conveniences and the like. More and more get added over time.
  2. We want to delegate different kinds of searches to completely different services and data stores.
  3. The new APIs will only return bare-bones search results, without all the extra baggage.
  4. We don’t want to rewrite the bolted-on features anew for each data source; we just want to write them once, and have them work for listings extracted from anywhere.


The trouble is of course, that all of the current logic is hardwired to the current implementation, which extracts the necessary information from the old legacy data store. How can we repurpose the existing features such that they can be added and removed independently of other code, and without requiring duplication for each data store, and without an expensive rewrite of the codebase?

Composing programs

First, let’s consider the idea of composition in software, and how it helps us.

Often systems grow over time with a layering of new concepts over old; the two live side by side, and both must be understood to be used. Eventually there will be a stack of concepts, each a different market stall, hawking its wares to the overwhelmed maintainer. And so time brings complexity in tow.

On the other hand, when we say that modules or processes are composable, it means we are able to combine them in self-similar ways that do not add to the cognitive burden of the system; the composite forms are indistinguishable from their component parts. This is a powerful concept, because it implies that the software has an obvious avenue of growth that does not introduce greater complexity.

This is nothing new so far—this is also a staple of Object Oriented modelling, after all! We may wonder though: how can we capture this property? What is the essence that makes something composable?

Monoids: the essence of composition

The idea of combining two things into a self-similar composite suggests a type signature:

(M,M) -> M

This notation represents a function that takes two Ms as arguments, and returns an M result. A simple thing then—two Ms become one.  We’ll call this compose, or use the conventional infix operator ° as shorthand.

We make things simpler for ourselves if we can say that composition should be associative, so that:

(a ∘ b) ∘ c == a ∘ (b ∘ c)

This lets us notationally omit grouping brackets.

Instantly, we have a building block:

m1 ∘ m2 ∘ m3 ∘ m4 ∘ m5

Each atom, layer, and the final result have the same type, and may be treated in the same way. However, there is still an action taking place; if we have a way to neutralise the operation, then we can build from it with greater flexibility.

id: M
where (id ∘ M) == (M ∘ id) == M

When we have these two things—an associative binary function and an identity element—we have the essential ingredients of composition.

compose: (M,M) -> M
id: M

In abstract algebra, this structure is called a monoid. It is a simple concept, but profound; it is nothing less than the fundamental notion of category theory! [1]

Category theory may have a fearsome reputation, but not really for complexity; in fact it is the very simplicity of this scalable composition mechanism that allows the most terrifying and alien concepts and meta-concepts to be toppled to the ground, and discussed with the very same vocabulary as the most mundane.

This is precisely what it brings to software development. We don’t want ever-growing towers of concepts and meta-concepts any more than mathematicians do!

Common monoid examples

Monoids are extremely common in nature and in software; it will benefit us to examine some garden varieties that constantly crop up in everyday software development. Notice that in each case, we are building with raw materials, but the results let us retain the same notation and vocabulary:

Integers with addition:

1 + 2 + 3 + 4 == 10
0 + 77 + 0 == 77

Integers again, with multiplication:

2 * 3 * 4 == 24
1 * 444 * 1 == 444

Lists with appending:

[1,2] ++ [3,4] ++ [5,6] == [1,2,4,3,5,6]
[] ++ [9,9,9] ++ [] == [9,9,9]

Functions A -> A [2], with composition:

(λa -> a+1) ∘ (λa -> a*2) ∘ (λa -> a*a-1) == (λa -> ((a*a-1) * 2) + 1)
(λa -> a) ∘ (λa -> a+777) ∘ (λa -> a) == (λa -> a+777)

Can you see the pattern? While we are performing a different operation in each case, in fact all these examples exhibit the exact same compositional behaviour. We can forget the messy specifics of each data type, and just consider them generically as monoids.

Fussier composition

Assuming that function composition requires no explanation, we know that functions compose associatively too:

f ∘ (g ∘ h) == (f ∘ g) ∘ h

And of course, the identity function has no effect when composed with any other function.

We saw that functions A -> A are monoids, but general functions A -> B are fussier; they can’t be piped together unless the types line up! For instance A -> B can compose with B -> C, but cannot be connected to an X -> Z.

compose: (Fn<B,C>, Fn<A,B>) -> Fn<A,C>
id: Fn<A,A>

We see that functions behave a bit like monoids, but generalised to allow restrictions on composition. This more general concept, represented here by “Fn” is called a category; it captures the essence of what makes functions composable, without getting bogged down in their specific details.

From functions to functors

Let’s consider a similar concept, that we frequently come across in day-to-day software: containers or processes that produce values we can map over.

A trivial example is a simple value of some type A; applying a function A -> B naturally turns it into a B.

A more interesting case is List; if we know how to turn A -> B, then we know how to turn a List<A> -> List<B>; we just iterate through the list, apply the function to each element, and accumulate the resulting list.

functor list

Binary trees too, for example; a function from A -> B is enough information for us to convert a Tree<A> into a Tree<B>.

functor tree

Data structures are one thing, but what about processes that produce a result? Take a function Input -> A. The composition that we looked at before gives us the answer already; a function from A -> B will give us another function Input -> B. So even though they are not a fixed structure, functions have this same mapping behaviour over their output! Notice the difference in behaviour though; rather than directly applying the mapping function, it is composed and stored for later.



interface OutputGiver<A> {
    public A giveOutput(Input input);

    public default <B> OutputGiver<B> map(Function<A,B> f) {
        return input -> f.apply(giveOutput(input));

A structure or computation that can be mapped over in this way is called a functor [3]. We can see, in fact, that the underlying mechanism that allows some F<A> to be a functor is if the generic parameter is only used for “output”. If it’s used in an “input” position, then it just won’t fit. [4]

As it happens, functors are also composable; if F<?> and G<?> are functors, then F<G<?>> is a functor too! But why are they composable? What is the underlying principle that makes them so?

Firstly, we’ve seen that composing two functors F<?> and G<?> will give us a single resulting functor F<G<?>>.

Secondly, the “single value” functor in the first example is called the “Identity functor”; when composed, the structure will be basically unchanged:

Id<F<A>> ≅ F<Id<A>> ≅ F<A> [5]

Do these two conditions now remind you of a pattern you’ve already seen?


Looking more closely at computations where the type variable is exclusively used in an “input” position, we see that a similar pattern emerges.

If we take the simple function A -> Output, how can we turn it into a B -> Output? A function A -> B won’t do the trick, because it doesn’t give us any ability to operate on the incoming B value. We can however, use a flipped function B -> A!


interface InputTaker<A> {
    public Output takeInput(A a);

    public default <B> InputTaker<B> contraMap(Function<B,A> f) {
        return b -> takeInput(f.apply(b));

A computation that can be “contramapped” over like this is called a contravariant functor; it composes in the same way that regular functors do. [6]

This is really useful, because sometimes the type parameter just has to be in an “input”  position; now we can change the structure in a compositional, manageable way.

Category theory applied to the problem

If you remember, we needed to find a way to add features to listings in a sustainable and modular way; we only want to write the feature logic once, regardless of which service provided the data in whatever format.

A monoid appears

When considering the problem, we first recognised that there was a basic A -> A monoid pattern in the way that each feature modifies the search results:

interface Extension {
    Results apply(Results results);

This suggested to us that there might be a way to compose any number of extensions to the results processing without making the system harder to understand. Monoid functionality could be immediately added:

interface Extension {
    Results apply(Results results);

    final default Extension compose(Extension e) {
        return results -> e.apply(Extension.this.apply(results));

    static final Extension IDENTITY = r -> r;
    static Extension composeAll(Extension... extensions) {
        return Stream.of(extensions).reduce(IDENTITY, Extension::compose);

Now, using composeAll(), multiple extensions running in sequence reduce to a single one!

Composing with input

We’re not done yet, though; in practice each extension will need some kind of input to operate on. The bit that adds real estate agent data will need the agent ID and the AgentService; the bit that adds pretty URLs will need the PrettyUrlCreator, and so forth. Fortunately though, we knew that a function shaped (S, A) -> A is still a monoid. So we could add a “source” parameter, without giving up the compositional properties we want:

interface Extension<S> {

    Results apply(S source, Results results);

    final default Extension<S> compose(Extension<S> e) {
        return (src, results) -> e.apply(src, Extension.this.apply(src, results));

    static final Extension IDENTITY = (s, r) -> r;

    static <S> Extension<S> composeAll(Extension<S>... extensions) {
        return Stream.of(extensions).reduce(IDENTITY, Extension::compose);

Contramapping input

There’s a key piece of the puzzle missing though; all these disparate services and data stores are going to return their own special formats. However, each extension only knows about its own specific dependencies:


How can we fish out the input we need to run our generic extensions? We want to reduce the datastore-specific code to some minimal glue code, so that everything important happens independently of them. So we need a way to widen the input required, so that, say, an Extension<Banana> can become an Extension<GroceryStore> if we know how to buy the banana from the store.

If we look carefully at the (S, Results) -> Results shape of our Extension type, we can see that the S parameter is only used as input, never as output. Of course, this means that Extension<S> is a contravariant functor over S!

By adding a contraMap method, our generic extensions can be adapted to any particular data store, given a function that can connect the inputs.

interface Extension<S> {

    Results apply(S source, Results results);

    final default Extension<S> compose(Extension<S> e) {
        return (src, results) -> e.apply(src, Extension.this.apply(src, results));

    static final Extension IDENTITY = (s, r) -> r;

    static <S> Extension<S> composeAll(Extension<S>... extensions) {
        return Stream.of(extensions).reduce(IDENTITY, Extension::compose);

    default <T> Extension<T> contraMap(Function<T, S> f) {
        return (T src, Results results) -> Extension.this.apply(f.apply(src), results);

Once contraMap has widened all of the extension to uniformly require the data source output, they can be collapsed into a single extension, using monoid composition.

There we have it! An elegant compositional architecture with the properties we need:

  1. Extensions can be written once, as independent features.
  2. Input functions can be composed horizontally, to make all the extensions take the same datasource-specific input.
  3. The stack of extensions can be vertically collapsed down to a single composite.
  4. The composite extension gets slotted into the existing code somewhere, as one line.

What has category theory bought us?

We started with an important, mature Java codebase with extremely high traffic and rate of development. It was starting to become legacy; modifications were harder, and there was talk about deprecating and replacing it. A new requirement to use many differing data-stores underneath left it at the crossroads; we needed a clean design, or else the whole thing would have to be dismantled, at great risk and expense.

We finally came up with a design, using category theory as a guiding principle. It allowed us to revitalise the existing codebase, allowing continued development while ensuring a sustainable architecture. It is now running in production with over 50 generic extensions wired in, using two different data stores. It saved us a whole lot of time and money.

There are many ways to arrive at a good design, but navigating the endless void of the solution space is often the hardest part; simple solutions are frequently far from obvious. Category theory offers a ready-made repertoire of concepts and vocabulary that can help light a path toward a good design; we can think of it as “the study of composable abstractions”. Exposing this very essence of composition provides insights into software design that are not as clearly articulated elsewhere.

The fact that we’ve successfully brought category theory principles to bear with a such a thoroughly blue-collar language as Java is worth thinking about. Far from being the preserve of researchers and Haskell programmers, these principles apply universally, and can produce simple, clean modules that are easily understood by programmers with no knowledge of the theory. It is well worth the investment of familiarising yourself with the fundamental ideas.

Further reading

The category design pattern (article)​ – Gabriel Gonzalez

Scalable program architectures (article) – Gabriel Gonzalez

Category Theory for Programmers (online book) – Bartosz Milewski

Category Theory for Beginners (talk, slides) – Ken Scambler


[1] Mac Lane, Categories for the Working Mathematician [back]

[2] Functions with the same input/output type are known as endofunctions, or just “endos” to the lazy. [back]

[3] In the maths, functors map between categories, in a similar way to how functions map between sets. When applied to programming, in fact everything is inside the one category of types and functions, so we can more properly call them endofunctors; just like endofunctions, they map from and to the same thing (categories/types, respectively). [back]

[4] The “input” and “output” positions are often called negative and positive positions, respectively. [back]

[5] By the hand-wavy “basically unchanged”, I actually mean isomorphic, which has a precise definition: there exists a one-to-one mapping between them, so that if you go back-and-forth either way, you get the same result. The two types are the “same size”. [back]

[6] Functors can also be called covariant functors, by way of contrast with contravariant functors. [back]