What are these Applicatives you speak of?

Introduction

In this blog post, I'm going to provide a very simple explanation for Applicatives (aka "Applicative Functors") just the way I know them. I'm not going to cover the math behind it, or the laws which applicatives must obey.

I've taken a reverse approach compared to many other posts explaining Applicatives: rather than starting with what Applicatives are, I'm going to start with some examples to demonstrate the need for them, then I'll show how Applicatives can be used and at the end, I'll briefly cover how they can be implemented.

All the examples are in Scala, using types such as Option, Scalaz Either, List, Stream, Future, etc which I've referred to as Wrapper types in the rest of this write-up. For example Option[String] is a Wrapper of String.

Here at REA Group, we capture data for all the micro services we have in production. We capture all micro service (aka system) names, as well as the teams which own them and the health (status) of each system. I'm going to use these in all my examples. Let's assume we have models defined for systems, owners and system statuses:

flatMapping for sequential operations

Before I get to examples of Applicatives, I'm going to provide an example use of flatMap. Further down, I'll build on this example to demonstrate the need for Applicatives.

flatMap is used to "sequentially" compose two instances of same wrapper type, i.e. when one operation depends on the output of another.

Let's assume we have a Map (aka dictionary) from system IDs to system objects, and another Map from owner IDs to owner objects:

Note that Map.get(key) always returns an optional (Option) value which is present if the key was found in the Map (Some), and absent otherwise (None), e.g.:

Now, using a combination of maps and flatMaps, we can write a function which given a system ID returns the owner name for that system, if any:

We needed to use flatMap here, since we have two Maps one for systems and one for owners, and to get the optional owner from owners Map, we first need to get an optional system from systems map (sequential dependency).

This function can then be used to get owner name for different system IDs:

Independent operations

Now, what if we had to get values out of two Maps which had no dependency on each other. For example, let's assume that in addition to the systemsMap above, we have a systemStatusMap which gives us the current status of each system.

Now assume that given a system ID, we want to display both the system information (e.g. name) and status on a page. Let's first define a case class for the combination of system and system status:

A given system ID might not exist in systemsMap or systemStatusMap, so again we need to deal with Optional values:

This time, getting system name and status are independent (you don't need the output of one as the input of another).

Using flatMap for independent operations

Still though, we "can" use flatMap:

Note that we need to flatMap over systemOption to be able to access the system value when creating SystemStatusInfo instance, despite the fact that we actually don't need it when we get systemStatusOption from systemStatusMap.

We can use this function to extract SystemStatusInfo for any given system ID:

This implementation of getSystemStatusInfo looks unnecessarily nested and complex. But don't fear, we can rewrite it using for comprehension:

It looks much nicer now, but the fact remains that unlike the the implementation of getOwnerName above, here we're not using the output of first line (system) in the second line (systemStatus). In fact, we can change the order of the lines and the result would be the same:

It would be good if we could somehow "declare" in our code that these two operations are independent. I hear you saying "would it really? How does that matter?". Well, yes it probably doesn't for getting things out of Maps in memory, or any synchronous tasks in general. But let's assume resolving an ID to a system requires making an HTTP request, e.g. to a Systems API, and similarly checking the status of a system at realtime could require making a request to a Systems' Status API.

Asynchronous operations

Let's re-do our example with the assumption that fetching system data and system status are asynchronous:

These functions return Scala Future objects, instead of Options. Futures simply represent asynchronous operations. We can still see them as Wrapper types though, Option[System] has been replaced with Future[System]. So the thing inside still has the same type (System) and only the wrapper type is different.

With this, we can rewrite getSystemStatusInfo function:

The import on first line is needed because mapping or flatMapping over Futures requires ExecutionContexts.

This implementation looks very similar to the one which got stuff out of Maps. The only differences are in the return type of the function and the calls on the right hand side of for-comprehension assignments.

But with Futures, the problem with running the two operations sequentially becomes more obvious. Here, while we don't need the system data object in order to fetch the system status, the request to systems' status API would only be sent when the system API response has returned, so:

Get System -> Get System Status -> Join The Results

Instead of:

----- Get System ------|
|                      |-> Join The Results
-- Get System Status --|

Parallel execution of asynchronous operations

There are ways to run Futures in parallel, so let's do that:

Note that zip is a method on Future which let's us combine two futures and return a Future of a tuple (pair) of the values from each future.

This implementation works, but is very specific to Futures, what if I wanted to replace Future with Scalaz or Monix Task types?

Extracting Future-specific parts out

So, let's try to extract out the future-specific parts of the implementation above:

map2 accepts:

  • Two Future objects, one with type parameter A and one with type parameter B (A and B are the type of the things which would be asynchronously resolved by these futures).
  • And a combine function from A and B to C ((A, B) => C)

It returns a Future[C] which is the result of joining the two futures with the combine function applied to values.

Now, getSystemStatusInfo can be rewritten as:

Replacing one type of operation with another

Now if I wanted to replace Future with a different type like Monix Task, I would need to define a similar map2 function for Task:

Somewhat similar to Future.zip, Task.mapBoth lets us combine two independent Tasks.

Now a Task version of getSystemStatusInfo can be written as:

I'm assuming here that we've already defined fetchSystemTask and fetchSystemStatusTask functions which make HTTP requests to fetch system data and status, but return Tasks instead of Futures.

Defining Applicatives

Similar to flatMap, map2 above is giving us a way to say that two operations need to be composed. The exception is that we're using map2 to declare that the two operations being composed are "independent from each other". Whereas flatMap is used for sequential composition of "things".

The two implementations of getSystemStatusInfo above (one using Task and one using Future) look very much the same. The differences are in

  • Wrapper types, i.e. Future vs. Task
  • The functions used for fetching system and system status

Surely there is a way to extract out the common bits.

That's where the Applicative "typeclass" comes in. If I could define a type like this:

Which provides the map2 function for a Wrapper type F, then I can rewrite my getSystemStatusInfo function as:

Here, F[_]: Applicative means "for any Wrapper type F, for which there is an implementation of Applicative in the scope where getSystemStatusInfo is called".

getSystemStatusInfo has been refactored to accept fetchSystem and fetchSystemStatus as arguments of type F[System] and F[SystemStatus]. The fetch functions are different across different implementations of getSystemStatusInfo, so they need to be passed it.

Now as long as an Applicative has been implemented for Future and is available in the caller's scope, we can use getSystemStatusInfo with Futures:

This can easily be switched to use Tasks without modifying the getSystemStatusInfo function implementation. Interestingly, this can even be switched to use Options if we end up with system and system status data loaded in memory. Again, we only need to provide an Applicative instance for Options.

This implementation of getSystemStatusInfo exposes the least knowledge about the type of fetchSystem and fetchSystemStatus. It only requires them to be applicatives, i.e. "independent operations which can be composed". There isn't much which can go wrong with it indicating that it's using the "right abstraction"s.

In the same way you wouldn't use ArrayList to declare the type of an argument for a function which only requires an abstract Collection, you wouldn't use concrete Future or Task types when the only thing you need is an Applicative.

Implementing Applicatives

The Applicative type in reality is actually defined somewhat different to what I've explained above. To implement an Applicative for any wrapper type F, you would need to implement the following two functions:

  • ap: Applicative extends the Apply type, which is implemented providing an ap function which accepts an F of (A => B) (a wrapper of a function), and an F[A] and returns an F[B].
  • pure: Turns a simple value of type A into an F[A].

For example, an Applicative instance for Option type can be implemented as below:

Similarly, an Applicative instance for Monix Task can be implemented as:

The Applicative type then implements the map2 function using ap and pure:

Scala libraries such as Cats define Applicative as a type class and provide implementation of Applicatives for Options, Futures, etc.