There’s a pattern that I keep recommending to teams over and over again, because it helps separate concerns around I/O; sending and receiving things over a network, interacting with AWS, saving and loading things from data stores. It’s an old idea; you’ll find it filed under “Decorator” in your Gang of Four book.
For our purposes here, this is a compositional pattern that allows us to stack onion-rings of concern around an I/O boundary; strings on the wire, transfer protocols (eg HTTP) , formats (eg JSON), business concepts. There is no code in existence that should be concerned about more than one of these at once; they can be structured as stackable layers, and other modules can pick which level of detail they wish to talk to.
Before I get into the solution, let’s motivate it with a problem; we’ll take a bit of code that has suspect module boundaries:
Given a way to transform listings, and an asynchronous action on strings, this class will compose these actions with some bridging logic in a way that makes sense for what we know about listings.
The thing with strings
There’s a problem here though – why are we saving strings? What strings? The implementation tells us that we are rendering the Seq[FullListing] into JSON, then into a String, so we know that we are specifically sending a JSON string to be saved. But why? Why does this live with the logic? If you want to change your transmission mechanism or protocol, then you have to dig into this business logic here, on top of everywhere else that knows about this implementation decision.
Testing is hard
Let’s not forget about the tests; how do we test this module? The question I always come back to is:
What’s this for? What is the problem that makes you say “I know! I’ll use this thing!” to solve it?
I guess, according to our implementation, the problem is “I already have a way to transform listings and a place to put FullListing JSON, but I need a way to put them together with some extra logic”. It’s a clumsy description, and this already suggests that things aren’t quite right. Our tests need to reflect what the code “is for”. This means we need to:
- Prove that inputs got “transformed” and “saved” in a meaningful way
- Prove that the logic happened in an expected way
Tests work best when you can reason about them in terms of input and output; this is easiest in functionally pure code, but otherwise it could be “check the state of the world before/after executing side effects”, “observe side effects that were performed in the world”, and that sort of thing. We should generally expect that the classes that exist for a purpose are framed to allow observation of this purpose. Don’t make implementation details public, or blast holes in the hull with mocking libraries; these are cop-outs that paper over problems with the module boundaries.
Here, because the “output” is some string that gets sent to the “save” action, we have to do some terrible things to test it. We have to know that JSON is being produced, what kind of JSON is being produced, and what kind of stringy goop is being squeezed out of the pasta grinder.
Horrid. It is slobbering over intimate implementation details in the most unspeakable fashion. It’s hard to get a real feeling of what is really being tested here.
The thing we really needed in the class was “a place to put things” – so let’s define that.
The name “Sink” implies a place to put things, and not much more; if we want to imply that it retains the data, we could name it “Store” or something like that. We happen to have a asynchronous return value, which is what we need here, but you can clearly do what you want for your use case.
This class has been cleaned up nicely. Now, we no longer know anything about strings or JSON – but somebody has to. We can write a JsonSink that can render any type for which an Encoder has been defined, and it will pass the resulting string to an underlying Sink[String].
As it happens, the app wants the listings sent to an API. We can make one for that as well:
Great, so it takes strings and sends them away as request bodies to some preconfigured URL.
The only place in the program that should know about all these at once is the “wiring” place, that knows there it is in an app; nothing else should know it is in an app.
Easy! The JSON and HTTP concerns are now locked in layers, and nobody else needs to worry about them. But what about the tests?
Testing is easy
We can make a Sink for testing that remembers the inputs:
Sometimes people call this kind of thing a “mock”, but I don’t think this is a useful way to think of it. The ListingProcessor class has declared that it is agnostic as to which Sink implementation it is given; there is no sense in which a MemorySink is “less real” than the JSON/HTTP one or a database one. It has as much to do with puppydogs, or pancakes. Those things are non-concepts in our ListingProcessor universe. We are testing ListingProcessor in precisely the way it is supposed to be used.
Setup, execute, check. Very concise, very straightforward, and we are not stuffing around with endpoints, formats, or strings.
We can make it even more abtract; here we don’t even care what the effect is.
Now, the ListingProcessor need not even care about asynchrony; it turns out it didn’t need it, and only wanted to flatMap over the Async type. Therefore, Monad is the lightest constraint that can be applied. Now, asynchrony is an application-level concern.
Testing is great
We didn’t need to do fancy things to resolve the asynchrony; it just isn’t a thing anymore. We have skinned our class down to something we really cared about, and everything else fell into place.
Functional point of view
From a functional programming point of view, what is happening is function composition. We could define a contramap method on Sink, that widens the input type:
Instead of making new subclasses of Sink, we could just call contraMap to widen it one step at a time. Sink is called a contravariant functor, because the type parameter A is always in “input” position.
Other use cases
This pattern is really flexible, and I’d love to see it used more on problems like this.
For instance, we could use it as a logger, where the app can directly log meaningful objects, and the layers (chosen by app wiring) can handle the serialisation and output mechanism. The information logged could be treated as a first class requirement and directly tested.
While Sink is focused on input, we could just as easily represent something that is focused on output, or both input and output at the same time, with whatever effects in the mix.
In mobile code, a “Store” that fetches things might have different layers; one that directly returns meaningful objects, one that caches results and knows when to use the local cache, one that renders to protobufs, one that transmits the protobufs to a known API. Each of these could be tested independently, and app components themselves can talk the language of their domain without needing to get in the mud with the implementation details.
Most interactions with AWS can be profitably modelled with this pattern.
Next time you’re working on a codebase that talks to I/O — database, API calls, HTTP, S3, queues, SNS, Dynamo, etc, have a good think about how you are separating concerns. What is the problem you are really solving? Does everything get easier if you adjust the boundaries slightly?
Decorators are a great compositional pattern allowing the different concerns that inevitably cluster around I/O boundaries to be neatly separated and recombined. This opportunity presents itself several times in every app we write, and does not require any fancy language, type system, or framework. See how you go!