Polysemy, one year later
Polysemy, one year later
One year ago, we began the development of Hetchr. We made a number of technical and architectural decisions, and this blog post intends to discuss one of them: the adoption of polysemy.
This article does not aim to explain how polysemy works, but in short, it's a library that allows algebraic effects composition. While using
Monad transformers forces each layer to be composed one with another, Polysemy provides a Monad that is responsible for the composition, letting effects stay independent from one to another.
Hetchr is a SaaS product that centralizes the key features of many collaborative development tools (Github, Jira, Gitlab, and so on).
From a technical point of view: we talk to external APIs and we store metadata over AWS.
From the beginning, we planned to support a large number of different APIs, leading to various choices:
- Having a big monad (or stick with
IO) for everything: this is an issue in terms of expressivity, maintainability, and would make programmatic tests difficult and slow.
- Having a
mtl-style Monad transformer: which might lead to composition difficulties.
- Picking an effect system library: it was new to me at the time but the concept looked promising, such as an orthogonal way of dealing with effects.
Today, we have worked roughly one year on Hetchr and 7 developers have contributed to it (2-3 at the same time, and months with freelancers), leading to a product with 60k lines of code.
We have 19 effects and 35 interpreters.
Fined grained interface implementation segregation is perceptible. Where previously we used to encapsulate technical code implementation in functions (such as an HTTP call), we can create an effect that does not expose the implementation. This allows us to not be coupled too much to our providers (or their SDK version).
Effects exposition makes mistakes more obvious. With dedicated effects, signatures reflect what functions are doing. So if a function has effects related to Github in code dealing with users, it is a clear hint the code does more than it aims to do. Also when a piece of code relies on only a small subset of an effect, it can be considered too large.
Interpreters are easy to write, easy to (re)use. One of the development strategies is to have a really short feedback loop. Keeping the full product building and testing as fast as possible (around one minute on a developer's machine, 10 minutes to build/test/deploy on the continuous integration). So we have interpreters that target the real infrastructure (AWS DynamoDB, AWS Kinesis, ElasticSearch), and interpreters in-memory for tests. At some point we wanted a local version of the backend to facilitate front-end development, it takes us little effort to pull the code and pick the right interpreters. For example, we picked the in-memory interpreters for stores but we took the logging used in production. Extensibility is great, adding effects and interpreters is linear.
Interpreters can be composed. Not only we can have interpreters depending on each other (we have an interpreter of an event store effect, which yields a registry effect, a time effect, and a log effect), but we are also able to transform effects on the fly (for buffering, checking the number of calls), without changing the underlying code.
Most important effects exist already. Most of the heavy lifting is done and we can easily reuse it in our specific interpreters.
What did not work
Initial effects are costly to build. While the principle is simple, setting up the infrastructure is a bit tedious.
Errors handling is difficult. While Haskell is not known for the simplicity of its compilation errors, polysemy makes them worse. Even with the polysemy-plugin, if we forgot to apply one value to an effect, we might end up with an error suggesting a missing effect. Sometimes we even get an incomprehensive error when it happens around effect declaration.
Onboarding is painful. I find effects systems quite simple to grasp (compared to monad transformers), but I think there is a lack of culture, feedback, and examples around it. Consequently, whenever we onboarded people, it took more time than we expected, while our codebase is full of examples.
What can be improved
The documentation is good enough. It contains all we need to work with the wide range of effects, but we need more articles (and even books) on it. This will improve the culture around it and will reduce the onboarding effort.
Having more guidance on the design process. While we aim for a very iterative Type Driven Design, it is especially true in effects systems. For example, I usually iterate three to four times over a piece of code before it is stabilized, and when it comes to effects, it usually takes seven iterations.
Better warnings would be great. Especially for redundant constraints, which is helpful for not effects-related code, but is necessary for effects-related code. It is quite easy to start our code, add an effect, change the code, drop this effect, and keep the constraint. In the end, we keep carrying this effect, which pollutes the type signature and forces us to interpret it.
Despite the initial investment and learning the basics, I think that choosing polysemy was one of our best moves. It helps us to have better structure in our code, giving it good stability.