Abilities for the monadically inclined

This section covers the purpose of monads, their alternatives, and why you might want to use abilities.

How we ended up in the world with monads

Little-known fact: Monads were introduced into functional programming because Haskell is lazy (non-strict). Before we look into Haskell, let's look at strict languages.

Mainstream languages (Java, JavaScript, Python, etc.) are based on strict evaluation:

  • the expressions are evaluated when they are bound to a name;
  • the function arguments are evaluated before they're applied to a function.

Let's start with a simple snippet in a strict pseudo-language:

fun second(a: Int, b: Int, c: Int): Int =
  { println("Ignore everything and return second"); b }
let c = { println("Evaluating C"); 3 } // [1]

let result = second(
  { println("Evaluating A"); 1 },      // [2]
  { println("Evaluating B"); 2 },      // [3]
  c
)

println(result)

In this example, c is evaluated and printed first, then a, then b, and so on:

Evaluating C
Evaluating A
Evaluating B
Ignore everything and return second
2

Such an evaluation strategy, roughly speaking, allows us to read code in sequential order: up to down, left to right. This might be taken for granted, but we can show/observe this because the side-effects (like printing to the console) are executed during the evaluation of the expression where they are defined.

On the other hand, having this β€œnatural” evaluation order and tying execution to evaluation the same way is not an option in Haskell. Haskell is non-strict. In a similar snippet, β€œa” and β€œc” are never used nor evaluated.

second a b c = b
result = ???

We won't try to translate the rest of the snippet to Haskell β€” it wouldn't make much sense. The non-strictness raises too many questions. Would it even print Evaluating A or Evaluating C? If so, when? Both a and c are never used! If we call this second function multiple times, should we print Evaluating A or Evaluating C multiple times? It's non-strict! And so on.

Luckily for us, we can avoid philosophizing by using monads! The idea is that we can embed these native side-effects in a data type and describe the sequencing using functions.

For example, we can use the putStrLn function that returns IO:

second a b c =
  putStrLn "Ignore everything and return second" >>= \_ -> return b

main :: IO ()
main =
  putStrLn "Evaluating C" >>= \_ ->
  return 3 >>= \c ->
  putStrLn "Evaluating A" >>= \_ ->
  return 1 >>= \a ->
  putStrLn "Evaluating B" >>= \_ ->
  return 2 >>= \b ->
  second a b c >>= \result ->
  putStrLn (show result)

When we run this Haskell program, we once again see:

Evaluating C
Evaluating A
Evaluating B
Ignore everything and return second
2

Okay, but why are we using monads and monad-like things outside of Haskell β€” where we don't have β€œthis laziness issue”? Turned out that monads are great not just for modeling native effects but also for any custom or user-defined effect.

Control flow

The actual superpower of monads is the embedding of control flow. In other words, we can abstract control flow (and various patterns) as datatypes and use functions over values.

Control flow is the order of execution β€” the path a program takes, based on instructions like conditions, loops, and function calls. For instance, do X, then do Y, and depending on some condition do S or Z (using if/then/else):

graph TD;
    X[doing X]-->Y;
    Y[doing Y]-->C{condition};
    C-->|true|S[doing S];
    C-->|false|Z[doing Z];
    S-->done;
    Z-->done;

In traditional programming languages, the set of control flow patterns is fixed. Expanding it requires additional features, such as try/catch, while and for loops, async and await, and so on.

Sometimes, we can represent complex patterns using simpler building blocks; for example, we can represent loops using conditionals (aka if/then/else):

fun loop(counter: Int, iterations: Int) =
  if counter >= max_iterations
    // Do something
  else
    // Do something else
    loop(counter + 1, iterations)

Writing loops from scratch quickly becomes cumbersome. So, usually, we'll get built-in loops.

Another example is checking if the value exists (checking for null, nil, whatever). It can be implemented using a simple check:

String result = "";
if (album != null) {
  result = album.getLast().toString();
} else {
  result = "missing";
}

It also doesn't scale well:

String result = "";
if (artist != null
    && artist.getDiscography() != null
    && artist.getDiscography().getAlbums() != null
    && artist.getDiscography().getAlbums().getLast() != null) {
  result = artist.getDiscography().getAlbums().getLast().toString();
} else {
  result = "missing";
}

That's why languages like Kotlin and Rust provide operators or special syntax to safely chain and deal with nullable values:

artist?.discography?.albums?.last?.toString() ?: "missing"
artist?.discography?.albums?.last.map(|album| album.to_string())

Which are pretty much monads in disguise. We represent optionality as datatypes and use functions over values. And it's not just optionality. We can model various concepts (such as exceptions, iterating, sync / async, parsing, finalizing or closing resources, dependency injection, and cancellation) with monads and monad-like things:

  • doSomething.handleErrorWith(doSomethingElse)
  • listOfItems.traverse(item => persistToDB(item))
  • (fetchFromOneService, fetchFromAnotherDataBase).parMapN(doSomethingWithResult)

The largest recent β€œmonadization” we can observe in strict and mainstream languages is around asynchronous programming: Reactive Streams in Java, Futures in Scala, Promises in JavaScript and many other places.

Eventually, languages catch up and acquire some sort of async/await syntax or lightweight threads, which replace monads (and monad-like things). But it takes time. For example, Java finally obtained virtual threads (see Project Loom), but we have been doing asynchronous programming in Java thanks to β€œmonads” for more than 10 years already!

This is the real superpower of monads. This is why we use monads! They allow us, developers, to have more control over the flow of the program.

The costs of monads

However, let's not lie to ourselves, this is not the nicest snippet of code:

second a b c =
  putStrLn "Ignore everything and return second" >>= \_ -> return b

main :: IO ()
main =
  putStrLn "Evaluating C" >>= \_ ->
  return 3 >>= \c ->
  putStrLn "Evaluating A" >>= \_ ->
  return 1 >>= \a ->
  putStrLn "Evaluating B" >>= \_ ->
  return 2 >>= \b ->
  second a b c >>= \result ->
  putStrLn (show result)

Look at all the plumbing we have to do with monads. We describe the sequencing of effects using explicit combinators. It's nice that we have unified syntax for various effects (`pure`/`return`, flatMap/`bind`, etc.), but it's less nice that we have to deal with all the noise and syntax overhead. It’s also a cognitive overhead! We wouldn't have the phenomenon of monad tutorials and hate towards monads if it wasn't the case.

And let's not even talk about monad transformers and problems with mixing effects πŸ˜–

Direct style (code without monads) is straightforward. Effects are executed during the evaluation of an expression. Most of the code is just do-this-do-that:

// execute foo and pass the result to bar
let x = foo()
let y = bar(x)

Sometimes, if we need to prevent execution, we can explicitly suspend evaluation:

// pass the foo itself
let doFoo = () => foo
let y = retry3Times(doFoo)

In this example, we don't want to evaluate foo and don't want to execute its effects, because we want the retry function to do that.

In the monadic world, foo is already doFoo β€” it's already suspended. We can pass things to functions, return them as results, and so on:

// pass the foo itself
retry3Times(foo)

It's amazing for building control flows and writing composable code, but we end up juggling two layers all the time: the layer of constructing it and the layer of when it will execute.

// pass the result of foo to bar
foo >>= bar

This is taxing on an untrained brain. This is where we lose people.

It might hurt to admit, but because most of the code is plain do-this-do-that, on average, it's is simpler to write and read direct-style code because thinking about one level at a time is less demanding than thinking about two.

Monads vs. Direct Style

Direct style is simple, yes, but only when it's possible. Each feature (or control-flow mechanism) needs to be supported by the language and usually comes with its own syntax.

Another trade-off is effects tracking: the effects are not typed and it's not as painful to mix different effects.

On the one hand, it's still easier to teach; on the other hand, it comes with annoying properties such as a lack of referential transparency and some presence of function coloring (depends on the particular implementation).

The latter can be frustrating with monads as well. Have you ever had to replace a bunch of maps with traverse when you introduce one additional monad (e.g., adding a log) in a deeply nested function?

def one(str: String): String

def two(y: String): Option[String] =
  y.some.map(one(_))

def three(x: Option[String]): Option[String] =
  x.flatMap(two(_))
// if String becomes IO[String]
def one(str: String): IO[String]

// map becomes traverse
def two(y: String): IO[Option[String]] =
  y.some.traverse(one(_))

// flatMap becomes flatTraverse
def three(x: Option[String]): IO[Option[String]] =
  x.flatTraverse(two(_))
"Monads""Direct Style"
Arbitrary/User-defined EffectsπŸ‘πŸ‘ŽπŸ‘ŽπŸ‘Ž
SyntaxUnified, but has plumbing overheadO(N) β€” each feature requires custom syntax
TeachabilityπŸ‘ŽπŸ‘ŽπŸ‘ŽπŸ‘
FamiliarityπŸ‘ŽπŸ‘
Typed Effectstypednot typed
Effects mixingπŸ‘ŽπŸ‘
Function coloringsome (e.g., map vs traverse)some (depends on the implementation)
Referential transparencyπŸ‘πŸ‘πŸ‘πŸ‘ŽπŸ‘Ž

So, can we do better?

The worst of both worlds? Mixed styles

Naturally, we want to have our cake and eat it too, so we wind up with a mix both β€” the cons of both! We are stuck in this world because we want the power of monads to embed user-defined effects and control flows, but at the same time, we never handled the hate of monads.

Some effects are controlled by evaluation (direct style), and some β€” by explicit combinators (monadic style). As a result, we lose both the immediacy of the former and the systematic clarity of the latter.

However, we don't have to stay here.

The best of both worlds? Abilities

Meet abilities β€” Unison's implementation of direct-style algebraic effects, also known as effect handlers and ambiguously as algebraic effects.

Let's start with division:

1 / 0
Encountered exception:
divide by zero

We can build a safer version by utilizing Exception ability:

safeDiv1 : Nat -> Nat ->{Exception} Nat
safeDiv1 a b =
  use Nat / ==
  if b == 0 then Exception.raise (Generic.failure "Oops. Zero" b)
  else a / b

We use the Exception.raise constructor to capture failure details (which aren't very interesting in this case). If we try to naively run/evaluate it, we get a compilation error:

.> safeDiv 6 3
The expression in red needs the {Exception} ability, but this location does not have access to any abilities.

We can use the catch handler, which translates the Exception into a value of type Either:

catch do safeDiv1 6 0
⧨
Either.Left (Failure (typeLink Generic) "Oops. Zero" (Any 0))

Let's try something more exciting and randomly generate both numbers via the IO ability (imagine fetching numbers from external services or something):

safeDiv2 : '{IO, Exception} Nat
safeDiv2 _ =
  use Nat / ==
  a = randomNat()
  b = randomNat()
  if b == 0 then Exception.raise (Generic.failure "Oops. Zero" b)
  else a / b

There are a couple of new things in this snippet. We force (execute) the computation of randomNat to get 2 random natural numbers and delay (suspend) the whole safeDiv, which is also reflected in the type signature.

The idea behind forcing should be straightforward, but why should we delay a computation? The type {IO, Exception} Nat by itself is not allowed there, because IO and Exception abilities must be provided by a handler. The runtime can do it for us when we run the whole program (notice that we don't use catch):

>. run safeDiv2

As a result, we get something boring like 0 or 2. Let's make it fun by involving another ability:

safeDiv3 : '{IO, Exception, Store Text} Nat
safeDiv3 =
  do
    use Nat / == toText
    use Text ++
    a = randomNat()
    b = randomNat()
    Store.put (toText a ++ "/" ++ toText b)
    if b == 0 then Exception.raise (Generic.failure "Oops. Zero" b)
    else a / b

We can use the Store ability to record the randomly generated division components. This time we must provide a handler for Store β€” because Unison can't just guess or handle it for us β€” withInitialValue "empty" will do the job:

forMonadicallyInclined.final : '{IO, Exception} ()
forMonadicallyInclined.final = do
  use Text ++
  withInitialValue "empty" do
    result = safeDiv3()
    printLine (Store.get ++ "=" ++ Nat.toText result)

Wow, look at this:

>. run final
6864436983616853973/1491537154359797591=4

And we can keep going and going with adding new abilities. Notice how little syntax we used (no bind / flatMap or any specialized combinators) and how similar each iteration of safeDiv was.

Control flow and the call stack

In a nutshell, direct-style algebraic effects is β€œjust” a nice interface over delimited continuations.

Control flow happens by affecting the call stack. As a reminder: exceptions, loops, async, finalizing or closing resources, dependency injection, cancellation β€” all of these require some technique to jump through the function call stack. So, instead of manipulating the call stack β€œdirectly” via custom features, we can use monads or some other instrument.

  • Direct style / Custom features: The compiler does that.
  • Monads: We can manipulate the data structures that reify the call stack.
  • Delimited continuations: We can use user-level primitives to affect the call stack, but it's quite mind-breaking. Nobody wants that.
  • Direct-style algebraic effects: We can use delimited continuations via a better interface (types + syntax).

The pros of direct-style algebraic effects

What do we have now?

Right up front, we get the ability to embed user-defined effects β€” a massive improvement over traditional direct style; plus the ability to write code in direct style β€” a considerable improvement in expressing control flow.

We haven't demonstrated this, but writing custom effects is more accessible than monads. You have to trust us on that. Writing custom monads is an upper-intermediate or expert-only endeavor (especially when you have to deal with stack-safety in a strict language).

At the same time, there's a limited amount of syntax. The language designers must make a few choices: around effect handlers, suspending and forcing execution. We saw two variations in the case of Unison. Still, there is no need to make a new syntax for every feature.

Which leads us to teaching. Teaching people how to use effects is trivial (essentially, they need to grasp suspending and forcing computations).

Remember how easy it was to add the Store ability to existing abilities? We keep typed effects and peacefully mix them! And on top of all that, no more function coloring at all. There is no map/`traverse` distinction and so on.

The elephant in the pure room

Nothing comes for free. This might sound dramatic for people coming from monadic worlds, but the first thing we lose with direct-style algebraic effects is referential transparency.

Imagine we have a program that initializes two mutable refs with "empty", writes "full" into the first one, and then reads values from the both refs:

refExample1 : '{IO} Text
refExample1 = do
  use IO ref
  use Text ++
  use mutable.Ref read
  x = ref "empty"
  y = ref "empty"
  mutable.Ref.write x "full"
  read x ++ "|" ++ read y

When we run refExample, we get:

"full|empty"

If we decide to dry the code and naively extract the initialization into emptyRef:

refExample2 : '{IO} Text
refExample2 = do
  use Text ++
  use mutable.Ref read
  emptyRef = IO.ref "empty"
  x = emptyRef
  y = emptyRef
  mutable.Ref.write x "full"
  read x ++ "|" ++ read y

We get a different program because, when we run refExample2, we get:

"full|full"

Both x and y were updated because they are the same ref. What we wanted to do is to suspend the initialization and then force it twice (first time for x and second for y):

refExample3 : '{IO} Text
refExample3 = do
  use Text ++
  use mutable.Ref read
  emptyRef = do IO.ref "empty"
  x = emptyRef()
  y = emptyRef()
  mutable.Ref.write x "full"
  read x ++ "|" ++ read y

Which would give us:

"full|empty"

As you can see, refactoring the code with abilities might require more brain activity than with monads. Even though it's a hurdle for people with a β€œmonadic” background, it's not the end of the world. We gave up something important, but we also gained a ton.

Other limitations and unknowns

The drawbacks don't stop there. Direct-style algebraic effects is a novel concept (even comparatively to monads) β€” there are still some open questions and some unknowns. We need some time to explore their limitations.

For example, static control flow (applicative-style code) is a niche use case (e.g., applicative parsers, applicative-style build tools) can't be expressed with abilities. Because there is no way not to evaluate effects or undo an effect.

Are there more cases like this? We don't know yet. However…

Takeaways

If you are curious and want to explore the knowns and unknowns of abilities (and direct-style algebraic effects), now is the perfect time.

At the end of the day, direct-style algebraic effects (delimited continuations) and monads have the same expressive power β€” they can express the same things.

MonadsDirect StyleAbilities
Arbitrary/User-defined EffectsπŸ‘πŸ‘ŽπŸ‘ŽπŸ‘ŽπŸ‘πŸ‘πŸ‘
SyntaxUnified, but has plumbing overheadO(N) β€” each feature requires custom syntaxUnified, O(1)
TeachabilityπŸ‘ŽπŸ‘ŽπŸ‘ŽπŸ‘πŸ‘
FamiliarityπŸ‘ŽπŸ‘πŸ‘
Typed Effectstypednot typedtyped
Effects mixingπŸ‘ŽπŸ‘πŸ‘
Function coloringsome (e.g., map vs traverse)some (depends on the implementation)πŸ‘
Referential transparencyπŸ‘πŸ‘πŸ‘πŸ‘ŽπŸ‘ŽπŸ‘Œ / πŸ‘Ž
Call-stack manipulationvia data structurescompiler's jobvia delimited continuations
MaturityLimitations are knownLimitations are knownNovel, some unknowns
Static control flowπŸ‘πŸ‘ŽπŸ‘Ž