Abilities for the monadically inclined

This section covers the purpose of monads, their alternatives, and why you might want to use abilities.

How we ended up in the world with monads

Little-known fact: Monads were introduced into functional programming becauseHaskell is lazy (non-strict).Before we look into Haskell, let's look at strict languages.

Mainstream languages (Java, JavaScript, Python, etc.) are based on strict evaluation:

  • the expressions are evaluated when they are bound to a name;
  • the function arguments are evaluated before they're applied to a function.

Let's start with a simple snippet in a strict pseudo-language:

fun second(a: Int, b: Int, c: Int): Int =
  { println("Ignore everything and return second"); b }
let c = { println("Evaluating C"); 3 } // [1]

let result = second(
  { println("Evaluating A"); 1 },      // [2]
  { println("Evaluating B"); 2 },      // [3]
  c
)

println(result)

In this example,cis evaluated and printed first, thena,thenb,and so on:

Evaluating C
Evaluating A
Evaluating B
Ignore everything and return second
2

Such an evaluation strategy, roughly speaking, allows us to read code in sequential order: up to down, left to right. This might be taken for granted, but we can show/observe this because theside-effects(like printing to the console)are executedduring theevaluationof the expression where they are defined.

On the other hand, having this “natural” evaluation order and tying execution to evaluation the same way is not an option in Haskell. Haskell is non-strict. In a similar snippet, “a” and “c” are never used nor evaluated.

second a b c = b
result = ???

We won't try to translate the rest of the snippet to Haskell — it wouldn't make much sense. The non-strictness raises too many questions. Would it even printEvaluating AorEvaluating C?If so, when? Bothaandcare never used! If we call thissecondfunction multiple times, should we printEvaluating AorEvaluating Cmultiple times? It's non-strict! And so on.

Luckily for us, we can avoid philosophizing by using monads! The idea is that we can embed thesenative side-effectsin a data type and describe the sequencing using functions.

For example, we can use theputStrLnfunction that returnsIO:

second a b c =
  putStrLn "Ignore everything and return second" >>= \_ -> return b

main :: IO ()
main =
  putStrLn "Evaluating C" >>= \_ ->
  return 3 >>= \c ->
  putStrLn "Evaluating A" >>= \_ ->
  return 1 >>= \a ->
  putStrLn "Evaluating B" >>= \_ ->
  return 2 >>= \b ->
  second a b c >>= \result ->
  putStrLn (show result)

When we run this Haskell program, we once again see:

Evaluating C
Evaluating A
Evaluating B
Ignore everything and return second
2

Okay, but why are we using monads and monad-like things outside of Haskell — where we don't have “this laziness issue”? Turned out that monads are great not just for modeling native effects but also forany customoruser-defined effect.

Control flow

The actual superpower of monads is the embedding of control flow.In other words, we can abstract control flow (and various patterns) as datatypes and use functions over values.

Control flow is the order of execution — the path a program takes, based on instructions like conditions, loops, and function calls. For instance, do X, then do Y, and depending on some condition do S or Z (usingif/then/else):

graph TD;
    X[doing X]-->Y;
    Y[doing Y]-->C{condition};
    C-->|true|S[doing S];
    C-->|false|Z[doing Z];
    S-->done;
    Z-->done;

In traditional programming languages, the set of control flow patterns is fixed. Expanding it requires additional features, such astry/catch,whileandforloops,asyncandawait,and so on.

Sometimes, we can represent complex patterns using simpler building blocks; for example, we can represent loops using conditionals (akaif/then/else):

fun loop(counter: Int, iterations: Int) =
  if counter >= max_iterations
    // Do something
  else
    // Do something else
    loop(counter + 1, iterations)

Writing loops from scratch quickly becomes cumbersome. So, usually, we'll get built-in loops.

Another example is checking if the value exists (checking fornull,nil,whatever). It can be implemented using a simple check:

String result = "";
if (album != null) {
  result = album.getLast().toString();
} else {
  result = "missing";
}

It also doesn't scale well:

String result = "";
if (artist != null
    && artist.getDiscography() != null
    && artist.getDiscography().getAlbums() != null
    && artist.getDiscography().getAlbums().getLast() != null) {
  result = artist.getDiscography().getAlbums().getLast().toString();
} else {
  result = "missing";
}

That's why languages like Kotlin and Rust provide operators or special syntax to safely chain and deal with nullable values:

artist?.discography?.albums?.last?.toString() ?: "missing"
artist?.discography?.albums?.last.map(|album| album.to_string())

Which are pretty much monads in disguise. We represent optionality as datatypes and use functions overvalues.And it's not just optionality. We can model various concepts (such as exceptions, iterating, sync / async, parsing, finalizing or closing resources, dependency injection, and cancellation) with monads and monad-like things:

  • doSomething.handleErrorWith(doSomethingElse)
  • listOfItems.traverse(item => persistToDB(item))
  • (fetchFromOneService, fetchFromAnotherDataBase).parMapN(doSomethingWithResult)

The largest recent“monadization”we can observe in strict and mainstream languages is aroundasynchronous programming:Reactive Streams in Java, Futures in Scala, Promises in JavaScript and many other places.

Eventually, languages catch up and acquire some sort of async/await syntax or lightweight threads, which replace monads (and monad-like things). But it takes time. For example, Java finally obtained virtual threads (seeProject Loom),but we have been doing asynchronous programming in Java thanks to “monads” for more than 10 years already!

This is the real superpower of monads.This is why we use monads! They allow us, developers, to have more control over the flow of the program.

The costs of monads

However, let's not lie to ourselves, this is not the nicest snippet of code:

second a b c =
  putStrLn "Ignore everything and return second" >>= \_ -> return b

main :: IO ()
main =
  putStrLn "Evaluating C" >>= \_ ->
  return 3 >>= \c ->
  putStrLn "Evaluating A" >>= \_ ->
  return 1 >>= \a ->
  putStrLn "Evaluating B" >>= \_ ->
  return 2 >>= \b ->
  second a b c >>= \result ->
  putStrLn (show result)

Look at all the plumbing we have to do with monads. We describe the sequencing of effects using explicit combinators. It's nice that we have unified syntax for various effects (`pure`/`return`,flatMap/`bind`,etc.), but it's less nice that we have to deal with all the noise and syntax overhead. It’s also a cognitive overhead! We wouldn't have the phenomenon of monad tutorials and hate towards monads if it wasn't the case.

And let's not even talk about monad transformers and problems with mixing effects 😖

Direct style (code without monads) is straightforward. Effects are executed during the evaluation of an expression. Most of the code is just do-this-do-that:

// execute foo and pass the result to bar
let x = foo()
let y = bar(x)

Sometimes, if we need to prevent execution, we can explicitly suspend evaluation:

// pass the foo itself
let doFoo = () => foo
let y = retry3Times(doFoo)

In this example, we don't want to evaluatefooand don't want to execute its effects, because we want the retry function to do that.

In the monadic world,foois alreadydoFoo— it's already suspended. We can passthingsto functions, return them as results, and so on:

// pass the foo itself
retry3Times(foo)

It's amazing for building control flows and writing composable code, but we end up juggling two layers all the time: the layer of constructing it and the layer of when it will execute.

// pass the result of foo to bar
foo >>= bar

This is taxing on an untrained brain. This is where we lose people.

It might hurt to admit, but because most of the code is plain do-this-do-that, on average, it's is simpler to write and readdirect-stylecode because thinking about one level at a time is less demanding than thinking about two.

Monads vs. Direct Style

Direct style is simple, yes, but only when it's possible. Each feature (or control-flow mechanism) needs to be supported by the language and usually comes with its own syntax.

Another trade-off is effects tracking: the effects are not typed and it's not as painful to mix different effects.

On the one hand, it's still easier to teach; on the other hand, it comes with annoying properties such as a lack ofreferential transparencyand some presence offunction coloring(depends on the particular implementation).

The latter can be frustrating with monads as well. Have you ever had to replace a bunch ofmapswithtraversewhen you introduce one additional monad (e.g., adding a log) in a deeply nested function?

def one(str: String): String

def two(y: String): Option[String] =
  y.some.map(one(_))

def three(x: Option[String]): Option[String] =
  x.flatMap(two(_))
// if String becomes IO[String]
def one(str: String): IO[String]

// map becomes traverse
def two(y: String): IO[Option[String]] =
  y.some.traverse(one(_))

// flatMap becomes flatTraverse
def three(x: Option[String]): IO[Option[String]] =
  x.flatTraverse(two(_))
"Monads""Direct Style"
Arbitrary/User-defined Effects👍👎👎👎
SyntaxUnified, but has plumbing overheadO(N) — each feature requires custom syntax
Teachability👎👎👎👍
Familiarity👎👍
Typed Effectstypednot typed
Effects mixing👎👍
Function coloringsome (e.g., map vs traverse)some (depends on the implementation)
Referential transparency👍👍👍👎👎

So, can we do better?

The worst of both worlds? Mixed styles

Naturally, we want to have our cake and eat it too, so we wind up with a mix both — the cons of both! We are stuck in this world because we want thepower of monadsto embed user-defined effects and control flows, but at the same time, we never handled thehate of monads.

Some effects are controlled by evaluation (direct style), and some — by explicit combinators (monadic style). As a result, we lose both the immediacy of the former and the systematic clarity of the latter.

However, we don't have to stay here.

The best of both worlds? Abilities

Meetabilities— Unison's implementation of direct-style algebraic effects, also known aseffect handlersand ambiguously asalgebraic effects.

Let's start with division:

1 / 0
Encountered exception:
divide by zero

We can build a safer version by utilizingExceptionability:

safeDiv1 : Nat -> Nat ->{Exception} Nat
safeDiv1 a b =
  use Nat / ==
  if b == 0 then Exception.raise (Generic.failure "Oops. Zero" b)
  else a / b

We use theException.raiseconstructor to capture failure details (which aren't very interesting in this case). If we try to naively run/evaluate it, we get a compilation error:

.> safeDiv 6 3
The expression in red needs the {Exception} ability, but this location does not have access to any abilities.

We can use thecatchhandler,which translates theExceptioninto a value of typeEither:

catch '(safeDiv1 6 3)
catch '(safeDiv1 6 0)
Left (Failure (typeLink Generic) "Oops. Zero" (Any 0))

Let's try something more exciting and randomly generate both numbers via theIOability (imagine fetching numbers from external services or something):

safeDiv2 : '{IO, Exception} Nat
safeDiv2 _ =
  use Nat / ==
  a = !randomNat
  b = !randomNat
  if b == 0 then Exception.raise (Generic.failure "Oops. Zero" b)
  else a / b

There are a couple of new things in this snippet. Weforce(execute) the computation ofrandomNatto get 2 random natural numbers anddelay(suspend) the wholesafeDiv,which is also reflected in the type signature.

The idea behind forcing should be straightforward, but why should we delay a computation? The type{IO, Exception} Natby itself is not allowed there, becauseIOandExceptionabilities must be provided by a handler. The runtime can do it for us when we run the whole program (notice that we don't usecatch):

>. run safeDiv2

As a result, we get something boring like0or2.Let's make it fun by involving another ability:

safeDiv3 : '{IO, Exception, Store Text} Nat
safeDiv3 =
  do
    use Nat / == toText
    use Text ++
    a = !randomNat
    b = !randomNat
    Store.put (toText a ++ "/" ++ toText b)
    if b == 0 then Exception.raise (Generic.failure "Oops. Zero" b)
    else a / b

We can use theStoreability to record the randomly generated division components. This time we must provide a handler forStore— because Unison can't just guess or handle it for us —withInitialValue "empty"will do the job:

final : '{IO, Exception} ()
final : '{IO, Exception} ()
final = do
  use Text ++
  withInitialValue "empty" do
    result = !safeDiv3
    printLine (Store.get ++ "=" ++ Nat.toText result)

Wow, look at this:

>. run final
6864436983616853973/1491537154359797591=4

And we can keep going and going with adding new abilities. Notice how little syntax we used (nobind/flatMapor any specialized combinators) and how similar each iteration ofsafeDivwas.

Control flow and the call stack

In a nutshell, direct-style algebraic effects is “just” a nice interface overdelimited continuations.

Control flow happens by affecting the call stack.As a reminder: exceptions, loops, async, finalizing or closing resources, dependency injection, cancellation — all of these require some technique to jump through the function call stack. So, instead of manipulating the call stack “directly” via custom features, we can use monads or some other instrument.

  • Direct style / Custom features:The compiler does that.
  • Monads:We can manipulate the data structures that reify the call stack.
  • Delimited continuations:We can use user-level primitives to affect the call stack, but it's quite mind-breaking. Nobody wants that.
  • Direct-style algebraic effects:We can use delimited continuations via a better interface (types + syntax).

The pros of direct-style algebraic effects

What do we have now?

Right up front, we get the ability to embed user-defined effects — a massive improvement over traditional direct style; plus the ability to write code in direct style — a considerable improvement in expressing control flow.

We haven't demonstrated this,but writing custom effects is more accessible than monads.You have to trust us on that.Writing custom monads is an upper-intermediate or expert-only endeavor (especially when you have to deal with stack-safety in a strict language).

At the same time, there's a limited amount of syntax. The language designers must make a few choices: around effect handlers, suspending and forcing execution. We saw two variations in the case of Unison. Still, there is no need to make a new syntax for every feature.

Which leads us to teaching. Teaching people how touseeffects is trivial (essentially, they need to grasp suspending and forcing computations).

Remember how easy it was to add theStoreability to existing abilities? We keep typed effects and peacefully mix them! And on top of all that, no more function coloring at all. There is nomap/`traverse`distinction and so on.

The elephant in the pure room

Nothing comes for free. This might sound dramatic for people coming from monadic worlds, but the first thing we lose with direct-style algebraic effects isreferential transparency.

Imagine we have a program that initializes two mutable refs with"empty",writes"full"into the first one, and then reads values from the both refs:

refExample1 : '{IO} Text
refExample1 = do
  use IO ref
  use Text ++
  use mutable.Ref read
  x = ref "empty"
  y = ref "empty"
  mutable.Ref.write x "full"
  read x ++ "|" ++ read y

When werun refExample,we get:

"full|empty"

If we decide todrythe code and naively extract the initialization intoemptyRef:

refExample2 : '{IO} Text
refExample2 = do
  use Text ++
  use mutable.Ref read
  emptyRef = IO.ref "empty"
  x = emptyRef
  y = emptyRef
  mutable.Ref.write x "full"
  read x ++ "|" ++ read y

We get a different program because, when werun refExample2,we get:

"full|full"

Bothxandywere updated because they are the same ref. What we wanted to do is tosuspendthe initialization and thenforceit twice (first time forxand second fory):

refExample3 : '{IO} Text
refExample3 = do
  use Text ++
  use mutable.Ref read
  emptyRef = do IO.ref "empty"
  x = !emptyRef
  y = !emptyRef
  mutable.Ref.write x "full"
  read x ++ "|" ++ read y

Which would give us:

"full|empty"

As you can see, refactoring the code with abilities might require more brain activity than with monads. Even though it's a hurdle for people with a “monadic” background, it's not the end of the world. We gave up something important, but we also gained a ton.

Other limitations and unknowns

The drawbacks don't stop there. Direct-style algebraic effects is a novel concept (even comparatively to monads) — there are still some open questions and some unknowns. We need some time to explore their limitations.

For example, static control flow (applicative-style code) is a niche use case (e.g., applicative parsers, applicative-style build tools) can't be expressed with abilities. Because there is no way not to evaluate effects or undo an effect.

Are there more cases like this? We don't know yet. However…

Takeaways

If you are curious and want to explore the knowns and unknowns of abilities (and direct-style algebraic effects), now is the perfect time.

At the end of the day, direct-style algebraic effects (delimited continuations) and monads havethe same expressive power— they can express the same things.

MonadsDirect StyleAbilities
Arbitrary/User-defined Effects👍👎👎👎👍👍👍
SyntaxUnified, but has plumbing overheadO(N) — each feature requires custom syntaxUnified, O(1)
Teachability👎👎👎👍👍
Familiarity👎👍👍
Typed Effectstypednot typedtyped
Effects mixing👎👍👍
Function coloringsome (e.g., map vs traverse)some (depends on the implementation)👍
Referential transparency👍👍👍👎👎👌 / 👎
Call-stack manipulationvia data structurescompiler's jobvia delimited continuations
MaturityLimitations are knownLimitations are knownNovel, some unknowns
Static control flow👍👎👎