Pattern matching part 2

Intro to Unison datatypes and pattern matching

We've been applying pattern matching to literal values at this point. But pattern matching is frequently done on more complicated types, where we can use the cases of the pattern to decompose the data type into its constituent parts. Here we'll talk about Unison data types in the context of pattern matching.

Your first Unison data type

Let's take a look at a pre-made data type, Either. Either is used to represent situations in which a value can be one type or another.

In Unison, Either is defined in the base library like this (we'll break down what these parts mean shortly):

structural type Either a b
structural type Either a b
  = lib.base.Either.Right b
  | lib.base.Either.Left a

The keyword type indicates that we're looking at a type definition, as opposed to a term definition. Unison types are given a modifier of structural or unique—structural here means that types which share Either's structure are treated as identical to Either and therefore are interchangeable in Unison code, even if they're given a different name. Either is the name we've given to our type, and the two letters a and b are type parameters.

Type parameters in Unison are lowercase by convention. For example:

You can think of a and b as placeholders which represent any type. When we construct a value of type Either, we "fill in" the placeholder.

On the right hand side of the equals are the data constructors of the type. We use data constructors to create a value of the type being described, so to create an Either we have two options: Either.Left or Either.Right. They're separated by pipes |. When you see the pipe think "or."

In summary: you can read the line
structural type Either a b
structural type Either a b
  = lib.base.Either.Right b
  | lib.base.Either.Left a
as Either is a Either.Left containing an a or a Either.Right containing a b. Unison data types are often composed by slotting other Unison types into one data constructor or another. Sometimes those types are not specified, like for a and b in Either, and sometimes they're pinned down to a concrete type.

We'll return to explore more in-depth about data types later.

Decomposing data types with pattern matching

With our whirlwind intro to the parts of a data type behind us, we'll return to how to pattern match on the different data constructors of a given type.

Let's say we wanted a function to tell us which utensils should be paired with a lunch order. We'll use the following types:

Our function should take in a type Lunch as an argument and return a List of type Utensil. We know that there are only three ways to make a value of type Lunch, so we match on the data constructor name followed by the number of fields that the constructor contains.

placeSetting : Lunch -> [Utensil]
placeSetting = cases
  Soup soupName   -> [Spoon]
  Salad saladName -> [Fork, Knife]
  _               -> [Spoon, Fork, Knife]

Pattern matching on the data constructors of the type enables us to inspect and make use of the values they contain. In the example above we don't end up using the variables that are bound to the fields in the data, so we could have also represented them as underscores, like Soup _ -> [Spoon], but we can imagine a function where that would become important:

placeSetting : Lunch -> [Utensil]
placeSetting = cases
  Soup "Hearty Chunky Soup"   -> [Fork, Spoon]
  Soup _                      -> [Spoon]
  Salad _                     -> [Fork, Knife]
  Mystery mysteryMeal isAlive ->
    use Text ==
    if (mysteryMeal == "Giant Squid") && isAlive then [Knife]
    else [Spoon, Fork, Knife]

The first case is an example of how to combine a literal pattern match with a data constructor, and the second and third cases are an example of how to match on any value that Lunch.Soup or Salad data constructor might enclose. Our last case extracts the values being provided to the Mystery data constructor as pattern match variables for use on the right.

Note, the underscores above represent the fact that the value being provided to the data constructor isn't important for the logic of our expression on the right. The underscores do, however, need to be present. Every parameter to the data constructor needs to be represented in the pattern either by a variable, as in our Mystery case, or by an underscore, otherwise Unison will return a pattern arity mismatch error.

As-patterns (the @ in a pattern match)

Let's say you want to pattern match on the Either.Right side of an Either, inspect its content, and use the entire Either.Left value in an expression to the right of your arrow in a match case. With our current tools you could do something like this:

use Either Left Right myMatch : Either Text Text -> Either Text Text myMatch = cases Right _ -> Right "I found a Right" Left b | b === "oh no" -> Left b Left _ -> Left "I found a Left" myMatch (Left "oh no")
Either.Left "oh no"

Reconstructing Either.Left is fairly simple for a small data type like Either, but imagine a data type called Hydra where you need multiple values to form an instance of the type.

Reconstructing the Heads value on the right hand side of the arrows from variables in the pattern match could become onerous.

slayHydra : Nat -> Hydra -> Optional Hydra
slayHydra attack = cases
  Heads h1 h2 immortal h4 h5
    | attack Nat.!= immortal -> Some (Heads h1 h2 immortal h4 h5)
    | attack Nat.== immortal -> None

This is where the "as-pattern" or @ symbol in a pattern match can be useful. Rewriting the function above to use an as-pattern looks like:

slayHydra : Nat -> Hydra -> Optional Hydra
slayHydra attack = cases
  hydra@(Heads h1 h2 immortal h4 h5)| attack Nat.!= immortal  ->
    Some hydra
  Heads h1 h2 immortal h4 h5| attack Nat.== immortal  ->
  Heads _ _ _ _ _ -> Optional.None

The as-pattern binds a variable name to some part of the element being pattern matched on. Its form is variableName@someElement. That variable can then be used to the right of the arrow in your pattern match.

Pattern matching on List

Pattern matching on List elements has its own special syntax. The documentation for List goes in-depth about the data structure itself, check it out!

Head and tail pattern matching

You can use pattern matching to scrutinize the first (left-most) element of a list with the +: syntax.

match ["a", "b", "c"] with head +: tail -> head _ -> "empty list"

The +: is unpacking the first element head to its text value "a" while keeping the remaining elements tail as a List. The underscore will match the "empty list" case. We could have also expressed _ -> "empty list" as [] -> "empty list" or List.empty -> "empty list". All three are valid ways of testing for the empty list case.

You can also pattern match on the last (right-most) element of a list in Unison:

match ["a", "b", "c"] with firsts :+ last -> last _ -> "empty list"
If you find that you're mixing up :+ and +: in your pattern matches, remember that the colon : goes on the side of the COL-lection.

Let's say you wanted to pattern match on the first 2 elements of a given list. It might be tempting to do a multi item pattern match with successive +: operators, but recall that function application for operators always starts at the leftmost sub-expression.

🙅🏻‍♀️ This will not work:

match ["a","b","c"] with
  first +: second +: remainder -> "nope"
  _ -> "empty list"

first +: second is expecting second to be a type List, but what we're trying to express is that it's the unwrapped second element. Instead, if you want to pattern match on a particular list segment length or list segment values, you can use the [_] list constructor syntax!

Our example above can be rewritten:

match ["a", " b", "c "] with [first, second] ++ remainder -> first Text.++ " yes!" _ -> "fallback"
"a yes!"

Or if we don't care about binding the list elements to variables, we can use underscores to pattern match on any list that has exactly [_] elements:

match ["a", " b", "c "] with [_, _] ++ remainder -> "list has at least two elements" _ -> "fallback"
"list has at least two elements"