Pattern matching part 2

Intro to Unison data types and pattern matching

In addition to matching on literal values, pattern matching can be applied to data types to inspect their internal structure and control a program's behavior based on it.

Your first Unison data type

Let's take a look at the Either data type. Either is used to represent situations in which a value can be one type or another.

structural type Either a b = Left a | Right b

The keyword type indicates that we're looking at a type definition, as opposed to a term definition. Unison types are given a modifier of structural or (optionally) unique. structural here means that types which share Either's structure are treated as identical to Either and therefore are interchangeable in Unison code, even if they're given a different name.

The two letters a and b are type parameters. They're lowercase by convention in Unison. You can think of a and b as placeholders which represent any type. When we construct a value of type Either, we "fill in" the placeholder.

On the right hand side of the equals are the data constructors of the type. We use data constructors to create a value of the type being described. To create an Either we have two options: Either.Left or Either.Right. They're separated by pipes |. When you see the pipe think "or."

If you're coming from the land of Object Oriented Programming, you might be familiar with the notion of a "constructor" as a special way to create an instance of an object. A good mental model for what a Unison data constructor is doing is that it is a function whose return type is the data type on the left.

You might think of the data constructor Color.RGB:

type Color = RGB Nat Nat Nat

as a function called Color.RGB that takes three Nat arguments to produce a value of type Color.

RGB : Nat -> Nat -> Nat -> Color

We'll return to explore more in-depth about data types later.

Decomposing data types with pattern matching

With our whirlwind intro to the parts of a data type behind us, we'll return to how to pattern match on a type.

Let's say we wanted a function to tell us which utensils should be paired with a lunch order. We'll use the following types:

type Lunch = Soup Text | Salad Text | Mystery Text Boolean

type Utensil = Fork | Knife | Spoon

Our function should take in a type Lunch as an argument and return a List of type Utensil. We know that there are only three ways to make a value of type Lunch, so we match on the data constructor name followed by the number of fields that the constructor contains.

placeSetting : Lunch -> [Utensil]
placeSetting = cases
  Soup soupName   -> [Spoon]
  Salad saladName -> [Fork, Knife]
  _               -> [Spoon, Fork, Knife]

Pattern matching on the data constructors of the type enables us to inspect and make use of the values they contain. In the example above we don't end up using the variables that are bound to the fields in the data, so we could have also represented them as underscores, like Soup _ -> [Spoon], but we can imagine a function where that would become important:

placeSetting : Lunch -> [Utensil]
placeSetting = cases
  Soup "Hearty Chunky Soup"   -> [Fork, Spoon]
  Soup _                      -> [Spoon]
  Salad _                     -> [Fork, Knife]
  Mystery mysteryMeal isAlive ->
    use Text ==
    if (mysteryMeal == "Giant Squid") && isAlive then [Knife]
    else [Spoon, Fork, Knife]

The first case is an example of how to combine a literal pattern match with a data constructor, and the second and third cases are an example of how to match on any value that Lunch.Soup or Salad data constructor might enclose. Our last case extracts the values being provided to the Mystery data constructor as pattern match variables for use on the right.

Note, the underscores above represent the fact that the value being provided to the data constructor isn't important for the logic of our expression on the right. The underscores do, however, need to be present. Every parameter to the data constructor needs to be represented in the pattern either by a variable, as in our Mystery case, or by an underscore, otherwise Unison will return a pattern arity mismatch error.

As-patterns (the @ in a pattern match)

Let's say you want to pattern match on the Either.Right side of an Either, inspect its content, and use the entire Either.Left value in an expression to the right of your arrow in a match case. With our current tools you could do something like this:

use Either Left Right myMatch : Either Text Text -> Either Text Text myMatch = cases Right _ -> Right "I found a Right" Left b | b === "oh no" -> Left b Left _ -> Left "I found a Left" myMatch (Left "oh no")
Either.Left "oh no"

Reconstructing Either.Left is fairly simple for a small data type like Either, but imagine a data type called Hydra where you need multiple values to form an instance of the type.

type Hydra = Heads Nat Nat Nat Nat Nat

Reconstructing the Heads value on the right hand side of the arrows from variables in the pattern match could become onerous.

slayHydra : Nat -> Hydra -> Optional Hydra
slayHydra attack = cases
  Heads h1 h2 immortal h4 h5
    | attack Nat.!= immortal -> Some (Heads h1 h2 immortal h4 h5)
    | attack Nat.== immortal -> None

This is where the "as-pattern" or @ symbol in a pattern match can be useful. Rewriting the function above to use an as-pattern looks like:

slayHydra : Nat -> Hydra -> Optional Hydra
slayHydra attack = cases
  hydra@(Heads h1 h2 immortal h4 h5) ->
    if attack Nat.== immortal then Optional.None else Some hydra

The as-pattern binds a variable name to some part of the element being pattern matched on. Its form is variableName@someElement. That variable can then be used to the right of the arrow in your pattern match.

Pattern matching on List

Pattern matching on List elements has its own special syntax.

Head and tail pattern matching

You can use pattern matching to scrutinize the first (left-most) element of a list with the +: syntax.

match ["a", "b", "c"] with head +: tail -> head _ -> "empty list"
"a"

The +: is unpacking the first element head to its text value "a" while keeping the remaining elements tail as a List. The underscore will match the "empty list" case. We could have also expressed _ -> "empty list" as [] -> "empty list" or List.empty -> "empty list". All three are valid ways of testing for the empty list case.

You can also pattern match on the last (right-most) element of a list in Unison:

match ["a", "b", "c"] with firsts :+ last -> last _ -> "empty list"
"c"

Put together, you can even pattern match on both ends of the list:

match ["a", "b", "c"] with first +: midSection :+ last -> first Text.++ last _ -> "fallback"
"ac"
If you find that you're mixing up :+ and +: in your pattern matches, remember that the colon : goes on the side of the COL-lection.

However, let's say you wanted to pattern match on the first 2 elements of a given list. It might be tempting to do a multi-item pattern match with successive +: operators, but recall that function application for operators always starts at the leftmost sub-expression.

🙅🏻‍♀️ This will not work:

match ["a","b","c"] with
  first +: second +: remainder -> "nope"
  _ -> "empty list"

first +: second is expecting second to be a type List, but what we're trying to express is that it's the unwrapped second element. Instead, if you want to pattern match on a particular list segment length or list segment values, you can use the [_] list constructor syntax.

Our example above can be rewritten:

match ["a", "b", "c"] with [first, second] ++ remainder -> first Text.++ " yes!" _ -> "fallback"
"a yes!"

Or if we don't care about binding the list elements to variables, we can use underscores to pattern match on any list that has exactly [_] elements:

match ["a", "b", "c"] with [_, _] ++ remainder -> "list has at least two elements" _ -> "fallback"
"list has at least two elements"