Testing your Unison code

Writing unit tests is easy in Unison. You add your tests as special watch expressions in your scratch file, then add them to the codebase using the add or update commands, then use the test to run the tests.

Note that unit tests can't access any abilities that would cause the test to give different results each time it's run. This means Unison can cache test results, which is completely safe to do. When you issue the test command any tests that have been run before will simply report the cached results.

Basic unit tests

Okay, let's get started! Here's a simple test:

square : Nat -> Nat
square x = x * x

test> square.tests.ex1 = check (square 4 == 16)

The test> line is called a "test watch expression." Like other watch expressions in your scratch file, it will get evaluated (or looked up in the evaluation cache) on every file save. By convention, tests for a definition called square are placed in the square.tests namespace.

🤓
The check function has type Boolean -> [Test.Result]. Any test watch expression must have the type [Test.Result]. Though we won't use this capability much, it's possible to have a single test produce multiple results, hence the [Test.Result] rather than Test.Result. If you decide to write a different testing library, you just have to be able to produce a [Test.Result] in the end.

By the way, you can write test watch expressions that span multiple lines. For instance, here's a test for distributed.lib.base.data.List.reverse:

use universal
test> List.reverse.tests.ex1 = check let
  actual = reverse [1,2,3,4]
  expected = [4,3,2,1]
  expected === actual
📕
As discussed in the language reference, keyword-based constructs like let bind tighter than function application, so you don't need any parentheses around the let block which is used as the argument to check.

Adding diagnostics for a failing test

When you have a test that's failing, like this one below, you often want to print out some information before it fails:

use universal

brokenReverse : [a] -> [a]
brokenReverse as = []

test> brokenReverse.tests.ex1 = check let
  actual = brokenReverse [1,2,3,4]
  expected = [4,3,2,1]
  expected === actual

Notice we don't get any information about why it failed. Let's go ahead and fix that, by temporarily inserting a call to the function base.bug : a -> b, which halts your program with an exception and prints out its argument, nicely formatted onto multiple lines if needed:

use universal

brokenReverse : [a] -> [a]
brokenReverse as = []

test> brokenReverse.tests.ex1 = check let
  actual = brokenReverse [1,2,3,4]
  expected = [4,3,2,1]
  if not (expected === actual) then
    bug ("Not equal!", expected, actual)
  else
    true

Here we're passing a tuple to the base.bug function, but we could pass it any value at all.

Adding tests to the codebase

Once you're happy with your tests, you can add them to the codebase and use the test command to see the test results in the current namespace.

square : Nat -> Nat
square x = x * x

test> square.tests.ex1 = check (square 4 == 16)

test> List.reverse.tests.ex1 = check let
  actual = reverse [1,2,3,4]
  expected = [4,3,2,1]
  expected == actual
.mystuff> add
.mystuff> test

But actually, it didn't need to run anything! All the tests had been run previously and cached according to their Unison hash. In a purely functional language like Unison, tests like these are deterministic and can be cached and never run again. No more running the same tests over and over again!

Generating test cases with code

Unison's base library contains powerful utility functions for generating test cases with Unison code, which lets your tests cover a lot more cases than if you are writing test cases manually like we've done so far. (This style of testing is often called property-based testing.)

The property-based testing support in Unison relies on an ability called Gen (short for "generator"). If you haven't read about abilities yet, we suggest taking a detour to do so before continuing.

For instance, a '{Gen} Nat is a computation that, when run, produces Nat values. You can sample from a '{Gen} a multiple times to produce different values. Importantly, these are not random values. The sequence of values generated is entirely deterministic:

> sample 100 (natIn 0 10)

test> myTest = runs 100 '(expect (!(natIn 0 10) < 10))

When developing generators to use for testing, you'll often put those generators in a watch expression like this to make sure you understand what they are generating. Generators denote a set of values, and as the above shows, it is possible to exhaustively enumerate that set, at which point Gen.sample will stop short. Above we asked it to generate 100 natural numbers in the range 0 to 10, but there's only 10 unique values, so it stops after that.

Combining generators

Where things get interesting is when combining generators. There are a few ways of doing that. For a '{Gen} a, you can use the ! operator to sample from it, and you can sample from multiple generators to build up a more complex generator. Let's have a look at an interesting example, which highlights something important about these generators:

base.test.sample 10 do use Nat + use test.gen natIn n = natIn 0 10 () + 100 m = natIn 0 100 () (n, m)
[ (100, 0) , (100, 1) , (101, 0) , (100, 2) , (101, 1) , (102, 0) , (100, 3) , (101, 2) , (102, 1) , (103, 0) ]
🎨
Syntax note: The do keyword (sometimes expressed as'let) is a common idiom to introduce a delayed computation which is a block. The precedence is such that sample 10 do … followed by a newline is parsed like sample 10 (do …).

As we can see, Gen does fair or "breadth-first" sampling from both of the generators involved, rather than exhausting one before moving on to the next.

Doing a breadth-first enumeration is the right move because as we build up more complex generators, where the space of possibilities is often so huge that it's only possible to sample a tiny fraction of it.

There are two other ways of combining generators. One is base.test.gen.pick, which fairly samples from multiple generators in a breadth-first manner:

base.test.gen.pick : ['{Gen} a] -> '{Gen} a

Here's an example:

The other is base.test.Gen.cost : Nat -> '{Gen} a -> '{Gen} a, which assigns a "cost" to a generator. What does that mean? When a branch of base.test.gen.pick has a cost of 5 for instance, the sampling process will take 5 samples from all other branches before switching to fairly sampling from both branches:

You may want to do some experimentation to get a feel for how Gen behaves. You can use the base.test.Gen.cost to control which branches of the space of possibilities get explored first⁠—a common use case will be to assign higher costs to "large" test cases that you wish to test less frequently.

Using generators to write property based tests

Once you've got your generators in good shape, you can combine these into property-based tests that verify some property for all generated test cases. For example, let's check that reversing a list twice gives back the original list:

test> List.reverse.tests.prop1 = runs 100 'let
  original = !(listOf (natIn 0 100))
  original' = List.reverse (List.reverse original)
  expect (original === original')

> sample 10 (listOf (natIn 0 100))

Don't forget, if you encounter failures, you can use base.bug to view intermediate generated values that trigger the failure.

Because the test results are always deterministic and cached, you may want to crank up the number of samples taken before choosing to add your tests to the codebase.

.mystuff> add
.mystuff> test

Notice that all the test results are cached. If you later update the definitions being tested (like distributed.lib.base.data.List.reverse in this example), the tests won't be found in the cache and will get re-run when you type the test command.

Other useful functions when writing property-based tests

The function base.test.runs that we've been using has type base.test.runs : Nat -> '{e, Gen} deprecated.Test ->{e} [test.Result]. To form a value of type deprecated.Test, you can use the functions base.test.Test.expect (which we've seen) as well as ok, fail, and others which we can locate using type-based search:

.base> find : Test

These functions are used to give different messages on success or failure. Feel free to try them out in your tests, and you may want to explore other functions in the testing package.

Lastly, the base library is open for contributions if you come up with some handy testing utility functions, or want to contribute better documentation (or tests) for existing definitions.

Thanks for reading!