news
Jun 03, 2024

Where Unison is headed

Paul Chiusano

Unison is built on the unusual idea of content-addressed code. Even though we knew early on that new things would be possible with this approach ("the math checks out!"), actually getting to a real usable technology took years.

We now have an amazing foundation for continued innovation. There's a cohesive ecosystem of tools around Unison which are already quite useful today. But this is just the beginning. In the next 6 months, year, and beyond we will be entering the realm of science fiction. 🛸

This is a long post with a lot of details of what we're thinking for the future, so here's a summary:

  1. Keep improving the core language, runtime, and tooling. Examples: adding an FFI to Unison, improving JIT compiler performance, more semantic merge capabilities, and a more graphical UCM experience.
  2. Make Unison Share an even nicer place to host your projects. Examples: "find usages", site-wide code search, multi-collaborator projects, and more.
  3. Major new Unison Cloud features: scheduled jobs, distributed event processing, resilient workflows, and high-performance native execution via our JIT compiler.

Read on for the details! Items are marked as ⚙️ ("relatively easy or already in progress"), 🧪 ("a big effort, but very doable"), or 🛸 ("possibly science fiction or needing research, but seems broadly possible").

Lots can change so don't take this post as a guarantee of when things are happening. We'll continue to keep our roadmap updated.

New cloud capabilities

In the next month we're planning to ship what seems like an inconsequential feature: the ability to launch daemon processes that run indefinitely "on at least one node at all times". This isn't necessarily a feature one programs with directly; instead it's a building block for all kinds of things:

  • High-performance and highly available in-memory distributed queues and topics
  • Scheduled jobs, cron for the cloud
  • Resilient workflows ("every 30 days, wake up and send an invoice to each customer")
  • Distributed event processing, streaming analytics, etc
  • … and lots more.

These will be open source libraries running atop Unison Cloud daemons, so you'll have the same power to build similar features, or to customize the behavior for your unique needs. From early prototyping of the above, we're seeing (as usual) orders of magnitude reductions in how much code is required to express these kinds of systems.

Compositionality makes a huge difference. All the things you build can talk to each other without serialization and networking boilerplate, without dependency conflicts, YAML engineering, and without having to build and ship around multi-GB containers. It's incredibly refreshing and makes working on these systems so much more fun.

The right sources of complexity

Fabio was recently working on a Kafka-like distributed streaming system prototype (stay tuned!) which has some tricky parts, and he made the following observation on the experience of building interesting distributed systems with Unison: "it's challenging, but never tedious".

There's still intricate code, but maybe it's 50 lines instead of 5000, and it's 50 lines concerned entirely with the part that matters.

Other major features planned or being considered:

  • JIT compiler for Unison Cloud nodes. (⚙️) This will be a drop-in replacement requiring no Unison code changes, your code will just magically run faster. The rollout will happen seamlessly with no downtime and no need to redeploy your services!
  • GPU pools and model inference pools in Unison Cloud, (🧪) for easy multi-GPU distributed computations in a few lines of code. Our basic distributed computing API is designed to support multiple compute pools with different capabilities and which can all be mixed seamlessly in an overall distributed computation.
  • Unison Cloud Edge, (🧪) running on globally distributed infrastructure. These "edge" locations might have different included abilities than the main cloud pool (for instance, storage for the edge has different performance requirements and tradeoffs) but would still be able to seamlessly call Unison native services regardless of where they're hosted.
  • "Unison Cloud in a box" (⚙️) enterprise product to run our compute and storage fabric on your own infrastructure or VPC.
  • Pro Tier managed cloud offering, (🧪) for orgs wanting their own dedicated autoscaled compute pool in our public cloud.

Core language and tooling

Doing the basics well is important! These are all happening now or sometime in the next six months.

  1. New and improved merge algorithm. (✅) We've just released this, addressing a major annoyance with the previous workflow. Going forward, the large majority of merges and pulls should be clean, conflicting only if the two branches being merged have literally edited the same definitions. In the event of actual conflicts you'll have the same nice "just get your scratch file compiling" experience that we first got working for update and upgrade. Because Unison stores code in a more semantic way, entire classes of merge conflicts are just impossible! You'll never have conflicts due to things like order of imports changing, order of definitions in a file changing, formatting differences, and more.
  2. Fast clone and pull. (⚙️) Right now, to contribute to a project, you have to obtain the full history via a clone which takes a while (minutes for a large project with a long history). We'll likely address this with shallow clones. You don't need the full history of a project to contribute to it—only the person merging the contribution needs the history. The common case of opening a contribution can take seconds instead of minutes. In addition, we might tweak our sync protocol to be more efficient, especially for common cases like fetching a project release.
  3. Faster JIT compiler performance. (⚙️) The JIT is a lot faster than the current interpreter (about 60x faster straight line performance), but we also haven't done much to optimize the code it generates or the implementations of many builtins. There's a lot of easy wins here. We'll also be adding instant JIT compiler startup times via run.native. Right now we don't do anything smart to cache compiled code. We can instead use a little server running a code syncing protocol that only asks ucm to send missing dependencies, so startup is instant.
  4. A myriad of quality of life improvements and bugfixes that until now we've been putting off to work on major features. (⚙️)

And beyond!

Let's look at some more advanced capabilities we have in the works for core tooling. Even though these seem hard, a lot are pretty straightforward to do atop Unison's foundations:

  1. Find usages on Share (⚙️) based on Unison's fully accurate dependency graph. This feature is not hard, is already supported locally (the dependents command) and will ship "as soon as Simon has a few free days to build the front end". Also on Share, we'll be adding automated changelogs with hyperlinked diffs for every project release (similarly not difficult and reuses the work already done to produce hyperlinked diffs for contributions).
  2. Site-wide code search on Share (⚙️) across all projects, and all type signatures in those projects. This will be incremental, with results refined as you type and also supporting autocompletion. Unlike text-based code search, Unison understands your code semantically, so when you search for List.map, it'll take you to that definition, in the project where it is originally defined. GitHub struggles here; since code is treated as as blobs of text, search results often take you to a random file in a random project where List.map appears as a call site… or in a comment!
  3. New semantic merge capabilities. (🧪) Imagine: Alice swaps the order of arguments for a function foo while Bob simultaneously writes a new function that calls foo. With a text-based version control system, Bob's code won't compile in the merged result. But in Unison, we can detect these sorts of refactorings and merge them automatically! But this example (which is actually not hard for us to do) just scratches the surface of what's possible. Some of these are easy while others require storing more structured information in the version history.

By maintaining the structure of a codebase in an actual database instead of a bag of text files, so much of the basic experience of writing and interacting with code can be made better!

Some other possibilities:

  • Alternate syntaxes. (🧪) Since code is stored in a database as its abstract syntax tree, that same AST can be viewed using a "Pythonic" syntax or a "C-like syntax". We've always known this was possible but haven't had the time to invest in it yet.
  • Graphical UCM. (🧪) The Unison codebase manager command line tool is pretty low tech, but it's been surprisingly effective. It's a single all-in-one tool that runs alongside your text editor and "just works" for most of the things you need to do with a Unison codebase (it's also an LSP server for Unison)… and yet, we can also do even better. Imagine a rich UI with a command palette with amazing autocomplete and inline help, and where output can be richly formatted. Don't worry, we can do this while continuing to support the existing terminal app.
  • Interactive documentation. (🛸) Imagine the Unison documentation type, but with interactive sections where users can write code and run it, getting feedback right away. Useful for building even richer tutorials, learning materials, and blog posts.
  • Unison and Unison Share for other languages. (🛸) Imagine if all languages could get the Unison Share experience of hyperlinked code, semantic merges, and more. Imagine all languages could get something like Unison's perfect incremental compilation and shared compilation cache. This is definitely a science project but a lot seems possible and it would be fun to explore if we weren't so busy with everything else. However, if anyone would like to hand us a large pile of money to work on this, please get in touch. 😀

Core language improvements

The core Unison language has been quite stable over the years. While we occasionally make tweaks to the syntax or pretty-printer, this doesn't break existing code. We've yet to make a backwards incompatible language change; code that was written years ago still runs unchanged.

These are the language improvements that seem the most pressing right now:

  • ⚙️ A foreign function interface. We've held off working on this while the JIT was in progress, since the FFI obviously needs to interact with the runtime. With the JIT more stable, we can make it easy to access arbitrary C libraries from Unison. (Even though code that uses the FFI can't be distributed in the same way as pure Unison, that's fine, we plan to track usage of foreign libraries with abilities.)
  • 🧪 Proper record types. Either extensible records or regular "non-extensible" records. The current experience with record types is not great, because they are just regular data types with some autogenerated helper functions. A more first-class representation of records would be much better.

Also potentially in the works for later:

  • 🧪 Generalized algebraic data types. This is a known extension to the basic type system that Unison uses, but we haven't gotten around to implementing it.
  • 🧪 Typeclasses or something similar. This has been discussed a lot over the years. While typeclasses a la Haskell aren't really a slam dunk for Unison, something like this feature can be incredibly convenient.

Conclusion

Phew, that was a lot!

Unison was designed differently than most languages. This was done for good reason, to make new things possible, and it's paid off. Here are a few things we have today in Unison:

  • No builds. A perfect and shared incremental compilation cache.
  • Easy distributed computation, deployment with a function call, etc.
  • Hyperlinked code browsing and diffs, by default.
  • Computable docs with live hyperlinked examples.
  • Find usages, type-based search, and even structural term rewriting
  • Instant non-breaking renames.
  • … and more

Unlike a lot of programming tech which has stagnated due to outdated assumptions about programming ("programs are single-machine computations" or "codebases are bags of text files"), Unison is only going to keep getting better. And not just bit by bit, but by huge leaps.

Onward!

– The Unison Team 💜