Smyth

Smyth is a tool that completes unfinished programs in a typed, functional language by gathering information from normal code assertions.

Such a tool is known as a program synthesizer.

Let’s see how Smyth can help us finish a program that duplicates each element of a list.

Try it yourself!

You can edit all this code in the browser! Just be warned: the web-based version of Smyth is much slower and less robust than the official command-line tool, so check that repo out if you want to try more involved examples.

We’ve written out the types and some unit tests here, but the actual implementation is incomplete. Click the “Forge” button above to have Smyth forge some code for us.

Smyth gathers enough information from our assertions to fill in the hole on line 11 of our program.

Let’s take a look at a more complicated example.

Let’s create a function max that returns the greater of its two arguments.

`specifyFunction`

We’ll often use the helper function specifyFunction as shorthand for a list of assertions. For example, specifyFunction f [(0, 1), (2, 3)] would yield the assertions assert f 0 == 1 and assert f 2 == 3.

We also define helpers specifyFunction2, specifyFunction3, and so on to specify two- and three- argument functions, respectively (and so on).

Smyth successfully synthesizes max, but it took a lot of examples! Try removing some examples to see if and when Smyth returns overspecialized code.

The truth is, sometimes it’s easier to write code than examples. For example, the case when m = 0 is really easy for us to figure out: max(0, n) = n, always!

If we sketch out this branch of the code in max, Smyth can fill in the rest of the code for us with fewer examples.

Sketches

As in the max example, we can provide partially-completed programs (known as sketches) to Smyth to be completed.

Internally, Smyth uses sketches to keep track of its progress as it builds a full solution for us.

In order to forge code for us, Smyth needs a way to run these sketches and a way to check them against our assertions.

Let’s take a look at how Smyth runs and checks sketches using live evaluation and live unevaluation respectively.

Live Evaluation

Smyth uses a technique called live evaluation invented by the folks behind the Hazel project to run sketches.

Live evaluation works by running the code “around” a hole. For example, 1 + 2 + ?? live evaluates to 3 + ??.

Each hole in the output also keeps track of what all the variables in the program are equal to at the time of evaluation. So

let x = 7 in 1 + 2 + ??

live evaluates to 3 + [x ↦ 7] ??.

These bindings are very important for Smyth when it comes time to generate the new code that completes a sketch.

Live Unevaluation

Smyth uses an all-new technique we call live unevaluation to check sketches against assertions.

Live unevaluation transfers assertions on code to constraints specifically on holes.

For example, (3, 4) unevaluates onto (??₀, 4) with the constraint ??₀ ⊧ 3.

In other words, assert (??₀, 4) == (3, 4) produces the constraint ??₀ ⊧ 3.

As with live evaluation, Smyth keeps track of what the variables in the program are equal to, so

let y = 8 in assert (??₀, 4) == (3, 4)

produces the constraint [y ↦ 8] ??₀ ⊧ 3.

Live Bidirectional Evaluation

We call the combination of live evaluation and live unevaluation live bidirectional evaluation!

Live bidirectional evaluation enables Smyth to take sketches with assertions and produce constraints on holes.

Smyth then feeds these constraints into an augmented version of the Myth algorithm to generate code that satisfies the hole constraints.

Smyth vs. Myth

The original Myth algorithm only allows for independent top-level holes and requires that assertions be trace-complete. A set of assertions for a function is trace-complete if, when the function is called on each of the asserted examples, all its recursive calls also have an asserted example in the set.

For example, a trace-complete version of the set of assertions for the stutter function given above would be:

specifyFunction stutter
  [ ([], [])
  , ([0], [0, 0])
  , ([1, 0], [1, 1, 0, 0])
  ]

(Notice the extra example for [0].)

The two main contributions of Smyth that live bidirectional evaluation enables are thus:

Synthesis of recursive functions without trace-complete sets of assertions; and
Support for specifying and solving interdependent synthesis goals at arbitrary locations in the program.

More Examples

Let’s start with a function that inserts a number into a sorted list:

Now let’s use that function to make a sorting function:

Changing gears, we can also work with more complex data structures like trees. Let’s make a function that returns an in-order traversal of a tree:

More Information

Interested in more examples?

Check out the examples directory in the Smyth repository. You’ll need to use the command-line version of Smyth to run some of those examples. (The repo’s README contains installation instructions.)

Interested in Smyth’s implementation?

Check out the Smyth repository, the Smyth code documentation, and Justin Lubin’s undergraduate thesis, which discusses implementation details of Smyth.

Interested in theory?

Check out our ICFP 2020 publication, Program Sketching with Live Bidirectional Evaluation.

Interested in Myth?

Check out Peter-Michael Osera’s PhD thesis, which was a huge help in developing Smyth!

Interested in live evaluation?

Check out the Hazel project, which pioneered live evaluation and takes related ideas even further. The Hazel team is actually working on implementing Smyth within their system right now!

Acknowledgements

We would like to thank:

Ian Voysey for guidance regarding proof strategies;
Nadia Polikarpova, Brian Hempel, Michael Adams, Youyou Cong, and anonymous reviewers for many helpful suggestions;
Aws Albarghouthi, John Feser, Viktor Kunčak, and Nadia Polikarpova for answering questions about Escher, λ², Leon, and Synquid; and
Robert Rand—who coined the name Myth—for suggesting the name Smyth, thus further entangling our work with its predecessor.

This work was supported by the following NSF grants:

Semantic Foundations for Hole-Driven Development (CCF-1814900 and CCF-1817145); and
Direct Manipulation Programming Systems (CCF-1651794).