Smyth is a tool that completes unfinished programs in a typed, functional language by gathering information from normal code assertions.
Such a tool is known as a program synthesizer.
Let’s see how Smyth can help us finish a program that duplicates each element of a list.
You can edit all this code in the browser! Just be warned: the web-based version of Smyth is much slower and less robust than the official command-line tool, so check that repo out if you want to try more involved examples.
We’ve written out the types and some unit tests here, but the actual implementation is incomplete. Click the “Forge” button above to have Smyth forge some code for us.
Smyth gathers enough information from our assertions to fill in the hole on line 11 of our program.
Let’s take a look at a more complicated example.
Let’s create a function max
that returns the
greater of its two arguments.
specifyFunction
We’ll often use the helper function specifyFunction
as shorthand for a list of assertions. For example,
specifyFunction f [(0, 1), (2, 3)]
would yield the
assertions assert f 0 == 1
and assert f 2 == 3
.
We also define helpers specifyFunction2
,
specifyFunction3
, and so on to specify two- and three-
argument functions, respectively (and so on).
Smyth successfully synthesizes
max
, but it took a lot of examples!
Try removing some examples to see if and when
Smyth
returns overspecialized code.
The truth is, sometimes it’s easier to write code than
examples. For example, the case when m = 0
is really
easy for us to figure out: max(0, n) = n
, always!
If we sketch out this branch of the code in max
,
Smyth
can fill in the rest of the code for us with fewer examples.
As in the max
example, we can provide
partially-completed programs (known as sketches) to
Smyth to be completed.
Internally, Smyth uses sketches to keep track of its progress as it builds a full solution for us.
In order to forge code for us, Smyth needs a way to run these sketches and a way to check them against our assertions.
Let’s take a look at how Smyth runs and checks sketches using live evaluation and live unevaluation respectively.
Smyth uses a technique called live evaluation invented by the folks behind the Hazel project to run sketches.
Live evaluation works by running the code “around” a
hole.
For example,
1 + 2 + ??
live evaluates to
3 + ??
.
Each hole in the output also keeps track of what all the variables in the program are equal to at the time of evaluation. So
let x = 7 in 1 + 2 + ??
live evaluates to
3 + [x ↦ 7] ??
.
These bindings are very important for Smyth when it comes time to generate the new code that completes a sketch.
Smyth uses an all-new technique we call live unevaluation to check sketches against assertions.
Live unevaluation transfers assertions on code to constraints specifically on holes.
For example,
(3, 4)
unevaluates onto
(??0, 4)
with the constraint
??0 ⊧ 3
.
In other words,
assert (??0, 4) == (3, 4)
produces the constraint
??0 ⊧ 3
.
As with live evaluation, Smyth keeps track of what the variables in the program are equal to, so
let y = 8 in assert (??0, 4) == (3, 4)
produces the constraint
[y ↦ 8] ??0 ⊧ 3
.
We call the combination of live evaluation and live unevaluation live bidirectional evaluation!
Live bidirectional evaluation enables Smyth to take sketches with assertions and produce constraints on holes.
Smyth then feeds these constraints into an augmented version of the Myth algorithm to generate code that satisfies the hole constraints.
The original Myth algorithm only allows for independent top-level holes and requires that assertions be trace-complete. A set of assertions for a function is trace-complete if, when the function is called on each of the asserted examples, all its recursive calls also have an asserted example in the set.
For example, a trace-complete version of the set of assertions for
the stutter
function given above would be:
specifyFunction stutter
[ ([], [])
, ([0], [0, 0])
, ([1, 0], [1, 1, 0, 0])
]
(Notice the extra example for [0]
.)
The two main contributions of Smyth that live bidirectional evaluation enables are thus:
Let’s start with a function that inserts a number into a sorted list:
Now let’s use that function to make a sorting function:
Changing gears, we can also work with more complex data structures like trees. Let’s make a function that returns an in-order traversal of a tree:
Interested in more examples?
Check out the examples
directory in the
Smyth repository.
You’ll need to use the command-line version of
Smyth
to run some of those examples.
(The repo’s README
contains installation
instructions.)
Interested in Smyth’s implementation?
Check out the Smyth repository, the Smyth code documentation, and Justin Lubin’s undergraduate thesis, which discusses implementation details of Smyth.
Interested in theory?
Check out our ICFP 2020 publication, Program Sketching with Live Bidirectional Evaluation.
Interested in Myth?
Check out Peter-Michael Osera’s PhD thesis, which was a huge help in developing Smyth!
Interested in live evaluation?
Check out the Hazel project, which pioneered live evaluation and takes related ideas even further. The Hazel team is actually working on implementing Smyth within their system right now!
We would like to thank:
This work was supported by the following NSF grants: