Verifying Legato’s multiplier in Cryptol

Consider the following multiplication algorithm, coded in Mostek 6502 Assembler:

 LDX #8 ; 1; load X immediate with the 8LDA #0 ; 2; load A immediate with the 0CLC ; 3; set C to 0LOOP ROR F1 ; 4; rotate F1 right circular through CBCC ZCOEF ; 5; branch to ZCOEF if C = 0CLC ; 6; set C to 0ADC F2 ; 7; set A to A+F2+C and C to the carryZCOEF ROR A ; 8; rotate A right circular through CROR LOW ; 9; rotate LOW right circular through CDEX ;10; set X to X-1BNE LOOP ;11; branch to LOOP if Z = 0

This program comes from Wilfred Legato’s paper “A Weakest Precondition Model for Assembly Language Programs.” It multiplies the contents of the memory locations F1 and F2; each of which is 8-bits wide. The result is stored in the accumulator register A and the memory location LOW, each of which is, again, 8-bits. It holds that:

 F1 * F2 = 256 * A + LOW

when the algorithm terminates, correctly handling the overflow. It is worth spending a moment or two pondering how this algorithm works; it is not at all obvious how the multiplication is done!Legato’s challenge  (as  referred to in ACL2 circles) is to prove a deep-embedding of Legato’s algorithm correct with respect to a Mostek simulator coded in ACL2. We do not attempt to solve  Legato’s challenge in Cryptol. We are merely interested in coding and proving that Legato’s multiplier is correct in Cryptol. Our interest stems from the fact that Legato’s algorithm is a truly interesting multiplier on its own right, and we would like to make sure that a straightforward encoding of it in Cryptol can be proven correct automatically by Cryptol’s verification tools. And of course, it’s just too hard to pass up on the opportunity to  pay respect to the Mostek chip that powered the Commodore 64‘s and Atari 800XL‘s of our childhood.

A shallow embedding

The Cryptol solution to Legato’s problem will be a fairly shallow encoding of the multiplier, together with an automated proof of correctness. We choose to do a shallow encoding here since it allows us to focus on the multiplication algorithm itself, as opposed to the particulars of the underlying Mostek chip. Theorem proving based solutions (such as those given by ACL2 folks) will rightly pursue a deeper embedding of the algorithm and the Mostek architecture in general. Cryptol is not particularly suitable for deep embeddings. Representing Mostek assembly instructions directly as Cryptol functions is a much simpler and straightforward choice.Looking at Legato’s multiplier above, we will represent each instruction (from 1 to 11) as a simple state transformer, taking a simplified representation of the Mostek machine state as input and delivering a new one. We will only represent parts of the state that matter for our problem. The following Cryptol type declaration succinctly captures what we need:

 type Mostek = ( [8] // F1, [8] // F2, [8] // A, [8] // X, [8] // LOW, Bit // C (Carry), Bit // Z (Zero));

Using this state representation, each instruction in the program can be modeled as a  state transformer:

 type Instruction = Mostek -> Mostek;

This takes care of the data-flow aspect of the embedding; but the question of how to model control-flow remains. We will simply use the host-language’s control-flow features, using the quintessential functional idiom: by calling functions! This is actually easier done than said, and here’s our embedding of the first instruction of the program:

 // step1: LDX #8; load X immediate with the integer 8.step1 : Instruction;step1 (f1, f2, a, _, l, c, z) =step2 (f1, f2, a, 8, l, c, z);

Let’s spend a minute explaining this in detail. The first step in the program loads the register X with the immediate value 8. Using our state-transformer model, our step1 function will receive a Mostek state (consisting of the “current” values of F1, F2, A, X, LOW, CARRY, and ZERO). The “effect” of this instruction is to put the value 8 into the register X, leaving everything else the same. Once this is done, the control goes to the next instruction, which we model by calling the function step2 (which is yet to be defined).In this fashion, we can shallowly embed all the instructions in Legato’s multiplier, using Cryptol’s native functions and control-flow features. Of course, this is hardly a new idea, being the essence of the whole domain-specific embedded language saga: Using a rich host-language to “fake” other languages.Following the recipe set by step1, it is easy to model the next two instructions:

 // step2: LDA #0; load A immediate with the integer 0.step2 : Instruction;step2 (f1, f2, _, x, l, c, z) =step3 (f1, f2, 0, x, l, c, z);// step3: CLC; set C to 0 (Note the use of Bit False here)step3 : Instruction;step3 (f1, f2, a, x, l, _, z) =step4 (f1, f2, a, x, l, False, z);

Step 4 is equally easy in terms of control flow, but is tricky in terms of operation. After some head-scratching, one figures out that the term “rotate F1 right circular through C” means put the right-most bit of F1 in C, and put C in the first position of F1. A bizarre thing to do indeed, but that’s the beauty of Legato’s multiplier. The Cryptol translation is almost literal:

// step4: LOOP ROR F1; rotate F1 right circular through C.step4 : Instruction;step4 (f1, f2, a, x, l, c, z) =step5 (f1', f2, a, x, l, b0, z)where {[b0 b1 b2 b3 b4 b5 b6 b7] = f1;f1' = [b1 b2 b3 b4 b5 b6 b7 c];};

The use of pattern matching in getting the bits out of f1, and the construction of the new value of f1 is idiomatic Cryptol. There’s one little catch though: Apparently Mostek was a big-endian machine, having a most-significant-bit-first representation. Cryptol is little-endian. So, instead of rotating the bits to right, we  rotate them left.The fifth instruction is the first time where we use Cryptol’s control-flow to model the Mostek jump instruction:

 // step5 : BCC ZCOEF; branch to ZCOEF if C = 0.// ZCOEF is step8 in our encodingstep5 (f1, f2, a, x, l, c, z)= if c then step6 (f1, f2, a, x, l, c, z)else step8 (f1, f2, a, x, l, c, z);

In this case, we simply receive a state, and depending on the value of the carry bit (C), we either go to the next step (i.e., no jump); or go to the ZCOEF instruction, which is going to be step-8 in our model. Easy as pie!Step-6 is a replica of Step-3, clearing the carry bit:

 // step6: CLC; set C to 0step6 (f1, f2, a, x, l, _, z) =step7 (f1, f2, a, x, l, False, z);

Step-7 is the most compute intensive part of the algorithm. The Cryptol encoding is a bit complicated due to the need to determine if there was a carry in the addition. Since all Cryptol arithmetic is modular, we are forced to do the computation at an extended bit-size. Otherwise, the modeling of the ADC instruction is quite straightforward:

 // step7: ADC F2; set A to A+F2+C and C to the carry.step7 (f1, f2, a, x, l, c, z) =step8 (f1, f2, a', x, l, c', z')where {// 8-bit "modular" resulta' = a + f2 + (if c then (1:[8]) else (0:[8]));// Was there a carry? Check that "real"// result is larger than 255a'Large : [9];a'Large =(a # zero) // extend a by adding zero bits+ (f2 # zero) // same for f2+ (if c then (1:[9]) else (0:[9]));c' = a'Large > (255:[9]);// set the zero flagz' = a' == 0;};

The Cryptol idiom x # zero simply represents the value x extended on the right with 0 bits. (Remember that Cryptol is little-endian, hence the addition of zero bits on the right does not change the value.) Due to the polymorphic type of the value zero, the result has any number of bits larger than equal to the original bit-size of x. (Since we only need 9-bits i
n this case, we could have coded the same via the expression x # [False], but the former expression is more idiomatic Cryptol.)Steps 8 and 9 are similar to Step-4, using A and LOW instead of F1, respectively:

 // step8 : ZCOEF ROR A; rotate A right circular through C.step8 : Instruction;step8 (f1, f2, a, x, l, c, z) =step9 (f1, f2, a', x, l, a0, z)where {[a0 a1 a2 a3 a4 a5 a6 a7] = a;a' = [a1 a2 a3 a4 a5 a6 a7 c];};// step9 : ROR LOW; rotate LOW right circular through C.step9 : Instruction;step9 (f1, f2, a, x, l, c, z) =step10 (f1, f2, a, x, l', l0, z)where {[l0 l1 l2 l3 l4 l5 l6 l7] = l;l' = [l1 l2 l3 l4 l5 l6 l7 c];};

Step-10 simply decrements X, setting the ZERO flag appropriately:

 // step10: DEX; set X to X-1step10 : Instruction;step10 (f1, f2, a, x, l, c, z) =step11 (f1, f2, a, x', l, c, x'==0)where x' = x-1;

Finally, step-11 either jumps back to the top of the loop (step-4), or finishes the algorithm:

 // step11: BNE LOOP; branch to LOOP if Z = 0.// LOOP is step4 in our encodingstep11 : Instruction;step11 (f1, f2, a, x, l, c, z)= if zthen (f1, f2, a, x, l, c, z) // done!else step4 (f1, f2, a, x, l, c, z);

From a control-flow perspective, we indicate the end of the algorithm by simply returning the final Mostek state. It is worthwile at this point to go through the Cryptol embeddings of the instructions to see how they match-up to the Mostek assembly given by Legato.

Extracting the multiplier

Having coded Legato’s multiplier as a sequence of state transformers, we can simply call the function step1 to use it with an appropriate state. The following helper function simplifies this task for us, by loading the registers F1 and F2, and extracting the high and low bits at the end:

legato : ([8], [8], Mostek) -> ([8], [8]);legato (f1, f2, st) = (hi, lo)where {// get the relevant parts// to construct the starting state(_, _, A, X, LOW, C, Z) = st;// Run legato multiplier;// final A is hi; and final LOW is low(_, _, hi, _, lo, _, _) =step1 (f1, f2, A, X, LOW, C, Z);};

Note that legato still takes the starting machine state st as an argument. Legato’s claim (which we will shortly prove) is that the algorithm works correctly no matter what the initial state is, hence it is important to be explicit about the starting state.To see legato in action, let’s just run it on a simple input:

 legato> legato (12, 93, (9, 42, 3, 8, 1, False, True))(4, 92)

where I just made up the initial state by plugging in some random values. If Legato is right, then it must be the case that

 12 * 93 = 256 * 4 + 92

correctly computing the high and low bytes. And voila! Both sides equal 1116. Magic!


If you do believe in magic,  you can stop reading now. But I suspect most readers of the Galois blog will be looking for something more concrete. Surely, we must be able to give a better argument than claiming witchcraft for the correctness of our implementation.Let us first formally capture what we mean by “correct,” by writing a Cryptol theorem that expresses our intuitive expectation:

theoremlegatoIsCorrect: {x y st}. x' * y' == 256 * hi' + lo'where { (hi, lo) = legato (x, y, st);hi', lo', x', y' : [16];hi' = hi # zero;lo' = lo # zero;x' = x # zero;y' = y # zero};

Here’s the English reading of this theorem: “For all values of x, y, and st, if we run legato on these values and get the results hi and lo, then, it’ll be the case that x * y = 256 * hi + lo.” The only caveat is that we have to do arithmetic operations over 16 bit values (instead of 8), to make sure the theorem statement correctly captures the intended mathematical meaning. (Recall that all Cryptol arithmetic is modular with respect to the bit-size involved.) Hence, we simply add extra zero‘s at the end to enlarge the arguments to 16 bits. Note that, we do not have to assert that the value of lo is at most 255; this is automatically guaranteed by the fact that it is an 8-bit value. Cryptol’s bit-precise type system saves the day!


Here’s what happens when I run cryptol on the file containing the above theorem:

$ cryptol legato.cryCryptol version 1.8.5, Copyright (C) 2004-2009 Galois, Inc.www.cryptol.netType :? for helpLoading "legato.cry".. Checking types.. Processing.. Done!*** Auto quickchecking 1 theorem.*** Checking "legatoIsCorrect" ["legato.cry", line 147, col 1]Checking case 100 of 100 (100.00%)100 tests passed OK[Coverage: 3.47e-14%. (100/288230376151711744)]

When Cryptol sees a theorem declaration in a loaded file, it automatically performs a quick-check run to provide feedback on its validity. In this case, Cryptol automatically created 100 random test values for the theorem and checked that each one of them satisfied the statement. This is a quick way of getting feedback on the correctness of theorems, courtesy of Cryptol at no additonal cost to the user!While the quick-check run is promising, the coverage info indicates that we’ve barely scratched the surface. The entire state space in this case has 58 bits (8 each for x and y, plus the starting arbitrary state of the Mostek machine costing us an extra 42 bits; for a total of 58). The total number of possible inputs is, therefore, 258 or 288230376151711744. This is a huge number: If you had a computer that run 1-billion (109) test cases every second, it’d still take you over 9 years to go through all possible inputs!Of course, we can do better. Cryptol’s theorem proving environment uses modern equivalence-checkers to prove such theorems automatically, at the push of a (virtual) button:

 legato> :prove legatoIsCorrectQ.E.D.

And there, we’ve proved that our implementation of Legato’s multiplier is indeed correct for all possible inputs! (The above proof takes about 2.5 minutes to complete on my 3-year old MacBook Pro, using abc as the underlying equivalence checker in Cryptol’s symbolic mode. I should also note that the symbolic mode is only available in the full Cryptol release, for which free licenses are available.)

Closing thoughts

I must emphasize that we are not advocating Cryptol as a platform for doing proofs of algorithm correctness. Modern theorem provers such as ACL2, Coq, or Isabelle are the leading tools in this regard. (In particular, the logic behind Cryptol’s automated theorem prover is much less expressive, for starters.) Where Cryptol shines is in its restricted attention to bit-vectors and data-flow algorithms (cryptography being a prime application area), and it turns out that automated equivalence-checking based techniques do perform rather well for such problems. Our shallow embedding of Legato’s multiplier and the automated proof-of-correctness is a case in point.There is one more important point to make. While push-button provers are indispensable in industrial practice, the final Q.E.D. you get from an interactive theorem prover such as ACL2 or Isabelle is much more satisfactory. For instance, we can hardly claim that the above proof increased our understanding of Legato’s algorithm in any sense, it just made us really believe it. I’m willing to bet that anyone who goes through a similar proof in ACL2 or Isabelle would have a much higher chance of having their “aha!” moment, where everything just sinks in…On the practical side, however, nothing beats the fully-automated Q.E.D., especially when your boss is breathing down your neck!


The Cryptol file containing Legato’s multiplier and the correctness theorem is here. The Cryptol toolset licenses are freely available at

Read More

Tech Talk: Orc in Haskell

The June 16th Galois Tech Talk will be delivered by Trevor Elliott, titled “Orc in Haskell.”

  • Date: Tuesday, June 16th, 2009
  • Time: 10:30am – 11:30am
  • Location: Galois, Inc.421 SW 6th Ave. Suite 300(3rd floor of the Commonwealth Building)Portland, OR 97204

Abstract: Concurrency is difficult to realize successfully. The Orc language tackles this problem by introducing explicit concurrency as part of its core. It presents a clean, and somewhat monadic, style of programming that should look familiar to Haskell users. I will give a quick introduction to the Orc language, using several examples to motivate its use. Following this introduction, a monadic Haskell embedding of the major features will be presented, bringing a type system to Orc.Bio: Trevor Elliott is a member of the technical staff at Galois, Inc.  His interests center around functional programming, and the effective use of type systems.Slides are available for download.Update: the source is now available on Hackage (though changed from the version presented at this talk).

Galois has been holding weekly technical seminars for several years on topics from functional programming, formal methods, compiler and language design, to cryptography, and operating system construction, with talks by many figures from the programming language and formal methods communities. The talks are open and free. An RSVP is not required, but feel free to contact the organizer with questions and comments.

Read More

EDSLs for Unmanned Autonomous Verification and Validation

We have a new position paper on the use of EDSLs (LwDSLs) for verification and validation of unmanned vehicle avionics, written jointly with John van Enk of DornerWorks, recently presented at a mixed-criticality architecture conference. (Download) :: PDF

Lee Pike, Don Stewart, John Van EnkCPS Week 2009 Workshop on Mixed CriticalityRoadmap to Evolving UAV Certification

We outline a new approach to the verification and validation (V & V) of safety-critical avionics based on the use of executable lightweight domain specific languages – domain-specific languages hosted directly in an existing high-level programming language. We provide examples of LwDSLs used in industry today, and then we describe the advantages of LwDSLs in V & V. We argue the approach promises substantial automation and cost-reduction in V & V.

Read More

Solving Sudoku Using Cryptol

Cryptol is a language tailored for cryptographic algorithms. Sudoku is a popular puzzle the reader  is no-doubt already familiar with. We will offer no deep reason why anyone should try to solve Sudoku in Cryptol; other than the very fact that it’d be a shame if we couldn’t!Needless to say, Cryptol has not been designed for encoding search algorithms. Nonetheless, some of the features of Cryptol and its associated toolset make it extremely suitable for expressing certain constraint satisfaction problems very concisely; and Sudoku very nicely falls into this category.

Representing the board

A Sudoku board can be represented in a variety of ways. We will pick the simplest: A sequence of 9 rows, each of which has 9 elements storing the digits. Each digit will require 4 bits; since they range from 1 to 9. So, a good Cryptol type for a board is:


In Cryptol-speak, this type simply represents a sequence of precisely 9 elements, each of which is a sequence of 9 elements themselves, each of which are 4-bit words. (Technically, the type [4] also represents a sequence of precisely 4 elements, each of which are bits. But it’s easier to read that as 4-bit words. The type [4] and [4]Bit are synonymous in Cryptol, and can be used interchangeably in all contexts.)

Read More

Equivalence and Safety Checking in Cryptol

The Cryptol language comes with an integrated verification tool-set that can automatically perform equivalence and safety checking on Cryptol programs. Recently, we have presented a paper on this topic at PLPV’09: “Programming Languages Meets Program Verification” workshop. (Slides are also available.)

Briefly, equivalence checking refers to the problem of proving that two functions have the exact same input/output behavior. Typically, these functions are versions of the same algorithm; one being a reference implementation and the other being an optimized version. Cryptol automatically establishes that the optimized version is precisely equivalent to the original. If the functions are not equivalent, Cryptol provides a counter-example where they disagree; aiding greatly in development/debugging.

Safety checking refers to the problem of proving that the execution of a function cannot raise any exceptions; such as division by zero; index out-of-bounds, etc. When the safety checker says that a function is safe, you will know for sure that such conditions will never arise at run-time. (Similarly, you will get a concrete counter-example from Cryptol if this is not the case.)

Cryptol uses symbolic simulation to translate equivalence and safety checking problems to equivalent problems using the bit-vector logic of SMT-Lib. Furthermore, Cryptol has built-in connections to several SAT/SMT solvers. It automatically calls these provers and presents the results to the user in original Cryptol terms; providing a seamless verification environment for the end-user.

The full paper and slides on equivalence checking in Cryptol are available for download.

Read More

A Cryptol Implementation of Skein

Following on from the MD6-in-Cryptol posting, let’s consider another very interesting candidate from the (deep) pool of SHA-3 submissions; Skein

by the merry band of Ferguson, Lucks, Schneier, et al.The expression of their reference implementation comes out, we think, fairly cleanly in Cryptol. The digest output size is a variable parameter to the algorithm, but we’ll focus on the 512-bit version here — the submission’s primary candidate for SHA-3.In order to avoid duplicating the introductory material on Cryptol, we suggest the reader go through the MD6 writeup to get a grounding in Cryptol, its idioms, and syntax.

Read More

MD6 in Cryptol

NIST is currently running a competition to come up with the next generation message hashing function that it intends to standardize and FIPS recommend upon completion (assuming one good candidate is left standing and well at the conclusion of the evaluation process):

Apart from the need to come up with better alternatives to its current recommendation, the SHA-2 family of hashing functions, this competition draws inspiration from the success that the AES competition had a couple of years ago in engaging the community in coming up with a replacement for the DES block cipher. As then, a lot of new innovation has resulted.As with block ciphers, many common types of hashing functions lend themselves well to expression in Cryptol. To demonstrate some of the features of Cryptol and how it could be used to express SHA-3 candidates, here’s one of the submissions, MD6 from the CSAIL group at MIT, headed by Ronald L. Rivest:

The goal of this writeup is twofold:

  • Introduce you to the MD6 hashing algorithm and its construction.
  • Expose you to the Cryptol language, and how it lends itself to expressing MD6.

Ideally, you’ll come away with enthusiasm on both accounts!

Read More