Substitution ciphers in Cryptol

Substitution ciphers are one of the oldest encryption methods, dating back to at least the 15th century. In a substitution cipher, each character in the plain-text is simply “substituted” according to a predefined map. Decryption is simply the substitution in the reverse direction. Wikipedia has a nice description of these ciphers. Obviously, you wouldn’t want your bank to use such a cipher when executing your web-based transactions! But they are fun to play around, especially when entertaining kids in hot summer days. In this post, we’ll see how to code simple substitution ciphers in Cryptol, and go a step further and actually prove that our implementation is correct.

Preliminaries

The simplest form of substitution ciphers use a permutation of the input alphabet. That is, each letter in the input alphabet gets mapped to another in the same  alphabet. (Strictly speaking, input and output alphabets need not be the same, but nothing essential changes by making that assumption.) For instance, you might decide that your substitution will map ‘a’ to ‘q’, and ‘b’ to ‘d’, …, etc., making sure no two letters are mapped to the same target. Once this mapping is agreed on, all you have to do to encrypt a given message is to map each character to the corresponding element according to your predefined mapping rules.Here’s our Cryptol encoding of these ciphers. First, some preliminary declarations:

type Char = [8];
type String(l) = [l]Char;
type Table(n) = [n](Char, Char);

We’ll simply assume that the input consist of “characters,” each of which will be 8-bit quantities (i.e., numbers from 0 to 255). We will simply use ASCII encoding for normal English characters. This is captured by the Char type declaration above, which simply gives a convenient name for 8-bit wide words. The second type declaration captures sized-strings: For any given size l, the type String(l) represents a sequence of length l, containing 8-bit words. For instance, String(16) is the type of all sequences of length 16, containing numbers from 0 to 255 as elements. Finally a Table of size n is simply n-pairings of characters that form a substitution. Here’s the example table we will use:

cipherTable : Table(28);
cipherTable = [| (x, y) || x <- plain || y <- cipher |] 
  where { 
    plain = "abcdefghijklmnopqrstuvwxyz ."; 
    cipher = "oebxa.cdf hijklmnzpqtuvwrygs" 
  };

Note that our table has 28 entries (the lower-case English alphabet, plus space and the dot). A simple Cryptol sequence-comprehension succinctly zips the sequences up, forming our example table.

Performing the substitution

Given a table and a character, the function subst returns the element that the table maps the character to:

subst : {n} (fin n) => (Table(n), Char) -> Char;
subst (table, chr) = find 0 
  where find i = if i == width table 
    then '?' 
    else if chr == chr' 
      then code 
      else find (i+1) 
    where (chr', code) = table @ i;

To do the search, we simply start from index 0 and walk through the given table recursively, returning the mapped element if we have a match. If the given table does not have the corresponding element,  we simply return the character ‘?’, which will indicate failure.  (Aside: Note how the predicate "fin n" ensures we’re given a finite table, ensured by Cryptol’s type-system. Haskell enthusiasts can gripe about receiving an infinite list when they didn’t ask for one!)

Encryption and Decryption

Having defined subst, encryption and decryption are mere maps:

encrypt (table, msg) = [| subst (table, c) || c <- msg |];
decrypt (table, msg) = [| subst (table', c) || c <- msg |] 
  where table' = [| (y, x) || (x, y) <- table |];

where we simply swap the elements in the table for decryption.

Substitution in action

That’s pretty much all there is to it for substitution ciphers. To illustrate, let us create specialized versions of encrypt and decrypt for our example table, together with some test data:

enc, dec : {l} String(l) -> String(l);
enc msg = encrypt (cipherTable, msg);
dec msg = decrypt (cipherTable, msg);
plainText, cipherText, decodedText : String(51);
plainText = "the quick brown fox jumped over the lazy black dog.";
cipherText = enc plainText;
decodedText = dec cipherText;

Here’s Cryptol in action:

SubstCipher> :p plain
Textthe quick brown fox jumped over the lazy black dog.
SubstCipher> :p cipherText
qdagntfbhgezlvkg.lwg tjmaxgluazgqdagioyrgeiobhgxlcs
SubstCipher> :p decodedTextthe quick brown fox jumped over the lazy black dog.

Volia! The secret is now safe, thanks to our substitution cipher.. (Note that we’re using Cryptol’s :p command which prints its argument as a string, as opposed to a sequence of 8-bit words.)

Correctness

Substitution ciphers are dead-simple, but it would be nice to have further assurance that our implementation is indeed correct. This is where Cryptol’s verification tools come into play. Let us first try to write a simple theorem, stating that encryption followed by decryption does not alter the message:

theorem checkEncDec: {msg}. dec (enc msg) == (msg : String(16));

Note that we have restricted the theorem to messages of size 16 only, in order to create a  monomorphic theorem that Cryptol’s verification engine can check/prove automatically. (While polymorphic theorems can be stated in Cryptol, they cannot be proven automatically.)Before attempting a proof, it’s always good to get a quick-check evidence that we’re on safe ground. Here’s what happens when I try to quick-check the above theorem using Cryptol:

SubstCipher> :check checkEncDecChecking case 1 of 1000 (0.10%)Falsifiable.checkEncDec [0x15 0x3e 0xf2 0x4f 0x34 0xc4 0x69 0x5a 0x64 0x9e 0xb0 0xe2 0x6f 0xf8 0x6a 0x4a]	= False

Oops! Something went terribly wrong, Cryptol found a counter-example without even really trying. Let’s use Cryptol’s “:p” command to see what the counter-example really is:

SubstCipher> :p [0x15 0x3e 0xf2 0x4f 0x34 0xc4 0x69 0x5a 0x64 0x9e 0xb0 0xe2 0x6f 0xf8 0x6a 0x4a]>?O4?iZd???o?jJ

Ah, it contains characters that we have not mapped! Since our input accepts any 8-bit word, all values from 0 to 255 are valid “characters.” But we clearly have not mapped most of these to anything, causing Cryptol to rightfully reject our theorem!

Conditional theorems

What we need is a conditional theorem, one that has a hypothesis that says “for all good messages.” That is, we need to consider only those messages that our cipher knows how to map. Here’s a helper function to do just this:

all, any : {n a} (fin n) => (a -> Bit, [n]a) -> Bit;
all (f, xs) = [| f x || x <- xs |] == ~zero;
any (f, xs) = [| f x || x <- xs |] != zero;
checkMessage (table, msg) = all (isMapped, msg) where isMapped c = any ((c', _) -> c == c', table);

[Aside: Cryptol’s polymorphic constant zero is simply the value that is False at every point. Thus, checking against its negation (~zero) ensures all the elements are True, and checking inequivalence against it (i.e., != zero), ensures at least one element is True.]

Having defined checkMessage, we can now express our correctness theorem (again constrained to strings of size 16):

theorem cipherIsCorrect: {msg}. if checkMessage(cipherTable, (msg : String(16))) then dec (enc msg) == msg else True;

If you try the :check command on this theorem, you will find that Cryptol is now happy with it:

SubstCipher> :check cipherIsCorrectChecking case 1000 of 1000 (100.00%)1000 tests passed OK[Coverage: 0.00%. (1000/340282366920938463463374607431768211456)]

The coverage info suggests we have not even barely scratched the test state-space in this case, as one can expect. However, that’s the least of our worries in this case. More importantly, quick-check’s OK is not  satisfactory at all for conditional theorems. How do we know that the random test data generated by Cryptol will pass the checkMessage test? In fact, it is very likely that very few of the test cases actually executed the actual equality check itself, as the “randomly-generated” messages are unlikely to pass the checkMessage filter.

Verification

Luckily, we can do better. Using Cryptol’s automated theorem prover, we can rigorously prove that the theorem is indeed true for all possible messages:

SubstCipher> :prove cipherIsCorrectQ.E.D.

Cryptol proves this theorem on my 3 year old laptop in less than a second!  (We  should also note that the :prove command is available in the symbolic and SBV backends of Cryptol that comes with the full release. It uses off-the-shelf SAT/SMT solvers to prove Cryptol theorems automatically.) Now we can rest assured that our implementation of substitution ciphers in Cryptol is indeed correct!

For the curious

As the alert reader would have no doubt noticed, we have only proved our implementation correct with respect to our example translation map cipherTable. A more general theorem would have proved the cipher correct with respect to all possible tables. Such a theorem is indeed easy to express in Cryptol, but you’ll also need to add the condition that the table is well defined. (That is, it should be a one-to-one map and no letter should be mapped to the special invalid marker ‘?’.) We invite the curious reader to play with this variant. Note that the automated proof will be more complicated in this case, and the backend prover might need more time before returning the final Q.E.D.

Download

The Cryptol source file implementing the substitution cipher is available for download. The Cryptol toolset licenses are freely available at www.cryptol.net. Enjoy!