Solving cryptarithms

Introduction

KBK (25 April 2002)

[Have you come to this page by accident when you were really interested in Solving cryptograms instead?]

In Playing Prolog and Playing predicate logic, RS implies that Tcl is poor at doing backtracking searches. In fact, it isn't all that difficult to write code that does backtracking, and Tcl's upvar and uplevel are just the things to manage the needed control structures.

Let's work out an example. We wish to solve a cryptarithm (an arithmetic sum where the digits have been replaced with letters). Examples include:

     S E N D              Y E L L O W
   + M O R E              Y E L L O W
  ------------     and  +       R E D
   M O N E Y            ---------------
                          O R A N G E

The usual rules for these problems include:

  • Each letter represents a different digit.
  • No digit is represented by more than one letter.
  • Leading digits (S and M in the first problem; Y, R, and O in the second) are nonzero.
  • The usual rules of arithmetic are followed.

When solving these problems by computer, it's usually easiest to guess the less significant digits first.


# We begin with a procedure that is used to make an arbitrary choice for a given digit (stored in the variable digitVar in caller's scope). The possible choices are given in choices, and the ones not chosen are stored in the variable remainingVar in caller's scope. For each possible choice, the script script is evaluated in the caller's scope.

 proc choose { digitVar choices remainingVar script } {
   upvar 1 $digitVar digit
   upvar 1 $remainingVar remaining
   set i -1
   foreach digit $choices {
     incr i
     set remaining [lreplace $choices $i $i]
     uplevel 1 $script
   }
 }

# Very often, part way through a problem, we know what the value of a given letter must be. The next procedure assigns to the variable digitVar in the caller's scope the known value value. The value must be present in the list choices, and the remaining choices get stored in the variable remainingVar in the caller's scope. If the value is present in the list, script is evaluated in caller's scope; otherwise, nothing further happens.

 proc let { digitVar value choices remainingVar script } {
   upvar 1 $digitVar digit
   upvar 1 $remainingVar remaining
   set i [lsearch -exact $choices $value]
   if { $i >= 0 } {
     set digit $value
     set remaining [lreplace $choices $i $i]
     uplevel 1 $script
   }
 }

Program 1

# Someone writes: I AM VERY STUPID. While I don't understand why that person would want to advertise the fact, I'm leaving it in, much like Johnson for President.

# OK, let's turn these procedures loose on the cryptarithm YELLOW + YELLOW + RED = ORANGE

 puts [time {
   # Choose W and D arbitrarily
   choose w { 0 1 2 3 4 5 6 7 8 9 } r1 {
     choose d $r1 r2 {
       set sum1 [expr { $w + $w + $d }]
       # W and D determine E
       let e [expr { $sum1 % 10 }] $r2 r3 {
         set cy1 [expr { $sum1 / 10 }]
         # Choose O arbitrarily
         choose o $r3 r4 {
           # O must be non-zero because it's a leading digit
           if { $o != 0 } {
             set sum10 [expr { $o + $o + $e + $cy1 }]
             # The digits chosen so far determine G
             let g [expr { $sum10 % 10 }] $r4 r5 {
               set cy10 [expr { $sum10 / 10 }]
               # L and R are now chosen arbitrarily
               choose l $r5 r6 {
                 choose r $r6 r7 {
                   # R must not be zero because it's a leading digit
                   if { $r != 0 } {
                     set sum100 [expr { $l + $l + $r + $cy10 }]
                     # N is chosen arbitrarily
                     let n [expr { $sum100 % 10 }] $r7 r8 {
                       set cy100 [expr { $sum100 / 10 }]
                       set sum1000 [expr { $l + $l + $cy100 }]
                       # A is now determined by the other digits
                       let a [expr { $sum1000 % 10 }] $r8 r9 {
                         # Y is the digit that remains
                         choose y $r9 r10 {
                           # Y is a leading digit and may not be zero
                           if { $y != 0 } {
                             # At this point, it's easiest simply to
                             # check the arithmetic using [expr]
                             set yellow $y$e$l$l$o$w
                             set red $r$e$d
                             set orange $o$r$a$n$g$e
                             if { $yellow + $yellow + $red == $orange } {
                               puts [list yellow $yellow red $red orange $orange]
                             }
                           }
                         }
                       }
                     }
                   }
                 }
               }
             }
           }
         }
       }
     }
   }
 }]

Program 2

# The solution of SEND + MORE = MONEY is similar:

 puts [time {
   choose d { 0 1 2 3 4 5 6 7 8 9 } r1 {
     choose e $r1 r2 {
       set sum1 [expr { $d + $e }]
       let y [expr { $sum1 % 10 }] $r2 r3 {
         set cy1 [expr { $sum1 / 10 }]
         choose n $r3 r4 {
           choose r $r4 r5 {
             set sum10 [expr { $n + $r + $cy1 }]
             if { $sum10 % 10 == $e } {
               set cy10 [expr { $sum10 / 10 }]
               choose o $r5 r6 {
                 set sum100 [expr { $e + $o + $cy10 }]
                 if { $sum100 % 10 == $n } {
                   set cy100 [expr { $sum100 / 10 }]
                   choose s $r6 r7 {
                     if { $s != 0 } {
                       choose m $r7 r8 {
                         if { $m != 0 } {
                           set send $s$e$n$d
                           set more $m$o$r$e
                           set money $m$o$n$e$y
                           if { $send + $more == $money } {
                             puts [list send $send more $more money $money]
                           }
                         }
                       }
                     }
                   }
                 }
               }
             }
           }
         }
       }
     }
   }
 }]

And, on a not-terribly fast Windows laptop in Tcl 8.3.4, we get the correct answers:

 yellow 143329 red 846 orange 287504
 2464000 microseconds per iteration
 send 9567 more 1085 money 10652
 1542000 microseconds per iteration

Program 3

Curiously enough, the same framework solves the eight queens problem. Even without resorting to the obvious recursive strategy, it's simple enough to code:

  proc testDiagonal {x queens_array} {
    upvar 1 $queens_array queens
    for {set i 0} {$i < $x} {incr i} {
      if {abs($queens($i) - $queens($x)) == $x - $i} {
        return false
      }
    }
    return true
  }

  choose q(0) { 0 1 2 3 4 5 6 7 } r1 {
    choose q(1) $r1 r2 {
      if { [testDiagonal 1 q] } {
        choose q(2) $r2 r3 {
          if { [testDiagonal 2 q] } {
            choose q(3) $r3 r4 {
              if { [testDiagonal 3 q] } {
                choose q(4) $r4 r5 {
                  if { [testDiagonal 4 q] } {
                    choose q(5) $r5 r6 {
                      if { [testDiagonal 5 q] } {
                        choose q(6) $r6 r7 {
                          if { [testDiagonal 6 q] } {
                            choose q(7) $r7 r8 {
                              if { [testDiagonal 7 q] } {
                                set result {}
                                for { set i 0 } { $i < 8 } { incr i } {
                                  lappend result [list $i $q($i)]
                                }
                                puts $result
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }

and enumerates the ninety-two possible solutions in a couple of seconds.

Program 4

Of course, for a well-structured problem like this, anyone sensible would use a recursive search, which can be based on the same framework:

 proc 8queens {{n 0} {choices {0 1 2 3 4 5 6 7}} {qArray {}}} {
   if { [string compare {} $qArray] } {
     upvar 1 $qArray q
   }
   set nextN [expr { $n + 1 }]
   choose q($n) $choices remainder {
     if [testDiagonal $n q] {
       if { $nextN < 8 } {
         8queens $nextN $remainder q
       } else {
         set result {}
         for { set i 0 } { $i < 8 } { incr i } {
           lappend result [list $i $q($i)]
         }
         puts $result
       }
     }
   }
 }
 8queens

and of course gives the same answer.

AMG: I added an implementation of [testDiagonal].


There is a lot of related work on Control structures for backtracking search.


RS: I'm amazed at the nesting depths.. Looks like such code could also be generated from the problem description. But is this backtracking in the sense of Prolog (where "cuts" can reduce the search tree), or rather depth-first searching of all possible pathes? I think the choice of "first solution" vs. "all solutions" can be made in the deepest brace nest, where an exit, error or return -code 111 can force instant unwinding.. but then you can't get back in there if the user wants another solution? Maybe something like the Prolog prompt "More?" could be displayed there, to trigger either continue or "super-break". - How could you suspect me of calling Tcl "poor" at anything? I advocate that anything is possible in Tcl - just that some tasks need more hard work than others... and I of course prefer tasks where little effort brings great effects ;-) But here's a little contribution for the repeated task of breaking a string into characters and substitute each character by the value of a same-named variable:

 proc varify word {uplevel 1 [list subst $[join [split $word ""] $]]}
 set f 1
 set o 2
 varify foo => 122

DNK: It's definitely possible to generate the code automatically as you say, because you made me curious enough to implement that in Lisp. Unfortunately it's two pages of mutually-recursive functions (each individual function, mercifully, is quite short), and I'm not sure how to express the kinds of code transformation they do in TCL. Of course, I've no doubt it can be done! ... As to the question of how to get back in and ask for the next value, clearly the thing to do is to pass the solver the TCL equivalent of a lexical closure, which can print out the solution if desired, process it, and decide whether to continue to the next solution. Again, I know that it's possible to simulate lexical closures in TCL, but I'm too new to it to know how...

KBK If you're interested in taking it further, have a taste of Hot curry.


Program 5

KBK: The recursive solution for '8-queens', of course, avoids the nesting depths. But clearly, we are pruning the search tree; the cryptarithm solution (which has the most awkward nesting) cannot be examining all 10! possible letter-digit mappings in its run time. In fact, the 'let' and 'if' statements in the nest both are there to prune the tree.

I agree that the notation is awkward, with its excessive nesting of braces. That's quite easily fixed, though, with a "little language" adapted for this type of problem. Let's make a compiler for this "little language:"

 proc Choose { v } {
     upvar 1 script script
     upvar 1 depth depth
     set v2 v$depth
     set v1 v[incr depth]
     set script "choose $v \$$v1 $v2 [list $script]"
 }
 
 proc Let { v expr } {
     upvar 1 script script
     upvar 1 depth depth
     set v2 v$depth
     set v1 v[incr depth]
     set script "let $v \[[list expr $expr]\] \$$v1 $v2 [list $script]"
 }
 
 proc Restrict { expr } {
     upvar 1 script script
     set script [list if $expr $script]
 }
 
 proc Compute { script2 } {
     upvar 1 script script
     set script $script2\;$script
 }
 
 proc Generate { decls } {
     set l [split $decls \n]
     set script {}
     set depth 0
     for { set i [llength $l] } { $i >= 0 } { incr i -1 } {
         eval [lindex $l $i]
     }
     return [list $depth $script]
 }
 
 proc Run { decls initialChoices } {
     foreach {depth script} [Generate $decls] {}
     set v$depth $initialChoices
     eval $script
 }

Program 6

and then a cryptarithm can be expressed in a much prettier notation. For example, let's find the four solutions to SAVE+MORE=MONEY:

 Run { 
     Choose     e
     Compute            { set sum1 [expr { $e + $e }] }
     Let        y       { $sum1 % 10 }
     Compute            { set cy1 [expr { $sum1 / 10 }] }
     Choose     v
     Choose     r
     Compute            { set sum10 [expr { $v + $r + $cy1 }] }
     Restrict           { $sum10 % 10 == $e }
     Compute            { set cy10 [expr { $sum10 / 10 }] }
     Choose     a
     Choose     o
     Compute            { set sum100 [expr { $a + $o + $cy10 }] }
     Let        n       { $sum100 % 10 }
     Compute            { set cy100 [expr { $sum100 / 10 }] }
     Choose     s
     Restrict           { $s != 0 }
     Choose     m
     Restrict           { $m != 0 }
     Compute            { set save [varify save] }
     Compute            { set more [varify more] }
     Compute            { set money [varify money] }
     Restrict           { $save + $more == $money }
     Compute            { puts "save + more = money: $save + $more = $money" }
 } { 0 1 2 3 4 5 6 7 8 9 }

The script gives the output:

 save + more = money: 9376 + 1086 = 10462
 save + more = money: 9476 + 1086 = 10562
 save + more = money: 9386 + 1076 = 10462
 save + more = money: 9486 + 1076 = 10562

Program 7

Note that the obvious observation that m==1 makes the script run about six times faster:

 Run { 
     Let        m        1
     Choose     e
     Compute            { set sum1 [expr { $e + $e }] }
     Let        y       { $sum1 % 10 }
     Compute            { set cy1 [expr { $sum1 / 10 }] }
     Choose     v
     Choose     r
     Compute            { set sum10 [expr { $v + $r + $cy1 }] }
     Restrict           { $sum10 % 10 == $e }
     Compute            { set cy10 [expr { $sum10 / 10 }] }
     Choose     a
     Choose     o
     Compute            { set sum100 [expr { $a + $o + $cy10 }] }
     Let        n       { $sum100 % 10 }
     Compute            { set cy100 [expr { $sum100 / 10 }] }
     Choose     s
     Restrict           { $s != 0 }
     Compute            { set save [varify save] }
     Compute            { set more [varify more] }
     Compute            { set money [varify money] }
     Restrict           { $save + $more == $money }
     Compute            { puts "save + more = money: $save + $more = $money" }
 } { 0 1 2 3 4 5 6 7 8 9 }

More

KBK 2002-10-10:

The Pythoneers have been about cryptarithms, too, witness the thread in http://groups.google.com/groups?frame=left&th=6ae6e807b808c75d

The problem posed there is the multiplication

   xAB
    CD
 -----
 EFGHJ

with the letters representing the ten decimal digits, and x fixed to be 7.

The following does the problem, including the obvious observations that neither C nor E can be zero, and that J must be the least significant digit of B*D. Interestingly enough, it seems to be at least as fast as any solution that the Pythoneers offered:

 puts [time {
  Run {
    Let         x       7
    Choose      c
    Restrict            { $c != 0 }
    Choose      b
    Choose      d
    Let         j       { ( $b * $d ) % 10 }
    Choose      a
    Compute             { set multiplicand [expr {100 * $x + 10 * $a + $b}] }
    Compute             { set multiplier [expr {10 * $c + $d}] }
    Compute             { set product [expr { $multiplicand * $multiplier }] }
    Let         e       { $product / 10000 }
    Restrict            { $e != 0 }
    Compute             { set fghj [expr { $product - 10000 * $e }] }
    Let         f       { $fghj / 1000 }
    Compute             { set ghj [expr { $fghj - 1000 * $f }] }
    Let         g       { $ghj / 100 }
    Compute             { set hj [expr { $ghj - 100 * $g }] }
    Let         h       { $hj / 10 }
    Compute             { puts "$multiplicand * $multiplier = $product" }
  } { 0 1 2 3 4 5 6 7 8 9 }
 }]
 715 * 46 = 32890
 322169 microseconds per iteration

Just for the heck of it, I added some code (not shown) to instrument how many times the code generated by each statement in the script above is executed. It shows clearly how [Choose] multiplies the possible choices while [Let] and [Restrict] down-select from them.

   Count    Statement
  -----------------------------------------------------------------------------------------
      1            Let         x       7
               (There's one choice for x)
      1            Choose      c 
               (There are nine choices for c)
      9            Restrict            { $c != 0 }
               (But one of them is zero, which leaves 8)
      8            Choose      b 
               (Once c is chosen there are eight choices for b,
               making 64 combinations)
     64            Choose      d 
               (And once c and b are chosen, there are seven choices
               for j, making 448 combinations)
    448            Let         j       { ( $b * $d ) % 10 }
               (In all but 158 of the 448 combinations, j is equal
               to x, b, c, or d and is eliminated.
    158            Choose      a 
               (In each of the 158 combinations, there are five
               choices for a, so the next code executes 790 times)
    790            Compute             { set multiplicand [expr {100 * $x + 10 * $a + $b}] }
    790            Compute             { set multiplier [expr {10 * $c + $d}] }
    790            Compute             { set product [expr { $multiplicand * $multiplier }] }
    790            Let         e       { $product / 10000 }
               (But in all but 286 of the resulting combinations,
               e duplicates one of the letters already chosen)
    286            Restrict            { $e != 0 }
               (And e is zero in 16 more)
    260            Compute             { set fghj [expr { $product - 10000 * $e }] }
    260            Let         f       { $fghj / 1000 }
               (f turns out to be distinct from the other digits in
               only 81 of the remaining 260 configurations)
     81            Compute             { set ghj [expr { $fghj - 1000 * $f }] }
     81            Let         g       { $ghj / 100 }
               (All but 16 are ruled out by the requirement that
               g be distinct from the other digits)
     16            Compute             { set hj [expr { $ghj - 100 * $g }] }
     16            Let         h       { $hj / 10 }
               (And then the requirement that h be distinct from all
               the other digits gives a unique solution.)
      1            Compute             { puts "$multiplicand * $multiplier = $product" }

Etc

See Brute force with velvet gloves for different approaches that run possibly much longer, but are simpler to write.


KBK 2005-09-03: I realized that the "little language" so far presented can get more than an order of magnitude performance gain by inlining the [choose] and [let] functions. The changes to the compiler are pretty simple: the [Choose] and [Let] procedures are replaced with:

 proc Choose { digitVar } {
    upvar 1 script script
    upvar 1 depth depth
    set remainingVar _v$depth
    set choicesVar _v[incr depth]
    set i _i$depth
    # choose $digitVar $choicesVar $remainingVar script
    set script [string map [list %i $i \
                                %choicesVar $choicesVar \
                                %remainingVar $remainingVar \
                                %digitVar $digitVar \
                                %script $script] {
        set %i -1
        foreach %digitVar $%choicesVar {
            incr %i
            set %remainingVar [lreplace $%choicesVar $%i $%i]
            %script
        }
    }]
 }

 proc Let { digitVar expr } {
    upvar 1 script script
    upvar 1 depth depth
    set remainingVar _v$depth
    set choiceVar _v[incr depth]
    set script [string map [list \
                                %digitVar $digitVar \
                                %choiceVar $choiceVar \
                                %expr [list $expr] \
                                %i _i$depth \
                                %remainingVar $remainingVar \
                                %script $script \
                                %valueVar _value$depth \
    ] {
        set %valueVar [expr %expr]
        set %i [lsearch -exact $%choiceVar $%valueVar]
        if { $%i >= 0 } {
            set %digitVar $%valueVar
            set %remainingVar [lreplace $%choiceVar $%i $%i]
            %script
        }
    }]
 }

and, on a not-particularly-fast laptop (1 GHz Pentium-M), the correct answer is obtained in less than 9 milliseconds — certainly faster than the Pythoneers.