KBK (25 April 2002)

[Have you come to this page by accident when you were really interested in Solving cryptograms instead?]

In Playing Prolog and Playing predicate logic, RS implies that Tcl is poor at doing backtracking searches. In fact, it isn't all that difficult to write code that does backtracking, and Tcl's upvar and uplevel are just the things to manage the needed control structures.

Let's work out an example. We wish to solve a cryptarithm (an arithmetic sum where the digits have been replaced with letters). Examples include:

S E N D Y E L L O W + M O R E Y E L L O W ------------ and + R E D M O N E Y --------------- O R A N G E

The usual rules for these problems include:

- Each letter represents a different digit.
- No digit is represented by more than one letter.
- Leading digits (S and M in the first problem; Y, R, and O in the second) are nonzero.
- The usual rules of arithmetic are followed.

When solving these problems by computer, it's usually easiest to guess the less significant digits first.

# We begin with a procedure that is used to make an arbitrary choice for a given digit (stored in the variable *digitVar* in caller's scope). The possible choices are given in *choices,* and the ones not chosen are stored in the variable *remainingVar* in caller's scope. For each possible choice, the script *script* is evaluated in the caller's scope.

proc choose { digitVar choices remainingVar script } { upvar 1 $digitVar digit upvar 1 $remainingVar remaining set i -1 foreach digit $choices { incr i set remaining [lreplace $choices $i $i] uplevel 1 $script } }

# Very often, part way through a problem, we know what the value of a given letter must be. The next procedure assigns to the variable *digitVar* in the caller's scope the known value *value.* The value must be present in the list *choices,* and the remaining choices get stored in the variable *remainingVar* in the caller's scope. If the value is present in the list, *script* is evaluated in caller's scope; otherwise, nothing further happens.

proc let { digitVar value choices remainingVar script } { upvar 1 $digitVar digit upvar 1 $remainingVar remaining set i [lsearch -exact $choices $value] if { $i >= 0 } { set digit $value set remaining [lreplace $choices $i $i] uplevel 1 $script } }

# Someone writes: I AM VERY STUPID. While I don't understand why that person would want to advertise the fact, I'm leaving it in, much like Johnson for President.

# OK, let's turn these procedures loose on the cryptarithm YELLOW + YELLOW + RED = ORANGE

puts [time { # Choose W and D arbitrarily choose w { 0 1 2 3 4 5 6 7 8 9 } r1 { choose d $r1 r2 { set sum1 [expr { $w + $w + $d }] # W and D determine E let e [expr { $sum1 % 10 }] $r2 r3 { set cy1 [expr { $sum1 / 10 }] # Choose O arbitrarily choose o $r3 r4 { # O must be non-zero because it's a leading digit if { $o != 0 } { set sum10 [expr { $o + $o + $e + $cy1 }] # The digits chosen so far determine G let g [expr { $sum10 % 10 }] $r4 r5 { set cy10 [expr { $sum10 / 10 }] # L and R are now chosen arbitrarily choose l $r5 r6 { choose r $r6 r7 { # R must not be zero because it's a leading digit if { $r != 0 } { set sum100 [expr { $l + $l + $r + $cy10 }] # N is chosen arbitrarily let n [expr { $sum100 % 10 }] $r7 r8 { set cy100 [expr { $sum100 / 10 }] set sum1000 [expr { $l + $l + $cy100 }] # A is now determined by the other digits let a [expr { $sum1000 % 10 }] $r8 r9 { # Y is the digit that remains choose y $r9 r10 { # Y is a leading digit and may not be zero if { $y != 0 } { # At this point, it's easiest simply to # check the arithmetic using [expr] set yellow $y$e$l$l$o$w set red $r$e$d set orange $o$r$a$n$g$e if { $yellow + $yellow + $red == $orange } { puts [list yellow $yellow red $red orange $orange] } } } } } } } } } } } } } } }]

# The solution of SEND + MORE = MONEY is similar:

puts [time { choose d { 0 1 2 3 4 5 6 7 8 9 } r1 { choose e $r1 r2 { set sum1 [expr { $d + $e }] let y [expr { $sum1 % 10 }] $r2 r3 { set cy1 [expr { $sum1 / 10 }] choose n $r3 r4 { choose r $r4 r5 { set sum10 [expr { $n + $r + $cy1 }] if { $sum10 % 10 == $e } { set cy10 [expr { $sum10 / 10 }] choose o $r5 r6 { set sum100 [expr { $e + $o + $cy10 }] if { $sum100 % 10 == $n } { set cy100 [expr { $sum100 / 10 }] choose s $r6 r7 { if { $s != 0 } { choose m $r7 r8 { if { $m != 0 } { set send $s$e$n$d set more $m$o$r$e set money $m$o$n$e$y if { $send + $more == $money } { puts [list send $send more $more money $money] } } } } } } } } } } } } } }]

And, on a not-terribly fast Windows laptop in Tcl 8.3.4, we get the correct answers:

yellow 143329 red 846 orange 287504 2464000 microseconds per iteration send 9567 more 1085 money 10652 1542000 microseconds per iteration

Curiously enough, the same framework solves the eight queens problem. Even without resorting to the obvious recursive strategy, it's simple enough to code:

proc testDiagonal {x queens_array} { upvar 1 $queens_array queens for {set i 0} {$i < $x} {incr i} { if {abs($queens($i) - $queens($x)) == $x - $i} { return false } } return true } choose q(0) { 0 1 2 3 4 5 6 7 } r1 { choose q(1) $r1 r2 { if { [testDiagonal 1 q] } { choose q(2) $r2 r3 { if { [testDiagonal 2 q] } { choose q(3) $r3 r4 { if { [testDiagonal 3 q] } { choose q(4) $r4 r5 { if { [testDiagonal 4 q] } { choose q(5) $r5 r6 { if { [testDiagonal 5 q] } { choose q(6) $r6 r7 { if { [testDiagonal 6 q] } { choose q(7) $r7 r8 { if { [testDiagonal 7 q] } { set result {} for { set i 0 } { $i < 8 } { incr i } { lappend result [list $i $q($i)] } puts $result } } } } } } } } } } } } } } }

and enumerates the ninety-two possible solutions in a couple of seconds.

Of course, for a well-structured problem like this, anyone sensible would use a recursive search, which can be based on the same framework:

proc 8queens {{n 0} {choices {0 1 2 3 4 5 6 7}} {qArray {}}} { if { [string compare {} $qArray] } { upvar 1 $qArray q } set nextN [expr { $n + 1 }] choose q($n) $choices remainder { if [testDiagonal $n q] { if { $nextN < 8 } { 8queens $nextN $remainder q } else { set result {} for { set i 0 } { $i < 8 } { incr i } { lappend result [list $i $q($i)] } puts $result } } } } 8queens

and of course gives the same answer.

AMG: I added an implementation of [testDiagonal].

There is a lot of related work on Control structures for backtracking search.

RS: I'm amazed at the nesting depths.. Looks like such code could also be generated from the problem description. But is this backtracking in the sense of Prolog (where "cuts" can reduce the search tree), or rather depth-first searching of all possible pathes? I think the choice of "first solution" vs. "all solutions" can be made in the deepest brace nest, where an exit, error or return -code 111 can force instant unwinding.. but then you can't get back in there if the user wants another solution? Maybe something like the Prolog prompt "More?" could be displayed there, to trigger either continue or "super-break". - How could you suspect me of calling Tcl "poor" at anything? I advocate that anything is possible in Tcl - just that some tasks need more hard work than others... and I of course prefer tasks where little effort brings great effects ;-) But here's a little contribution for the repeated task of breaking a string into characters and substitute each character by the value of a same-named variable:

proc varify word {uplevel 1 [list subst $[join [split $word ""] $]]} set f 1 set o 2 varify foo => 122

DNK: It's definitely possible to generate the code automatically as you say, because you made me curious enough to implement that in Lisp. Unfortunately it's two pages of mutually-recursive functions (each individual function, mercifully, is quite short), and I'm not sure how to express the kinds of code transformation they do in TCL. Of course, I've no doubt it can be done! ... As to the question of how to get back in and ask for the next value, clearly the thing to do is to pass the solver the TCL equivalent of a lexical closure, which can print out the solution if desired, process it, and decide whether to continue to the next solution. Again, I know that it's possible to simulate lexical closures in TCL, but I'm too new to it to know how...

KBK If you're interested in taking it further, have a taste of Hot curry.

KBK: The recursive solution for '8-queens', of course, avoids the nesting depths. But clearly, we are pruning the search tree; the cryptarithm solution (which has the most awkward nesting) cannot be examining all 10! possible letter-digit mappings in its run time. In fact, the 'let' and 'if' statements in the nest both are there to prune the tree.

I agree that the notation is awkward, with its excessive nesting of braces. That's quite easily fixed, though, with a "little language" adapted for this type of problem. Let's make a compiler for this "little language:"

proc Choose { v } { upvar 1 script script upvar 1 depth depth set v2 v$depth set v1 v[incr depth] set script "choose $v \$$v1 $v2 [list $script]" } proc Let { v expr } { upvar 1 script script upvar 1 depth depth set v2 v$depth set v1 v[incr depth] set script "let $v \[[list expr $expr]\] \$$v1 $v2 [list $script]" } proc Restrict { expr } { upvar 1 script script set script [list if $expr $script] } proc Compute { script2 } { upvar 1 script script set script $script2\;$script } proc Generate { decls } { set l [split $decls \n] set script {} set depth 0 for { set i [llength $l] } { $i >= 0 } { incr i -1 } { eval [lindex $l $i] } return [list $depth $script] } proc Run { decls initialChoices } { foreach {depth script} [Generate $decls] {} set v$depth $initialChoices eval $script }

and then a cryptarithm can be expressed in a much prettier notation. For example, let's find the four solutions to SAVE+MORE=MONEY:

Run { Choose e Compute { set sum1 [expr { $e + $e }] } Let y { $sum1 % 10 } Compute { set cy1 [expr { $sum1 / 10 }] } Choose v Choose r Compute { set sum10 [expr { $v + $r + $cy1 }] } Restrict { $sum10 % 10 == $e } Compute { set cy10 [expr { $sum10 / 10 }] } Choose a Choose o Compute { set sum100 [expr { $a + $o + $cy10 }] } Let n { $sum100 % 10 } Compute { set cy100 [expr { $sum100 / 10 }] } Choose s Restrict { $s != 0 } Choose m Restrict { $m != 0 } Compute { set save [varify save] } Compute { set more [varify more] } Compute { set money [varify money] } Restrict { $save + $more == $money } Compute { puts "save + more = money: $save + $more = $money" } } { 0 1 2 3 4 5 6 7 8 9 }

The script gives the output:

save + more = money: 9376 + 1086 = 10462 save + more = money: 9476 + 1086 = 10562 save + more = money: 9386 + 1076 = 10462 save + more = money: 9486 + 1076 = 10562

Note that the obvious observation that m==1 makes the script run about six times faster:

Run { Let m 1 Choose e Compute { set sum1 [expr { $e + $e }] } Let y { $sum1 % 10 } Compute { set cy1 [expr { $sum1 / 10 }] } Choose v Choose r Compute { set sum10 [expr { $v + $r + $cy1 }] } Restrict { $sum10 % 10 == $e } Compute { set cy10 [expr { $sum10 / 10 }] } Choose a Choose o Compute { set sum100 [expr { $a + $o + $cy10 }] } Let n { $sum100 % 10 } Compute { set cy100 [expr { $sum100 / 10 }] } Choose s Restrict { $s != 0 } Compute { set save [varify save] } Compute { set more [varify more] } Compute { set money [varify money] } Restrict { $save + $more == $money } Compute { puts "save + more = money: $save + $more = $money" } } { 0 1 2 3 4 5 6 7 8 9 }

KBK 2002-10-10:

The Pythoneers have been about cryptarithms, too, witness the thread in http://groups.google.com/groups?frame=left&th=6ae6e807b808c75d

The problem posed there is the multiplication

xAB CD ----- EFGHJ

with the letters representing the ten decimal digits, and x fixed to be 7.

The following does the problem, including the obvious observations that neither C nor E can be zero, and that J must be the least significant digit of B*D. Interestingly enough, it seems to be at least as fast as any solution that the Pythoneers offered:

puts [time { Run { Let x 7 Choose c Restrict { $c != 0 } Choose b Choose d Let j { ( $b * $d ) % 10 } Choose a Compute { set multiplicand [expr {100 * $x + 10 * $a + $b}] } Compute { set multiplier [expr {10 * $c + $d}] } Compute { set product [expr { $multiplicand * $multiplier }] } Let e { $product / 10000 } Restrict { $e != 0 } Compute { set fghj [expr { $product - 10000 * $e }] } Let f { $fghj / 1000 } Compute { set ghj [expr { $fghj - 1000 * $f }] } Let g { $ghj / 100 } Compute { set hj [expr { $ghj - 100 * $g }] } Let h { $hj / 10 } Compute { puts "$multiplicand * $multiplier = $product" } } { 0 1 2 3 4 5 6 7 8 9 } }]

715 * 46 = 32890 322169 microseconds per iteration

Just for the heck of it, I added some code (not shown) to instrument how many times the code generated by each statement in the script above is executed. It shows clearly how [Choose] multiplies the possible choices while [Let] and [Restrict] down-select from them.

Count Statement ----------------------------------------------------------------------------------------- 1 Let x 7 (There's one choice for x) 1 Choose c (There are nine choices for c) 9 Restrict { $c != 0 } (But one of them is zero, which leaves 8) 8 Choose b (Once c is chosen there are eight choices for b, making 64 combinations) 64 Choose d (And once c and b are chosen, there are seven choices for j, making 448 combinations) 448 Let j { ( $b * $d ) % 10 } (In all but 158 of the 448 combinations, j is equal to x, b, c, or d and is eliminated. 158 Choose a (In each of the 158 combinations, there are five choices for a, so the next code executes 790 times) 790 Compute { set multiplicand [expr {100 * $x + 10 * $a + $b}] } 790 Compute { set multiplier [expr {10 * $c + $d}] } 790 Compute { set product [expr { $multiplicand * $multiplier }] } 790 Let e { $product / 10000 } (But in all but 286 of the resulting combinations, e duplicates one of the letters already chosen) 286 Restrict { $e != 0 } (And e is zero in 16 more) 260 Compute { set fghj [expr { $product - 10000 * $e }] } 260 Let f { $fghj / 1000 } (f turns out to be distinct from the other digits in only 81 of the remaining 260 configurations) 81 Compute { set ghj [expr { $fghj - 1000 * $f }] } 81 Let g { $ghj / 100 } (All but 16 are ruled out by the requirement that g be distinct from the other digits) 16 Compute { set hj [expr { $ghj - 100 * $g }] } 16 Let h { $hj / 10 } (And then the requirement that h be distinct from all the other digits gives a unique solution.) 1 Compute { puts "$multiplicand * $multiplier = $product" }

See Brute force with velvet gloves for different approaches that run possibly much longer, but are simpler to write.

KBK 2005-09-03: I realized that the "little language" so far presented can get more than an order of magnitude performance gain by inlining the [choose] and [let] functions. The changes to the compiler are pretty simple: the [Choose] and [Let] procedures are replaced with:

proc Choose { digitVar } { upvar 1 script script upvar 1 depth depth set remainingVar _v$depth set choicesVar _v[incr depth] set i _i$depth # choose $digitVar $choicesVar $remainingVar script set script [string map [list %i $i \ %choicesVar $choicesVar \ %remainingVar $remainingVar \ %digitVar $digitVar \ %script $script] { set %i -1 foreach %digitVar $%choicesVar { incr %i set %remainingVar [lreplace $%choicesVar $%i $%i] %script } }] } proc Let { digitVar expr } { upvar 1 script script upvar 1 depth depth set remainingVar _v$depth set choiceVar _v[incr depth] set script [string map [list \ %digitVar $digitVar \ %choiceVar $choiceVar \ %expr [list $expr] \ %i _i$depth \ %remainingVar $remainingVar \ %script $script \ %valueVar _value$depth \ ] { set %valueVar [expr %expr] set %i [lsearch -exact $%choiceVar $%valueVar] if { $%i >= 0 } { set %digitVar $%valueVar set %remainingVar [lreplace $%choiceVar $%i $%i] %script } }] }

and, on a not-particularly-fast laptop (1 GHz Pentium-M), the correct answer is obtained in less than 9 milliseconds — certainly faster than the Pythoneers.