KISS

Richard Suchenwirth 2002-12-01 - "Keep It Simple Sir!" This saying is often used and less often heeded. Its history may go back to William of Occam's razor and beyond: "If two things do the same job, simpler is better." (More precisely, "entia non sunt multiplicanda praeter necessitatem - (use) no more things than necessary"). Tcl is simpler than many other languages, but Tcl code can be more or less simple. I'm pondering the following rules of thumb:

  • Try to use less lines of code. Shorter code is easier inspected by humans, and run by Tcl interpreters.
  • Try to use less variables. Variables make sense when retrieved more than once, but again, less degrees of freedom allow less bugs.
  • Reinventing wheels is fun in Tcl, even more if you re-use existing functionality (spokes, hubs, ... ;-)
  • Try to depend on less. For instance, global variables have their uses, but doing without is often better.
  • Try to factor out common code. When snippets look suspiciously as if done by copy'n'paste, think on how it might be generalized in one place.
  • To sum up: Tcl'ing is fun. Less Tcl'ing may be more fun.

Re-reading Arbitrary precision math procedures prompted me to bring forth this sermon. The multiply improved code on that page certainly does a good job, but appeared to me less simple than I'd have liked. So I started rewriting it, and mostly found it could be done with about half the LOC and variables. One thing struck me: in almost each rewritten proc I found myself coding like

 if {$x == ""} {set x 0}

Simple enough, but still... I factored that operation into this fancy-named one-liner (which is minimal LOC):}

 proc empty=0 x {expr {$x == ""? 0: $x}}

which does not reset a variable, but in all cases I was interested in the value, and only once. Now for the revised code (it's about addition and multiplication of non-negative integers, that may be bigger than expr could stand). To turn such a "bigInt" into a "little-endian" list of chunks at base 10000 (i.e. max. 4 digits), I wrote:

 proc tolist bigint {
    set res {}
    while {[string length $bigint]} {
        lappend res [scan [string range $bigint end-3 end] %d]
        set bigint [string range $bigint 0 end-4]
    }
    set res
 }

Seems to work well:

 % tolist 1234567890001002003004005006007008009
 8009 700 60 4005 300 20 1 6789 2345 1

and uses 8 LOC and one local variable, instead of 29 LOC and 7 local variables in the original page. In the opposite direction, such chunk lists have to be formatted back to "big-endian" digit strings:

 proc fromlist chunks {
    set res ""
    foreach chunk $chunks {
        set res [format %4.4d $chunk]$res ;# prepend at left
    }
    empty=0 [string trimleft $res 0] ;# [scan] might choke on big ints
 }

7 LOC vs. 12, and apparently less buggy for leading zeroes (the original checked only for chunk==0). Needless to say which coding style I prefer. Now for the addition of two chunk lists:

 proc adda {xs ys} {
    set res {}
    set carry 0
    foreach x $xs y $ys {
        set sum [expr {[empty=0 $x] + [empty=0 $y] + $carry}]
        set carry [expr {$sum / 10000}]
        lappend res [expr {$sum % 10000}]
    }
    if $carry {lappend res $carry} else {set res}
 }

The last line is debatably tricky and could have been done as

 if $carry {lappend res $carry}
 set res

but uses the facts that if returns the result of the branch that succeeded, while lappend returns the resulting list. Anyway, 10 LOC compared to 24 originally, and much less local variables.

Finally, multiplication. DKFs original solution did it with repeated shifted addition; JJS's enhanced code (base-10000 chunks) split it into single-chunk multiplication (multi) and a wrapping loop multa. I follow that model, though again simplified:

 proc multa {xs ys} {
    set res {}
    foreach x $xs {
        set res [adda $res [multi $ys $x]]
        set ys [linsert $ys 0 [set ys 0]] ;# K-less K ;-)
    }
    set res
 }
 proc multi {xs chunk} {
    set res {}
    set carry 0
    foreach x $xs {
        set n [expr {wide($x) * $chunk + $carry}]
        set carry [expr {$n/10000}]
        lappend res [expr {$n % 10000}]
    }
    if $carry {lappend res $carry} else {set res}
 }
if 0 {And now for the factorial demo:}
 proc facta n {
    if {$n<2} {return 1} else {
     fromlist [multa $n [tolist [facta [incr n -1]]]]
    }
 }

Brevity in code is achieved here by the classic recursive form, which is more expensive in runtime compared to the equally possible iteration, but easier on the human eye. The break-even point between simple and fast code is however coming our way, as CPUs get faster every year.. facta 400 took 20 seconds on my 200MHz W95 box to produce the voluminous integer (line breaks added)

 64034522846623895262347970319503005850702583026002959458684
 44594280239716918683143627847864746326467629435057503585681
 08482981628835174352289619886468029979373416541508381624264
 61942352307046244325015114448670890662773914918117331955996
 44070954967134529047702032243491121079759328079510154537266
 72516278778900093497637657103263503315339653498683868313393
 52024373788157786791506311858702618270169819740062983025308
 59129834616227230455833952075961150530223608681043329725519
 48526744322324386699484224042325998055516106359423769613992
 31917134063858996537970147827206606320217379472010321356624
 61380907794230459736069956759583609615871512991382228657857
 95493616176544804532220078258184008484364155912294542753848
 03558374518022675900061399560145595206127211192918105032491
 00800000000000000000000000000000000000000000000000000000000
 0000000000000000000000000000000000000000000

and I had to increase the recursion limit to 2000 for this, but on all checks it is equal to the result reported in the Ruby User's Guide which I took as reference, so "it must be right", as it says there ;-) Here's the iterative version:

 proc facta'it n {
    for {set i 1; set this 1} {$i <= $n} {incr i} {
        set this [multa $this [set this $i]]
    }
    fromlist $this
 }

Funny, and against my expectation, that the iterative version is slower than the recursive one:

 % time {facta 200}
 5055078 microseconds per iteration
 % time {facta'it 200}
 57408667 microseconds per iteration

The recursive version also suffers from repeated conversions from and to lists. Hence, a second try that uses slightly more code in the hope for less runtime:

 proc facta2 n {fromlist [facta' [tolist $n]]}
 proc facta' n {
    if {$n<2} {return 1} else {multa $n [facta' [incr n -1]]}
 }

That was definitely worth the effort:

  % time {facta2 200}
 1652267 microseconds per iteration

See also Big integer routines.