Richard Suchenwirth 2002-12-01 - "Keep It Simple Sir!" This saying is often used and less often heeded. Its history may go back to William of Occam's razor and beyond: "If two things do the same job, simpler is better." (More precisely, "entia non sunt multiplicanda praeter necessitatem - (use) no more things than necessary"). Tcl is simpler than many other languages, but Tcl code can be more or less simple. I'm pondering the following rules of thumb:
Re-reading Arbitrary precision math procedures prompted me to bring forth this sermon. The multiply improved code on that page certainly does a good job, but appeared to me less simple than I'd have liked. So I started rewriting it, and mostly found it could be done with about half the LOC and variables. One thing struck me: in almost each rewritten proc I found myself coding like
if {$x == ""} {set x 0}
Simple enough, but still... I factored that operation into this fancy-named one-liner (which is minimal LOC):}
proc empty=0 x {expr {$x == ""? 0: $x}}
which does not reset a variable, but in all cases I was interested in the value, and only once. Now for the revised code (it's about addition and multiplication of non-negative integers, that may be bigger than expr could stand). To turn such a "bigInt" into a "little-endian" list of chunks at base 10000 (i.e. max. 4 digits), I wrote:
proc tolist bigint { set res {} while {[string length $bigint]} { lappend res [scan [string range $bigint end-3 end] %d] set bigint [string range $bigint 0 end-4] } set res }
Seems to work well:
% tolist 1234567890001002003004005006007008009 8009 700 60 4005 300 20 1 6789 2345 1
and uses 8 LOC and one local variable, instead of 29 LOC and 7 local variables in the original page. In the opposite direction, such chunk lists have to be formatted back to "big-endian" digit strings:
proc fromlist chunks { set res "" foreach chunk $chunks { set res [format %4.4d $chunk]$res ;# prepend at left } empty=0 [string trimleft $res 0] ;# [scan] might choke on big ints }
7 LOC vs. 12, and apparently less buggy for leading zeroes (the original checked only for chunk==0). Needless to say which coding style I prefer. Now for the addition of two chunk lists:
proc adda {xs ys} { set res {} set carry 0 foreach x $xs y $ys { set sum [expr {[empty=0 $x] + [empty=0 $y] + $carry}] set carry [expr {$sum / 10000}] lappend res [expr {$sum % 10000}] } if $carry {lappend res $carry} else {set res} }
The last line is debatably tricky and could have been done as
if $carry {lappend res $carry} set res
but uses the facts that if returns the result of the branch that succeeded, while lappend returns the resulting list. Anyway, 10 LOC compared to 24 originally, and much less local variables.
Finally, multiplication. DKFs original solution did it with repeated shifted addition; JJS's enhanced code (base-10000 chunks) split it into single-chunk multiplication (multi) and a wrapping loop multa. I follow that model, though again simplified:
proc multa {xs ys} { set res {} foreach x $xs { set res [adda $res [multi $ys $x]] set ys [linsert $ys 0 [set ys 0]] ;# K-less K ;-) } set res } proc multi {xs chunk} { set res {} set carry 0 foreach x $xs { set n [expr {wide($x) * $chunk + $carry}] set carry [expr {$n/10000}] lappend res [expr {$n % 10000}] } if $carry {lappend res $carry} else {set res} } if 0 {And now for the factorial demo:} proc facta n { if {$n<2} {return 1} else { fromlist [multa $n [tolist [facta [incr n -1]]]] } }
Brevity in code is achieved here by the classic recursive form, which is more expensive in runtime compared to the equally possible iteration, but easier on the human eye. The break-even point between simple and fast code is however coming our way, as CPUs get faster every year.. facta 400 took 20 seconds on my 200MHz W95 box to produce the voluminous integer (line breaks added)
64034522846623895262347970319503005850702583026002959458684 44594280239716918683143627847864746326467629435057503585681 08482981628835174352289619886468029979373416541508381624264 61942352307046244325015114448670890662773914918117331955996 44070954967134529047702032243491121079759328079510154537266 72516278778900093497637657103263503315339653498683868313393 52024373788157786791506311858702618270169819740062983025308 59129834616227230455833952075961150530223608681043329725519 48526744322324386699484224042325998055516106359423769613992 31917134063858996537970147827206606320217379472010321356624 61380907794230459736069956759583609615871512991382228657857 95493616176544804532220078258184008484364155912294542753848 03558374518022675900061399560145595206127211192918105032491 00800000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000
and I had to increase the recursion limit to 2000 for this, but on all checks it is equal to the result reported in the Ruby User's Guide which I took as reference, so "it must be right", as it says there ;-) Here's the iterative version:
proc facta'it n { for {set i 1; set this 1} {$i <= $n} {incr i} { set this [multa $this [set this $i]] } fromlist $this }
Funny, and against my expectation, that the iterative version is slower than the recursive one:
% time {facta 200} 5055078 microseconds per iteration % time {facta'it 200} 57408667 microseconds per iteration
The recursive version also suffers from repeated conversions from and to lists. Hence, a second try that uses slightly more code in the hope for less runtime:
proc facta2 n {fromlist [facta' [tolist $n]]} proc facta' n { if {$n<2} {return 1} else {multa $n [facta' [incr n -1]]} }
That was definitely worth the effort:
% time {facta2 200} 1652267 microseconds per iteration
See also Big integer routines.