fori

MC, 31 Dec 2004: In a post on his blog[L1 ] discussing implementation and design issues of his Ramble game (written, naturally in Tcl/Tk), WHD proposes a fori command, whose signature looks like:

fori var from to script

Since any Tcl version is going to be slower than the built-in version of [for], Will writes: "I think something like fori would be a nice addition to Tcl...but it would have to be implemented as a C extension."

Larry Smith12/31/04: More than that - you'd need to bytecode it. for wins out every time because it is already bytecoded and, once compiled, can be run very cheaply, whereas fori will always involve a procedure call. I also hate the syntax, try: loop -var <var> -from <startval> -to <endval> <-do> script. You might see how fast it would be if it was a Sugar macro.

Lars H, 1 Jan 2005: Another take at this problem can be found in TIP #225 [L2 ]. The syntax that comes out of that is

foreach var [range from to+1 ] script

and thanks to some implementation magic, it is actually supposed to be faster than for. However the to+1 argument (i.e., you don't specify the end of the range, but the first thing not in the range) seems terribly odd (e.g. not the same as for lrange or string range). Perhaps a decided opinion against that detail can sway the proposer?

WHD, 1 Jan 2005: I actually tried using foreach and a "range" command, minus the implementation magic; it was no better than any other solution I came up with, and it has a major drawback for nested loops (though possibly this is mitigated by the implementation magic): if you use "range" in an inner loop, it's generating the list over which to iterate once for each iteration of the outer loop.

ECS, 3 Jan 2005: may I suggest seq as the command available on Unix? From the man page: 'seq [OPTION]... [FIRST [STEP]] LAST...'


MC12/31/04: I was curious how much slower a Tcl version would be... Here are a couple of different approaches I came up with this afternoon:

 proc fori1 {var m n script} {
     uplevel 1 for [list "set [list $var] $m"] [list "\$[list $var] <= $n"] [list "incr [list $var]"] [list $script]
 }

 proc fori2 {var m n script} {
     uplevel 1 for [list "set [list $var] $m"] [list "\[set [list $var]\] <= $n"] [list "incr [list $var]"] [list $script]
 }

 proc fori3 {var m n script} {
     uplevel 1 "set [list $var] [expr {$m - 1}]
         while {\[incr [list $var]\] <= {$n}} {
             $script
         }"
 }

 proc fori4 {var m n script} {
     uplevel 1 "if {1} {
         for {set [list $var] {$m}} {\$[list $var] <= {$n}} {incr [list $var]} {                      
             $script
         }
     }"
 }

 proc fori5 {var m n script} {
     uplevel 1 "if {1} [list "set [list $var] [expr {$m - 1}]
         while {\[incr [list $var]\] <= {$n}} {
             $script
         }"]"
 }

and now a bit of infrastructure to make testing/timing them easier:

 proc try {name proc limit {n_iterations 50}} {
     set time [time {set result [$proc $limit]} $n_iterations]
     return [format { #%-5s %-14s result == %8d, in %7d microseconds} \
         $name "(1 .. $limit):" $result [lindex $time 0]]
 }

 proc builtin {limit} {
     set sum 0
     for {set i 1} {$i <= $limit} {incr i} {
         incr sum $i
     }
     return $sum
 }

 for {set i 1} {$i <= 5} {incr i} {
     proc method$i {limit} [format {
         set sum 0
         fori%d i 1 $limit {incr sum $i}
         return $sum
     } $i]
 }

 foreach size {10 100 1000 10000} {
     if {$size != 10} then {puts ""} 
     foreach version {builtin method1 method2 method3 method4 method5} \
         name {for fori1 fori2 fori3 fori4 fori5} {
         # run once before we begin timing
         $version $size
         puts [try $name $version $size]
     }
 }

and now to see the timing results (these are from tclsh (8.4.7) on a 1.33GHz PowerBook running MacOS X 10.3 via wiki-reaper | tclsh):

 #for   (1 .. 10):     result ==       55, in      57 microseconds
 #fori1 (1 .. 10):     result ==       55, in     290 microseconds
 #fori2 (1 .. 10):     result ==       55, in     285 microseconds
 #fori3 (1 .. 10):     result ==       55, in     218 microseconds
 #fori4 (1 .. 10):     result ==       55, in     183 microseconds
 #fori5 (1 .. 10):     result ==       55, in     173 microseconds

 #for   (1 .. 100):    result ==     5050, in      86 microseconds
 #fori1 (1 .. 100):    result ==     5050, in    4183 microseconds
 #fori2 (1 .. 100):    result ==     5050, in    1373 microseconds
 #fori3 (1 .. 100):    result ==     5050, in    3808 microseconds
 #fori4 (1 .. 100):    result ==     5050, in     277 microseconds
 #fori5 (1 .. 100):    result ==     5050, in     219 microseconds

 #for   (1 .. 1000):   result ==   500500, in     358 microseconds
 #fori1 (1 .. 1000):   result ==   500500, in    7054 microseconds
 #fori2 (1 .. 1000):   result ==   500500, in    7087 microseconds
 #fori3 (1 .. 1000):   result ==   500500, in    5559 microseconds
 #fori4 (1 .. 1000):   result ==   500500, in    3664 microseconds
 #fori5 (1 .. 1000):   result ==   500500, in    1553 microseconds

 #for   (1 .. 10000):  result == 50005000, in    5034 microseconds
 #fori1 (1 .. 10000):  result == 50005000, in   54393 microseconds
 #fori2 (1 .. 10000):  result == 50005000, in   53805 microseconds
 #fori3 (1 .. 10000):  result == 50005000, in   39142 microseconds
 #fori4 (1 .. 10000):  result == 50005000, in   20296 microseconds
 #fori5 (1 .. 10000):  result == 50005000, in   15092 microseconds

Out of these fori5 looks the best overall, but it is still 2.5-5 times slower than for is; can we do any better?

DKF: Curious. Why is fori5 faster than fori4? (Note that fori5 is faster than TclX's loop command, but that shouldn't be too surprising given that that uses inefficient interfaces for a number of things.)