Version 5 of ::tcl::unsupported::tailcall

Updated 2008-07-14 12:18:50 by nem

Experimental command provided by NRE, available in HEAD and onwards from the release of Tcl8.6a2.

The command provides proper tailcalls, with the following semantics:

  • a proc/lambda invokes [tailcall foo [bar] $soom]
  • bar is invoked and the value of soom fetched, as if [list foo [bar] $soom] had been called
  • a command named foo is looked up in the current context
  • the current proc/lambda replaces itself in the call stack with a call to the just found command

That is: tailcall foo [bar] $soom is very similar to return [uplevel 1 [list foo [bar] $soom]]] with two exceptions:

  1. foo is looked up in the current context, not in the caller's
  2. the stack frame is really gone, not just virtually. This has positive effects on memory, and a possibly confusing effect on stack traces.

NEM Many thanks to MS for his hard work making this a reality! For some background, see Tail call optimization.


NEM As a test/demo of how to use this facility, here is a simple benchmark using the factorial function:

package require Tcl 8.6a1
namespace import ::tcl::mathop::*
interp alias {} tailcall {} tcl::unsupported::tailcall

# Naive recursive factorial function
proc fac n {
    if {$n <= 1} {
        return 1
    } else {
        * $n [fac [- $n 1]]
    }
}

# Tail-recursive factorial
proc fac-tr {n {k 1}} {
    if {$n <= 1} {
        return $k
    } else {
        tailcall fac-tr [- $n 1] [* $n $k]
    }
}

# Iterative factorial
proc fac-i n {
    for {set k 1} {$n > 1} {incr n -1} {
        set k [expr {$n*$k}]
    }
    return $k
}

proc test {} {
    set fmt "%-10s ..%-12.12s %s"
    puts [format $fmt "Implementation" "Result" "Time"]
    foreach n {1 5 10 100 500 1000 2500 5000 10000} {
        puts "\nfac $n:"
        foreach impl {fac fac-i fac-tr} {
            if {[catch { $impl $n } result]} { 
                set result "n/a"
                set time "n/a"
            } else {
                set time [time [list $impl $n] 10]
            }
            puts [format $fmt $impl $result $time]
        }
    }
}
test

Putting this in a table, we get (timings taken on Windows XP: dual processor, dual core 2.66GHz machine):

N fac Time fac-i Time fac-tr Time
1 0.0 0.0 0.0
5 0.0 0.0 0.0
10 0.0 0.0 0.0
100 0.0 0.0 1600.0
500 3100.0 3100.0 6200.0
1000 n/a 11000.0 17200.0
5000 n/a 48500.0 64000.0
10000 n/a 661000.0 n/a

Timings are very rough (Windows issue, I think); I'll rerun on a unix at some point. As we can see, the tail-recursive version is slightly slower than the iterative version, and unlike the naive version, manages to not blow the stack. However, at N=10000 it does abort trying to realloc 12944 bytes. Maybe a memory leak?