Tail call optimization

tail call optimization is an explanation of the concept and also a discussion of its application to Tcl, much of which occurred before tailcall was introduced.

See Also

Functional programming
tailcall

Further Reading

Why Object-Oriented Languages Need Tail Calls , Guy Steele
In addition to simple recursive optimization, tailcalls make control more modular.

Description

The usual programming style in Tcl is imperative, where all the programing primitives are thought of as commands. Programs maintain their state by assigning to variables, and they can assign to the same variable multiple times. In Tcl, the classical way to compute a factorial, for instance, would be given by the following script:

proc fact0 { n } {
    for {set f 1} {$n > 1} {incr n -1} {
        set f [expr {$f * $n}]
    }
    return $f
}

This style is not the only style available, nor necessarily even the best. The functional programming community considers variable assignments harmful, and prefers that variables be bound only as the arguments to functions. This preference leads to a style of programming where loops are replaced by recursion, and all the control structure is applicative, that is, based on function calls. The naive implementation of factorial in such a system is shown in the following script:

proc fact1 { n } {
    if { $n <= 1 } {
        return 1.
    } else {
        return [expr { $n * [fact1 [expr { $n - 1 }]] }]
    }
}

(This Wiki has a discussion on the required steps towards functional programming in Tcl.)


Tail recursion

Note that the fact1 procedure consumes a large quantity of stack. If n is 10, say, it goes ten recursive calls deep. If multiple levels of loop are eliminated in this way, or if something requires many thousand iterations, the recursion quickly becomes unmanageable.

There is an alternative that preserves the best of both approaches. If a procedure can accept extra parameters that maintain its state, it can be written in a hybrid-iterative or tail-recursive style. The key to this style is:

  • If a procedure invokes itself recursively, it immediately returns the value of the recursive call without further processing.

One factorial procedure, written in the tail-recursive style, is shown below:

proc fact2 { n { result 1. } } {
    if { $n <= 1 } {
        return $result
    } else {
        return [fact2 [expr { $n - 1 }] [expr { $n * $result }]]
    }
}

ZB: I must say: I don't get it. It just looks so nice on the paper (or screen) - but the value to be returned, has to be computed first, right? So, the "return verse" of the example given above, in fact contain two verses:

set hiddenTemporaryVariable [fact2 [expr { $n - 1 }] [expr { $n * $result }]]
return $hiddenTemporaryVariable

The function doesn't return "a call to procedure with parameters" - but a value rather (we're using "[ ]"), right? So it seems to me, that still there is a need to fill the stack with return addresses. If I'm wrong - perhaps somebody could tell me, where did I mistake.

Lars H: It's written in (traditional) tail-recursive style, not actually employing tail-recursion. Realistic Tcl syntax would rather be something like

proc fact2 { n { result 1. } } {
    if { $n <= 1 } {
        return $result
    } else {
        tailcall fact2 [expr { $n - 1 }] [expr { $n * $result }]
    }
}

but that still wouldn't be any different without also making changes to the Tcl execution engine. Mechanisms such as that in returneval could be used to simulate this effect, but that's just syntactic sugar hiding the actual loop. I think some TCT member during the last half a year or so put some notes on the wiki regarding a reform of the bytecode execution engine which would make tail call optimisation feasible, but right now I can't find them (it's not MS's bytecode engine ideas, although it is related, and neither Radical reform of the execution engine).

Update 2008-07-13: The page I was looking for is NRE. MS just posted an announcement [L1 ] asking people (especially extension writers) to test it.

ZB: So the final conclusion is, that it was purely theoretical - and currently it's not possible to save stack space by "tail calling"?

NEM: Currently, there is no tail-call optimisation in Tcl, so best to use loops for now. I believe there has been some experiments with doing this in the bytecode engine (from MS), but I don't know what stage they are at. I should also point out that tail-call optimisation applies not just to recursive calls to the same procedure, but to any call that is in a tail-call position. So, e.g., various OO design patterns, such as the Visitor Pattern that involve deeply nested function calls can be optimised in the same way. (It is often overlooked that good OO design relies quite heavily on functional programming idioms).

MS 2008-07-13: NRE has been committed to HEAD: Tcl8.6a2 and later will include a ::tcl::unsupported::tailcall command that does proper tailcalls. That is, until the dust settles, the semantics are fixed and the thing is TIPped

NEM 2008-10-25: And TIP [L2 ] is now Final, adding this command to Tcl. Lars' fact2 now works as written, as does e.g. this linear-time fibonacci function:

proc fib {n {x 1} {y 0}} {
    if {$n == 0} { return $x }
    tailcall fib [incr n -1] [expr {$x+$y}] $x
}

Tail call Elimination

Why is tail recursion important? The reason is that there's a simple, mechanical optimization that can turn it back into iteration. The recursive call can be replaced with code that:

  • Replaces the arguments to the function with the values from the recursive call.
  • Jumps back to the start of the function.

While Tcl has no go to construct, the code for this optimization can be approximated with a while loop, as in the fact2 procedure below.

proc fact2-rewritten { n { result 1 } } {
    while { 1 } {
        if { $n <= 1 } {
            return $result
        } else {
            set result [expr { $n * $result }]
            set n [expr { $n - 1 }]
        }
    }
}

The Tcl bytecode engine has no support for this kind of optimization today, but maybe it should. Programmers accustomed to functional languages such as FP and Haskell (see Playing Haskell), and those accustomed to Scheme, would be surprised that it does not.

In fact, most implementations of these languages optimize not only tail recursion, but all tail calls. If a function returns the value of another function, the call is replaced with code that adjusts the stack and jumps to the start of the other function. In this way, mutually recursive functions can also get reduced to iteration.

See also MS's bytecode engine ideas


MS: a silent implementation of tail-jump-optimisation in the engine is probably possible. However, it would mess up the stack - and errorInfo.

Maybe it should be used at the programmer´s bidding only; maybe something like

proc fact2 { n { result 1. } } {
    if { $n <= 1 } {
        return $result
    } else {
        return -tail [fact2 [expr { $n - 1 }] [expr { $n * $result }]]
    }
}

KBK: I'd rather see it be a global option for the bytecode compiler, so that the programmer can set up as a "debug" or "release" configuration. Something like:

if { [info exists ::tcl_platform(debug)] 
       && $::tcl_platform(debug)] } {
    set ::tcl::compiler(tailCallOptimize) 0
} else {
    set ::tcl::compiler(tailCallOptimize) 1
}

That way, running in the "debug" environment gets detailed ::errorInfo, while running in the "release" environment gets fast code. Sort of like -g vs -O on most C compilers.

AK: This modification reaches further than messing up errorInfo. uplevel and upvar are deeply affected. They will either not work at all, or have different semantics. A suboptimal (interim) solution would be to eliminate tail-calls on the C-stack but keep the Tcl Callframes so that uplevel, etc. can still work correctly.


Arjen Markus: It is possible to realise tail recursion in pure Tcl (version 8.0 and onwards). The major drawbacks of the approach are:

  • Clumsy programming (you need to use variables in a context outside the recursive procedure
  • Slow execution (I have included some figures to give an impression)
  • All the caveats mentioned above about upvar and uplevel continue to apply.

But the advantage is that you can do endless recursion - the very reason tail recursion is so useful!

Here is my small example - a stupid little counter, but nevertheless ...

#
# Simple recursive calculation
#
proc rcount { c } {
    if { $c != 0 } {
        incr c -1
        return [expr {1+[rcount $c]}]
    } else {
        return 0
    }
}

set counter [rcount 10]
puts $counter

#
# Simulate tail-recursive calculation
#
proc tail_rcount_priv { c } {
    if { $c != 0 } {
        incr c -1
        incr ::tail_rcount_c
        after 0 [list tail_rcount_priv $c]
    } else {
        set ::tail_rcount_continue 0
    }
}


proc tail_rcount { c } {
    set ::tail_rcount_continue 1
    set ::tail_rcount_c        0

    after 0 [list tail_rcount_priv $c]

    vwait ::tail_rcount_continue

    return $::tail_rcount_c
}

set counter [tail_rcount 10]
puts $counter

puts [time {set counter [rcount 100]}      100]
puts [time {set counter [rcount  10]}     1000]
puts [time {set counter [rcount   2]}     5000]
puts [time {set counter [tail_rcount 100]} 100]
puts [time {set counter [tail_rcount 1000]} 10]
puts [time {set counter [tail_rcount 10000]} 1]

The result (at least, one possible, on my home PC, 350 MHz, Tcl8.3.4):

10
10
1600 microseconds per iteration
500 microseconds per iteration
78 microseconds per iteration
89000 microseconds per iteration
841000 microseconds per iteration
8450000 microseconds per iteration

Conclusion:

  • Tail recursion using after is possible but expensive (about 50 times!)
  • Tail recursion is linearly dependent on the number of steps (for ordinary recursion the time does not increase linearly, but grows less fast)

- - - I realised that the large execution time is probably due to the granularity of the timer (hence the round figures), more than anything else, so a delay of 0 ms is actually a delay of 1 ms most of the time!

(I ran the above script on a UNIX workstation and got very different ratios, a clear indication that granularity is the problem here!)

Update: using after idle improves the performance numbers as well (at least in tclsh)!

Trampolines

jenglish:

Another technique for implementing tail-calls is the trampoline: instead of returning a value, procedures return a continuation (i.e., what to do next). Then an outer loop -- the trampoline -- repeatedly evaluates the current continuation.

Implementing a trampoline is easy in Tcl:

proc trampoline {args} {
    set cmd $args
    while {1} {
        set cmd [eval $cmd]
    }
}

Add a bit of syntactic sugar to indicate tail-calls:

proc trTailcall {args} {
    return -code return $args
}

and finally we need a way to break out of the trampoline:

proc trResult {result} {
    return -code return [list return $result]
}

For an example, here's the standard (tail-recursive) version of the Fibonacci function in Tcl:

proc fib1 {n} {     ; # See [SICP] p. 39
    proc fib-iter {a b count} {
        if {$count <= 0} {
            return $b
        } else {
            fib-iter [expr {$a+$b}] $a [expr {$count-1}]
        }
    }
    fib-iter 1.0 0.0 $n
}

But this version runs out of stack space for large n:

% fib1 1000
too many nested calls to Tcl_EvalObj (infinite loop?)

Rewriting it to use a trampoline works for (almost) arbitrarily large n

proc fib2 {n} {
    proc fib-iter {a b count} {
        if {$count <= 0} {
            trResult $b
        } else {
            trTailcall fib-iter [expr {$a+$b}] $a [expr {$count-1}]
        }
    }
    trampoline fib-iter 1.0 0.0 $n
}

% fib2 1000
4.34665576869e+208

although we eventually run into floating point limitations (on my machine this happens at n=1476).

Arjen Markus: This is a very nice solution too, nicer in that the evaluation occurs in a local context, rather than global.


Arjen Markus: I copied these wise words about tail recursion from a response by Donal Fellows on the c.l.t.:

While many algorithms can have tail-recursion elimination applied to them, not all can be handled like this. Typically, you'd expect the optimisation to be applicable exactly when a function directly returns the result of another function call (especially in the case where it is a recursive or co-recursive[1] call.) The problem comes with when you start wanting to do some operations to the result before passing it back; optimising these sorts of things can be much harder. As an example of what I mean, the natural definition of the factorial function[2] (in Tcl) is:

proc fact {n} {
    expr {$n<2 ? $n : $n*[fact [expr {$n-1}]]}
}

This is not tail-recursive since you need to keep the value of $n about until after the recursion ends. Of course, this is fixable using an accumulator-argument scheme, but in general that's difficult to do:

proc fact {n {accum 1}} {
    expr {$n<2 ? $accum : [fact [expr {$n-1}] [expr {$accum*$n}]]}
}

Indeed, for some algorithms (depth-first tree traversal, for example) you need a stack *somewhere* and it may well be easiest to maintain the stack through recursive calls (it usually depends what else is going on at the same time.)

Trampolining (I *like* that code[3]) becomes unwieldy as the amount of information stored on the stack increases (I've got co-recursive tree-traversal code that uses upvar to access large arrays in parent contexts as part of an elaborate code generator, which would be very hard to make tail-recursive.) However, if you can bound your recursion depth neatly (pretty easy in practise with non-worst-case trees because of the sheer quantity of data even fairly shallow trees can hold) then a recursive algorithm may well be the best way to solve a problem.

Donal.

[1: A calls B, and B calls A. ]

[2: Fibonacci and GCD functions lend themselves to this treatment as well. ]

[3: If we had first-class stack-frame objects, this would be even cooler! ]

Lars H, 28 Aug 2004: The bifurcation page defines a command that allows you to do recursion with a stack, while not using the Tcl stack and thus becomes able to go much deeper than standard recursion in Tcl. (The command also provides a way of locally remembering values of the recursive function that have already been computed.)


Another place where tail calls would be really useful, and in support of an existing tcl idiom, is to implement major command/minor commands. In this case, the major command does nothing (modulo error checking) but a tail call to the minor command, and the fact that the stack frame of the major command is lost is a feature, not a bug. It's sort of the inverse of a curry (a chutney?). Something like:

proc majorCommand {minorCommand args} {
    if {[isValidCommand $minorCommand]} {
        return -tail [eval $minorCommand $args]
    } else {
        error "valid subcommands are ..."
    }
}

And: Suppose that creating a namespace also created a major command of the same name that redispatched the minor command to the namespace. Then namespaces and commands could be unified. The nice thing about this is that while commands are closed, namespaces are not(*). This would allow the set of minor commands of a major command to be extended, which would definitely be nice. And once commands and namespaces are unified, a tcl OO-system seems like it would practically fall out.

(*) If it were possible to close a namespace (perhaps a new kind of trace?), class-based OO systems could be implemented on top of this. With open namespaces, it's slot-based OO only. Scott Gargash


TV: It sounds to me like an approach which on this page by and large is not answered in another way then lets say programming pattern preferences, while the reasons for for instance recursive programming, and the reasons to get away from that are good computer architecture reasons, and the idea of the tail call has to do with defining a certain order of computations (or events, such as in process algebra) where you have no better option than having that organisation, in process algebra for instance because the possible patterns extend to infinity (possibly in overseeable ways), in network-like programmed cases because of what people call dependencies in the network which let some functions for instance wait for the result of others.

Simply talking about efficiency is pretty out of order I would say when using tcl to do recursion or something similar: it simply will not let you gain much, the tcl interpretation in general is slow compared to the actual work being done for examples like above, so on can chose any organisation one pleases, maybe a more eastetical or mathematically founded one can be prefered.

For compiled or assembled languages, the picture is different, a function call requires some overhead, at least usually the saving of some registers and the program counter on the stack, and at least one call instruction, which for small functions and especially for something like the faculty on a processor with acceleration for arithmetic operations is always an objective overhead.

Weather in that case that lies in the fact that it takes up some stack space, making it remember intermedeate function outcomes which aren't needed for the end result, or that a certain function is called repeatedly doesn't take away the efficiency deficiency.

From a debugging point of view, recursion has the advantage of using a single function, with possible only few execution branches in it, which especially for more complicated cases (or the availability of good, tested, building blocks for it such as multiplication assembly coded for earlier or small machines) is a major debugging advantage: there is only one function to test.

Repeated function calling can be prevented by either looping the function body, or repeated use of macros. That makes sense in machine language or C, but in tcl probably some things which influence the bytecompiler are inportant, but not in the same way as is known from traditional programming optimisation efforts. Talking about that is either wrong or possibly a matter of syntax eastetics. For repeated tasks on the processor level (at least for most common processors), an (quite) intelligent compiler or programmer probably would want to include the use of registers to store various intermedeate results in, which is not possible for all but also for most registers when calling functions, but not trivial, and not usually over the standard C function call/stack handling. Also the cache will be less happy with a long list of intermedeately stored values.


RS 2004-02-08: It just occurred to me that RPN does bring "tail call optimization" without any effort, if done with late bindings as in RPN again:

 /fac {dup 2 < {drop 1} {dup 1 - fac *} if} def

No magic, because RPN "words" have no local stack frame; they are just substituted and evaluated in sequence, and probably slower than "real Tcl" too. But fun to play with :)


SS: 2004-03-25: Sugar provides a tailrec_proc command that automatically converts a Tcl procedure into a semantically equivalent one but tail-rec-call optimized. The conversion is done at source level. This is an example:

package require sugar

tailrec_proc printlist l {
    if {[llength $l]} {
       puts [lindex $l 0]
       printlist [lrange $l 1 end]
   }
}

printlist {1 2 3 4}

It's the same as

proc printlist l {
     while 1 {
         if {[llength $l]} {
             puts [lindex $l 0]
             foreach [list l ] [list [lrange $l 1 end] ] break
             continue
         }
         break
     }
}

printlist {1 2 3 4}

Tail calls can be nested inside if branches as deeply as the user like.

TV: Is reminded of his university class (college?) control theory, where the professor used to remind the students the optimisation is meaningless if you don't specify *what* is being optimized. On itself a valid thought.

NEM 2005-04-08: Here's an implementation of tail-recursion using Tcl's exception handling mechanism, and trampolining (like Joe's version above). Not very polished, but illustrates a general idea:

# Tail-call exception, use exception code "100"
proc tailcall {args} { return -code 100 $args }
# Handle the trampoline call -- this only needs to be done once at the toplevel
proc tcall {name args} {
    set rc [catch {uplevel 1 [linsert $args 0 $name]} ret]
    while {$rc == 100} {
        set rc [catch {uplevel 1 $ret} ret]
    } 
    return -code $rc $ret
}
# Demo -- a tail-recursive factorial function
proc fac {x {val 1.0}} {
    if {$x <= 2} {
        expr {$x * $val}
    } else {
        tailcall fac [expr {$x -1}] [expr {$x * $val}]
    }
}

Think that is correct... demo:

% tcall fac 2
2.0
% tcall fac 4
24.0
% tcall fac 1000
floating-point value too large to represent
% tcall fac 100
9.33262154439e+157

RS: Here's the limit:

% tcall fac 170
7.25741561531e+306
% tcall fac 171
floating-point value too large to represent

RHS Made some changes to you don't need to use tcall to start with. Also changed it back to ints, so we could be sure it was getting past the recursion limit. The result, due to this, is wrong :)

proc tailcall {args} { return -code 100 $args }

# Handle the trampoline call -- this only needs to be done once at the toplevel
proc tcall {name args} {
    set rc [catch {uplevel 1 [linsert $args 0 $name]} ret]
    while {$rc == 100} {
        set ret [lreplace $ret 0 0 [lindex $ret 0]_internal]
        set rc [catch {uplevel 1 $ret} ret]
    }
    return -code $rc $ret
}

rename proc proc_old
proc_old proc {name argList body} {
    uplevel 1 [list proc_old ${name}_internal $argList $body]
    uplevel 1 [list interp alias {} $name {} tcall ${name}_internal]
}

# Demo -- a tail-recursive factorial function
proc fac {x {val 1}} {
    if {$x <= 2} {
        expr {$x * $val}
    } else {
        tailcall fac [expr {$x -1}] [expr {$x * $val}]
    }
}

fac 100

Another term for code 100 could be "eval"...

proc tailcall args { return -code eval $args }

Lars H: On uplevel there is an old discussion of this, which ends with the suggestion that the code for eval should be -1.

SS: Nice idea, this is exactly how I was thinking to implement the tail command of Jim, like a JIM_TAIL_CALL return code, that has as effect the execution of the return value. But in the form you wrote it it is better IMHO, I'm probably going to take your idea ;)

Lars H: See also returneval. The aim there is different, but the end is very similar.

SS: Indeed, and this is like a good sign. Another one is that the implementation in Jim took something like 7 lines of code... so you can find it inside the CVS and hourly tar.gz. Note: I'm not sure I'll take it inside the language, but it will stay inside the source code for at least one week in order to allow Tclers interested in the idea to evaluate it. Btw I think it's a good concept.

NEM: The argument for eval being -1 didn't seem conclusive. Negative integers seem to be allowed as return -code values, so it still might break something. In addition, on the uplevel page it states that the TCL_EVAL (or whatever) code would be internal only, and never returned from a command. In my implementation that isn't the case (and I'd argue that it shouldn't be the case) -- the exception can be caught before it propagates up past the first stack frame, e.g. in

catch {tailcall ... } 

the return value of this catch will be the TCL_EVAL code.

The returneval code is very similar, but doesn't properly handle recursive calls.

Lars H: Regarding the argument for -1: No, that's mostly a fun coincidence, but I think -1 is less likely to collide with possible other uses of [return -code] than would e.g. 5 or 100.


SS: In Jim's implementation you can write tailcall in terms of return -code eval as:

proc tailcall args {
    return -code eval [list return -code eval $args]
}

I didn't used a return code of -1, but as the page claims Jim's processing of this special return code is internal only because the semantic itself asks for it to be this way (the semantic I'm using at least). In Jim when return -code eval {some script} is used, the function returns to the caller, but before to give control to the caller the return code is evaluated as a script (in the context of the caller), and the result of the evaluation returned as return value of the procedure.

Note that indeed it *is* possible to trap the return itself:

. catch {return -code eval foo}
2

Like it happens with every other -code. But what is trapped is the JIM_RETURN retcode itself, because where every return -code is really processed is when the procedure returns (this is what makes different return -code from commands returning an exception like break or continue after all). That said it seems to me that to make it untrappable is the right thing to do, as this return code is a request we make to Tcl to evaluate the returned value as a script before to return it to the caller.

Also an implementation detail: to make the eval code not using the C stack at all is not as trivial as you can think, the code required to make it working is little, but with a non trivial semantic. The only way I think can be used to do it (that's what I used) is to trap the EVAL code in a while loop, without to exit the loop if the successive evaluation continues to return JIM_EVAL. There are other subtle things so it is also needed to add a field with the level number currently handling a JIM_EVAL code inside the interp structure.

Now what I'm not sure about is to expose directly return -code eval, or just his more limited and higher-level interface of the tailcall command. The latter allows for less abuses, the former can be used for unrelated things and makes *clear* that the script is returning. The concept of "we are returning even if this is *not* a tail call" is less clear with the tailcall command.

Btw that's the full implementation in Jim:

/* Handle the JIM_EVAL return code */
if (retcode == JIM_EVAL && interp->evalRetcodeLevel != interp->numLevels) {
    int savedLevel = interp->evalRetcodeLevel;

    interp->evalRetcodeLevel = interp->numLevels;
    while (retcode == JIM_EVAL) {
        Jim_Obj *resultScriptObjPtr = Jim_GetResult(interp);
        Jim_IncrRefCount(resultScriptObjPtr);
        retcode = Jim_EvalObj(interp, resultScriptObjPtr);
        Jim_DecrRefCount(interp, resultScriptObjPtr);
    }
    interp->evalRetcodeLevel = savedLevel;
}

NEM: Good discussion. With regards to catching the exception, yes [catch {return -code eval ..}] should give a TCL_RETURN return code. The question is what should

catch {tailcall ...}

give? I'd argue that it should give a TCL_EVAL return code. The only other option is that it doesn't return, which would be a new situation in Tcl (except for [exit], I can't think of another function which doesn't return at all). Also, the Tcl level implementation of tailcall should be simply:

proc tailcall args { return -code eval $args }

The return causes a TCL_RETURN code to propagate. When the tailcall frame unwinds, it is converted into a TCL_EVAL return code, just as happens for [return -code break] etc, so it should work as planned. This is the difference between [return -code eval] and [tailcall] -- the latter causes the current proc to return and eval the code, the former causes the caller to return and eval. This might seem an odd thing to want to do, but we allow it for other things, and it makes writing certain types of control structures easier. [return -code eval -level 0] would be identical to [tailcall].

I'm not entirely sure what the level stuff is for, but your implementation looks like what I was thinking for Tcl. I'm not entirely sure what the level stuff is for? (Something to do with [uplevel] handling?) Anyway, I'm sure we can get the details worked out. One point which occurs to me: In Jim, what does the following do?

proc foo {} {
     set a "In foo"
     tailcall {puts $a}
}
set a "global"
foo

In which scope is $a resolved? I think it should print "global", as the foo stack frame should have disappeared by that point, but it's a subtle issue, and I'm not entirely sure what the best option is. Printing "global" implies that tailcall is really a special cross between uplevel and return. Perhaps the name should reflect that?


SS:

About "The question is what should catch {tailcall ...} return", is IMHO JIM_EVAL again. Why I think so should be clarified by the following. NEM said that the implementation of tail should be: "proc tailcall args { return -code eval $args }". This is not possible, for instance thing to what we already have:

return -code return foobar

You can't write a procedure "return2" that does the same just writing:

proc return2 {} {return -code return foobar}

Because the seamtic of "return -code" is relative to the level at wich it is called. Incidentally with return -code eval you have a way to implement it:

proc return2 {} {return -code eval [list return -code return foobar]}

About the "set a global" thing, I agree, and actually executing your code what Jim prints is global. I also think that return -code eval has a lot to do with uplevel, and I'm pretty sure in the future if this feature will enter Tcl and/or will remain in Jim there will be some new interesting usage pattern for it.

NEM: I'm not sure if we're aguing for the same behaviour or not. I should probably just download Jim and try it out! Yes, I think the result of [catch {tailcall ...}] should be TCL_EVAL (or JIM_EVAL). So, I think we are agreed there. I don't see the point you are making with the return -code stuff. The [return -code return] form allows you to write a proc that behaves the same as just "return" (as your return2 does). What I am arguing is that [return -code eval] should allow you to write a proc which acts the same as [tailcall] does (i.e. not [return -code eval]). To make this explicit, here's a worked example:

proc tailcall {args} { return -code eval $args }
proc foo {} {
    puts "In foo.."
    tailcall bar 1 2
    puts "In foo still??"
}
proc bar {a b} { puts "bar $a $b" }
foo

Now, this example should print:

In foo...
bar 1 2

and that's it. The execution goes like this:

call foo --> stack = {foo}
  puts "In foo.."
  call tailcall bar 1 2 --> stack = {foo tailcall}
    return -code eval {bar 1 2} --> returns TCL_RETURN
  (exit from tailcall) --> stack = {foo} , return code becomes TCL_EVAL
 (exit from foo) --> stack = {}
 call bar 1 2 --> stack = {bar}

Perhaps it would be better to avoid use of "eval", as it doesn't convey the uplevel nature. Ideally, the [tailcall] command (whatever we call it) and the return code should have the same name. Perhaps something like [reteval] or just [call]?

To put it more clearly, the difference is in what this is equivalent to:

proc thing {args} { return -code eval $args }

In Jim, it seems as if that [thing ...] would be equivalent to [eval ...] (which does match the name). In my scheme it would be equivalent to [tailcall ...]. I could be swayed either way (and for my scheme, a different name than "eval" would be more appropriate, e.g. [return -code tailcall ...])

SS: Ok, I got your point and actually it is more similar to all the other exceptions are working today in Tcl (modulo the fact that this is the first Tcl exception where the return value is not the empty string if I'm correct). But to do so we need *three* different "things":

return -code eval ....

returneval args

tailcall arg0 arg1 ... argN

returneval returns directly a TCL_EVAL retcode, and set the result to the string the caller's context will evaluate to get the actual result value. return -code eval does what it does with every other kind of exception of course, and tailcall is similar to returneval but guarantees that every argument is a single argument of the called procedure, this is convenient as returneval instead should do something like concat on the arguments to be coherent with the rest.

I'm more happy with this design, but I'm not happy at all with the name returneval :) some suggestion? Do you agree we also need a command that directly returns the return code like we have with break, continue and so on? Also after this change I no longer agree with myself as returneval will be catchable and will return TCL_EVAL (integer value 5 in Jim currently)!

Btw thank you very much for your contribution on this topic.

Lars H: The way returneval is implemented is partly due to limitations in Tcl with respect to what a [return -code] can do, so its return code being TCL_EVAL is not a big deal. These restrictions were lifted by TIP 90 , so it could be done differently in 8.5, but that's not much of an issue. BTW, the idea of returneval was to a large extent inspired by a discussion in TIP#90, so if any of you two haven't read that, you really should. It should for example explain the issues with levels of return, which can indeed be a braintwister.

NEM: Yes, I am familiar with TIP 90. That's why I would prefer return -code return to produce a TCL_RETURN code, which then is converted to a TCL_EVAL after one level. This is consistent with the phrasing of the TIP, which explicitly says that if the -level value is non-zero (which it is by default=1) then return always produces a TCL_RETURN result. Coming up with a good name for the operation seems to be the sticking point! Suggestions welcome.

SS: what about evalret?

RS: or reeval?

NEM: I'm quite keen on teleport or warp at the moment! :) One of the coursework assignments I had as an undergraduate was to write an interpreter for a Red Dwarf[L3 ] inspired toy language (called Jim++, incidentally), which had a sort of GOTO operator, named hollyhop...

Lars H: Another issue which should perhaps be considered is the semantics. returneval, as implemented, takes a script as argument (one of the main reasons it is hard to modify its eproc_call to do tail call optimisation), whereas the tailcall above takes the words of the command to evaluate as arguments. For TCO, the latter is clearly more convenient.

SS: Ok, I found some good reason to take Jim's implementation as it is: there is no way with the current implementation to directly return a JIM_EVAL code, but just JIM_RETURN with code set to JIM_EVAL. This means that it is uncatchable, that a JIM_EVAL code can never cross interpreter boundaries, and that the semantic of the EVAL code is fully relative to the JIM_RETURN code itself, i.e. it is semantically not possible to have JIM_EVAL without to return from the procedure. All this makes sense to me.

About Lars H semantical question, I think he is correct. While it is hardly possible to expose just tailcall command because return allows for numeric -code ... like [return -code 5 ...], it is IMHO a very good idea to provide tailcall as a builtin. Personally I think return -code eval has other advantages (see TIP90), but the most interesting use case is IMHO tail recursion optimization, so it should be simple to use. This means that I'm going to add tailcall as a builtin in Jim too, with exactly the same semantic as return -code eval [list {*}$args], so it will also return a JIM_RETURN retcode, and will continue to be safe against cross-interp boundaries.

Lars H: The tailcall issue is probably parallel to e.g. break and continue, which can be implemented as procs.

SS: It is a bit different as break implemented as a proc will return TCL_BREAK. While my implementation of tailcall returns JIM_RETURN instead of JIM_EVAL, but set the 'return code' of JIM_RETURN to JIM_EVAL. Btw an implementation of tailcall is now in Jim's hourly tar.gz and CVS if somebody is interested to play with the semantic of this commands. From the point of view of the user that want to write tail recursive procs it is a non-brainer, just prefix with "tailcall" every tail call.

Lars H: I meant it is similar from a language minimality point of view. break and continue are allowed to exist as built-in Tcl commands, although they could in principle be procs, because they're so useful. The same argument applies for tailcall.

RHS: I'm confused. Can you provide an example of where [tailcall] setting the return code to TCL|JIM_EVAL breaks something? I believe you when you say there's an issue, I'm just having trouble wrapping my head around it. I'm a big fan of consistent behavior, so I'd like to see it work like break/continue/error's relation to [return -code ...].

As a sidenote, is there a reason other than optimization to have [returneval] in addition to [tailcall]? It seems to me that a command that takes one argument that is a list can do everything that a command that takes a list of arguments can do, since you can just as easily do [tailcall [list arg1 arg2 arg3] ].

Lars H: With returneval syntax you pass not a list, but a script, so you can do things like

returneval {
    set someVar [llength $someList]
    break
}

This ought to be equivalent to

tailcall eval {
    set someVar [llength $someList]
    break
}

however, so there's really no loss of power in choosing tailcall over returneval.

RHS: Thanks Lars. I was missing that part. My preference (if there's only one of the two) is for the script form, with an optimization that says "if its a pure list, consider it a command + args".

SS: Jim's way to do it is to provide the raw [return -code eval $script] for script execution, and [tailcall procname arg1 arg2 ... argN] to tail-call a command where every tailcall argument is one argument of the command. Note that with the latter is also possible to get the script behaviour with just [tailcall eval $script]. About what is not ok with returning JIM/TCL_EVAL directly. Problem 1: safe interpreters may return a TCL_EVAL retcode and be able to do more than they should forcing the parent interpreter to eval something of untrusted. if instead TCL_RETURN is returned this is impossible, the TCL_EVAL is only stored inside safe interp structure and not affecting in any way the parent interpreter. I continue to be not sure at all about what's the right thing to do with this issue. Notably there is no difference at all between the two versions *but* the fact that to return JIM_EVAL makes it a problematic about safe interpreters, and the fact that to return JIM_RETURN does not allow to distinguish via catch between JIM_RETURN and JIM_EVAL, but are we sure it's worth to distinguish it?

SS: Ok, the games are done. I think NEM/RHS design idea is better, so I changed Jim to reflect this. Thank you very much for your contribution to this page, as this make you Jim contributors too, not of code but of design that is at least as important!.


[merged from tail call optimisation by dkf]

mjk: Here is a simple example of using the after for tail-recursive calls. The following example calculates Fibonacci numbers without exhausting the stack.
proc _fib {i a b} {
    if {$i <= 2} then {
        set ::_fibretval [expr $a + $b]
    } else {
        after 0 _fib [expr $i - 1] $b [expr $a + $b]
    }
}

proc fib {i} {
    after 0 _fib $i 0 1
    vwait ::_fibretval

    return $::_fibretval
}

puts [fib  1] ;# => 1
puts [fib  2] ;# => 1
puts [fib  3] ;# => 2
puts [fib 10] ;# => 55
puts [fib 30] ;# => 832040
puts [fib 46] ;# => 1836311903
MSW: Basically it's turning the stack by 90 degrees and is using time as stack. Not exactly what I would call a tail call optimisation (Reuse of the same stack) but rather a stack consumption workaround ...
Usual recursion             |  after-workaround
                            |
[ frame ] - return & wreck  |  [ frame ] - alter & wait & wreck
[ frame ] - return & wreck  |              [ frame ] - alter & wait & wreck
[ frame ] - return & wreck  |                         [ frame ] - alter & wait & wreck
 ---> result                |                                   ---> result
Mind the functional nature  |  Mind the imperative nature

Tail recursion
[ frame ] <-- loop here & return result. Reuse stack frame. Still functional

Tail call
[ frame ] <-- call new function in the very same frame. Still not imperative !
Tail call support really means reusing the very same stack, the optimisation being not only stopping the burn of stack frames but especially saving the costs of building them up just to tear them down again. A tail recursive function is equivalent to a loop in execution, which can't be said about the after way to implement it. I suppose having a look at Guy Steele's original paper [L4 ] might save some confusion.

mjk: You are correct. My example was inspired by a comment at the after page, where Arjen mentioned that after can be used to implement tail-recursive calls. Or tail-recursive-like calls. In my case, I think that "the end justifies the means", because I can't come up a way to implement this kind of behaviour with using only the pure Tcl. Is that even possible (there's no goto, but foreach and while can be used)? However, using after, it is possible to achieve similar results. But that's true, that in the spirit of the real tail-recursiveness, my example is just a workaround.

MSW: About "real" tail calls in pure tcl, Check Tcl 9.0 Wishlist (search 53. MSW & its answers :) and Tail call optimization.

mjk: Thanks for the pointers. And now I feel myself a bit dumb, because there seems to be an example Arjen mentiones (at the Tail call optimization page), which will make my example redundant. Anyway, great to see that the tail-call optimization is considered in the next major release. It would be a great thing and I'd love to see it in the Tcl some day.

MSW: That and Closures and Continuations ... :) --- I'm not sure it's considered for the next release either, it was just my windowing shell...