Version 39 of Coroutines for the Dazed and Confused

Updated 2010-06-23 13:44:59 by nem

[DKF: This example copied from below to show off a classic way of doing coroutines.]

% proc count x {
     while {$x > 0} {
          yield "tick... $x"
          incr x -1
     }
     return "BOOM!"
}
% coroutine bomb count 3
tick... 3
% bomb
tick... 2
% bomb
tick... 1
% bomb
BOOM!

And at the end of this script, bomb has gone away too because the coroutine's done a normal exit.


I hope these lies and oversimplifications will help others understand coroutines. See below for an increasingly accurate explanation. A special thanks to miguel on the #tcl freenode channel for taking the time to explain this to me.

proc allNumbers {} {
    yield
    set i 0
    while 1 {
        yield $i
        incr i 2
    }
}
coroutine nextNumber allNumbers
for {set i 0} {$i < 10} {incr i} {
    puts "received [nextNumber]"
}
rename nextNumber {}

The output of the above code is:

received 0
received 2
received 4
received 6
received 8
received 10
received 12
received 14
received 16
received 18

The first thing to understand is the yield command (which is in the coroutine's code). [yield $foo] acts like [return $foo], but it suspends execution; when execution is resumed, the proc restarts on the line after the previous call to yield.

The procedure "allNumbers" is a proc which first calls yield, and so suspends immediately (line 2); when it resumes it begins execution at line 3, after which it reaches another yield within a while loop on line 5. After producing each number, it yields its value ... and suspends until it is called again.

Now this line:

coroutine nextNumber allNumbers

This line will (a) run allNumbers until the first yield (b) create a command called 'nextNumber' that resumes execution of allNumbers when nextNumber is called.

So after this coroutine command, we have allNumbers suspended right before 'set i 0', and a new command 'nextNumber' that continues it where it left off.

Get it?


MS notes that he kind of lied to jblz: yield is a bit more complicated, but behaves exactly as described for this example.

ZB Wrong. It's the thing, that once confused me: the proc restarts not "on the line after previous call to yield", but resumes - one can say - directly in that lately left yield field, because there it can get ev. arguments "from outside". Consider the yield field as a "gate". You're leaving the proc - and getting back using this gate. Changed an example a bit - in my opinion, now it better explains the role of yield; changing theValue (and restarting the loop) one can see, how the increment step is different.

proc allNumbers {} {
    yield
    set i 0
    while 1 {
      set howMuch [yield $i]
      incr i $howMuch
    }
}

coroutine nextNumber allNumbers

set theValue 2
for {set i 0} {$i < 10} {incr i} {
    puts "received [nextNumber $theValue]"
}

The following discussion is moved from "lies and oversimplifications".

jblz: Coroutines are the first thing in tcl to ever confuse me to the point of utter incomprehension. {*} made me scratch my head for a minute, but then the light bulb went off. we need some really easy-reader level explanation of them A) because they can be difficult to grasp conceptually (wikipedia helped with this quite a bit) B) Even after understanding the pseudo-code on wikipedia on the topic, the implementation can be difficult to follow.

I think something needs to emphasize the "command aliasing" aspect of the coroutine command, clearly and simply communicate the order in which commands are executed, and simply explain the operation of the yield command, even if it requires "lies and oversimplifications" at first. All the current explanations were far too robust and complex for me to understand.

AMG: Let me give it a try. I'll put the names of the principal commands in bold, to maybe help highlight what creates what and what calls what.

When you call [coroutine], you tell it two things: the name of a new command to be created, and an existing command to run, including arguments. Let's say the new command is to be named [resume] and the existing command is [procedure]. [coroutine] runs [procedure] until it returns or yields, and the returned/yielded value is returned by [coroutine]. In this sense, it's like simply calling [procedure] directly.

Here's where things get different. If [procedure] happened to yield, then the [resume] command is created. When [procedure] yielded, it basically got frozen in time, taken off the call stack but ready to be put back into action. Calling [resume] continues execution of [procedure] right where it left off, until it returns or yields again.

Inside [procedure], [yield] returns the (optional) argument that was passed to [resume]. Inside the caller, [resume] returns the value that was returned or yielded by [procedure].

The [resume] command continues to exist so long as [procedure] yields, then it finally goes away when [procedure] returns.

Throwing and erroring are also considered returning. Yielding is distinct from returning in that it does not terminate execution, only pauses it.

Here's another way to look at it. The arguments to [coroutine] are as follows, but not in this order:

  • The procedure to carry out
  • A handle used to resume the procedure after it yields

I'm not sure what you mean by "command aliasing", unless you're trying to say that [resume] becomes an alias for [procedure]. I don't think this is the right way to look at things. [procedure] is a library routine like any other, waiting to be called, whereas [resume] is an in-progress execution of [procedure] that just happens to be paused at the moment. And if [procedure] isn't currently in progress but paused, [resume] doesn't exist.

I hope this helps!

Hmm, actually I'm afraid this might be the kind of too-robust explanation you can't absorb all at once. Let me try again.

[coroutine] runs a command of your choice (let's say [procedure]) and returns its result. If [procedure] happens to yield instead of returning, [coroutine] creates a new command whose name you get to pick (here, [resume]). When you invoke [resume], [procedure] starts running again exactly at the point that it yielded. In fact, [yield] returns to [procedure] whatever argument you passed to [resume]. Similarly, [resume] returns whatever argument was passed to [yield]. [resume] can be used repeatedly, until [procedure] returns instead of yields, and then [resume] is deleted.

Basically, you have two threads of execution, except that instead of running at the same time, they're taking turns. The [yield] and [resume] commands are used to switch between them and pass data from one to the other.

Read this, write some code, then read my previous explanation. :^)

jblz: AMG, this is extremely helpful. Thank you very much.


DKF: A coroutine's execution starts from the global namespace and runs a command in a separate call stack that will eventually produce a result (which includes returns, errors, and a few other things) that will be returned to the caller, when the separate stack goes away. However, a coroutine can be interrupted by the yield command which stops what the coroutine is doing and returns its value to the caller but leaves the separate stack in place. The coroutine can be resumed by calling its name (with an optional value – the resumption value) which causes execution of the coroutine to recommence exactly where it left off, half way through the yield which will return the resumption value.

We can know that it always starts from the global namespace via this trivial non-yielding coroutine:

     % namespace eval bar {coroutine foo eval {namespace current}}
     ::

Note here that the coroutine is implemented by eval. This is legal! In fact you can use any Tcl command, but the most useful ones tend to be calls of procedures, applications of lambdas, or calls of object methods, since then you can have local variables. The local variables are not a feature of the coroutine, but rather of the command executed to implement it.

OK, so let's explore this resumption value stuff. This coroutine shows a neat trick:

     % coroutine tick eval {yield [yield [yield 3]]}
     3
     % tick 2
     2
     % tick 1
     1
     % tick 0
     0
     % tick -1
     invalid command name "tick"

Do you see how we can work out how many values are produced? There's no loop, so there's one value for each yield (going into the yield from the command substitution that is its argument) and one outer result, which is the outcome of the outermost yield.

OK, let's show off some real state:

     % proc count x {
         while {$x > 0} {
            yield "tick... $x"; # Ignore the resumption value!
            incr x -1
         }
         return "BOOM!"
     }
     % coroutine bomb count 3
     tick... 3
     % bomb
     tick... 2
     % bomb
     tick... 1
     % bomb
     BOOM!

Kroc: You should put this last example at the top of the page! That's the only one which allowed me to understand while the first one puzzled me.


NEM 2010-06-23: For those struggling with coroutines, it might be worth considering how they relate to existing Tcl constructs. Let's consider how we could solve the same problems if we had no built-in coroutine and yield commands. If we take DKF's example from the top of the page:

proc count x {
    while {$x > 0} {
        yield "tick... $x"
        incr x -1
    }
    return "BOOM!"
}

When run as a coroutine, this procedure yields the strings "tick... 3", "tick... 2", "tick... 1" and the finally returns the string "BOOM!". If we had no coroutines, we could implement this example by passing a callback command: just an ordinary command prefix that is called each time we want to yield a new value:

% proc count {x yieldCallback} {
    while {$x > 0} {
        $yieldCallback "tick... $x"
        incr x -1
    }
    return "BOOM!"
}
% count 3 puts
tick... 3
tick... 2
tick... 1
BOOM!

Another name for this style of callback is a continuation, because it represents the rest of the computation to perform. This style of programming should be familiar to those used to doing event-oriented programming, but it is also the way that all loop commands (e.g., while, foreach, etc) are written: the body of the loop is the continuation callback. This is a very powerful and general way of programming -- indeed, pretty much all possible control mechanisms can be coded this way (see e.g., control structures for backtracking search). However, programming in this manner can be cumbersome, as we have to manually construct and pass in these continuation callbacks. For examples such as this, it may not seem too onerous as there is just one such callback. However, (a) things are rarely this simple, and (b) performing such transformations on a large body of existing code can be tedious and error-prone. The coroutine mechanism essentially performs this task for us: silently rewriting all of our code to work in this way without us having to do anything more than start up a coroutine and insert yields where appropriate. The page coroutines for event-based programming has some more motivating examples, as does the original TIP 328 [L1 ].

The example just given gives an idea of the power of coroutines, but is a drastic simplification. Rather than hiding just one continuation callback, the coroutine mechanism allows us to specify a new continuation every time we yield! To emulate this explicitly in Tcl, we can write our own version of yield that allows us to specify a new callback each time:

proc myYield value {
    # assume variable "yieldCallback" in caller's scope
    upvar 1 yieldCallback yieldCallback
    # invoke the yield callback (continuation)
    set newCallback [uplevel #0 [list {*}$yieldCallback $value]]
    # result is the new callback coroutine
    set yieldCallback $newCallback
    return
}

This is written out in verbose form to make it clear what is going on: we call the $yieldCallback coroutine with our yield value, and expect this to return a new coroutine. With this in place, we can now mimick more closely what coroutine provides:

% proc count {x yieldCallback} {
    while {$x > 0} {
        myYield "tick... $x"
        incr x -1
    }
    return "BOOM!"
}
% proc lambda {params body} { list ::apply [list $params $body] }
% count 3 [lambda str1 {
    puts "First yield: $str1"
    lambda str2 {
        puts "Second yield: $str2"
        lambda str3 {
            puts "Third yield: $str3"
        }
    }
}]
First yield: tick... 3
Second yield: tick... 2
Third yield: tick... 1
BOOM!

As you can see, while it is possible to mimick coroutine in pure Tcl, the result is not pretty! Indeed, the more complex the examples get, the more painful it becomes -- and our example doesn't even yet handle returning results from yield (which enables 2-way communication and elevates coroutines above mere generators)! (Hint: to add this feature, adjust myYield to expect a 2-element list from the callback continuation: the new continuation and the return value).

Hopefully this provides some insight into what coroutines are and how they work their magic. Behind the scenes, the entire C implementation of Tcl has been adjusted by MS to transparently perform these transformations for us, so we can continue writing straight-forward code in "direct" style and let the language take care of the complexities.

As a final point, you may be wondering if the coroutine mechanism relieves the need to write any code in continuation-passing style. The answer is almost, but not quite. The mechanism implemented strikes a careful balance between expressiveness and performance. The upshot of this is that there are certain categories of control structures which cannot be implemented with coroutines -- in particular, any kind of non-deterministic (backtracking) search, as used in logic programming cannot be implemented with coroutines alone. For these (rare) occassions, an explicit continuation callback is still the best approach. Everything else will yield before the mighty coroutine!

Kroc - I'm sorry Neil, but if you consider the apply manual page is almost ununderstandable for me, I doubt anybody confused by coroutines (like I was) could understand it with examples based on it. ;)

My point of view is Tcl was better before TCT introduce all those confusing apply, coroutine and OO stuff.

NEM The first example uses only normal commands, no apply. I can rewrite the other examples to just use proc if you'd prefer? I'm sorry you feel that Tcl is evolving in a bad direction. My feeling is the opposite: these new features keep Tcl ahead of the pack in the power/simplicity trade-off.

AMG: apply is immensely powerful stuff, as are coroutines. I've heard rumors that OO is powerful too. :^) They may be confusing, but I think that's only because they offer power that was previously very difficult or impossible to implement in pure-Tcl. In other words, they're confusing because they're truly new additions, not merely sugar over existing functionality. Since they're new, it takes positive effort to learn what they are, when they're useful, and how to use them. You can choose to not use them, but in my opinion you will be missing out. Remember {*}? It was confusing for the same reason (it was a wildly new feature), but I now find it to be indispensable. Let me see if I can motivate each of the new features you mentioned:

apply: Allows you to pass executable code anywhere a command prefix is expected, without having to create a named proc. For instance, can be used as the argument to any callback option, such as [lsort -command] and [trace add].

coroutine: Avoids the need for callbacks and resumable procs with global state data when writing routines with interleaved execution. See Coronet and Wibble for an I/O example, and see Keep a GUI alive during a long calculation for a Tk example.

oo: Lets you cleanly bundle procedures and state data together into objects which can derive their definitions from other objects. Probably other stuff too; I admit I haven't used this in my own work. But I probably should, because as they mature, a great many of my projects wind up developing Yet Another mini-oo implementation.