Coroutines for the Dazed and Confused

[DKF: This example copied from below to show off a classic way of doing coroutines.]

% proc count x {
     while {$x > 0} {
          yield "tick... $x"
          incr x -1
     }
     return "BOOM!"
}
% coroutine bomb count 3
tick... 3
% bomb
tick... 2
% bomb
tick... 1
% bomb
BOOM!

And at the end of this script, bomb has gone away too because the coroutine's done a normal exit.


I hope these lies and oversimplifications will help others understand coroutines. See below for an increasingly accurate explanation. A special thanks to miguel on the #tcl freenode channel for taking the time to explain this to me.

proc allNumbers {} {
    yield
    set i 0
    while 1 {
        yield $i
        incr i 2
    }
}
coroutine nextNumber allNumbers
for {set i 0} {$i < 10} {incr i} {
    puts "received [nextNumber]"
}
rename nextNumber {}

The output of the above code is:

received 0
received 2
received 4
received 6
received 8
received 10
received 12
received 14
received 16
received 18

The first thing to understand is the yield command (which is in the coroutine's code). [yield $foo] acts like [return $foo], but it suspends execution; when execution is resumed, the proc restarts on the line after the previous call to yield.

The procedure "allNumbers" is a proc which first calls yield, and so suspends immediately (line 2); when it resumes it begins execution at line 3, after which it reaches another yield within a while loop on line 5. After producing each number, it yields its value ... and suspends until it is called again.

Now this line:

coroutine nextNumber allNumbers

This line will (a) run allNumbers until the first yield (b) create a command called 'nextNumber' that resumes execution of allNumbers when nextNumber is called.

So after this coroutine command, we have allNumbers suspended right before 'set i 0', and a new command 'nextNumber' that continues it where it left off.

Get it?


MS notes that he kind of lied to jblz: yield is a bit more complicated, but behaves exactly as described for this example.

ZB Wrong. It's the thing, that once confused me: the proc restarts not "on the line after previous call to yield", but resumes - one can say - directly in that lately left yield field, because there it can get ev. arguments "from outside". Consider the yield field as a "gate". You're leaving the proc - and getting back using this gate. Changed an example a bit - in my opinion, now it better explains the role of yield; changing theValue (and restarting the loop) one can see, how the increment step is different.

proc allNumbers {} {
    yield
    set i 0
    while 1 {
      set howMuch [yield $i]
      incr i $howMuch
    }
}

coroutine nextNumber allNumbers

set theValue 2
for {set i 0} {$i < 10} {incr i} {
    puts "received [nextNumber $theValue]"
}

The following discussion is moved from "lies and oversimplifications".

jblz: Coroutines are the first thing in tcl to ever confuse me to the point of utter incomprehension. {*} made me scratch my head for a minute, but then the light bulb went off. we need some really easy-reader level explanation of them A) because they can be difficult to grasp conceptually (wikipedia helped with this quite a bit) B) Even after understanding the pseudo-code on wikipedia on the topic, the implementation can be difficult to follow.

I think something needs to emphasize the "command aliasing" aspect of the coroutine command, clearly and simply communicate the order in which commands are executed, and simply explain the operation of the yield command, even if it requires "lies and oversimplifications" at first. All the current explanations were far too robust and complex for me to understand.

AMG: Let me give it a try. I'll put the names of the principal commands in bold, to maybe help highlight what creates what and what calls what.

When you call [coroutine], you tell it two things: the name of a new command to be created, and an existing command to run, including arguments. Let's say the new command is to be named [resume] and the existing command is [procedure]. [coroutine] runs [procedure] until it returns or yields, and the returned/yielded value is returned by [coroutine]. In this sense, it's like simply calling [procedure] directly.

Here's where things get different. If [procedure] happened to yield, then the [resume] command is created. When [procedure] yielded, it basically got frozen in time, taken off the call stack but ready to be put back into action. Calling [resume] continues execution of [procedure] right where it left off, until it returns or yields again.

dzach 2013-1-3 If [procedure] happened to yield, then the [resume] command is created.. Is this right? Consider the following:

 proc procedure {} {
   puts [info command procedure]
   yield
 }
 coroutine resume procedure
 -> resume

It looks like the coroutine [resume] is created upon calling [coroutine] and not when [yield]ing.

AMG: Don't you mean:

proc procedure {} {
   puts [info command [info coroutine]]
   yield
}

The name of the coroutine command is "resume" ([info coroutine]) and not "procedure". But yeah, it does get created upon initial invocation and isn't deferred until [yield]. Most programs can't tell the difference, since during this window of time, the coroutine's procedure is in control.


Inside [procedure], [yield] returns the (optional) argument that was passed to [resume]. Inside the caller, [resume] returns the value that was returned or yielded by [procedure].

The [resume] command continues to exist so long as [procedure] yields, then it finally goes away when [procedure] returns.

Throwing and erroring are also considered returning. Yielding is distinct from returning in that it does not terminate execution, only pauses it.

Here's another way to look at it. The arguments to [coroutine] are as follows, but not in this order:

  • The procedure to carry out
  • A handle used to resume the procedure after it yields

I'm not sure what you mean by "command aliasing", unless you're trying to say that [resume] becomes an alias for [procedure]. I don't think this is the right way to look at things. [procedure] is a library routine like any other, waiting to be called, whereas [resume] is an in-progress execution of [procedure] that just happens to be paused at the moment. And if [procedure] isn't currently in progress but paused, [resume] doesn't exist.

I hope this helps!

Hmm, actually I'm afraid this might be the kind of too-robust explanation you can't absorb all at once. Let me try again.

[coroutine] runs a command of your choice (let's say [procedure]) and returns its result. If [procedure] happens to yield instead of returning, [coroutine] creates a new command whose name you get to pick (here, [resume]). When you invoke [resume], [procedure] starts running again exactly at the point that it yielded. In fact, [yield] returns to [procedure] whatever argument you passed to [resume]. Similarly, [resume] returns whatever argument was passed to [yield]. [resume] can be used repeatedly, until [procedure] returns instead of yields, and then [resume] is deleted.

Basically, you have two threads of execution, except that instead of running at the same time, they're taking turns. The [yield] and [resume] commands are used to switch between them and pass data from one to the other.

Read this, write some code, then read my previous explanation. :^)

jblz: AMG, this is extremely helpful. Thank you very much.


DKF: A coroutine's execution starts from the global namespace and runs a command in a separate call stack that will eventually produce a result (which includes returns, errors, and a few other things) that will be returned to the caller, when the separate stack goes away. However, a coroutine can be interrupted by the yield command which stops what the coroutine is doing and returns its value to the caller but leaves the separate stack in place. The coroutine can be resumed by calling its name (with an optional value – the resumption value) which causes execution of the coroutine to recommence exactly where it left off, half way through the yield which will return the resumption value.

We can know that it always starts from the global namespace via this trivial non-yielding coroutine:

     % namespace eval bar {coroutine foo eval {namespace current}}
     ::

Note here that the coroutine is implemented by eval. This is legal! In fact you can use any Tcl command, but the most useful ones tend to be calls of procedures, applications of lambdas, or calls of object methods, since then you can have local variables. The local variables are not a feature of the coroutine, but rather of the command executed to implement it.

OK, so let's explore this resumption value stuff. This coroutine shows a neat trick:

     % coroutine tick eval {yield [yield [yield 3]]}
     3
     % tick 2
     2
     % tick 1
     1
     % tick 0
     0
     % tick -1
     invalid command name "tick"

Do you see how we can work out how many values are produced? There's no loop, so there's one value for each yield (going into the yield from the command substitution that is its argument) and one outer result, which is the outcome of the outermost yield.

OK, let's show off some real state:

     % proc count x {
         while {$x > 0} {
            yield "tick... $x"; # Ignore the resumption value!
            incr x -1
         }
         return "BOOM!"
     }
     % coroutine bomb count 3
     tick... 3
     % bomb
     tick... 2
     % bomb
     tick... 1
     % bomb
     BOOM!

Kroc: You should put this last example at the top of the page! That's the only one which allowed me to understand while the first one puzzled me.


NEM 2010-06-23: For those struggling with coroutines, it might be worth considering how they relate to existing Tcl constructs. Let's consider how we could solve the same problems if we had no built-in coroutine and yield commands. If we take DKF's example from the top of the page:

proc count x {
    while {$x > 0} {
        yield "tick... $x"
        incr x -1
    }
    return "BOOM!"
}

When run as a coroutine, this procedure yields the strings "tick... 3", "tick... 2", "tick... 1" and the finally returns the string "BOOM!". If we had no coroutines, we could implement this example by passing a callback command: just an ordinary command prefix that is called each time we want to yield a new value:

% proc count {x yieldCallback} {
    while {$x > 0} {
        $yieldCallback "tick... $x"
        incr x -1
    }
    return "BOOM!"
}
% count 3 puts
tick... 3
tick... 2
tick... 1
BOOM!

Another name for this style of callback is a continuation, because it represents the rest of the computation to perform. This style of programming should be familiar to those used to doing event-oriented programming, but it is also the way that all loop commands (e.g., while, foreach, etc) are written: the body of the loop is the continuation callback. This is a very powerful and general way of programming -- indeed, pretty much all possible control mechanisms can be coded this way (see e.g., control structures for backtracking search).

However, programming in this manner can be cumbersome, as we have to manually construct and pass in these continuation callbacks. For examples such as this, it may not seem too onerous as there is just one such callback. However, (a) things are rarely this simple, and (b) performing such transformations on a large body of existing code can be tedious and error-prone. The coroutine mechanism essentially performs this task for us: silently rewriting all of our code to work in this way without us having to do anything more than start up a coroutine and insert yields where appropriate. The page coroutines for event-based programming has some more motivating examples, as does the original TIP 328 [1 ].

The example just given gives an idea of the power of coroutines, but is a drastic simplification. Rather than hiding just one continuation callback, the coroutine mechanism allows us to specify a new continuation every time we yield! To emulate this explicitly in Tcl, we can write our own version of yield that allows us to specify a new callback each time:

proc myYield value {
    # assume variable "yieldCallback" in caller's scope
    upvar 1 yieldCallback yieldCallback
    # invoke the yield callback (continuation)
    set newCallback [uplevel #0 [list {*}$yieldCallback $value]]
    # result is the new callback coroutine
    set yieldCallback $newCallback
    return
}

This is written out in verbose form to make it clear what is going on: we call the $yieldCallback coroutine with our yield value, and expect this to return a new coroutine. With this in place, we can now mimick more closely what coroutine provides:

% proc count {x yieldCallback} {
    while {$x > 0} {
        myYield "tick... $x"
        incr x -1
    }
    return "BOOM!"
}
% proc yieldCb1 str1 { puts "First yield: $str1"; return yieldCb2 }
% proc yieldCb2 str2 { puts "Second yield: $str2"; return yieldCb3 }
% proc yieldCb3 str3 { puts "Third yield: $str3"; return "" }
% count 3 yieldCb1
First yield: tick... 3
Second yield: tick... 2
Third yield: tick... 1
BOOM!

As you can see, while it is possible to mimic coroutine in pure Tcl, the result is not pretty! Indeed, the more complex the examples get, the more painful it becomes -- and our example doesn't even yet handle returning results from yield (which enables 2-way communication and elevates coroutines above mere generators)! (Hint: to add this feature, adjust myYield to expect a 2-element list from the callback continuation: the new continuation and the return value).

Hopefully this provides some insight into what coroutines are and how they work their magic. Behind the scenes, the entire C implementation of Tcl has been adjusted by MS to transparently perform these transformations for us, so we can continue writing straight-forward code in "direct" style and let the language take care of the complexities.

As a final point, you may be wondering if the coroutine mechanism relieves the need to write any code in continuation-passing style. The answer is almost, but not quite. The mechanism implemented strikes a careful balance between expressiveness and performance. The upshot of this is that there are certain categories of control structures which cannot be implemented with coroutines -- in particular, any kind of non-deterministic (backtracking) search, as used in logic programming cannot be implemented with coroutines alone. For these (rare) occasions, an explicit continuation callback is still the best approach. Everything else will yield before the mighty coroutine!

Kroc - I'm sorry Neil, but if you consider the apply manual page is almost ununderstandable for me, I doubt anybody confused by coroutines (like I was) could understand it with examples based on it. ;)

My point of view is Tcl was better before TCT introduce all those confusing apply, coroutine and OO stuff.

NEM The first example uses only normal commands, no apply. I can rewrite the other examples to just use proc if you'd prefer? (Done) I'm sorry you feel that Tcl is evolving in a bad direction. My feeling is the opposite: these new features keep Tcl ahead of the pack in the power/simplicity trade-off.

AMG: apply is immensely powerful stuff, as are coroutines. I've heard rumors that OO is powerful too. :^) They may be confusing, but I think that's only because they offer power that was previously very difficult or impossible to implement in pure-Tcl. In other words, they're confusing because they're truly new additions, not merely sugar over existing functionality. Since they're new, it takes positive effort to learn what they are, when they're useful, and how to use them. You can choose to not use them, but in my opinion you will be missing out. Remember {*}? It was confusing for the same reason (it was a wildly new feature), but I now find it to be indispensable. Let me see if I can motivate each of the new features you mentioned:

apply: Allows you to pass executable code anywhere a command prefix is expected, without having to create a named proc. For instance, can be used as the argument to any callback option, such as [lsort -command] and [trace add].

coroutine: Avoids the need for callbacks and resumable procs with global state data when writing routines with interleaved execution. See Coronet and Wibble for an I/O example, and see Keep a GUI alive during a long calculation for a Tk example.

oo: Lets you cleanly bundle procedures and state data together into objects which can derive their definitions from other objects. Probably other stuff too; I admit I haven't used this in my own work. But I probably should, because as they mature, a great many of my projects wind up developing Yet Another mini-oo implementation.

Kroc - I don't know your background, but for somebody like me who didn't learn programming at school, these pretty features are very hard to understand (while {*} wasn't). It makes me feel Tcl becomes a programming language for programmers, while it was a programming language for everyone. I would have prefer TCT members focus on things like tiled text and canvas widgets, bringing back text widget speed (to what it was before 8.5), printing facilities, core pdf support, etc..

But as I'm not a school programmer, I'm probably wrong and you're probably right. ;^)

NEM - I'm disappointed that you feel coroutine and apply etc are hard to learn. They're supposed to make life easier, and if that's not apparent, then it would be good to discuss how they could be improved and better documented. However, note that the TCT can only vote on features that are TIP'ed -- if you feel a different direction is more appropriate, then the TIP process is the way to influence that.

DKF: FWIW, the apparent slowdown in text is due to the fact that it now does smooth scrolling, which required a major rewrite of its internals.

Kroc - My goal wasn't to disappoint you Neil, I'm sorry for this. I think I'm simply lacking programming basis to understand the kind of concepts apply, coroutine and OO use: it's not your fault, nor mine.

About TIP, I can't do anything. TIP process requires C knowledge I haven't. So I'll keep following happily Tcl evolution.

AMG: A word about my background: I'm a self-taught programmer. I went to school not to study programming but rather to study computer science and computer engineering; I didn't take any classes for learning programming languages.

I know this page is supposed to be about coroutines, but I'll go ahead and describe apply because it's pretty easy to explain and if you can understand that one concept, then maybe you can get over whatever mental block you have. Probably this part of the discussion should be moved to a more appropriate location some time in the future.

apply is a Tcl command. That's the first thing to understand. :^) Its first argument is what's called a lambda, which is a script plus a parameter listing (and an optional namespace which I almost never need); basically the same stuff you would pass to proc, except grouped into a single list. The rest of the argument to apply are used as arguments to the script in the lambda.

This shows the relationship between proc and simple two-element lambdas:

proc make_proc_from_lambda {name lambda} {
    proc $name [lindex $lambda 0] [lindex $lambda 1]
}

Here's an example showing how to use apply. apply's first argument is a lambda that takes one argument "number" and returns "even" or "odd" depending on that number. apply's second argument is 20, which is passed to the lambda, which in turn returns "even".

apply {{number} {
    if {$number % 2 == 0} {
        return even
    } else {
        return odd
    }
}} 20

Simple, but what good is this? Here's where things start to get interesting:

proc map {lambda list} {
    set result {}
    foreach element $list {
        lappend result [apply $lambda $element]
    }
    return $result
}
map {{number} {
    if {$number % 2 == 0} {
        return even
    } else {
        return odd
    }
}} {1 2 3 4 5 6}

The lambda is now passed to a proc that applies it to every element of a list. The result is "odd even odd even odd even".

But wait! What if map had been written by someone else a long time ago, before apply even existed? Can apply still be used? Yes, let me show you:

proc map {command list} {
    set result {}
    foreach element $list {
        # Classic "script prefix" style:
        lappend result [eval $command [list $element]]
        # Newer, safer, faster "command prefix" style:
        # lappend result [{*}$command $element]
    }
    return $result
}
map {apply {{number} {
    if {$number % 2 == 0} {
        return even
    } else {
        return odd
    }
}}} {1 2 3 4 5 6}

Same result. You can use this technique to integrate lambdas (parameter list + script, all in anonymous value form) with existing callback-style code, like [trace].

Maybe an analogy will help you to understand the nature of lambdas and how they can be useful. Remember when dict was added? It gave us an alternative to arrays that could be passed around more easily, plus other benefits. apply and proc have the same relationship, except that the restrictions on proc are more severe than on arrays (arrays can be local variables, but procs are always global).

Kroc - At first sight, the apply syntax is really horrible to follow so it doesn't help to understand. I have to read it slowly and carefully to try to understand how it works and what it does. After a while I got it, but it seems to me that's something very complicated I can easily reproduce in a very simpler way:

proc oe {numbers} {
    set res [list]
    foreach num $numbers {
        if {$num % 2 == 0} {
            lappend res even
        } else {
            lappend res odd
        }
    }
    return $res
}
oe {1 2 3 4 5 6}

Maybe that's because you take a simple example to help me understand, but then, I can't imagine a case where apply would simplify my code. I also don't understand why your (global) map proc is better than my (global) oe proc.

On the other hand, the very first example of coroutines at the top of this page gave me a good example I can't reproduce easily with coroutine (and this example should be put in coroutine manpage).

AMG: Using apply tends to ramp up the number of braces in your program, because lambdas must be contained in a single word, whereas proc lets you put the parameters and the script in separate words. This is a necessary evil, because in order for lambdas to be bytecode-compiled they must live in a single Tcl_Obj (i.e. word). But consider also that if lambdas were split into two words, it would be much harder to pass them around, to store them in dicts, and so on. If it really does bother you, it's simple to wrap apply (adapted from code by NEM [2 ]):

proc lambda {params body} {
    list apply [list $params $body]
} 
map [lambda {number} {
    if {$number % 2 == 0} {
        return even
    } else {
        return odd
    }
}] {1 2 3 4 5 6}
apply {{args} {puts $args}} hello world     ;# using apply directly
{*}[lambda {args} {puts $args}] hello world ;# using NEM's lambda constructor
set print [lambda {args} {puts $args}]      ;# store the lambda into a variable
{*}$print hello world                       ;# and use it

But that's probably not your real complaint with apply. The issue is that no matter how simple it is made, it will still involve more syntax than inlining the code (which your example did) or using procs directly. You want to know how apply could ever be used to simplify your program. It's like this: apply is potentially useful any time you have to pass code from one place to another, especially when that code takes arguments. Without apply, it would be necessary to pass the name of a proc, which seems nice and simple until you realize that you have to find a globally unique name for that proc and that you're responsible for deleting it when it's no longer needed. You don't have to worry about that when you have apply: just pass the code directly, without having to package it into a proc and pass it by name. Of course, for specialized (non-generalized) applications, you can also inline the code and not have to pass it around at all; again, that's what your example did. One great example of a generalized application is the Tcl/Tk core, which is designed to serve the needs of the widest possible variety of programs. There's no way to inline code into the Tcl/Tk core commands.

By the way, there's an alternative design to passing proc names. [bind] and some other commands take code templates with %placeholders that they replace with the arguments, then they eval the result. This is convenient for simple uses, but it prevents bytecode'ing from taking place and is therefore very slow. Also it constitutes Yet Another little language that has to be learned, since its notation is different than ordinary parameter substitution.

I suggest thinking some more about the parallel between proc/apply and array/dict. dict changed associative maps from being named interpreter state objects to being simple values; it brought them into the realm of the Tcl_Obj. apply does the same for executable parameterized code. Consider the cases where dict makes your program simpler, despite the fact that [dict get $varname $index] involves more typing than $varname($index). apply is the same way: it can make simple things more complex (don't use it when you don't need it), but it also can make complex things more simple (definitely use it when it's helpful).

By the way #2, in both cases I wish there wasn't more typing. I would like for $varname($index) to be a dict lookup, and I would like for [foo bar] to be the same as [apply $foo bar]. But both wishes are problematic and unrealistic and must be accompanied by extensive, invasive, pervasive language changes, so I reserve them for Cloverfield.

apply is simply Tcl's vehicle for lambdas. To understand when apply is useful, you will need to research the cases where lambdas are useful.

Kroc - I understood apply when I wrote the page for the french wiki with very simples examples (without map and lambda procs I see on all examples here). This basic examples shrunk the steps of the stair of understanding and made it easier to climb.

AMG: Yes, but none of the examples on the French wiki serve to demonstrate how apply could be useful. They can all be written more simply without apply, which is pretty much your complaint: code that uses apply would be simpler without it. My argument is that there also exists code that can be simplified by adding apply. I used map in my examples because it's a very common, simple, and well-understood functional operation. But perhaps its simplicity does not serve us in this case, because it's also simple to not use map and instead do foreach and lappend directly, in imperative programming style. Instead consider something complex, like [lsort -command].

lsort -command {apply {{a b} {
    set result [expr {[string length $a] - [string length $b]}]
    if {$result == 0} {
        set result [string compare $a $b]
    }
    return $result
}}} {red orange yellow green blue indigo violet}

Writing this without [lsort] would be very difficult. Writing it without [apply] would be pretty simple BUT would require the creation of a proc which might only get used once and will pollute some global namespace. Wait a minute, that's not what I said earlier, that apply would make this simpler... I guess it depends on your definition of simple!

You know what? I think I understand your point. The truth is that you and I live in different worlds. You're focused on imperative programming, because that's classical Tcl, that's what you've taught yourself, and that's what you're comfortable with. I'm looking at functional programming, because it's cleaner due to minimizing manually managed global state objects. I think I've shown that I'm not very good at arguing the merits of one over the other, so let me just say everyone can have his personal preference. apply provides lambdas, a central concept of functional programming, but if you're strictly imperative, you don't need apply. Right? I'm sure I'm overlooking something. Somebody else please jump in here and show how apply could be gainfully employed in an imperative context.

To me, functional programming seems to be a unification of code and data, such that code can be data and therefore code can be manipulated the same as data. This recalls Tcl's everything is a string philosophy, which is also a unification of types.

Kroc - I think you're right Andy: I'm uncomfortable with functional programming (which seems to me harder to understand than imperative programming). I added your last example to the french page and I'll get back on this if/when I'll improve my functional programming basis.


Potrzebie - 2013-01-23 20:33:21

Please move this good discussion about apply's usefulness to the apply page.

DKF: This is a wiki; move it (or add links) yourself. :-)