multi-arg coroutines

coroutine has been modified to permit multiple args to be passed in.

yield returns any args passed in as a single element list, or if only one element was passed in, as a single value.

yieldm returns args passed in as a list.

To bridge the two behaviours, consider this shim:

# temporary compatibility shim for coroutines
# handle new coro interface
if {[llength [info command ::tcl::unsupported::yieldm]]} {
    namespace eval tcl::unsupported namespace export yieldm
    namespace import tcl::unsupported::yieldm
    interp alias {} ::Coroutine {} ::coroutine
} else {
    # the new yieldm multi-arg coro call does not exist.
    # this is the older coroutine implementation
    interp alias {} ::yieldm {} ::yield

    proc ::delshim {name x y op} {
        catch {::rename $name {}}        ;# delete shim
    }

    proc ::Coroutine {name command args} {
        # determine the appropriate namespace for coro creation
        set ns [namespace qualifiers $name]
        if {![string match ::* $ns]} {
            set ns [uplevel 1 namespace current]::$ns
        }
        set name [namespace tail $name]

        # create a like-named coro
        set x [uplevel 1 [list ::coroutine ${ns}_$name $command {*}$args]]

        # wrap the coro in a shim
        proc ${ns}$name {args} [string map [list $x %N%] {
            tailcall %N% $args        ;# wrap the args into a list for the old-style coro
        }]

        # the two commands need to be paired for destruction
        trace add command $x delete ::delshim ${ns}$name
        trace add command ${ns}$name ::delshim $x

        # tell it we created the one they requested
        return ${ns}$name
    }
}

Historical Argument

I wish to consider amending TIP 328 , along these lines:

Invocation of a coroutine should accept multiple arguments, those arguments should be returned to the coroutine's yield as a list of actual parameters.

The current implementation forbids invoking a coroutine with more than one argument. I have found, in practice, that many of my coroutine invocations are naturally like other command invocations, and take multiple arguments.

The case for providing multi-arg'd coroutines is:

coroutines should be able to simulate any command, not just any single-arg'd command [generality]
to implement single-arg's coroutines in multi-arg'd coroutines is trivial - nothing needs to be done. The converse (implementing multi-arg'd coroutines under coroutine is inefficient and difficult. [increased expressive power]
there is no sound reason that the invocation of a coroutine should not resemble that of any other command [principle of minimal surprise]
it is impossible to construct forms like interp alias to address a coroutine if the caller may pass more than one argument.

For these reasons, the core should be modified to accept multiple actual parameters to a coroutine invocation, yield should be modified to resemble yield2 below, and a new interface, yieldm should be created to return the actual parameter list as it's passed.

Arguments in favour of single-arg'd coroutine:

extra cost in packing (at invocation) and unpacking (in yield) the actual parameters. Total cost is building a 1-element list and 1 lindex, both in C. [performance]
the potential that coroutine invocation might require options to control the invocation in ways simple command call doesn't. Examples: $coro -code return, which would be interpreted as an unconditional termination. [expressive power]
(a) $coro and yield are properly considered as symmetrical operations, (b) $coro v. yield is an attractive/consonant way to represent their symmetry, (c) having multi-arg'd $coro would violate or misrepresent that symmetry. [underlying symmetry]
Are there any other objections to multi-arg coros?

Rebuttals to arguments against multi-arg'd coroutine:

cost of packing/unpacking are insignificant, multi-arg'd use is more common.
While it is useful to be able to cause a coro's yield to return an exception, this would be as easily served by using a distinct command to invoke a coro with return-like semantics.
The symmetry is not served by the current form of representation yield does not resemble $coro. The symmetry is not as useful as the ability to simulate.

Demonstration of point 2 [increased expressive power]

CASE 1: (counterfactual)

Assume a Coroutine which generates a command taking multiple args, to implement coroutine as we have it implemented:

Coroutine would require no wrapping or changes to function as coroutine does now. Only yield would have to change.

To provide precisely the same functionality as yield currently does, it is necessary to strip off a single layer of list:

proc ::yield2 {value} {
    return [lindex [::yield $value] 0]
}

No other changes are necessary. More likely, one would define yield like that, and create a new yield-variant which returned the whole invocation arg list.

CASE 2: Coroutine in coroutine - implementing multi-arg'd coroutines over singe-arg'd coroutine

proc Coroutine {name command args} {
    set ns [namespace qualifiers $name]
    if {$ns eq {}} {
        set ns [uplevel 1 {namespace current}]
    }
    set name [namespace tail $name]

    set coco [::coroutine ${ns}::_C$name $command {*}args]
    trace add command ${ns}::_C$name delete "rename ${ns}::name {}"
    proc ${ns}::$name {args} {
        set name [lindex [info level 0] 0]
        set ns [namespace qualifiers $name]
        if {$ns eq {}} {
            set ns [uplevel 1 {namespace current}]
        }
        set name [namespace tail $name]
        
        tailcall ${ns}::_C$name $args
    }
}

The predominant cost in this is that of tracing intermediate commands to avoid leakage. Even if this were not the case, the cost of calling a proc to wrap the extra args is considerable. The only alternative is to wrap the args on *each* invocation.

One can provide variable assignment by signature (or Occam-like protocol):

proc entrypoint {value args} {
    uplevel 1 lassign [::yield $value] {*}$args
}

This is possible in current coroutine the same way, but requires the caller to form args into lists on each invocation.

MS Notes that this would require that the invocation's arguments be a list of {name,value} pairs. This breaks the wanted analogy to proc, where the assignment to variables is positional and not by name. It would be possible to mimic proc perfectly, but in that case scripting the current coroutine functionality becomes cumbersome.

CMcC was thinking of it as a wrapper to yield with the effect of lassigning the actual parameters to caller-local variables. The coro invocation would only provide the values, the caller of entrypoint would provide the formal parameters (that's the analogy.)

jmn 2010-04-15 :

I totally agree. I was thoroughly dismayed by the single arg coroutine thing. It just seems to go against the grain of the “Tcl way” - for no real advantage. If it was multi-arg'd it would present an interesting way to build some command alternatives along the lines of existing mechanisms such as interp alias. Having to wrap it to achieve this is ugly enough to discourage this sort of innovation especially if the whole point of the innovation was to do so in a situation where dispatch performance matters.

nem: notes that performance will be dominated by the cost of the coroutine context switch (quite high currently). Also, wrapping a coro to accept multiple args is trivial:

proc apply-list {cmd args} {$cmd $args}
do stuff [list apply-list $coro]

On the other hand, forcing all coros be multi-arg means that yield then must return a list, and the extremely common single arg case then needs to remember to use lindex [yield 0] everywhere.

MS: An alternative is to provide a new command (coroutine2?) which creates multi-arg commands. In that case, yield can be modified in C to do the right thing depending on the nature of the enclosing coroutine. For some reason which I'm not clear about (paternity?), this would be my current preference.

A second command would also make it easy to allow the actual arguments to be passed positionally, as in proc. Some syntax would be needed to allow the invocation to also pass the internal result from yield, if one is wanted. Maybe something like

coroutine coro2Cmd ::apply {{x1 x2} {...}} 42 42
coro2Cmd -yieldResult foo 11 22

that would cause the coro to be created with x1=x2=42. When it is later resumed: x1 is set to 11, x2 is set to 22, and yield returns foo. If -yieldResult is not specified, yield returns {}.

CMcC: this really doesn't make coro invocation look like normal command invocation, though. My preference is for something where a caller doesn't need to know that what it's calling is a coro. The reason I prefer this is that I can't see why a caller should need to know, or should have to consider anything unusual when calling a command which might happen to be implemented as a coro. The specific implementation you sketch, with special options to be interpreted, doesn't provide for justification (1) above. Additionally, it's not possible to pass -yieldResult in if you wanted to.

MS: At least one of us misunderstands the other :P In my proposal, if the caller doesn't know this is a coro, he uses it as a normal command

coro2Cmd 11 12

OTOH, if he does know it is a coro he has the option of also sending in a result for yield. Note that I am not really proposing this as a solution, just an option that really mimics normal command invocation - with positional semantics for the arguments.

CMcC: I did misunderstand. You were suggesting that the coroutine command somehow ascertain by inspection the formal parameter set of the its second arg, and use that as a kind of occam protocol for later interaction. I don't think that can work, can it? You have no way to inspect, for example, a C-coded command to discover its formal parameter signature, nor (really) to be certain that a given command was created by a given implementation of a given other command. I think what you seem to be suggesting is impossible, unless I still misunderstand it.

I am attracted to the idea that a coro is just like any other command in terms of its invocation (because it enables a coro to simulate any other command.)

I dislike the -yieldResult idea because I can think of no arguments one can pass to a command which are interpreted by the invocation mechanism itself prior to the command gaining control. What you propose with your example -yieldResult is to make command invocation get involved, at a C level, unable to be intercepted by Tcl, to interpret (as modifiers of the invocation) things which would normally be considered arguments to the command, and this only for coros. To me, that seems like a special case with wide implications (you can't pass “-yieldResult” as the first argument to any coro) and no evident benefit (at least, I don't understand what it buys you that can't be achieved differently.)

I understand there's also an argument that there may be other things one would like to do to a coroutine than invoke it in the usual manner one invokes commands. For things which operate *on* a coro rather than through it (say, for example, injecting an error into it) I would suggest a completely different command, say coro_op kill $coro or indeed coro_op error $coro $eo to cause yield to complete with error dict specified by $eo.

CMcC: I wouldn't mind if there were two ways to create coroutines, with a new form creating a multi-arg coro, but I don't think that solution is necessary, or as neat as providing a second yield and making multi-arg the standard behaviour of coroutine.

What's your objection to this change, Miguel? By the time the coro is invoked, one has already parsed the arguments to its invocation in order (currently) to complain that they don't form a singleton set. I presume that by that stage they exist in a list form anyway, and that coro invocation needs to pick out the first argument to return as yield%'s result anyway, surely the performance cost of duplicating the lrange $invocation 1 end is not significantly greater than that of lindex $invocation 1, and in any case is exactly what a proc command invocation would have to do (without the necessity of then assigning them to their corresponding formal parameters.)

Lars H: How about letting yield accept a second argument that, like the args argument of proc, specifies the arguments the coroutine will take the next time it is called? This would mean that after

yield $value {foo bar {baz "apa"} args}

the variables foo, bar, baz, and args in the local context would all have been assigned according to the values supplied in the coro call (which would throw an error unless at least two arguments are supplied). This does open up for coroutine commands having wildly varying syntaxes, but on the other hand it is already a consequence of the fact that you can yield just about anywhere that coroutine commands can have semantics that vary wildly from call to call.

CMcC 2010-04-20 03:18:35:

I want to explore the taxonomy of commands which create commands here: Creating Commands

dkf 2010-04-21 08:12:40:

I suppose we should ask what the anatomy of a coroutine is before analyzing how easy it is to do tricks.

A coroutine is a stateful entity created with:

: coroutine name definition...

I'll primarily ignore the nature of the definition for now. After that is called, there will be a new command, the coro-command, called name, existing, which can be invoked to pass in a value to the coroutine. Within the coroutine context, yield will produce a result from the coro-command and info coroutine will report the name of the coro-command. Upon deletion of the coro-command, the coroutine is deleted, and the coro is also deleted when the definition terminates (yielding isn't termination).

The issues where I see there are problems now:

yield has an over-simplified model of result that lacks the sophistication of Tcl's general result tuple model. (Making it more return-like would solve this.)
The coro-command behaves as a very restricted type of Tcl command
Any wrapper of the coro-command needs to preserve the deletion behaviour and the info coroutine reporting; they are critical features. That makes wrapping to allow a multi-arg accepting coro-command rather tricky; any injected proxy needs to be made very carefully and info needs diddling too. (The last part is easier than it used to be, of course.)

NEM: I very much agree with point (1). I'd love to see yield (and $coroCmd) be able to pass back exception values and the options dictionary. TIP 328 demonstrates a wrapper approach to achieving this:

proc exyield {value args} {
    lassign [yield [list $value $args]] value opts
    return -options $opts $value
}

proc exresume {coro value args} {
    lassign [$coro [list $value $args]] value args
    return -options $opts $value
}

# Usage
proc mycoro {} {exyield $val -code error -errorcode $somecode ...}

This works ok -- we debated whether to propose this in the TIP, but felt it was best left as a possible future enhancement. Point 2 was also discussed, however I felt that the simple wrapper approach works fine, and that adding option 1 would be more difficult/impossible if coroutines accept multiple args. Regarding point 3, deletion behaviour is preserved by curry (a curried command is just a command prefix, like a lambda). I think info coroutine is a red-herring: any code that calls that function will know that the command returned only accepts a single argument. I can't think of a situation where you would call info coroutine and then expect the resulting command to accept multiple arguments.

Note also that $coroCmd and yield are two ends of a communication "channel": whatever bells and whistles you want for one end also make sense for the other. So, it makes sense for $coroCmd to also accept return-style options, if yield does, and it may also make sense for yield to accept multiple arguments, if $coroCmd does. How to achieve these extras in a consistent and elegant manner is quite tricky, which is one of the reasons why we decided to keep to just a single argument. My personal preference (at the time, and still now) would be for these features to be implemented in tcllib as a higher-level interface over the core coroutine technology.

CMcC 2010-04-21 19:03:09:

As Creating Commands shows, coroutine is the only command generator which generates a one-arg command, and as the example code above shows it is quite difficult and expensive to work around that quite arbitrary limitation.

Whether one chooses to conceptualise a coroutine as a stream or not, unlike the other streams, e.g. the opaque handle of open, one certainly cannot conceptualise it as a general command form, even though it is presented as one. Considered as a stream, coroutine has no reason to generate a command unless that command functions as any other command.

This arbitrary limitation could easily be removed, such that existing single-arg uses are facilitated. Anything the current form permits (but, notably, does not yet provide) could be provided by coroutine-specific commands or ensembles as such facilities are provided in every other command-generated command form associated with state.

dkf writes "the coro-command behaves as a very restricted type of Tcl command" and NEM replies that this "was also discussed, however I felt that the simple wrapper approach works fine." I would contend that as coroutines have been used, it has become increasingly clear that it does not work fine. It is possible to contort any new application of coroutines to work with this arbitrary limitation, however it is not possible to use a coroutine to simulate other commands because of this limitation, and that is a crying shame as it unnecessarily reduces their immediate utility.

NEM then goes on to say that adding option 1 would be more difficult/impossible if coroutines accept multiple args, but of course this is demonstrably false. Each of the Creating Commands commands manages it perfectly well through specific accessors. For example, interp invokehidden can be used to invoke a command when normal command invocation won't cut it.

It seems to me that it is worse to generate a command with surprising properties, such that it can't simulate a general command, than it would be to generate no command at all, and operate on coroutines purely by opaque handles. This would at least preserve the possibility of using coroutines as general commands, by means of a wrapper (as is currently possible) without them polluting the command namespace.

If we see the act of yielding as a communication event, and have no compunction requiring a coroutine-specific command (consistent with thread::send, interp invokehidden, namespace eval, interp eval, etc ad nauseum) then why would we find it necessary to force the inverse and symmetric operation (of sending a message to a coroutine) through a limited command? If and as, and to the extent that, the operations are symmetric, surely they should have the same basic form? I begin to see the $coro form as a special case hack, and not a general purpose command, and then see the wisdom of removing it from the general command namespace.

NEM: Coroutines aren't presented as a general command form, they are presented as a 1-arg command. Personally, I consider opaque handles to be rather old-fashioned. It seems preferable to me to use commands to represent all such opaque resources -- either in an OO fashion, or as simple commands where the interface is simple. Perhaps in hindsight, an OO approach would have been preferable (allowing different methods of invoking a coroutine and of yielding to coexist).

I still don't understand why you feel it is not possible to adapt a coroutine to accept multiple args, despite multiple demonstrations to the contrary. I'd also be interested to see a concrete case where the current coroutine design "does not work fine". This has not been my experience. I also don't see in what way yield and $coroCmd are not symmetric: they both accept a single optional argument, and return a single value.

However, as I respect your work, and I appreciate that you are (with Wub) one of the main users of coroutines at present, I'm willing to compromise. If we can support multiple arguments in a way that is symmetric (i.e., both $coroCmd and yield can accept/return multiple args) and supports the single-arg case well, then I will accept it.

CMcC 2010-04-22 08:11:06:

NEM, it is not possible for a coroutine to accept multiple args, it is possible to use a shim to curry multiple args into the single arg that a coroutine accepts, this is known and well understood. It is no more an argument for single-arg coro than it is an argument for single-arg proc.

I have found many uses for coroutines. I have never found it desirable to consider a coroutine to be a communication endpoint. All my uses have been of a coroutine as a function. In all my applications the representative power of multiple argument passing is far more important than a putative symmetry which it is my desire to hide from a caller, and which does not matter to a caller.

If one considers a coro as a comms endpoint, then the case for making it also a command is far weaker than that for making it an opaque token, as no other communication endpoint has a dual command nature, although it could: socket could return a command, open also. They do not, because they do not represent functions. It is only the nature of coroutine as function which justifies its being presented as a command, and this functional nature is more important than an analogy with communication endpoint.

As to your compromise ... yield and $coro are not symmetrical in appearance. It seems to me that if you were actually after symmetry, you would forget yield and use info coroutine instead. And, failing that, you would have no objection to a yield which could take a $coro arg, and function as invocation. At least those would give the appearance of symmetry. I am, of course, arguing for something akin to the latter form. Your desired symmetry would be apparent (rename yield to something which is not so patently asymmetric) and you would not need the command-representation which you do not, in any case, use to its full effect.

NEM: I really don't understand this. The "shim" as you describe it fully solves the problem of multi-arg coroutines:

proc resume {coro args} {tailcall $coro $args}
coroutine _test apply {{} {
    while 1 {
        lassign [yield] a b c
        puts "got 3 args: $a $b $c"
    }
}}
interp alias {} test {} resume _test
test 1 2 3
test 4 5 6

I can understand wanting multi-arg coroutine commands, but I really don't understand why you keep rejecting a single line procedure solution to that problem. I'm yet to see any kind of real argument for why this is not a practical solution. All the coroutine code on this wiki seems to work just fine.

I also have no objection to a symmetrical yield, and it is easy to achieve -- resume does just that!

resume $coro ...;# to pass value(s) into a coroutine
resume yield ...;# to pass value(s) out of a coroutine

The only asymmetry is simply that one has a unique name and one has a generic name (resolved according to context). This not particularly different to the use of $object and self in OO systems.

AMG: Regarding "it is impossible to construct [a] general fileevent to pass more than one argument to a coroutine": fileevent simply executes a script; it doesn't append arguments or do any other alteration to the script. It just runs it. So what special restriction does fileevent face? It's impossible for anything to pass more than one argument to a coroutine, but this is not a practical limitation, because that one argument can contain as much data as needed, and because wrapper procs or lambdas can be used to make coroutines usable in multi-argument command prefix contexts. What's different about fileevent that makes it worth singling out, and what exactly is impossible in this case?

CMcC I'm sorry, you're right, thank you. All the other things which take a prefix and append args are impossible. Not fileevent. Interp alias, Tk callbacks, socket connections, trace commands. These shouldn't be subject to that limitation.

My argument is 'there is a class of use cases where it is desirable to invoke something with multiple args, coroutines cannot be invoked in those contexts.' It is not rebutted by the obvious, true, and completely irrelevant statement that something which is not a coroutine can. That is, rather, my point. So yes, one can write a shim to wrap a coroutine. That still does not make coroutines as functional and useful as they might easily be.

AMG: I was not involved in the original discussion, so I can't say for certain, but my reconstruction (i.e. imagination) from what I've seen on this Wiki is that coroutines were intended to be a lower-level facility and that users could use proc/lambda wrappers to script higher-level functionality like proc-style argument processing. There was not exactly one obvious and useful way to handle yield returning multiple values, so it was decided to KISS: coroutine would take at most one argument, with the understanding that scripting would glue this lower-level communication and synchronization mechanism to the specific needs of the application. In short: your argument holds true if you change "cannot be invoked" to "cannot be invoked directly". My question is whether or not direct invocation is important enough to warrant a change. If yes, my second question is whether we can all agree on a new interface that supports multiple arguments. In particular, I am worried that if yield is changed to write into variables, it can no longer be used in functional contexts without a shim of its own.

CMcC 2010-04-22 22:30:51:

AMG, I do not want yield to inject into variables, since that's easily done with lassign, and easily wrapped when it might be useful.

NEM: as far as I can see, your current argument is that you dislike multi-arg coros being implemented in Tcl and would prefer they were implemented in C. That's not a position I have any time for.

CMcC: My arguments are layed out in points 1..4 at the top of the page. I do not see any rebuttal of them.

dkf 2010-04-23 11:03:40:

Re-analyzing what is the real issue, the problem is that coroutines, while a cool primitive, aren't good enough yet at pretending to be normal Tcl commands.

What's needed to change that? Well, they've got to be able to be variadic (a key defining Tcl feature) and they've got to be able to produce errors or other kinds of exceptions.

Variadic Coroutines: All that's really needed here is to be able to have yield return the list of arguments passed to the coro re-entry command. Binding to variables can be scripted. NB: the primary thing being changed here is the implementation of the coroutine re-entry command, and that's just so that it accepts arbitrary many arguments and passes along a list of them. (It does mean that it no longer need to call Tcl_WrongNumArgs...)

Exceptional Results: In this case, yield needs to take the same sorts of options as return. Not a big deal as it currently takes the same arguments as an extremely cut down version.

In combination, this allows producing an error on the wrong number of arguments passed (through yielding an error code) and all sorts of other things. Adding the ability to script these things once the eye of this particular needle is made wider will make life easier. And fixing this before 8.6.0 is a good time, before there is too much code that is locked into the old ways of doing things.

2010-04-25 MS offers yieldm as a compromise: please see http://paste.tclers.tk/2061 and the following chat transcript:

[10:41] miguel: colin (and all interested in the coroutine discussion): please see http://paste.tclers.tk/2061
[10:42] miguel: the new idea/insight: the guy who "creates" the command to resume a coroutine is not actually coroutine but yield!
[10:43] miguel: so: yield creates a command that takes 0 or 1 arg, and internally returns that arg as its result (as before)
[10:43] miguel: new: yieldm creates a command that takes any number of args, and internally returns them in a list [what colin was requesting]
[10:44] miguel: now you can actually control which of these two you want at every suspension, they can be different each time
[10:45] miguel: there is C code to allow for fixing a number of allowed arguments (cheap); but I do not know if this is wanted or not, nor what syntax we could want for it. So it is not operative from the script level
[10:47] miguel: I *think* this would be a satisfactory solution to your gripe; please test and confirm
[10:48] miguel: [no news at all for those thinking about passing non-ok codes]
[10:50] colin: miguel, this is in HEAD? Ok! thanks!
[10:50] colin: is it slower?
[10:51] colin: I mean ... it actually creates a command on each call?
[10:52] colin: or just on the first, and then keeps using it?
[10:52] colin: that's a pretty innovative approach, anyway.
[10:53] miguel: no no, not slower ... "creates the command" is the same as before
[10:53] miguel: new field in thge coro struct, with number of expected args on resume
[10:53] colin: ohhh, ok.
[10:54] miguel: conceptually "yield creates the command", but the code is the same
[10:54] miguel: just that on resuming the nargs are checked against the stored value, instead of against a hardwired 0/1
[10:54] miguel: yieldm does : set up a yield, change nargs from -2 to -1
[10:54] colin: Right, change in perspective
[10:55] miguel: the C code allows for any integer - and if you set 5 it would complain if you try to resume with anything but 5 args
[10:56] miguel: [but there is currently no way to set that to anything but -2 (yield) or -1 (yieldm)]
[10:56] colin: Well, thank you MIguel. I'm compiling it up now, and will move my stuff over to it.
[10:56] miguel: is this serving your needs properly and sufficiently?
[10:57] colin: Sure! I can invoke [$coro] with multiple args? That's all I want.

Lars H wishes to remark that this "new idea/insight" had been published, in a slightly refined form, on this page already a week earlier.

MS acknowledges ... must have been worming through my brain, and I did not realize where it came from. Thanks, credited on the Changelog.

Category Development