Coroutine cancellability

You can't generally cancel a running coroutine easily. You can of course do

rename runningCoro ""

But then:

  • the coroutine doesn't know it was cancelled(*), no error is raised inside it, it just ceases to exist
  • if the coroutine has scheduled itself as a callback like in e.g.
::chan event $chan readable [list [info coroutine]]

just like say coroutine::util gets does, then that event will produce an error. Unfortunately I found no way to handle coroutine cancellation without resorting to callbacks. The goods news is tcl allows you to trace command deletion like:

proc cleanup {args} {...}
proc coro {} {
    ...
    trace add command [info coroutine] delete [list cleanup $arg1 $arg2]
    ...
}

Which will allow it to e.g. unsubscribe from fileevent. Quite awkward: another proc has to be created and it also has to accept mostly useless args that trace passes to it in addition to any useful args we may pass from inside the coro.

A better solution is trace not just command deletion, but leaving the current command's scope, i.e. Deferred evaluation, especially since the same cleanup often is done upon normal exit.

To sum up, here's one particular solution I could come up with for cancellable io with coroutines:

namespace eval cio {}

proc cio::defer {args} {
    if {[llength $args]==1} {set args [lindex $args 0]} ;# flatten if 1 elem 
    uplevel {set _defer_var {}}
    uplevel [list trace add variable _defer_var unset [list apply [list args $args]]]
}

proc cio::gets {{chan stdin} args} {
    ::chan configure $chan -blocking 0
    defer fileevent $chan readable {}
    while {[set res [uplevel gets $chan {*}$args]] eq ""} {
        fileevent $chan readable [list [info coroutine]]
        yield
    }
    set res 
}

proc cio::puts {args} {
    switch -- [llength $args] {
        1 {set chan stdin}
        2 {set chan [lindex $args 0]}
        3 {set chan [lindex $args 1]}
        default {tailcall ::chan puts {*}$args; #calling original with bogus args
        }
    }
    ::chan configure $chan -blocking 0
    defer fileevent $chan writable {}
    fileevent $chan readable [list [info coroutine]]
    yield
    uplevel gets $chan {*}$args
}

proc cio::after {} {
    set id [::after $delay [list [info coroutine]]]
    defer after cancel $id
    yield
}

proc ::cio::read {args} {
    set chop no ; # !-nonewline
    set total "" ;# "" = until eof
    # parse args
    if {[set chan [lindex $args 0]] eq "-nonewline"} {
        set chop yes
        set chan [lindex $args 1]
    } else {
        set total [lindex $args 1]
    }
    ::chan configure $chan -blocking 0
    # run loop
    set buf {}
    ::chan event $chan readable [list [info coroutine]]
    defer ::chan even $chan readable {}
    if {$total eq ""} {; # Loop until eof        
        while {![::chan eof $chan]} {
            yield
            append buf [::chan read $chan]
        }
    } else {
        # Loop til eof or $total chars read 
        while {$total >0 && ![::chan eof $chan]} {
            set chunk [::chan read $chan $total]
            append buf $chunk
            incr total -[string length $chunk]
        }
    }
    
    if {$chop && [string index $buf end] eq "\n"} {
        set buf [string range $buf 0 end-1]
    }

    return $buf
}

None of these will not cause background errors should its caller be destroyed during the call unlike coroutine::util. They may of course cause numerous other errors as this is hardly tested!

It's still sub-optimal: no error is raised in the caller, and if the callers wants to do some of its own cleanup, it's going to have to defer too. Less straightforward than something like

proc coro {} {
    if {[catch {imaginary_cancellable_gets $chan} err]} {
        if {$err eq "cancelled"} {
            <cleanup>
        }    
    }
}

would be. But that's not possible to do with the current tcl coro mechanism, is it?

Discussion

PYK 2022-03-22

This entire page is an XY problem. Cancellation of this sort is at the very least a code smell. An event handler for a channel is tied to the life of the channel, and is automatically deleted when the channel is closed. Anything that decides to delete a coroutine can also decide what to do about the channel. If the coroutine itself initiated the channel and is responsible for its management, then the coroutine should not be deleted using rename, but instead told to finish.

A coroutine is best viewed as a "separate program", complete with its own control flow. Since a coroutines share the same larger environment and can have arbitrary effects, it is rather futile to attempt the sort of general external approach to resource cleanup described on this page. Coroutines are designed to execute in a collaborative fashion, and a coroutine itself is in the best position to know how to clean up whatever resources it has engaged. Each coroutine should be a good citizien and leave the shared environment in a better place than it found it in. The documentation for any coroutine available for use by third parties should describe the things that are normally described for an individual program: What it does, how to use it, what to pass to it, what it produces, and what its effects are.

Since coroutines do not execute concurrently, it isn't running at the moment rename might be used to delete, so there is always a chance to issue a command to the coroutine instead. This means that a coroutine can develop an interface that it can use to distinguish callbacks it has scheduled for itself from commands issued to it by a third party. A termination and cleanup routine can be a documented part of that interface. This keeps any complexity associated with cleanup neatly encapsulated within the coroutine itself. The script below provides and example of this idea, where the interface handles both channel event callbacks and termination requests:

#! /usr/bin/env tclsh
package require coroutine

proc {on timeout} chan {
        puts [list {got a timeout} $chan]
}

proc main_coro {chan timeout} { 
        variable status
        yield [info coroutine]
        chan event $chan readable [list [info coroutine] chan $chan]
        set timer {}
        while 1 {
                after cancel $timer
                set timer [after 1500 [list [info coroutine] timeout] $chan]
                set event [yieldto return -level 0]
                lassign $event type
                switch $type {
                        bye {
                                close $chan
                                set status 0
                                break
                        }
                        chan {
                                set chan [lindex $event 1]
                                set chars [gets $chan line]
                                if {$chars >= 0} {
                                        puts [list chan $chan line $line]
                                }
                        }
                        timeout {
                                {*}$timeout [lindex $event 1]
                        }
                        default {
                                error [list {unknown command} $event]
                        }
                }
        }
        puts [list {coroutine finished}]
}
chan configure stdin -blocking 0

set coro1 [coroutine main main_coro stdin [list [namespace which {on timeout}]]]
after 0 [list $coro1]
after 10000 [list $coro1 bye]

vwait [namespace current]::status
if {$status} {
        error {something went wrong}
}

Another note: To avoid double substitution, the line,

uplevel gets $chan {*}$args]

should instead be

uplevel [list gets $chan {*}$args]

wusspuss 2022-03-25 07:00:21: That coroutine is still "manually" scheduling callbacks for itself - i.e. it has to reimplement wrapping gets - it runs a loop and checks if this time gets returned anything. With read the algorithm would be more complicated. I think a wrapper can be written this way around gets, read, akin to that of coroutine::util but cancellable. And cancellation actually could be handled with a catch in the caller then. Indeed, that is a superior approach.

PYK 2022-03-25: coroutine::util procedures are designed to provide the same semantics as their synchronous counterparts, so they aren't the right place to provide timeout functionality. A replacement for read or gets that does provide timeout functionality must pause the current coroutine, which means it either must document that it only works if it has exclusive callback access to the coroutine, or that it shares access to the coroutine via a common interface like the one in the example above.


wusspuss - 2022-03-28 04:33:29

With pooryorick's suggestions taken into account,

namespace eval cio {}
proc cio::defer {args} {
    if {[llength $args]==1} {set args [lindex $args 0]} ;# flatten if 1 elem 
    uplevel {set _defer_var {}}
    uplevel [list trace add variable _defer_var unset [list apply [list args $args]]]
}

proc cio::yieldm {{value {}}} {
    yieldto return -level 0 $value
}
proc cio::timeout {time cmd args} {
    set id [::after $time [list [info coroutine] cancel timeout]]
    defer ::after cancel $id
    $cmd {*}$args
}
proc cio::gets {{chan stdin} args} {
    ::chan configure $chan -blocking 0
    fileevent $chan readable [list [info coroutine] chan]
    defer fileevent $chan readable {} 
    while 1 {
        lassign [yieldm] cmd val
        switch -- $cmd {
            chan {
                if {[set res [uplevel [list ::gets $chan {*}$args]]] ne ""} {
                    return $res
                }
            }
            cancel {error Cancelled $val}
            default {error "Unknown command $cmd"}
        }
    }
}

proc example {} {
    cio::timeout 1000 cio::gets
}
coroutine main example

vwait forever

something like this will raise an error in the caller. The coroutine can be cancelled with cancel, that's assuming it itself and all coros it calls implement this interface. An error will be printed unless the caller catches cancellations.