Higher order TIP discuss

Discussion page for [1 ], a proposal for Higher Order Functions in Tcl that is intended to result in a TIP.


escargo 15 Apr 2010 - There appears to be some damage to the page in the reference link. One heading is tr;dr. Headings for "Introduction, Motivation and Goals," "A review of FP and higher order functions in Tcl," and "The design of this proposal" appear to be missing.

One feature of the design of the proposal is described like this:

filter predicate list is a filter over a list, and predicate is a cmdprefix that must return a boolean.

This is somewhat imprecise in that Tcl does not have a boolean data type. More firmly specified would be to say that cmdprefix must return 0 or 1.

The discussion about optional arguments would be clarified if the distinction between formal arguments and actual arguments were more clearly drawn.

While proponents of functional programming might know about left fold and right fold an explanatory reference for the rest of us would be welcome.

Twylite 2010/04/16: No damage; the "missing" headings are external links. I have updated the doc to make this more readily apparent (I hope).

The concept of a "boolean" is defined in the man page for if - I have updated the doc to refer to this definition.

Thanks for the comments on the "optional arguments" proposal. I will improve this proposal in time; it will not form part of the Higher Order Functions TIP and should probably be a TIP in its own right. The idea is rather rough at the moment but I wanted to get it down and hear some opinions.

The tl;dr is deliberately brief, take a look at the manual (link is now more obvious, in the ToC) for more detail and a link to the Wikipedia.


Lars H: While I certainly agree with the proposal's philosophy that a command prefix should be considered the normal Tcl equivalent of an abstract function, I'm somewhat at a loss as to what the TIP is supposed to actually propose. The whole thing just looks like a collection of observations.

Twylite 2010/04/16 Fair point. The TIP will do two things:

  1. Present a consistent model for handling functions, and propose that future TIPs (that accept or return command prefixes or functions) conform to (or at least interoperate with) this model.
  2. Propose the inclusion in the Tcl core of the following commands: lambda, lamdaexpr, curry, compose, foldl, foldr, filter, map, range.

The doc has been updated to indicate this (at the top of the tl;dr).

The rationale for including these commands in the core is given in the Introduction, Motivation and Goals section.


Twylite 2010/04/16 Andreas Kupries made a point that this proposal does not support infinite lists. That is true. I experimented with [coforeach] and [cofoldl] implementations, and decided that two paradigm shifts (functional programming, and iterators instead of lists) would be going too far in Tcl 8.x. The fold/map/filter constructs for Tcl 8.x should support the familiar data types of Tcl 8.x, i.e. finite lists. There have been some discussions about moving towards an iterator approach in Tcl 9, which would be an appropriate time to make the higher-order functions iterate over infinte lists.

Put another way, I see it as the job of another TIP to propose [coforeach], [cofoldl] and friends that accept iterators (as coroutines the yield values) rather than lists. It would also make sense to have an [iterator] helper to iterate over lists, making the following two statements equivalent in function:

  foreach element $list { ... }
  coforeach element [iterator $list] { ... }

Tcl 9 proposals to remove the command/variable dichotomy could further simplify the syntax.


dkf - 2010-04-16 04:22:07

I want a map that supports the full power of foreach. Why? Because it will also allow us to do

  1. zipping of lists together

    set zipped [map a $list1 b $list2 {list $a $b}]

  2. consuming of several values at once

    set sums [map {a b} $values {+ $a $b}]

  3. filtering

    set goodOnes [map x $values {expr {[isGood $x] ? $x : [continue]}}]

  4. taking a prefix of the list

    set prefix [map x $values {expr {[isGood $x] ? $x : [break]}}]

  5. combinations of the above

What's more, we don't need any new bytecodes to do this efficiently. It's essentially just reusing the machinery we already have for foreach, but keeping the value from each time round the loop (on normal result).

AMG: What about the crazy foreach+Python hybrid list/dict/whatever comprehension I posted [2 ]? By the way, mixing short-circuit evaluation with continue is very clever. ;^)

Twylite 2010/04/16 Hmmm, so a capturing [foreach] that assigns values to variables and evals a block, rather than something that invokes a function with arguments. Out of scope for this proposal ;p

Very cute though; this would provide list comprehension functionality in a construct that is familiar to Tclers.

So I will see your [map], and raise you multiple output elements per iteration:

  proc mapfor {args} {
    set coro ::__mapfor_[incr ::__mapfor_id]
    set a [coroutine $coro foreach {*}$args]
    set out [list]
    while { ($a ne {}) || [llength [info commands $coro]] } {
      lappend out $a
      set a [$coro]
    }
    return $out
  }

  # Return multiple output elements per iteration
  % mapfor a {1 2 3 4 5} { yield [expr { $a * 2 }] ; yield x }
  2 x 4 x 6 x 8 x 10 x
  
  # Zipping
  % set list1 {1 2 3 4 5 6}
  % set list2 {a b c d e f}
  % set zipped [mapfor a $list1 b $list2 {yield [list $a $b]}]
  {1 a} {2 b} {3 c} {4 d} {5 e} {6 f}
  
  # Consuming several values at once
  % set sums [mapfor {a b} $list1 {yield [::tcl::mathop::+ $a $b] }]   
  3 7 11
  
  # Filtering
  % proc isEven {x} { expr { $x % 2 == 0 } }
  % set goodOnes [mapfor x $list1 { yield [expr {[isEven $x] ? $x : [continue]}] }]

  # Taking a prefix of the list
  % proc isGood {x} { expr { $x < 4  } }
  % set prefix [mapfor x $list1 { yield [expr {[isGood $x] ? $x : [break]}] }]
  1 2 3
  
  # Performance
  % set values {}; for {set i 0} {$i < 1000000} {incr i} { lappend values $i }
  % set out {} ; time { foreach i $values { lappend out [expr { $i * 5 }] } }
  921000 microseconds per iteration
  % time { mapfor i $values { yield [expr { $i * 5 }] } }
  1828000 microseconds per iteration

DKF: Oh wow! Neat, but the performance of mapfor becomes relatively worse when the iteration is done inside a procedure/lambda:

% time {apply {{} {
    set out {} ;foreach i $::values {lappend out [expr {$i*5}]};return $out
}}}
4126981 microseconds per iteration
% time {apply {{} {
    mapfor i $::values { yield [expr { $i * 5 }] }
}}}
33906295 microseconds per iteration

This is because foreach is significantly faster when there's a local variable context (that's because that lets the bytecode compiler kick in; it doesn't bother otherwise) and the trick with coroutine foreach defeats the compiler.


Twylite 2011/01/22 Of course (he says, noticing the problem 8 months later) coroutine causes foreach to run at global scope rather than in the caller's context, and this cannot (so far as I am aware) be solved by uplevel. Here's an alternative construct closer to what DKF originally described:

  # Approach #2: uplevel foreach + eval loop-script
  proc mapeach {args} {
    upvar 1 [set uidvar _uid_mapeach_[incr ::mapeachCounter]] result
    set result {}
    set block [lindex $args end]
    uplevel 1 [list ::foreach {*}[lrange $args 0 end-1] [subst -nocommands {
      lappend $uidvar [eval {$block}]
    }]]
    return $result
  }
  
  # NEW: Nesting
  # DOESN'T WORK FOR [mapfor]
  % set list1 {1 2 3 4 5 6}
  % set list2 {a b c d e f}
  % mapeach a $list1 { mapeach b $list2 { list $a$b } }
  {1a 1b 1c 1d 1e 1f} {2a 2b 2c 2d 2e 2f} {3a 3b 3c 3d 3e 3f} \
  {4a 4b 4c 4d 4e 4f} {5a 5b 5c 5d 5e 5f} {6a 6b 6c 6d 6e 6f}  

  # Return multiple output elements per iteration
  # NOT SUPPORTED
  
  # Zipping
  % set zipped [mapeach a $list1 b $list2 {list $a $b}]
  {1 a} {2 b} {3 c} {4 d} {5 e} {6 f}

  # Consuming several values at once
  % set sums [mapeach {a b} $list1 {::tcl::mathop::+ $a $b}]
  3 7 11

  # Filtering
  % proc isEven {x} { expr { $x % 2 == 0 } }
  % set goodOnes [mapeach x $list1 { expr {[isEven $x] ? $x : [continue]} }]

  # Taking a prefix of the list
  % proc isGood {x} { expr { $x < 4  } }
  % set prefix [mapeach x $list1 { expr {[isGood $x] ? $x : [break]} }]
  1 2 3

  # Performance
  % set values {}; for {set i 0} {$i < 1000000} {incr i} { lappend values $i }
  % time { apply {{} { set out {} ; foreach i $::values { lappend out [expr { $i * 5 }] } ; return $out }} } 10
  250000 microseconds per iteration
  % time { apply {{} { mapeach i $::values { expr { $i * 5 } } }} } 10
  1412500.0 microseconds per iteration
  % time {apply {{} { mapfor i $::values { yield [expr { $i * 5 }] } }} } 10
  1946800.0 microseconds per iteration

This solution is terse, has better performance than [mapfor], but is still performance somewhat worse than [foreach] and lacks [mapfor]'s ability to yield multiple elements per iteration.

I recently came across accumulate and collect, which is another approach to a collection foreach, but it generalises the collection mechanism rather than the foreach control structure.

I present an updated accumulate & collect implementation and a [mapeach] that is able to collect multiple elements per iteration:

  # The [collect] coroutine accepts a value and appends it to the current
  # accumulator.
  # [accumulate] delimits the scope of an accumulator.  It accepts args and 
  # evaluates them as a command.  On completion [accumulate] returns a list
  # of values [collect]ed into the accumulator.  [accumulate] can nest, and
  # probably does not interact safely with coroutines (behaviour still to be 
  # codified).
  # [accumulate] and [collect] communicate via a global signal variable.
  
  namespace eval ::tcl::Private { set accumulate.signal {} }

  proc accumulate {args} {
    variable ::tcl::Private::accumulate.signal
    
    # Notify the collector of a new accumulator 
    set accumulate.signal [linsert ${accumulate.signal} 0 push]
    set collectorId [collect {}]
    
    # Evaluate the command
    set code [catch { uplevel 1 $args } r opts]
    
    # End collection for this accumulator
    set accumulate.signal [linsert ${accumulate.signal} 0 pop]
    set result [collect $collectorId]
    
    # Return collection or propagate error
    if { $code == 0 } {
      set r $result
    }
    return -options $opts $r
  }

  coroutine ::collect apply {{} {
    variable ::tcl::Private::accumulate.signal
    lassign [info coroutine] result collections collection
    set counter 0
    
    while {1} {
      # Return answer to creation, push or pop      
      set value [yield $result]
      
      # Collect values into current accumulator
      while {${accumulate.signal} eq {}} {
        lappend collection $value
        set value [yield $value]
      }
      
      # Handle signal from [accumulate]
      set accumulate.signal [lassign ${accumulate.signal} op] 
      switch -- $op {
        push {
          # We can't use a simple stack of collection lists here as nesting
          # across coroutines will break.  Nesting information must be stored
          # on the stack, which means [accumulate] must do it.
          set result "collection[incr counter]"
          dict set collections $result $collection
          set collection {}
        }
        pop {
          set result $collection
          set collection [dict get $collections $value]
          dict unset collections $value
        }
      }
    }
  }}

  proc mapeach {args} {
    set block [lindex $args end]
    uplevel 1 [list accumulate foreach {*}[lrange $args 0 end-1] [subst -nocommands {
      collect [eval {$block}]
    }]]
  }

  # NEW: Nesting
  # DOESN'T WORK FOR [mapfor]
  % set list1 {1 2 3 4 5 6}
  % set list2 {a b c d e f}
  % mapeach a $list1 { mapeach b $list2 { list $a$b } }
  {1a 1b 1c 1d 1e 1f} {2a 2b 2c 2d 2e 2f} {3a 3b 3c 3d 3e 3f} \
  {4a 4b 4c 4d 4e 4f} {5a 5b 5c 5d 5e 5f} {6a 6b 6c 6d 6e 6f}  

  # Return multiple output elements per iteration
  % mapeach a {1 2 3 4 5} { collect [expr { $a * 2 }] ; list x }
  2 x 4 x 6 x 8 x 10 x
  
  # Zipping
  % set zipped [mapeach a $list1 b $list2 {list $a $b}]
  {1 a} {2 b} {3 c} {4 d} {5 e} {6 f}

  # Consuming several values at once
  % set sums [mapeach {a b} $list1 {::tcl::mathop::+ $a $b}]
  3 7 11

  # Filtering
  % proc isEven {x} { expr { $x % 2 == 0 } }
  % set goodOnes [mapeach x $list1 { expr {[isEven $x] ? $x : [continue]} }]

  # Taking a prefix of the list
  % proc isGood {x} { expr { $x < 4  } }
  % set prefix [mapeach x $list1 { expr {[isGood $x] ? $x : [break]} }]
  1 2 3

  # Performance
  % set tcl_patchLevel
  8.6b1.1
  % set values {}; for {set i 0} {$i < 1000000} {incr i} { lappend values $i }
  % time { apply {{} { accumulate foreach i $::values { collect [expr { $i * 5 }] } }} } 10
  1062500.0 microseconds per iteration
  % time { apply {{} { mapeach i $::values { expr { $i * 5 } } }} } 10
  1614100.0 microseconds per iteration

The performance of [mapeach] has dropped, but the speed [accumulate foreach] is reasonably impressive (24% of foreach).

I am still suspicious about the interaction of accumulate/collect and coroutines. Coroutines allow the stack to change without [accumulate] being aware of a change in scope, leading to [collect] behaving unexpectedly:

  coroutine mycollection apply {{} {
    set value [info coroutine]
    accumulate while { $value ne {} } {
      set value [yield $value]
      if { $value ne {} } {
        collect $value
      }
    }
  }}
  
  mapeach i {1 2 3} { mycollection $i }   

I'm not sure how this can be fixed without making [collect] use [info coroutine] (which in turn leads to resource cleanup issues).

At this bottom of this page I've added some information on the evolution of these implementations, including less correct or poorer performing alternatives. This may save someone else the trouble of repeating my experiments ;)

Twylite 2011/01/22 Patch 3163961 [3 ] provides a bytecoded implementation of [mapeach] in the Tcl core.


escargo 15 Apr 2010 - I'm looking at the control::functional page [4 ].

It seems that either the metalanguage for describing optional arguments is insufficient or ambiguous.

How would range 1 be interpreted?

How would range 2 2 be interpreted?

Or is it "understood" that the third argument can only be present if the first two are present?

The way it looks to me, in the two-argument case, you can't tell if the first or the third argument is missing.

Twylite 2010/04/16 This is related to the idea of enhancing support for optional arguments, so that there is a common understanding of behaviour in this sort of situation.

Parameters (given by the caller) are assigned to arguments (vars in the arglist) from left to right (i.e. iterate over the argument vars and assign to each the next available parameter). If an optional argument is encountered and the number of available parameters remaining is less than or equal to the number of mandatory arguments that have not yet received a parameter value, then assign the default value to the optional arg (and don't consume a parameter).

Perhaps this is more succinctly stated by rephrasing your words: an optional argument can only be assigned a parameter if every optional argument to the left of it has been assigned a parameter.

In the case [range ?from? to ?step?]:

  • A single argument is understood as [range to], equivalent to [range 0 to 1]
  • Two arguments are understood as [range from to], equivalent to [range from to 1]
  • Three arguments fully specify [range from to step]

Twylite 2011/01/22 Evolution of [mapeach], [accumulate] and [collect] implementations (for posterity).

Some of these implementations require the following helpers:

  # Based on KBK's [gensym] from http://code.activestate.com/lists/tcl-core/9716/
  # See also NEM's [gensym] in http://code.activestate.com/lists/tcl-core/9717/
  coroutine ::autoname apply {{} {
    set prefix [yield [info coroutine]]
    while {1} {
      if { $prefix eq {} } { set prefix "uid" }
      set prefix [yield "${prefix}[incr counters($prefix)]"]    
    }
  }}

Most of the implementations work with the following test code:

  # Operation: simple arithmetic transform on each item in a list, then compute
  # the sum of the list.

  # Generate list
  for {set i 0} {$i < 1000000} {incr i} {
    lappend input [expr { int(rand() * 1000000) }] 
  }

  # Test the performance of [mapeach]
  time { apply {{} { 
    set accum 0 
    foreach val [mapeach i $::input {expr { $i * 5}}] { incr accum $val }
    puts $accum 
  }} }

Performance baselines are established using equivalent operations in a foreach loop. To be fair we must generate transformed list as well as a sum over the list - we are not just implementing a [foldl] operation!

  # Baseline (method #1): create transformed list and sum in one loop
  time { apply {{} { 
    set accum 0 ; set collect {} 
    foreach val $::input { set v [expr { $val * 1 }] ; lappend collect $v ; incr accum $v }
    unset collect
    puts $accum 
  }}} 10
  # 314100.0 microseconds per iteration

  # Baseline (method #2): create transformed list in one loop, then sum it in
  # another loop.  This more accurately reflects the work we are asking for 
  # when using a foreach over mapeach.
  time { apply {{} { 
    set accum 0; set result {}
    foreach i $::input { lappend result [expr { $i * 1 }] }
    foreach val $result { incr accum $val }
    unset result
    puts $accum
  }}} 10
  # 396900.0 microseconds per iteration

[mapeach] approach #1: rewrite the loop script to append to a collection variable (using try/on to run the script), then run a foreach loop in the caller's context using uplevel.

  # Approach #1: uplevel foreach + try/on loop-script
  # Variant #1: using unique collection variable in caller's context
  proc mapeach {args} {
    upvar 1 [set uidvar _uid_mapeach_[incr ::mapeachCounter]] result
    set result {}
    uplevel 1 [list foreach {*}[lrange $args 0 end-1] \
      [list try [lindex $args end] on ok {result} "lappend $uidvar \$result"]]
    return $result
  }
  time { apply {{} { set accum 0 ; foreach val [mapeach i $::input {expr { $i * 1}}] { incr accum $val } ; puts $accum }} } 5
  # 4215600.0 microseconds per iteration


  # Approach #1: uplevel foreach + try/on loop-script
  # Variant #2: using collection variable in caller's context that is nesting-
  #   and coroutine-safe but predictably named
  proc mapeach {args} {
    set coro [info coroutine]
    upvar 1 [set uidvar _uid_mapeach_${coro}_[incr ::nested($coro)]] result
    set result {}
    uplevel 1 [list foreach {*}[lrange $args 0 end-1] \
      [list try [lindex $args end] on ok {result} "lappend $uidvar \$result"]]
    incr ::nested($coro) -1
    return $result
  }
  time { apply {{} { set accum 0 ; foreach val [mapeach i $::input {expr { $i * 1}}] { incr accum $val } ; puts $accum }} } 5
  # 4215600.0 microseconds per iteration


  # Approach #1: uplevel foreach + try/on loop-script
  # Variant #3: using unique global collection variable
  proc mapeach {args} {
    set uidvar ::uid_[incr ::uidctr]
    set $uidvar {}
    uplevel 1 [list foreach {*}[lrange $args 0 end-1] \
      [list try [lindex $args end] on ok {result} "lappend $uidvar \$result"]]
    set result [set $uidvar]
    unset $uidvar
    return $result
  }
  time { apply {{} { set accum 0 ; foreach val [mapeach i $::input {expr { $i * 1}}] { incr accum $val } ; puts $accum }} }
  # 4390600.0 microseconds per iteration


  # Approach #1: uplevel foreach + try/on loop-script
  # Variant #4: using common collection variable in caller's context
  # WARNING: this [mapeach] cannot nest (within a given stack frame)
  proc mapeach {args} {
    upvar 1 __mapeach result
    set result {}
    uplevel 1 [list foreach {*}[lrange $args 0 end-1] \
      [list try [lindex $args end] on ok {result} "lappend __mapeach \$result"]]
    return $result
  }
  time { apply {{} { set accum 0 ; foreach val [mapeach i $::input {expr { $i * 1}}] { incr accum $val } ; puts $accum }} } 5
  # 4215600.0 microseconds per iteration

No real difference in how we name the upvar, but using a global accumulation variable is slightly slower.

[mapeach] approach #2: rewrite the loop script to append to a collection variable (using eval to the run the script), then run a foreach loop in the caller's context using uplevel.

  # Approach #2: uplevel foreach + eval loop-script
  proc mapeach {args} {
    upvar 1 [set uidvar _uid_mapeach_[incr ::mapeachCounter]] result
    set result {}
    set block [lindex $args end]
    uplevel 1 [list foreach {*}[lrange $args 0 end-1] [subst -nocommands {
      lappend $uidvar [eval {$block}]
    }]]
    return $result
  }
  time { apply {{} { set accum 0 ; foreach val [mapeach i $::input {expr { $i * 1}}] { incr accum $val } ; puts $accum }} } 10
  # 1593800.0 microseconds per iteration

Clearly try/on was imposing a severe performance penalty. Here are are achieving 20%-25% of the baseline.

[mapeach] approach #3: map to the loop variables in the caller's context, run the foreach locally but on every iteration execute the original loop-script via uplevel and capture the result.

  # Approach #3: foreach + uplevel loop-script
  # This approach does not require extra variables in the caller's context
  # WARNING: this [mapeach] cannot support a varname list to consume multiple
  #   values at once, i.e. mapeach {a b} $value {...}
  proc mapeach {args} {
    set counter 0
    foreach {var lst} [lrange $args 0 end-1] {
      upvar 1 $var var[incr counter]
      lappend fargs var${counter} $lst 
    }
    set result {}
    set block [lindex $args end]
    foreach {*}$fargs {
      lappend result [uplevel 1 $block]
    }
    return $result
  }
  time { apply {{} { set accum 0 ; foreach val [mapeach i $::input {expr { $i * 1}}] { incr accum $val } ; puts $accum }} } 10
  # 1742100.0 microseconds per iteration

Approach #3 has worse performance than Approach #2, and is fatally flawed by being unable to support the full syntax of foreach.

Of the implementations tried so far, Approach #2 wins.

We now consider implementation approaches for accumulate and collect. The following test code is used:

  # Generate list
  for {set i 0} {$i < 1000000} {incr i} {
    lappend input [expr { int(rand() * 1000000) }] 
  }

  # Performance test (script argument)
  time { apply {{} { 
    set accum 0
    set result [accumulate { foreach i $::input { collect [expr { $i * 1 }] }}]
    foreach val $result { incr accum $val }
    unset result
    puts $accum
  }}} 10

  # Performance test (command args)
  time { apply {{} { 
    set accum 0
    set result [accumulate foreach i $::input { collect [expr { $i * 1 }] }]
    foreach val $result { incr accum $val }
    unset result
    puts $accum
  }}} 10
  
  # Baseline FOR [llength $input] = 50000
  # 5781000 microseconds per iteration

  # Nesting test (script argument)
  apply {{} {
    set z "hello"
    accumulate { foreach x {1 2 3} {
      collect [accumulate {foreach y {a b c} {
        collect "$z$x$y"
      }}]
    }}
  }}
  #-> {hello1a hello1b hello1c} {hello2a hello2b hello2c} {hello3a hello3b hello3c}

  # Nesting test (command args)
  apply {{} {
    set z "hello"
    accumulate foreach x {1 2 3} {
      collect [accumulate foreach y {a b c} {
        collect "$z$x$y"
      }]
    }
  }}
  #-> {hello1a hello1b hello1c} {hello2a hello2b hello2c} {hello3a hello3b hello3c}

Note that we will be using two different prototypes for [accumulate]: one takes a single argument which is a script to evaluate; the other accepts args and evaluates them like eval.

The original implementation presented at accumulate and collect is extremely slow (tested in Tcl 8.6) on account of the data structure used to accumulate values. As indicated above we needed to use a truncated input list to allow the performance test to complete in a reasonable time.

AMG's coroutine-based implementation suffers the same problem as [mapfor] (which he notes) in that it runs in the global context rather than the caller's context. It will not pass the "nesting test" given above.

[accumulate] and [collect] Approach #1: accumulate into global variables, but use a data structure conducive to performance:

  # Manage the collection in global variables
  # WARNING: use of globals doesn't play nicely with coroutines (we could switch
  #   accumulate blocks without changing the collection variable)
  set collections {}
  set collection {}
  
  proc ::accumulate {block} {
    global collections collection
    set collections [linsert $collections 0 $collection]
    set collection {}
    uplevel 1 $block
    set result $collection
    set collection [lassign $collections collection]
    return $result
  }

  proc ::collect {value} {
    lappend ::collection $value
  }

  # Nesting test (script argument)
  #-> {hello1a hello1b hello1c} {hello2a hello2b hello2c} {hello3a hello3b hello3c}
  # PASS 

  # Performance test (script argument), [llength $input] = 1000000
  # 1432800.0 microseconds per iteration

A naive implementation has reasonably good performance, but as noted the use of global variables may cause unexpected interactions with coroutines.

  # Each accumulator is a coroutine created and managed by [accumulate], which
  # collects into a local variable and uses a unique per-collector flag to 
  # control when to stop.  Nested is managed by [accumulator] by renaming
  # collectors as required.
  # 1312000 microseconds per iteration
  namespace eval accumulate {}
  
  proc accumulate {block} {
    set oldcollect [autoname ::accumulate::collector]
    set $oldcollect 1
    catch { rename ::collect $oldcollect } 
    coroutine ::collect apply {{watchvar} {
      upvar $watchvar collecting
      set collection {}
      set value [yield [info coroutine]]
      while {$collecting} {
        lappend collection $value
        set value [yield $value]
      }
      return $collection    
    }} $oldcollect
    uplevel 1 $block
    set $oldcollect 0
    set result [collect]
    unset $oldcollect
    catch { rename $oldcollect ::collect }
    return $result
  }

  # As above by accumulate uses args not block
  # Propagates errors correctly with cleanup
  # 1281000 microseconds per iteration
  proc accumulate {args} {
    set oldcollect [autoname ::accumulate::collector]
    set $oldcollect 1
    catch { rename ::collect $oldcollect } 
    coroutine ::collect apply {{watchvar} {
      upvar $watchvar collecting
      set collection {}
      set value [yield [info coroutine]]
      while {$collecting} {
        lappend collection $value
        set value [yield $value]
      }
      return $collection    
    }} $oldcollect
    try {
      uplevel 1 $args
      set $oldcollect 0
      set result [collect]
      return $result
    } finally { 
      unset $oldcollect
      catch { rename ::collect {} }
      catch { rename $oldcollect ::collect }
    }
  }

Renaming the collector coroutines is not particularly fast. A better approach would be to use a single collector coroutine and signal to it when a nested construct is encountered.

  # The [collect] coroutine accepts a value and appends it to the current
  # accumulator.
  # [accumulate] delimits the scope of an accumulator.  It accepts args and 
  # evaluates them as a command.  On completion [accumulate] returns a list
  # of values [collect]ed into the accumulator.  [accumulate] can nest, and
  # _may_ interact safely with coroutines (behaviour still to be codified).
  # [accumulate] and [collect] communicate via a global signal variable.
  
  namespace eval ::tcl::Private { set accumulate.signal {} }

  proc accumulate {args} {
    variable ::tcl::Private::accumulate.signal
    
    # Notify the collector of a new accumulator 
    set accumulate.signal [linsert ${accumulate.signal} 0 push]
    set collectorId [collect {}]
    
    # Evaluate the command
    set code [catch { uplevel 1 $args } r opts]
    
    # End collection for this accumulator
    set accumulate.signal [linsert ${accumulate.signal} 0 pop]
    set result [collect $collectorId]
    
    # Return collection or propagate error
    if { $code == 0 } {
      set r $result
    }
    return -options $opts $r
  }

  coroutine ::collect apply {{} {
    variable ::tcl::Private::accumulate.signal
    lassign [info coroutine] result collections collection
    set counter 0
    
    while {1} {
      # Return answer to creation, push or pop      
      set value [yield $result]
      
      # Collect values into current accumulator
      while {${accumulate.signal} eq {}} {
        lappend collection $value
        set value [yield $value]
      }
      
      # Handle signal from [accumulate]
      set accumulate.signal [lassign ${accumulate.signal} op] 
      switch -- $op {
        push {
          # We can't use a simple stack of collection lists here as nesting
          # across coroutines will break.  Nesting information must be stored
          # on the stack, which means [accumulate] must do it.
          set result "collection[incr counter]"
          dict set collections $result $collection
          set collection {}
        }
        pop {
          set result $collection
          set collection [dict get $collections $value]
          dict unset collections $value
        }
      }
    }
  }}