Version 2 of accumulate and collect

Updated 2010-01-27 19:21:27 by AMG

Inlining Loops

Very often, when constructing data structures (especially lists) in a loop, I've had to write something like this:

  set result {}
  foreach x $input {
    lappend result [process $x]
  }

that is, declare a variable before the loop and manually appending results to that variable inside the loop. I thought, there must be a better way to do this. Well, there is map and filter from functional programming. They'll handle the common cases. But sometimes I really need the extra power of foreach. Wouldn't it be nice if foreach behave like map and filter?

So I started thinking about writing a custom control structure. But rather than just another looping construct like foreach or map I wanted a control structure that can be applied to any and all loops. This is what I ended up with:

  set result [accumulate {
    foreach x $input {
      collect [process $x]
    }
  }]

I'm calling this accumulate and collect. The accumulate function simply evals the string passed to it constructing a list from values collected by the collect function. Basically, this is an encapsulation of the set..lappend idiom above.

The accumulate function is nestable such that:

  accumulate foreach x {1 2 3} {
    collect [accumulate foreach y {a b c} {
      collect "$x$y"
    }]
  }

would return:

  {1a 1b 1c} {2a 2b 2c} {3a 3b 3c}

This is great. It allows me to treat loops like functions that return lists. And I don't have to declare pesky temporary variables!

Then I realized something:

A Fancy List Generator

This is another problem I keep facing. I've long wished that tcl had a list generator where I don't need to backslash escape all the time. I hate having to do:

  set foo [list \
    a 1 \
    b 2 \
    c 3 \
  ]

Apart form all the '\' looking ugly, it is also error prone. I often forget to add a '\' and tcl complains about 'c' being an invalid command name.

Of course I could use {} to generate lists:

  set foo {
    a 1
    b 2
    c 3
  }

but this doesn't work if I need to perform variable substitution. And doing it with "" is not only hard to read but much more error prone than list!

While playing with accumulate..collect I realized that it is just a fancy list generator. Is this the list generator that I've been wanting for so long? Lets try it out:

  set foo [accumulate {
    collect a; collect 1;
    collect b; collect 2;
    collect $argv;         # Yes, variable substitutions work.
    collect [glob *];      # It works! And look, comments!
  }]

Wow, like so many things in tcl this came as a complete surprise. I've been waiting for something like this for so long! But the syntax looks a bit cumbersome. Let's see if we can remedy that:

  interp alias {} List {} accumulate
  interp alias {} : {} collect

  set foo [List {
    : a
    : b
    : c
    : $argv
    : [glob *]
    : [List {   #nested!
      : 1
      : 2
      : 3
      # and even plays well with [list]:
      : [list x y z]
    }]
  }]

I love it!

Implementation

So, without further ado, here's the implementation of accumulate and collect:

set accumulator {}
proc accumulate {args} {
        if {[llength $args] == 1} {
                set args [lindex $args 0]
        }

        lappend ::accumulator {}
        set code [catch {uplevel 1 $args} result]
        switch -- $code {
                0 {}
                3 break
                4 continue
                default {
                        set ::accumulator [lrange $::accumulator 0 end-1]
                        return -code $code $result
                }
        }
        set ret [lindex $::accumulator end]
        set ::accumulator [lrange $::accumulator 0 end-1]
        return $ret
}

proc collect {value} {
        set acc [lindex $::accumulator end]
        lappend acc $value
        lset ::accumulator end $acc
}

AMG: Here's a version implemented using coroutines.

proc accumulate {args} {
    if {[llength $args] == 1} {
        set args [lindex $args 0]
    }
    set coro [info coroutine]-accum
    set accumulator [list [coroutine $coro eval $args]]
    while {[llength [info commands $coro]]} {
        lappend accumulator [$coro]
    }
    lrange $accumulator 0 end-1
}

proc collect {value} {
    yield $value
}

I don't like the names "accumulate" and "collect" since they're virtually the same word. It's not clear that one is providing data to the other, that one is gathering data produced by the other.

I can't decide if the method I used to generate unique coroutine names is genius or madness. I guess it can be both! A fixed name won't cut it for nested use of accumulate/collect, such as in the "{1a 1b 1c} {2a 2b 2c} {3a 3b 3c}" example.

The script body is executed in the global stack frame, not in the caller's stack frame. I don't think this is possible to fix.