Version 20 of like python generator

Updated 2007-02-02 14:39:15

iu2 2007-02-01 Inspired by What kinds of variable names can be used in Tcl and Looking at LISP's SERIES extension I tried to write a generator with Python's syntax.

The generator creates values upon demand. For example, given (in python)

 >>> def inc1():
        c = 0
        while 1:
                c += 1
                yield c

                yield c
 >>> gen = inc1()
 >>> gen.next()
 1
 >>> gen.next()
 2
 >>> gen.next()
 3
 >>> gen.next()
 4
 >>> 
 >>> 

More information can be found in generators and Python's doc at [L1 ] and the itertools library [L2 ]

The syntax is simple and powerful. The mechanism involves saving the context after the yield command. Saving context in tcl? --- using threads....

So it occured to me that if I use threads I can immitate Python's generator. This is the method: I write a generator proc just like in Python, using yield. Then I call a special function generate that converts my proc to a functioning generator by launching a thread, and establishing communication with the calling thread upon yield occurances.

This is the generate proc. iu2 addition Important: I used vwait as a means of synchronization between the threads, which is not robust, and I can think of a scenario of a deadlock. generate should be implemented by condition variables, instead, or even better, message queues (but I don't think tcl supports that...)

  proc generate {var proc args} {
    # creating a thread
    uplevel #0 set thread$var [thread::create {thread::wait}]
    upvar #0 thread$var th
    thread::send $th "set th [thread::id]" ;# notifying about caller thread
    tsv::set gen $var 0
    uplevel #0 set wait$var 0
    upvar #0 wait$var wait

    # change the desired proc so that 'yield' causes a generation of a value and a pause
    set b [info body $proc]
    regsub -all -linestop {\yyield\y\s+(.+)} [info body $proc] [format {tsv::set gen %s [subst \1]; thread::send -async $::th {set ::%s 1}; vwait sleep} $var wait$var] b
    append b "\nthread::send -async \$::th {set ::wait$var 2}"
    thread::send $th [list proc $proc [info args $proc] $b]

    # create a proc for making the generator advance - communication with the thread
    proc $var {} [format {
      if {$::%s == 0} {thread::send -async %s "%s %s"} else {
      thread::send -async %s {set sleep 1}}
      vwait ::%s
      if {$::%s == 2} {error "Generator exausted"}
      return [tsv::get gen %s]} wait$var $th $proc $args $th wait$var wait$var $var]
  }

Lars H: There's quite a bit of quoting hell in this generate. I'd suggest instead of the dynamically created $var proc having one "generator_get" which takes the necessary arguments as arguments and in generate create the $var command as an interp alias to this generator_get. yield should probably be a proper command as well.

An example of a generator function:

  proc list3 {a b c} {
    for {set x 0} {$x < $a} {incr x} {
      for {set y 0} {$y < $b} {incr y} {
        for {set z 0} {$z < $c} {incr z} {
          yield [list $x $y $z]
        }
      }
    }
  }

Now I convert list3 to a generator

 generate next list3 2 2 3

And here is the usage and result for next

  for {set i 0} {$i < 20} {incr i} {
    if {[catch {
      puts [next]
    }]} break
  }

result

 0 0 0
 0 0 1
 0 0 2
 0 1 0
 0 1 1
 0 1 2
 1 0 0
 1 0 1
 1 0 2
 1 1 0
 1 1 1
 1 1 2

Another example

  proc plus1 {} {
    set i 0
    while 1 {
      yield [incr i]
    }
  }

And the usage:

  generate p1 plus1
  while 1 {
    gets stdin s
    catch {eval $s}
    puts [p1]
  }

The catch {eval $s} is to be able to write exit and leave the program.

One more:

  proc manyyields {} {
    yield 1
    yield {Hello world!}
    yield {1 2 3 4 5}
  }

  generate my manyyields
  while 1 {
    puts [my]
  }

And the result

 1
 Hello world!
 1 2 3 4 5 
 Generator exausted
    while executing
 "error "Generator exausted""
    (procedure "my" line 5)
    invoked from within
 "my"
    ("while" body line 2)
    invoked from within
 "while 1 {
   puts [my]
 }"

schlenk Why not use simple slave interpreters to save the context (even namespaces or dynamically rewritten procs with default arguments might be enough)? No need to involve threads.

iu2 I simply couldn't figure out how to do this with the yield syntax. I will be glad to learn how to do it without threads... I can point out two advantages with the current method:

  • Using yield is an extremely simple and intuitive way to describe generators
  • The thread implementation is relatively short - 24 lines (might be a little longer, replacing the vwait commands - I added a comment above that vwait is not a safe way to synchronize the threads...)

There are of course disadvantages, like not having control over context switches (the exact time when the next item fetch will complete), to name one...

NEM I don't think it's possible to save the execution context of a slave interpreter without using threads, as there is no method to capture the current continuation. You can suspend a thread at the appropriate point, however. Generators/co-routines are neat. The usual Tcl way of accomplishing the same is to instead pass a script/callback to the generating function:

 proc list3 {a b c yield} {
   for {set x 0} {$x < $a} {incr x} {
     for {set y 0} {$y < $b} {incr y} {
       for {set z 0} {$z < $c} {incr z} {
         invoke $yield [list $x $y $z]
       }
     }
   }
 }
 proc invoke {cmd args} { uplevel #0 $cmd $args }
 list3 2 2 3 puts

Apart from the difference in structure, the Python method works better if you want to combine elements from several generators at once. For instance:

 while 1 {
     set el1 [generator1]
     set el2 [generator2]
     puts [list $el1 $el2]
 }

Streams are useful for this case. See also a generic collection traversal interface.

iu2 The C library functions setjmp and longjmp can be used to save the context in C. I wonder if they could be incorporated into tcl commands....


Lars H: I don't know much about threads, but using [thread::send -async] for the communication seems odd to me. Conceptually, a generator thread should just run its little generating script and then exit (pausing only if the mode of communicating the data requires waiting for the previous item to have been read), so maybe this is a use case for threads where one should not call [thread::wait]?


Category Concept | Category Control Structure | Category Threads