Version 34 of lcomp

Updated 2010-12-27 10:01:16 by AMG

AMG: I prefer lcomp version 2, which I have moved to the top of this page. Elsewhere on the wiki, when I talk about [lcomp], that's probably the version I'm referring to.


lcomp version 2

AMG: Here's a list comprehension I initially developed on hat0's page in response to his musings about building a syntax for list comprehensions directly into Tcl. My intention was to demonstrate that they work fine as a command without special language support. Although, it would be awesome if this was bytecoded. :^)

Unlike my earlier list comprehension posted at the bottom of the page, this version uses [expr] to process each generated list element. Also it supports generating more than one output element for each combination of input elements, which is useful for making dicts. In addition to alternating, combinatorial, and conditional iteration, it supports parallel iteration, and it has the option to unpack lists of value tuples. It's much more sugary (see the words in bold in the following table); however, this sugar is syntactically meaningful.

Iteration type[foreach] example [lcomp] example
Alternating foreach {a b} $list {...} lcomp {...} for {a b} in $list
Unpacking foreach _ $list {lassign $_ a b; ...} lcomp {...} for {a b} inside $list
Combinatorial foreach a $list1 {foreach b $list2 {...}}lcomp {...} for a in $list1 for b in $list2
Parallel foreach a $list1 b $list2 {...} lcomp {...} for a in $list1 and b in $list2
Conditional foreach a $list {if {cond} {...}} lcomp {...} for a in $list if {cond}

All these different iterations types can be freely intermixed. For example, conditional iteration can be used to skip entire inner loops, not just individual elements.

proc lcomp {expression args} {
    set __0__ "lappend __1__ \[expr [list $expression]\]"
    while {[llength $args] && [lindex $args 0] ni {for if}} {
        append __0__ " \[expr [list [lindex $args 0]]\]"
        set args [lrange $args 1 end]
    }
    if {![llength $args]} {
        error "wrong # args: must have at least one opcode"
    }
    set tmpvar 2
    while {[llength $args]} {
        set prefix ""
        switch [lindex $args 0] {
        for {
            set nest [list foreach]
            while {[llength $nest] == 1 || [lindex $args 0] eq "and"} {
                if {[llength $args] < 4 || [lindex $args 2] ni {in inside}} {
                    error "wrong # operands: must be \"for\" vars \"in?side?\"\
                           vals ?\"and\" vars \"in?side?\" vals? ?...?"
                }
                switch [lindex $args 2] {
                in {
                    lappend nest [lindex $args 1] [lindex $args 3]
                } inside {
                    lappend nest __${tmpvar}__ [lindex $args 3]
                    append prefix "lassign \$__${tmpvar}__ [lindex $args 1]\n"
                    incr tmpvar
                }}
                set args [lrange $args 4 end]
            }
        } if {
            if {[llength $args] < 2} {
                error "wrong # operands: must be \"if\" condition"
            }
            set nest [list if [lindex $args 1]]
            set args [lrange $args 2 end]
        } default {
            error "bad opcode \"[lindex $args 0]\": must be \"for\" or \"if\""
        }}
        lappend structure $nest $prefix
    }
    foreach {prefix nest} [lreverse $structure] {
        set __0__ [concat $nest [list \n$prefix$__0__]]
    }
    unset -nocomplain expression args tmpvar prefix nest structure
    set __1__ ""
    eval $__0__
    return $__1__
}

To demonstrate its usage, I will first rework all the version-1 examples given later on this page.

set l {1 2 3 4 5 6}
puts "A copy of the list: [lcomp {$i} for i in $l]"
puts "Double values from list: [lcomp {$n * 2} for n in $l]"
puts "Only even numbers: [lcomp {$i} for i in $l if {$i % 2 == 0}]"
proc digits {str} {
    lcomp {$d} for d in [split $str ""] if {[string is digit $d]}
}
puts "Just digits from (703)-999-0012= [digits (703)-999-0012]"
set names1 {Todd Coram Bob Jones Tim Druid}
puts "From ($names1): Last,first = [lcomp {"$l,$f"} for {f l} in $names1]"
puts "From ($names1): Only names starting with 't':\
    [lcomp {$f} for {f l} in $names1 if {[string match T* $f]}]"
puts "Create a matrix pairing {a b c} and {1 2 3}:\
    [lcomp {[list $n1 $n2]} for n1 in {a b c} for n2 in {1 2 3}]"
lcomp {$x} for x in {0 1 2 3}                         ;# 0 1 2 3
lcomp {[list $y $x]} for {x y} in {0 1 2 3}           ;# {1 0} {3 2}
lcomp {$x ** 2} for x in {0 1 2 3}                    ;# 0 1 4 9
lcomp {$x + $y} for x in {0 1 2 3} for y in {0 1 2 3} ;# 0 1 2 3 1 2 3 4 2 3 4 5 3 4 5 6
lcomp {$x} for x in {0 1 2 3} if {$x % 2 == 0}        ;# 0 2
image delete {*}[lcomp {$val} for {key val} in [array get images]]

Now I will show off some new features:

lcomp {$key} {$val} for key in {a b c} and val in {1 2 3}    ;# a 1 b 2 c 3
lcomp {"$key=$val"} for {key val} inside {{a 1} {b 2} {c 3}} ;# a=1 b=2 c=3

AMG: Both list comprehensions on this page have the drawback (or feature, depending on your preference) of running in a new stack frame. This means the iteration variables don't collide with existing variables (but also can't be examined afterward). I tried to use uplevel to fix this, but that created a new problem: I definitely do not want the __N__ temporary variables to be in the caller's stack frame, especially in the case of nested [lcomp] invocation. Since I couldn't figure out a good solution to that problem, I gave up. Ideas, anyone?

Using [eval] has another problem. Variables in the caller's stack aren't automatically visible within expressions evaluated by lcomp. You can make them visible by adding a single-iteration for whose sole purpose is to import the variables. All of the following work:

set data {1 2 3 4 5}
set v1 20
set v2 30
lcomp {$x + $v1 + $v2} for {v1 v2} in [list $v1 $v2] for x in $data
lcomp {$x + $v1 + $v2} for v1 in [list $v1] and v2 in [list $v2] for x in $data
lcomp {$x + $v1 + $v2} for v1 in [list $v1] for v2 in [list $v2] for x in $data

Maybe I can add a with opcode that takes a list of variable names as its operand. These variables will be [upvar]'ed into the [lcomp] stack frame. Theoretically, with could also be used to make the iteration variables persist in the caller's stack frame after [lcomp] returns, mirroring the behavior of [foreach]. Would import be a better name? I dunno; in light of my last observation, it could be used to export the iteration variables! Maybe use...?

lcomp {$x + $v1 + $v2} with {v1 v2} for x in $data

AMG: I'm curious if there's a workable way to add customizable "reduction" behavior, or if list reduction should remain a separate operation performed on [lcomp]'s return value. (Reduce is also known as fold.)

reduce(f, l) = f(...(f(f(f(l[0], l[1]), l[2]), l[3])..., l[n-1])

At the moment, the "reduction function" is hard-coded to be [lappend], which builds a list given each element in sequence. There are other sensible options: [append] concatenates the elements into a string, [join] does the same but inserts a delimiter, [+] finds the sum, [*] finds the product, [max] finds the maximum, etc.


lcomp version 1

AMG I wasn't quite satisfied with [listc], Todd Coram's list comprehension procedure, so I adapted it into something far worse. Behold!

proc lcomp {expression args} {
    # Check the number of arguments.
    if {[llength $args] < 2} {
        error "wrong # args: should be \"lcomp expression var1 list1\
            ?... varN listN? ?condition?\""
    }

    # Extract condition from $args, or use default.
    if {[llength $args] % 2 == 1} {
        set condition [lindex $args end]
        set args [lrange $args 0 end-1]
    } else {
        set condition 1
    }

    # Collect all var/list pairs and store in reverse order.
    set varlst [list]
    foreach {var lst} $args {
        set varlst [concat [list $var $lst] $varlst]
    }

    # Actual command to be executed, repeatedly.
    set script {lappend result [subst $expression]}

    # If necessary, make $script conditional.
    if {$condition ne "1"} {
        set script [list if $condition $script]
    }

    # Apply layers of foreach constructs around $script.
    foreach {var lst} $varlst {
        set script [list foreach $var $lst $script]
    }

    # Do it!
    set result [list]
    {*}$script ;# Change to "eval $script" if using Tcl 8.4 or older.
    return $result
}

Mainly I didn't like [listc]'s method of constructing its $foreachs variable (which I have named $script in my code), so I switched it around to be built inside-out rather than left-to-right. I start by putting together the actual command that'll be run by the tangled nest of [foreach] constructs; then I tack on [foreach]s and [if]s with the aid of [list]. Magically I no longer have to worry about matching braces, thus shortening my code into the sorry heap you see above.

Next up, I changed [eval] to [subst] and [expr], since I felt that all $expressions would use substitution and all $conditions would be boolean expressions. Especially nice is the ability to use [...] substitutions with both [subst] and [expr], so the old behavior can be had by wrapping the parameter with [ and ].

Lastly (unless I missed something), I dropped the <- noise word. I didn't much care for it, since [foreach] itself doesn't use it. You can re-add it if you like by changing "% 2" to "% 3" and "< 2" to "< 3" and the first instance of "var lst" to "var <- lst".


Examples

I'll start by listing Todd's examples modified to work with [lcomp].

set i  [list 1 2 3 4 5 6]
set l2 [lcomp {$i} i $i]
puts "A copy of the list: $l2"

set dbl [lcomp {[expr {$n*2}]} n $i]
puts "Double values from list: $dbl"

set evn [lcomp {$i} i $i {$i%2 == 0}]
puts "Only even numbers: $evn"

proc digits {str} {
    set lstr [split $str ""]
    return [lcomp {$d} d $lstr {[string is digit $d]}]
}
puts "Just digits from (703)-999-0012= [digits (703)-999-0012]"

set names1 [list Todd Coram Bob Jones Tim Druid]
set lf [lcomp {$l,$f} {f l} $names1]
puts "From ($names1): Last,first = $lf"

set l3 [lcomp {$f} {f l} $names1 {[string match "T*" $f]}]
puts "From ($names1): Only names starting with 't': $l3"

set l4 [lcomp {[list $n1 $n2]} n1 [list a b c] n2 [list 1 2 3]]
puts "Create a matrix pairing {a b c} and {1 2 3}: $l4"

Here are some more:

lcomp {$x} x {0 1 2 3}                           ;# 0 1 2 3
lcomp {$y $x} {x y} {0 1 2 3}                    ;# {1 0} {3 2}
lcomp {[expr {$x ** 2}]} x {0 1 2 3}             ;# 0 1 4 9
lcomp {[expr {$x + $y}]} x {0 1 2 3} y {0 1 2 3} ;# 0 1 2 3 1 2 3 4 2 3 4 5 3 4 5 6
lcomp {$x} x {0 1 2 3} {$x % 2 == 0}             ;# 0 2

Practical uses

The following deletes all images named by any elements in the images array:

image delete {*}[lcomp {$val} {key val} [array get images]]

I'll come up with more later as I use [lcomp] more in my own code.


AMG: It's possible to modify [lcomp] to use [eval] to process $expression rather than [subst] or [expr]. [subst]-like behavior can be had with the [K*] proc found on K. Single-argument [list] or [concat] should also work. For instance,

lcomp {K* $x} x {0 1 2 3}                        ;# 0 1 2 3
lcomp {concat $y $x} {x y} {0 1 2 3}             ;# {1 0} {3 2}
lcomp {expr {$x ** 2}} x {0 1 2 3}               ;# 0 1 4 9
lcomp {expr {$x + $y}} x {0 1 2 3} y {0 1 2 3}   ;# 0 1 2 3 1 2 3 4 2 3 4 5 3 4 5 6
lcomp {K* $x} x {0 1 2 3} {$x % 2 == 0}          ;# 0 2

Update: Instead of [K*], I suggest using single-argument [lindex]. See the [return -level 0] discussion for more information.