**Overview** [AMG]: This page presents a list comprehension I initially developed on [hat0]'s page in response to his musings about building a ''syntax'' for list comprehensions directly into [Tcl]. My intention was to demonstrate that list comprehensions work fine as a command without special language support. [[[expr]]] is used to process each generated list element. More than one output element can be generated for each combination of input elements, which is useful for making [dict]s. A variety of iteration modes are supported, as shown in the table below. %|Iteration type|[[[foreach]]] example |[[lcomp]] example |% &|Simple |`foreach a $list {...}` |`lcomp {...} '''for''' a '''in''' $list` |& &|Striding |`foreach {a b} $list {...}` |`lcomp {...} '''for''' {a b} '''in''' $list` |& &|Unpacking |`foreach _ $list {lassign $_ a b; ...}` |`lcomp {...} '''for''' {a b} '''inside''' $list` |& &|Combinatorial |`foreach a $list1 {foreach b $list2 {...}}`|`lcomp {...} '''for''' a '''in''' $list1 '''for''' b '''in''' $list2`|& &|Parallel |`foreach a $list1 b $list2 {...}` |`lcomp {...} '''for''' a '''in''' $list1 '''and''' b '''in''' $list2`|& &|Conditional |`foreach a $list {if {cond} {...}}` |`lcomp {...} '''for''' a '''in''' $list '''if''' {cond}` |& All these different iterations types can be freely intermixed. For example, conditional iteration can be used to skip entire inner loops, not just individual elements. The caller's variables can be brought into [[lcomp]]'s scope using the '''with''' opcode, whose operand is a list of variable names to bring in. This can be used both to import values and export iterators. Any number of '''with''' opcodes can be used, and they can be placed anywhere in the argument list. The "`{...}`"s shown in the above table aren't exact copies of each other. Let's compare in more detail: %|[[lcomp]] | [[foreach]] |% &|`lcomp {$a * 2} '''for''' a '''in''' $list` | `set _ {}; foreach a $list {lappend _ [[expr {$a * 2}]]}; set _`|& &|`lcomp {$b} {$a} '''for''' {a b} '''in''' $list` | `set _ {}; foreach {a b} $list {lappend _ $b $a}; set _` |& For brevity, the second [[foreach]] example says "`$b $a`" even though it's technically "`[[expr {$b}]] [[expr {$a}]]`". ---- **Implementation** Here's the code: ====== proc lcomp {expression args} { set __0__ "lappend __1__ \[expr [list $expression]\]" while {[llength $args] && [lindex $args 0] ni {for if with}} { append __0__ " \[expr [list [lindex $args 0]]\]" set args [lrange $args 1 end] } set tmpvar 2 set structure {} set upvars {} while {[llength $args]} { set prefix "" switch [lindex $args 0] { for { set nest [list foreach] while {[llength $nest] == 1 || [lindex $args 0] eq "and"} { if {[llength $args] < 4 || [lindex $args 2] ni {in inside}} { error "wrong # operands: must be \"for\" vars \"in?side?\"\ vals ?\"and\" vars \"in?side?\" vals? ?...?" } switch [lindex $args 2] { in { lappend nest [lindex $args 1] [lindex $args 3] } inside { lappend nest __${tmpvar}__ [lindex $args 3] append prefix "lassign \$__${tmpvar}__ [lindex $args 1]\n" incr tmpvar }} set args [lrange $args 4 end] } lappend structure $nest $prefix } if { if {[llength $args] < 2} { error "wrong # operands: must be \"if\" condition" } lappend structure [list if [lindex $args 1]] $prefix set args [lrange $args 2 end] } with { if {[llength $args] < 2} { error "wrong # operands: must be \"with\" varlist" } foreach var [lindex $args 1] { lappend upvars $var $var } set args [lrange $args 2 end] } default { error "bad opcode \"[lindex $args 0]\": must be for, if, or with" }} } foreach {prefix nest} [lreverse $structure] { set __0__ [concat $nest [list \n$prefix$__0__]] } if {[llength $upvars]} { set __0__ "upvar 1 $upvars; $__0__" } unset -nocomplain expression args tmpvar prefix nest structure var upvars set __1__ "" eval $__0__ return $__1__ } ====== ---- **Demonstration** To demonstrate its usage, I will first rework all the version-1 examples given on old versions of this page. ====== set l {1 2 3 4 5 6} puts "A copy of the list: [lcomp {$i} for i in $l]" puts "Double values from list: [lcomp {$n * 2} for n in $l]" puts "Only even numbers: [lcomp {$i} for i in $l if {$i % 2 == 0}]" proc digits {str} { lcomp {$d} for d in [split $str ""] if {[string is digit $d]} } puts "Just digits from (703)-999-0012= [digits (703)-999-0012]" set names1 {Todd Coram Bob Jones Tim Druid} puts "From ($names1): Last,first = [lcomp {"$l,$f"} for {f l} in $names1]" puts "From ($names1): Only names starting with 't':\ [lcomp {$f} for {f l} in $names1 if {[string match T* $f]}]" puts "Create a matrix pairing {a b c} and {1 2 3}:\ [lcomp {[list $n1 $n2]} for n1 in {a b c} for n2 in {1 2 3}]" lcomp {$x} for x in {0 1 2 3} ;# 0 1 2 3 lcomp {[list $y $x]} for {x y} in {0 1 2 3} ;# {1 0} {3 2} lcomp {$x ** 2} for x in {0 1 2 3} ;# 0 1 4 9 lcomp {$x + $y} for x in {0 1 2 3} for y in {0 1 2 3} ;# 0 1 2 3 1 2 3 4 2 3 4 5 3 4 5 6 lcomp {$x} for x in {0 1 2 3} if {$x % 2 == 0} ;# 0 2 image delete {*}[lcomp {$val} for {key val} in [array get images]] ====== Now I will show off some new features: ====== set scale 2 lcomp {$x * $scale} with scale for x in {1 2 3 4} ;# 2 4 6 8 lcomp {$key} {$val} for key in {a b c} and val in {1 2 3} ;# a 1 b 2 c 3 lcomp {"$key=$val"} for {key val} inside {{a 1} {b 2} {c 3}} ;# a=1 b=2 c=3 ====== ---- **Discussion** ***Bytecoding*** [AMG]: It would be '''''awesome''''' if this was [bytecode]d. :^) [DKF]: Just a hint: writing bytecode compilers is quite difficult. No, it's actually ''very'' difficult indeed for anything non-trivial; you're working with the parser and the bytecode and the compiler core and you're short of most of the nice tools available to you when Tcl coding. What's more, get it wrong and you might not notice for quite a long time, as different problems show up in different ways. (Try `tcl::unsupported::assemble` instead; it gives you 90% of the effect for about 1% of the effort.) [AMG]: What objections would there be to me distributing Tcl code that uses [assemble]? ***Reduction functions*** [AMG]: I'm curious if there's a workable way to add customizable "reduction" behavior, or if list reduction should remain a separate operation performed on [[lcomp]]'s return value. (Reduce is also known as [fold].) : '''reduce'''(''f'', ''l'') = ''f''(...(''f''(''f''(''f''(''l''[[0]], ''l''[[1]]), ''l''[[2]]), ''l''[[3]])..., ''l''[[''n''-1]]) At the moment, the "reduction function" is hard-coded to be [[[lappend]]], which builds a list given each element in sequence. There are other sensible options: [[[append]]] concatenates the elements into a string, [[[join]]] does the same but inserts a delimiter, [[[+]]] finds the sum, [[[*]]] finds the product, [[[max]]] finds the maximum, etc. [DKF]: It's more usual to define ''reduce'' to take three arguments: a reduction function, a “zero”, and a list/sequence. (It's also best if you can have the function and zero being things can be applied without the list to get a partially-applied – i.e., curried – function.) That gives a definition like this: : '''reduce'''(''f'', ''z'')(''l'') = ''f''(...(''f''(''f''(''f''(''z'', ''l''[[0]]), ''l''[[1]]), ''l''[[2]])..., ''l''[[''n''-1]]) [AMG]: Subtle but cool: DKF's `reduce` is a function that returns a function which is applied to `l`. ***Spiritual successor*** [AMG]: The "syntax" provided by this [[lcomp]] command inspired the [[loop]] and [[collect]] commands proposed for [Brush]. The [[loop]] command takes the place of [[[foreach]]], [[[for]]], [[[while]]], and [[[dict for]]], plus provides many more modes of operation. The [[collect]] command is the list comprehension command, and its arguments are [[[expr]]]essions followed by the same looping specification arguments supported by [[loop]]. Here are the examples given on this page rewritten to use Brush and [[collect]]: ====== set &l (1 2 3 4 5 6) puts "A copy of the list: [collect i for &i in $l]" puts "Double values from list: [collect n*2 for &n in $l]" puts "Only even numbers: [collect i for &i in $l if {i % 2 == 0}]" proc digits {str} { collect d for &d in [split $str ""] if {[string is digit $d]} } puts "Just digits from (703)-999-0012= [digits {(703)-999-0012}]" set &names1 (Todd Coram Bob Jones Tim Druid) puts "From ($names1): Last,first = [collect {"$l,$f"} for (&f &l) in $names1]" puts "From ($names1): Only names starting with 't':\ [collect f for (&f &l) in $names1 if {[string match T* $f]}]" puts "Create a matrix pairing {a b c} and {1 2 3}:\ [collect {(n1, n2)} for &n1 in (a b c) for &n2 in (1 2 3)]" collect x for &x in (0 1 2 3) ;# 0 1 2 3 collect {(y, x)} for (&x &y) in (0 1 2 3) ;# {1 0} {3 2} collect x**2 for &x in (0 1 2 3) ;# 0 1 4 9 collect x+y for &x in (0 1 2 3) for &y in (0 1 2 3) ;# 0 1 2 3 1 2 3 4 2 3 4 5 3 4 5 6 collect x for &x in (0 1 2 3) if {x % 2 == 0} ;# 0 2 image delete {*}[collect val for (&key &val) in [array get images]] ;# Note: Brush doesn't have [array] set scale 2 collect x*scale for x in (1 2 3 4) ;# 2 4 6 8 * no need for "with" collect key val for &key in (a b c) and &val in (1 2 3) ;# a 1 b 2 c 3 collect {"$key=$val"} for (&key &val) unpack ((a 1) (b 2) (c 3)) ;# a=1 b=2 c=3 ====== [AMG]: I'm working on an update to the design of Brush's [[loop]] and [[collect]] commands that would eliminate "unpack" and change that last line to: ====== collect {"$key=$val"} for ((&key &val)) in ((a 1) (b 2) (c 3)) ====== ***Difficulty processing nested lists*** [AMG]: [[lcomp]] can't be used to process nested lists, not in the way I intended. ====== % lcomp {$b ** 2} for a in {{1 2 3} {4 5}} for b in $a can't read "a": no such variable % lcomp {[lcomp {$b ** 2} for b in $a]} for a in {{1 2 3} {4 5}} {1 4 9} {16 25} % lcomp {$b ** 2} for b inside {{1 2 3} {4 5}} 1 16 ====== That really hurts. Compare Python. Tcl insists on variable substitution being done before the command starts, provided `$` notation is used, whereas Python variable substitution happens on demand throughout the execution of a "command" (using the term loosely). ======none >>> [b ** 2 for a in ((1, 2, 3), (4, 5)) for b in a] [1, 4, 9, 16, 25] ====== I'm working on changes to the design of Brush to permit the following, which gives the same result as the Python code: ====== % collect b**2 for ((* &b)) in ((1 2 3) (4 5)) 1 4 9 16 25 ====== But this is still suboptimal because the implicit nesting expressed by `((* &b))` does not offer any room to inject an `if` clause. How to translate the following Python? ======none >>> [b ** 2 for a in ((1, 2, 3), (4, 5)) if len(a) > 2 for b in a] [1, 4, 9] ====== As mentioned before, Tcl insists on variable substitution being done early. Brush provides a way to defer the substitution: ====== % collect {$b@ ** 2} for &a in ((1 2 3) (4 5)) for &b over &a 1 4 9 16 25 ====== Here, the `b` variable doesn't contain the actual list elements. Typing `$b` instead gives ''references'' to said list elements, so to obtain the elements themselves, it is necessary to dereference with `$b@`. (In Tcl, this could be done by `b` containing a ''script'' that provides the referent when [[[eval]]]'ed, in this case "`[lindex] $a 123`", where `123` is the current loop iteration counter.) This makes room for `if`: ====== % collect {$b@ ** 2} for &a in ((1 2 3) (4 5)) if {[list size $a] > 2} for &b over &a 1 4 9 ====== Changing gears a bit. Tcl has [[[lmap]]]: ====== % lmap a {{1 2 3} {4 5}} {lmap b $a {expr {$b ** 2}}} {1 4 9} {16 25} ====== However, this doesn't offer an obvious way to flatten the list structure. Back to Brush. How to ''not'' flatten the list? ====== % collect {[collect b**2 for &b in $a]} for &a in ((1 2 3) (4 5) {1 4 9} {16 25} % collect i for &a in ((1 2 3) (4 5)) {= &i [collect b**2 for &b in $a]} {1 4 9} {16 25} ====== These do not make me happy... ---- **See also** * Historical version of [[lcomp]]: [http://wiki.tcl.tk/_/revision?N=12574&V=35#pagetoc36af10c5] * [List comprehension] * [Zip] * [Fold] * [Filter] * [Map in functional programming] <> Command | Concept | Functional Programming