Version 11 of another list comprehension

Updated 2007-01-16 17:05:37 by LV

iu2 - 01-16-2007

Forming a list out of another list can be achieved with one line of code.

Taking the examples from the tcl documentation about struct::list map we get mapping with at least two lines of code:

    tclsh> # squaring all elements in a list

    tclsh> proc sqr {x} {expr {$x*$x}}
    tclsh> ::struct::list map {1 2 3 4 5} sqr
    1 4 9 16 25

    tclsh> # Retrieving the second column from a matrix
    tclsh> # given as list of lists.

    tclsh> proc projection {n list} {::lindex $list $n}
    tclsh> ::struct::list map {{a b c} {1 2 3} {d f g}} {projection 1}
    b 2 f

In tcl 8.5 they can be rewritten as one liners like this:

  # squaring
  % ::struct::list map {1 2 3 4 5} {apply {x {expr {$x*$x}}}}
  1 4 9 16 25    

  # projection
  % ::struct::list map {{a b c} {1 2 3} {d f g}} {apply {x {lindex $x 1}}}
  b 2 f

Comments and improvements to this form can be reviewed in foreach little friends, where I talked about struct::list and apply as an introduction to something else.

This is only a step towards list comprehensions. I used list comprehensions a lot in Python and I miss them in tcl. List comprehensions have several advantages over other methods of forming lists:

  • A one liner, yet very clear
  • Can handle more than one list
  • Engages both mapping and filtering

I would describe Python's list comprehensions syntax as addictive. I used it only a couple of times before I through away the classic map/filter and lambda as a means of forming lists out of other lists. I also started using list comprehensions where I would usually use looping.

Many coding patterns seem unnecessary until one tries them out. For example, giving another name to a proc can be achieved via both interp alias {}... and a proc. I still don't get what iterp alias {}... does that a simple proc doesn't, but I do feel that if I use 'iterp' for a couple of times I will. After all it requires less coding. Perhaps when I find it convenient I will use it more often, hence improving my programming techniques - same function with different names. The same applies to list comprehensions.

A few examples form Python:

  # squaring
  >>> lis = [1,2,3,4,5]
  >>> lis = [x*x for x in lis]
  >>> lis
  [1, 4, 9, 16, 25]

Given lis = [1,2,3,4,5] one can calculate the sum of squares in a one short clear line

  >>> sum([x*x for x in lis])
  55  

And here is the projection example

  # projection
  >>> [x[1] for x in [['a','b','c'],[1,2,3],['d','f','g']]]
  ['b', 2, 'f']

Working with more than one list using zip:

  # calculating the differences between elements from two lists and taking only the ones that are not 0
  >>> list1 = [10,20,30,40,50]
  >>> list2 = [9, 20, 28, 40, 50]
  >>> print [x-y for x,y in zip(list1, list2) if x-y > 0]
  [1, 2]  

A few times I needed to count all the non empty lines in a file. A brute force, yet elegant list comprehension solution is

        # txt holds the entire text  
        lis = [line for line in txt.splitlines() if line.strip() != ""]
        coutn = len(lis)
        coutn = len(lis)

Indeed a regular expression will do here too, but crafting it will not take 20 (or less) seconds - the time needed to write the list comprehension. . and a proc. I still don't get what iterp alias {}... does that a simple proc''' doesn't, but I do feel that if I use 'iterp' for a couple of times I will. After all it requires less coding. Perhaps when I find it convenient I will use it more often, hence improving my programming techniques - same function with different names. The same applies to list comprehensions.

The same thing with tcl would probably be

        set lines {}
        foreach line [split $txt \n] {
          if {[string trim $line] ne ""} {lappend lines $line}
        }
        set count [split $lines \n]

or

        regsub -all -lineanchor {^\s*?\n} $txt "" txt2
        set count [split $txt2 \n]
        set count [split $txt2 \n]

none of which, I feel, fits the task properly. The first one is too long for such a simple thing, and the second one, well, didn't come out right on the first trial.

Python (like Haskell) also supports nested fors like in

  >>> lis1 = [1,2,3]
  >>> lis2 = ['a','b','c']
  >>> [[x,y] for x in lis1 for y in lis2]
  [[1, 'a'], [1, 'b'], [1, 'c'], [2, 'a'], [2, 'b'], [2, 'c'], [3, 'a'], [3, 'b'], [3, 'c']]

but I'll do without it for now.

Is this nice syntax can be imported to tcl? I would like it in further versions. I played with several forms of list comprehensions. The amount of braces stemming from tcl's syntax realy makes it difficult to come up with a nice syntax. Eventually, influenced by bind and small languages I've come up with this:

        # A helper proc: perform foreach the paramters given as one list
        proc foreachlist {list body} {
          uplevel 1 foreach $list [list $body]
        }
        }
        # list comprehension
        proc lisco {group {var ""}} {
          # extract params
          regexp {(.*?)(?:\sfor\s)(.*?)(?:\sif\s(.*?)$|$)} $group dummy cmd lists if
          if {$if eq ""} {set if 1}
          if {$if eq ""} {set if 1}
          # generate foreach line and string-map expression
          set nums 0
          set mapExp {%% % }
          foreach list [uplevel 1 [list subst [uplevel 1 list $lists]]] {  ;# 8-(
            incr nums
            lappend foreachLine $nums $list
            lappend mapExp %$nums $$nums
          }

          # build result list
          set res {}
          foreachlist $foreachLine {
            set mapExp2 [subst $mapExp]
            set cmd2 [string map $mapExp2 $cmd]
            set rtmp [uplevel 1 $cmd2]
            set cond [string map [concat $mapExp2 [list %r $rtmp]] $if]
            if {[expr [uplevel 1 [list subst $cond]]]} {lappend res $rtmp}
          }
          }
          if {$var ne ""} {uplevel 1 [list set $var $res]}
          return $res
        }

Here are a few examples using it:

        # squaring
        % lisco {expr {%1*%1} for {1 2 3 4 5}}
        1 4 9 16 25
        1 4 9 16 25

Why waste space on named arguments? One list requires one argument - %1.

Now with two lists:

         two lists:
        set list1 {10 20 30 40 50}
        set list2 {9 20 28 40 50}
        lisco {expr {%1-%2} for $list1 $list2 if [expr %1-%2] > 0} result
        puts $result
        puts $result
        1 2     sult
        1 2     sult

Another convenience 'keyword' is %r (for 'reference'), with which the previous lisco may be written like this:

        lisco {expr {%1-%2} for $list1 $list2 if %r > 0} result
        puts $result
        puts $result
        1 2     sult

Better the Python! %2} for $list1 $list2 if %r > 0} result

The auto-variables, %1, %2, %r are substituted using string map so when they might contain lists, they need to be grouped. Like in the projection example:

        # projection
        % lisco {lindex "%1" 1 for {{a b c} {1 2 3} {d f g}}}
        b 2 f   {lindex "%1" 1 for {{a b c} {1 2 3} {d f g}}}

Now counting the non-empty lines may be achieved this way

        lisco {list %1 for [split $txt \n] if [llength "%r"] > 0}
        or
        lisco {format "%1" for [split $txt \n] if [string trim "%r"] ne ""}     ey might contain lists, they need to be grouped. Like in the projection example:

                ormat "%1" for [split $txt \n] if [string trim "%r"] ne ""}     ey might contain lists, they need to be grouped. Like in the projection example:

lisco might need more polishing. It also supports only %1-%9 variables (who needs more anyway). The for and 'if words are used as keywords so you can't have them anywhere else in the expression (not so good if you've got a list containing one of them...), but the idea is clear: Having a nice syntax for forming lists, which will be convenient to use. Like in Python.


Category Discussion