Nested-loop join

Richard Suchenwirth 2002-11-27 - A visitor in the Tcl chatroom asked for a modified foreach to cover all combinations of the input lists. In database technical terms, this is called a "nested-loop join", at least one functional programming tutorial tells me so.

Well, and reason enough to do some Tcl'ing (where metaprogramming is as trivial as virtual destructors in other languages ;-) - from the input, build up a nested command of foreaches, insert the specified body in the middle, and presto:

 proc nljoin args {
    set body [lindex $args end]
    set cmd ""
    foreach {vars list} [lrange $args 0 end-1] {
       append cmd "[list foreach $vars $list] \{\n"
       append body "\}"
    }
    uplevel 1 $cmd$body
 }

#------------------------- Testing:

 % nljoin x {a b} y {c d} {puts x:$x,y:$y}
 x:a,y:c
 x:a,y:d
 x:b,y:c
 x:b,y:d

No error-checking yet - the length of args must be odd and >2. Caveat user.

Filtering the lists in advance reduces the runtime needed - for instance, after an example in [?], to get the list of all professors who have published since 1990, using the fancy select list comprehension:

 nljoin pub  [select x from $publications where {$date($x)>=1990}] \
        prof [select x from $staff where {$status($x)=="professor"}] {
            if {$empID($pub)==$empID($prof)} {
                lappend res $prof
            }
        }

Come to think, what this does is just iterate over the Cartesian product of a list of lists...

A join is not the same as a cartesian product. Joining a table of 3 records with one of 5 does not necessarily produce a table of 15 records. -jcw

RS: Thanks for elucidation. So what's called "join" is the matching of IDs in the body, right?

Yes, one or more fields, used as lookup in another table. That's "equi-join", i.e. matching on equality. There's "inner join", which only returns cases where there is a match, and (left/right) "outer join" which keeps all rows on the left of right hand side, even when lookups fail.

Here's page with tons of info [L1 ].

[EE]: Whoever posted the above link, thank you, I found it informative.


Nested Loop Parser Library :

parse_nest_loop parses an input string that contains <> notation to indicate nested loops.

Syntax:

  • <n-m> means loop from n to m, <<n-m>> means an inner-loop.
  • The number of < is the nested level
  • Multiple levels of loops, and multiple loops of the same levels can exist together.
  • Same level of loops can have different number of iteration times.
  • The loop ends when the longest loop in the same level ends. The shorter loop will continue from beginning.

Example:

parse_nest_loop "1<0-5>.<<1-5>>" returns:

10.1, 10.2, ..., 10.5, 11.1, 11.2, ..., 11.5, ... 15.1, 15.2, ..., 15.5

parse_nest_loop "<0-5><0-3>" returns:

00 11 22 33 40 51