Purpose: Definition and discussion of set functionality, especially the function to merge two or more sets. 

Back to the [Chart of proposed set functionality].
----

    :   '''::setops::union''' ''set1 set2...''

----

Arguments:

   * Any number of arguments, (including none).
   * Each argument represents one of the sets to merge.
   * The sets are given by value, as tcl-lists.

Result:

   * A list representing the union of all sets in the argument list.

----

Implementation:

Several variants are possible:

   * Add the list elements as keys to a hash-table, then extract the keys out of the array. Predicted performance O(nm). Removal of duplicates is done automatically. This is variant A.

   * Append the lists into a big list, then remove the duplicates. Removal of duplicates may use a sort followed by a simple linear scan, or a hash-table. Predicted runtimes are O(nm log nm) (sort+scan) and O(nm) (hash-table). n = length of longest lists, m = number of sets. These are variants B and C.

   * It is possible to use local namespace as implicit array, see Variant D in [SetOps, Create]. Predicted performance O(nm). This is Variant D.

   Remark:   Some shortcuts are possible, f.e. if none or only one set is given to the procedure.

----
Timing information:

   * [SetOps, Union, Timing Script],
   * [SetOps, Union, Timing results, v7.6]
   * [SetOps, Union, Timing results, v8.0]

Summary:
   * From fast to slow the order of variants is: A, B, C.
   * v7.6 does the tests faster than v8.0
   * Test system: Linux 2.0.33, gcc 2.7.2.3, libc 5.4.38, P2 133, 32M main memory

'''Update (1999 Mar 12):'''

   * Did the same test as for intersections, with much more datapoints.
   * In the long run the order of algorithms was '''D''', A, C, B, from best to worst. See below for the script and the raw data.

   * [SetOps, Union, Timing Script II]
   * [SetOps, Union, Results]
   * [SetOps, Union, Results, NEP Format]

----
Variant A.
======
    proc ::setops::union {args} {
        switch [llength $args] {
            0 {
                return {}
            }
            1 {
                return [lindex $args 0]
            }
            default {
                foreach set $args {
                    foreach e $set {
                        set tmp($e) .
                    }
                }
                return [array names tmp]
            }
        }
    }
======
----
Variant B.
======
    proc ::setops::union {args} {
        switch [llength $args] {
            0 {
                return {}
            }
            1 {
                return [lindex $args 0]
            }
            default {
                set tmp {}
                foreach set $args {
                    foreach e $set {
                        lappend tmp $e
                    }
                }
                # remove duplicates --
                # sort and scan. shortcut in case of empty or
                # single-element result.

                if {[llength $tmp] < 2} {
                    return $tmp
                }

                set tmp  [lsort $tmp]
                set last [lindex $tmp 0]
                set tmp  [lrange $tmp 1 end]
                set res  $last

                foreach e $tmp {
                    if {[string compare $e $last] != 0} {
                        lappend res $e
                        set last    $e
                    }
                }

                return $res
            }
        }
    }
======
----
Variant C.
======
    proc ::setops::union {args} {
        switch [llength $args] {
            0 {
                return {}
            }
            1 {
                return [lindex $args 0]
            }
            default {
                set tmp {}
                foreach set $args {
                    foreach e $set {
                        lappend tmp $e
                    }
                }
                # -W- remove duplicates --
                # hash out. shortcut in case of empty or
                # single-element result.

                if {[llength $tmp] < 2} {
                    return $tmp
                }

                foreach e $tmp {
                    set tmpa($e) .
                }

                return [array names tmpa]
            }
        }
    }
======
----
Variant D.
======
 proc ::setops::union {args} {
    switch [llength $args] {
        0 {
            return {}
        }
        1 {
            return [lindex $args 0]
        }
        default {
            foreach set $args {
                if {[llength $set] > 0} {
                    foreach $set {.} {break}
                }
            }

            unset args set
            info locals
        }
    }
 }
======
----
-- AK

<<categories>>Data Structure