SetOps, Intersect3

Purpose: Definition and discussion of set functionality, especially the function to compute intersection and both differences for two sets.

Back to the Chart of proposed set functionality.


::setops::intersect3 set1 set2

Arguments:

  • Exactly two arguments
  • Each argument represents one of the sets to intersect.
  • The sets are given by value, as tcl-lists.

Result:

  • A list with 3 elements, set1*set2, set1 - set2 and set2 - set1, in this order.

This is a convenience function.


Implementation:

Several variants are possible:

  1. Like the first variant in SetOps, Intersection, but takes the results from all branches. Predicted performance O(n+2nlogn), n = length of longest set. Variant A.
  2. Call the necessary other functions internally, use the fastest implementations. Predicted performance: O(6n), n as above. Variant B.
  3. Convert both lists into arrays, then scan them and do the checks against the arrays, as in the individual functions. Predicted performance: O(4n), n as above. Variant C.

The idea of using the local namespace as implicit array cannot be used here, sorry. But maybe variant B is boosted high enough to outoperform the other variants.

Shortcuts to use in the code:

  • See the definitions of the merged procedures.

Timing:

Summary:

  • B, the most simple implementation was the best of the three variants. Because of this I have decided to exclude intersect3 from the functionality provided by the extension, especially as it is not primitive set-operation but just a convenience procedure.

Variant A.

 proc ::setops::intersect3 {a b} {
    if {[llength $a] == 0} {
        return [list {} {} $b]
    }
    if {[llength $b] == 0} {
        return [list {} $a {}]
    }

    set res_is {}
    set res_ab {}
    set res_ba {}

    set a [lsort $a]
    set b [lsort $b]

    while {1} {
        # Store lindex/0,1 in var, access later faster ?
        set n [string compare [lindex $a 0] [lindex $b 0]]
        if {$n == 0} {
            # A = B => element in both, add to intersection.
            lappend res_is [lindex $a 0]
            set a [lrange $a 1 end]
            set b [lrange $b 1 end]
        } elseif {$n > 0} {
            # A > B, remove B, we are beyond the element. 
            # This element in B is part of B-A.
            lappend res_ba [lindex $b 0]
            set b [lrange $b 1 end]
        } else {
            # A < B, remove A, we are beyond the element.
            # This element in A is part of A-B.
            lappend res_ab [lindex $a 0]
            set a [lrange $a 1 end]
        }
        if {[llength $a] == 0} {
            foreach e $b {
                lappend res_ba $e
            }

            return [list $res_is $res_ab $res_ba]
        }
        if {[llength $b] == 0} {
            foreach e $a {
                lappend res_ab $e
            }

            return [list $res_is $res_ab $res_ba]
        }
    }

    return [list $res_is $res_ab $res_ba]
 }

Variant B.

 proc ::setops::intersect3 {a b} {
    list [Intersect2 $a $b] [diff $a $b] [diff $b $a]
 }

Variant C.

 proc ::setops::intersect3 {a b} {
    if {[llength $a] == 0} {
        return [list {} {} $b]
    }
    if {[llength $b] == 0} {
        return [list {} $a {}]
    }

    set res_i  {}
    set res_ab {}
    set res_ba {}

    foreach e $b {
        set ba($e) .
    }

    foreach e $a {
        set aa($e) .
    }

    foreach e $a {
        if {![info exists ba($e)]} {
            lappend res_ab $e
        } else {
            lappend res_i $e
        }
    }

    foreach e $b {
        if {![info exists aa($e)]} {
            lappend res_ba $e
        } else {
            lappend res_i $e
        }
    }

    list $res_i $res_ab $res_ba
 }

-- AK