list comparison, elementwise

Difference between version 7 and 8 - Previous - Next
[wdb] -- looking for list comparison similar to ''equal?'' in [Scheme],
I found that it is surprisingly difficult due to the lack of types.
Here is my approach in Tcl. 
The procedure ''lequal'' checks recursively on elementwise equality:

======
proc lequal {a b} {
  if {![[string is list $a]] ||
      ![[string is list $b]]} then {
    string equal $a $b
  } elseif {[[llength $a]] != [[llength $b]]} then {
    return 0
  } elseif {[[llength $a]] == 0} then {
    return 1
  } elseif {[[llength $a]] > 1} then {        
    expr {[[lequal [[lindex $a 0]] [[lindex $b 0]]]] &&
          [[lequal [[lrange $a 1 end]] [[lrange $b 1 end]]]]}
  } elseif {$a eq [[lindex $a 0]] || $b eq [[lindex $b 0]]} then {
    string equal $a $b
  } else {
    lequal [[lindex $a 0]] [[lindex $b 0]]
  }
}
======

Don't know if in real life necessary, but by reasons of completeness, I did it.

----

[NEM] 2009-10-19: The lack of type information in Tcl makes these kinds of generic
operations slightly more difficult, as you say. I suspect '''string is list''' is not going to
be very reliable. For instance simple numbers like `12` return true for that test, where
you probably want them to be compared as numbers and not lists. The easiest way to
solve the lack of type information is to add a third argument which is the command to
use for comparing each element, as used in [dictutils]:

======
proc lequal {p xs ys} {
    if {[llength $xs] != [llength $ys]} { return 0 }
    foreach x $xs y $ys {
        if {![{*}$p $x $y]} { return 0 }
    }
    return 1
}
======

That way the caller can unambiguously say how they want comparisons to be performed.
To compare two lists of strings you would do:
======
lequal {string equal} $a $b
======
While, to compare two lists of lists of integers, you could do:
======
set xxs {{1 2 3} {4.0 5 6}}
set yys {{1 2.0 3} {4 5e0 6}}
proc inteq {a b} { expr {int($a) == int($b)} }
lequal {lequal inteq} $xxs $yys
======
There is a further complication, however, in that this assumes that the list is of homogenous
type: i.e., all elements have the same notion of equality. If you are using a list more like a tuple, then each
element position may have a different type and thus need to be compared in a different way.
This requires a different comparison method that takes a list of element comparison commands:
======
# tequal: tuple equality
proc tequal {types xs ys} {
    if {[llength $types] != [llength $xs] || [llength $xs] != [llength $ys]} { return 0 }
    foreach type $types x $xs y $ys {
        if {![{*}$type $x $y]} { return 0 }
    }
    return 1
}
# Test on tuples of type: integer x case-insensitive string x list of characters.
set type {inteq {string equal -nocase} {lequal {string equal}}}
set a {12.0 "Hello!" {a b c}}
set b {12 hello! {a b c}}
tequal $type $a $b
======
Ironically, notions of equality vary tremendously.

----

[wdb]: My core aspect is the difference of ''list'' and ''atom'' in arbitrary nested lists, 
not strings and numbers.
My proc ''lequal'' above treats both "''' a '''" and "'''{a}'''" as list with 1 element
(because enclosed with braces or spaces), but "'''a'''" as ''atom''. 
Perhaps a questionable decision, but I found no better one.

In the other aspects (.1 == 0.1, but .1 ne 0.1), you are right, but I did not care of that aspect.

----

[APE] 2021-04-15:  A diff function to compare 2 lists:

======
proc ldiff {l1 l2} {
    set added {} 
    set removed {} 
    set unchanged $l1
    foreach el1 $l1 {        if {[lsearch -exact $l2 $el1] == -1} {
            lappend removed $el1
            set idx [lsearch $unchanged $el1]
            set unchanged [lreplace $unchanged $idx $idx]
        }
    }
    foreach el2 $l2 {        if {[lsearch -exact $l1 $el2] == -1} {
            lappend added $el2
        }
    }    return [list added [list $added] removed [list $removed] unchanged [list $unchanged]]
}
======

example :

======ldiff {[list 1 2 3 6}] {[list 1 2 4 5}]
======
returns : added {4 5} removed {3 6} unchanged {1 2}


[PYK] 2021-04-17:  This is a hybrid of a list operation and a set operation
because it ignores order, but still considers duplicate values separately.
Also, the default matching mode  for `[lsearch]` is `[glob]`, so `-exact` is
probably needed here.
[APE] 2021-04-20: Thanks, added