Version 3 of Unique Element List

Updated 2010-07-17 00:16:32 by AMG

CJB: Returns the list without duplicate elements. This is different from [lsort -unique] in that it maintains the list ordering. It also keeps the first element encountered and removes subsequent callings.

proc uniqueList {list} {
  set new {}
  foreach item $list {
    if {[lsearch $new $item] < 0} {
      lappend new $item
    }
  }
  return $new
}

AMG: Here's another way which internally uses a hash table (amortized constant time per element, linear time overall) instead of a linear search (linear time per element, quadratic time overall):

proc uniqueList2 {list} {
  set dict {}
  foreach item $list {
    dict set dict $item ""
  }
  dict keys $dict
}

AMG: Performance timing:

set test [split "this is my test data" ""]
llength $test                                    ;#    20 elements
time {uniqueList  $test} 100000                  ;#    20.78 microseconds per iteration
time {uniqueList2 $test} 100000                  ;#    13.6  microseconds per iteration
set test [lrepeat 1000 {*}$test]; list
llength $test                                    ;# 20000 elements
time {uniqueList  $test} 1000                    ;# 17562.0 microseconds per iteration
time {uniqueList2 $test} 1000                    ;#  6844.0 microseconds per iteration

AMG: When (if) I write my [ring] command (i.e. set manipulation), the following will be all that's necessary:

proc uniqueListRing {list} {
    # Warning: vaporware! does not actually work!
    ring create {*}$list
}

It'll return a ring, which is my name for a set (since "set" is taken), and the string representation of a ring will be the same as that of a list of unique elements. Order will be preserved the same as CJB described above. The performance will be on the order of [uniqueList2], but it will be faster due to being implemented in C. Also the Tcl_Obj type of the returned value will be List, so no conversion will be necessary before doing list operations on it. This is because my plan is to augment the internal List struct with a pointer to an optional indexing structure that can be used to describe a dict or a ring.