[CJB]: Returns the list without duplicate elements. This is different from [[[lsort] -unique]] in that it maintains the list ordering. It also keeps the first element encountered and removes subsequent callings. ====== proc uniqueList {list} { set new {} foreach item $list { if {[lsearch $new $item] < 0} { lappend new $item } } return $new } ====== ---- [AMG]: Here's another way which internally uses a hash table (amortized constant time per element, linear time overall) instead of a linear search (linear time per element, quadratic time overall): ====== proc uniqueList2 {list} { set dict {} foreach item $list { dict set dict $item "" } dict keys $dict } ====== ---- [AMG]: Performance timing: ====== set test [split "this is my test data" ""] llength $test ;# 20 elements time {uniqueList $test} 100000 ;# 20.78 microseconds per iteration time {uniqueList2 $test} 100000 ;# 13.6 microseconds per iteration set test [lrepeat 1000 {*}$test]; list llength $test ;# 20000 elements time {uniqueList $test} 1000 ;# 17562.0 microseconds per iteration time {uniqueList2 $test} 1000 ;# 6844.0 microseconds per iteration ====== [wdb] Performance is reduced by usage of ''lsearch'' in procedure unique above. If you use the infix operator ''in'' resp. ''ni'' like here (needs Tcl 8.5) ... ====== proc uniqueList3 {list} { set new {} foreach item $list { if {$item ni $list} { lappend new $item } } return $new } ====== ... then the results show a different image: ====== % time {uniqueList $test} 100000 29.59143 microseconds per iteration % time {uniqueList2 $test} 100000 22.08167 microseconds per iteration % time {uniqueList3 $test} 100000 16.23675 microseconds per iteration ====== [AMG]: Yikes, there is a critical error in your code!! It always returns empty. Let me fix and retest: ====== proc uniqueList3 {list} { set new {} foreach item $list { if {$item ni $new} { lappend new $item } } return $new } time {uniqueList $test} 100000 ;# 37.13617 microseconds per iteration time {uniqueList2 $test} 100000 ;# 25.42803 microseconds per iteration time {uniqueList3 $test} 100000 ;# 20.33975 microseconds per iteration ====== The gain over uniqueList2 is still there, but smaller. The corrected uniqueList3 runs in 80% the time of uniqueList2, whereas your original uniqueList3 ran in 75%. Again, the results are different when the list length is much longer. ====== set test [lrepeat 1000 {*}[split "this is my test data" ""]]; list time {uniqueList $test} 1000 ;# 30554.174 microseconds per iteration time {uniqueList2 $test} 1000 ;# 13517.655 microseconds per iteration time {uniqueList3 $test} 1000 ;# 14456.274 microseconds per iteration ====== This is because uniqueList2 uses a different class of algorithm than uniqueList and uniqueList3. For shorter lists, uniqueList3 is fastest. For longer lists, uniqueList2 is fastest. Somewhere around 20,000 elements, [in]/[ni] becomes slower than [dict], since in/ni uses a linear-time search whereas dict uses a constant-time search. Below 20,000 elements, the overhead of dict negates its performance benefit. Of course, this is not a complete analysis. Also one must consider the performance for lists with lots of repetition versus lists with predominantly unique elements. But I don't think it's useful to go into this much detail, not on this page. ---- [AMG]: When (if) I write my [[ring]] command (i.e. set manipulation), the following will be all that's necessary: ====== proc uniqueListRing {list} { # Warning: vaporware! does not actually work! ring create {*}$list } ====== It'll return a ring, which is my name for a set (since "[set]" is taken), and the string representation of a ring will be the same as that of a list of unique elements. Order will be preserved the same as [CJB] described above. The performance will be on the order of [[uniqueList2]], but it will be faster due to being implemented in [C]. Also the [Tcl_Obj] type of the returned value will be List, so no conversion will be necessary before doing list operations on it. This is because my plan is to augment the internal List struct with a pointer to an optional indexing structure that can be used to describe a [dict] or a ring. <> Command | Performance