Version 6 of Counting Elements in a List

Updated 2004-02-23 09:20:25

Is there a simple way to count the number of matching elements in a list? The answer is "Yes. Several ways."


And just to keep things interesting, we can compare them in the context of a little test harness that times how quickly they run. There are many times when a Tcl programmer might want to compare a couple of different techniques to see which is fastest. The time command can often help. (See the Tcl Performance page for more information on speed improvements.) -- RWT

    #!/bin/sh
    # restart on the next line using tclsh \
    exec tclsh "$0" "$@"

    #----------------------------------------
    #  Define procs to test each method for
    #  counting identical list items.  This
    #  enables the byte-code compiler to
    #  optimize the code.
    #----------------------------------------
    proc count_members1 list {
        foreach member $list {
            if {[info exists count($member)]} {
                incr count($member)
            } else {
                set count($member) 1
            }
        }
    }  

    proc count_members2 list {
        foreach x $list {
            if {[catch {incr count($x)}]} {set count($x) 1}
        }
    }  

    proc count_members3 list {
        foreach x $list {
            expr {[catch {incr count($x)}] && [set count($x) 1]}
        }
    }  

    proc count_members4 list {
        foreach x $list {
            lappend ulist($x) {}
        }
        foreach name [array names ulist] {
            set count($name) [llength $ulist($name)]
        }
    }     

    proc count_members5 list {
        foreach x $list {
            append ulist($x) .
        }
        foreach name [array names ulist] {
            set count($name) [string length $ulist($name)]
        }
    }     

    #----------------------------------------
    #  Create some test data.  In this case,
    #  build a list of 10,000 items
    #----------------------------------------
    set items [list john paul jones mary]
    for {set i 0} {$i<10000} {incr i} {
        lappend data [lindex $items [expr {int(rand()*[llength $items])}]]
    }   


    #----------------------------------------
    #  Print some information about our
    #  environment.  This is very useful
    #  when consulting comp.lang.tcl.
    #----------------------------------------
    puts "[info patchlevel] over $tcl_platform(os) $tcl_platform(osVersion)."


    #----------------------------------------
    #  Run the tests.
    #  Note that we have cleverly named
    #  the test procs so that [info] can
    #  easily find and execute them.
    #----------------------------------------
    foreach proc [info proc count_members*] {
        puts ""
        puts "$proc"
        puts [time {$proc $data} 10]
    }

RS 2004-02-20: Note however that the above procs don't really yield their count - the local array is discarded on return. For practical use, I modified count_members4 which was among the fastest in my tests on WinXP, to return a list of {element count} pairs:

 proc lcount list {
    foreach x $list {lappend arr($x) {}}
    set res {}
    foreach name [array names arr] {
       lappend res [list $name [llength $arr($name)]]
    }
    return $res
 }
 % lcount {yes no no present yes yes no no yes present yes no no yes yes}
 {no 6} {yes 7} {present 2}

The list is in hash (i.e., apparently no) order, but you can post-process it to

  • alphabetic: lsort [lcount $list]
  • numeric: lsort -integer -index 1 -decr [lcount $list]

I don't have 8.5 at hand yet, but I expect this dict version to be a good solution too:

 proc lcount list {
    set count {}
    foreach element $list {dict incr count $element}
    set count
 }

See Chart of proposed list functionality too.


Category Command