## Version 2 of Kruskal-Wallis test

Updated 2010-09-21 02:46:31 by AKgnome

The Kruskal-Wallis test is a non-parametric one-way analysis of variance by ranks (named after William Kruskal and W. Allen Wallis) for testing equality of population medians among groups (see [L1 ] for more information).

This test is missing from the tcllib math package (here: ::math::statistics), so here is an implementation. It can easily be taken apart to use the ranking of groups of values as a separate command, since this is needed separately from the test in other occasions. My implementation takes a list of groups where each group is a list of values. It returns a list with the H value (the test result) and the p value (i.e. the probability of a H value this large or larger). The latter is computed using ::math::statistics::cdf-chisquare, so the tcllib math::statistics package is needed.

Here is code:

```package require Tcl 8.4
package require math::statistics

proc kw-h {args} {
set index 0
set rankList [list]
set setCount [llength \$args]
foreach item \$args {
set values(\$index) [lindex \$args \$index]
# prepare ranking with rank=0:
foreach value \$values(\$index) {lappend rankList [list \$index \$value 0]}
incr index 1
}
# sort the values:
set rankList [lsort -real -index 1 \$rankList]
# assign the ranks (disregarding ties):
set length [llength \$rankList]
for {set i 0} {\$i < \$length} {incr i} {
lset rankList \$i 2 [expr {\$i + 1}]
}
# value of the previous list element:
set prevValue {}
# list of indices of list elements having the same value (ties):
set equalIndex [list]
# test for ties and re-assign mean ranks for tied values:
for {set i 0} {\$i < \$length} {incr i} {
set value [lindex \$rankList \$i 1]
if {(\$value != \$prevValue) && (\$i > 0) && ([llength \$equalIndex] > 0)} {
# we are still missing the first tied value:
set j [lindex \$equalIndex 0]
incr j -1
set equalIndex [linsert \$equalIndex 0 \$j]
# re-assign rank as mean rank of tied values:
set firstRank [lindex \$rankList [lindex \$equalIndex 0] 2]
set lastRank  [lindex \$rankList [lindex \$equalIndex end] 2]
set newRank [expr {(\$firstRank+\$lastRank)/2.0}]
foreach j \$equalIndex {lset rankList \$j 2 \$newRank}
# clear list of equal elements:
set equalIndex [list]
} elseif {\$value == \$prevValue} {
# remember index of equal value element:
lappend equalIndex \$i
}
set prevValue \$value
}
# re-establish original sets of values, but using the ranks:
foreach item \$rankList {
lappend rankValues([lindex \$item 0]) [lindex \$item 2]
}
# now compute H:
set H 0
for {set i 0} {\$i < \$setCount} {incr i} {
set total [expr [join \$rankValues(\$i) +]]
set count [llength \$rankValues(\$i)]
set H [expr {\$H + pow(\$total,2)/double(\$count)}]
}
set H [expr {\$H*(12.0/(\$length*(\$length + 1))) - (3*(\$length + 1))}]
incr setCount -1
set p [expr {1 - [::math::statistics::cdf-chisquare \$setCount \$H]}]
return [list \$H \$p]
}```

```% puts [kw-h {6.4 6.8 7.2 8.3 8.4 9.1 9.4 9.7} {2.5 3.7 4.9 5.4 5.9 8.1 8.2} {1.3 4.1 4.9 5.2 5.5 8.2}]