How fast can we sort 5 elements

The Tcl2002 programming contest: problem 1 asked the entrants to write code to replicate the functionality of

    lsort -index 0 -ascii $list

without using the [lsort] command. It was stated that it was known in advance that the list would contain exactly five elements.

In preparing for the contest, one of the judges KBK decided to see just what was possible if the entire sort was placed in straight-line code (that is, if any loops were unrolled). On finding that performance improved by only 15% or so, he decided to get more aggressive, writing code to do the sort in seven comparisons. Doing so requires some sophistication; none of the familiar sort algorithms will achieve it:

  • bubble sort, insertion sort, merge sort and Quicksort all will require ten comparisons in the worst case.
  • a carefully coded merge sort requires eight comparisons in the worst case.
  • heapsort requires at least nine comparisons in the worst case.

In fact, it may be expected to be difficult, because n comparisons can distinguish among only 2**n possibilities; seven comparisons distinguish only 128 cases, and there are 120 possible orderings of five elements. Nevertheless, it is possible, using a technique called merge insertion; see Volume 3 of Knuth's The Art of Computer Programming for the mathematical details.


# We will begin by extracting the five list elements into variables ''p0''..''p4'', and the five sort keys into ''x0''..''x4''. The generated code will use these variables.

# The ''sort5'' code will be too long for us to want to write it out in longhand, so let's write some Tcl code to generate it, instead.  We start with some basics.  We're going to want the generated code to be nicely indented.  It turns out that the indentation will closely track the level of recursion in the Tcl code that will write the ''sort5'' procedure, so we can use the following little procedure to make the spaces at the beginning of a line.

 proc indent {} {
    return [format {%*s} [expr { 4 * [info level] - 12 }] {}]
 }

# The generated code is going to have a nest of constucts that all look like:
# if { [string compare $x1 $x2] <= 0 } {
#     ... code ...
# } else {
#     ... more code ...
# }
# The <= operator will be replaced with < if x2 preceded x1 in the original list.  This replacement will guarantee stability.  Here are procedures to generate the comparisons:

 proc emitCompare { v0 v1 codeVar } {
    upvar 1 $codeVar code
    append code [indent] {if } \{ \[ {string compare $x} $v0 { $x} $v1 \]
    if { $v0 < $v1 } {
        append code { <= }
    } else {
        append code { < }
    }
    append code {0 } \} { } \{ \n
 }
 proc emitElse { codeVar } {
    upvar 1 $codeVar code
    append code [indent] \} { else } \{ \n
 }
 proc emitEndIf { codeVar } {
    upvar 1 $codeVar code
    append code [indent] \} \n
 }

# We also need to generate a [[return]] command to sort the list once the correct order of elements is known:

 proc emitReturn { a b c d e codeVar } {
    upvar 1 $codeVar code
    append code [indent] {    return } \[ {list $p} $a { $p} $b { $p} $c \
        { $p} $d { $p} $e \] \n
 }

# With these preliminaries out of the way, here's the procedure that makes 'sort5'.  It has the variable extraction code; everything else happens in 'pair0', which follows it:

 proc makeSort5Proc {} {
    set code {}
    append code {proc sort5 { list } } \{ \n
    for { set i 0 } { $i < 5 } { incr i } {
        append code \
            {    set x} $i { [lindex [set p} $i { [lindex $list } $i {]] 0]} \n
    }
    pair0 code
    append code \} \n
    return $code
 }

# On entry to the code generated by 'pair0', nothing is known about the order of the elements. 'pair0' compare elements 0 and 1, and calls 'pair1' with the element indices correctly ordered:

 proc pair0 { codeVar } {
   upvar 1 $codeVar code
    emitCompare 0 1 code
    pair1 0 1 code
    emitElse code
    pair1 1 0 code
    emitEndIf code
 }

# On entry to 'pair1', it is known that the element at index ''a'' precedes the element at index ''b''. Nothing else is known.  The 'pair1' procedure compares elements 2 and 3, and calls 'pair2' with the results:

 proc pair1 { a b codeVar } {
    upvar 1 $codeVar code
    emitCompare 2 3 code
    pair2 $a $b 2 3 code
    emitElse code
    pair2 $a $b 3 2 code
    emitEndIf code
 }

# On entry to 'pair2', it is known that the element at index ''a'' precedes the one at index ''b'' and that the one at index ''c'' precedes the one at index ''d''. Nothing has yet examined element 4. The 'pair2' procedure compares elements ''b'' and ''d'', and calls 'place5' with the results.

 proc pair2 { a b c d codeVar } {
    upvar 1 $codeVar code
    emitCompare $b $d code
    place5 $a $b $c $d code
    emitElse code
    place5 $c $d $a $b code
    emitEndIf code
 }

# After three comparisons, things begin to get interesting. Let's represent the relationships by connecting the elements with lines. If element ''a'' is known to precede element ''b'', then a line joins ''a'' and ''b'' with ''a'' at the left end. What's known so far is:
#     b---d
#    /   /
#   a   c
# Nothing is known about element 4. Our next plan is to use binary search to insert it into the chain of elements ''a'', ''b'' and ''d'', so we begin by comparing it with element ''b''.

 proc place5 { a b c d codeVar } { 
     upvar 1 $codeVar code
     emitCompare 4 $b code
     place5a $a $b $c $d code
     emitElse code
     place5b $a $b $c $d code
     emitEndIf code
 }

# We now have done four comparisons, and established the relationships:
 #    a---b---d
 #       /   /
 #      4   c
# Continue the binary search by comparing element 4 against element ''a''.

 proc place5a { a b c d codeVar } {
     upvar 1 $codeVar code
     emitCompare 4 $a code
     insert3 4 $a $b $d $c code
     emitElse code
     insert3 $a 4 $b $d $c code
     emitEndIf code
 }

# In this procedure, we have done four comparisons, and established the relationships:
#          4
#         /
#    a---b---d 
#           /
#          c
# Continue the binary search by comparing element 4 agains element ''d''.

 proc place5b { a b c d codeVar } {
     upvar 1 $codeVar code
     emitCompare 4 $d code
     insert3 $a $b 4 $d $c code
     emitElse code
     insert2 $a $b $d 4 $c code
     emitEndIf code
 }

# Five comparisons have completed the binary search that inserts the last element into a three-element chain. We now have the relationships:
#    a---b---c---d
#               /
#              e
# We start another binary search to insert element ''e'' into the chain
# formed by elements ''a'', ''b'' and ''c''. Once again, things begin by
# comparing it against the middle element ''b'':

 proc insert3 { a b c d e codeVar } {
     upvar 1 $codeVar code
     emitCompare $e $b code
     insert3a $a $b $c $d $e code
     emitElse code
     insert3b $a $b $c $d $e code
     emitEndIf code
 }

# Six comparisons have established:
#    a---b---c---d
#       /
#      e
# One more comparison determines the sequence.

 proc insert3a { a b c d e codeVar } {
     upvar 1 $codeVar code
     emitCompare $e $a code
     emitReturn $e $a $b $c $d code
     emitElse code
     emitReturn $a $e $b $c $d code
     emitEndIf code
 }

# Six comparisons have established:
#         c
#        / \ 
#   a---b   d
#        \ /
#         e
# One more comparison determines the order of ''c'' and ''e''.

 proc insert3b { a b c d e codeVar } {
     upvar 1 $codeVar code
     emitCompare $e $c code
     emitReturn $a $b $e $c $d code
     emitElse code
     emitReturn $a $b $c $e $d code
     emitEndIf code
 }

# The next procedure handles the (lucky) case where five comparisons establish the relationships:
#    a---b---c---d
#           /
#          e
# Compare ''e'' against ''b'', then against ''a'' if necessary.

 proc insert2 { a b c d e codeVar } {
     upvar 1 $codeVar code
     emitCompare $e $b code
     insert3a $a $b $c $d $e code
     emitElse code
     emitReturn $a $b $e $c $d code
     emitEndIf code
 }

# A call to [[makeSort5Proc]] finishes the job:
 eval [makeSort5Proc]

The generated procedure is 485 lines long, and consists entirely of straight-line comparisons. It begins:

 proc sort5 { list } {
    set x0 [lindex [set p0 [lindex $list 0]] 0]
    set x1 [lindex [set p1 [lindex $list 1]] 0]
    set x2 [lindex [set p2 [lindex $list 2]] 0]
    set x3 [lindex [set p3 [lindex $list 3]] 0]
    set x4 [lindex [set p4 [lindex $list 4]] 0]
    if {[string compare $x0 $x1] <= 0 } {
        if {[string compare $x2 $x3] <= 0 } {
            if {[string compare $x1 $x3] <= 0 } {
                if {[string compare $x4 $x1] < 0 } {
                    if {[string compare $x4 $x0] < 0 } {
                        if {[string compare $x2 $x0] < 0 } {
                            if {[string compare $x2 $x4] <= 0 } {
                             return [list $p2 $p4 $p0 $p1 $p3]
                            } else {
                             return [list $p4 $p2 $p0 $p1 $p3]
                            }
                        } else {
                            if {[string compare $x2 $x1] < 0 } {
                             return [list $p4 $p0 $p2 $p1 $p3]
                            } else {
                             return [list $p4 $p0 $p1 $p2 $p3]
                            }
                        }
                    } else {
                        if {[string compare $x2 $x4] <= 0 } {
                            if {[string compare $x2 $x0] < 0 } {
                             return [list $p2 $p0 $p4 $p1 $p3]
                            } else {
                             return [list $p0 $p2 $p4 $p1 $p3]
                            }
 . . .

The generated procedure needs only 141% of the time that a procedure consisting of the lsort command uses. There is at least one way to squeeze a tiny bit more out of it, by removing the 'set' statements at the top of the procedure, and instead placing them in-line in the first place that the corresponding variables appear on any branch through the code. KBK has tried this, and discovered that it saves less than 2% of the remaining time. We appear here to be close to the absolute minimum run time!


If you want to try things out for yourself, the program that the judges used to evaluate contestants is at Tcl2002 programming contest: problem 1 test harness


See Also