Shuffling a list

AMG: I guess RS wanted this page, so here goes... :^)

RS: Oops, sorry, what I meant was Shuffle a list ...

AMG: Ah, well, at least I got to write another un/sorting story page.

This page discusses shuffling algorithms and implementations thereof.

Version 1: sorting random reals

Here's a proc, adapted from unsort:

proc shuffle {data} {
    set rand_data [list]
    foreach elem $data {
        lappend rand_data [list [expr {rand()}] $elem]
    }

    set result [list]
    foreach elem [lsort -real -index 0 $rand_data] {
        lappend result [lindex $elem 1]
    }

    return $result
}

It works, but it's seems suboptimal.

Version 2: sorting random integers

For starters, the real sort bothers me. I imagine sorting by integer would be more efficient, but then again producing random integers is (currently) slower than producing random reals. Change rand() for int(0x7fffffff * rand()) and -real for -integer and it winds up taking longer. See the benchmark below. (By the way, how am I supposed to portably get the maximum and minimum integer?)

Next, having up to three copies of the list (one of which is augmented) appears to be a waste of RAM.

Version 3: in-place swaps

So would it be better to do an in-place shuffle? For each element of $data it could swap with another, randomly-selected element of $data. I'll take a shot at it:

proc shuffle {data} {
    set length [llength $data]
    for {set idx_1 0} {$idx_1 < $length} {incr idx_1} {
        set idx_2 [expr {int($length * rand())}]
        set temp [lindex $data $idx_1]
        lset data $idx_1 [lindex $data $idx_2]
        lset data $idx_2 $temp
    }
    return $data
}

Big improvement.

Benchmarking 1, 2, and 3

Here's my magical test data:

 set data [lrepeat 100000 x y z z y]

If you're trying this at home from your interactive tclsh, you might want to type "; puts -nonewline {}" after the end of the above line or else its printed result will scroll for a very long time.

Now I time the shuffles:

 ..................................................
 :                    :   400 MHz   :  1500 MHz   :
 :....................:.............:.............:
 : 1, random reals    : 21.847024 s :  9.541117 s :
 : 2, random integers : 24.649960 s :  9.857447 s :
 : 3, in-place swaps  :  7.650484 s :  2.328508 s :
 :....................:.............:.............:

Wow, I think I will go with #3! Hold on while I update unsort... :^)

Version 4: in-place swaps, better random behavior

AMG: Per Lars H's suggestion, I improved the code again. As Lars explains in unsort, the old code doesn't have a perfectly uniform distribution because, for any given list, the size of the set of permutations from which it chooses is not a multiple of the size of the set of all possible permutations. This is similar to doing "random() % 3" in C and getting 0's more often than 2's.

proc shuffle {data} {
    set length [llength $data]
    for {} {$length > 1} {incr length -1} {
        set idx_1 [expr {$length - 1}]
        set idx_2 [expr {int($length * rand())}]
        set temp [lindex $data $idx_1]
        lset data $idx_1 [lindex $data $idx_2]
        lset data $idx_2 $temp
    }
    return $data
}

That ought to do it! However, it's (theoretically) a little bit slower due to modifying more variables in the inner loop. I also made a less-readable version that incr's idx_1 rather than recalculating it with expr, for comparison's sake. Here are the numbers, using the same benchmark.

 ..................................................
 :                    :   400 MHz   :  1500 MHz   :
 :....................:.............:.............:
 : 4 , more random    :  8.428660 s :  2.286190 s :
 : 4a, less readable  :  7.824165 s :  2.163990 s :
 :....................:.............:.............:

For reasonably-sized lists (under 500,000 elements, I guess), the speed difference between #3, #4, and #4a is negligible. So I choose #4, which has a more uniform distribution and doesn't have grody sources due to silly optimizations.

I wonder why the 400 MHz results for #4 and #4a are slower than for #3 yet the 1500 MHz results are the other way around...? I guess this proc is random in more than one way.

SMP: Here is what 4a would look like and a re-benchmark. I think a 4.1% speedup is worth it. I deal with list sizes in the millions. I also use upvar to prevent passing gigantic strings by value (I don't know if Tcl tries to optimize or not).

proc shuffle {in_list_ out_list_} {
  upvar 1 $in_list_  in_list
  upvar 1 $out_list_ out_list
  set out_list $in_list
  for {set idx_1 [expr {[llength $out_list] - 1}]} {$idx_1 > 0} {incr idx_1 -1} {
    set idx_2 [expr {int(($idx_1 + 1) * rand())}]
    set temp [lindex $out_list $idx_1]
    lset out_list $idx_1 [lindex $out_list $idx_2]
    lset out_list $idx_2 $temp
  }
}

Tested using the benchmark method above.

 ...........................................................
 : 4                  :  365282 microseconds per iteration :
 : 4a                 :  350104 microseconds per iteration :
 :....................:....................................:

More?

Here's another algorithm. Basically you randomly select an element from the input list, remove it from the input list, and place it at the end of the output list. The reason I thought of it is, for unsort it's not necessary to maintain an output list; simply puts the elements as they come.

But this algorithm involves a lot of copying--- lset is nice for in-place modifies, but there's no in-place element removal. So it's liable to be slower than #4.

Category Algorithm

Category Performance