Version 0 of Shuffling a list

Updated 2005-03-16 08:45:02

AMG: I guess RS wanted this page, so here goes... :^)

Version 1: sorting random reals

Here's a proc, adapted from unsort:

 proc shuffle {data} {
     set rand_data [list]
     foreach elem $data {
          lappend rand_data [list [expr {rand()}] $elem]
     }

     set result [list]
     foreach elem [lsort -real -index 0 $rand_data] {
         lappend result [lindex $elem 1]
     }

     return $result
 }

It works, but it's seems suboptimal.

Version 2: sorting random integers

For starters, the real sort bothers me. I imagine sorting by integer would be more efficient, but then again producing random integers is (currently) slower than producing random reals. Change rand() for int(0x7fffffff * rand()) and -real for -integer and it winds up taking longer. See the benchmark below. (By the way, how am I supposed to portably get the maximum and minimum integer?)

Next, having up to three copies of the list (one of which is augmented) appears to be a waste of RAM.

Version 3: in-place swaps

So would it be better to do an in-place shuffle? For each element of $data it could swap with another, randomly-selected element of $data. I'll take a shot at it:

 proc shuffle {data} {
     set length [llength $data]
     for {set idx_1 0} {$idx_1 < $length} {incr idx_1} {
         set idx_2 [expr {int($length * rand())}]
         set temp [lindex $data $idx_1]
         lset data $idx_1 [lindex $data $idx_2]
         lset data $idx_2 $temp
     }
     return $data
 }

Big improvement.

Benchmarking 1, 2, and 3

Here's my magical test data:

 set data [lrepeat 100000 x y z z y]

If you're trying this at home from your interactive tclsh, you might want to type "; puts -nonewline {}" after the end of the above line or else its printed result will scroll for a very long time.

Now I time the shuffles:

400 MHz : 1500 MHz
 1, random reals    : 21.847024 s :  9.541117 s
 2, random integers : 24.649960 s :  9.857447 s
 3, in-place swaps  :  7.650484 s :  2.328508 s

Wow, I think I will go with #3! Hold on while I update unsort... :^)


[ [Category Something] - please classify ]