AMG: I guess RS wanted this page, so here goes... :^)
This page discusses shuffling algorithms and implementations thereof.
Version 1: sorting random reals
Here's a proc, adapted from unsort:
proc shuffle {data} { set rand_data [list] foreach elem $data { lappend rand_data [list [expr {rand()}] $elem] } set result [list] foreach elem [lsort -real -index 0 $rand_data] { lappend result [lindex $elem 1] } return $result }
It works, but it's seems suboptimal.
Version 2: sorting random integers
For starters, the real sort bothers me. I imagine sorting by integer would be more efficient, but then again producing random integers is (currently) slower than producing random reals. Change rand() for int(0x7fffffff * rand()) and -real for -integer and it winds up taking longer. See the benchmark below. (By the way, how am I supposed to portably get the maximum and minimum integer?)
Next, having up to three copies of the list (one of which is augmented) appears to be a waste of RAM.
Version 3: in-place swaps
So would it be better to do an in-place shuffle? For each element of $data it could swap with another, randomly-selected element of $data. I'll take a shot at it:
proc shuffle {data} { set length [llength $data] for {set idx_1 0} {$idx_1 < $length} {incr idx_1} { set idx_2 [expr {int($length * rand())}] set temp [lindex $data $idx_1] lset data $idx_1 [lindex $data $idx_2] lset data $idx_2 $temp } return $data }
Big improvement.
Benchmarking 1, 2, and 3
Here's my magical test data:
set data [lrepeat 100000 x y z z y]
If you're trying this at home from your interactive tclsh, you might want to type "; puts -nonewline {}" after the end of the above line or else its printed result will scroll for a very long time.
Now I time the shuffles:
.................................................. : : 400 MHz : 1500 MHz : :....................:.............:.............: : 1, random reals : 21.847024 s : 9.541117 s : : 2, random integers : 24.649960 s : 9.857447 s : : 3, in-place swaps : 7.650484 s : 2.328508 s : :....................:.............:.............:
Wow, I think I will go with #3! Hold on while I update unsort... :^)
More?
I just thought of another algorithm. Basically I randomly select an element from the input list, remove it from the input list, and place it at the end of the output list. The reason I thought of it is, for unsort it's not necessary to maintain an output list; simply puts the elements as they come.
But this algorithm involves a lot of copying--- lset is nice for in-place modifies, but there's no in-place element removal. So it's liable to be slower than #3.
[ [Category Something] - please classify ]