Bob Techentin posted to comp.lang.tcl, in response to someone who wanted to get a list of numbers in a random order, the following: ---- ''To get a unique list of numbers (say, 1-1000) in a random order, you can first create an ordered list, then randomize the list by walking through once and performing a random "swap". Something like this:'' for {set i 0} {$i<1000} {incr i} {lappend nums $i} for {set i 0} {$i<1000} {incr i} { set j [expr {int(rand()*1000)}] set temp [lindex $nums $j] set nums [lreplace $nums $j $j [lindex $nums $i]] set nums [lreplace $nums $i $i $temp] } ---- This got me thinking about a variety of methods, and performance for each. The cool thing about Bentley 's method is that it works in linear time; it's essentially selection sort where a random oracle determines the lement to be selected. Alas, ''lreplace'' isn't the greatest performer on the planet. The problem is that it copies the list every time, meaning that Bob's code degenerates into quadratic behavior, as I discovered when I timed it. I then hacked up the following stuff to experiment with the behavior: '''An afterthought:''' ''lreplace'' isn't the problem. The problem is a bizarre interaction with ''lreplace'' and the lifetime of a ''Tcl_Obj''. Following up on an idea of Donal Fellows, I hacked up the ''shuffle1a'' procedure below to make sure that I was updating the list in place, and the performance improved incredibly! '''Another afterthought:''' Bob Techentin posted to comp.lang.tcl the ''shuffle3'' procedure with the message: I've tried timing this on my WinNT machine, and it runs 3x slower than shuffle1a, which I don't completely understand. It ''should'' just be moving the elements from one list to another, but something else must be going on. My (KBK) explanation is that ''lreplace'' has a special case when you're replacing elements of the list with the same number of elements. In this case, it is able to avoid copying. Otherwise, it winds up needing to slide the elements down, leading to quadratic behavior once again. '''One more time...''' Steve Cohen posted: This is a close approximation of/exactly the code that Kevin ran in his test, ins't it? I wonder if we could convince him to run *just one more routine* through his gammut. Here it is: Note that the only big modification is to simply swap the list element selected from the first list with the nth element from the original list. By not modifying the size of the list, Tcl seems to leave it alone (in memory). On my (really slow P133 laptop) machine, this runs linearly with list size. I would be interested to see what Kevin gets with his big-hootie computer. '''KBK:''' Congratulations, Steve, ''shuffle4'' is the best so far, for large lists. Numbers below. [Christoph Bauer] adds a significant improvement for short lists, in the ''shuffle5a'' procedure below. Note that it, too requires the ''K'' combinator for adequate performance. Like ''shuffle3'', it begins to show the strain of quadratic behavior once the list has many thousands of elements. [MS]: for slightly faster performance (5-10% on Tcl8.4a3, as of 2001/05/23), and even uglier appearance, replace the ''K'' combinator with the equivalent idiom: [K x y] <=> [lindex [list x y] 0] ---- * Procedure to generate a list of ''n'' numbers: proc iota { n } { for { set i 0 } { $i < $n } { incr i } { lappend retval $i } return $retval } * Several of the procedures rely on Donal Fellows's ''K'' combinator: proc K { x y } { set x } * ''shuffle0'' is the obvious method of generating random keys, then sorting the list according to those keys. ''TFW 4/18/01'' This was broken due to inital newlist 0 (should be empty). Also, if you insert the set retval [[list]] you get a non trivial speedup. Moral, don't let lappend initialize your lists for you. proc shuffle0 { list } { set newlist [list] foreach element $list { lappend newlist [list [expr { rand() }] $element] } set retval [list] foreach pair [lsort -real -index 0 $newlist] { foreach { random item } $pair { lappend retval $item } } return $retval } * ''shuffle1'' is Techentin's implementation of Bentley's method. proc shuffle1 { list } { set n [llength $list] for { set i 0 } { $i < $n } { incr i } { set j [expr {int(rand()*$n)}] set temp [lindex $list $j] set list [lreplace $list $j $j [lindex $list $i]] set list [lreplace $list $i $i $temp] } return $list } * ''shuffle1a'' is Techentin's code, with a clever hack (due to Donal Fellows) for managing the lifetime of the Tcl_Obj that represents the list so that it doesn't get copied needlessly. proc shuffle1a { list } { set n [llength $list] for { set i 0 } { $i < $n } { incr i } { set j [expr {int(rand()*$n)}] set temp1 [lindex $list $j] set temp2 [lindex $list $i] set list [lreplace [K $list [set list {}]] $j $j $temp2] set list [lreplace [K $list [set list {}]] $i $i $temp1] } return $list } * ''shuffle2'' implements Bentley's method, unpacking the list to an array first. proc shuffle2 { list } { set n 0 foreach element $list { set data($n) $element incr n } for { set i 0 } { $i < $n } { incr i } { set j [expr { int( rand() * $n ) }] set temp $data($j) set data($j) $data($i) set data($i) $temp } for { set i 0 } { $i < $n } { incr i } { lappend retval $data($i) } return $retval } * ''shuffle3'' is Bob Techentin's implementation of Stephen D. Cohen's proposed method. proc shuffle3 { list } { set n [llength $list] while {$n>0} { set j [expr {int(rand()*$n)}] lappend slist [lindex $list $j] set list [lreplace [K $list [set list {}]] $j $j] incr n -1 } return $slist } * ''shuffle4'' is Steve Cohen's improved implementation: proc shuffle4 { list } { set n [llength $list] while {$n>0} { set j [expr {int(rand()*$n)}] lappend slist [lindex $list $j] incr n -1 set temp [lindex $list $n] set list [lreplace [K $list [set list {}]] $j $j $temp] } return $slist } * ''shuffle5'' and ''shuffle5a'' are from Christoph Bauer. They differ only in the use of the ''K'' combinator. proc shuffle5 { list } { set n 1 set slist {} foreach item $list { set index [expr {int(rand()*$n)}] set slist [linsert $slist $index $item] incr n } return $slist } proc shuffle5a { list } { set n 1 set slist {} foreach item $list { set index [expr {int(rand()*$n)}] set slist [linsert [K $slist [set slist {}]] $index $item] incr n } return $slist } * The test harness times the various methods and prints the results. puts " Times in usec for shuffle methods" puts " Method List length" puts " 1 10 100 1000 10000" puts " -------------------------------------------------" foreach method { shuffle0 shuffle1 shuffle1a shuffle2 shuffle3 shuffle4 shuffle5 shuffle5a } { set line " " append line $method foreach n { 1 10 100 1000 10000 } { set list [iota $n] if { [string equal $method shuffle1] && $n > 1000 } { append line " ------" } else { set t [lindex [time { $method $list } [expr 100000/$n]] 0] append line [format "%8d" $t] } } puts $line } ---- The results are summarized in the following table (8.3.2, WinNT 4.0 SP6, 550 MHz PIII-SpeedStep): Times in usec for shuffle methods Method List length 1 10 100 1000 10000 ------------------------------------------------- shuffle0 78 253 2053 21730 287400 shuffle1 37 212 3675 215110 ------ shuffle1a 47 287 2754 28140 286400 shuffle2 58 271 2513 26240 301400 shuffle3 47 252 2363 29140 890200 shuffle4 50 274 2504 25240 256300 shuffle5 28 104 1342 68800 ------ shuffle5a 33 153 1382 18830 785100 If anyone wants to see a graph of the result, you can get one by running the code at [Shuffle a list: graph results]. (All times were measured using Tcl 8.3.2 on a 550 MHz P-III SpeedStep laptop, with a modified tclWinTime.c that uses the Windows performance counter as a resolution multiplier. I'll be posting a patch for that shortly.) Summary: * For short lists (up to a hundred elements or so is "short"), Bauer's ''shuffle5'' procedure is a clear win, because it exploits the speed of ''foreach''. (This tradeoff may change, given that the maintainer is contemplating bytecoding other list operations in version 8.4.) * For lists longer than a few hundred elements, any implementation that does list surgery without the ''K'' combinator runs into extremely expensive quadratic behavior. What happens is that ''linsert'' or ''lreplace'' discovers that it has a ''Tcl_Obj'' with a reference count greater than 1 and is forced to copy it. For medium-sized lists, say 100-1000 elements, ''shuffle5a'', which is ''shuffle5'' with the K combinator added, is a clear winner. * Similarly, ''shuffle3'' causes ''lreplace'' to copy, on average, half the list. This copying leads to quadratic behavior as well, although it's not nearly as bad as 'shuffle1'. The ''shuffle5'' procedure runs into the same behavior in ''linsert.'' * The ''shuffle4'' procedure emerges a winner for the longest lists, although ''shuffle0'', ''shuffle1a'', and ''shuffle2'' are all extremely close. The lesson here is that it's important to structure the code so that the list never gets resized -- otherwise, there's a lot of memory copying involved. Some of the bytecode changes between 8.3.2 and 8.4 appear to favor ''shuffle1a'' instead. * ''lsort'' is surprisingly inexpensive. All of these procedures seem to be dominated by the linear overheads. [Kevin Kenny] ---- [Jeffrey Hobbs]: I have added this to the tclbench suite, but I find after quite a bit of testing, that ''shuffle1a'' consistently performas as well as or better than ''shuffle4'', with ''shuffle5a'' being a good general solution for lists up to 1000 or so items. My last timings with 8.3.3: Win2K/P3-800: Tcl8.3.3 Times in usec for shuffle methods Method List length 1 10 100 1000 10000 -------------------------------------------------- shuffle0 50 155 1315 13920 214300 shuffle1-s 24 125 2191 130895 ------ shuffle1a 30 170 1578 15725 162750 shuffle2 36 176 1515 15870 200800 shuffle3 30 150 1378 16825 535800 shuffle4 32 165 1478 14675 151700 shuffle5-s 18 66 839 38505 ------ shuffle5a 22 96 864 12015 478200 Linux/P3-???: Tcl8.3.3 Times in usec for shuffle methods Method List length 1 10 100 1000 10000 -------------------------------------------------- shuffle0 143 405 3089 32796 396658 shuffle1-s 64 296 3579 398035 ------ shuffle1a 79 407 3653 36153 363324 shuffle2 104 431 3598 36537 402357 shuffle3 87 406 3470 37679 778334 shuffle4 92 436 3734 36761 368350 shuffle5-s 47 182 1854 89727 ------ shuffle5a 54 239 2033 22177 616730 ---- [Janos Holanyi] Linux 2.4.14-686 / P3-667 (almost no processes*, single user): Times in usec for shuffle methods Method List length 1 10 100 1000 10000 ------------------------------------------------- shuffle0 34 152 1382 14706 205798 shuffle1 25 151 1889 82133 ------ shuffle1a 30 198 1854 18546 189828 shuffle2 37 182 1632 16255 195863 shuffle3 32 173 1592 16757 280023 shuffle4 34 186 1700 17024 175342 shuffle5 17 73 759 25910 3702327 shuffle5a 20 92 786 8732 194077 *'almost no processes' means: PID TTY STAT TIME COMMAND 1 ? S 0:00 init [S] 2 ? SW 0:00 [keventd] 3 ? SWN 0:00 [ksoftirqd_CPU0] 4 ? SW 0:00 [kswapd] 5 ? SW 0:00 [bdflush] 6 ? SW 0:00 [kupdated] 161 tty1 S 0:00 init [S] 162 tty1 S 0:00 bash # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 3 cpu MHz : 666.545 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1330.38