'''Purpose:''' Show how to improve the output of the math function [rand] to reduce sequential correlation. ---- The sequence of numbers returned by [rand] is prone to sequential correlations if large runs of numbers are taken. For example, if clusters of six sequential results from expr { rand () } are analyzed, it will turn out that they lie on only 40 hyperplanes in six-dimensional space. This limitation means that rand() is not suitable by itself for applications like high-resolution Monte Carlo integration. The following code implements a "better" rand that reduces this problem by shuffling the sequence that is returned. For further discussion, see [Donald E. Knuth], ''The Art of Computer Programming, Volume 2: Seminumerical Algorithms,'' which has a wealth of information about random number generators and their foibles. ---- **Implementation using array** namespace eval BayesDurham { namespace export rand variable poolSize 97; # Size of the pool of random numbers variable v; # Pool of random numbers variable y; # Most recently generated random number } #---------------------------------------------------------------------- # # BayesDurham::rand -- # # Generate a random floating point number between 0 and 1 # # Parameters: # None. # # Results: # Returns a randomly chosen floating point number in the # range [0 .. 1) # # Side effects: # Stores state in several namespace variables. # # The BayesDurham::rand procedure attempts to improve upon the # system-supplied rand() function by breaking up sequential # correlations. It works by shuffling a vector of numbers supplied by # the system's random number generator, using its own output to choose # the next slot in the vector. # #---------------------------------------------------------------------- proc BayesDurham::rand {} { variable poolSize variable v variable y # Test whether we're initializing the sequence if { ![info exists v] } { # Discard some random numbers - in case the RNG was just # seeded. for { set i 0 } { $i < $poolSize } { incr i } { expr { rand() } } # Save a pool of random numbers for { set i 0 } { $i < $poolSize } { incr i } { set v($i) [expr { rand() }] } # Generate one more random number set y [expr { rand() }] } # Use the most recently generated number to select from the pool. set i [expr { int( double( $poolSize ) * $y ) }] set y $v($i) set v($i) [expr { rand() }] return $y } ---- **Implementation using list** [EKB] Here's an alternative implementation: namespace eval shufflerand { variable vals variable vallen 98 } proc shufflerand::init {args} { variable vals variable vallen set haveseed false foreach {arg argval} $args { switch -- $arg { -len { if [string is integer $argval] { if {$argval > 0} { set vallen $argval } else { error "[namespace current]::init: Length must be greater than zero" } } else { error "[namespace current]::init: Length must be an integer" } } -seed { if [string is integer $argval] { set seed $argval set haveseed true } else { error "[namespace current]::init: Seed must be an integer" } } default { error "[namespace current]::init: Possible arguments are: -seed -len" } } } set vals {} if $haveseed { lappend vals [expr {srand($seed)}] } else { lappend vals [expr {rand()}] } for {set i 1} {$i < $vallen} {incr i} { lappend vals [expr {rand()}] } } proc shufflerand::rand {} { variable vals variable vallen set retval [lindex $vals end] set ndx [expr {int($vallen * $retval)}] lset vals end [lindex $vals $ndx] lset vals $ndx [expr {rand()}] return $retval } # Test console show shufflerand::init puts "A list of 10 random numbers from shuffled list:" for {set i 0} {$i < 10} {incr i} { puts [shufflerand::rand] } puts "\nTime to generate 1000 values using rand():" puts [time {for {set i 0} {$i < 1000} {incr i} {expr {rand()}}}] puts "\nTime to generate 1000 values using shufflerand:" puts [time {for {set i 0} {$i < 1000} {incr i} {shufflerand::rand}}] Here's some sample output from the test code at the end: A list of 10 random numbers from shuffled list: 0.00284572923688 0.897247713477 0.649551953492 0.456922465682 0.0448791366279 0.0441727326457 0.232282039818 0.645288598093 0.753969432672 0.207090688034 Time to generate 1000 values using rand(): 558 microseconds per iteration Time to generate 1000 values using shufflerand: 4052 microseconds per iteration So, it's about 7-8 times slower than the straight linear congruential generator that's bundled with Tcl. ---- [EKB] Hm...Now using Tcl 8.5, rand() takes a little longer, shufflerand a little less long. Here are some runs for rand(), shufflerand, and Bayes-Durham: Time to generate 1000 values using rand(): 738 microseconds per iteration Time to generate 1000 values using shufflerand: 3202 microseconds per iteration Time to generate 1000 values using Bayes-Durham rand: 3941 microseconds per iteration So slightly longer with Bayes-Durham (about 20%). Possibly because of the test for whether the pool had been created or not. I decided not to include that in the "shufflerand" proc because typically the call to rand() is in the heart of a loop. I also cached the pool size. ---- **Implementation for 8.5 using list** [EKB] This is a version for Tcl 8.5 that replaces the existing rand() with a shuffled version, so code that uses rand() doesn't have to be rewritten: namespace eval shufflerand { variable vals } proc shufflerand::init {} { variable vals for {set i 0} {$i < 98} {incr i} { lappend vals [expr {rand()}] } # lcrand -- "Linear Congruential rand" rename ::tcl::mathfunc::rand ::tcl::mathfunc::lcrand proc ::tcl::mathfunc::rand {} "[namespace current]::rand" } proc shufflerand::rand {} { variable vals set retval [lindex $vals end] set ndx [expr {int(98 * $retval)}] lset vals end [lindex $vals $ndx] lset vals $ndx [expr {lcrand()}] return $retval } # Test console show puts "\nTime to generate 1000 values using default rand():" puts [time {for {set i 0} {$i < 1000} {incr i} {expr {rand()}}}] shufflerand::init puts "\nTime to generate 1000 values using shuffled rand():" puts [time {for {set i 0} {$i < 1000} {incr i} {expr {rand()}}}] ---- **The effect on shuffling on the period** [Lars H]: I claim in [Random Number] that shuffling such as the above doesn't do much for the period of the PRNG. The following is the reasoning that led me to that conclusion. ***Modified implementation for analysis*** [[To be continued.]] ---- !!!!!! %| [Category Mathematics] | [Category Statistics] |% !!!!!!