Created by CecilWesterhof.
Functions to Get Random Integers on a *NIX System
To get better random integers you can use /dev/urandom. I wrote a function getRandomBytesNix to get a series of random bytes see Get Random Bytes on *NIX. This is used for generating random numbers.
The first proc is a proc to get a a random integer from type c, s, i, or w.
proc getRandomIntNix {{type i}} { switch $type { c { set bytes 1 set format cu } s { set bytes 2 set format su } i { set bytes 4 set format iu } w { set bytes 8 set format wu } default { error "type can only be c, s, i, or w ($type)" } } binary scan [getRandomBytesNix $bytes] $format random return $random }
Often you need another range (for example -5, 12), for this I created getRandomIntInRangeNix.
A naive version is something like the following:
proc getRandomIntInRangeNixBiased {min max {type i}} { if {$max <= $min} { error "Error: [info level 0] min should be lower as max" } switch $type { c { set maxMod [expr 2 ** 8] } s { set maxMod [expr 2 ** 16] } i { set maxMod [expr 2 ** 32] } w { set maxMod [expr 2 ** 64] } default { error "type can only be c, s, i, or w ($type)" } } set modulo [expr {$max - $min + 1}] if {$modulo > $maxMod} { error "Modulo should be <= $maxMod ($modulo)" } set random [getRandomIntNix $type] expr {$random % $modulo + $min} }
The problem with this is that this can result in biased results.
Say we have a generator that generates the values between 0 and 15. If we ask for a range between 0 and 10, then the values 0 to 4 are two times more likely to be generated as the values 5 to 10, because the values 11 to 15 are mapped to 0 to 4.
To show this I created the following script:
set range 99 set repeatCalc 250000 set repeatGen 5 for {set i 0} {$0 <= $repeatGen} {incr i} { array set count {} for {set j 0} {$j < $repeatCalc} {incr j} { set index [expr {[getRandomIntInRangeNix 0 $range c] / 20}] incr count($index) } parray count unset count puts "" }
An example output when this is run:
count(0) = 58899 count(1) = 58324 count(2) = 54793 count(3) = 39192 count(4) = 38792 count(0) = 58808 count(1) = 58317 count(2) = 54746 count(3) = 39129 count(4) = 39000 count(0) = 58295 count(1) = 58439 count(2) = 55095 count(3) = 39156 count(4) = 39015 count(0) = 58454 count(1) = 58674 count(2) = 54598 count(3) = 39202 count(4) = 39072 count(0) = 58825 count(1) = 58447 count(2) = 54700 count(3) = 39170 count(4) = 38858
This can be solved with the following code:
proc getRandomIntInRangeNix {min max {type i}} { if {$max <= $min} { error "Error: [info level 0] min should be lower as max" } # maxMod should not be bigger as half of the range # otherwise calculating can take to long # in this way the chance that you need 10 random values before returning # is less as one promille switch $type { c { set range [expr 2 ** 8] } s { set range [expr 2 ** 16] } i { set range [expr 2 ** 32] } w { set range [expr 2 ** 64] } default { error "type can only be c, s, i, or w ($type)" } } set maxMod [expr {$range / 2}] set modulo [expr {$max - $min + 1}] if {$modulo > $maxMod} { error "Modulo should be <= $maxMod ($modulo)" } # Values above maxVal would result in bias and should be rejected set maxVal [expr {$modulo * ($range / $modulo) - 1}] while True { set random [getRandomIntNix $type] if {$random <= $maxVal} { break } } expr {$random % $modulo + $min} }
Now a run could give something like:
count(0) = 50347 count(1) = 49748 count(2) = 49856 count(3) = 50155 count(4) = 49894 count(0) = 50381 count(1) = 49524 count(2) = 50101 count(3) = 49979 count(4) = 50015 count(0) = 50005 count(1) = 50035 count(2) = 49845 count(3) = 49950 count(4) = 50165 count(0) = 49911 count(1) = 50110 count(2) = 50171 count(3) = 50349 count(4) = 49459 count(0) = 49681 count(1) = 50095 count(2) = 49802 count(3) = 49995 count(4) = 50427
As always: comments, tips and questions are appreciated.