## Random Integers

Created by CecilWesterhof.

Functions to Get Random Integers on a *NIX System

To get better random integers you can use /dev/urandom. I wrote a function getRandomBytesNix to get a series of random bytes see Get Random Bytes on *NIX. This is used for generating random numbers.

The first proc is a proc to get a a random integer from type c, s, i, or w.

```proc getRandomIntNix {{type i}} {
switch \$type {
c {
set bytes  1
set format cu
}
s {
set bytes  2
set format su
}
i {
set bytes  4
set format iu
}
w {
set bytes  8
set format wu
}
default {
error "type can only be c, s, i, or w (\$type)"
}
}
binary scan [getRandomBytesNix \$bytes] \$format random
return \$random
}```

Often you need another range (for example -5, 12), for this I created getRandomIntInRangeNix.

A naive version is something like the following:

```proc getRandomIntInRangeNixBiased {min max {type i}} {
if {\$max <= \$min} {
error "Error: [info level 0] min should be lower as max"
}
switch \$type {
c {
set maxMod [expr 2 **  8]
}
s {
set maxMod [expr 2 ** 16]
}
i {
set maxMod [expr 2 ** 32]
}
w {
set maxMod [expr 2 ** 64]
}
default {
error "type can only be c, s, i, or w (\$type)"
}
}
set  modulo [expr {\$max - \$min + 1}]
if {\$modulo > \$maxMod} {
error "Modulo should be <= \$maxMod (\$modulo)"
}
set  random [getRandomIntNix \$type]
expr {\$random % \$modulo + \$min}
}```

The problem with this is that this can result in biased results.

Say we have a generator that generates the values between 0 and 15. If we ask for a range between 0 and 10, then the values 0 to 4 are two times more likely to be generated as the values 5 to 10, because the values 11 to 15 are mapped to 0 to 4.

To show this I created the following script:

```set range          99
set repeatCalc 250000
set repeatGen       5

for {set i 0} {\$0 <= \$repeatGen} {incr i} {
array set count {}
for {set j 0} {\$j < \$repeatCalc} {incr j} {
set index [expr {[getRandomIntInRangeNix 0 \$range c] / 20}]
incr count(\$index)
}
parray count
unset  count
puts   ""
}```

An example output when this is run:

```count(0) = 58899
count(1) = 58324
count(2) = 54793
count(3) = 39192
count(4) = 38792

count(0) = 58808
count(1) = 58317
count(2) = 54746
count(3) = 39129
count(4) = 39000

count(0) = 58295
count(1) = 58439
count(2) = 55095
count(3) = 39156
count(4) = 39015

count(0) = 58454
count(1) = 58674
count(2) = 54598
count(3) = 39202
count(4) = 39072

count(0) = 58825
count(1) = 58447
count(2) = 54700
count(3) = 39170
count(4) = 38858```

This can be solved with the following code:

```proc getRandomIntInRangeNix {min max {type i}} {
if {\$max <= \$min} {
error "Error: [info level 0] min should be lower as max"
}
# maxMod should not be bigger as half of the range
# otherwise calculating can take to long
# in this way the chance that you need 10 random values before returning
# is less as one promille
switch \$type {
c {
set range [expr 2 **  8]
}
s {
set range [expr 2 ** 16]
}
i {
set range [expr 2 ** 32]
}
w {
set range [expr 2 ** 64]
}
default {
error "type can only be c, s, i, or w (\$type)"
}
}
set maxMod [expr {\$range / 2}]
set modulo [expr {\$max - \$min + 1}]
if {\$modulo > \$maxMod} {
error "Modulo should be <= \$maxMod (\$modulo)"
}
# Values above maxVal would result in bias and should be rejected
set maxVal [expr {\$modulo * (\$range  / \$modulo) - 1}]
while True {
set  random [getRandomIntNix \$type]
if {\$random <= \$maxVal} {
break
}
}
expr {\$random % \$modulo + \$min}
}```

Now a run could give something like:

```count(0) = 50347
count(1) = 49748
count(2) = 49856
count(3) = 50155
count(4) = 49894

count(0) = 50381
count(1) = 49524
count(2) = 50101
count(3) = 49979
count(4) = 50015

count(0) = 50005
count(1) = 50035
count(2) = 49845
count(3) = 49950
count(4) = 50165

count(0) = 49911
count(1) = 50110
count(2) = 50171
count(3) = 50349
count(4) = 49459

count(0) = 49681
count(1) = 50095
count(2) = 49802
count(3) = 49995
count(4) = 50427```

As always: comments, tips and questions are appreciated.

 Category Linux Category Utilities