Version 3 of Random Number

Updated 2004-12-05 03:47:36

Many applications require random numbers.

This includes -

  • Games of chance: You want simulated dice to behave as unpredictable as their real counterpart.
  • Monte Carlo simulations: Algorithms are tested with random input.
  • Cryptography: Random numbers make "one time pads" perfectly secure.

The deterministic nature of computers makes it impossible for computer software to come up with truly random numbers.

Hardware devices exist that produce random numbers from unpredictable, or unpredictably chaotic, physical effects such as athmospheric noise, radioactive decay, or even the movement of liquids in a common lava lamp. Some operating systems (e.g., Linux) derive reasonably random numbers from user behavior, such as the time between keyboard inputs (i.e., something like "nanoseconds between hitting two keys"), or mouse movement.

True Random Numbers downloads true random numbers from a free internet service.

Without access to such a device, the best a software can do is to use pseudo-random numbers (PRNs) which are produced by mathematical algorithms. The remainder of this page uses random number synonymous with pseudo-random number.

Because of their importance in various fields, there is a large body of literature on the subject. The canonical reference is Donald E. Knuth's The Art of Computer Programming, Volume 2: Seminumerical Algorithms.

Two of the most important properties that sequences of random numbers must, or should, satisfy are unpredictability and unbiased. The sequence of integer numbers {1, 2, 3, 4, ...} is perfectly biased but predictable. The sequence of daily outside temperature readings is unpredictable but biased, as the range of numbers is narrow, and the readings between two days are usually close. Because of such subtleties, the (probabilistic) proof that a sequence of numbers is reasonably random is very hard.

Tcl provides a pseudo-random number generator as part of the expr command:

 expr rand()

returns a random number between 0 and 1.

Tcl's random number generator highlights a weakness that applies to all pseudo-random number generators: it has to be initialized, or "seeded." The initial seed value fully determines the sequence of numbers that is generated. If the generator is seeded with the same value over and over, it will keep producing the same random numbers. Also, by design, it never produces the same number twice in a row -- unlike dice, which can come up with the same number many times in a row without being biased.

Tcl's random number generator is as bad as many, and worse than some, in that it only accepts a single 31 bit integer as a seed value. In other words, there are only 2^31 possible random number sequences, and they repeat after a period of 2^31 (in fact, numbers never repeat in less than this period). For applications that only care about the distribution of pseudo-random numbers across the value space, such as simulated dice, that can be considered good enough. In cryptography, such a random number generator is not considered cryptographically secure. An attacker could potentially make a brute-force attempt to recover a random number sequence by checking all 2^31 seed values.

Many applications, and Tcl's random number generator (when not seeded by the application), cheat by using something like the time of day (e.g., [clock seconds]) as a seed value. However, for cryptographical applications, that is about the worst thing that can be done. Because [clock seconds] returns a number in a very narrow range, so an attacker could make a good prediction of the values that need to be tested. Using a higher resolution timer, such as [clock clicks], improves the issue by a cryptographically insigngificant margin only.

Therefore, we have a chicken and egg problem: in order to have unpredictable random numbers, a pseudo-random number generator needs to be initialized with a random seed.

We have two problems to solve, if we want to improve upon Tcl's pseudo-random number generator:

  • First, we need a better algorithm; one that allows for seed values greater than 2^32.
  • Second, we need a means of coming up with unpredictable seed values.

Two candidates for a better algorithm are


Why the fuss about cryptographically secure random numbers?

Already mentioned above, random numbers allow for perfectly secure, unbreakable cryptography, using "one-time pads." In use for decades, first on paper, now on computers, a one-time pad contains a sequence of random numbers. Only two copies of a one-time pad exist, one is with the sender of a message, and one with the receiver.

In order to encrypt a message, the sender takes each letter of the message, and transposes it with one random number: The random number, between 0 and 25, is added to the ordinal of the letter, modulo 26, to produce one letter of the encrypted message.

The receiver uses the same pad, and the inverse transformation (by subtracting the random number, modulo 26), to decrypt the message.

This scheme is theoretically unbreakable, if the numbers are perfectly random, and if each one-time pad is used only once. If a pad is used multiple times, an attacker can leverage frequency analysis to determine the contents of the pad and gain access to the messages. If the numbers are not perfectly random, but produced by a pseudo-random number generator, decryption of the message is equivalent to breaking the generator's algorithm, parameters and seed. With the algorithm and parameters usually public, only the secrecy of the seed remains, reinforcing the need for unpredictable seed values.


Initially written by FPX.


[Category Cryptography]