Generating a series of not entirely random data

Arjen Markus (25 may 2020) Experimenting (well, fooling around, I guess, is a more appropriate term) with stochastic differential equations, I was confronted with the question of how to generate a series of data with a certain autocorrelation structure. You can use an autoregression method:

    x_new = r * x_old + (1-r) * random number

The result is a series where consecutive numbers look a bit like their predecessors, but it remains difficult to tune the behaviour. In my case: I wanted to simulate rainfall data. The observations I had showed a faily strong correlation between the rainfall on two consecutive days, but between day 1 and day 3, it is far less and the above algorithm makes the correlation decay like a geometric series - r, r**2, r**3, ... not rapid enough. But more importantly: the resulting distribution is not uniform!

I came up with a simple alternative that in principle allows much more freedom and does give a uniform distribution:

  • Get two random numbers
  • Accept the first as the new value if the second one exceeds the threshold r
  • If not, reuse the old value

In code, something along these lines:

set prev [expr {rand()}]

for {set i 0} {$i < 1000} {incr i} {
    set new    [expr {rand()}]
    set accept [expr {rand()}]

    if { $accept < 0.25 } {
        set new $prev
    }

    lappend randomValues $new

    set prev $new
}