Generating a series of not entirely random data

Difference between version 0 and 0 - Previous - Next
[Arjen Markus] (25 may 2020) Experimenting (well, fooling around, I guess, is a more appropriate term) with stochastic differential equations, I was confronted with the question of how to generate a series of data with a certain autocorrelation structure. You can use an autoregression method:

======
    x_new = r * x_old + (1-r) * random number
======
The result is a series where consecutive numbers look a bit like their predecessors, but it remains difficult to tune the behaviour. In my case: I wanted to simulate rainfall data. The observations I had showed a faily strong correlation between the rainfall on two consecutive days, but between day 1 and day 3, it is far less and the above algorithm makes the correlation decay like a geometric series - r, r**2, r**3, ... not rapid enough. But more importantly: ''the resulting distribution is not uniform!''

I came up with a simple alternative that in principle allows much more freedom and does give a uniform distribution:

   * Get two random numbers
   * Accept the first as the new value if the second one exceeds the threshold r
   * If not, reuse the old value

In code, something along these lines:

======
set prev [expr {rand()}]

for {set i 0} {$i < 1000} {incr i} {
    set new    [expr {rand()}]
    set accept [expr {rand()}]

    if { $accept < 0.25 } {
        set new $prev
    }

    lappend randomValues $new

    set prev $new
}
======


<<categories>>Toys|Mathematics