[Arjen Markus] (4 april 2019) Monte Carlo methods are often used to
integrate (or average) some function over a multidimensional space.
The idea is quite simple: you randomly select points in the
parameter/coordinate space, evaluate the function and the sum or the
mean of the values you get is an approximation to the integral or the
average.

The nice thing about Monte Carlo methods is that you need not worry
about the number of dimensions - just pick one more coordinate and you
are done.

Well, not quite. A significant drawback of Monte Carlo methods is that
the error you make is inversely proportional to the square root of the
number of points (1/sqrt(N)). In other words, to reduce the error (on
average - this is after all a probabilistic method) by a factor of 2,
you need to pick four times more points.

Using so-called quasi-random points gives a much better result: you can
expect the error to behave as 1/N instead of 1/sqrt(N): to reduce the
error by a factor of 2, you pick twice as many points. For a reduction
by a factor 10, ten times as many points in stead of 100 times.

To select quasi-random points in a coordinate space is, however, not as
simple as it looks and there have been many studies on their properties.
And that is where this [http://extremelearning.com.au/unreasonable-effectiveness-of-quasirandom-sequences/%|%blog%|%] comes in. I heard about it on the
Tclers chat and did some experiments. It does look very promising.

The code below is a very simple ad hoc program to compare the new method
for quasi-random numbers with a straightforward Monte Carlo method:

   * The function is one of three, fairly simple and smooth, functions.
   * The goal is to determine the integral over the square of 1 by 1.
   * For each method it does ten trials of 100 points and prints the result.

I could have made a more formal comparison, of course, determine the
standard deviation of the series of trials, compare to the exact
value (if that is - it is for two of these functions), etc. But I just
let the numbers speak instead.

For the first function, the integral is exactly (exp(1)-1)**2 or
2.952492440125593. The result of one particular run of the program is:

======
Monte Carlo:
Trial 0: 2.9167670044773537
Trial 1: 2.810709684843801
Trial 2: 2.9047008201644164
Trial 3: 2.976484326431138
Trial 4: 3.0313593590149632
Trial 5: 2.96497151958601
Trial 6: 2.8629509293157156
Trial 7: 3.0411656660841055
Trial 8: 2.8703993994820847
Trial 9: 2.9159757069046632
Quasi-random points
Trial 0: 2.978267096620306
Trial 1: 2.955749509583321
Trial 2: 2.9527730154883494
Trial 3: 2.926382581196727
Trial 4: 2.9579869418153515
Trial 5: 2.9618063810319537
Trial 6: 2.9649989225942868
Trial 7: 2.9364314388690826
Trial 8: 2.9293270016224824
Trial 9: 2.950819793096617
Trial 10: 2.9529078493520875
======
As you can see the spread for the Monte Carlo method is much bigger than
that of the quasi-random points method.

For the third function, the product of two cosines, the exact value of
the integral is 0. Here is the output from one run:

======
Monte Carlo:
Trial 0: -0.0480207020593826
Trial 1: -0.0603522327671914
Trial 2: 0.05696694135009513
Trial 3: -0.047421832955353314
Trial 4: -0.026318495422166435
Trial 5: 0.05661887158719929
Trial 6: -0.026122597935765293
Trial 7: 0.044876304823986496
Trial 8: 0.0030658863374625036
Trial 9: -0.013536678398721615
Quasi-random points
Trial 0: -0.009778893711060707
Trial 1: 0.009266650448147623
Trial 2: -0.008626148945971347
Trial 3: 0.007882773877382872
Trial 4: -0.007065039878311997
Trial 5: 0.006203695521161606
Trial 6: -0.005330757368947462
Trial 7: 0.004478504366989686
Trial 8: -0.0036784640718740703
Trial 9: 0.0029604224726672724
Trial 10: -3.389966518034056e-6
======

Here is the program:
======
proc func {x y} {
    expr {exp($x+$y)}
}
# Wavy alternative
proc func {x y} {
    expr {cos(20.0*($x**2+$y**2))+$x}
}
# Product of cosines - integral should approach zero
proc func {x y} {
    set pi [expr {acos(-1.0)}]

    expr {cos(2.0*$pi*$x) * cos(2.0*$pi*$y)}
}

# Determine the parameters for the quasi-random points
# (in 2D)
#
set n 3
set rn [expr {1.0/$n}]
set x 1.0
for {set i 0} {$i < 10} {incr i} {
    set x [expr {$x - ($x**$n-$x-1.0) / ($n*$x**($n-1)-1.0)}]
}

set ax [expr {1.0/$x}]
set ay [expr {1.0/$x**2}]

puts "Monte Carlo:"
for {set trial 0} {$trial < 10} {incr trial} {
    set sum 0.0

    for {set p 0} {$p < 100} {incr p} {
        set x   [expr {rand()}]
        set y   [expr {rand()}]
        set sum [expr {$sum + [func $x $y]}]
    }

    puts "Trial $trial: [expr {$sum/100.0}]"
}

puts "Quasi-random points"
set p 0  ;# Needs to be outside the loop!
for {set trial 0} {$trial < 10} {incr trial} {
    set sum 0.0

    for {set p1 0} {$p1 < 100} {incr p1} {
        incr p
        set x   [expr {fmod($p * $ax, 1.0)}]
        set y   [expr {fmod($p * $ay, 1.0)}]
        set sum [expr {$sum + [func $x $y]}]
    }

    puts "Trial $trial: [expr {$sum/100.0}]"
}

    set sum 0.0

    for {set p1 0} {$p1 < 10000} {incr p1} {
        incr p
        set x   [expr {fmod($p * $ax, 1.0)}]
        set y   [expr {fmod($p * $ay, 1.0)}]
        set sum [expr {$sum + [func $x $y]}]
    }

    puts "Trial $trial: [expr {$sum/10000.0}]"
======

When you examine the paper, you will see a general method for
determining the parameters - it can be extended to an arbitrary number
of dimensions. Another attractive feature - besides the apparent
accuracy - is that no external tuning parameters are required.

I intend to make a nice little package out of this for [Tcllib], but
that takes a bit more work than this little example.


<<categories>>Mathematics|Tcllib