Version 0 of Evaluation of multiple integrals using quasi-random points

Updated 2019-04-04 17:34:42 by arjen

Arjen Markus (4 april 2019) Monte Carlo methods are often used to integrate (or average) some function over a multidimensional space. The idea is quite simple: you randomly select points in the parameter/coordinate space, evaluate the function and the sum or the mean of the values you get is an approximation to the integral or the average.

The nice thing about Monte Carlo methods is that you need not worry about the number of dimensions - just pick one more coordinate and you are done.

Well, not quite. A significant drawback of Monte Carlo methods is that the error you make is inversely proportional to the square root of the number of points (1/sqrt(N)). In other words, to reduce the error (on average - this is after all a probabilistic method) by a factor of 2, you need to pick four times more points.

Using so-called quasi-random points gives a much better result: you can expect the error to behave as 1/N instead of 1/sqrt(N): to reduce the error by a factor of 2, you pick twice as many points. For a reduction by a factor 10, ten times as many points in stead of 100 times.

To select quasi-random points in a coordinate space is, however, not as simple as it looks and there have been many studies on their properties. And that is where this blog comes in. I heard about it on the Tclers chat and did some experiments. It does look very promising.

The code below is a very simple ad hoc program to compare the new method for quasi-random numbers with a straightforward Monte Carlo method:

  • The function is one of three, fairly simple and smooth, functions.
  • The goal is to determine the integral over the square of 1 by 1.
  • For each method it does ten trials of 100 points and prints the result.

I could have made a more formal comparison, of course, determine the standard deviation of the series of trials, compare to the exact value (if that is - it is for two of these functions), etc. But I just let the numbers speak instead.

For the first function, the integral is exactly (exp(1)-1)**2 or 2.952492440125593. The result of one particular run of the program is:

Monte Carlo:
Trial 0: 2.9167670044773537
Trial 1: 2.810709684843801
Trial 2: 2.9047008201644164
Trial 3: 2.976484326431138
Trial 4: 3.0313593590149632
Trial 5: 2.96497151958601
Trial 6: 2.8629509293157156
Trial 7: 3.0411656660841055
Trial 8: 2.8703993994820847
Trial 9: 2.9159757069046632
Quasi-random points
Trial 0: 2.978267096620306
Trial 1: 2.955749509583321
Trial 2: 2.9527730154883494
Trial 3: 2.926382581196727
Trial 4: 2.9579869418153515
Trial 5: 2.9618063810319537
Trial 6: 2.9649989225942868
Trial 7: 2.9364314388690826
Trial 8: 2.9293270016224824
Trial 9: 2.950819793096617
Trial 10: 2.9529078493520875

As you can see the spread for the Monte Carlo method is much bigger than that of the quasi-random points method.

For the third function, the product of two cosines, the exact value of the integral is 0. Here is the output from one run:

Monte Carlo:
Trial 0: -0.0480207020593826
Trial 1: -0.0603522327671914
Trial 2: 0.05696694135009513
Trial 3: -0.047421832955353314
Trial 4: -0.026318495422166435
Trial 5: 0.05661887158719929
Trial 6: -0.026122597935765293
Trial 7: 0.044876304823986496
Trial 8: 0.0030658863374625036
Trial 9: -0.013536678398721615
Quasi-random points
Trial 0: -0.009778893711060707
Trial 1: 0.009266650448147623
Trial 2: -0.008626148945971347
Trial 3: 0.007882773877382872
Trial 4: -0.007065039878311997
Trial 5: 0.006203695521161606
Trial 6: -0.005330757368947462
Trial 7: 0.004478504366989686
Trial 8: -0.0036784640718740703
Trial 9: 0.0029604224726672724
Trial 10: -3.389966518034056e-6

Here is the program:

proc func {x y} {
    expr {exp($x+$y)}
}
# Wavy alternative
proc func {x y} {
    expr {cos(20.0*($x**2+$y**2))+$x}
}
# Product of cosines - integral should approach zero
proc func {x y} {
    set pi [expr {acos(-1.0)}]

    expr {cos(2.0*$pi*$x) * cos(2.0*$pi*$y)}
}

# Determine the parameters for the quasi-random points
# (in 2D)
#
set n 3
set rn [expr {1.0/$n}]
set x 1.0
for {set i 0} {$i < 10} {incr i} {
    set x [expr {$x - ($x**$n-$x-1.0) / ($n*$x**($n-1)-1.0)}]
}

set ax [expr {1.0/$x}]
set ay [expr {1.0/$x**2}]

puts "Monte Carlo:"
for {set trial 0} {$trial < 10} {incr trial} {
    set sum 0.0

    for {set p 0} {$p < 100} {incr p} {
        set x   [expr {rand()}]
        set y   [expr {rand()}]
        set sum [expr {$sum + [func $x $y]}]
    }

    puts "Trial $trial: [expr {$sum/100.0}]"
}

puts "Quasi-random points"
set p 0  ;# Needs to be outside the loop!
for {set trial 0} {$trial < 10} {incr trial} {
    set sum 0.0

    for {set p1 0} {$p1 < 100} {incr p1} {
        incr p
        set x   [expr {fmod($p * $ax, 1.0)}]
        set y   [expr {fmod($p * $ay, 1.0)}]
        set sum [expr {$sum + [func $x $y]}]
    }

    puts "Trial $trial: [expr {$sum/100.0}]"
}

    set sum 0.0

    for {set p1 0} {$p1 < 10000} {incr p1} {
        incr p
        set x   [expr {fmod($p * $ax, 1.0)}]
        set y   [expr {fmod($p * $ay, 1.0)}]
        set sum [expr {$sum + [func $x $y]}]
    }

    puts "Trial $trial: [expr {$sum/10000.0}]"

When you examine the paper, you will see a general method for determining the parameters - it can be extended to an arbitrary number of dimensions. Another attractive feature - besides the apparent accuracy - is that no external tuning parameters are required.

I intend to make a nice little package out of this for Tcllib, but that takes a bit more work than this little example.