Version 1 of Gaussian Distribution

Updated 2004-03-31 21:38:29

started by Theo Verelst

Also called 'Normal Distribution'. For most physical independent measurements it holds that the statistical law of large numbers makes a measurment distribution graph due to measurement noise is a normal or Guassian distrubuted probability density.

So if we flip a coin so many times, and keep score of 10 throws for instance, we expect to get an average of 5 heads for ten throws. But of course, it will also happen we throw 6 heads or even 7 tails.

This little script computes 100,000 times the average of 5 random numbers (from the rand() tcl function), and stores the normalized results in a list called to:

 unset to;  for {set j 0} {$j < 100000} {incr j} {
    set o 0; set u 5
    for {set i 0} {$i < $u} {incr i} {
       set o [expr $o+rand()]
    };
    lappend to [expr $o/$u]
 }

Now we'll make 300 buckets in which we count how much of the result fits between to adjacent bucket limits:

 unset gd; for {set i 0} {$i<300} {incr i} {set gd($i) 0} ; foreach i $to {incr gd([expr int([expr $i*300])])}

And display the result as a reasonably accurate bar graph on the bwise canvas, or you should have a visible Tk canvas of which the path is in the variable mc :

 mc del gr3; foreach n [array names gd] {$mc create line [expr 100+$n] 301 [expr 100+$n] [expr 300-0.2*$gd($n)] -tag gr3}

http://82.168.209.239/wiki/gaussian1.jpg

At least clearly the gaussian curve can be recognized, where the statistical measure 'variance' can be derived from the bowing points (the change of sign of the second derivative, at both sides of the maximum, the expectation value). Also clearly, a hundred thousand samples still leave us with considerable dviation from the ideal 'big numbers' curve.

Normally, for natural random processes, the law is big numbers works pretty well.


KBK 31 March 2004 -

One conventional way to sample a random variable from a Gaussian distribution is Marsaglia's polar method. [L1 ]. This method is usually faster than any method that involves the Central Limit Theorem and a large batch of random numbers. It also has precision that is limited only by the precision of the underlying random number generator.

The following Tcl code samples 50000 numbers from the normal distribution and displays a histogram of the samples, together with the Gaussian "bell-shaped curve". The "nordev" procedure should be generally useful wherever a Gaussian random variable is needed.

# Procedure that returns a random variable sampled from a Gaussian # distribution with mean zero and standard deviation unity.

 proc nordev {} {
     variable last
     if { [info exists last] } {
         set retval $last
         unset last
         return $retval
     }
     set v1 [expr { 2. * rand() - 1 }]
     set v2 [expr { 2. * rand() - 1 }]
     set rsq [expr { $v1*$v1 + $v2*$v2 }]
     while { $rsq > 1. } {
         set v1 [expr { 2. * rand() - 1 }]
         set v2 [expr { 2. * rand() - 1 }]
         set rsq [expr { $v1*$v1 + $v2*$v2 }]
     }
     set fac [expr { sqrt( -2. * log( $rsq ) / $rsq ) }]
     set last [expr { $v1 * $fac }]
     return [expr { $v2 * $fac }]
 }

# Calculate 50000 normal deviates, and prepare a histogram of them.

 set l { 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
     0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 }
 for { set i 0 } { $i < 50000 } { incr i } {
     set d [nordev]
     set bin [expr {int(20 + floor( 4. * $d))}]
     lset l $bin [expr {[lindex $l $bin] + 1}]
 }

# Display the histogram on a Tk canvas, together with the # Gaussian error function

 package require Tk
 grid [canvas .c -width 400 -height 200 -bg white]
 set max 0
 foreach v $l {
     if { $v > $max } {
         set max $v
     }
 }
 set x 0
 set coords {}
 foreach v $l {
     set y [expr {190-(180*$v/$max)}]
     set x2 [expr {$x+10}]
     .c create rectangle $x $y $x2 190 -outline black -fill {}
     set x $x2
 }
 set c {}
 for { set i 0 } { $i <= 400 } { incr i } {
     set sigma [expr { 0.025 * ( $i - 200 ) }]
     set y [expr { 190. - 180. * exp( -$sigma*$sigma / 2 ) }]
     lappend c $i $y
 }
 .c create line 0 190 400 190 -fill red
 .c create line 200 0 200 200 -fill red
 eval [list .c create line] $c [list -fill blue]

Category Mathematics