Keith Vetter : 2006-10-11 : To quote Wikipedia [1 ]:

*Benford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is 1 almost one-third of the time, and further, larger numbers occur as the leading digit with less and less frequency as they grow in magnitude, to the point that 9 is the leading digit less than one time in twenty.*

*This counter-intuitive result applies to a wide variety of figures from the natural world or of social significance, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature).*

*It is named after physicist Frank Benford, who stated it in 1938. However, it was earlier stated by Simon Newcomb, in 1881. The first rigorous formulation and proof appears to be due to Theodore P.Hill in 1988.*

One cool application of Benford's Law is in fraud detection. People who make up numbers, say for a fraudelent insurance claim, tend to distribute their digits uniformly. So by checking if the leading digits distribution matches Benford's Law you can spot anomalous claims.

Here is a visualization of Benford's Law. It produces a list of numbers using the recurrence A(n+1) = A(n) * B, and plots the count of the leading digit.

NB. this program requires some sort of bignum package. Tcl 8.5 should work as is--see tip #237 [2 ] (untested). Otherwise it looks for the bignum package (or the bignum starkit). Failing that it tries to load the mpexpr package (or the library directly). As a last result, it uses the tcllib math::bignum package, which works fine but is a bit slow.

##+########################################################################## # # benford.tcl -- Simulation demonstrating Benford's Law # by Keith Vetter, Oct 10, 2006 # package require Tk package require tile package require Plotchart array set S {title "Benford's Law"} array set N {iter 100} set S(about) { Wikipedia: Benford's law, also called the first-digit law, states that in lists of numbers from many real-life sources of data, the leading digit is 1 almost one-third of the time, and further, larger numbers occur as the leading digit with less and less frequency as they grow in magnitude, to the point that 9 is the leading digit less than one time in twenty. This counter-intuitive result applies to a wide variety of figures from the natural world or of social significance, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). It is named after physicist Frank Benford, who stated it in 1938. However, it was earlier stated by Simon Newcomb, in 1881. The first rigorous formulation and proof appears to be due to Theodore P. Hill in 1988. http://en.wikipedia.org/wiki/First_digit_law This simulation produces a list of numbers which follows Benford's Law by counting the leading digit as two numbers are repeatedly multiplied together: A(n+1) = A(n) * B } proc DoDisplay {} { global S wm title . $S(title) wm minsize . 500 300 frame .f -bd 2 -relief ridge canvas .c -width 500 -height 400 -bd 0 -highlightthickness 0 bind .c <Configure> {ReCenter %W %h %w} bind all <F2> {console show} frame .ctrl ::ttk::button .go -text "Generate" -command Go ::ttk::button .random -text "Random" -command Random ::ttk::button .clear -text "Clear" -command Clear ::ttk::button .about -text "About" -command About ::ttk::label .lstart -text "Start:" entry .estart -textvariable ::N(start) -width 10 -validate key \ -vcmd {string is integer %P} ::ttk::label .lfactor -text "Factor:" entry .efactor -textvariable ::N(factor) -width 10 -validate key \ -vcmd {string is integer %P} ::ttk::label .liter -text "Iterations:" entry .eiter -textvariable ::N(iter) -width 10 -validate key \ -vcmd {string is integer %P} ::ttk::label .lcurr -text "Current:" entry .ecurr -textvariable ::N(current) -width 10 -state disabled \ -disabledforeground [.eiter cget -fg] foreach i [trace info variable ::N] { eval trace remove variable ::N $i } trace variable ::N w Tracer Tracer a b c pack .f -side left -fill both -expand 1 pack .c -in .f -side left -fill both -expand 1 pack .ctrl -side right -fill y -padx 10 -pady 5 -before .f grid .lstart .estart -in .ctrl -sticky e -pady 5 grid .lfactor .efactor -in .ctrl -sticky e -pady 5 grid .liter .eiter -in .ctrl -sticky e -pady 5 grid .lcurr .ecurr -in .ctrl -sticky e -pady 5 grid .go - -in .ctrl -pady 15 grid .random - -in .ctrl -pady 0 grid .clear - -in .ctrl -pady 5 grid rowconfigure .ctrl 99 -weight 1 grid .about - -in .ctrl -row 100 -pady 10 catch { ::ttk::sizegrip .sg place .sg -in . -relx 1 -rely 1 -anchor se } } proc Tracer {var1 var2 op} { if {[regexp {^[0-9]*$} $::N(start)] && [string is integer -strict $::N(factor)] && [string is integer -strict $::N(iter)]} { set how !disabled } else { set how disabled } .go state $how } proc ReCenter {W h w} { ;# Called by configure event #set h2 [expr {$h / 2}] ; set w2 [expr {$w / 2}] #$W config -scrollregion [list -$w2 -$h2 $w2 $h2] DrawGraph } proc DrawGraph {} { global S CNT .c delete all .c config -width [winfo width .c] -height [winfo height .c] foreach {a b c} [::Plotchart::determineScale 0 \ [expr {$::CNT(max) > 40 ? $::CNT(max)+1 : 40}]] break set S(chart) [::Plotchart::createBarchart .c {1 2 3 4 5 6 7 8 9} \ [list 0 [expr {int($b)}] [expr {int($c)}]] 1] $S(chart) title "Benford's Law" .c itemconfig title -font {Times 20 bold} $S(chart) xtext "Significant Digit" $S(chart) ytext "Frequency" set ydata {} foreach i {1 2 3 4 5 6 7 8 9} { lappend ydata $CNT($i) } $S(chart) plot 1 $ydata {red yellow green cyan blue magenta red yellow green} } proc Init {} { global CNT unset -nocomplain CNT array set CNT {max 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0 8 0 9 0} set ::N(current) "" } proc Go:8.5 {} { global CNT N set a [expr {$N(current) ne "" ? $N(current) : $N(start)}] for {set i 0} {$i < $N(iter)} {incr i} { set a [expr {$a * $N(factor)}] set digit [string range $a 0 0] if {[incr CNT($digit)] > $CNT(max)} { set CNT(max) $CNT($digit) } DrawGraph update } set N(current) $a } proc Go:Bignum {} { global CNT N set a [expr {$N(current) ne "" ? $N(current) : $N(start)}] set b $N(factor) for {set i 0} {$i < $N(iter)} {incr i} { set a [::bigint::mul $a $b] set digit [string range $a 0 0] if {[incr CNT($digit)] > $CNT(max)} { set CNT(max) $CNT($digit) } DrawGraph update } set N(current) $a } proc Go:Mpexpr {} { global CNT N set a [expr {$N(current) ne "" ? $N(current) : $N(start)}] for {set i 0} {$i < $N(iter)} {incr i} { set a [mpexpr {$a * $N(factor)}] set digit [string range $a 0 0] if {[incr CNT($digit)] > $CNT(max)} { set CNT(max) $CNT($digit) } DrawGraph update } set N(current) $a } proc Go:Math.Bignum {} { global CNT N set a [expr {$N(current) ne "" ? $N(current) : $N(start)}] set a [::math::bignum::fromstr $a] set b [::math::bignum::fromstr $N(factor)] for {set i 0} {$i < $N(iter)} {incr i} { set a [::math::bignum::mul $a $b] set str [::math::bignum::tostr $a] set digit [string range $str 0 0] if {[incr CNT($digit)] > $CNT(max)} { set CNT(max) $CNT($digit) } DrawGraph update } set N(current) [::math::bignum::tostr $a] } proc Clear {} { Init DrawGraph } proc Random {} { set ::N(start) [expr {1 + int(rand() * 300)}] set ::N(factor) [expr {1 + int(rand() * 300)}] set ::N(current) "" Init Go } proc About {} { set msg "Benford's Law\nby Keith Vetter, Oct 2006\n$::S(about)" append msg "Using $::S(bignum) for multi-precision arithmetic." tk_messageBox -title "About" -message $msg } ################################################################ # # Load a multi-precision math package of some sort # while {1} { # tcl 8.5 should work correctly if {[package vsatisfies $::tcl_version 8.5] > 0} { set S(bignum) "Tcl 8.5" interp alias {} Go {} Go:8.5 break } # Bignum package set n [catch {package require bignum}] if {$n} { set n [catch {source bignum.kit}] } set n [catch {package require bignum}] if {! $n} { set S(bignum) "bignum package" interp alias {} Go {} Go:Bignum break } # mpexpr (how should it be loaded?) set n [catch {package require Mpexpr}] if {$n} { set fname "mpexpr10[info sharedlibextension]" set n [catch {load $fname}] } if {! $n} { set S(bignum) "mpexpr package" interp alias {} Go {} Go:Mpexpr break } # Fall back to tcllib's math::bignum package set S(bignum) "math::bignum package" package require math::bignum interp alias {} Go {} Go:Math.Bignum break } Init DoDisplay update Random return