Version 2 of Computers and real numbers

if { 0 } {

Arjen Markus (14 july 2004) I think it is appropriate to write a little tutorial on the subject of "real numbers". I do not mean this as a grand in-depth tutorial, just a few of the more basic facts.

Let us first start with some terminology: computers normally deal with floating-point numbers rather than the real numbers known from mathematics. And that is to a large degree the cause of much of the trouble!

Note:

Only a few things that I say here are specific to Tcl. Computers in general have a hard time dealing with real numbers.

Floating-point numbers are approximations of the real numbers:

They have a limited range, typically from 1.0e-37 to 1.0e+37 for "single-precision" and from 1.0e-200 to 1.0e+200 for "double-precision". Tcl uses "double-precision".
They can approximate real numbers with a finite number of decimals only. For "double-precision", it is typically 12 to 13 decimals. Enough in many practical applications, but you can get into serious problems.
Whereas we are used to decimals (1.1 and 0.002 for instance), most current-day computers use a binary system, so they can not represent 1.1 exactly. Strange? Yes, perhaps, but not if you compare this to 1/3 in our decimal system:

        1/3 = 0.333333... a never-ending sequence of threes!

What does this mean in practice? Well, have a look yourself: }

   set x 0.1
   set y 0.3
   set z [expr {$x+$y}]

   puts "$x + $y = $z"
   puts "Difference: [expr {$z-$x-$y}]"

if { 0 } { On my PC, the difference is approximately 5.55e-17. This is due to the fact that both 0.1 and 0.3 are represented by floating-point numbers that are not exactly 0.1 and 0.3 and their sum is not exactly 0.4.

Does this matter? It depends.

Real variables controlling a loop

What you should not do is program like this: }

   for {set x 0.1} {$x < 1.1} {set x [expr {$x+0.1}]} {
      puts $x
   }

if { 0 } {

On my PC x runs from 0.1 to 1.1, not 1.0! For the sake of completeness is can be noted that the problem does not occur in general, e.g. from 1.0 till 10.0 there is no problem. The casual reader should be aware of the fact that "real" problems only arises when fractions exist (decimal fractions are not necessarily represented exactly in binary) or when the number is very high (truncating digits). In short: all values that can also be represented as integers are safe.

Here is a more elaborate loop that shows the pitfall ... }

   set tcl_precision 17
   foreach start {0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0} {
      set stop  [expr {$start+1.0}]
      set count 0
      for {set x $start} {$x < $stop} {set x [expr {$x+0.1}]} {
         incr count
      }
      puts "End: $x -- $count"
   }

if { 0 } {

Here is the output I get:

 End: 1.0999999999999999 -- 11
 End: 1.2 -- 11
 End: 1.2 -- 10
 End: 1.3 -- 10
 End: 1.4000000000000001 -- 10
 End: 1.5000000000000002 -- 10
 End: 1.6000000000000003 -- 10
 End: 1.7000000000000004 -- 10
 End: 1.8000000000000007 -- 10
 End: 1.9000000000000008 -- 10
 End: 2.0000000000000009 -- 10

Yes, tcl_precision is a trick: this is one of the few magical variables in Tcl. It controls the number of decimals that are used to convert numbers to readable strings.

The variation in the number of iterations (the value of "count" at the end) shows why you should not use "real" variables to control loops.

The consequences of finite precision

Another remarkable thing about computer arithmetic is this:

   x + y + z  may not be equal to (x + y) + z or z + y + x

Just look at this code: }

   set x  1.0e30
   set y  1.0
   set z -1.0e30

   puts "x+y+z = [expr {$x+$y+$z}]"
   puts "x+z+y = [expr {$x+$z+$y}]"

if { 0 } {

The result:

   x+y+z = 0.0
   x+z+y = 1.0

Because of the finite precision, 1.0e30+1.0 becomes 1.0e30.

Rounding

By now it must be clear: nothing concerning "real numbers" is easy with computers. That is even true for such a seemingly simple matter as rounding.

There are at least four methods to round a number:

Round 0.1 ... 0.4 downwards to 0, 0.5 ... 0.9 upwards to 1
Make sure the last digit becomes even
Round to zero: 0.9 ==> 0
Round to infinity: 0.1 ==> 1

The first two are the commonest in manual computations, but quite often a computer will simply truncate the value:

   set e [expr {1/3}]

does not become 0.33333333..., but simply 0: the operands are integer, so the result will become integer. To force the result to maintain the decimals:

   set e [expr {double(1)/3}]

for instance.

Rounding is extremely important with financial computations: get it wrong and, alas, you loose money!

An example where rounding to even is preferrable is this: Round off the number 0.444445 to less and less decimals.

The first method would give: 0.44445, 0.4445, 0.445, 0.45, 0.5.

Rounding to even gives: 0.44444, 0.4444, 0.444, 0.44, 0.4, so the difference is smallest.

Comparisons

A last slippery spot: comparing two numbers. It is said, that you should not directly compare for equality:

   if { $x == 0.1 } { puts "Do not do this!" }

Instead use a small enough margin:

   if { abs($x-0.1) < 1.0e-10 } { puts "This is more reliable" }

This is not only true of equality but also inequality and even lower than and greater - see the do-loops above!

One convenient way out of this conundrum is the math::fuzzy module of Tcllib. It takes care of determining a suitable margin and returns a consistent result.

There is more!

Further information - probably more than you want to know is given by David Goldberg in ...

The page A real problem has much more examples and some technical discussions on the subject.

From time to time you may get into serious trouble with straightforward computations. See the page Evaluating polynomial functions if you feel up to a preview. Luckily, in many cases smart people have already found a solution. Consult your local numerical analysis guru to find out how to tackle the problem.

RJM: Thanks Arjen for this contribution. After reading this page I wonder why BCD arithmetic has not got to be a common option in computing apps. Every microprocessor provides a BCD mode flag or equivalent operations and as I have been working with C-compilers in the past, some of them provided an option "use BCD arithmetic". Of course, BCD is slower than binary calculations, but here the same applies to the increasing popularity of scripting languages: today's computers are sooo fast.... Perhaps, BCD is not implemented in Intel's floating point processor (I don't know) - that would make a dramatic speed difference (footnote: at the same time I added an extra note close to the for loop example).

[ Category Tutorial

Category Mathematics Category Numerical Analysis

] }