Version 11 of expr problems with int

Updated 2005-03-15 15:44:51

Richard Suchenwirth - Mathematics with expr has some problems coming from the C implementation. On my NT box, and on most present-day computers, int is represented as a 32-bit signed "long". This means there is a limit, a.k.a. MAXINT, to what an integer can express:

 % expr 2147483647
 2147483647
 % expr 2147483648
 -2147483648

Probably not what was intended. (Larry Virden interjects - I would much prefer to see Tcl do things that are 'intended' than not - if anyone else has this desire, perhaps it would be worthwhile to consider what compatible changes could be made to Tcl so that one gets the math one expects...)

From the given threshold, positives are recast to negatives, up to

 % expr 4294967295
 -1

and then you run into

 % expr 4294967296
 error: integer value too large to represent

Tcl 8.4 provides a wide integer type for 64-bit support. More below.

You might consider using the multiple-precision extension called Mpexpr [L1 ], which can handle integer and floating point numbers with large numbers of digits, or the pure-Tcl Arbitrary Precision Math Procedures.

As an alternative, you gain some more computing power by converting such strings to floating-point variables. You gain the advantage of a wider range of values and (usually) more significant digits, but you do not have unlimited precision. (You can't count all the way to 10e300 by ones.) You also lose the ability to use the [incr] command, but you can always use [expr].

You cannot use expr's double() function to perform the conversion, because expr fails with the same integer conversion error (as above) before calling double().

One simple way of casting an "integerstring" to a "doublestring" is to append a dot:

 set x $i. ;# or: append i "."

This may however produce an ill-formed number if the numberstring contained a dot already. Hume Smith gave a clever solution in news:comp.lang.tcl :

 append i "e0"

makes it look like scientific notation, meaning "multiplied by 10 to the 0th power, which is 1", which forces the string to a floatstring (i.e. that expr interprets as double) with pure string manipulation. If there is the slightest possibility that scientific notation occurs in the input, make it bullet-proof like

 if ![regexp "e" $i] {append i "e0"}

If you get an expression in from outside, and want it to be computed in float (in case it contains divisions), you can safeguard that with

 expr 1.0*$expression

Here again, braces cannot be used, because the expr parser won't take operators in a variable... RS


Here's a solution from Donal Fellows that checks the error reason:

   # The absence of {curlies} from [expr] is crucial!
   if {[catch {expr double($int)} float]} {
       if {[string equal [lrange $::errorCode 0 1] "ARITH IOVERFLOW"]} {
           # We know we've got an int value now!
           set float [expr double($int.0)]
       } else {
           error "attempted conversion of non-numeric to float"
       }
   }

Paul Welton showed in the comp.lang.tcl newsgroup that you can get an unsigned string rep if you ask for it:

 format %u -1 => 4294967295

This can be sugared into a C-like declaration:

 proc unsigned  var {
        uplevel trace var $var w "{set $var \[format %u \$$var\];#}"
 } ;#RS
 unsigned y
 set y -1
 4294967295

But if you incr a variable which holds that value, you're still stuck at 0...


Volker Hetzer wrote in the comp.lang.tcl newsgroup:

 [expr -1 / 10] returns -1 !!! What's that?

Peter G. Baum responded: That's integer division. From man expr:

 *      /      %

Multiply, divide, remainder. None of these operands may be applied to string operands, and remainder may be applied only to integers. The remainder will always have the same sign as the divisor and an absolute value smaller than the divisor. So

 expr -1%10

must be >0 and < 10, and

 expr 10*(-1/10)+(1%10)

must be -1. If you solve for "-1/10", you find, that -1 is the correct answer.

Steve Offutt wrote: To prevent errors caused by integer division, either explicitly cast to double, or introduce a decimal point:

 % expr -1 / 10
 -1
 % expr double(-1) / 10
 -0.1
 % expr -1 / 10.0
 -0.1
 % expr -1.0 / 10

Dan Kuchler added: Of course, if you want 'integer' division, you have to do something like:

 expr {int(double(-1)/10)}

Kevin Kenny (23 May 2001) --

If you're dealing with integers that don't fit in 32 bits, but do fit in your double-precision floating point significand, you can sometimes use 'double' to get a little bit more precision. Consider:

    set x 0x0100
    set y 0x1000FF1
    set z [expr { double( $x ) * double( $y ) }]
    set upper [expr { int( $z / ( 1 << 24 ) )}]
    set lower [expr { int( $z - ( 1 << 24 ) * double( $upper ) ) }]
    puts [format {0x%06x%06x} $upper $lower]

which prints:

    0x0001000ff100

RS Here's how you can find out your installation's MAXINT:

 expr 0x7[string range [format %X -1] 1 end]

To test:

 set maxint [expr 0x7[string range [format %X -1] 1 end]]
 incr maxint

should return a negative number with absolute value maxint+1.


Update for Tcl 8.4.1:

 % set t [scan 7fffffffffffffff %lx ]
 9223372036854775807
 % incr t
 -9223372036854775808

20050104 Twylite - In 8.4 expr and incr work according to the underlying type of the integer, which is determined by its size. so:

   % set i [expr 2147483647]
   2147483647
   % incr i
   -2147483648
   % set i [expr 2147483647 + 1]
   -2147483648
   % expr 2000000000 + 1000000000
   -1294967296

but:

   % set i [expr 21474836470]
   21474836470
   % incr i
   21474836471
   % set i [expr 21474836470 + 1]
   21474836471

A number larger than 32 bits can be formed as a string. To use 32-bit numbers in a calculation that will have a 64-bit result you can use the wide operator:

   % expr wide(2000000000) + 1000000000 
   3000000000

Lars H, 5 Jan 2005: Yeah, this is ugly. I filed a bug report [L2 ] on it about a year ago, but essentially got the reply that the bug is now so established that it cannot be fixed without a TIP (IMHO a questionable conclusion, but that's the current will of the powers that be).

Ideally, this would be resolved by getting proper integers for Tcl 8.5 or 9.0.

DKF: It's a horrendous bug/misfeature, but it ended up that way so we didn't break the code of too many people who were relying on Tcl's existing 32-bit behaviour. This is an area that felt very much like "I'm damned by all sides whatever I do"... :^/ The latest versions of Tcl (don't know if any of this has made it into the 8.4 release branch) are much less awful, as we've managed to greatly reduce the amount of tedious shimmering going on. Ultimately, we'd like to see arbitrary precision integers (TIP#132 is the plan) but timescale is unknown given currently available effort.


I wish expr would complain when there is a buffer overflow for eg

 expr 46341*46341
 -2147479015

If you don't have multiple precision, how about ?

 proc rshift10 {int dec_places} {
    # format right aligned padding left
    # eg 6 -> 0.006 for 3 dec_places 
    # or 
    # 23 ->2.3 for 1 dec_places 
    set s [string length $int]
    if { $int == 0 } {
        if { $dec_places > 0 } {
            return "0.[string repeat 0 $dec_places]"
        } else {
            return 0
        }
    }
    if { $s == $dec_places } {
        return "0.$int"
    }
    if { $s < $dec_places } {
        return "0.[string repeat 0 [expr {$dec_places - $s}]]$int"
    }
    if { $s > $dec_places } {
        return "[string range $int 0 [expr {$s - $dec_places-1}]].[string range $int [expr {$s - $dec_places}] end]"
    }
 }

 proc samesign { a b } {
    if { ($a>=0 && $b>=0) || ($a<0 && $b<0) } {
        return 1
    } else {
        return 0
    }
 }

 proc add { n1 n2 } {
    set k1 [string first . $n1]
    if { $k1==-1 } {
        set p1 0
    } else {
        set p1 [expr {[string length $n1] - $k1 - 1}]
        set n1 "[string range $n1 0 [expr {$k1-1}]][string range $n1 [expr {$k1+1}] end]"
    }
    # Prevent octal interpretation
    set n1 [string trimleft $n1 0]

    set k2 [string first . $n2]
    if { $k2==-1 } {
        set p2 0
    } else {
        set p2 [expr {[string length $n2] - $k2 - 1}]
        set n2 "[string range $n2 0 [expr {$k2-1}]][string range $n2 [expr {$k2+1}] end]"
    }
    # Prevent octal interpretation
    set n2 [string trimleft $n2 0]

    # Line up
    if { $p1>$p2 } { 
        set p $p1
        append n2 [string repeat 0 [expr {$p1-$p2}]]
    } else { 
        set p $p2
        append n1 [string repeat 0 [expr {$p2-$p1}]]
    }

    set value [expr {wide($n1) + wide($n2)}]

    # Check for buffer overflow
    if { [samesign $n1 $n2] && ![samesign $n1 $value] } { 
        error "Result is too large to represent"
    }

    # Format answer
    if { $p>0 } {
        return [rshift10 $value $p]
    } else {
        return $value
    }
 }

 proc mult { n1 n2 } {
    set k1 [string first . $n1]
    if { $k1==-1 } {
        set p1 0
    } else {
        set p1 [expr {[string length $n1] - $k1 - 1}]
        set n1 "[string range $n1 0 [expr {$k1-1}]][string range $n1 [expr {$k1+1}] end]"
    }
    # Prevent octal interpretation
    set n1 [string trimleft $n1 0]

    set k2 [string first . $n2]
    if { $k2==-1 } {
        set p2 0
    } else {
        set p2 [expr {[string length $n2] - $k2 - 1}]
        set n2 "[string range $n2 0 [expr {$k2-1}]][string range $n2 [expr {$k2+1}] end]"
    }
    # Prevent octal interpretation
    set n1 [string trimleft $n1 0]

    set result [expr {wide($n1)*wide($n2)}]

    # Check for buffer overflow
    if { !($n1<=32767 && $n1>=-32768 && $n2<=32767 && $n2>=-32768) 
         && $n2!=0 
         && ($result/$n2!=$n1 || ($n2==-1 && $n1<0 && $result < 0)) } {
        error "Result is too large to represent"
    }

    # format result
    set p [expr {$p1+$p2}]
    if { $p>0 } {
        return [rshift10 $result $p]
    } else {
        return $result
    }
 }

Why does Tcl think a fraction less than one is zero?


Comparing doubles may also be a real problem. See also Bag of algorithms - Bag of Tk algorithms - Mathematically oriented extensions - Category Mathematics