expr problems with int

expr problems with int describes issues that were encountered in Tcl prior to version 8.5. There's nothing here that is relevant to versions 8.5 and later.

See Also

Why does Tcl think a fraction less than one is zero?
a real problem
about comparing doubles
Bag of algorithms
Bag of Tk algorithms
Mathematically oriented extensions

Description

Richard Suchenwirth 1999-08: Mathematics with expr has some problems coming from the C implementation. On my NT box, and on most present-day computers, int is represented as a 32-bit signed "long". This means there is a limit, a.k.a. MAXINT, to what an integer can express:

% expr 2147483647
2147483647
% expr 2147483648
-2147483648

Probably not what was intended.

(Larry Virden interjects - I would much prefer to see Tcl do things that are 'intended' than not - if anyone else has this desire, perhaps it would be worthwhile to consider what compatible changes could be made to Tcl so that one gets the math one expects...)


LV 2005-03-15: There are a number of TIPs right now being worked on relating to expr and various functionality. I think that some - perhaps much - of the issues in Tcl with regards to the following topic may change in Tcl

  8.5 or perhaps 9.0 (I don't know the implementation schedule).

From the given threshold, positives are recast to negatives, up to

% expr 4294967295
-1

and then you run into

% expr 4294967296
error: integer value too large to represent

Tcl 8.4 provides a wide integer type for 64-bit support. More below.

You might consider using Mpexpr, a multiple-precision extension which can handle integer and floating point numbers with large numbers of digits, or the pure-Tcl Arbitrary Precision Math Procedures.

As an alternative, you gain some more computing power by converting such strings to floating-point variables. You gain the advantage of a wider range of values and (usually) more significant digits, but you do not have unlimited precision. (You can't count all the way to 10e300 by ones.) You also lose the ability to use the incr, but you can always use expr.

You cannot use expr's double() to perform the conversion, because expr fails with the same integer conversion error (as above) before calling double().

One simple way of casting an integer to a double is to append a dot:

set x $i. ;# or: append i "."

This may however produce an ill-formed number if $i contained a dot already. Hume Smith gave a clever solution in news:comp.lang.tcl :

append i e0

makes it look like scientific notation, meaning "multiplied by 10 to the 0th power, which is 1", which forces the string to a non-integral real number (a number that expr interprets as double) with pure string manipulation. If there is the slightest possibility that scientific notation occurs in the input, make it bullet-proof:

if {![regexp e $i]} {append i e0}

If the expression comes from an external source, it can be cast to a non-integral real number:

expr 1.0*$expression

RS: Here again, braces cannot be used, because the expr parser won't take operators in a variable...

This does not work for expressions such as 10+3/5


Here's a solution from Donal Fellows that checks the error reason:

# The absence of {curlies} from [expr] is crucial!
if {[catch {expr double($int)} float]} {
    if {[string equal [lrange $::errorCode 0 1] {ARITH IOVERFLOW}]} {
        # We know we've got an int value now!
        set float [expr double($int.0)]
    } else {
        error "attempted conversion of non-numeric to float"
    }
}

Paul Welton: showed in comp.lang.tcl that you can get an unsigned string rep if you ask for it:

format %u -1 => 4294967295

This can be sugared into a C-like declaration:

proc unsigned  var {
       uplevel trace add variable $var write "{set $var \[format %u \$$var\];#}"
} ;#RS
unsigned y
set y -1
4294967295

But if you incr a variable which holds that value, you're still stuck at 0...


Volker Hetzer wrote in comp.lang.tcl:

 [expr -1 / 10] returns -1 !!! What's that?

Peter G. Baum responded:

That's integer division. From man expr:

 *      /      %

Multiply, divide, remainder. None of these operands may be applied to string operands, and remainder may be applied only to integers. The remainder will always have the same sign as the divisor and an absolute value smaller than the divisor. So

expr -1%10

must be >0 and < 10, and

expr 10*(-1/10)+(1%10)

must be -1 . If you solve for "-1/10", you find, that -1 is the correct answer.

Steve Offutt wrote: To prevent errors caused by integer division, either explicitly cast to double, or introduce a decimal point:

 % expr -1 / 10
 -1
 % expr double(-1) / 10
 -0.1
 % expr -1 / 10.0
 -0.1
 % expr -1.0 / 10

Dan Kuchler added: Of course, if you want 'integer' division, you have to do something like:

expr {int(double(-1)/10)}

Kevin Kenny 2002-05-23:

If you're dealing with integers that don't fit in 32 bits, but do fit in your double-precision floating point significand, you can sometimes use 'double' to get a little bit more precision. Consider:

set x 0x0100
set y 0x1000FF1
set z [expr { double( $x ) * double( $y ) }]
set upper [expr { int( $z / ( 1 << 24 ) )}]
set lower [expr { int( $z - ( 1 << 24 ) * double( $upper ) ) }]
puts [format {0x%06x%06x} $upper $lower]

which prints:

0x0001000ff100

RS: Here's how you can find out your installation's MAXINT:

expr 0x7[string range [format %X -1] 1 end]

To test:

set maxint [expr 0x7[string range [format %X -1] 1 end]]
incr maxint

should return a negative number with absolute value maxint+1.


Update for Tcl 8.4.1:

% set t [scan 7fffffffffffffff %lx ]
9223372036854775807
% incr t
-9223372036854775808

Twylite 20050104: In 8.4 expr and incr work according to the underlying type of the integer, which is determined by its size. so:

% set i [expr 2147483647]
2147483647
% incr i
-2147483648
% set i [expr 2147483647 + 1]
-2147483648
% expr 2000000000 + 1000000000
-1294967296

but:

% set i [expr 21474836470]
21474836470
% incr i
21474836471
% set i [expr 21474836470 + 1]
21474836471

A number larger than 32 bits can be formed as a string. To use 32-bit numbers in a calculation that will have a 64-bit result you can use the wide operator:

% expr wide(2000000000) + 1000000000 
3000000000

Lars H 2005-01-05: Yeah, this is ugly. I filed a bug report on it about a year ago, but essentially got the reply that the bug is now so established that it cannot be fixed without a TIP (IMHO a questionable conclusion, but that's the current will of the powers that be).

Ideally, this would be resolved by getting proper integers for Tcl 8.5 or 9.0.

DKF: It's a horrendous bug/misfeature, but it ended up that way so we didn't break the code of too many people who were relying on Tcl's existing 32-bit behaviour. This is an area that felt very much like "I'm damned by all sides whatever I do"... :^/ The latest versions of Tcl (don't know if any of this has made it into the 8.4 release branch) are much less awful, as we've managed to greatly reduce the amount of tedious shimmering going on. Ultimately, we'd like to see arbitrary precision integers (TIP#132 is the plan) but timescale is unknown given currently available effort.


I wish expr would complain when there is a buffer overflow for eg

expr 46341*46341
-2147479015

If you don't have multiple precision, how about:

proc rshift10 {int dec_places} {
    # format right aligned padding left
    # eg 6 -> 0.006 for 3 dec_places 
    # or 
    # 23 ->2.3 for 1 dec_places 
    set s [string length $int]
    if { $int == 0 } {
        if { $dec_places > 0 } {
            return "0.[string repeat 0 $dec_places]"
        } else {
            return 0
        }
    }
    if { $s == $dec_places } {
        return "0.$int"
    }
    if { $s < $dec_places } {
        return "0.[string repeat 0 [expr {$dec_places - $s}]]$int"
    }
    if { $s > $dec_places } {
        return "[string range $int 0 [expr {$s - $dec_places-1}]].[string range $int [expr {$s - $dec_places}] end]"
    }
}

proc samesign { a b } {
    if { ($a>=0 && $b>=0) || ($a<0 && $b<0) } {
        return 1
    } else {
        return 0
    }
}

proc add { n1 n2 } {
    set k1 [string first . $n1]
    if { $k1==-1 } {
        set p1 0
    } else {
        set p1 [expr {[string length $n1] - $k1 - 1}]
        set n1 "[string range $n1 0 [expr {$k1-1}]][string range $n1 [expr {$k1+1}] end]"
    }
    if { $n1!=0 } {
        # Prevent octal interpretation
        set n1 [string trimleft $n1 0]
    } else {
        set n1 0
    }

    set k2 [string first . $n2]
    if { $k2==-1 } {
        set p2 0
    } else {
        set p2 [expr {[string length $n2] - $k2 - 1}]
        set n2 "[string range $n2 0 [expr {$k2-1}]][string range $n2 [expr {$k2+1}] end]"
    }
    if { $n2!=0 } {
        # Prevent octal interpretation
        set n2 [string trimleft $n2 0]
    } else {
        set n2 0
    }

    # Line up
    if { $p1>$p2 } { 
        set p $p1
        append n2 [string repeat 0 [expr {$p1-$p2}]]
    } else { 
        set p $p2
        append n1 [string repeat 0 [expr {$p2-$p1}]]
    }

    set value [expr {wide($n1) + wide($n2)}]
    
    # Check for buffer overflow
    if { [samesign $n1 $n2] && ![samesign $n1 $value] } { 
        error "Result is too large to represent"
    }

    # Format answer
    if { $p>0 } {
        return [rshift10 $value $p]
    } else {
        return $value
    }
}

proc mult { n1 n2 } {
    set k1 [string first . $n1]
    if { $k1==-1 } {
        set p1 0
    } else {
        set p1 [expr {[string length $n1] - $k1 - 1}]
        set n1 "[string range $n1 0 [expr {$k1-1}]][string range $n1 [expr {$k1+1}] end]"
    }
    if { $n1!=0 } {
        # Prevent octal interpretation
        set n1 [string trimleft $n1 0]
    } else {
        set n1 0
    }

    set k2 [string first . $n2]
    if { $k2==-1 } {
        set p2 0
    } else {
        set p2 [expr {[string length $n2] - $k2 - 1}]
        set n2 "[string range $n2 0 [expr {$k2-1}]][string range $n2 [expr {$k2+1}] end]"
    }
    if { $n2!=0 } {
        # Prevent octal interpretation
        set n2 [string trimleft $n2 0]
    } else {
        set n2 0
    }

    set result [expr {wide($n1)*wide($n2)}]

    # Check for buffer overflow
    if { !($n1<=32767 && $n1>=-32768 && $n2<=32767 && $n2>=-32768) 
         && $n2!=0 
         && ($result/$n2!=$n1 || ($n2==-1 && $n1<0 && $result < 0)) } {
        error "Result is too large to represent"
    }

    # format result
    set p [expr {$p1+$p2}]
    if { $p>0 } {
        return [rshift10 $result $p]
    } else {
        return $result
    }
}