Version 28 of Tcl and octal numbers

Updated 2008-07-04 13:49:15 by lars_h

Purpose: discuss the common pitfall of Tcl and octal numbers


Since Tcl does many things in an obvious manner, non-unix/c people are frequently surprised when they try this kind of code:

 # somehow get today's date and time into MMM DDD YYY HH MM SS variables

 set newtime [expr $HH + 1]

and get an error at 8am. The problem - Tcl tries to do

  expr 08 + 1

and complains. 08, you see, isn't really the string of the decimal number that comes after 7. Instead, it is an error. Tcl sees the leading 0 and treats the digits after it as representing a base 8 number. But there are no 8s (or 9s) in a base 8 number. So it generates an error.


The fix is to use:

    scan $HH %d HH

which strips hazardous leading zeros. This is also safer than [string trimleft $HH 0] which can fail if $HH ever ends up containing "00" for example. Or if $HH is negative (not likely in a clock context, but the argument still applies).

DKF

Lars H, 2008-07-04: One problem with the above that turned up when doing arithmetic on currency is that %d doesn't handle arbitrarily large integers (even though Tcl can) — it's restricted to the range of numbers int() can output. In order to handle integers with an arbitrary number of digits, it is necessary to do

    scan $HH %lld HH

Obviously this is not an issue with a number of hours in a day, but it can be an issue for other numbers.


glennj: one potential pitfall of [scan] is that it might mask potential errors:

        set n 09blah42
        incr n

fails as expected with the error message:

        expected integer but got "09blah42"
        while evaluating {incr n}

However:

        set n 09blah42
        scan $n %d n
        incr n ;# ==> n is now 10

Application writers might actually want to trap an invalid entry like that.

[2003-03-12] I see Kevin Kenny contributed the following to c.l.t

     proc forceInteger { x } {
        set count [scan $x %d%s n rest]
        if { $count <= 0 || ( $count == 2 && ![string is space $rest] ) } {
            return -code error "not an integer: \"$x\""
        }
        return $n
     }
     % forceInteger x
     not an integer: "x"
     % forceInteger 123
     123
     % forceInteger 08
     8

This also covers my preceding concern:

     % forceInteger 09blah42
     not an integer: "09blah42"

Better than an explicit test of

    [string is space $rest]

is to just skip (optional) spaces in the scan pattern:

     proc forceInteger { x } {
        set count [scan $x {%d %c} n c]
        if { $count != 1 } {
            return -code error "not an integer: \"$x\""
        }
        return $n
     }

Donald Arseneau


[Refer to http://phaseit.net/claird/comp.lang.tcl/fmm.html#zero ]

[Explain improved diagnostic in 8.3.]


The question recently came up how do I display the octal value of a character in Tcl? and RS replied:

 'the complete sequence is "format 0%o [scan a %c]"'

(but only with Tcl more recent than 8.2 or so; older scan works slightly differently).


http://www.tcl.tk/cgi-bin/tct/tip/114.html proposes modifying Tcl in a future release so that numbers beginning with 0 will not be interpreted by default as being expressed in octal. The proposer believes that far more users stumble upon this feature by accident than use it intentionally.


IDG There appears to be an octal related bug in string is. string is double 098 returns 1. (8.4.1 on windoze)

PT At the very least it's inconsistent (tcl 8.5a0 win98)

 % string is integer 098
 0
 % string is double 098
 1

2003-12-22 VI What I'd like more than what TIP 114 specifies is a prefix like 0d, which would force the rest of the number to be interpreted as decimal.

For the specific clock case, where we know we have two digits, I like to use expr like this:

 set m [clock format [clock seconds] -format %m]
 set m [expr 1$m % 100]

LV Right now, when I say:

set abc "\1"

abc is set to an octal 001.

Once this TIP is implemented, what will happen to the above code? Is it going to change behavior? - RS: abc is set to a string of one character U+0001 (ASCII SOH) - unaware of decimal, hex, or octal. This page is about parsing integers from strings, and U+0001 or any non-digit cannot be parsed as integer anyway :^)


Chris Nelson points out that octality afflicts not only Tcl [L1 ].


snichols 02/26/07 I tested octal arithmetic in both Ruby's irb interpreter and Python's interpreter and they behave in a similar way. So, this seems to be a common pitfall with other scripting languages too:

Ruby's IRB

 irb(main):001:0> 09 + 100
 SyntaxError: compile error
 (irb):1: Illegal octal digit
 09 + 100
  ^

Python

 >>> 09 + 100
  File "<stdin>", line 1
    09 + 100
     ^
 SyntaxError: invalid token

Category Tutorial