Purpose: The Tcl maintainers are working on an issue where overflow of a 32-bit integer, followed by conversion to wide, can yield a literal with an inappropriate internal representation. KBK has been asked to start a Wiki page to track the discussions, since they are getting complex enough to be difficult to track in the Chat and email, and a Wiki page supports collaborative editing, unlike a SourceForge bug report.
Background: Tcl 8.4 has a serious bug [L1 ] in its processing of integer literals that exceed the range of a 32-bit word. The issue is that, for backward compatibility with earlier versions of Tcl, integers that fit into a 32-bit unsigned word are treated as 32-bit constants. The problem that arises is that these constants acquire an internal representation that can then be sign-extended to a wide integer. The wide integer will have an incorrect value.
The bug is truly insidious because it pollutes shared literals, so unrelated code can stumble over problems. The following illustrates the sort of bizarre results the literal pollution can cause.
% proc b {} { set x 2200000000 ; puts [expr { wide($x) + 1 }] } % b 2200000001 % proc a {} { clock format 2200000000 } % a; b -2094967295
Certain changes to [string is integer] in 8.4.3 and later releases have made the following script also fail similarly, although it works from 8.0 to 8.4.2:
% proc b {} { set x 2200000000 ; puts [expr { wide($x) + 1 }] } % b 2200000001 % proc a {} { string is integer 2200000000 } % a; b -2094967295
What's happening: The following table gives examples of each type of 32-bit integer conversion that's possible, and the notes below explain what is going on.
----------------------------------------------------------------------------- Constant Tcl 7.6 Tcl 8.x Note and earlier ----------------------------------------------------------------------------- -0x100000000 0x00000000 -- integer value too large to represent -- *1 ----------------------------------------------------------------------------- -0xffffffff 0x1 0x1 *2 -0x80000001 0x7fffffff 0x7fffffff *2 ----------------------------------------------------------------------------- -0x80000000 0x80000000 0x80000000 *3 -0x7fffffff 0x80000001 0x80000001 *3 -0x1 0xffffffff 0xffffffff *3 ----------------------------------------------------------------------------- -0x0 0x0 0x0 *4 0x0 0x0 0x0 *4 0x1 0x1 0x1 *4 0x7fffffff 0x7fffffff 0x7fffffff *4 ----------------------------------------------------------------------------- 0x80000000 0x80000000 0x80000000 *5 0xffffffff 0xffffffff 0xffffffff *5 ----------------------------------------------------------------------------- 0x100000000 0x0 -- integer value too large to represent -- *6 -----------------------------------------------------------------------------
The significant cases above are 2 (numbers between -0x80000001 and -0xffffffff) and 5 (numbers between 0x80000000 and 0xffffffff). In both these cases, the "integer" internal representation, if sign extended to "wide", will result in an incorrect value. In both cases, the "wide" value will be correct if the sign bit is complemented before sign extension. The following table gives examples for each case.
Constant Incorrect Correct Note ------------------------------------------------------------------- -0xffffffff 0x0000000000000001 0xffffffff00000001 *2 -0x80000001 0x000000007fffffff 0xffffffff7fffffff *2 -0x80000000 0xffffffff80000000 *3 -0x00000001 0xffffffffffffffff *3 -0x00000000 0x0000000000000000 *3 0x00000000 0x0000000000000000 *4 0x00000001 0x0000000000000001 *4 0x7fffffff 0x000000007fffffff *4 0x80000000 0xffffffff80000000 0x0000000080000000 *5 0xffffffff 0xffffffffffffffff 0x00000000ffffffff *5 -------------------------------------------------------------------
So, how to fix this bug? KBK's initial thought is to introduce another object type in tclObj.c: tclOverflowedIntType. This object type will represent objects that were converted on input to 32-bit integers with overflow. It will behave identically to tclIntType in that Tcl_GetIntFromObj will return the 32-bit value. But the places in the Core where an integer representation is retrieved and then sign extended to wide will change to sign extend with the complement of the sign bit, as shown above in 2 and 5.
It is KBK's belief that this change should not break existing scripts, since they will see the same 32-bit behavior that they did before. It should also not break existing extensions, even those that reach into the internal representation; the worst that it will cause is to make them do needless calls to convert the type.
One remaining issue with this idea is the question of what Tcl_ConvertToType should do if requested to convert one of these overflowed integers to tclIntType. KBK is of the belief that the most backward-compatible action is probably to have it silently convert to tclOverflowedIntType instead; any code that is expecting the internal representation afterward will see the correct data in objPtr->internalRep.longValue and will only notice the difference if it explicitly checks objPtr->typePtr. A riskier alternative is to return TCL_ERROR with a message indicating that the value is too large to represent. KBK believes that both alternatives are low-risk, because there are few if any callers for Tcl_ConvertToType - no Core caller ever requests an explicit integer conversion in this manner.