Delimiting Numbers

WJP I find large numbers hard to read without group delimiters, but Tcl's format command doesn't provide automatic delimiter insertion the way some recent versions of C printf do. Here's a Tcl procedure that does the job. It defaults to a group size of 3 and comma as the delimiter but has optional arguments that allow other choices for locales that use other delimiters and group sizes.

# Given a number represented as a string, insert delimiters to break it up for
# readability. Normally, the delimiter will be a comma which will be inserted every
# three digits. However, the delimiter and groupsize are optional arguments,
# permitting use in other locales.
#
# The string is assumed to consist of digits, possibly preceded by spaces,
# and possibly containing a decimal point, i.e.: [:space:]*[:digit:]*\.[:digit:]*

proc DelimitNumber {number {delim ","} {GroupSize 3}} {
    # First, extract right hand part of number, up to and including decimal point
    set point [string last "." $number];
    if {$point >= 0} {
        set PostDecimal [string range $number [expr $point + 1] end];
        set PostDecimalP 1;
    } else {
        set point [expr [string length $number] + 1]
        set PostDecimal "";
        set PostDecimalP 0;
    }

    # Now extract any leading spaces.
    set ind 0;
    while {[string equal [string index $number $ind] \u0020]} {
        incr ind;
    }
    set FirstNonSpace $ind;
    set LastSpace [expr $FirstNonSpace - 1];
    set LeadingSpaces [string range $number 0 $LastSpace];

    # Now extract the non-fractional part of the number, omitting leading spaces.
    set MainNumber [string range $number $FirstNonSpace [expr $point -1]];

    # Insert commas into the non-fractional part.
    set Length [string length $MainNumber];
    set Phase  [expr $Length % $GroupSize]
    set PhaseMinusOne  [expr $Phase -1];
    set DelimitedMain "";

    #First we deal with the extra stuff.
    if {$Phase > 0} {
        append DelimitedMain [string range $MainNumber 0 $PhaseMinusOne];
    }
    set FirstInGroup $Phase;
    set LastInGroup [expr $FirstInGroup + $GroupSize -1];
    while {$LastInGroup < $Length} {
        if {$FirstInGroup > 0} {
            append DelimitedMain $delim;
        }
        append DelimitedMain [string range $MainNumber $FirstInGroup $LastInGroup];
        incr FirstInGroup $GroupSize
        incr LastInGroup  $GroupSize
    }

    # Reassemble the number.
    if {$PostDecimalP} {
        return [format "%s%s.%s" $LeadingSpaces $DelimitedMain $PostDecimal];
    } else {
        return [format "%s%s" $LeadingSpaces $DelimitedMain];
    }
}

While regular expressions are good at many things, they are not very good when you need to count things. Using regexps is also often slower. DelimitNumber is faster, though only by a small margin, than commify below. This is probably a case in which direct parsing is a better approach than using regular expressions.


See also commas added to numbers which features several shorter (though likely slower) implementations, such as below:

 proc commify { num {sep ,} {groupSize 3}} {
    while {[regsub "^(\[-+]?\\d+)(\\d{$groupSize})" $num "\\1$sep\\2" num]} {}
    return $num
 }

AMG: There's more commas-in-numbers stuff on SYStems's page. It should probably be moved here.


aspect notes that some locales use a non-uniform spacing for separators: India is one example. A neat syntax for supporting these would be something like DelimitNumber $x , {3 2}, but there are probably systems out there which follow repeating patterns or even more bizarre variations.


RS 2016-06-09: Here is an implementation for Indian number formatting - tested to work well with positive and negative integers:

proc in_number num {
   set res ""
   set crore [expr {int($num/10000000.)}]
   if {$crore != 0} {
      append res $crore,
      set num [expr {abs($num - $crore*10000000)}]
   }
   set lakh [expr {int($num/100000.)}]
   if {$crore!=0} {set lakh [format %02d $lakh]}
   if {$lakh != 0 || $crore != 0} {
      append res $lakh,
      set num [expr {abs($num - $lakh*100000)}]
   }
   set k [expr {int($num/1000.)}]
   if {$lakh != 0 || $crore != 0} {set k [format %02d $k]}
   if {$k != 0 || $res ne ""} {
      append res $k,
      set num [expr {abs($num - $k*1000)}]
   }
   if {$res ne ""} {set num [format %03d $num]}
   append res $num
   return $res
}

More compact:

proc in_number num {
   set res ""
   foreach i {10000000 100000 1000} {
      set x [expr {int($num/double($i))}]
      if {$res ne ""} {set x [format %02d $x]}
      if {$x != 0 || $res ne ""} {
         append res $x,
         set num [expr {abs($num - $x * $i)}]
      }
   }
   if {$res ne ""} {set num [format %02d $num]}
   append res $num
   return $res
}

Testing:

239 % in_number -1234567
-12,34,567
56 % in_number 1234567
12,34,567
62 % in_number 12345678
1,23,45,678
62 % in_number 123456789
12,34,56,789
63 % in_number -123456789
-12,34,56,789

APN 2016-06-10: On Windows TWAPI provides locale specific number formatting. This can also be overridden with options.

% format_number 100000000 0  -idigits 0
100,000,000
% format_number 100000000 0  -idigits 0 -sgrouping 4
1,0000,0000
% format_number 100000000 0  -idigits 0 -sgrouping 32
10,00,00,000