Version 42 of format

Updated 2016-06-05 08:20:57 by ten

format , a built-in Tcl command, formats a string in the style of sprintf

Synopsis

format formatString ?arg arg ...?

Documentation

man page

Description

This command generates a formatted string in a manner similar to the ANSI C sprintf procedure (it uses sprintf in its implementation). formatString indicates how to format the result, using % conversion specifiers as in sprintf, with the additional arg values, if any, provide values to be substituted into the formatString. The value of format is the formatted string.

Without a size modifier, the numeric conversion types are all subject to underflow/overflow. The ll size modifier configures a conversion type to operate on a number of arbitrary size.

Maximum Width for Numbers

by default, format doesn't provide a way to specify a maxiumum number of digits when using a numeric conversion type such as %d. You could specify the format as a %s and then provide a maximum number of characters, or you could write tcl code to check for maximum. Jonathan Bromley, comp.lang.tcl 2007-09, posted the following code (modernized here), which provides a first cut at a -strict initial argument to format.

proc strictformat {fmt value} {
    set f [format $fmt $value]
    regexp {%(\d+)} $fmt -> maxwidth
    if {[string length $f] > $maxwidth} {
        return [string repeat * $maxwidth]
    } else {
        return $f
    }
}
rename format _format
proc format args {
    if {[lindex $args 0] eq {-strict}} {
        strictformat {*}$args
    } else {
        _format {*}$args
    }
}

Make Unsigned Values

You can use format to produce unsigned integers for display (but don't reckon with them - for expr they're still signed!):

% format %u -1
4294967295

See floating-point formatting for discussion on how to write format strings to handle floats...

DKF: Note that 8.5 makes this sort of thing much less necessary as we can now handle arbitrary width integers.

Nice-Looking Floats

To make numbers look nice:

set fah [format {%0.2f} [expr {$temperature_cel * 9 / 5 + 32}]]

Color Formatting

set color [format #%02x%02x%02x $r $g $b]

Converting Characters

A limited formatting of decimals to characters is available in other languages, e.g. CHR() in Basic. If you use that more often, here's a cute shortcut:

interp alias {} chr {} format %c
% set a [chr 49][chr 48]
10

Abbreviating Integers

See Narrow formatting for short rendering of big integers, with powers of 1024:

% fixform 12345678
11.7M

Understanding Formats

This method should get format string and explain the format structure. This is a fast scatch:

proc explainFormat {formatStr vars} {
    set index 1
    foreach frm [split $formatStr %] {
        set extra {} 
        set size 0
        regexp {([0-9]+)([duioxXcsfegG])(.*)} $frm => size type extra
        if {$size == 0} {
            set size [string length $frm]
        } else {
            set frm "%$size$type [lindex $vars 0]"
            set vars [lrange $vars 1 end]
            set size [string trimleft $size 0]
        }
        for {set i 0} {$i < 2} {incr i} {
            set newIndex [expr {$size +$index -1}]
            puts "$index-$newIndex '$frm'"
            set index [expr {$newIndex +1}]
            if {$extra eq {}} {
                break
            } else {
                set frm $extra
                set size [string length $extra]
            }
        }
    }
}
% explainFormat hello%02s000%3d $a $b
1-5 'hello'
6-7 '%02s $a'
8-10 '000'
11-13 '%3d $b'

Emulating Fortran

RS 2007-09-04: Here's emulating the Fortran behavior that numbers too large for the format are marked as an asterisk string:

proc strictformat {fmt value} {
    set f [format $fmt $value]
    regexp {%(\d+)} $fmt -> maxwidth
    if {[string length $f]>$maxwidth} {
        return [string repeat * $maxwidth]
    } else {return $f}
}

Testing:

% strictformat %5.2f 12.345
12.35
% strictformat %5.2f 123.45
*****
% strictformat %5.2f 12345.67
*****

Restricting Floats

While using Tcl 8.5, you will begin to see strings like 0.0052499999999999995 where before you were seeing values like 0.00525. To round the value to a shorter value, try something like:

format %.3g 0.0052499999999999995

See Also Floating-point formatting

Rebasing

Don Porter, comp.lang.tcl, in reply to a question about how to go from base 10 to another base, such as 2 or 16, using arbitrarily large numbers in Tcl 8.5:

> I would have guessed that format %x should do the job, but apparently
> it's currently limited to 64 bits...

% format %llx 1234567890123456789012345
1056e0f36a6443de2df79

DrASK: Those are ELLs above in %llx. Not {percent eleven lower-case-x}, but rather {percent, ell, ell, lower-case-x}.

Digit Grouping

Digit grouping can make numbers with many digits easier to read.

ET: While I never liked the language Ada, it did have an idea that I wish had caught on, the optional use of an underscore character in large numerical constants, to make the numbers readable. (And trival for a compiler or interpreter to scan/parse).

So, 1_234_567 is a number that is as readable as 1,234,567 and is much better than 1234567. I have a handy little converter, (which I stole from somewhere on this wiki and modifed):

proc commas {var {num 3} {char ,}} {
    set len   [string length $var]
    set first [expr $len - $num]
    set x     {}
    while {$len > 0} {
        # grab left num chars
        set lef [string range $var $first end] 
        if {[string length $x] > 0} {
            set x   "${lef}$char${x}"
        } else {
            set x   $lef
        }
        # grab everything except left num chars
        set var [string range  $var 0 [expr $first -1]]
        set len   [string length $var]
        set first [expr {$len - $num}]
    }
    return $x
}

Here are some examples of its use:

dec2bin 987654
11110001001000000110

% commas [dec2bin 987654] 4 _
1111_0001_0010_0000_0110

% commas [dec2bin 987654] 4 { }
1111 0001 0010 0000 0110

% commas [dec2bin 987654] 1 { }
1 1 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0

%commas 123456789     ;# naturally, it defaults for use with large decimal integers
123,456,789

% commas 123456789 3 _  ;# and here's how I wish numbers could be entered, in tcl and in C etc.
123_456_789

% puts "0x[commas [format %08X 123456789] 4 _]"    ;# and for hex numbers as well
0x075B_CD15   

ten: modified to make it work with decimal points:

% commas 101135130.01 101,135,130.01

proc commas {var {num 3} {char ,}} {
    set dec ""
    regexp {(\d+)(\.\d+)?} $var tmp var dec
    set len   [string length $var]
    set first [expr $len - $num]
    set x     {}
    while {$len > 0} {
        # grab left num chars
        set lef [string range $var $first end] 
        if {[string length $x] > 0} {
            set x   "${lef}$char${x}"
        } else {
            set x   $lef
        }
        # grab everything except left num chars
        set var [string range  $var 0 [expr $first -1]]
        set len   [string length $var]
        set first [expr {$len - $num}]
    }
    return $x$dec
}

Misc

LV: The man page for 8.4 is missing examples. 8.5 is better, but I'm looking for an example of the following. I have a report line that I am trying to fill out. It consists of a time stamp, a date stamp, and 2 text strings. each of these items must begin in a specific column.

set g OHIO
set fmtg [format %-25.25s $g]
puts [string length $fmtg]

The man page is complex enough that I want to be certain that I am not missing something. This seems to ensure that if g is longer than 25 characters, it is truncated, and if it is shorter than 25 characters, that it is left justified and blank padded. Are there any gotchas of which I need to be aware?

[TODO: Explain XPG positional format specifiers.]


dbohdan 2014-06-06: Observation: you could use format to do multiple ad hoc type assertions in the vein of assert [string is integer $var] with a single command. E.g.,

eltclsh > set a 5
eltclsh > set b 7
eltclsh > format %d%d $a $b
57
eltclsh > set a NaNNaNNaN
eltclsh > format %d%d $a $b
expected integer but got "NaNNaNNaN"

This may or may not be a bad idea.

See Also

Binary representation of numbers
Pure-Tcl implementations similar to format %b.