IEEE754 float and double formats are curious beasts. The procedure below was designed to help understand the binary encoding of these numbers, dissecting the bits to unravel and present their meaning. It uses binary format to get the binary encoding of a float and then interprets the bits according to the spec. Tcl8.6's knowledge of binary numbers (format %b and 0b11010010 notation) is exploited to make the code a little easier to read.
Binary128 ("quadruple precision") is not supported because binary doesn't know about it yet: if you're really curious, this should be easily remedied with a small Critcl or tcc4tcl proc.
You can see what values look like in different precision:
% set pi [expr atan(1)*4] % seefloat pi Value: 3.141592653589793 (0 10000000 10010010000111111011011) Sign: + Exponent: 1 Mantissa: 4788187 Significand: 1 + (4788187 / (2.**23)) Expression: (1 + (4788187 / (2.**23))) * 2.**1 Re-calculated value: 3.1415927410125732 % seefloat pi double Value: 3.141592653589793 (0 10000000000 1001001000011111101101010100010001000010110100011000) Sign: + Exponent: 1 Mantissa: 2570638124657944 Significand: 1 + (2570638124657944 / (2.**52)) Expression: (1 + (2570638124657944 / (2.**52))) * 2.**1 Re-calculated value: 3.141592653589793
It also handles denormal numbers:
% seefloat 1.+2.-3. ;# a denormal number very close to 0 Value: 5.551115123125783e-17 (0 01001001 00000000000000000000000) Sign: + Exponent: -54 Mantissa: 0 Significand: 1 + (0 / (2.**23)) Expression: (1 + (0 / (2.**23))) * 2.**-54 Re-calculated value: 5.551115123125783e-17
.. but NaN and Inf are not handled very well (TODO -- see below).
proc tobinary {s} { ;# literal bytes to binary sequence binary scan $s B* d return $d } proc truncfloat {n} { binary scan [binary format R $n] R f return $f } proc fitsinbinary32 {x} { expr {$x == [truncfloat $x]} } proc seefloat {x {type "binary32"}} { if {$x ne "NaN"} { set x [expr $x] } # FMT: [binary scan/format] code # EBITS: bits in exponent # MBITS: bits in mantissa set swbody { binary32 - float { set FMT R set EBITS 8 set MBITS 23 } binary64 - double { set FMT Q set EBITS 11 set MBITS 52 } } set ERRFMT "Unknown type '%s', must be one of [join [dict keys $swbody] ,\ ]." lappend swbody default { return -code error [format $ERRFMT $type] } switch $type $swbody # Exponent bias set EBIAS [expr {2**($EBITS-1)-1}] # Maximum exponent (means value is Inf) set EMAX 0b[string repeat 1 $EBITS] # Top bit of mantissa (quiet vs signalling NaN) set MTOP [expr {2**($MBITS-1)}] set bits [tobinary [binary format $FMT $x]] set fbits [concat [ string range $bits 0 0 ] [ string range $bits 1 $EBITS ] [ string range $bits $EBITS+1 end ]] set sgn [string range $bits 0 0] set exp 0b[string range $bits 1 $EBITS] set man 0b[string range $bits $EBITS+1 end] set sgn [expr {$sgn ? "-" : "+"}] set denormal false set isnan false set isinf false set sig [format "%d / (2.**%d)" $man $MBITS] if {$exp == $EMAX} { if {$man == 0} { puts "${sgn}Inf" } else { if {$man & $MTOP} { puts "quiet NaN" } else { puts "signalling NaN" } } return } elseif {$exp == 0} { set denormal true set exp [expr {1 - $EBIAS}] } else { set exp [expr {$exp - $EBIAS}] set sig "1 + ($sig)" } set expr "($sig) * 2.**$exp" puts "Value: $x ($fbits)" puts "Sign: $sgn" puts [format "Exponent: %d" $exp] puts [format "Mantissa: %d" $man] if {$denormal} { puts "Denormal number" } puts "Significand: $sig" puts "Expression: $expr" puts "Re-calculated value: [expr $expr]" }
This experiment has uncovered an obscure bug in binary format with binary32 targets:
% seefloat Inf Value: Inf (0 11111110 11111111111111111111111) Sign: + Exponent: 127 Mantissa: 8388607 Significand: 1 + (8388607 / (2.**23)) Expression: (1 + (8388607 / (2.**23))) * 2.**127 Re-calculated value: 3.4028234663852886e+38
See http://core.tcl.tk/tcl/tktview/85ce4bf92 for more detail and an attempted fix.
It would be interesting to also examine bounds on representation, and how this relates to IEEE binary float to string conversion.
aspect kicked this page off in a fit of idle curiosity.
arjen - 2015-02-01 11:17:14
You might also want to have a look at the Tcllib module "math::machineparameters" - it is the Tcl equivalent to the LAPACK routine DLAMCH.