IEEE binary float to string conversion

With Tcl 8.5, it is less important to use the code on this page because you can use the r/R and q/Q specifiers to binary scan and binary format to handle floating point values of specified endianness. See the detailed documentation for more information.


From a post in news:comp.lang.tcl :

FPX: Floating point values are usually transferred in IEEE format. IEEE 754-1985, "IEEE Standard for Binary Floating-Point Arithmetic" [1 ] defines 32-bit and 64-bit encoding formats for floating-point numbers.

Normally, such values can be interpreted using Tcl's binary command, using the "f" and "d" formats. However, there is a catch: Tcl ultimately depends on the encoding of the "float" and "double" data types in the C language, and according to ISO C, the encoding of these data types is implementation dependent.

Consequently, the manual for binary warns that input and output of the "f" and "d" formats is not portable.

I have written the following code to read an IEEE float value, in case that your machine doesn't use IEEE natively. It also supports a "byteorder" flag that allows to read the value whether in big-endian or little-endian byteorder.

CMP What about denormalised values, +-infinity and NaN? How to handle numbers that are too big, infinity or limit to the largest representable number as the native conversion does?

PWQ 20 Jul 05, By definition an IEEE float is always represented one way. IMNHO you should call these IEEE-2-native based on byte order of host.

CMP Indeed, the byteorder is not related to the float encoding. It is much clearer to leave it out of this discussion.

 proc IEEE2float {data byteorder} {
    if {$byteorder == 0} {
        set code [binary scan $data cccc se1 e2f1 f2 f3]
    } else {
        set code [binary scan $data cccc f3 f2 e2f1 se1]
    }
    
    set se1  [expr {($se1 + 0x100) % 0x100}]
    set e2f1 [expr {($e2f1 + 0x100) % 0x100}]
    set f2   [expr {($f2 + 0x100) % 0x100}]
    set f3   [expr {($f3 + 0x100) % 0x100}]
    
    set sign [expr {$se1 >> 7}]
    set exponent [expr {(($se1 & 0x7f) << 1 | ($e2f1 >> 7))}]
    set f1 [expr {$e2f1 & 0x7f}]
    
    set fraction [expr {double($f1)*0.0078125 + \
            double($f2)*3.0517578125e-05 + \
            double($f3)*1.19209289550781e-07}]
    
    set res [expr {($sign ? -1. : 1.) * \
            pow(2.,double($exponent-127)) * \
            (1. + $fraction)}]
    return $res
 }

It expects a binary buffer containing an IEEE number and the byte order the number is in (0 for big-endian and 1 for little-endian).

3fa22435 yields 1.2667299509 (big-endian) or 6.1330860035e-07 (little).

Here's code for the reverse transformation, from a floating-point value to IEEE format:

 proc float2IEEE {val byteorder} {
    if {$val > 0} {
        set sign 0
    } else {
        set sign 1
        set val [expr {-1. * $val}]
    }

    #
    # If the following math fails, then it's because of the logarithm.
    # That means that val is indistinguishable from zero.
    #

    if {[catch {
        set exponent [expr {int(floor(log($val)/0.69314718055994529))+127}]
        set fraction [expr {($val/pow(2.,double($exponent-127)))-1.}]
    }]} {
        set exponent 0
        set fraction 0.0
    } else {
        #
        # round off too-small values to zero, throw error for
        # too-large values
        #

        if {$exponent < 0} {
            set exponent 0
            set fraction 0.0
        } elseif {$exponent > 255} {
            error "value $val outside legal range for a float"
        }
    }

    set fraction [expr {$fraction * 128.}]
    set f1f      [expr {floor($fraction)}]
    set fraction [expr {($fraction - $f1f) * 256.}]
    set f2f      [expr {floor($fraction)}]
    set fraction [expr {($fraction - $f2f) * 256.}]
    set f3f      [expr {floor($fraction)}]

    set f1       [expr {int($f1f)}]
    set f2       [expr {int($f2f)}]
    set f3       [expr {int($f3f)}]

    set se1      [expr {($sign ? 128 : 0) | ($exponent >> 1)}]
    set e2f1     [expr {(($exponent & 0x1) << 7) | $f1}]

    if {$byteorder == 0} {
        set bytes [binary format cccc $se1 $e2f1 $f2 $f3]
    } else {
        set bytes [binary format cccc $f3 $f2 $e2f1 $se1]
    }

    return $bytes
 }

See also 1750A to Float Conversion for converting to and from a MIL-STD-1750A 32-bit floating number.

Michael Jacobson ~ [email protected]


CL intends to make time to demonstrate how the constants above introduce small imprecisions around the twelfth decimal place.


PC For those that want 64 bit double precision IEEE numbers use the following procedure:

 proc double2IEEE { value } {
        # covert value to double precision IEEE754 number
        if {$value > 0} {
                set sign 0
        } else {
                set sign 1
                set value [expr {-1. * $value}]
        }

        # If the following math fails, then it's because of the logarithm.
        # That means that value is indistinguishable from zero
        if {[catch {
                set exponent [expr {int(floor(log($value)/0.69314718055994529))+1023}]
                set fraction [expr {($value/pow(2.,double($exponent-1023)))-1.}]
        }]} {
                set exponent 0
                set fraction 0.0
        } else {
                # round off too-small values to zero, throw error for too-large values
                if {$exponent < 0} {
                        set exponent 0
                        set fraction 0.0
                } elseif {$exponent > 2047} {
                        error "value $value outside legal range for a float"
                }
        }

        set fraction [expr {$fraction * 16.}]
        set f1f      [expr {floor($fraction)}]

        set fraction [expr {($fraction - $f1f) * 256.}]
        set f2f      [expr {floor($fraction)}]

        set fraction [expr {($fraction - $f2f) * 256.}]
        set f3f      [expr {floor($fraction)}]

        set fraction [expr {($fraction - $f3f) * 256.}]
        set f4f      [expr {floor($fraction)}]

        set fraction [expr {($fraction - $f4f) * 256.}]
        set f5f      [expr {floor($fraction)}]

        set fraction [expr {($fraction - $f5f) * 256.}]
        set f6f      [expr {floor($fraction)}]

        set fraction [expr {($fraction - $f6f) * 256.}]
        set f7f      [expr {floor($fraction)}]

        for {set i 1} {$i <= 7} {incr i} {
                set var "f$i"
                append var "f"
                set f$i [expr {int([set $var])}]
        }

        set se1 [expr {($sign ? 128 : 0) | ($exponent >> 4)}]
        set e2f1 [expr {(($exponent & 15) * 16) | $f1}]

        set bytes [binary format cccccccc $f7 $f6 $f5 $f4 $f3 $f2 $e2f1 $se1]

        return $bytes
 }

CMB Here's another set of procedures for performing both 32-bit and 64-bit conversions. They accept and return strings representing hexadecimal and floating point numbers (e.g., 0x12345678, -123.4567). I didn't include the byte ordering routines and have the place holders commented out.


    proc hex2float {hex} {
        global tcl_platform
        #if {$tcl_platform(byteOrder) == "littleEndian"} { set hex [reverse4 $hex] }
        set sign [expr $hex >> 31]
        set exponent [expr ($hex >> 23) & 0xFF]
        set mantissa [expr $hex & ((1 << 23) -1)]
        set result [expr 1 + 1.0 * $mantissa / (1 << 23)]
        set result [expr {($sign ? -1.0 : 1.0)} * $result]
        if {$mantissa == 0 && $exponent == 0} {
            set result [expr $result * 0.0]
        } else {
            set result [expr $result * pow(2, $exponent - 127)]
        }
        return $result
    }

    proc hex2double {hex} {
        global tcl_platform
        #if {$tcl_platform(byteOrder) == "littleEndian"} { set hex [reverse8 $hex] }
        set sign [expr $hex >> 63]
        set exponent [expr ($hex >> 52) & 0x7FF]
        set mantissa [expr $hex & ((1 << 52) -1)]
        set result [expr 1 + 1.0 * $mantissa / (1 << 52)]
        set result [expr {($sign ? -1.0 : 1.0)} * $result]
        if {$mantissa == 0 && $exponent == 0} {
            set result [expr $result * 0.0]
        } else {
            set result [expr $result * pow(2, $exponent - 1023)]
        }
        return $result
    }

    proc float2hex {value} {
        global tcl_platform
        set sign [expr ($value < 0 ? 1 : 0)]
        if {$sign} {set value [expr -1.0 * $value]}
        if {[catch {
            set exponent [expr {int(floor(log($value)/0.69314718055994529))+127}]
            set mantissa [expr {($value/pow(2,double($exponent-127)))-1.}]
        }]} {
            set exponent 0
            set mantissa 0.0
        } else {
            if {$exponent < 0} {
                set exponent 0
                set mantissa 0.0
            } elseif {$exponent > 255} {
                error "value $value outside legal range for a float"
            }
        }
        set mantissa [expr int($mantissa * pow(2,23))]
        set result [format "0x%04X" [expr ($sign << 31) | ($exponent << 23) | ($mantissa)]]
        #if {$tcl_platform(byteOrder) == "littleEndian"} { set result [reverse4 $result] }
        return $result
    }

    proc double2hex {value} {
        global tcl_platform
        set sign [expr ($value < 0 ? 1 : 0)]
        if {$sign} {set value [expr -1.0 * $value]}
        if {[catch {
            set exponent [expr {entier(floor(log($value)/0.69314718055994529))+1023}]
            set mantissa [expr {($value/pow(2,double($exponent-1023)))-1.}]
        }]} {
            set exponent 0
            set mantissa 0.0
        } else {
            # round off too-small values to zero, throw error for too-large values
            if {$exponent < 0} {
                set exponent 0
                set mantissa 0.0
            } elseif {$exponent > 2047} {
                error "value $value outside legal range for a double"
            }
        }
        set upper [expr entier($mantissa * pow(2,20))]
        set lower [expr entier($mantissa * pow(2,52))]
        set upper [expr ($sign << 31) | ($exponent << 20) | ($upper)]
        set result [format "0x%04X%04X" $upper $lower]
        #if {$tcl_platform(byteOrder) == "littleEndian"} { set result [reverse8 $result] }
        return $result
    }