Version 44 of Binary representation of numbers

Updated 2011-01-12 21:01:02 by Chad

Problem: how to convert numbers between their decimal and binary representation.

I.e., we are looking for a pair of procs {int2bits, bits2int} such that

  • int2bits 6 -> {1 1 0}
  • bits2int {1 1 0} -> 6

 package require Tcl 8.5
 proc bits2int {bits} {
    return [expr 0b[join $bits {}]]
 }

TS Caveat, negative integers are not always carefully handled (while {$i>0}...construct). As Lars H pointed out.


Richard Suchenwirth suggested in c.l.t.

 proc int2bits {i} {
     #returns a bitslist, e.g. int2bits 10 => {1 0 1 0} 
     set res ""
     while {$i>0} {
         set res [expr {$i%2}]$res
         set i [expr {$i/2}]
     }
     if {$res==""} {set res 0}
     split $res ""
 }
 proc bits2int {bits} {
     #returns integer equivalent of a bitlist
     set res 0
     foreach i $bits {
         set res [expr {$res*2+$i}]
     }
     set res
 }

Joseph Collard added optional field width:

 proc int2bits {i {digits {} } } { 
        #returns a bitslist, e.g. int2bits 10 => {1 0 1 0} 
        # digits determines the length of the returned list (left truncated or added left 0 ) 
        # use of digits allows concatenation of bits sub-fields 

        set res "" 
        while {$i>0} { 
                set res [expr {$i%2}]$res 
                set i [expr {$i/2}] 
        } 
        if {$res==""} {set res 0} 

        if {$digits != {} } { 
                append d [string repeat 0 $digits ] $res 
                set res [string range $d [string length $res ] end ] 
        } 
        split $res ""                           
 } 

MS: Another approach is using the binary command:

 proc int2bits {i} {
     #returns a bitslist, e.g. int2bits 10 => {1 0 1 0} 
     binary scan [binary format I1 $i] B* x
     split [string trimleft $x 0] {}
 }
 proc bits2int {bits} {
     #returns integer equivalent of a bitlist
     set bits [format %032s [join $bits {}]]
     binary scan [binary format B* $bits] I1 x
     set x
 }

Cameron Laird rightly noted the platform dependence of the "%032s" part. To correct that, do instead

     # Miguel wrote this.
 proc int2bits {i} {
     #returns a bitslist, e.g. int2bits 10 => {1 0 1 0} 
     binary scan [binary format I1 $i] B* x
     split [string trimleft $x 0] {}
 }

 set tmp [llength [int2bits -1]]
     # What is this!?  Despite newcomers' instinct that there is some
     # special @-related syntax for regular expressions, the subtlety
     # of this next computation is pointed elsewhere.  Think of it this
     # way:  bits2int is a function of *two* variables, operating in quite
     # different spaces.  Along with the "bits" argument, bits2int depends
     # on the arithmetic domain of operation, or the "width" (32-bit, 64-bit,
     # ...) of native computations.
     # It would be easy enough to write bits2int as a function whose 
     # implementation calculates arithmetic-width each time.  That's
     # wasteful of cycles, in general, and arguably bad style.  What Miguel
     # wanted when he wrote this definition was a bits2int with a particular
     # proc body which is computed once on first scanning the definition.
     # 
     # So the proc body itself is a variable rather then the stereotypical
     #     "proc ... {
     #             body
     #      }"
     # There are several ways to compute such a string.  He chose to write
     # a template (the {# returns ...; set bits ...} part), and use regsub to 
     # substitute in the tiny sequence which is variable at this point.  Other
     # approaches rely on [list], ""-grouping, [format], [subst], and so on.
     #
     # To repeat, though--there's nothing magic about '@' and regsub.
     # It's just a mechanism for controlling the time of substitution or 
     # or evaluation--and any number of strings other than {@} would have
     # worked as far as regsub's syntax is concerned.
     #
     # What I *really* should do is augment the [proc] Wiki page to explain
     # mechanisms and situations for variable proc bodies.
 regsub @ {
     #returns integer equivalent of a bitlist
     set bits [format %0@s [join $bits {}]]
     binary scan [binary format B* $bits] I1 x
     set x
 } $tmp tmp
 proc bits2int {bits} $tmp

 unset tmp

Note: I'm leaving this for now, as an illustration. But it is *not* correct, the 32-bit dependence is still there, implicit in the "I" format specification. A correct approach needs to wait until there are 64 bit format specs for binary


'Nother variation: use format to convert to octal (or hexadecimal, for that matter), and "string map ..." to transform octal digits to bit patterns.

 proc int2bits x {
   string map {
   0 {0 0 0} 1 {0 0 1} 2 {0 1 0} 3 {0 1 1} 4 {1 0 0} 5 {1 0 1} 6 {1 1 0} 7 {1 1 1}
   } [split [format %o $x] ""]
 } ;#RS - note however that the bit sequence is multiples of 3 long:
 % int2bits 9
 0 0 1 0 0 1
 % int2bits 255
 0 1 1 1 1 1 1 1 1

Lars H: A simple variation on that, without leading zeros and about 30% faster, since it avoids the split:

 proc int2bits x {
     string map {
       +0 0 +1 1 +2 {1 0} +3 {1 1} +4 {1 0 0} +5 {1 0 1} +6 {1 1 0} +7 {1 1 1}
       0 { 0 0 0} 1 { 0 0 1} 2 { 0 1 0} 3 { 0 1 1} 4 { 1 0 0} 5 { 1 0 1} 6 { 1 1 0} 7 { 1 1 1}
     } [format +%o $x]
 }

Helmut Giese in c.l.t:

 proc val2Bin val {
     set binRep [binary format c $val]
     binary scan $binRep B* binStr
     return $binStr
 }

 # some tests
 puts [val2Bin 10]
 puts [val2Bin 129]
 puts [val2Bin 0x7F]

Arjen Markus The following fragment converts a hexadecimal representation to a floating-point number:

   # Convert between float and hex  
   #
   set hex "f3080000"
   set bin [binary format h8 $hex]
   binary scan $bin f float
   puts "$float"

On a big-endian machine the result is "1.0" (on a little-endian machine it is bizarre: 4.6006..e-041, but that is simply because the bytes are reversed.)


For extracting "unsigned" floating-point values from very long (>32 bits) integers in hex, MS has this recommendation on c.l.t.:

 proc conv {largeHex} {
     set res 0.0
     foreach hexDigit [split $largeHex {}] {
         set new 0x$hexDigit
         set res [expr {16.0*$res + $new}]
     }
     return $res
 }

Arjen Markus Suppose you have a sequence of bytes that are read from a file. Two of these, say at position 6 and 7 (counting from 0), make up an integer number. Then:

   set two_bytes [string range $str 6 7]
   binary scan "\0\0$two_bytes" I intvalue

will turn these two bytes into an integer (consisting of 4 bytes, hence the two leading nulls), if the original data are in big-endian order.

If they are in little-endian order, use:

   set two_bytes [string range $str 6 7]
   binary scan "${two_bytes}\0\0" i intvalue

(Note: trailing nulls and a different format)


Eric Amundsen - Here's a solution that seems to work fine for 64-bit 2's complement integer math.

    proc bin2int {binString} {
        set result 0
        for {set j 0} {$j < [string length $binString]} {incr j} {
            set bit [string range $binString $j $j]
            set result [expr $result << 1]
            set result [expr $result | $bit]
        }
        return $result
    }

    set binaryString 11001

    puts [bin2int $binaryString]

KBK contributed this version in the Tcl chatroom on 2002-11-18:

 proc fromBinary { digitString } {
    set r 0
    foreach d [split $digitString {}] {
        incr r $r
        incr r $d
    }
    return $r
 }

TV (24-4-'03) The above is along the lines which one could use in a reusable low level implementation. The complement in similar manner could be:

 set a 16; set r {}; 
 while {$a != 0} {set h [expr $a/2]; if {[expr $h+$h] != $a} {set r "1$r"} {set r "0$r"} ; set a $h }
 puts $r

The result should be split to get a list from the string, and the domain is limited by the ceiling for expr based addition and division.

Addition and division by two can be fast for low level implementation, and the method can probably easily be used with an infinite math package (I've seen one but didn't try it out yet).


Lars H, 5 Feb 2005: The following two converters came out of a "challenge" at Googol magnitude. Since they are simply doing things to strings, they are not limited by expr precision!

 proc dec2bin {num} {
    while {[regexp {[0-9]} $num]} {
       set num\
         [string map {o0 0 o1 1 o2 2 o3 3 o4 4 i0 5 i1 6 i2 7 i3 8 i4 9 0 ""}\
           [string map {0 0o 1 0i 2 1o 3 1i 4 2o 5 2i 6 3o 7 3i 8 4o 9 4i} $num]]
    }
    string map {i 1 o 0} $num
 }

 proc bin2dec {num} {
    set num h[string map {1 i 0 o} $num]
    while {[regexp {[io]} $num]} {
       set num\
         [string map {0o 0 0i 1 1o 2 1i 3 2o 4 2i 5 3o 6 3i 7 4o 8 4i 9 ho h hi h1}\
           [string map {0 o0 1 o1 2 o2 3 o3 4 o4 5 i0 6 i1 7 i2 8 i3 9 i4} $num]]
    }
    string range $num 1 end
 }

 proc int2bits {n} {split [dec2bin $n] ""}
 proc bits2int {n} {bin2dec [join $n ""]}

The idea of dec2bin is to alternate between a normal decimal notation and an a bi-quinary [L1 ] or "abacistic" notation where there are separate positions for fives (i = a five, o = no five). It is easy to generate the abacistic notation for half of a number in decimal notation -- just half every digit in place, e.g. 7 -> 3i -- and also easy to convert from abacistic to decimal notation, e.g. i3 -> 8. (Note that this doesn't combine is and os into the same decimal position as they were split off from.) is and os that don't have a digit to their right to recombine with constitute the bits that have been shifted out from the decimal number. bin2dec does the same thing backwards, with an extra h in the leftmost position to discard unnecessary zeroes.

 % dec2bin 987654
 11110001001000000110
 % bin2dec 11110001001000000111
 987655
 %  bin2dec 1[string repeat 0 64]
 18446744073709551616

ET: This last example shows the problem with binary digits: They're impossible to read after just a few digits. While I never liked the language ADA, it did have an idea that I wish had caught on, the optional use of an underscore character in large numerical constants, to make the numbers readable. (And trival for a compiler or interpreter to scan/parse).

So, 1_234_567 is a number that is as readable as 1,234,567 and is much better than 1234567. I have a handy little converter, (which I stole from somewhere on this wiki and modifed):

 proc {commas} {var {num 3} {char ","}} {
    set len   [string length $var]
    set first [expr $len - $num]
    set x     ""
    while { $len > 0} {
        # grab left num chars
        set lef [string range $var $first end] 
        if {[string length $x] > 0} {
            set x   "${lef}$char${x}"
        } else {
            set x   ${lef}
        }
        # grab everything except left num chars
        set var [string range  $var 0 [expr $first -1]]
        set len   [string length $var]
        set first [expr $len - $num]
    }
    return $x
 }

Here are some examples of its use:

 dec2bin 987654
 11110001001000000110

 % commas [dec2bin 987654] 4 _
 1111_0001_0010_0000_0110

 % commas [dec2bin 987654] 4 { }
 1111 0001 0010 0000 0110

 % commas [dec2bin 987654] 1 { }
 1 1 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0

 %commas 123456789     ;# naturally, it defaults for use with large decimal integers
 123,456,789

 % commas 123456789 3 _  ;# and here's how I wish numbers could be entered, in tcl and in C etc.
 123_456_789

 % puts "0x[commas [format %08X 123456789] 4 _]"    ;# and for hex numbers as well
 0x075B_CD15   

Larry Very interesting page! Is there aywhere on the page code that simply converts numbers into binary?

For example we have CW000-aa000 and we'd like the program to first search the CW string then convert into binary numbers at position 3-4-5 and 9-10-11.


Chad: Why is it that after

 binary scan [binary format f 393.84] f n

the value of n is 393.839996338?


See also: