if 0 {[Richard Suchenwirth] 2004-09-05 - In [Tcl chatroom] discussions on the [binary image compression challenge], [KBK] mentioned the Elias way of representing positive integers as compact bit sequences. ''Elias, P. Universal Codeword Sets and Representations of the Integers. IEEE Trans. Inform. Theory 21, 2 (March 1975), 194-203'' In the '''gamma''' way [http://en.wikipedia.org/wiki/Elias_Gamma_coding], the bit sequence of the number (without leading zeroes) is prefixed with a length marker: one less zeroes than its length, so that 1 = 1 2 = 010 3 = 011 4 = 00100 and so on. The bits sequence of a positive integer can be obtained like this:} proc int2bits int { set res "" while {$int>0} { set res [expr {$int%2}]$res ;# prepend - we have no [string insert] set int [expr {$int/2}] } set res } # and so the Elias gamma function is just proc Elias'gamma int { set bits [int2bits $int] return [string repeat 0 [expr {[string length $bits]-1}]]$bits } # and for a list of integers, which is the most useful application: proc Elias'gammas ints { set res "" foreach int $ints {append res [Elias'gamma $int]} set res } # Now to decode such a bit string: proc Elias'decode'gammas bits { set res {} while {$bits ne ""} { regexp ^(0*) $bits -> zeroes set length [expr {[string length $zeroes]*2}] lappend res [bits2int [string range $bits 0 $length]] set bits [string range $bits [incr length] end] } set res } # with this little helper, which doesn't bother for the leading zeroes: proc bits2int bits { set res 0 foreach bit [split $bits ""] { set res [expr {$res*2+$bit}] } set res } if 0 {Testing: % Elias'gammas {1 2 3 4 5} 10100110010000101 % Elias'decode'gammas [Elias'gammas {1 2 3 4 5}] 1 2 3 4 5 So we managed to cram 5 integers into 17 bits, and extract them correctly, too. This code works very well for lists of small integers, like runlengths - the worst case there, the 010101... checkerboard, just takes up one bit per 1-length run. For bigger integers, the Elias '''delta''' encoding [http://en.wikipedia.org/wiki/Elias_delta_coding] may be more compact. In the delta encoding the length information for the integer we are encoding is encoded with the Elias gamma code.} proc Elias'delta int { set bits [int2bits $int] return [Elias'gamma [string length $bits]][string range $bits 1 end] } proc Elias'deltas ints { set res "" foreach int $ints {append res [Elias'delta $int]} set res } proc Elias'decode'deltas bits { set res {} while {$bits ne ""} { regexp ^(0*) $bits -> zeroes set length [expr {[string length $zeroes]*2}] set l2 [bits2int [string range $bits 0 $length]] set l3 [expr {$length+$l2-1}] lappend res [bits2int 1[string range $bits [expr {$length+1}] $l3]] set bits [string range $bits [incr l3] end] } set res } if 0 {Testing again: % Elias'decode'deltas [Elias'deltas {1 10 100 1000}] 1 10 100 1000 The delta encoding is more compact if many numbers are bigger than 32 - here's an experimental table: % foreach i {1 2 4 8 16 32 64 128 256 512 1024 2048 4096} { puts $i:[string length [Elias'gamma $i]],[string length [Elias'delta $i]]} 1:1,1 2:3,4 4:5,5 8:7,8 16:9,9 32:11,10 64:13,11 128:15,14 256:17,15 512:19,16 1024:21,17 2048:23,18 4096:25,19 if 0 {Another Elias code is ''omega'' coding, also called ''recursive Elias coding'' [http://en.wikipedia.org/wiki/Elias_omega_coding]. It generalizes on the gamma and delta codes by putting a series of length prefixes in front of the encoded number, each prefix telling the length of the next in the sequence. The process stops when encountering a '0' bit after a prefix (Each length prefix starts with an '1' bit).} ---- [[ [Arts and crafts of Tcl-Tk programming] | [Category Compression] ]]