[Richard Suchenwirth] 2002-05-05 - Here is a routine for querying or setting single bits in vectors, where bits are addressed by non-negative integers. Implementation is as a "little-endian" list of integers, where bits 0..31 are in the first list element, 32..63 in the second, etc. '''USAGE''' bit varName position ?bitval? If ''bitval'' is given, sets the bit at numeric position ''position'' to 1 if ''bitval != 0'', else to 0; in any case returns the bit value at specified position. If variable ''varName'' does not exist in caller's scope, it will be created; if it is not long enough, it will be extended to hold at least ''$position+1'' bits, e.g. ''bit foo 32'' will turn ''foo'' into a list of two integers, if it was only one before. All bits are initialized to 0. proc bit {varName pos {bitval {}}} { upvar 1 $varName var if {![info exist var]} {set var 0} set element [expr {$pos/32}] while {$element >= [llength $var]} {lappend var 0} set bitpos [expr {$pos%32}] set word [lindex $var $element] if {$bitval != ""} { if {$bitval} { set word [expr {$word | 1 << $bitpos}] } else { set word [expr {$word & ~(1 << $bitpos)}] } set var [lreplace $var $element $element $word] # In Tcl 8.4, the above line may be replaced with: # lset var $element $word # If the length of the bit vector is large, in 8.0-8.3 # performance gain will be achieved with # set var [lreplace [K $var [set var {}]] \ # $element $element $word] # where K is the K combinator: # proc K { x y } { return $x } } expr {($word & 1 << $bitpos) != 0} } #---------------------- now testing... if {[file tail [info script]] == [file tail $argv0]} { foreach {test expected} { {bit foo 5 1} 1 {set foo} 32 {bit foo 32 1} {32 1} } { catch {eval $test} res puts $test:$res/$expected } } if 0 {Discussion: This may be used for Boolean properties of numerically indexed sets of items. Example: An existence map of ZIP codes between 00000 and 99999 can be kept in a list of 3125 integers (where each element requires about 15 bytes overall), while implementing the map as an array would take 100000 * 42 bytes in worst case, but still more than a bit vector if the population isn't extremely sparse - in that case, a list of 1-bit positions, retrieved with [lsearch], might be more efficient in memory usage. Runtime of bit vector accesses is constant, except when a vector has to be extended to much larger length. Bit vectors can also be used to indicate set membership (set operations would run faster if processing 32 bits on one go with bitwise operators (&, |, ~, ^)) - or pixels in a binary imary image, where each row could be implemented by a bitvector. Here's a routine that returns the numeric indices of all set bits in a bit vector: proc bits bitvec { set res {} set pos 0 foreach word $bitvec { for {set i 0} {$i<32} {incr i} { if {$word & 1<<$i} {lappend res $pos} incr pos } } set res } % bit foo 47 1 1 % bit foo 11 1 1 % set foo 2048 32768 % bits $foo 11 47 ---- '''Sieve of Erastothenes''': The following procedure exercises the bit vector functions by letting bits represent integers, and unsetting all that are divisible. The numbers of the bits finally still set are supposed to be primes, and returned: proc sieve max { set maxroot [expr {sqrt($max)}] set primes [string repeat " 0xFFFFFFFF" [expr {($max+31)/32}]] bit primes 0 0; bit primes 1 0 for {set i [expr $max+1]} {$i<=(($max+31)/32)*32} {incr i} { bit primes $i 0 ;# mask out excess bits } for {set i 2} {$i<=$maxroot} {incr i} { if {[bit primes $i]} { for {set j [expr $i<<1]} {$j<=$max} {incr j $i} { bit primes $j 0 } } } bits $primes } % time {set res [sieve 10000]} 797000 microseconds per iteration ---- '''Phone number sorting problem:''' sort up to 10 million 7-digit numbers (see Jon Bentley's[http://www.cs.bell-labs.com/cm/cs/pearls/cto.html] for the story): # produce a data file with a million 7-digit numbers >= 2000000 set fp [open t.txt w] for {set i 0} {$i<1000000} {incr i} { puts $fp [expr {int(rand()*8000000)+2000000}] } close $fp #----------------------------------- Process all numbers in one go proc readall-foreach {fn} { set fp [open $fn r] foreach i [read $fp] { bit B $i 1 } close $fp set fp [open $fn.sorted w] puts $fp [join [bits $B] \n] close $fp } ---- [Category Concept] | [Arts and crafts of Tcl-Tk programming]