if 0 {I've come across skepticism about [Tcl]'s facility for bit manipulation one too many times. While [RS]'s usual ''tours de force'' such as "[Playing with bits]" and "[Big bitstring operations]" show how '''good''' Tcl can be at bit manipulation, this page has a far more limited ambition: simply to help hardware- or [C]-oriented developers to feel comfortable working on low-level data in a higher-level language.} # Let's experiment: one model for "low-level data" is a byte array, or # byte sequence. A sequence of eight-bit values is often the manifestation # of information received from a physical device through a serial port, or # from a remote host through a network connection. Start, then, with # sample data, a sequence of seven eight-bit quantities. set sample \x63\x77\x54\x00\x83\x41\x42 # While they're all there, only some are displayable. puts $sample # Let's make a utility that'll show the content of a byte # sequence as hex data: proc show byte_sequence { binary scan $byte_sequence H* x puts [regsub -all (..) $x {\1 }] } # Display our sample data. show $sample # Suppose bits 3 and 4 combine to make some type specification. # Let's look at them: foreach byte [split $sample {}] { puts "The type of this byte is '[expr 0x06 & [scan $byte %c]]'." } # We can "mask off" unwanted bits. foreach byte [split $sample {}] { puts "After masking, we're looking at '[format %2X [expr 0x3F & [scan $byte %c]]]'." } ################################################################ # This sample datum is of five sixteen-bit quantities. set word_sample \u0001\u0020\u0300\u4000\uFEDC proc show_words word_sequence { foreach word [split $word_sequence {}] { puts -nonewline "[format %04X [scan $word %c]] " } puts "" } # Suppose I need to look just at words three and four. set subsample [string range $word_sample 3 4] # This next displays "4000 FEDC". show_words $subsample # We can mask off any bits we choose. foreach word [split $word_sample {}] { puts "After masking, we see '[format %04X [expr 0xF0FF & [scan $word %c]]]'." } # The output from that last should have been: # After masking, we see '0001'. # After masking, we see '0020'. # After masking, we see '0000'. # After masking, we see '4000'. # After masking, we see 'F0DC'. #################################################################### # It sometimes happens that vendors define sixteen-bit protocols that they, in effect, # force through eight-bit pipes. A network receiver might, for example, receive bytes # we'll label \x01\x03\x54\x80, with the direction to interpret these as the two # sixteen-bit words \u0301\u8054 (notice that we're entering the realm of endian # affairs). Here's a model for handling such cases: set byte_sequence \x01\x03\x54\x80\x33\x34 set word_sequence {} foreach {low high} [split $byte_sequence {}] { # Is there a more elegant way to write this? # # If you don't like the endianness of the result, it's easy to # advise that you swap $low and $high in the calculation # which immediately follows. set word [expr ([scan $high %c] << 8) + [scan $low %c]] set hex [format %04X $word] append word_sequence [subst \\u$hex] } show_words $word_sequence # Also: remarks on RE (string trim; (..?)). ---- [CLN] A comment above asks, "Is there a more elegant way to write this? Yes, I think that there is. [binary scan] can deal with endian issues: set byte_sequence \x01\x03\x54\x80\x33\x34 T34 % binary scan $byte_sequence "s2" a 1 % puts $a 769 -32684 % binary scan $byte_sequence "S2" a 1 % puts $a 259 21632