if 0 {I've come across skepticism about [Tcl]'s facility for bit manipulation one too many times. While [RS]'s usual ''tours de force'' such as "[Playing with bits]" and "[Big bitstring operations]" show how '''good''' Tcl can be at bit manipulation, this page has a far more limited ambition: simply to help hardware- or [C]-oriented developers to feel comfortable working on low-level data in a higher-level language.} # Let's experiment: one model for "low-level data" is a byte array, or # byte sequence. A sequence of eight-bit values is often the manifestation # of information received from a physical device through a serial port, or # from a remote host through a network connection. Start, then, with # sample data, a sequence of seven eight-bit quantities. set sample \x63\x77\x54\x00\x83\x41\x42 # While they're all there, only some are displayable. puts $sample # Let's make a utility that'll show the content of a byte # sequence as hex data: proc show byte_sequence { binary scan $byte_sequence H* x puts [regsub -all (..) $x {\1 }] } # Display our sample data. show $sample # Suppose bits 3 and 4 combine to make some type specification. # Let's look at them: foreach byte [split $sample {}] { puts "The type of this byte is '[expr 0x06 & [scan $byte %c]]'." } # We can "mask off" unwanted bits. foreach byte [split $sample {}] { puts "After masking, we're looking at '[format %2X [expr 0x3F & [scan $byte %c]]]'." } ################################################################ # This sample datum is of five sixteen-bit quantities. set word_sample \u0001\u0020\u0300\u4000\uFEDC proc show_words word_sequence { foreach word [split $word_sequence {}] { puts -nonewline "[format %04X [scan $word %c]] " } puts "" } # Suppose I need to look just at words three and four. set subsample [string range $word_sample 3 4] # This next displays "4000 FEDC". show_words $subsample # We can mask off any bits we choose. foreach word [split $word_sample {}] { puts "After masking, we see '[format %04X [expr 0xF0FF & [scan $word %c]]]'." } # The output from that last should have been: # After masking, we see '0001'. # After masking, we see '0020'. # After masking, we see '0000'. # After masking, we see '4000'. # After masking, we see 'F0DC'. #################################################################### if 0 {Thanks to [CLN] for a drastic simplification of what follows.} # It sometimes happens that vendors define sixteen-bit protocols that they, in effect, # force through eight-bit pipes. A network receiver might, for example, receive bytes # we'll label \x01\x03\x54\x80, with the direction to interpret these as the two # sixteen-bit words \u0301\u8054 (notice that we're entering the realm of endian # affairs). Here's a model for handling such cases: set byte_sequence \x01\x03\x54\x80\x33\x34 # Notice that "s*" and "S*" account for the two endianness parities. binary scan $byte_sequence s* display_word_sequence puts "Here are the words: '$display_word_sequence'." binary scan [string range $byte_sequence 0 3] S2 first_two_words puts "Here are the first two words of the byte sequence: '$first_two_words'." # Also: remarks on RE (string trim; (..?)). ---- [Category String Process]