encodiff

Richard Suchenwirth 2006-04-05 - If you want to know the difference between two (one-byte) encodings, the following codelet may help:

 proc encodiff {e1 e2} {
   set res ""
   for {set i 32} {$i<256} {incr i} {
      set c [format %c $i]
      if {[encoding convertfrom $e1 $c] ne [encoding convertfrom $e2 $c]} {
         append res [format %02X $i] \
              [encoding convertfrom $e1 $c] | [encoding convertfrom $e2 $c]\n
      } 
   }
   set res
 }

#-- Testing demo:

 % encodiff iso8859-1 cp1252
 80 |€
 82 |‚
 83 |ƒ
 84 |„
 85 |…
 86 |†
 87 |‡
 88 |ˆ
 89 |‰
 8A |Š
 8B |‹
 8C |Œ
 8E |Ž
 91 |‘
 92 |’
 93 |“
 94 |”
 95 |•
 96 |–
 97 |—
 98 |˜
 99 |™
 9A |š
 9B |›
 9C |œ
 9E |ž
 9F |Ÿ

Obviously, Windows' cp1252 filled most of the "high control" positions with extra characters, but is otherwise identical to iso8859-1.