OK, I have a small test file that contains utf-8 codes. Here it is (the language is Wolof) ====== Fˆndeen d‘kk la bu ay wolof aki seereer a fa nekk. DigantŽem ak Cees jur—om-benni kilomeetar la. MbŽyum gerte ‘pp ci diiwaan bi mu ====== that is what it looks like in a vanilla editor, but in hex it is: ====== $ xxd test.txt 0000000: 46cb 866e 6465 656e 2064 e280 986b 6b20 F..ndeen d...kk 0000010: 6c61 2062 7520 6179 2077 6f6c 6f66 2061 la bu ay wolof a 0000020: 6b69 2073 6565 7265 6572 2061 2066 6120 ki seereer a fa 0000030: 6e65 6b6b 2e20 4469 6761 6e74 c5bd 656d nekk. Digant..em 0000040: 2061 6b0d 0a43 6565 7320 6a75 72e2 8094 ak..Cees jur... 0000050: 6f6d 2d62 656e 6e69 206b 696c 6f6d 6565 om-benni kilomee 0000060: 7461 7220 6c61 2e20 4d62 c5bd 7975 6d20 tar la. Mb..yum 0000070: 6765 7274 6520 e280 9870 7020 6369 2064 gerte ...pp ci d 0000080: 6969 7761 616e 2062 6920 6d75 0d0a iiwaan bi mu.. ====== The second character [[cb86]] is a non-standard coding for a-grave [[à]] which is found quite consistently in web documents, although in 'real' utf-8, a-grave would be c3a0. Real utf-8 works beautifully on Macs and under Windows. I handle the fake utf-8 by using a character map which included the pair { ˆ à } because that little caret is what cb86 generates, and everything works fine ON A MAC for displaying text (in a text widget) like this: ====== Fàndeen dëkk la bu ay wolof aki seereer a fa nekk. Digantéem ak Cees juróom-benni kilomeetar la. Mbéyum gerte ëpp ci diiwaan bi mu ====== On a PC - using the same file (shared) the first three characters read in are 46 cb 20 (using no fconfigure). I have run through ALL the possible encodings and can never get the same map to work. [[There are twenty that will allow 46 cb 86]] Sorry this is so long, but if anyone has a clue, I would love to hear it. Tel Monks '''[DGP]''' I assume your 'character map' is an application of the [string map] command. Take care that it is mapping the character you believe it is. In particular, check whether you are mapping \u005e or \u02c6 , both of which are variants of ^ but which Tcl string processing treats as distinct characters because they are distinct Unicode code points. ---- '''[telgo] - 2010-07-24 13:04:10''' Thanks for your response. Here is the whole of the code area set in [open $arg1 r] fconfigure $in -encoding utf-8 set body [read -nonewline $in] if {$platform eq "windows" && $arg1 eq "test.txt" } { regexp {^F(.*)ndeen\y} $body matched thing puts $matched set thinghex [convert.string.to.hex $thing] puts "<$thing>=$thinghex" set body [ string map { $thing à Ž é ‘ ë – ñ „ Ñ — ó ƒ É è Ë Ë? à} $body] puts $body } else { set body [ string map { ˆ à Ž é ‘ ë – ñ „ Ñ — ó ƒ É è Ë Ë? à} $body] } I did this in an attempt to ensure that WHATEVER was in that position would be map translated. I get a but the a-grave never appears - it stays as caret. The console and the display show the same thing (as I would expect) <> Characters | Mac | Windows