Version 3 of encoding

Updated 2001-10-07 11:51:54

http://purl.org/tcl/home/man/tcl8.4/TclCmd/encoding.htm


Examples: The Euro sign is represented in Windows cp1252 as \x80. If you get such strings in, you can see the real Euro sign with

 encoding convertfrom cp1252 \x80

Back you go with

 encoding convertto cp1252 \u20AC

Which default encoding system is used for communication with the OS (including file I/O), you can find out with

 encoding system

Which encodings are delivered with your Tcl version, you can easily see with

 encoding names

Can I use the 'encoding' command (or some appriate 'fconfigure -encoding') to take a Tcl source file (*.tcl) in an arbitrary encoding and output a well-formed Tcl source file which is pure ascii (i.e. all chars > 127 have been converted to \uhhhh unicode escape sequences)?

RS: Sure, but some assembly required - like this:

 proc u2x s {
    set res ""
    foreach i [split $s ""] {
        scan $i %c int
        if {$int<128} {
           append res $i
        } else {
           append res \\u[format 04.4X $int]
        }
    }
    set res
 }
 set fp [open $filename]
 fconfigure $fp -encoding $originalEncoding
 set data [u2x [read $fp [file size $filename]]]
 close $fp
 set fp2 [open $newFilename w]
 puts -nonewline $fp2 $data
 close $fp2 

The "u2x" functionality is easily done, but it's also somewhere built-in in Tk - on Unix, codes for which no font has a character are substituted in "\uxxxx" style... (Windows mostly shows an empty rectangle). See Unicode and UTF-8


Tcl syntax help - Arts and crafts of Tcl-Tk programming Category Command