Richard Suchenwirth - Greeklish is a name used in the Web for Greek text written in Latin letters, in other words, a transliteration. See http://homepages.lycos.com/cast00/lypersonal/ for an example - which uses 8 for Theta and differs slightly from the encoding used here.
The following proc translates text in Greeklish to the appropriate Unicodes (cf. Unicode and UTF-8) for the Greek letters. Transliteration is mostly strict, i.e. a 1:1 mapping (that's why slight oddities like Q for Theta occur, but it can be memorized as "a circle with something at it"). I made one exception for the accented letters, which in Greeklish are written with trailing apostrophe.
array set i18n_a2g { A \u391 B \u392 G \u393 D \u394 E \u395 Z \u396 H \u397 Q \u398 I \u399 K \u39a L \u39b M \u39c N \u39d J \u39e O \u39f P \u3a0 R \u3a1 S \u3a3 T \u3a4 U \u3a5 F \u3a6 X \u3a7 Y \u3a8 W \u3a9 a \u3b1 b \u3b2 g \u3b3 d \u3b4 e \u3b5 z \u3b6 h \u3b7 q \u3b8 i \u3b9 k \u3ba l \u3bb m \u3bc n \u3bd j \u3be o \u3bf p \u3c0 r \u3c1 c \u3c2 s \u3c3 t \u3c4 u \u3c5 f \u3c6 x \u3c7 y \u3c8 w \u3c9 ";" \u387 ? ";" } proc greeklish {args} { global i18n_a2g set res "" foreach {in out} { A' \u386 E' \u388 H' \u389 I' \u38a O' \u38c U' \u38e W' \u38f a' \u3ac e' \u3ad h' \u3ae i' \u3af o' \u3cc u' \u3cd w' \u3ce } {regsub -all $in $args $out args} foreach char [split " \n\t.,:;" ""] { regsub -all "s\[$char\]" "$args " "c$char" args } ;# change to word-final sigma at evident word ends foreach i [split $args ""] { if {[array names i18n_a2g $i]!=""} { append res $i18n_a2g($i) } else { append res $i } } return $res }
Example: [greeklish Aqh'nai] gives the Greek name of Athens.
Nov 15,2000 RS: added conversion of word-final "s" to "c", to produce the final lowercase sigma.
See also: The Lish family - Arts and crafts of Tcl-Tk programming