WJG (26/09/09) Sometimes it really easy to do relatively complex jobs in Tcl. But, I guess regular Tclers know that anyway. Handling a lot of texts possibly in mixed romanised script is a daily task for me. SCIM on Linux goes a long way to help, but there's the need to constantly switch my input system between a number of encodings. Sometimes it helps to use ascii key combinations. This is where ITRANS [L1 ] comes in useful. ITRANS a system developed as a means of depicting Indic languages using the base ASCII code set. So,its easy! Imagine my nightmares of using mixed Chinese-Characters, Pinyin, Romanised Sanskrit (IAST) [L2 ] and English key mappings! Its a lot of switching about. Hence this little routine. I guess that an ITRANS to Devanagari converter is now called for.
#=============== # ITrans-Unicode.tcl #=============== # Author: William J Giddings # Date: 26/09/09 #=============== #!/bin/sh # the next line restarts using tclsh \ exec tclsh "$0" "$@" #--------------- # convert ITRANS ascii sequence to Unicode #--------------- proc ITrans { key } { return [string map { "aa" "ā" "AA" "Ā" "ii" "ī" "II" "Ī" "uu" "ū" "UU" "Ū" ".r" "ṛ" ".R" "Ṛ" ".rr" "ṝ" ".RR" "Ṝ" ".l" "ḷ" ".L" "Ḷ" ".ll" "ḹ" ".LL" "Ḹ" ".M" "ṁ" ".m" "ṃ" ".h" "ḥ" ".H" "Ḥ" ";n" "ṅ" ";N" "Ṅ" "~n" "ñ" "~N" "Ñ" ".t" "ṭ" ".T" "Ṭ" ".d" "ḍ" ".D" "Ḍ" ".n" "ṇ" ".N" "Ṇ" ";s" "ś" ";S" "Ś" ".s" "ṣ" ".S" "Ṣ" } $key] } # Demonstration code puts [ITrans "praaj~napaaramitaa"] puts [ITrans "rat.nagunasa.Mcayagaathaa"] puts [ITrans "K.r.s.na"]
The demonstration code produces this output:
prājñapāramitā ratṇagunasaṁcayagāthā Kṛṣṇa
DGP This script should use the -encoding option of tclsh to be sure that tclsh uses the encoding in which all the characters in this script are stored in the file. This is probably utf-8.
jbr - 2009-11-13 22:23:24
Here are a bunch of other characters with a TeX mapping. [L3 ]
WJG (25/03/13) It might be worth noting for any Linux users who happen to be Sanskritists or work with any other Indic language that a while ago I submitted an IAST module based upon the above script for distribution with the M17N library used by SCIM to handle multi-language input. This certainly makes implementing complex input methods more manageable. This code snippet need not be forgotten, however, as there are quote a number of online texts out there on the web encoded in ITRANS which can be downloaded and converted using this simple proc.