Easy input of Pinyin

WJG (22nd May, 2005) A number of contributors to the Wiki have expressed some interest in working with matters 'Chinese'. So, here's my contribution. I'm currently working on a project to create language learning flashcards in Tk and so, of course, pinyin is vital to enable learners to get a handle on the characters that they are trying to memorize.

# -------------------------------------------------------------
# pinyin_input.tcl
# Written by William J Giddings, May, 2005.
# -------------------------------------------------------------
# Usage:
# pinyin_input <widget>
# 
# Purpose:
# --------
# Assigns necessary bindings to a widget in order to
# interpret multiple key-presses as tone inflected vowels.
#
# User operation:
# ---------------
# 
# Alt-vowel-Alt-tone
# Alt-<a|e|i|o|u|y>-Alt-<1|2|3|4>
# 
# The fifth or neutral tone takes no diacritic, so is entered as
# an unmodified key.
# 
# In the case of the umlauted u:, enter the vowel as 'y'
#
# Notes: 
# ------
# Use versions of ttf 2.76 or higher. These include the necessary 
# diacritics to display all pinyin vowels plus their tone marks.
#
# ------------------------------------------------------------- 
 
# -------------------------------------------------------------
# modify widget bindings to insert pinyin vowels
# -------------------------------------------------------------
proc {pinyin_input} {txt} {
    proc {_pinyin_insert} {w v t} {
         switch $v$t {
                a1 { set code 0101 }
                        a2 { set code 00E1 }
                        a3 { set code 01CE }
                        a4 { set code 00E0 }
                        A1 { set code 0100 }
                        A2 { set code 00C1 }
                        A3 { set code 01CD }
                        A4 { set code 00C0 }
                        e1 { set code 0113 }
                        e2 { set code 00E9 }
                        e3 { set code 011B }
                        e4 { set code 00E8 }
                        E1 { set code 0112 }
                        E2 { set code 00C9 }
                        E3 { set code 011A }
                        E4 { set code 00C8 }
                        i1 { set code 012B }
                        i2 { set code 00ED }
                        i3 { set code 01D0 }
                        i4 { set code 00EC }
                        I1 { set code 012A }
                        I2 { set code 00CD }
                        I3 { set code 01CF }
                        I4 { set code 00CC }
                        o1 { set code 014D }
                        o2 { set code 00F3 }
                        o3 { set code 01D2 }
                        o4 { set code 00F2 }
                        O1 { set code 014C }
                        O2 { set code 00D3 }
                        O3 { set code 01D1 }
                        O4 { set code 00D2 }
                        u1 { set code 016B }
                        u2 { set code 00FA }
                        u3 { set code 01D4 }
                        u4 { set code 00F9 }
                        U1 { set code 016A }
                        U2 { set code 00DA }
                        U3 { set code 01D3 }
                        U4 { set code 00D9 }
            # u:1 - u:4
                        y1 { set code 01D6} 
                        y2 { set code 01D8} 
                        y3 { set code 01DA} 
                        y4 { set code 01DC}
            # U:1 - U:4
                        Y1 { set code 01D5}
                        Y2 { set code 01D7}
                        Y3 { set code 01D9}
                        Y4 { set code 01DB}        
                        }
                #puts $code 
                $w insert insert [subst \\u$code] 
        }
    # create specific bindings
        foreach {vowel} {a A e E i I o O u U y Y} {
                        bind $txt <Alt-Key-$vowel><Alt-Key> [list _pinyin_insert %W $vowel %K]
                } 
}
# -------------------------------------------------------------
# test the code
# -------------------------------------------------------------
proc demo  {} {
    package require Tk
    text .txt -font {Ariel 14}
    pack .txt -fill both -expand 1 
    pinyin_input .txt
}
demo

RS 2005-05-22: Here's a simple alternative approach for entry widgets:

proc bindTones w {
    foreach i {1 2 3 4} {
        bind $w <KeyRelease-$i> [list bindTones'digit %W]
    }
}
proc bindTones'digit w {
     set text [$w get]
     if [in {a e i o u} [string index $text end-1]] {
        $w delete 0 end
        $w insert 0 [string map {
            a1 \u101 a2 \ue1 a3 \u103 a4 \ue0
            e1 \u113 e2 \ue9 e3 \u11b e4 \ue8
            i1 \u12b i2 \ued i3 \u12d i4 \uec
            o1 \u14d o2 \uf3 o3 \u14f o4 \uf2
            u1 \u16b u2 \ufa u3 \u16d u4 \uf9
        } $text]
    }
}
proc in {list element} {expr {[lsearch -exact $list $element]>=0}}
#-- Test:
pack [entry .e -font {{Bitstream Cyberbit} 14}]
bindTones .e

WJG (23nd May, 2005) Richard, I like that, it has nice instant input. As there was no trapping for the umlauted u, perhaps the bindTone'sdigit proc can be modified to:

proc bindTones'digit w {
    set text [$w get]
    if [in {a e i o u :} [string index $text end-1]] {
        $w delete 0 end
        $w insert 0 [string map {
            a1 \u101 a2 \ue1 a3 \u103 a4 \ue0
            e1 \u113 e2 \ue9 e3 \u11b e4 \ue8
            i1 \u12b i2 \ued i3 \u12d i4 \uec
            o1 \u14d o2 \uf3 o3 \u14f o4 \uf2
            u1 \u16b u2 \ufa u3 \u16d u4 \uf9
            u:1 \u1d6 u:2 \u1d8 u:3 \u1da u:4 \udc 
        } $text]
    }
}

RS: Sure - I just was too lazy to dig those Unicodes up on my breakfast fun project :^)