A little Unicode editor


WikiDbImage uniedit.jpg

Richard Suchenwirth -- By popular demand, here's a plaything for those who have Tcl/Tk 8.1 or better and want to experiment with Unicode and UTF-8 - an editor with

  • a menu (File/Open/Save/Exit as usual, Encoding/..., Language/... where you select which exotic character set you'd like to use),
  • a text widget (no scroll bar, but you can still scroll with middle mouse button held down)
  • a keyboard widget where the selected characters are offered as buttons. Clicking on one inserts the respective character at the text widget's insert cursor.

To display Greek, Arabic or whatever, of course you have to have at least one font installed that contains those characters. For similar input of 2350 different Korean Hangul (and a keyboard in a text widget, so it auto-wraps), see A little Korean editor. For Chinese, Japanese, Korean, Arabic, Greek, Hebrew, see taiku goes multilingual.
See also iKey: a tiny multilingual keyboard.


 #!/bin/sh
 # use -*-Tcl-*- \
 exec wish "$0"
 package require Tk

 proc keyboard {w args} {
    destroy $w
    frame $w
    array set opts {-keys {0xA0-0xFF} -keysperline 16}
    array set opts $args ;# no errors checked
    set klist {}; set n 0
    foreach i [clist2list $opts(-keys)] {
        set c [format %c $i]
        set cmd "$opts(-receiver) insert insert [list $c]"
        if {$i>=0x5D0 && $i<=0x6FF} {
            append cmd ";$opts(-receiver) mark set insert {insert - 1 chars}"
        } 
        button $w.k$i -text $c -command $cmd  -padx 5 -pady 0
        lappend klist $w.k$i
        if {[incr n]==$opts(-keysperline)} {
            eval grid $klist -sticky news
            set n 0; set klist {}
        }
    }
    if [llength $klist] {eval grid $klist -sticky news}
    pack $w -side bottom
    set w ;# return widget pathname, as the others do
 }
 proc clist2list {clist} {
    #-- clist: compact integer list w.ranges, e.g. {1-5 7 9-11}
    set res {}
    foreach i $clist {
        if [regexp {([^-]+)-([^-]+)} $i -> from to] {
            for {set j [expr $from]} {$j<=[expr $to]} {incr j} {
                lappend res $j
            }
        } else {lappend res [expr $i]}
    }
    set res
 }
 proc file:open {w} {
    set fn [tk_getOpenFile]
    if [string length $fn] {
        $w delete 1.0 end
        set f [open $fn]
        fconfigure $f -encoding $::Encoding
        regsub -all \uFEFF [read $f [file size $fn]] "" text
        foreach line [split $text \n] {
            $w insert end $line\n
        }
        close $f
    }
 }
 proc file:save {w} {
    set fn [tk_getSaveFile]
    if [string length $fn] {
        set f [open $fn w]
        fconfigure $f -encoding $::Encoding
        if {$::Encoding=="unicode"} {
            puts -nonewline $f \uFEFF
        }
        puts -nonewline $f [$w get 1.0 end-2c]
        close $f
    }
 }
 ##################################################################### #
 menu .menu
 . config -menu .menu

 menu .menu.file -tearoff 0
 .menu add cascade -label File -menu .menu.file
 .menu.file add command -label Open... -command {file:open .t}
 .menu.file add command -label Save... -command {file:save .t}
 .menu.file add separator
 .menu.file add command -label Exit -command exit

 menu .menu.enc -tearoff 0
 .menu add cascade -label Encoding -menu .menu.enc
 foreach i {
    ascii cp1252 euc-jp iso2022-jp iso8859-1 iso8859-2 iso8859-3 
    iso8859-4 iso8859-5 iso8859-6 iso8859-7 iso8859-8 
    jis0208 koi8-r shiftjis utf-8 unicode
 } {
   .menu.enc add radio -label $i -variable ::Encoding -value $i
 }
 set ::Encoding [encoding system]

 menu .menu.lang -tearoff 0
 .menu add cascade -label Language -menu .menu.lang
 foreach {lang range} {
    "Euro Latin 1"          {0xA0-0xFF}
    Arabic                  {0xFE80-0xFEFC}
    Cyrillic                {0x410-0x44f}
    Greek                   {0x386-0x38a 0x38c 0x38e-0x3a1 0x3a3-0x3ce}
    Hebrew                  {0x5d0-0x5ea 0x5f0-0x5f4} 
    Hiragana                {0x3041-0x3094}
    Katakana                {0x30A1-0x30FE}
    Thai                    {0xE01-0xE3A 0xE3F-0xE5B}
 } {
    .menu.lang add command -label $lang -command \
            [list keyboard .kbd -keys $range -receiver .t]
 }

 keyboard .kbd -receiver .t
 pack [text .t -width 80 -height 24] -fill both -expand 1
 focus .t

LV: Is there a way to allow user selection of the size of the keyboard and display characters.(1)
At least one old-timer finds the tiny fonts difficult to see.
P.S. It would be useful to put the language selected either in the title bar, or a checkbox next to it to show what is currently being displayed.(2)
If an appropriate font isn't available, does the app tell the user? Or do the languages only show up when appropriate fonts are found? (3)


RS: (1) Of course: the button command in the keyboard proc can take a -font argument. One could add more menu items to control that...
(2) Sure. That's why I put the source here: so everyone can grab it and extend it like he wishes.
(3) This is a missing issue in Tcl-i18n: Tk does a good job to find a font which offers the required characters, but if there is none, the script does not get a feedback.
Instead, on WinNT an empty rectangle is displayed; on Sun, the required Unicode is displayed verbatim, e.g. "\uABCD". This way, the information is still there (somehow), but gridded keyboards get terribly torn up.
My workaround is "have a font that has the characters" ;-), on Windows my favorite is Bitstream Cyberbit [L1 ] (INVALID LINK!). MS-Office-2000+ comes with Arial Unicode MS, which is pretty well stuffed with all kinds of character sets (- except for the 1.0 Hangul).


For infrequent use, the virtual keyboard is most intuitive. But for writing in alphabets like Cyrillic, mapping input from the physical keyboard is more convenient - see A tiny input manager.


A modified version of this was described in Japanese at http://www.tooyoo.l.u-tokyo.ac.jp/~kmatsum/ papers/lsj/021104/MultLingTools.pdf