Entering Unicode characters in a widget

MJ - The following script adds the Control-u binding to text or entry widgets which allows entry of Unicode characters directly. After pressing Control-u you can type the 4 hex digits for the code point of the character.


 package require Tk
 text .t
 entry .e
 pack .t
 pack .e

 # enable functionality by adding the UnicodeEntry tag to the bindtags
 bindtags .t [list .t UnicodeEntry Text . all]
 bindtags .e [list .e UnicodeEntry Entry . all]


 namespace eval unicode_entry {
   variable uc_keys

   proc enable_unicode_entry {widget} {
      variable uc_keys
      set uc_keys($widget) {}
   }

   proc disable_unicode_entry {widget} {
      variable uc_keys
      unset -nocomplain uc_keys($widget)
   }

   proc handle_uc_key {widget key} {
      variable uc_keys
      if {![info exists uc_keys($widget)]} {
         return
      }
     
      upvar 0 uc_keys($widget) keys
      switch -glob -- [string toupper $key] {
        {[0-9A-F]} {
            append keys $key
            if {[string length $keys] >= 4} {
               $widget insert insert [subst \\u$keys]
               disable_unicode_entry $widget
            }
            return -code break
        }
        default {
            $widget insert insert $keys
            disable_unicode_entry $widget
          }      
        }                
   }

   bind UnicodeEntry <Control-Key-u> [namespace code [list enable_unicode_entry %W]]
   bind UnicodeEntry <Key> [namespace code [list handle_uc_key %W %A]]
 }

LV 2008 Jan 23 So, what would be an example of what the user would actually type in this case? Certainly I understand, from the code, the control-u to indicate that a unicode entry was to be entered. But then what? The hex/octal/decimal digits of the special character? That presumes, of course, the user knows what that is - do we have some sort of "lookup" widget for the user to know that info? And if so, perhaps would it make sense to add some sort of comm/send type interaction to such a lookup widget so that it could just send the character to the input directly, rather than chance the user mistyping a code?

Wearing a hat of an application user, having to type in obscure numeric codes sure is annoying, though obviously it is better than having no way to input them at all. However, I think I'd rather have some sort of input language where I typed in some sort of mnemonic for the character, or a virtual keyboard or something.

MJ - This is exactly what KHIM does. This binding is just a very limited piece of code that allows you to enter a character if you do know its code point. Otherwise it's much better to use KHIM.

LV Okay, thanks! I just wasn't certain of the behind the scenes reasoning.


BEO More advanced version that uses Control+U to insert Unicode characters. Expects 4 hex digits for Unicode sequence. Should not interfere with other bindings.

 #
 # Allow user to input Unicode character using 4 character hex sequence
 #
 namespace eval ::textlib {
     variable ubuffer ""
     variable ubinding ""
 }

 #
 # Define bindings
 #
 bind Entry <Control-U> {::textlib::Unicode_start %W}
 bind Text <Control-U> {::textlib::Unicode_start %W}
 bind TEntry <Control-U> {::textlib::Unicode_start %W}
 bind TSpinbox <Control-U> {::textlib::Unicode_start %W}

 #
 # Start 4 hex character Unicode entry sequence
 #
 proc ::textlib::Unicode_start {w} {
     variable ubuffer ""
     variable ubinding [bind $w <Key>]

     # Create binding for Unicode sequence
     bind $w <Key> {::textlib::Unicode_append %W %A;break}

     # Set 10 second time-out
     after 10000 ::textlib::Unicode_abort $w
 }

 #
 # Append to Unicode entry sequence
 #
 proc ::textlib::Unicode_append {w k} {
     variable ubuffer

     # Append hex digit, abort for others
     if {[string is xdigit $k]} {
         append ubuffer $k
     } else {
         Unicode_abort $w
         $w insert insert $k
     }

     # If have 4 digits, insert character
     if {[string length $ubuffer] == 4} {
         $w insert insert [format %c 0x$ubuffer]
         Unicode_abort $w
     }
 }

 #
 # Abort Unicode entry sequence
 #
 proc ::textlib::Unicode_abort {w} {
     variable ubuffer ""
     variable ubinding

     # Cancel time-out
     after cancel ::textlib::Unicode_abort $w

     # Restore binding
     bind $w <Key> $ubinding
 }

See also: KHIM, CharEntry