Version 7 of Inspecting Unicode characters

Updated 2008-01-20 08:34:30 by WJP

Arjen Markus (2 december 2002) I was looking for the Unicode of "promille" (0/00) and I got curious about the Unicode table itself. So, the following little script shows (a part of) that table. Nothing fancy - beyond the support for Unicode - but it might be useful.

(dd 16 january 2003) I made some improvements that were needed because I did not take care of the first column properly.


 # unicode.tcl -- 
 #    Inspect the available characters for a part of the UNICODE table
 #

 package require Tktable

 # fillArray --
 #    Fill the array with unicode characters
 #
 # Arguments:
 #    array    Name of the array
 #    norows   Number of rows (number of columns is fixed to 16)
 #
 # Result:
 #    None
 #
 # Side effect:
 #    The array is filled with unicode characters: 0 to 16*norows-1
 #
 proc fillArray {array norows} {
    upvar $array chars

    set nocols  16
    set chars(0,0) "Offset"
    for { set col 1 } { $col <= $nocols } { incr col } {
       set code [format "%x" [expr {$col-1}]]
       set chars(0,$col) $code
    }
    set unicode  0
    for { set row 1 } { $row <= $norows } { incr row } {
       set code [format "%x" $unicode]
       set chars($row,0) $code
       for { set col 1 } { $col <= $nocols } { incr col } {
          set code [format "%x" $unicode]
          set chars($row,$col) [subst \\u$code]
          incr unicode
       }
    }
 }

 # showArray -- 
 #    Show the table
 #
 # Arguments:
 #    w        Widget that will contain the table
 #    array    Name of the array
 #    norows   Number of rows
 #
 # Result:
 #    None
 #
 # Side effect:
 #    The array is shown
 #
 proc showArray {w array norows} {
    if { $w == "." } {
       set t .table
       set x .xscroll
       set y .yscroll
    } else {
       set t $w.table
       set x $w.xscroll
       set y $w.yscroll
    }
    table $t -rows $norows -cols 17 \
       -colwidth 6 \
       -height   10 \
       -titlerows 1 -titlecols 1 -variable ::$array \
       -xscrollcommand "$x set" \
       -yscrollcommand "$y set"
    scrollbar $x -orient horizontal \
       -command "$t xview"
    scrollbar $y -orient vertical \
       -command "$t yview"
    grid $t $y
    grid $x  x

    grid $t   -sticky news
    grid $x   -sticky ew
    grid $y   -sticky ns

 }

 # main --
 #   Set up the main screen and fill it
 #

 global unicode_array

 fillArray unicode_array 100 
 showArray . unicode_array 100

Richard Suchenwirth 2003-01-05 - This is my first little Tcl project on the iPaq: a Unicode page browser that allows to inspect which characters are available. In a spinbox you can select which page to display (00..FF hex). Each row starts with the hex digits of the second byte of the first character in the row. Caveat: Automatic font-finding doesn't work on iPAQ, so you have to pick a font, and see only those Unicodes that it has - the below works for inspecting Korean Hangul.

 set pages {}
 set f {"Baekmuk Dotum" 9}
 for {set i 0} {$i <=255} {incr i} {
  lappend pages [format %02X $i]
 }
 frame .f
 label .f.l -text "Unicode page"
 spinbox .f.s -values $pages -width 2 -command {redo .c %s} -textvar page
 set page 00
 entry .f.e -textvar entry -font $f
 eval pack [winfo childr .f] -side left
 pack .f.e -fill x -expand 1
 canvas .c -width 235 -height 215 -bg white
 eval pack [winfo childr .] -anchor w
 proc redo {w page} {
  $w delete all
  set xd [split 0123456789ABCDEF  ""]
  set y 10
  foreach i $xd {
   if !0x$page$i continue
   set x 10
   $w create text $x $y -text ${i}0
   foreach j $xd {
   incr x 13
    $w create text $x $y -text [subst \\u$page$i$j] -tag char -font $::f
   }
   incr y 13
  }
 }
 proc next incr {
   set p [expr (0x$::page+$incr)%256]
   set ::page [format %02X $p]
 }
 redo .c 00
 bind . <Up> {redo .c [next 1]}
 bind . <Down> {redo .c [next -1]}
 .c bind char <1> {
  append entry [.c itemcget current -text]
  .f.e select range 0 end
  clipboard clear
  clipboard append $entry
 }
 focus .f.e
 bind . <Left> {
  exec wish $argv0 &; exit
 }

By clicking on a character you can copy it to the entry widget at top right, but neither selection nor clipboard appear to work to make that string available for pasting into other apps.


WJP 2008-01-20. The file UnicodeData.txt contains various character properties such as case mappings, decompositions, character class, and BIDI class as well as the character names and codepoints but is not easy for humans to read. I therefore wrote a browser that makes it easier to find information in this file. It has regular expression search and filtering on a selected property, expands abbreviations, and so forth. If a local copy of UnicodeData.txt is not available, it will download the most recent version in one button press. It is too long to include here but can be downloaded from: http://billposer.org/Software/UnicodeDataBrowser.html tablelist does much of the work. Here's a screenshot. http://billposer.org/Software/Images/UnicodeDataBrowserOverview.jpg


Arts and crafts of Tcl-Tk programming [ Category Characters | Category Application | ]