Version 0 of Reformatting Lines of Chinese Text

Updated 2010-01-30 10:23:24 by WJG

WJG (20/Jan/10) Based upon a condundrum posted earlier this week , the script I created to produce column sorted rows of values was easily modified to handle the reformating of Chinese text in the western format to the traditional columns of text, written right to left. Here's what I came up with.

#!/bin/sh
# the next line restarts using tclsh \
exec tclsh "$0" "$@"
#---------------

package require Gnocl

# args:
#   data = list items to be formatted into colums
#   nrow = maximum number of rows to produce
#   pad  = character used to fill matrix gaps
# returns
#   list of formatted row strings
# note
#   tcl only
proc tabulate_Chinese_Columns {data nrows {pad -} } {

    set r 0 ;# row counter
    set str {} ;# list contain final, formatted list, returned by proc
    set m 0 ;# maximum string length, used for padding


    # Chinese has no spaces, must split everyting first
    set data [split $data {}]

    # initialise an array to hold output strings
    for {set i 0 } {$i < $nrows } {incr i} { set rows($i) {} }

    # build up the output strings
    for {set i 0} {$i < [llength $data] } {incr i} {
        if {$rows($r) == {} } {
            set rows($r) "[lindex $data $i]"
        } else {
            set rows($r) "$rows($r)\t[lindex $data $i]"
        }
        incr r
        if {$r ==  $nrows } {set r 0}
    }

    # get the row size in columns length
    for {set i 0 } {$i < $nrows } {incr i} {
        set l [string length $rows($i)]
        if {$l >= $m} { set m $l}
    }

    # pad shorter rows with character
    for {set i 0 } {$i < $nrows } {incr i} {
        set l [string length $rows($i)]
        if {$l < $m} {
                set rows($i) "$rows($i)\t$pad"
        }
        # invert if necessary
        set rows($i) [string reverse $rows($i)]
    }

    # build list
    for {set i 0 } {$i < $nrows } {incr i} {
        lappend str $rows($i)
    }

    return $str
}


# the uniquitous demo script
set txt1 [gnocl::text ]
gnocl::window -child $txt1 -defaultWidth 200 -defaultHeight 300 -title "The Analects"

set str(1) "論語:學而。學而:子曰:學而時習之,不亦說乎?有朋自遠方來,不亦樂乎?人不知而不慍,不亦君子乎?"
# Xue Er:   The Master said,"Is it not pleasant to learn with a constant perseverance
#           and application? Is it not delightful to have friends coming from distant
#           quarters? Is he not a man of complete virtue, who feels no discomposure
#           though men may take no note of him?"

set str(2) "學而:有子曰:其為人也孝弟,而好犯上者,鮮矣;不好犯上,而好作亂者,未之有也。君子務本,本立而道生。孝弟也者,其為仁之本與!"
# Xue Er:   The philosopher You said, "They are few who, being filial and fraternal,
#           are fond of offending against their superiors. There have been none, who,
#           not liking to offend against their superiors, have been fond of stirring
#           up confusion. The superior man bends his attention to what is radical.
#           That being established, all practical courses naturally grow up.
#           Filial piety and fraternal submission! - are they not the root
#           of all benevolent actions?"

set str(3) "學而:子曰:巧言令色,鮮矣仁!"
# Xue Er:   The Master said, "Fine words and an insinuating appearance are
#           seldom associated with true virtue."

lappend data3 $str(1) $str(2) $str(3)

foreach row [tabulate_Chinese_Columns $data3 20 ""] {
    $txt1 insert end \t${row}\t\n
}

Here's the screenshot of what it produced:

http://lh4.ggpht.com/_yaFgKvuZ36o/S2QIYvtBK6I/AAAAAAAAAN4/9Wrk9PR1Ml8/s800/Screenshot-The%20Analects.png