Chinese numbers

See Roman numerals for why the term 'numerals' would have been better fitting

Richard Suchenwirth 2001-12-10 - Another piece from my collection of international number formatters, here's how to convert a non-negative integer into Chinese characters (simplified as used in PRC and Singapore - edit the little "dictionary" if you prefer traditional style). I have marked powers of 10 that must be quantified with a leading !, and used "!2" for the alternative digit 2 (liang) that must be used before 1000 and 10000. Positional value 0 is expressed only once for zero sequences, hence the skipped flag. This routine should deal with numbers up to 2147483647 (32-bit MAXINT) sort of correctly:

 proc zh'format n {
    array set dic {
        0 \u96F6 1 \u4E00 2 \u4E8C 3 \u4E09 4 \u56DB 5 \u4E94
        6 \u516D 7 \u4E03 8 \u516B 9 \u4E5D 10 \u5341 !100 \u767E
        !1000 \u5343 !10000 \u4e07 !100000000 \u4EBF !2 \u4E24
    }
    if [info exists dic($n)] {return $dic($n)} ;# easy case quick kill
    set res ""
    if {$n>=100000000} {
        set res [zh'format [expr {$n/100000000}]]$dic(!100000000)
        set n [expr $n%100000000]
    }
    if {$n>=10000} {
        append res [zh'format [expr {$n/10000}]]$dic(!10000)
        set n [expr $n%10000]
    }
    set skipped 0
    foreach {i c} [list 1000 $dic(!1000) 100 $dic(!100) 10 $dic(10)] {
        if {$n/$i} {
            if {$res!="" && $skipped} {append res $dic(0)}
            append res $dic([expr $n/$i])$c
            set n [expr $n%$i]
            set skipped 0
        } else {
                set skipped 1
        }
    }
    if {$n} {
        if {$skipped} {append res $dic(0)} ;# filler zero
        append res $dic($n)
    }
    regsub ^$dic(1)$dic(10)     $res $dic(10)  res ;# avoid "yishi"
    regsub ^$dic(2)$dic(!10000) $res $dic(!2)$dic(!10000) res
    regsub $dic(2)$dic(!1000)   $res $dic(!2)$dic(!1000)  res
    set res
 }

Note however that this is more a proof of concept than a practical requirement for Chinese localization. Every Chinese using a computer is able and willing to understand numbers in the international style, so the above is useful only in full-text translation - or as a weekend fun project ;-)


RS 2007-02-14 - Here it goes in the opposite direction: convert a Chinese number to an integer, if possible (by building up an expression):

 proc cn2int str {
    set x [string map {
         \u96F6 0 \u4E00 1 \u4E8C 2 \u4E09 3 \u56DB 4 \u4E94 5
         \u516D 6 \u4E03 7 \u516B 8 \u4E5D 9 \u5341 *10+ \u767e *100+ \u5343 *1000+
       } $str]
    set x [string trim $x *+]
    catch {expr $x} res
    set res
 }

WJP 2007-02-14 - My uninum library, which has a Tcl interface, provides comprehensive conversion in both directions between Chinese numbers and integers. It handles both regular and legal/financial numerals, traditional and simplified characters, traditional and place-based, differences between Japanese and Chinese practice and among Chinese dialects, etc. It also handles Suzhou numbers and counting rods.


See also: