Things Japanese

Japanese is the language spoken by Japanese (日本語).

See Also

Common questions about Tcl/Tk and Japanese language support
iPAQ goes Japanese, by RS
Tcl/Tk日本語チーム
Roughly translated into English as "Tcl/Tk Japanese translation team".

Nengo calculation

The Japanese calendar starts counting years from 1 with each accession of an emperor, who baptizes his era with a two-Kanji name, nengo (see for example http://japanesesword.homestead.com/files/kanji/nengo.htm ). The following nengo are important for recent times:

 Nengo  Long form     Abbreviation  Dates
 Meiji  \u660e\u6cbb  \u337e        1868-1911
 Taisho \u5927\u6b63  \u337d        1912-1925
 Showa  \u662d\u548c  \u337c        1926-1988
 Heisei \u5e73\u6210  \u337b        1989-

For these four eras (and four more free positions), Unicodes have been allocated that render both Kanji into one character. (One might of course also render the two separately, from the regular CJK set). The windows font "Arial Unicode MS" has these four characters, so the following code was possible:

 proc nengo {{year ""}} {
    # convert an AD year (default: this) to nengo style 
    if {$year==""} {set year [clock format [clock seconds] -format %Y]}
    if {$year<1868} {
        error "cannot convert year $year, prior to 1868"
    } elseif {$year<1912} {
        incr year -1867; set nengo \u337E ;# Meiji
    } elseif {$year < 1926} {
        incr year -1911; set nengo \u337D ;# Taisho
    } elseif {$year < 1989} {
        incr year -1925; set nengo \u337C ;# Showa
    } else {
        incr year -1988; set nengo \u337B ;# Heisei
    }
    return $nengo$year
 } ;# RS

Richard is only partly right here. The year 1912 is Meiji 45 Nen until 30 July (7 Gatsu 30 Nichi) and Taisho 1 Nen thereafter; similarly, the year 1926 is Taisho 17 Nen until 25 December (12 Gatsu 25 Nichi) and Showa 1 Nen thereafter, and 1989 is Showa 64 Nen for January 1-7 (1 Gatsu Tsuitachi--Nanoka) and Heisei 1 Nen afterward. The era changes with the death of an emperor. --Kevin Kenny

RS: Yes, point taken (was too lazy to search for the switching dates, and wanted to keep the interface simple). So the above proc, over the 133 years from 1868 to 2001, gets 0.5+0.98+0.02=1.5 nengos wrong, an error rate of 1.1%. We can reduce that to a third by changing the Taisho test to $year < 1927 - and of course to zero by inputting and testing day and month (but clock times only go back to 1902.. See Reworking the clock command)

WJP: Also, anyone wanting to work with nengo for earlier periods should be aware that there may be more than one era per Emperor. The era was sometimes changed by a living emperor as a result of an omen. Some emperors' reigns encompass as many as three eras.

Japanese weekday names

This list contains the Unicodes for the short names (one Kanji) of what in English is Sunday, Monday,...,Saturday:

        weekdays,ja {\u65e5 \u6708 \u706b \u6c34 \u6728 \u91d1 \u571f}

(from an i15d date chooser). For the long form, append \u66DC\u65e5 (-yoobi); for an intermediate form, only \u66DC.

Japlish

A functional converter from Romaji (7-bit ASCII letters for hiragana and KATAKANA, the two syllabic writing systems used in Japan) to the corresponding Unicodes, plus a handful of Kanji words - add more if you need them...

Hiragana to Katakana

These are the two, basically isomorphic, Japanese syllabic "alphabets" (Katakana \u30f5 and \u30f6 have no Hiragana counterpart). See Example scripts everybody should have for how to convert Hiragana in a string to Katakana by just saying

 [tr \u3041-\u309e \u30a1-\u30fe $japaneseString] 

WJP: Sorry, but this does not work for real Japanese text. Hiragana and katakana are isomorphic if you just list the various CV (consonant-vowel) combinations: they each have a symbol for /ka/, a symbol for /ki/, and so on. However, they use different devices for writing long vowels. In hiragana long vowels are written by a CV symbol followed by the appropriate bare V symbol, e.g. <ka><a> for /ka:/, whereas in katakana there is a separate symbol that indicates that the previous vowel is long.

Input Methods

While there is some support for input methods in Tk, it doesn't yet work on all OSes. But for a quick pure-Tcl solution that does Romaji to Kanji resolution with menus, and e.g. Romaji to Kana translation in a single, half-page string map command, see taiku goes multilingual.

Misc

Japanese dates are written in the format

 (year - Western or nengo) \u5E74(month) \u6708 (day) \u65e5

. The month kanji is read "gatsu". The day kanji is read "nichi" except for some irregular combinations (the 1st of the month is "tsuitachi", the 2nd is "futsuka", etc.)


 -ka is just another reading of the japanese character for day.  A lot of this

nonesense comes from the multiple readings of single Japanese characters, at least two, sometimes more - a Chinese derived and a Japanese derived reading.

  ka/nichi - same kanji.
  Futsuka  - also means 2'nd day, futsu from the Japanese style counting:
   Chinese style counting        Japanese style counting
       ichi                       hitotsu
       ni                         futatsu
       san                        mitsu
       shi                        yotsu
       go                         mutsu
       roku                       itsutsu
       shichi                     nanatsu
       hachi                      yatsu
       kyu                        kokonotsu
       ju                         to

The irregular days I can remember (may be some more are tsuitachi and hatsuka (the 20'th). RFox


JBDrill, Jim Breen's simple Japanese Flashcard program at [1 ]


Play with a Japanese abacus (calculator) at TkSoroban


Animated Kanji lets you see the order in which some Kanji are written, stroke by stroke.


A japanese Tcl/Tk wiki seems to be at: http://purl.oclc.org/NET/jtclwiki/


TR These are the unicodes of all 214 Bushu radicals:

 set bushuCodes {
        4e00 4e28 4e36 4e3f 4e59 4e85 4e8c 4ea0 4eba 513f
        5165 516b 5182 5196 51ab 51e0 51f5 5200 529b 52f9
        5315 531a 5338 5341 535c 5369 5382 53b6 53c8 53e3
        56d7 571f 58eb 5902 590a 5915 5927 5973 5b50 5b80
        5bf8 5c0f 5c22 5c38 5c6e 5c71 5ddb 5de5 5df1 5dfe
        5e72 5e7a 5e7f 5ef4 5efe 5f0b 5f13 5f50 5f61 5f73
        5fc3 6208 6238 624b 652f 6534 6587 6597 65a4 65b9
        65e0 65e5 66f0 6708 6728 6b20 6b62 6b79 6bb3 6bcb
        6bd4 6bdb 6c0f 6c14 6c34 706b 722a 7236 723b 723f
        7247 7259 725b 72ac 7384 7389 74dc 74e6 7518 751f
        7528 7530 758b 7592 7676 767d 76ae 76bf 76ee 77db
        77e2 77f3 793a 79b8 79be 7a74 7acb 7af9 7c73 7cf8
        7f36 7f51 7f8a 7fbd 8001 800c 8012 8033 807f 8089
        81e3 81ea 81f3 81fc 820c 821b 821f 826e 8272 8278
        864d 866b 8840 884c 8863 897e 898b 89d2 8a00 8c37
        8c46 8c55 8c78 8c9d 8d64 8d70 8db3 8eab 8eca 8f9b
        8fb0 8fb5 9091 9149 91c6 91cc 91d1 9577 9580 961c
        96b6 96b9 96e8 9752 975e 9762 9769 97cb 97ed 97f3
        9801 98a8 98db 98df 9996 9999 99ac 9aa8 9ad8 9adf
        9b25 9b2f 9b32 9b3c 9b5a 9ce5 9e75 9e7f 9ea5 9ebb
        9ec3 9ecd 9ed1 9ef9 9efd 9f0e 9f13 9f20 9f3b 9f4a
        9f52 9f8d 9f9c 9fa0
 }

In order to see them, just use this:

 pack [text .t]
 foreach char $bushuCodes {.t insert end "[subst \\u$char] "}

Some Tcl-Tk software in Japanese [2 ]