Keith Vetter 2003-11-12 : Babelfish is sorely missing a key language, one known to many school kids, and that's Pig Latin. Luckily Tcl can come to the rescue. See Tclay - Ooltay Ommandcay Anguagelay for a famous example text.
Everything you type into the left text box will be automatically translated into Pig Latin in the right text box. It only works with ASCII characters and is not the smartest about deciding what exactly is a word but that's left as an exercise for the reader.
For the technically minded, I just learned the linguistic terms describing where the split occurs: onset is the initial consonant cluster and the rime is the remainder of the word.
##+########################################################################## # # piglatin.tcl -- translates english text into pig latin # by Keith Vetter, Nov 12, 2003 # package require Tk package require textutil proc PigLatinizeWord {word} { if {! [regexp -nocase {^[a-z]} $word]} { return $word } ;# Not alpha # Tricky part is keeping case consistent set case tolower if {[regexp {^[A-Z]} $word]} {set case totitle} if {[regexp {^[A-Z]{2,}} $word]} {set case toupper} if {[regexp -nocase {^[aeiou]} $word]} { set word "w$word"} ;# Initial vowel regexp -nocase {^([^aeiou][^aeiouy]*)(.*)} $word => onset rime set piglatin "$rime${onset}ay" return [string $case $piglatin] } proc DoText {} { set txt [string trimright [.tplain_org get 1.0 end]] # Not the smartest way to split into words but simple and good enough set words [::textutil::splitx $txt {(\W)}] ;# Split into words .tpig config -state normal .tpig delete 1.0 end foreach word $words { .tpig insert end [PigLatinizeWord $word] } .tpig config -state disabled } proc DoDisplay {} { wm title . "English to Pig Latin Translation" label .lplain -text "Plain Text" .lplain configure -font "[font actual [.lplain cget -font]] -weight bold" label .lpig -text "Pig Latin Text" .lpig configure -font "[font actual [.lpig cget -font]] -weight bold" text .tplain -width 50 -height 30 -wrap word text .tpig -width 50 -height 30 -wrap word -bg #dbe1ff -state disabled canvas .plain2pig -width 40 -height 20 -bd 0 -highlightthickness 0 .plain2pig create line 5 10 35 10 -tag z -arrow last -width 5 grid .lplain x .lpig -row 0 grid .tplain x .tpig -sticky news grid config .plain2pig -row 1 -column 1 -padx 5 grid columnconfig . {0 2} -weight 1 grid rowconfig . 1 -weight 1 focus .tplain rename .tplain .tplain_org proc .tplain {cmd args} { set rval [eval .tplain_org $cmd $args] if {$cmd == "insert" || $cmd == "delete"} { DoText } return $rval } } DoDisplay set sample {NAME Tcl - Tool Command Language SYNOPSIS Summary of Tcl language syntax. DESCRIPTION The following rules define the syntax and semantics of the Tcl language: [1] Commands. A Tcl script is a string containing one or more commands. Semi-colons and newlines are command separators unless quoted as described below. Close brackets are command terminators during command substitution (see below) unless quoted. ... } .tplain insert end $sample
RS: Super-cute!
TV Is this transformation strictly invertable? It seems redundant, but eh, we don't ask 'compress' for that concerning normal language either..
KPV It's almost but not quite invertable due to the vowel rule, specifically any word that starts with a "w" and is also a word when that "w" is beheaded will both map to the same pig latin word. For example, wad and ad both map to adway. A quick search of a word list I have came up with about 120 such words, my favorites being witch, wassail and weighty.
KPV I just realized that there's a whole second class of non-invertable words. These usually are pronounced differently but are spelt the same. This class is most easily described by example: thin and hint both become inthay in pig latin (they differ in pronunciation due to where the syllable is split). Other fun examples include kiss/skis and quirts/squirts. In my word 95,000 entry word list I get 444 words which are non-introvertable.
Donald Arseneau That second class shows a "bug" in the pig latin rules. Since there are rules for leading sh and th, there needs to be a rule for a leading h coupled with a terminal s or t. Omitting the h would make sense (but not help ensure reversible mappings).
lexfiend 2008-02-05: Somehow, Tcl --> Tclay doesn't seem right to me. Given that it's an acronym and conventionally pronounced "Tickle", I would've thought the correct transformation should be Icltay or something similar.