Version 4 of Google Translation via http Module

Updated 2007-11-09 10:30:54 by male

I just wanted to use the Google Translation Services in an application managing english and non-english messages for UIs and so I tried a bit.

The code below is just raw and has some flaws:

  • any format field inside the text to be translated will confuse the translation service, so in my application I map any format field, like e.g. "%ld", with a uppercase token, to remap after the translation to the original format fields
  • the resulting translation may contain HTML entities, so I used the tcllib package htmlparse to replace the HTML entities with their original characters - so there is a dependency on this external tcllib package htmlparse

Please see the differences in the example translations returned by Google!

Much fun,

Martin male


 package require htmlparse;
 package require http;
 namespace eval ::googleTranslation {
     variable postUrl http://translate.google.com/translate_t?langpair=en%7Cde;

     http::config -useragent {Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.12) Gecko/20070508 Firefox/1.5.0.12};

     proc progress {token total count} {
         if {$total == 0} {
             set output  [format \
                 {-> Received from Google: %ld Bytes} \
                 $count \
             ];
         } else {
             set output  [format \
                 {-> Received from Google: %.1%lf %% (%ld Bytes)} \
                 [expr {double($count) / $total * 100}] \
                 $count \
             ];
         }

         puts stdout $output;
         flush stdout;

         return;
     }

     proc translate {text {source en} {destination de}} {
         variable postUrl;

         set query   [http::formatQuery \
             hl          en \
             ie          UTF8 \
             text        $text \
             langpair    $source|$destination \
         ];

         if {[catch {
             set post [http::geturl \
                 $postUrl \
                 -query      $query \
                 -progress   ::googleTranslation::progress \
             ];
         } reason] == 1} {
             error "couldn't translate a text via Google: $reason";
         }

         set success [regexp -- \
             {<div id=result_box dir=ltr>([^<]+)<} \
             [http::data $post] \
             whole translation \
         ];

         http::cleanup $post;

         if {$success == 0} {
             error "couldn't translate a text via Google: error returned from Google's translation service";
         }

         return [htmlparse::mapEscapes $translation];
     }
 }

Here some examples:

 % googleTranslation::translate "This is only a test!"
 -> Received from Google: 2668 Bytes
 -> Received from Google: 4098 Bytes
 -> Received from Google: 5528 Bytes
 -> Received from Google: 6958 Bytes
 -> Received from Google: 9100 Bytes
 -> Received from Google: 9100 Bytes
 Dies ist nur ein Test!

 % googleTranslation::translate "The Alarm '%s (%ld)' occurred"
 -> Received from Google: 2668 Bytes
 -> Received from Google: 4098 Bytes
 -> Received from Google: 5528 Bytes
 -> Received from Google: 6958 Bytes
 -> Received from Google: 9260 Bytes
 -> Received from Google: 9260 Bytes
 The Alarm "% s, (%" SVN_REVNUM_T_FMT) 'aufgetreten

 % googleTranslation::translate "The Alarm %s occurred"
 -> Received from Google: 2668 Bytes
 -> Received from Google: 4098 Bytes
 -> Received from Google: 5528 Bytes
 -> Received from Google: 6958 Bytes
 -> Received from Google: 8388 Bytes
 -> Received from Google: 9108 Bytes
 -> Received from Google: 9108 Bytes
 The Alarm% n aufgetreten

 % googleTranslation::translate "The Alarm occurred"
 -> Received from Google: 2668 Bytes
 -> Received from Google: 4098 Bytes
 -> Received from Google: 5528 Bytes
 -> Received from Google: 6958 Bytes
 -> Received from Google: 9093 Bytes
 -> Received from Google: 9093 Bytes
 The Alarm aufgetreten

 % googleTranslation::translate "the alarm occurred"
 -> Received from Google: 2668 Bytes
 -> Received from Google: 4098 Bytes
 -> Received from Google: 5528 Bytes
 -> Received from Google: 6958 Bytes
 -> Received from Google: 9093 Bytes
 -> Received from Google: 9093 Bytes
 Der Alarm aufgetreten