Richard Suchenwirth 2007-06-27 - http://www.fallingrain.com/world/ (Copyright 1996-2004 by Falling Rain Genomics, Inc.) provides a very large, publicly accessible gazetteer of the world's cities and airports - they must have millions of entries available in HTML format. To avoid that pages get too big, they use a partly very deep URL tree. For instance, to locate my city Konstanz, the URL is
http://www.fallingrain.com/world/a/K/o/n/s/t/a/n/z/
In other cases, short prefixes are sufficient, e.g. all 131 airports whose code starts with ED (plus some others) are delivered by the URL
http://www.fallingrain.com/world/a/E/D/
So to search for a place one has to iterate the URL, appending letter after letter (or its decimal Unicode if it is outside of ASCII) until a match is found. Here's a proc that does this - called with a place name, it returns a list of hits, where each hit is a list of
name type region country lat lon elevation(ft) population(est)
#!/usr/bin/env tclsh package require http proc geo'get'rain placename { set url http://www.fallingrain.com/world/a/ set res {} foreach c [split $placename ""] { set i [scan $c %c] if {$i < 65 || $i > 127} {set c $i} append url $c/ set token [http::geturl $url] set page [http::data $token] http::cleanup $token foreach line [split $page \n] { if [string match <tr* $line] { set line [string map {<td> \x80 </tr> "" ")) (( " ""} $line] set fields [split $line \x80] regexp {<a.+>(.+)</a>} [lindex $fields 1] -> name if [string match $placename* $name] { lappend res [linsert [lrange $fields 2 end] 0 $name] } } } } set res } #-- If this script is called as toplevel, the function is called, and results displayed: if {[file tail [info script]] eq [file tail $argv0]} { puts [join [geo'get'rain $argv] \n] }
Testing this as a command-line tool:
/_Ricci> geo_rain.tcl Stockel Stockel city {Province de )) (( Brabant} Belgium 50.8333333 4.45 262 309844 Stockelanda city {(( Alvsborgs Lan ))} Sweden 58.65 12 354 935 Stockels city {Land Hessen} Germany 50.5666667 9.7333333 1049 14658 Stockelsberg city {Land Bayern} Germany 49.3833333 11.2666667 1312 22990 Stockelsdorf city {Land Schleswig-Holstein} Germany 53.9 10.65 49 49614 Stockelweingarten city {Bundesland Karnten} Austria 46.6694444 13.9377778 1558 9777
It takes its time for the repeated queries, but it's good waiting for :^) The population figures are sometimes a bit high, because it is reported to cover a 7 km radius around the point.
Also, the "region" field contains nonsense for e.g. UK (almost always Aberdeen) and France (usually Alsace), Liechtenstein (always Balzers) - looks like instead of missing data, the alphabetically first region is returned. For Germany, US, etc. things look better.
DKF: Note that for large cities, the population returned can also be too small.
ninovillari - 2012-11-02 14:05:21
Hallo, I have noticed that in http://www.fallingrain.com/world/ there is not Estonia (I started searching for EE, as I thought you could have used this acronym), while you can find Tallinn at http://www.fallingrain.com/icao/EETN.html .