A (and probably the biggest) internet search engine, reachable e.g. at http://google.com/ . See [Fuzzy Google truth] for an oracle that filters out how many pages Google found for a query. ---- [NEM] 23nov2002 - Can someone explain the impact of the following snippet from the Google terms of service[http://www.google.com/terms_of_service.html] on such [Tcl]-powered use of Google? "'''No Automated Querying''' You may not send automated queries of any sort to Google's system without express permission in advance from Google. Note that 'sending automated queries' includes, among other things: * using any software which sends queries to Google to determine how a website or webpage 'ranks' on Google for various queries; * 'meta-searching' Google; and * performing 'offline' searches on Google." ---- What part do you wish explained? What the words mean? Why they would do something like this? Why people ignore it? ---- While testing my browser package [http://tkbrowser.sourceforge.net], I discovered that Google rejects any requests from the Tcl [HTTP] package, unless you alter the User-Agent string. [PT] 13-May-2004: The following script will set the http package useragent string to something useful everywhere. I think the http package should probably use this automatically to avoid causing people unnecessary problems. proc SetUseragent {{app {}}} { global tcl_platform set ua "Mozilla/4.0 ([string totitle $tcl_platform(platform)];\ $tcl_platform(os)) http/[package provide http]" if {[string length $app] > 0} { append ua " " $app } else { append ua " Tcl/[package provide Tcl]" } http::config -useragent $ua } Produces something like: ''Mozilla/4.0 (Windows; Windows NT) http/2.4.5 Tcl/8.4'' ---- For examples of use of the "Google Web API" with Tcl, see [http://cedar.intel.com/cgi-bin/ids.dll/content/content.jsp?cntKey=Generic+Editorial%3a%3aws_google&cntType=IDS_EDITORIAL&catCode=CJA]. Active discussion of the API is available through google.* netnews [http://groups.google.com/groups?group=google.public.web-apis], as well as [news:alt.fan.dejanews]. ---- [[Anyone want to add info here about code to search google's [comp.lang.tcl]* newsgroups?]] [CL] daily uses something like set keywords "DDE+Word" set URL http://www.deja.com/\[ST_rn=ps]/qs.xp?ST=PS&svcclass=dnyr&firstsearch=yes&preserve=1&QRY=$keywords&defaultOp=AND&DBS=1&OP=dnquery.xp&LNG=english&subjects=&groups=comp.lang.tcl&authors=&fromdate=&todate=&showsort=score&maxhits=25 ... [[Anyone want to add info here about code to convert [usenet] message-id strings into google [URL]s?]] Use http://groups.google.com/groups?as_umsgid=$message_id (replacing $message_id with the usenet message-id of the post in question). Or use http://groups.google.com/groups?selm=$message_id (select message). Queries based on message-id are most useful for providing a compact url to refer to a particular message that you may have found in a search. The easiest (?) way to get that url is to click on "Original Format" for the particular message. Then copy and paste the url of that page, omitting the "&output=gplain" portion. [Bob Techentin], moreover, writes "I usually start at the advanced group search (http://groups.google.com/advanced_group_search), enter some terms and search. To get the thread URL, click on the 'View Thread' link and you'll get a frame on the left with the thread view, and a lot of messages on the right. Click the 'No Frame' link, and you'll get a URL with the th=xxx that you can hand edit down to a minimal link." ---- If you check http://www.google.com/apis/api_faq.html , you will see that there is a [WSDL] definition for google, allowing an application using [SOAP] to access the host. Tcl examples appear in various places, including the intel.com publication above. Notice reliance on the Web API eliminates needs for [web scraping]. [Pat Thoyts], on the chat, writes: Someone has done a [tclsoap] wrapper for the google API. it is here http://gondolin.hist.liv.ac.uk/~cheshire/tclgoogle.html And here is another at [Googling with SOAP] ---- 05Aug03 - [The Tclers Wiki is gone from Google] ---- [How to make short Google URLs] ---- [Category Internet]