A, and probably the biggest, internet search engine, reachable e.g. at http://google.com See [Fuzzy Google truth] for an oracle that filters out how many pages Google found for a query. ---- [NEM] 23nov2002 - Can someone explain the impact of the following snippet from the Google terms of service[http://www.google.com/terms_of_service.html] on such Tcl-powered use of Google? "'''No Automated Querying''' You may not send automated queries of any sort to Google's system without express permission in advance from Google. Note that "sending automated queries" includes, among other things: * using any software which sends queries to Google to determine how a website or webpage "ranks" on Google for various queries; * "meta-searching" Google; and * performing "offline" searches on Google." While testing my browser package [http://tkbrowser.sourceforge.net], I discovered that Google rejects any requests from the Tcl HTTP package, unless you alter the User-Agent string. ---- For examples of use of the "Google Web API" with Tcl, see [http://cedar.intel.com/cgi-bin/ids.dll/content/content.jsp?cntKey=Generic+Editorial%3a%3aws_google&cntType=IDS_EDITORIAL&catCode=CJA]. Active discussion of the API is available through google.* netnews [http://groups.google.com/groups?group=google.public.web-apis], as well as [news:alt.fan.dejanews]. ---- [[Anyone want to add info here about code to search google's comp.lang.tcl* newsgroups?]] [CL] daily uses something like set keywords "DDE+Word" set URL http://www.deja.com/\[ST_rn=ps]/qs.xp?ST=PS&svcclass=dnyr&firstsearch=yes&preserve=1&QRY=$keywords&defaultOp=AND&DBS=1&OP=dnquery.xp&LNG=english&subjects=&groups=comp.lang.tcl&authors=&fromdate=&todate=&showsort=score&maxhits=25 ... [[Anyone want to add info here about code to convert usenet message-id strings into google URLs?]] Use http://groups.google.com/groups?as_umsgid=$message_id (replacing $message_id with the usenet message-id of the post in question). Or use http://groups.google.com/groups?selm=$message_id (select message). Queries based on message-id are most useful for providing a compact url to refer to a particular message that you may have found in a search. The easiest (?) way to get that url is to click on "Original Format" for the particular message. Then copy and paste the url of that page, omitting the "&output=gplain" portion. ---- If you check http://www.google.com/apis/api_faq.html , you will see that there is a [WSDL] definition for google, allowing an application using [SOAP] to access the host. Tcl examples appear in various places, including the intel.com publication above. Notice reliance on the Web API eliminates needs for [web scraping]. [Pat Thoyts], on the chat, writes: Someone has done a [tclsoap] wrapper for the google API. it is here http://gondolin.hist.liv.ac.uk/~cheshire/tclgoogle.html And here is another at [Googling with SOAP] ---- [How to make short Google URLs] ---- [Category Internet]