Version 119 of http

Updated 2013-04-02 03:18:58 by pooryorick

http is a library package that is bundled with Tcl and is a client-side implementation of the HTTP/1.0 protocol.
http://www.purl.org/tcl/home/man/tcl8.5/TclCmd/http.htm

There is also a tcllib module called autoproxy that attempts to automate the use of HTTP proxy servers in Tcl HTTP client code.

"HTTP" stands for HyperText Transfer Protocol and is the protocol used by the worldwide web (WWW) - see http://www.w3c.org/Protocols/ for more about HTTP itself.


Commands in the http Package

http::config ?options?
http::geturl url ?options?
http::formatQuery key value ?key value ...?
http::reset token ?why?
http::wait token
http::data token
http::error token
http::status token
http::code token
http::ncode token
http::size token
http::meta token
http::cleanup token
http::register proto port command
http::unregister proto

Versions and Forks

The version of http distributed in the Tcl 8.4.18 core distribution is 2.5.3.

The Tcl 8.5.2 release provides the http version 2.7 with partial HTTP 1.1 support.

The TclSOAP project also contains a distribution of a proposed version 2.5 [L1 ] (with some http 1.1 support), and tclvfs extends that to a proposed version 2.6 (with some webdav support) (merged into the tclsoap projects version (23jun03)). However, versions 2.5/2.6 seem to have introduced at least one bug (reported against tclsoap on sourceforge).

Tcl 8.6 includes full HTTP/1.1 support in its http package.

There is also a sourceforge project at [L2 ] to build a HTTP/1.1 capable http package.

snichols This project has not released any files yet as of 11/1/04 and has had 0% activity.

KJN still no releases at 2007-07-22.

There is a HTTP 2.6 package that supports HTTP 1.1, but is not shipped with the Tcl core yet.

KJN Can anyone verify the statement above?

LV So, has anyone submitted a TIP to take the various forks of the code and create a unified http package with all the working features?


What are some uses people have found for this package?


Minimal downloader (to stdout- RS):

 package require http
 puts [http::data [http::geturl [lindex $argv 0]]]

Bruce Hartweg offers this (slightly paraphrased) minimal to-file version:

    package require http
    http::geturl $theURL -channel [open $theFile w]

along with observations that a more robust version will check for redirects, close channels, http::cleanup, ...

See also Download file via HTTP


One nice feature of the http package is the support of different http transport protocols via the command

   ::http::register proto port command

The initial setting for http itself is as if you had issued the following command:

   ::http::register http 80 ::socket

This can be expanded for https with the tls package:

   package require tls
   ::http::register https 443 ::tls::socket

But you can also overwrite the normal http transport protocol. Suppose you want support for multiple internet/ethernet interfaces in a server that has more than one network card or uses aliased IP addresses ([L3 ]). Just register your own version of http then

   set myIP 192.168.10.1
   ::http::register http 80 [list ::socket -myaddr $myIP]

which just expands the initial behaviour. - TR


It's a simple thing, but I've found use of Tcl's http package the simplest way to discover what Content-Type an HTTP server is sending back with the resource.

DGP - RS: me too, when playing HTTP


Here is some code recently mentioned on news:comp.lang.tcl for querying whether a site is alive.

 if { $argc == 0 } {
        set site "http://purl.org/thecliff/tcl/wiki/"
 } else {
        set site [ lindex $argv 0 ]
 }

 package require http 2.3

 # this proc contributed by [Donal Fellows]
 proc geturl_followRedirects {url args} {
     while {1} {
        set token [eval [list http::geturl $url] $args]
        switch -glob [http::ncode $token] {
           30[1237] { ### redirect - see below ### }
           default  { return $token }
        }
        upvar #0 $token state
        array set meta [set ${token}(meta)]
        if {![info exist meta(Location)]} {
           return $token
        }
        set url $meta(Location)
        unset meta
     }
  }

 set token [geturl_followRedirects $site -validate 1]
 if {[regexp -nocase {ok} [::http::code $token]]} {
        puts "$site is alive"
 } else {
        puts "$site is dead: [::http::code $token]"
 }
 ::http::cleanup $token

http authentication with Tcl isn't difficult.

Logging in and working with cookies isn't either.


A sample of catching an error when attempting to get a WWW page:

 proc t {url} {
   if {[catch {set tok [::http::geturl $url]} msg]} {
      puts "oops: $msg"
   } else {
      return $tok
   }
   puts "leaving"
 }

How can you catch an error in a callback? e.g., if I call

   http::geturl $url -command somecommand

any errors raised in somecommand just vanish instead of being passed to bgerror as I expect.


David Welton gives examples of POSTing HTTP data (that is, use of -query) in a comp.lang.tcl thread [L4 ].


Web Site Status

A simple tool to determine the status of a web site: Type a URL in the "Web Site:" text field, then press return or click the "Get Status" button. The HTTP status, code, filesize (in bytes) and raw HTML output will be displayed for the requested resource.

Bookmarks can be stored in a file called 'status-bookmarks.txt'. The file should reside in the same directory as this application.


Note this [L5 ] deep and productive discussion of keep-alive, certificates, other HTTP 1.1 aspects, and much more. Pat summarized it in http://tclsoap.sourceforge.net/http.html .


Proxy Handling

Identification and handling of proxies can be a pain when using the http package so I'm trying to write a package to handle as much of this as possible - see autoproxy


For a higher-level API than HTTP, see TclCurl or w3m.


Note that the ::http::size is the number of bytes of HTML that geturl has returned. If you use geturl -validate 1 (to get metadata about the page), then since no html has been retrieved, the size command returns 0. Use the state(totalsize) metadata value instead.


Complaint: http blocks while resolving non-existent and disconnected server.

DGP This complaint maps to the complaint that socket blocks in the form [socket $host $port] when $host does not exist/respond, even when the -async option is used. This basically further maps into a complaint that gethostbyname() blocks. Other C programs apparently have non-blocking solutions for this. We should discover what those solutions are and see if the socket implementation can make use of them. Andreas Kupries' memory is that we collectively decided the best solution is "to have the core spawn a helper thread to process (and wait for) the 'gethostbyname ()' while the rest of the core goes on crunching." Darren New observes that gethostbyname() can't be trusted to be thread-safe ...

NOTE: (hint) I don't see this problem with socket reported as a bug at SF.

TV It's been a while for me, but isn't that inet function in fact opening a (maybe udp) socket to a dns, which could be select()-able on decent systems?

nl until this will be fixed you can use tcllib::dns package to do the dns lookup and then http to the ip (note that this have some implications such as assuming that your DNS host responds and sending a wrong Host header by the http lib, but it is usually better then having your application hang on a bad dns entry).

PT 23June2003: This all assumes that DNS is what is being used. However, there are various ways to resolve hostnames and the local C library resolver knows how to handle them according to local configuration. Maybe we are using a hosts file, maybe we have NIS. It is unfortunately not as simple as this appears - otherwise we'd have fixed it. Ultimately using an external process to do the lookups ala netscape's resolver proxy is likely the only way to avoid this delay.

TV A seperate process would leave you with the delay, which is when you don't have the answer to the query readily available, wait for better alternatives or needed correction, or until your connections to the informing party are no longer cluttered or broken, but at least you could do something else in the meanwhile. A major normal reason for having processes or threads in the context of communication pacing things.

DKF: A separate process would let you do other things while the delay was happening. You could even keep a pool of helper processes around and use them round-robin fashion.


Parallel Geturl is a package built on of http to download a large number of urls in an efficient, parallel manner.


Peter Newman 8 March 2004: Resuming? Does anyone know if it's possible to resume (MP3 downloads) with http. And if so, how? And if it's not possible to resume with http, could you let me know that too. (So I don't have to waste time on a lost cause.) Thanks.

schlenk It is possible if the http server supports range requests and you know the length of the file from the content length headers. You just need to add the appropriate HTTP header fields when doing the request, see the RFC 2616 3.12 [L6 ].


MN 13 August 2004: Does anyone know if it's possible to create a web filter with http. And if so, how? Thanks.


LES is not superstitious and asks a question on Friday August 13 2004: What if the download is too large? How is it possible to... er... "cache" the download, i.e. save part of the stream and free up memory?

schlenk The http geturl method has various options for this special case. Either you give a channel, so the data is written directly to a file for example, or you register a special progress callback to deal with the situation.


Voicent Telephone Call Interface is an HTTP client package for making telephone calls from your Tcl/Tk programs using Voicent Gateway. It uses HTTP POST to communication with the gateway. [L7 ].


http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/9a1a0b7f57141b12/3d834b2849323562#3d834b2849323562 discusses a problem with http version 2.7 .


DKF: To get the title of a webpage, use this:

package require http
set token [http::geturl $theURL]
regexp {(?i)<title>([^<>]+)} [http::data $token] -> title
http::cleanup $token
puts "Title was \"$title\""

If you're doing more than getting the title, use tdom and not regexp for the parsing...

MJ - With tdom this becomes:

package require http
package require tdom
set token [http::geturl $theURL]
set doc [dom parse [http::data $token]] 
set title [[$doc selectNodes {/html/head/title}] asText]
$doc delete
http::cleanup $token
puts "Title was \"$title\""

Silas Here is probably the easiest example about how to POST HTTP data using the http package:

 package require http
 set url <your url comes here>
 ::http::geturl $url -query [::http::formatQuery field1 value1 field2 value2 field3 value3]

tonytraductor I've used http to build a crossposting blog client (see http://tonyb.us/xpost ) that posts to wordpress, livejournal, tumblr, friendica, and others.

An example, send a post to tumblr:

# where .txt.txt is a text widget,
# tags, title and other parameters set with tk::entry widgets in the gui
############################################
# post to tumblr
proc tbpost {} {

set ptext [.txt.txt get 1.0 {end -1c}]

set login [::http::formatQuery mode login user $::email password $::tpswd ]
set log [http::geturl http://www.tumblr.com/api/authenticate -query $login]
    
set post [http::formatQuery mode postevent auth_method clear email $::email password $::tpswd type regular generator Xpostulate tags $::tags title $::subject body $ptext]

set dopost [http::geturl http://www.tumblr.com/api/write -query $post]
set mymeta [http::meta $dopost]
set mystat [http::status $dopost]
set length [http::size $dopost]

toplevel .rsp 
wm title .rsp "Post Status"
grid [tk::label .rsp.lbl -text "Tumblr says: $mystat\nPost length: $length"]
grid [tk::button .rsp.view -text "View Journal" -command {
    set turl "http://$::tname.tumblr.com"
    exec $::brow $turl &
}]\
[tk::button .rsp.ok -text "DONE" -command {destroy .rsp}]

}

( Today I'm trying to get it working with posterous, however, and having difficulty. )