http

Difference between version 162 and 164 - Previous - Next
'''[http://www.tcl.tk/man/tcl/TclCmd/http.htm%|%http]''', a pacommkandge bundled into [Tcl] as part of the [Official library of extensions], is a client-side implementation of the [Hypertext Transfer Protocol%|%HTTP]/[http 1.1%|%1.1] protocol.



** Documentation **

   [http://www.tcl.tk/man/tcl/TclCmd/http.htm%|%official reference]:   



** Commands **
   [http::geturl%|%http::geturl](https://wiki.tcl-lang.org/24061):   Performs an HTTP transaction.

** HiOther Packages Thato Pryovide an HTTP Client **
As of ver [https://github.com/RubyLan X, `e/rl_http`%|%rl_http]: su  REST-capporable, entsirely [non-blocking HTTP 1client library, by RubyLane.1].
   [https://chiselapp.com/user/schelte/repository/www%|%www]:   by [Schelte Bron] implements [HTTP 1.1], [HTTP/2], and [WebSocket].  Takes care of redirects, retries, cookies, and proxies.
   [TclCurl]:   a Tcl binding to the external binary libcurl which supports HTTP and many other protocols
** Si  [w3mpl]:   command that use [Exampect] to control the text-mode browser **w3m
**   [Playing HTTP], by [Richard SuchExtenwsirth]:ons   **
   [HTTPS]:   with the twapi (Windows only) or tls (all platforms) package
   [Simple HTTP Authentication Wrapper for http::geturl RFC 2617]:   
** S  [http authentication]:    covers "HTTP Basic Aluthentication", "Digest Authentication" **and others.
   [cookies]:   see also http version 2.10 (with Tcl 8.7) which has a built-in implementation of cookies

   [http::POST]:   a helper for the HTTP POST method

   [https://core.tcl-lang.org/tcllib/doc/trunk/embedded/md/tcllib/files/modules/websocket/websocket.md%|%WebSocket]:   library for [WebSocket] on clients and servers.  Part of [Tcllib].  Earlier version at [WebSocket Client Library].

   [https://github.com/ecky-l/tclwebsocket%|%tclwebsocket]:   by Eckhard Lehmann exposes a WebSocket as a Tcl channel

   [https://core.tcl-lang.org/tcllib/doc/trunk/embedded/md/tcllib/files/modules/rest/rest.md%|%rest]:   define REST web APIs and call them inline or asychronously.  Part of [Tcllib].

   [https://core.tcl-lang.org/tcllib/doc/trunk/embedded/md/tcllib/files/modules/amazon-s3/S3.md%|%S3]:   Amazon S3 Web Service Interface.  Part of [Tcllib].

   [https://core.tcl-lang.org/tcllib/doc/trunk/embedded/md/tcllib/files/modules/http/autoproxy.md%|%autoproxy]:   Automatic HTTP proxy usage and authentication.  Part of [Tcllib].

   [TclSOAP]:   implementation of SOAP, XML-RPC and JSON-RPC.

   [WebServices]:   WebServices for Tcl

** Examples Using package http **

   [Playing HTTP], by [Richard Suchenwirth]:   

   [An HTTP robot in Tcl]:   

   [Download file via HTTP]:   

   [Downloading a File over HTTP]:   

   [File Upload with tcl's http]:   

   [Google Translation via http Module]:   
   [http authentication]:    covers "HTTP Basic Authentication", "Digest Authentication" and others.

   [HTML to DOM via http/htmlparse/struct::tree packages]:   
   [cookies]:   also easy.

   [TclCurl]:   a higher-level API than http

   [w3m]:   a higher-level API than http

   [Parallel Geturl]:   a package built on of http to download a large number of urls in an efficient, parallel manner.

   [Voicent Telephone Call Interface]:   an HTTP client package for making telephone calls from your Tcl/Tk programs using Voicent Gateway.  It uses HTTP POST to communication with the gateway. [http://www.voicent.com].
   [Official library of extensions]:   

   [single command http fetcher]:   
   [Simple HTTP Authentication Wrapper for http::geturl RFC 2617]:   

   [Tcl chatroom snaphost history (2)]:   
      [Tclers kChat Tk GUI]:   

   [An HTTP robot in Tcl]:   

   [Getting stock quotes over the internet]:   

   [Uploading files to Flickr]:   

   [Downloading a File over HTTP]:   
   [Hhttp://groups.google.com/group/comp.lang.tcl/browse_thread/thread/b46f7f3880ad42a5/16607ff5b9bac959?lnk=gst&q=http+keep-alive+#16607ff5b9bac959%|%TTclSOAP & SSL -- Can I reuse connections?], [comp.lang.tcl], 2001-11-28:   A deep and productive discussion of keep-alive, certificates, other HTTP 1.1 aspects, and much more.  Pat summarized it in [http://tclsoap.sourceforge.net/http.html].   Note that connection re-use and pipelining were greatly improved in http 2.9.
   [http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/b46f7f3880ad42a5/16607ff5b9bac959?lnk=gst&q=http+keep-alive+#16607ff5b9bac959%|%TclSOAP & SSL -- Can I reuse connections?], [comp.lang.tcl], 2001-11-28:   a deep and productive discussion of keep-alive, certificates, other HTTP 1.1 aspects, and much more.  Pat summarized it in http://tclsoap.sourceforge.net/http.html

   [Web Site Status]:   A simple tool to determine the status of a web site: Type a URL in the "Web Site:" text field, then press return or click the "Get Status" button.  The HTTP status, code, filesize (in bytes) and raw HTML output will be displayed for the requested resource.  Bookmarks can be stored in a file called 'status-bookmarks.txt'.  The file should reside in the same directory as this application.
   [https://github.com/pavel-demin/srmlite%|%srmlite]:    aA lightweight implementation of Storage Resource Management (SRM) interface for POSIX-compliant file systems. [XOTcl] is used for implementing high-level logic and for gluing together several technologies such as Grid Security Infrastructure (GSI), [HTTP], [XML] and [SOAP]. ''(Is this a server-side project?  Does it use the http client package? - [KJN])''

** Synopsis **

    :   '''http::config''' ?''options''?

    :   '''[http://wiki.tcl.tk/24061%|%http::geturl]''' ''url'' ?''options''?

    :   '''[http://wiki.tcl.tk/21934%|%http::formatQuery]''' ''key value'' ?''key value ...''?

    :   '''http::reset''' ''token'' ?''why''?

    :   '''http::wait''' ''token''

    :   '''http::data''' ''token''

    :   '''http::error''' ''token''

    :   '''http::status''' ''token''

    :   '''http::code''' ''token''

    :   '''http::ncode''' ''token''

    :   '''http::size''' ''token''

    :   '''http::meta''' ''token''

    :   '''http::cleanup''' ''token''

    :   '''http::register''' ''proto port command''

    :   '''http::unregister''' ''proto''

    :   '''http::registerError''' ''port ?message?''



** Documentation **

   [http://www.tcl.tk/man/tcl/TclCmd/http.htm%|%official reference]:   

** Version History and Forks **

Version 2.8.5, with full HTTP/1.1 support, is distributed with Tcl 8.6.

Version 2.7, with partial HTTP/1.1 support, is distributed with [Tcl] 8.5.2.

Version 2.5.3 is distributed with [Tcl] 8.4.18.
TheAn [TcearlSOAP]y project posalso cfonr htainstp a2.5 d(with some [htribtp 1.1] supporti) occurred in the [TclSOAP] profject a- 
see [http://tclsoap.sf.net/http.html%|%proposed version 2.5]. (with somThe [http 1.1]
support), and [tclvfs] project extendsed that to a proposed version 2.6  (with sothe -me
webthodav suopption to allow callers to use WebDAV methods). ( This change was merged into the t[TclsoapSOAP] projects version (2003-06-23)).
H; the -method option was addevd to the official http befor,e versions 2.5/27.6 Both [TclSOAP] and [tclvfs] now useem the official http pavckage ibuntroducled awith Tcle, aslt hone bugh (re"httpo 2.6" rted
magains as unused code in the [tclvfsoap] on sourcefo trge)e.


** Description **


`::http::size` is the number of bytes of HTML that geturl has returned.
`geturl -validate 1` returns the metadata about the page, and since no html
has been retrieved, `::http::size` returns `0`.  In this case
`$state(totalsize)` can be used.

One nice feature of the http package is the '''support of different http
transport protocols''' via the command:

======
::http::register proto port command
======

The initial setting for `http` itself is as if the following command were
issued:

======
::http::register http 80 ::socket
======

This can be expanded for [HTTPS] with the [tls] package:

======
package require tls
::http::register https 443 ::tls::socket
======

For websites that have disabled support for [SSL], including version 3, the
following should work:

======
::http::register https 443 [list ::tls::socket -tls1 1]
======

----

It is also possible to overwrite the normal http transport protocol.  For
example, to get support for multiple internet/ethernet interfaces in a server
that has more than one network card or uses aliased IP addresses
([http://www.linuxdig.com/howto/ldp/IP-Alias.php]),  register another version
of http:

======
set myIP 192.168.10.1
::http::register http 80 [list ::socket -myaddr $myIP]
======

[TR]: which just expands the initial behaviour.



** A POST Request **

[Silas]: Here is probably the easiest example about how to POST HTTP data using
the http package:

======
package require http
set url <your url comes here>
::http::geturl $url -query [::http::formatQuery field1 value1 field2 value2 field3 value3]
======

----

[David Welton] gives examples of POSTing HTTP data (that is, use of -query) in [http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/78aaf163e0303b82/a546c0e70f83b118?lnk=gst&q=david+welton+http+post+%22-query%22#a546c0e70f83b118|Web scraping with Tcl help anyone?], [comp.lang.tcl], 2002-01-18



** Examples **


[RS]: '''Minimal downloader''' to stdout:

======
package require http
puts [http::data [http::geturl [lindex $argv 0]]]
======

[Bruce Hartweg] offers this (slightly paraphrased) minimal to-file version:

======
package require http
http::geturl $theURL -channel [open $theFile w]
======
along with observations that a more robust version will check for redirects, close
channels, http::cleanup, ...

----

[DKF]: To '''get the title of a webpage''', use this:

======
package require http
set token [http::geturl $theURL]
regexp {(?i)<title>([^<>]+)} [http::data $token] -> title
http::cleanup $token
puts "Title was \"$title\""
======

If you're doing more than getting the title, use [tdom] and not [[`[regexp]`]
for the parsing...

[MJ]: With [tdom] this becomes:

======
package require http
package require tdom
set token [http::geturl $theURL]
set doc [dom parse [http::data $token]] 
set title [[$doc selectNodes {/html/head/title}] asText]
$doc delete
http::cleanup $token
puts "Title was \"$title\""
======


----

A sample of catching an error when attempting to get a WWW page:

======
proc t url {
    if {[catch {set tok [::http::geturl $url]} msg]} {
        puts "oops: $msg"
    } else {
        return $tok
    }
    puts leaving
}
======

----

[DGP]:  It's a simple thing, but I've found use of Tcl's http package the
simplest way to discover what Content-Type an HTTP server is sending back with
the resource.

[RS]: me too, when [playing HTTP]



** A Cross-Posting Blog Client **

[tonytraductor]: I've used http to build a crossposting blog client (see
http://tonyb.us/xpost) that posts to wordpress, livejournal, tumblr, friendica,
and others.

An example, send a post to tumblr:

======
# where .txt.txt is a text widget,
# tags, title and other parameters set with tk::entry widgets in the gui
############################################
# post to tumblr
proc tbpost {} {

set ptext [.txt.txt get 1.0 {end -1c}]

set login [::http::formatQuery mode login user $::email password $::tpswd ]
set log [http::geturl http://www.tumblr.com/api/authenticate -query $login]
    
set post [http::formatQuery mode postevent auth_method clear email $::email password $::tpswd type regular generator Xpostulate tags $::tags title $::subject body $ptext]

set dopost [http::geturl http://www.tumblr.com/api/write -query $post]
set mymeta [http::meta $dopost]
set mystat [http::status $dopost]
set length [http::size $dopost]

toplevel .rsp 
wm title .rsp "Post Status"
grid [tk::label .rsp.lbl -text "Tumblr says: $mystat\nPost length: $length"]
grid [tk::button .rsp.view -text "View Journal" -command {
    set turl "http://$::tname.tumblr.com"
    exec $::brow $turl &
}]\
[tk::button .rsp.ok -text "DONE" -command {destroy .rsp}]

}
======


Today I'm trying to get it working with posterous, however, and having difficulty.



** Restarting A Download **

[LES]: is not superstitious and asks a question on 2004-08-13: ''What
if the download is too large?  How is it possible to... er... "cache" the
download, i.e. save part of the stream and free up memory?''

[schlenk]: The http geturl method has various options for this special case.
Either you give a channel, so the data is written directly to a file for
example, or you register a special progress callback to deal with the
situation.

----

[Peter Newman] 2004-03-08 : '''Resuming?''' Does anyone know if it's possible
to resume (MP3 downloads) with [[`http`]. And if so, how? And if it's not
possible to resume with [http], could you let me know that too.  (So I don't
have to waste time on a lost cause.) Thanks.

[schlenk]: It is possible if the http server supports range requests and you
know the length of the file from the content length headers. You just need to
add the appropriate HTTP header fields when doing the request, see the RFC 2616
3.12 [http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.12].



** Proxy Handling **

Identification and handling of proxies can be a pain when using the http
package so I'm trying to write a package to handle as much of this as possible
- see [autoproxy]



** Blocking Behaviour when Resolving a Host Addresss **


Complaint:  http blocks while resolving non-existent and disconnected server.

[DGP]: This complaint maps to the complaint that [[`[socket]`] blocks in the
form [[socket $host $port]] when `$host` does not exist/respond, even when the
`-async` option is used.  This basically further maps into a complaint that
`gethostbyname()` blocks.  Other [C] programs apparently have non-blocking
solutions for this.  We should discover what those solutions are and see if the
[[`[socket]`] implementation can make use of them.  [Andreas Kupries]' memory
is that we collectively decided the best solution is "to have the core spawn a
helper thread to process (and wait for) `gethostbyname()` while the rest of the
core goes on crunching."

[Darren New] observes that `gethostbyname()` can't be trusted to be thread-safe
...

NOTE: (hint) I don't see this problem with `[socket]` reported as a bug at
SF.

[TV]: It's been a while for me, but isn't that inet function in fact opening a
(maybe udp) socket to a [DNS], which could be `select()`-able on decent systems?

[nl]: until this will be fixed you can use the `[tcllib]::dns` package to do the
dns lookup and then http to the ip (note that this have some implications such
as assuming that your DNS host responds and sending a wrong Host header by the
http lib, but it is usually better then having your application hang on a bad
dns entry).

[PT] 2003-06-23:  This all assumes that DNS is what is being used. However,
there are various ways to resolve hostnames and the local C library resolver
knows how to handle them according to local configuration.  Maybe we are using
a hosts file, maybe we have NIS. It is unfortunately not as simple as this
appears - otherwise we'd have fixed it. Ultimately using an external process to
do the lookups ala netscape's resolver proxy is likely the only way to avoid
this delay.

[TV]:  A separate process would leave you with the delay, which is when you
don't have the answer to the query readily available, wait for better
alternatives or needed correction, or until your connections to the informing
party are no longer cluttered or broken, but at least you could do something
else in the meanwhile.  A major normal reason for having processes or threads
in the context of communication pacing things.

[DKF]: A separate process would let you do other things while the delay was
happening.  You could even keep a pool of helper processes around and use them
round-robin fashion.



** Bug: Errors in Callback Disappear **

How can you catch an error in a callback?  e.g., if I call

======
http::geturl $url -command somecommand
======

any errors raised in '''somecommand''' just vanish instead of being passed to
bgerror as I expect. 

[PYK] 2016-04-03:  Yes, `http::Finish` does swallow errors in the `-command` if
an error is already being propagated.  The whole http module needs a little
redesigning.  In the meantime, the callback command can be "liberated" with something like this:

======
http::geturl $url -command [list after idle some_command]
======

[HolgerJ] 2016-04-23: Could the error I am encountering be related to this? 

======
::http::cleanup $token
::thread::release

Error from thread tid0x7f80f6893700
can not find channel named "sock7f80f0178c50"
    while executing
"eof $sock"
======

As soon as I put a '''after 500''' between the two statements, the error doesn't show. Could it be that the '''cleanup''' is being overtaken by the '''release''', so that the '''cleanup''' cannot find the '''$sock''' anymore?

[pyk] 2016-04-23:  Do these lines appear in a `-command` script?


** Website Up? **

Here is some code recently mentioned on news:comp.lang.tcl for querying whether
a site is alive.

======
if {$argc == 0} {
    set site http://purl.org/thecliff/tcl/wiki/
} else {
    set site [lindex $argv 0]
}

package require http 2.3

# this proc contributed by [Donal Fellows]
proc geturl_followRedirects {url args} {
    while 1 {
        set token [eval [list http::geturl $url] $args]
        switch -glob [http::ncode $token] {
            30[1237] {### redirect - see below ###}
            default  {return $token}
        }
        upvar #0 $token state
        array set meta [set ${token}(meta)]
        if {![info exists meta(Location)]} {
           return $token
        }
        set url $meta(Location)
        unset meta
    }
}

set token [geturl_followRedirects $site -validate 1]
if {[regexp -nocase ok [::http::code $token]]} {
    puts "$site is alive"
} else {
    puts "$site is dead: [::http::code $token]"
}
::http::cleanup $token
======

[HaO] 2013-05-02: IMHO it would be more secure to limit the redirections to 5.



** Backwards Incompatibility: http-2.7 **

[http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/9a1a0b7f57141b12/3d834b2849323562#3d834b2849323562%|%
Tcl/Tk 8.5.2 Release Candidates Options (new behaviour with http -handler)], [comp.lang.tcl], 2008-03-28:   discusses a problem with http version 2.7.



** Misc **

<<discussion>>Odd behaviour in an unusual context
[TV] 2003-04-24:

I just found behaviour I didn't get:

======none
(Tcl) 68 % info vars http::*
::http::urlTypes ::http::http ::http::1 ::http::alphanumeric ::http::encodings ::http::formMap ::http::defaultCharset
(Tcl) 68 % unset ::http::1
can't unset "::http::1": no such variable
(Tcl) 69 % info vars http::*
::http::urlTypes ::http::http ::http::alphanumeric ::http::encodings ::http::formMap ::http::defaultCharset
======

It's wish 8.4.1, and it runs [bwise], a webserver (tclhttpd with some
alterations), and this is clearly from the http package to fetch a webpage.
Maybe the manual gives a neat answer, I just found it noteworthy that an erroneous `[unset]` still seems to do its unsetting.

[RS]: ..or that the variable was removed by the web server between the first
two commands? What happens if you just call the first command repeatedly?

[TV]: It would seem to be stable. It's the page content and url info etc array
variable, which sticks around it seems until deleted, that's the whole reason I
was looking for some garbage collection, or delayed freeing. It could be there
is an event linked with some element, I don't know, I didn't write the at least
handy [http] package...

----
<<enddiscussion>>

<<discussion>> http(s) Link Verification

[HaO] 2013-05-10: Here is my http(s) link (url) verification code, as inspired
from the upper example from Kevin Kenny.
This code follows max 5 forwards and requires tcl8.6 due to the tailcall:

======tcl
proc ::linkCheck {urlIn {timeout 10000} {recursionLimit 5}} {
    if {[catch {
        set requestHandle [::http::geturl $urlIn -validate 1 -timeout $timeout]
    } err]} {
        return -code error [mc "Unknown host '%s'" $urlIn]
    }
    set fError 1
    if {[::http::status $requestHandle] ne {ok}} {
        set errMsg [::http::status $requestHandle]
    } else {
        switch -glob -- [::http::ncode $requestHandle] {
            2* {set fError 0}
            30[12378] {
                # redirect
                if {0 < $recursionLimit
                    && [info exists ${requestHandle}(meta)]
                    && [dict exists [set ${requestHandle}(meta)] Location]
                } {
                    incr recursionLimit -1
                    set url [dict get [set ${requestHandle}(meta)] Location]
                    ::http::cleanup $requestHandle
                    tailcall ::linkCheck $url $timeout $recursionLimit
                }
            }
        }
        set errMsg [::http::code $requestHandle]
    }
    ::http::cleanup $requestHandle
    if {$fError} {
        return -code error [mc "Error '%s' accessing url '%s'" $errMsg $urlIn]
    }
    return
}
======

----
<<enddiscussion>>

**Bug: code value "HTTP/1.1 100 Continue"**

'''[oehhar] - 2017-08-31 12:19:29'''

[HaO] 2017-08-31: When getting a ncode of 100, you are probably hitting bug [https://core.tcl.tk/tcl/info/2a94652ee10cae20%|%2a94652e%|%] in the http package shipped with tcl 8.6.0 - 8.6.7 (http package 2.8.0 - 2.8.11).

Use 2.8.12 which is shipped with tcl 8.6.8.

----


**Convert content-type charset to encoding**

[HaO] 2020-09-02:
The http package has an internal routine to convert a charset parameter passed to a content-type header to a TCL encoding name.

This may also be used if an IANA charset [https://www.iana.org/assignments/character-sets/character-sets.xhtml] should be converted to a tcl [encoding].

Of cause, this is not an official API and thus, it may change.

The command is:

======
http::CharsetToEncoding $Charset
======

Snipets from http itself using it:


***Extraction of the charset parameter from the content-type header***

From Line 2676 of http 2.9.1, the charset is extracted from the content-type header:
======
                            if {[regexp -nocase \
                                    {charset\s*=\s*\"((?:[^""]|\\\")*)\"} \
                                    $state(type) -> cs]} {
                                set state(charset) [string map {{\"} \"} $cs]
                            } else {
                                regexp -nocase {charset\s*=\s*(\S+?);?} \
                                        $state(type) -> state(charset)
                            }
======


***Recoding data from the IANA type***

From Line 3210 of http 2.9.1, the charset value is used to recode the data:

======
            set enc [CharsetToEncoding $state(charset)]
            if {$enc ne "binary"} {
                set state(body) [encoding convertfrom $enc $state(body)]
            }
======

----


**POST utf-8 encoded data**

[HaO] 2020-09-02: Here is an example how to post utf-8 encoded data:

======
set data "ABCDÄÖÜ\u2022"
set h [http::geturl http://sample.org/postutf8\
        -query [encoding convertto utf-8 $data]\
        -type "text/plain;charset=utf-8"]
======

The IANA name of the encoding is passed with the '''-type''' parameter.

The same scheme applies for any other encoding.
Remark, that ISO-Latin 1 is the default encoding:

======
set data "ABCDÄÖÜ"
set h [http::geturl http://sample.org/postutf8\
        -query [encoding convertto iso8859-1 $data]\
        -type "text/plain"]
======

----

<<categories>> Tcl syntax | Arts and crafts of Tcl-Tk programming | Command | Tcl | Protocol | Internet | Package | Web