http::geturl

http::geturl is a command in the http package.

Description

In synchronous mode, enters the event loop in order to wait obtain the full response from the server.


Here is a little wrapper that knows how to follow redirects, and sends a silly user-agent so that sourceforge will allow files to be downloaded:

proc geturl {url {_meta {}}} {
    if {$_meta ne ""} {
        upvar 1 $_meta meta
    }
    http::config -useragent moop    ;# thanks source forge!
    set tok [::http::geturl $url]   ;# -headers {User-Agent moop} adds a 2nd user-agent header
    try {
        upvar 1 $tok state
        if {[set status [::http::status $tok]] ne "ok"} {
            error $status
        }
        set headers [dict map {key val} [::http::meta $tok] {
            set key [string tolower $key]
            set val
        }]
        if {[dict exists $headers location]} {
            tailcall geturl [::http::relpath $url [dict get $headers location]] $_meta
        }
        return [::http::data $tok]
    } finally {
        set meta [array get $tok]
        ::http::cleanup $tok
    }
}

# example:
#  % string length [set x [geturl http://sourceforge.net/projects/tcl/files/latest/download]]
#  8915556

AM 2009-11-12: The method itself gets the entire URL page and returns a token that can be used for subsequent inspections. The option -channel is meant to put the data that come back into an opened file (or channel). However, in my experience, it adds some information at the start and at the end. This makes it unusable for retrieving binary files.

Here is a short fragment of how to retrieve a zip file (my use case):

set token [::http::geturl $URL]
set outfile [open myzipfile.zip w]
fconfigure $outfile -translation binary 
puts -nonewline $outfile [::http::data $token]
close $outfile

(My first attempt looked like this:

set outfile [open myzipfile.zip w]
fconfigure $outfile -translation binary 
::http::geturl $URL -channel $outfile
close $outfile

but this gives a corrupted zip-file that can not be read using the vfs::zip package - other unzip programs may deal with it correctly though)

PT This is a bug report and so should be in the bug tracker. The http package comes as part of Tcl so there is a category for such issues in the Tcl bug tracker. If you were to think of this as a bug report you might have been tempted to state just which version of the http package you were using and we might be able to say if this has been fixed already or if it is new. The issue you have is that you have tried to fetch a url that is using chunked transfer encoding. When you provided the -channel option it passed the chunked encoded data to the channel. In some more recent fixed versions this is correctly decoded before being passed to the channel for output. A workaround for your current broken version will be to add -protocol 1.0 to disable the HTTP/1.1 feature negotiation.

AM Ah, I was not sure if this was meant to the behaviour or not.

I just checked: HTTP package version is 2.7.4. I originally ran this via tclkitsh (Tcl version 8.4) but I noticed my Tcl 8.5 installation (Tcl 8.5.7) gives the same version number.

-- Added as a comment to bug report 1928131 on SF.