HJG Someone has uploaded a lot of pictures to Flickr, and I want to show them someplace where no internet is available.
The pages at Flickr have a lot of links, icons etc., so a simple recursive download with e.g. wget would fetch lots of unwanted stuff. Of course, I could tweak the parameters for calling wget (-accept, -reject, etc.), but doing roughly the same thing in Tcl looks like more fun :-) Moreover, with a Tcl-script I can also get the titles and descriptions of the images.
So the first step is to download the html-pages from that person, extract the links to the photos from them, then download the photo-pages (containing titles and complete descriptions), and the pictures in the selected size (Thumbnail=100x75, Small=240x180, Medium=500x375, Large=1024x768, Original=as taken).
Then we can make a Flickr Offline Photoalbum out of them, or just use a program like IrfanView [L1 ] to present the pictures as a slideshow.
Second draft for the download:
package require http proc getPage { url } { set token [::http::geturl $url] set data [::http::data $token] ::http::cleanup $token return $data } catch {console show} ;## set url http://www.flickr.com/photos/siegfrieden set filename "s01.html" set url http://www.flickr.com/photos/siegfrieden/page2 set filename "s02.html"
if 1 {
set data [ getPage $url ] #puts "$data" ;## set fileId [open $filename "w"] puts -nonewline $fileId $data close $fileId
}
if 0 {
set fileId [open $filename r] set data [read $fileId] close $fileId
}
set n 0 foreach line [split $data \n] { # <title>Flickr: Photos from XXX</title> if {[regexp -- "<title>" $line]} { #puts "1: $line"; incr n set p1 [ string first ":" $line 0 ]; incr p1 14 set p2 [ string first "</title" $line $p1 ]; incr p2 -1 set sT [ string range $line $p1 $p2 ] puts "Title: $p1 $p2: '$sT'" } # <h4>XXX</h4> if {[regexp -- "<h4>" $line]} { #puts "2: $line"; incr n set p1 [ string first "<h4>" $line 0 ]; incr p1 4 set p2 [ string first "</h4>" $line $p1 ]; incr p2 -1 set sH [ string range $line $p1 $p2 ] puts "\nHeader: $p1 $p2: '$sH'" } # <p class="Photo"><a href="/photos/XXX/9999/"><img src="http://static.flickr.com/99/9999_8888_m.jpg" width="240" height="180" /></a></p> if {[regexp -- (class="Photo") $line]} { #puts "3: $line"; incr n set p1 [ string first "href=" $line 0 ]; incr p1 6 set p2 [ string first "img" $line $p1 ]; incr p2 -4 set sL [ string range $line $p1 $p2 ] puts "Link : $p1 $p2: '$sL'" set p1 [ string first "src=" $line 0 ]; incr p1 5 set p2 [ string first "jpg" $line $p1 ]; incr p2 2 set sP [ string range $line $p1 $p2 ] puts "Photo: $p1 $p2: '$sP'" } # <p class="Desc">XXX</p> if {[regexp -- (class="Desc") $line]} { #puts "4: $line"; incr n set p1 [ string first "Desc" $line 0 ]; incr p1 6 set p2 [ string first "</p>" $line $p1 ]; incr p2 -1 set sD [ string range $line $p1 $p2 ] puts "Descr: $p1 $p2: '$sD'" } # <a href="/photos/XXX/page12/" class="end">12</a> if {[regexp -- (class="end") $line]} { #puts "5: $line"; incr n; set p1 [ string first "page" $line 0 ]; incr p1 4 set p2 [ string first "/" $line $p1 ]; incr p2 -1 set s9 [ string range $line $p1 $p2 ] puts "\nEnd: $p1 $p2: '$s9'" break } } puts "# $n"
This will only get one html-page, so the next step is to also get the other pages of the album, extract the informations we need, and finally fetch the pictures.
Strings to look for:
...
CJL wonders whether the Flickr-generated RSS feeds for an album might be a quicker way of getting at the required set of image URLs.
See also:
Category Internet - Category File - Category Word and Text Processing