Here is a pure Tcl script that builds up the history of the chatroom. There are better versions at [Tcl chatroom snaphost history (2)]. See also [Larry Virden]'s [ksh chatroom snaphost history]. Note that Larry's script requires (ksh + lynx + sed + diff + egrep). '''Notes:''' * the script will build daily snapshots of the chatroom activity * this is a first version; it may be buggy still ... * this script produces invalid html - it is missing the end-tags . Most browsers are tolerant of this (I hope, at least netscape is ...) '''Bugs:''' * version 0.1 had a possible resource leak when the connection failed; fixed in 0.2 * repeated lines can cause version 0.2 to fail, in the sense that it unnecessarily duplicates many entries. The bug is safe (it never loses data), an algorithm to solve it is not obvious. I am tempted to leave it as is for now ... ---- #! /usr/local/bin/tclsh # Author: Miguel Sofer < mailto:msofer@users.sourceforge.net > # Date: Sept. 20, 2001 # Version: 0.2 (token leak fix) # Adapted from Larry W. Virden's program (ksh+lynx+sed+diff+egrep) # Purpose: to scrape tcl'ers chat log pages and accumulate them # NOTE: This script is going to record private messages and memos to your # id, so you should be careful to review the information before making it # public. ############################################################# # USAGE ############################################################# # # First define the variables URL and chatPath below. # # Then call this program giving a time (in seconds) between checks, # it will run forever and update the history file periodically. # # Set URL to your chat information # To get this URL, use your web browser to visit the # chat room, and then check the URL information for the dialog frame. # WARNING: The URL contains your chat password - so this file needs to be # protected appropriately. set URL {http://purl.org/mini/cgi-bin/chat.cgi?action=chat&name=miguel&password=I_WONT_TELL_YOU_THIS&updatefrequency=45&color=800000&new_msg_on_top=0&ls=&pause=#end} # Directory for history files histFile_date.htm; lastFile contains # the results of the previous search, to reduce the cost of # finding duplicates. set chatPath $env(HOME) ############################################################# if {[llength $argv] != 1} { error "USAGE: $argv0 seconds_between_snapshots" } else { set time [expr {1000 * [lindex $argv 0]}] } package require http proc repeatRecord {time} { recordData after $time repeatRecord $time } ############################### # Build a regexp to clean up; this is very dependent on the # particular formatting of the chat program # The cleaned-up data has to start with a "real data" line # # match start to first real data; real data start with either # "" or "|[^\n]* (?:has (?:entered|left) the chat\n*
\n*)} # match trailers set re3 {.*} set re "${re1}|${re2}|${re3}" ############################### # a proc to get and cleanup the data from the chat. proc getChat {} { set token {} if {[catch {set token [::http::geturl $::URL -timeout 30000]}]\ || ([::http::status $token] != "ok")\ || ([::http::ncode $token] != "200")} { ::http::cleanup $token return {} } regsub -all $::re [::http::data $token] {} data ::http::cleanup $token set data } # the workhorse: record the data proc recordData {} { set data [getChat] if {![string length $data]} { return } set date [clock format [clock seconds] -format %y%m%d] set histFile [file join $::chatPath histfile_$date.htm] set lastFile [file join $::chatPath lastFile] if {[file isfile $lastFile]} { # # get the last saved file; find where they coincide # set lastH [open $lastFile] # # get the data, find the first non-empty line # set data [split $data "\n"] set firstLine [lindex $data 0] set i 0 # go forward in lastFile until you find the first line of the new data while {!(([gets $lastH oldLine] == -1) || ([string equal $oldLine $firstLine]))} { } # go forward in both until the coincidence terminates if {![eof $lastH]} { incr i while {([gets $lastH oldLine] != -1) \ && ([string equal $oldLine [lindex $data $i]])} { incr i } } close $lastH close $lastH if {!$i} { set new "....... possibly missing data\n
\n" } else { set new {} } append new [join [lrange $data $i end] \n] set data [join $data "\n"] } else { # # there is no lastFile; start one ... # set new $data } set new $data if {![file isfile $histFile]} { # no history file; start one set head {} } else { set head {} } if {[string length $new]} { set hist [open $histFile a] puts $hist "$head\n\n******* [clock format [clock seconds]]\n
\n" puts $hist $new close $hist set last [open $lastFile w] puts $last $data close $last } else { file mtime $histFile [clock seconds] } } repeatRecord $time vwait forever ---- Also see "[Tcl chatroom log]". ---- [Category Discussion]