Based on the code in wiki-reaper, I put together a wish interface so that I could run it on my Windows system. I was able to use it to reap the TkMapper code automagically (quite an impressive program by the way).
GPS - The reaping of the Wiki seems dangerous to me. Perhaps you should add checks for strings like "file delete" (possibly a glob del* pattern if file uses strncmp), and "rm "
escargo 5 Jun 2003 - If you want to be safe, then you could do any textual analysis on the code before you run it. Reaping only collects the code; you don't have to execute it if you do not want to. You can always read the code on the Wiki before you reap it, since "As you see, so shall ye reap." - RS 2004-03-06: Also, Tcl has safe interpreters for exactly this purpose: running untrusted code without danger...
Many of the applications in Category Whizzlet are reapable and entertaining.
Pages that it can reap successfully include:
Pages that it cannot reap successfully include:
cannot reap successfully means that the file resulting from reaping the specified page will not run when started from a Windows file explorer without being edited. The contents of the page will still be collected.
RS: Hm - the second time you "misreaped" a page of mine.. In the case of Click me, it's so ridiculously small, that it doesn't deserve splitting up; a little file searcher will split off A little file searcher (iPaq) ;-)
escargo 16 Mar 2003 - The code for a little file searcher can be reaped, but I can't run it under Windows at least because it assumes a command line interface. - RS has a command line interface on Windows, the infamous DOS prompt ;-)
This program is certainly able to be improved, but it works for now.
#!/usr/bin/env wish # wish-reaper -- # This script replaces the command line interface of wiki-repeater.tcl # with a wish GUI. # Create a user interface to collect the necessary parameters # and control actions. # We need an entry for a number field (the page to reap). # Page 6277 provides TkMapper. # Page 4718 prvides the original wiki-reaper. # We need an entry for a filename field (to write the page to). # We need a button to tell when to do the job. # We need a button to tell when to quit. wm title . "Wish Reaper" set font {Helvetica 14} frame .reaper pack .reaper -expand 1 -fill both -side left label .reaper.banner -font $font -text "Wish Reaper" pack .reaper.banner -expand 1 -fill x -side top frame .reaper.data pack .reaper.data -expand 1 -fill both -pady 5 -side top label .reaper.data.page -anchor w -justify left -text "Page Number" grid .reaper.data.page -column 1 -row 1 -sticky w entry .reaper.data.pageno -textvariable pageno grid .reaper.data.pageno -column 2 -row 1 label .reaper.data.file -anchor w -justify left -text "Output file name" grid .reaper.data.file -column 1 -row 2 -sticky w entry .reaper.data.filename -textvariable filename grid .reaper.data.filename -column 2 -row 2 button .reaper.data.browse -text Browse... -command [list select_file filename] grid .reaper.data.browse -column 3 -row 2 button .reaper.reap -command reap_now -padx 4 -pady 4 -text Reap pack .reaper.reap -side left button .reaper.quit -command exit -padx 4 -pady 4 -text Quit pack .reaper.quit -side right foreach {pageno filename} $argv break focus .reaper.data.pageno .reaper.data.pageno icursor end if {[catch {package require Tcl 8.3}]} { tk_messageBox -icon error \ -message "Tcl version 8.3 or later is required." \ -type ok -parent .reaper exit } if {![catch { package require nstcl-html }] && ![catch { package require nstcl-http }]} { namespace import nstcl::* } else { # Catch a failure here and pop up a dialog box before exiting. # A more precise statement about which version of http is needed # would be helpful. http::cleanup needs to be provided. if {[catch {package require http}]} { tk_messageBox -icon error \ -message "Package http not found." \ -type ok -parent .reaper exit } else { catch {package require autoproxy} } proc ns_geturl {url} { set conn [http::geturl $url] set html [http::data $conn] http::cleanup $conn return $html } proc ns_striphtml {-tags_only html} { regsub -all -- {<[^>]+>} $html "" html return $html ;# corrected a typo here } proc ns_urlencode {string} { set allowed_chars {[a-zA-Z0-9]} set encoded_string "" foreach char [split $string ""] { if {[string match $allowed_chars $char]} { append encoded_string $char } else { scan $char %c ascii append encoded_string %[format %02x $ascii] } } return $encoded_string } } proc output {data} { global out # we don't want to throw an error if output channel has been closed catch { puts $out $data } } # Factoring out the postamble for clarity and ease of change. proc postamble { } { output \n output "# EOF" output \n } # Factoring out the preamble for clarity and ease of change. proc preamble {title url now updated} { output "#####" output "#" output "# \"$title\"" output "# [string map [list mini.net/tcl wiki.tcl.tk] $url]" output "#" output "# Tcl code harvested on: $now GMT" output "# Wiki page last updated: $updated" output "#" output "#####" output \n } proc reap {page} { set url https://wiki.tcl-lang.org/[ns_urlencode $page] set now [clock format [clock seconds] -format "%e %b %Y, %H:%M" -gmt 1] set html [ns_geturl $url] # can't imagine why these characters would be in here, but just to be safe set html [string map [list \x00 "" \x0d ""] $html] set html [string map [list {<pre class='sh_tcl'>} \x00 </pre> \x0d] $html] if {![regexp -nocase {<title>([^<]*)</title>} $html => title]} { set title "(no title!?)" } set re {Updated ([^G]+) by} if {![regexp -nocase $re $html => updated]} { set updated "???" } preamble $title $url $now $updated set html [ns_striphtml -tags_only $html] foreach chunk [regexp -inline -all {\x00[^\x0d]+\x0d} $html] { set chunk [string range $chunk 1 end-1] set chunk [string map [list """ \x22 "&" & "<" < ">" >] $chunk] foreach line [split $chunk \n] { if {[string index $line 0] == " "} { set line [string range $line 1 end] } output $line } } postamble } proc reap_now { } { global filename global out global pageno if {![string is integer -strict $pageno]} { # put up a dialog box here. tk_messageBox -icon error \ -message "An integer page number is required." \ -type ok -parent .reaper } elseif {$filename == ""} { # put up a dialog box here tk_messageBox -icon error -message "A nonempty filename is required." \ -type ok -parent .reaper } elseif {[file exists $filename]} { # put up a dialog box here tk_messageBox -icon error \ -message "File $filename is not allowed to be overwritten." \ -type ok -parent .reaper } elseif {[catch {open $filename w} out]} { # put up a dialog box here tk_messageBox -icon error \ -message "File $filename could not be opened." \ -type ok -parent .reaper } else { # everything is good. reap $pageno close $out # completion dialog. tk_messageBox -icon info -message "Reaping complete." -type ok\ -parent .reaper } } proc select_file {varName} { upvar $varName filename set filename [tk_getSaveFile -title "Save to file ..."] }
Created by escargo 16 Jan 2003
And my thanks to the people who keep adding reapable applications to the wiki and then pointing to them from here. -- escargo 24 Sep 2003
escargo 12 Mar 2003 - Maybe I should create a reapable category to make it easy to find all the code that can be extracted; it would be a subset of Category Application.
Luciano ES 19 July 2003 - How about this: edit all of the reapable applications' pages, adding the hidden tag <!-- reap-ready --> to all of them. Quite some work, but quite feasible too.
MJL 30 May 2005 - Fixed the regexp to extract the "page last updated" value. <i>Updated on ... didn't match the HTML. I've split this into two lines to avoid exceeding 80 characters.
daapp 07 Mar 2006 - I add optional support for http proxy and button to browse for file name.
ATK 04.2011 - adapted for new wiki