Version 6 of WikiDiff

Updated 2002-11-23 01:19:19

I have put up a service that shows the changes made to the wiki in the last 24 hours. It runs every day at 11:15 am CET. You can look at it at http://pascal.scheffers.net/wikidiff/ but I'd like to put it right here in the wiki if that is possible. For that, the wikit would need some extra formatting rules to make use of the different colours.

Source of the wikidiff software is at http://pascal.scheffers.net/wikidiff/wikidiff.tcl.txt

Right now it just parses the cvs-diffs of the day before and dumps those into a page (which uses a slightly added to wiki.css, so it looks the same as the wiki!). Comments please.

-- PS

Brian Theado - Very nice! I especially like the all diffs in one page approach. I encountered changes that interest me that never would have drawn me in if I had just seen the page title on the recent changes page.

I wrote some functionality to display diffs for an old version of wikit. The date at the bottom of each page is a hyperlink that when clicked shows the most recent change of the page. This functionality can be seen at http://tkoutline.sourceforge.net . Now, this old version of wikit stores changes within the wikit database. The newest versions of wikit stores the various versions of a page in an env(WIKIT_HIST) directory.

Based on my desire to upgrade my tkoutline wikit to the latest version without losing the diff functionality and on my desire to see similar functionality here on the Tcl'ers wiki, I have started some code (see below). It makes use of KBK's code from diff in tcl. All that's left is to figure out how to specify via the URL what diffs to display (The code I have in the tkoutline wiki uses the "^" symbol appended to the page URL to display the most recent change. I'd like a way to specify more than just the most recent change).

More comments at bottom of page...

 package require Diff ;# i.e. the diff in tcl code from http://wiki.tcl.tk/3108
 catch {namespace import list::longestCommonSubsequence::compare}

 # Helper function for the diff callbacks
 proc appendDiff {mode value} {
    variable diff
    set lastMode [lindex $diff end-1]
    if {$lastMode == $mode} {
        set oldValue [lindex $diff end]
        set diff [lreplace $diff end end $oldValue\n$value]
    } else {
        lappend diff $mode $value
    }
 }

 # The following three functions are callbacks for the diff function
 proc removed { index value } {
     variable diff
     appendDiff removed $value
 }
 proc added { index value } {
     variable diff
     appendDiff added $value
 }
 proc matched { index1 index2 value } {
    variable diff
    appendDiff matched $value
 }

 # Returns the contents of the given file
 proc getFile {fileName} {
    set fd [open $fileName]
    set contents [read $fd]
    close $fd
    return $contents
 }

 # Converts a version as specified below into a list index
 proc getVersionIndex {version} {
    if {[string index $version 0] == "-"} {
        return end$version
    } else {
        return $version
    }
 }
 # Versions can be specified as an absolute positive version number
 # starting at zero and counting up.  A version relative to the most
 # recent can be specified with a negative number.
 # i.e. To see the most recent change: getWikiPageDiff $id -1 -0
 # This function returns a list in the format of chunktype text pairs
 # where chunktype is one of matched, added, removed
 #
 # TODO: It would be nice to be able to express "give me the difference
 # between the page as it is now and how it was 24 hours before the
 # most recent change"
 proc getWikiPageDiff {id version1 {version2 -0}} {
    variable diff
    set ewh $::env(WIKIT_HIST)
    set versIdx1 [getVersionIndex $version1]
    set versIdx2 [getVersionIndex $version2]
    set versions [lsort [glob $ewh/$id*]]
    set list1 [split [getFile [lindex $versions $versIdx1]] \n]
    set list2 [split [getFile [lindex $versions $versIdx2]] \n]
    set diff {}
    compare $list1 $list2 matched removed added
    return $diff
 }
 package require cgi
 proc displayHtmlDiff {id version1 {version2 -0}} {
    # Special colors for the various diff pieces
    array set options {
        added bgcolor=\"#ffffaf\"
        removed bgcolor=\"#cfffcf\"
        matched {}
    }

    # Legend
    cgi_table width=200 size=-1 {
        cgi_table_row $options(removed) {
            cgi_td [cgi_font size=-1 "Removed"]
        }
        cgi_table_row $options(added) {
            cgi_td [cgi_font size=-1 "Added"]
        }
    }
    hr noshade

    # Display the entire page with the differences embedded within
    cgi_table width="600" {
        foreach {mode value} [getWikiPageDiff $id $version1 $version2] {
            cgi_table_row $options($mode) { 
                cgi_td [lindex [Wikit::Expand_HTML $value] 0]
            }
        }
    }
    return
 }

22nov01 jcw - Agree with Brian - great to see these things happen now. From a brief email exchange with Pascal some thoughts (no more than that, really):

  • Yes, all-in-one-page really makes it easy to skim for what's important.
  • Idea: omit all diffs over say 25 lines, that keeps the page nicely limited, it may even entice people to stick to small(er) more concise comments.
  • How far back should the summary diff go? I'd think that a summary, listing diffs with what was on the page 3 days ago, makes it easy to track things and bridge the weekend. Only one diff per page number, summarizing multiple changes all in one, might work IMO.

That sort of raises the issue how much diffing is needed in all. While access to all diffs on each page is technically feasible, I'd be inclined to think it would confuse/overwhelm/distract more than offering just a bit of diffs. Last day, week, month - perhaps? Three links per page?

The history is now a separate subsystem on mini.net - the wiki stores the latest page version only, while all changes get archived and simmer down into the CVS historical archive once a day. It's sort of a collect-and-sweep daily cron job. What seems to work well is that latest and daily-snapshot versions are both efficiently available (static page accesses in fact - though "current" is HTML, whereas CVS daily-snapshot files are in raw wiki input format).

Whatever diff mechanism we come up with ought to keep those two sides of this wiki as lossely coupled as possible IMO.

Let me also add that there is the start of a remote sync/update mecchanism. If you get a copy of the wiki and run it locally, there is the option of updating it from the CVS daily snapshot, and that mechanism is quite efficient, so it ought to scale once fully ready. That means one can have a local copy of the Tclers' Wiki, and easily track it, while using it as a local Tk app (with much snappier search capability than the web can offer). To try it, get a copy of wikit.tkd (from the usual wikit.gz url), and do: "tclkitsh wikit.kit wikit.tkd -update http://mini.net/tclhist ". I'm mentioning this here (it was also mentioned on the tclerswiki mailing list), to emphasize that we need to plan so things remain open-ended when diffs get brought into the picture. This update mechanism, for example, *only* uses the tclhist/ area.