[GPS] asks in [the comp.lang.tcl newsgroup]: In my Tcl file server I use chroot for security reasons, and this has caused problems because my server also relies on diff. I know that I could recompile GNU diff as a standalone executable and put it within my chroot, but I don't want to infect my program with the GPL nor force my users to compile GNU binutils for this. I've been looking all over the place for a diff that is portable and can be used commercially, but I haven't been able to find one. So, I come to you for help writing a diff tool in Tcl. I'm overwhelmed when I look at existing code written in C, so I truly do need the help of someone more experienced. [KBK] answers: The heart of any implementation of '''diff''' is a computation of the ''longest common subsequence'' (or some approximation to it) -- that is, the largest set of lines that the two files have in common. There's a copy of the original paper of '''diff''' on Doug McIlroy's homepage [http://www.cs.dartmouth.edu/~doug/]. A "dynamic programming" approach to identifying the longest common subsequence has been known in the computing folklore for many years. There is a discussion of the folklore at [http://citeseer.nj.nec.com/simon88sequence.html] - follow the ''Download'' links at the upper right to get a copy of the paper. There's a Tcl implementation of the folklore algorithm over at [Longest common subsequence: folklore algorithm]. ---- Brilliant! I was going to try porting Tim Peters ndiff.py from the Python Tools/Scripts directory but have yet to find the time. He argues that he has an approach that is more human friendly, i.e. more in sync with what a person would consider the differences. (VPT) ---- This is a small diff program I wrote a few years ago. I needed a simple diff program without resorting to external programs (I wanted it to work unchanged on windows), where neither speed nor minimal changes were critical (it was part of a moo verb editor, so the diff-ed entities were at most around 60 or 70 lines). It gives reasonable results; from what I recall (I haven't used it much lately) the biggest problem is that moving a line will result in everything between the new line location and the old location showing up as deleted then added after the change. The operation is really simple - roughly, loop over the lines in the new file, for each line look forward from the current point in the old file for the same line. If a match is not found, mark the current line as new, if its found further along mark the intervening stuff as deleted. Adjust the position in the old file and repeat. Not exactly accurate since I'm going by old fuzzy memory :) proc adj {args} { eval .diff.orig yview $args; eval .diff.new yview $args; } proc wdiff {l1 l2} { catch {destroy .diff}; toplevel .diff; menu .diff.mbar; menu .diff.mbar.file .diff.mbar.file add command -label "Close" -command {destroy .diff}; .diff.mbar add cascade -menu .diff.mbar.file -label "File"; .diff configure -menu .diff.mbar; text .diff.orig -yscrollcommand ".diff.scr set"; text .diff.new -yscrollcommand ".diff.scr set"; scrollbar .diff.scr -command {adj}; pack .diff.orig .diff.scr .diff.new -side left; pack .diff.scr -expand t -fill y; .diff.new tag configure del -background green -foreground black; .diff.orig tag configure del -background green -foreground black; .diff.new tag configure ins -background red -foreground black; .diff.orig tag configure ins -background red -foreground black; .diff.new tag configure cha -background yellow -foreground black; .diff.orig tag configure cha -background yellow -foreground black; set len1 [llength $l1]; set len2 [llength $l2]; for { set s1 [lindex $l1 [set c 0]]; set s2 [lindex $l2 [set d 0]]; } {$c<$len1 || $d<$len2} { set s1 [lindex $l1 [incr c]]; set s2 [lindex $l2 [incr d]]; } { if {$c>=$len1} { # 2col "" "> $s2"; 2col "" del $s2 ins # puts "i$d> $s2"; continue; } if {$d>=$len2} { # puts "d$c> $s1"; 2col "$s1" ins "" del; continue; } if {![string compare $s1 $s2]} { # puts "$s1 $s2"; 2col $s1 "" $s2 ""; continue; } for {set cc $c} {$cc < $len1} {incr cc} { if {![string compare [lindex $l1 $cc] $s2]} { for {set n $c} {$n<$cc} {incr n} { 2col "[lindex $l1 $n]" ins "" del; } # 2col "$s1" "<"; # puts "d$cc< $s1"; set c $cc; break; } } if {$cc == $c} {incr c -1; incr d -1; continue;} for {set cd $d} {$cd < $len2} {incr cd} { if {![string compare [lindex $l2 $cd] $s1]} { # puts "add $d-$cd"; for {set n $d} {$n<$cd} {incr n} { 2col "" del "[lindex $l2 $n]" ins; } # puts "i$cd> $s2"; set d $cd; break; } } if {$cd == $d} {incr c -1; incr d -1; continue;} 2col "$s1" cha "$s2" cha; # puts "c$c< $s1"; # puts "c$d> $s2"; } .diff.orig configure -state disabled .diff.new configure -state disabled } proc 2col {a at b bt} { .diff.orig insert end "$a\n"; if {$at != ""} {.diff.orig tag add $at {end-2 lines} {end-1 lines}} .diff.new insert end "$b\n"; if {$bt != ""} {.diff.new tag add $bt {end-2 lines} {end-1 lines}} } #comment proc fdiff {a b} { set h [open $a]; set l1 {}; while {![eof $h]} {lappend l1 [gets $h]} set h [open $b]; set l2 {}; while {![eof $h]} {lappend l2 [gets $h]} wdiff $l1 $l2; } ---- [Category Application]