The ''word'' diff , in many computer circles, refers to the concept of comparing two items and displaying, in some manner, the differences between the two items. Most frequently, it is a comparison of two files. If the output is in text, the Unix tradition is to display the differences in terms of the changes made to the first file to achieve a file similar to the second file. Often in a [GUI] application, coloring or other techniques are used to convey more information about what changed. In some applcations, entire lines are highlighted, while in other, particular characters are highlited. ---- See [diff in Tcl] The code that was here was crap (according to the author) and has been removed. [Arjen Markus] We have faced a slightly different problem: two files that should be compared with special care for (floating-point) numbers. The solution was simple in design: * Read the files line by line (all lines should be comparable, we did not need to deal with inserted or deleted lines) * Split the lines into words and compare the words as either strings or as numbers. * By using [[string is float]] we identified if the "word" is actually a number and if so, we compared them numerically (even allowing a certain tolerance if required). This way you are immune to numbers formatted in different ways: 0.1, +.1, 1.0E-01, +1.00e-001 all spell the same number and you can encounter all of these forms (sometimes you have less than perfect control over the precise format). ---- [Arjen Markus] Question: would not this be a nice addition for the fileutil module in Tcllib? [GPS] maybe it would... [Arjen Markus] If so, it would benefit (in my opinion) from two custom procedures: * A procedure one can supply to compare the lines (for instance: ignore white-space or interpret numbers as numbers - my original problem) * A procedure to process the output (in a manner as [Tkdiff] does for instance) ---- [Arjen Markus] A few thoughts for improving the performance: * Store the lines as {lineno content} * Sort by content (lsort has this ability via "-index") * Use binary search to replace the inner loop. This would bring back the number of iterations from O(N^2) to O(NlogN). But perhaps it is not worth the trouble :-) ---- See also [Using Snit to glue diff, patch, and md5sum]. ---- [CL] has received mild testimonials about "Active File Compare" available through http://formulasoft.com There's no particular Tcl connection; it's just been valuable to me as a Tcl developer when working under Windows. ---- [Category Glossary] [Category Dev. Tools]