Version 10 of scanfile

Updated 2021-01-19 22:56:06 by glennj

scanfile is one of a set of commands (in the TclX package) designed to assist the user in looking through an open file for matches against a regular expression.

scanfile actually is the 'loop' that causes the evaluations to take place.

scanfile ?-copyfile copyFileId? contexthandle fileId

Scan the file specified by fileId, starting from the current file position. Check all patterns in the scan context specified by contexthandle against it, executing the match commands corresponding to patterns matched.

If the optional -copyfile argument is specified, the next argument is a file ID (see open) to which all lines not matched by any pattern (excluding the default pattern) are to be written. If the copy file is specified with this flag, instead of using the scancontext copyfile command, the file is disassociated from the scan context at the end of the scan.

This command does not work on files containing binary data (bytes of zero).


Hopefully someone will add in some examples here. In particular, an example that demonstrates how to change file lines based on regular expression hits would be appreciated.


Bruce Hartweg wrote this example. The tricky part is to make certain that the final scanmatch remains the final one!

 package require Tclx
 # fout has to be the same file as fin at the end
 set fout [open "q.in" "w"]
 puts $fout "aaaa"
 puts $fout "bbbb"
 puts $fout "cccc"
 puts $fout "cccd"
 puts $fout "dddd"
 puts $fout "eeee"
 puts $fout "abcde"
 close $fout

 set c [scancontext create]

 scanmatch $c {.*} {
     puts "I RUN FIRST - <$matchInfo(line)>"
 }
 scanmatch $c {a} {
     puts  "A - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {a X} $matchInfo(line)]
     puts  "A - post <$matchInfo(line)>"
 }
 scanmatch $c {b} {
     puts  "B - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {b a} $matchInfo(line)]
     puts  "B - post <$matchInfo(line)>"
 }
 scanmatch $c {c} {
     puts  "C - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {c d} $matchInfo(line)]
     puts  "C - post <$matchInfo(line)>"
 }
 scanmatch $c {d} {
     puts  "D - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {d Z} $matchInfo(line)]
     puts  "D - post <$matchInfo(line)>"
 }
 scanmatch $c {.*} {
     puts "I RUN LAST - <$matchInfo(line)>"
     puts $matchInfo(copyHandle) $matchInfo(line)
 }

 set fin [open "q.in" "r"]
 set fout [open "q.out" "w"]

 scanfile -copyfile $fout $c $fin

 exit

glennj For contrast, since the TclX file scanning mechanism is meant to be awk-like, here's the awk code to compare to that example. Note that gsub is not the same as string match, but here it has the same effect.

 awk '
     {
         line = $0
         printf "I RUN FIRST - <%s>\n", line
     }
     /a/ {
         printf  "A - pre  <%s>\n", line
         gsub(/a/, "X", line)
         printf  "A - post <%s>\n", line
     }
     /b/ {
         printf  "B - pre  <%s>\n", line
         gsub(/b/, "a", line)
         printf  "B - post <%s>\n", line
     }
     /c/ {
         printf  "C - pre  <%s>\n", line
         gsub(/c/, "d", line)
         printf  "C - post <%s>\n", line
     }
     /d/ {
         printf  "D - pre  <%s>\n", line
         gsub(/d/, "Z", line)
         printf  "D - post <%s>\n", line
     }
     {
         printf "I RUN LAST - <%s>\n", line
         print line > outfile
     }
 '  outfile=q.out  q.in

See also scancontext and scanmatch.