scanfile

Difference between version 9 and 10 - Previous - Next
'''scanfile''' is one of a set of commands (in the [TclX] package) designed to assist the user in looking through an open file for matches against a regular expression.

'''scanfile''' actually is the 'loop' that causes the evaluations to take place.


    :   '''scanfile''' ?'''-copyfile''' ''copyFileId''? ''contexthandle fileId''

Scan the file specified by ''fileId'', starting from the current file position.  Check all patterns in the scan context specified by ''contexthandle'' against it, executing the match commands corresponding to patterns matched.

If the optional '''-copyfile''' argument is specified, the next argument is a file ID (see [open]) to which all lines not matched by any pattern (excluding the default pattern) are to be written.  If the copy file is specified with this flag, instead of using the '''[scancontext] copyfile''' command, the file is disassociated from the scan context at the end of the scan.

This command does not work on files containing binary data (bytes of zero).

----
Hopefully someone will add in some examples here.  In particular, an example that demonstrates how to change file lines based on regular expression hits would be appreciated.

----
[Bruce Hartweg] wrote this example.  The tricky part is to make certain that
the final scanmatch remains the final one!

 package require Tclx
 # fout has to be the same file as fin at the end
 set fout [open "q.in" "w"]
 puts $fout "aaaa"
 puts $fout "bbbb"
 puts $fout "cccc"
 puts $fout "cccd"
 puts $fout "dddd"
 puts $fout "eeee"
 puts $fout "abcde"
 close $fout

 set c [scancontext create]
 
 scanmatch $c {.*} {
     puts "I RUN FIRST - <$matchInfo(line)>"
 }
 scanmatch $c {a} {
     puts  "A - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {a X} $matchInfo(line)]
     puts  "A - post <$matchInfo(line)>"
 }
 scanmatch $c {b} {
     puts  "B - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {b a} $matchInfo(line)]
     puts  "B - post <$matchInfo(line)>"
 }
 scanmatch $c {c} {
     puts  "C - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {c d} $matchInfo(line)]
     puts  "C - post <$matchInfo(line)>"
 }
 scanmatch $c {d} {
     puts  "D - pre  <$matchInfo(line)>"
     set matchInfo(line) [string map {d Z} $matchInfo(line)]
     puts  "D - post <$matchInfo(line)>"
 }
 scanmatch $c {.*} {
     puts "I RUN LAST - <$matchInfo(line)>"
     puts $matchInfo(copyHandle) $matchInfo(line)
 }
 
 set fin [open "q.in" "r"]
 set fout [open "q.out" "w"]
 
 scanfile -copyfile $fout $c $fin
 
 exit

----
[glennj] For contrast, since the TclX file scanning mechanism is meant to be awk-like, here's the awk code to compare to that example. Note that `gsub` is not the same as `string match`, but here it has the same effect.
 awk '
     {
         line = $0
         printf "I RUN FIRST - <%s>\n", line
     }
     /a/ {
         printf  "A - pre  <%s>\n", line
         gsub(/a/, "X", line)
         printf  "A - post <%s>\n", line
     }
     /b/ {
         printf  "B - pre  <%s>\n", line
         gsub(/b/, "a", line)
         printf  "B - post <%s>\n", line
     }
     /c/ {
         printf  "C - pre  <%s>\n", line
         gsub(/c/, "d", line)
         printf  "C - post <%s>\n", line
     }
     /d/ {
         printf  "D - pre  <%s>\n", line
         gsub(/d/, "Z", line)
         printf  "D - post <%s>\n", line
     }
     {
         printf "I RUN LAST - <%s>\n", line
         print line > outfile
     }
 '  outfile=q.out  q.in

----
See also [scancontext] and [scanmatch].




<<categories>> Command | Channel | TclX