**Summary**
[Arjen Markus] (21 february 2006) I am facing a task of modifying a lot of text files
in a rather mechanical way.
I used to do this kind of things with [AWK], but Tcl lends itself for this too.
It is just a matter of the right "little language".
The task I am facing is not really interesting for anyone else,
but the characteristics are fairly common:
* Certain modifications are required for a particular part of the file
* Some modifications apply to particular lines
* Defining regular expressions to capture exactly the lines you need can be tricky. <<br>>So it is probably easier to do it in steps.
The script below allows you to delimit sections of a file by a start and a stop pattern.
If the lines fall within a section, the associated script is run.
To apply default processing (just copying to the output for instance),
there is a fallback pattern - "otherwise".
----
======tcl
# modify.tcl --
# Yet another AWK-like utility. This one reads a file line by line
# and decides on the basis of patterns marking the beginning and
# end of a block of lines (section) what actions to take.
#
# Note:
# - sections may overlap
# - what they do is up to you
# - special sections are: begin, end and otherwise
# - the command "nextline" causes the actions for any subsequent
# sections to be cancelled.
#
namespace eval ::Sections {
variable section_number 0
variable section_data {}
variable section_active {}
variable nextline 0
namespace export section begin end otherwise nextline scanfile
proc _begin {} {}
proc _end {} {}
proc _otherwise line {}
}
# section --
# Define the beginning and end of a section and the actions to take
#
# Arguments:
# begin The regexp pattern marking the start
# end The regexp pattern marking the end
# actions The script to be run
#
# Result:
# None
#
proc ::Sections::section {begin end actions} {
variable section_number
variable section_active
variable section_data
lappend section_data $begin $end
lappend section_active 0
proc ::Sections::$section_number line $actions
incr section_number
}
# begin --
# Define the actions for the beginning of a file
#
# Arguments:
# actions The script to be run
#
# Result:
# None
#
proc ::Sections::begin {actions} {
proc ::Sections::_begin {} $actions
}
# end --
# Define the actions for the end of a file
#
# Arguments:
# actions The script to be run
#
# Result:
# None
#
proc ::Sections::end {actions} {
proc ::Sections::_end {} $actions
}
# otherwise --
# Define the actions for lines not falling in any section
#
# Arguments:
# actions The script to be run
#
# Result:
# None
#
proc ::Sections::otherwise {actions} {
proc ::Sections::_otherwise line $actions
}
# nextline --
# Instruct the scanning procedure to skip all remaining sections
#
# Arguments:
# None
#
# Result:
# None
#
proc ::Sections::nextline {} {
variable nextline
set nextline 1
}
# scanfile --
# Scan the file, taking actions appropriate for the
# sections the line is part of
#
# Arguments:
# filename Name of the file to scan
#
# Result:
# None
#
proc ::Sections::scanfile {filename} {
variable section_number
variable section_data
variable section_active
variable nextline
set infile [open $filename r]
_begin
while { [gets $infile line] >= 0 } {
set nextline 0
set id -1
set insection 0
foreach {start stop} $section_data active $section_active {
incr id
if { $active } {
if { [regexp $stop $line] } {
lset section_active $id 0
}
} else {
if { [regexp $start $line] } {
lset section_active $id 1
set active 1
}
}
if { $active } {
set insection 1
$id $line
if { $nextline } {
break
}
}
}
if { ! $insection } {
_otherwise $line
}
}
_end
close $infile
}
# main --
# Simple test case and demo
#
namespace import ::Sections::*
begin {
puts "List of procedures:"
set ::count 0
}
section "^#.*--" "^ *proc" {
puts "| $line"
if { [regexp "#.*--" $line] } {
set ::count 0
}
}
section "{" "^#.*--" {
incr ::count
if { $line == "\}" } {
# Naive criterium for the end of a procedure
puts "(Number of lines: $::count)"
}
}
scanfile $argv0
======
----
**Comments**
Very useful indeed ! I fixed a small bug: the "if {$insection} ..." test
is better placed outside the foreach loop
----
an excerpt of the demo's output:
| # scanfile --
| # Scan the file, taking actions appropriate for the
| # sections the line is part of
| #
| # Arguments:
| # filename Name of the file to scan
| #
| # Result:
| # None
| #
| proc ::Sections::scanfile {filename} {
(Number of lines: 46)
----
[JM] (4 April 2024) Make sure you remove the 1 space in this source when copy-pasting as this breaks the demo which uses:<<br>> "^#.*--" or<<br>>
"^ *proc"<<br>>
then, the extra space at the beginning of each line will not match these regex's
----
<<categories>> File | String Processing