The filter idiom

Richard Suchenwirth 2003-07-09 - After having written the following code repeatedly, I just declare it to be an idiom.

See Also

Arts and crafts of Tcl-Tk programming

Description

The filter function is frequently seen on Unix (and even DOS) tools, and I think was published in the Software Tools book(s):

  • read from stdin (or files specified on command line)
  • write to stdout

Such tools (grep, sed, awk, sort, ...) can easily be combined ("glued") in pipes to do powerful things. A little framework for Tcl scripts with the filter function:

set about {usage: myFilter ?file...?

Does something meaningful with the specified files, or stdin.
}

#-- This handler contains the real functionality for one stream (or file)
proc myHandler channel {
  ...
}

#-- This prevents lengthy errors if a filter|more is terminated with 'q'
proc puts! string {catch {puts stdout $string}}

if {[lsearch $argv --help] >= 0} {puts $about; exit}

if {[llength $argv] == 0} {
    puts! [myHandler stdin]
} else {
    foreach file $argv {
        set fp [open $file]
        puts! [myHandler $fp]
        close $fp
    }
}

Discussion

Mike Tuxford is a little confused here, although that in itself is not unusual. It appears to me that you provide a method of repeating a single set of functions upon a multiple set of files, whereas unix pipes provide multiple functions upon the returned data passed between the functions. That is, the 1st command passes its stdout to the 2nd command as its stdin, and so on... Perhaps an example of usage might clarify things for me. - RS: Well, the above is the framework for one filter, which you can put into a pipe, but also can draw input from files specified on command line, like this (and similar to e.g. cat or more):

echo Data | myFilter | more
cat data.file | myFilter | more
myFilter *.file | more
more data.File

You're right that real filters read from stdin only, but many have the added convenience of filenames on command line - that's what I meant, and implemented.

Mike Tuxford says: OK, I see, and that's still a useful thing. Not to mention that you got me thinking and learning. Thanks for the quick response.


Rainald - 2010-10-12 05:34:55

What is wrong with this command line (Windows with installed tcl)?

blockwise_average.tcl 8 <huge.csv >large.csv

The behaviour: A gray window pops up titled "blockwise_average" and a 0-byte file large.csv is created.

The content of blockwise_average.tcl, probably not related to my problem:

# STDIN-to-STDOUT filter
# Single command-line parameter is the number of lines consumed per line of output, default = 100
# Input is comma-separated, timestamps in ms, followed by 3 integer numbers
# Output too, timestamps in s, followed by 3 floats

set nAv [expr {$argc >= 1 ? [lindex $argv 0] : 100}]
set iAv $nAv
while {[gets stdin line] > 0} {
    if {$iAv == $nAv} {set iAv 0; set st 0.; set sx 0.; set sy 0.; set sz 0.}
    set fields [split $line ","]
    lassign $fields t x y z
    set st [expr {$st + $t}]
    set sx [expr {$sx + $x}]
    set sy [expr {$sy + $y}]
    set sz [expr {$sz + $z}]
    incr iAv
    if {$iAv == $nAv} {
            set st [expr {$st / $iAv / 1000.}]
            set sx [expr {$sx / $iAv}]
            set sy [expr {$sy / $iAv}]
            set sz [expr {$sz / $iAv}]
            puts [format "%.1f,%.2f,%.2f,%.2f" $st $sx $sy $sz] }}

AM: Nothing is wrong :) What you do not realise is that not tclsh was started but wish, the graphical interface that comes with Tcl/Tk. wish presents an empty window and a console on Windows. You can type in your commands in the console.


Rainald - 2010-10-12 09:38:49

Thank you. I may want to apply a chain of filters (as mentioned in this thread) to lots of input files. Using tclsh instead of wish maybe more appropriate(?)

Therefore, I changed the file association of *.tcl from wish to tclsh and started over again. The result:

 channel "stdin" wasn't opened for reading

As I'm pretty sure that the code has worked before, I'm confused.

P.S.: I backed out to solid ground, implemented the filter in C, but am still interested in a hint to a solution in tcl.

CliC: The code above worked for me. I invoked it as "tclsh86 rainald.tcl 1", typed "100,2,3,4" into my console window, and it replied with "0.1,2.00,3.00,4.00". I'm on ActiveTcl 8.6b3 on Windows 7 64-bit.

Rainald - 2010-10-13 06:33:48

Well, this works for me, too. What about redirecting input and output? I tried

 blockwise_average.tcl 8 <huge.csv >large.csv

with the error message

 channel "stdin" wasn't opened for reading

and

 tclsh85 blockwise_average.tcl 8 <huge.csv >large.csv

with the error message

 can't use non-numeric string as operand of "+"

(the filenames are taken as data).

CliC Tried it this way, too:

 tcl86sh rainald.tcl 3 <in.txt >out.txt

in.txt:

 100,1,2,3
 200,3,4,5
 300,4,5,6

out.txt:

 0.2,2.67,3.67,4.67

Granted, those aren't "huge" and "large" .csv files, but it works even with I/O redirection.


Rainald - 2010-10-14 10:28:08

Yes it works. I cannot reproduce the error message of yesterday. Maybe that I have copy&pasted a wrong filename for 'huge'.

It runs about a factor 20 slower than my C code, but that's ok. Somedays, my skill in tcl will shorten development cycles :)

Thanks a lot!