Purpose: to collect a variety of Tcl idioms that a programmer can use in a manner similar to the way they programmed in awk.

----

Background:

'''awk''' [http://cm.bell-labs.com/cm/cs/awkbook][http://www.faqs.org/faqs/computer-lang/awk/faq/] is a tiny language created by Aho, Weinberger, and Kernighan and is designed to manipulate files whose lines (referred to as records) are a series of fields separated by some character.  The basic awk application has an optional beginning set of code, followed by a series of actions which are executed if an associated pattern matches the record in some fashion, followed by an optional ending set of code.

For example, a sample program might look like this:

 #! /bin/awk -f
 #       Purpose: to change a double spaced file into a single spaced file

 BEGIN { sw = 0 ; cnt = 0 }

 NF == 0 {
                cnt++
                if (sw == 1) {
                        print $0
                        sw = 0
                } else {
                        sw = 1
                }
                continue
        }

 NF != 0 {
                cnt++
                print $0
                sw = 0
        }

 END { printf "Number of lines = %d\n", cnt }

The file is supplied on the command line invoking the above program
and is automatically opened for awk.  awk uses a number of special variables:
   * FS is the character used as the field separator.  
   * 0 is the current line.  awk breaks 0 into fields 1, 2, 3, 4, ... NF , using FS as the split character.  The default FS is interesting - it is white space, and two or more white space are collapsed and treated as a single occurance.  However, if you set FS to a specific character, then multiple occurances of that character creates empty fields.  Interestingly enough, in gnu's awk and nawk, you can provide FS a regular expression, so that you can collapse the null fields into a single field.
   * NR is the record number (where the first line input represents NR == 1)
   * NF is the number of fields created after using FS to separate )


----

Tcl solutions:

[Bob Techentin] writes on news:comp.lang.tcl:

Since the awk NR is the record number, I assume that you're trying to
get specific lines from a file by their line number.  The "Tcl way" to
do that, for small files that can be read entirely into memory, is to
read the data in one fell swoop, then split it into a list, like this:

  set filename "myfile.dat"
  set fp [open $filename "r"]
  set data [split [read $fp [file size $filename]] "\n"]
  close $fp

Then the variable 'data' contains a list of lines.  You can get at a
specific line by using the list index function:

  set p [lindex $data 122]
  puts "line 122:  '$p'"

If the original data file is very large, then you're stuck reading each
line in a loop.  If you're planning on matching lines, then look at the
[Regular Expressions] man pages.  Tcl 8.x's regular expressions are more
powerful than the traditional awk's regular expressions.
----
[owh - a fileless tclsh] (named in honor of Ousterhout, Welch, and Hobbs ;-) gives you a similar operation framework (initial, per-line, and final code) which can be specified right on the command line, as is habitual with awk programmers also. But the language is Tcl. Of course.

'''awksplit''' in [Braintwisters] mimicks the line-splitting behavior of awk: given a string, it is broken up on default or specifiable FS into variables 1..NF, where NF gives the number of fields. It even reconstructs $0 if you assign to one of the fields. 

''Side note:'' while $1 in Tcl means "the value of the variable named 1" (which is a legal name), in awk it's rather a special syntax for indexing into array "$", so you can write (in awk):
 {for(i=1;i<=NF;i++){print $i}}

One of the neat aspects of this feature is that one can make reference to $NF - which means "use the value of the last field - whatever number it might be".  Thus, one can write:

 #! /bin/awk -f
 { print $NF }

This splits the file based on your FS and then prints the last field of each record.  One record can have 10 and the next 100 - you still get the last field!  Not many other languages allows neat stuff like this.

Tcl of course does (cf. [[lindex $list end]]).

----
Don't forget the file scanning commands [http://www.neosoft.com/tclx/man/TclX.n.html] in TclX, i.e.

   * scancontext
   * scanfile
   * scanmatch

----
[Arjen Markus] has posted source [http://groups.google.com/groups?hl=en&frame=right&th=858a41b7eff29890]
(also found at  http://starbase.neosoft.com/~claird/comp.lang.tcl/examples/awkproc.tcl ),
for a Tcl-coded package which emulates important AWK capabilities.


----
This [http://groups.google.com/groups?hl=en&frame=right&th=ae482cc46a36dad3]
thread makes several points of interest to those coming from Awk.