Version 19 of awk

Updated 2009-01-06 15:36:43 by Cameron

Purpose: to describe awk, an early Unix tiny language named after the initials of its authors, Aho, Weinberger, Kernighan.


See http://cm.bell-labs.com/cm/cs/awkbook/index.html which is a page for an official book by the creators of the language. See also the usenet newsgroup news:comp.lang.awk . Another good resource for awk documentation is http://www.gnu.org/software/gawk/manual/gawk.html .

"... Awk One-Liners ..." [L1 ] serves as a handy FAQ for common tasks.

http://www.vectorsite.net/tsawk.html is a comprehensive intro to awk, suitable to beginners. There are a number of other sites offering tutorials and introductions.

Programmers often come to the Tcl newsgroups asking how can I do this awk like operation in Tcl or how to invoke awk from exec. This is because Awk's ability to scan through a file and manipulate the contents pre-dates Perl's functionality to do this, and frankly awk's abilities, while cruder in many ways, are also simpler (simpler even than Tcl!).


The BOOK Mastering Regular Expressions covers regular expressions in perl, awk, and tcl.


LV awk is one of my favorite languages in which to write...


For a Tcl variation on awk functionality, see owh - a fileless tclsh.


LV A common question is:

How can I invoke awk scripts from Tcl?

 $ tclsh 
 % set a [exec awk {'{print $1}'} /etc/motd]
 awk: cmd. line:1: '{print $1}'
 awk: cmd. line:1: ^ invalid char ''' in expression

RS answers: This is not an awk problem, but a misuse of /bin/sh et al. quoting: single quotes there have the effect as braces in Tcl - group in one word, don't substitute on contents. Solution here: You have outer braces already, so just drop the single quotes (the inner brace pair is awk syntax, not seen by Tcl):

 % set a [exec awk {{print $1}} /etc/motd]

RS 2007-02-07: I love this few-liner that allows tests in a subset of awk notation (in fact, the common subset of awk and expr, plus a shortcut for regexp):

 proc awktest {filter 0} {
    if {[regexp {^/(.+)/$} $filter -> re]} {return [regexp $re $0]}
    set i 0
    foreach field $0 {set [incr i] $field}
    expr $filter
 }

e.g.:

 awktest {$1==$2} {foo bar grill} ->  0.

The variable $0 just is the input list :^) The following shortcut is also cute:

 interp alias {} ~ {} regexp

RS 2007-11-08: When porting awk scripts, it's also helpful to have the numbered variables:

 proc awksplit {list sep} {
    set i 0
    foreach field [split $list $sep] {uplevel 1 [list set [incr i] $field]}
    upvar 1 [list set NF $i]
 }

where for example awksplit "foo;bar;grill" ";" assigns foo to 1, bar to 2, grill to 3.


RS 2007-02-28: Here's another piece of awk emulation:

 proc substr {str from length} {
   string range $str [expr {$from-1}] [expr {$from-2+$length}]
 }
 % substr hello 2 2
 el
 % substr hello 2 99
 ello

And this is trivial, but over 50% shorter to type:

 interp alias {} length {} string length