Here's the package to read and parse RFC 822-like (or RFC 2822-like) headers, which are used in variety of applications nowadays (e-mail, news and Debian Packages.gz files come to mind). When developing the package, ability to re-produce header just read was considered significant. Eg. the following code should produce '''header''' file containing part of '''message''' file 'till (and including) first empty line: ## get header from `message': array unset header set chan [ open "message" r ] set order [ read-parse-header $chan header ] close $chan ## write header to `header': set chan [ open "header" w ] write-header $chan $order header puts $chan "" close $chan Please note, however, that in mailbox files the header is oftenly prepended with '''From ''' (note the absence of colon) line, which isn't part of the RFC 822 syntax, so '''read-parse-header''' procedure will signal an error on seeing this sort of input. ---- -- Namespace: ::rfc-822 The `proc's below belong to this namespace, exported and ready to use. -- Function: read-parse-header ?''channelId''? ''arrayName'' Reads and parses RFC 822 header fields from ''channelId'' (defaulting to '''stdin''' if omitted), filling the array specified by ''arrayName''. Returns the order of fields read (as a list). Header fields are keyed by header name and stored as lists, with each occurence of the header field resulting in exactly one element in the list. Array isn't cleared before processing, it's left to the caller. Leading whitespace is ''not'' removed. Header values continued over several lines result in newlines being embedded in the list elements. Returned order of fields will contain several occurences of the same header name, in case there were multiple occurences of the header. -- Procedure: write-header ?''channelId''? ''order'' ''arrayName'' Writes header fields specified by ''order'' argument to the given ''channelId'' (defaulting to '''stdout''' if omitted), obtaining values from the array specified by ''arrayName''. NB: Header fields not included in ''order'' will ''not'' be output. This can be used to strip away unwanted headers. ---- The code below is under GNU Lesser General Public License. ---- ### rfc822h.tcl --- RFC 822 (and alike) headers -*- Tcl -*- ## $Id: 15254,v 1.1 2006-01-11 07:01:22 jcw Exp $ ### Copyright (C) 2005, 2006 Ivan Shmakov ### Code: namespace eval ::rfc-822 { variable re-header-name "\[\\041-\\071\\073-\\177\]+" variable re-header-del ":" namespace export \ read-parse-header \ write-header } proc ::rfc-822::read-parse-header { { channelId stdin } arrayName } { variable re-header-name variable re-header-del upvar 1 $arrayName header set order [ list ] set done 0 set hf "" set hv "" while { ! $done } { if { [ gets $channelId line ] < 0 } { error "eof or no data while reading header" } if { [ string length $line ] == 0 } { ## end of header if { [ string length $hf ] != 0 } { lappend header($hf) $hv lappend order $hf } set done 1 } elseif { [ string match "\[ \t\]*" $line ] } { ## continuation of a header field if { [ string length $hf ] == 0 } { error "no header to continue" } append hv "\n" $line } elseif { [ regexp -- \ "^(${re-header-name})${re-header-del}(.*)\$" \ $line \ dummy n v ] } { ## new header field if { [ string length $hf ] != 0 } { lappend header($hf) $hv lappend order $hf } set hf $n set hv $v } else { error "cannot parse header line: $line" } } ## . set order } proc ::rfc-822::write-header { { channelId stdout } order arrayName } { ## FIXME: should it write header fields not in ORDER? upvar 1 $arrayName header array unset is foreach hf $order { if { ! [ info exists is($hf) ] } { set is($hf) 0 } set i $is($hf) set l $header($hf) if { $i >= [ llength $l ] } { ## silently ignore this case continue } puts $channelId \ [ format "%s:%s" $hf [ lindex $l $i ] ] incr is($hf) } ## . return } package provide rfc822::headers 0.2 ### Emacs stuff ## Local variables: ## fill-column: 72 ## indent-tabs-mode: nil ## ispell-local-dictionary: "english" ## mode: outline-minor ## outline-regexp: "###\\|proc" ## End: ## LocalWords: ### rfc822h.tcl ends here ---- [Category Internet] [Category Package]