Version 5 of option parsing using ensemble

Updated 2008-08-20 12:49:25 by lars_h

Lars H, 2008-08-19: One of Tcl's strengths is that every core command can be "patched" at the script level, by providing a substitute proc — the old

 rename file __file
 proc file {subcommand args} {
     if somecondition then {
        # Handle special case
     }
     __file $subcommand {*}$args
 }

pattern. Some commands are however rather difficult to patch this way, because they have tricky syntaxes; this is for example the case with glob and regexp, which follow the general pattern

command ?-option ...? ?--? arg ?arg ...?

By contrast, it's much easier to handle commands with Tk-style -option value sequences at the end, such as fconfigure and namespace ensemble… Still, sometimes one has to bite the bullet.

The idea for the code below is to misuse the ensemble mechanisms to parse beginning-of-arguments options of a command — the example globby has the same syntax as glob. Some points to note are:

  1. Each defined option is really a proc in a particular namespace. That namespace is dedicated to parsing the options.
  2. The option-parsing procs are wrapped up as an ensemble (globby_parse::globby). This solves the problem that unique abbreviations of option names are allowed.
  3. Missing argument errors (for options that take arguments) are thrown by Tcl when calling the proc that parses this option.
  4. The ensemble must have an -unknown command to handle end-of-options without an explicit --.
 proc globby {args} {
     array set Opt {complain 1 tails 0}
     set patterns [namespace eval globby_parse [list globby] $args]
     # Do actual operation... In this demo we just return parse results.
     list [array get Opt] $patterns
 }

 namespace eval globby_parse {
     namespace export -*

     proc -directory {directory args} {
         upvar 2 Opt Opt
         set Opt(dir) $directory
         uplevel 1 [list globby {*}$args]
     }
     proc -join {args} {
         list [join $args [file separator]]
     }
     proc -nocomplain {args} {
         upvar 2 Opt Opt
         set Opt(complain) 0
         uplevel 1 [list globby {*}$args]
     }
     proc -path {pathPrefix args} {
         upvar 2 Opt Opt
         set Opt(path) $pathPrefix
         uplevel 1 [list globby {*}$args]
     }
     proc -tails {args} {
         upvar 2 Opt Opt
         set Opt(tails) 1
         uplevel 1 [list globby {*}$args]
     }
     proc -types {D typeList args} {
         upvar 2 Opt Opt
         set Opt(types) $typeList
         uplevel 1 [list globby {*}$args]
     }
     proc -- {args} {return $args}
     proc unknown {cmd opt args} {
         if {[string match -* $opt]} then {
             return -code error\
               "bad option \"$opt\": must be -directory, -join,\
               -nocomplain, -path, -tails, -types, or --"
         } else {
             list [namespace which --] $opt
         }
     }

     namespace ensemble create -command [namespace current]::globby\
       -unknown [namespace which unknown]
 }

A tricky detail is that the ensemble and the main command have the same name (although in a different namespace). This makes some errors look like they are for the main command, even when Tcl encounter them for the ensemble.

The main wart here is the use of upvar to access the Opt array in the caller; an alternative approach is to upvar 1 throughout and skip the [namespace eval], but that shows up in the error message.


DKF: It would be interesting to see how this changes when using tcl::unsupported::tailcall instead of uplevel 1.

Experimentation leads to this (with gratuitous use of dicts...):

proc globby {args} {
    set opts {complain 1 tails 0}
    if {[catch {globby_parse::globby {*}$args} patterns] == 1} {
	return -code error $patterns
    }
    # Do actual operation... In this demo we just return parse results.
    list $opts $patterns
}

namespace eval globby_parse {
    namespace path ::tcl::unsupported
    namespace export -*

    proc -directory {directory args} {
	upvar 1 opts o
	dict set o dir $directory
	tailcall globby {*}$args
    }
    proc -join {args} {
	list [join $args [file separator]]
    }
    proc -nocomplain {args} {
	upvar 1 opts o
	dict set o complain 0
	tailcall globby {*}$args
    }
    proc -path {pathPrefix args} {
	upvar 1 opts o
	dict set o path $pathPrefix
	tailcall globby {*}$args
    }
    proc -tails {args} {
	upvar 1 opts o
	dict set o tails 1
	tailcall globby {*}$args
    }
    proc -types {typeList args} {
	upvar 1 opts o
	dict set o types $typeList
	tailcall globby {*}$args
    }
    proc -- {args} {return $args}
    proc unknown {cmd opt args} {
	if {[string match -* $opt]} then {
	    return -code error\
		"bad option \"$opt\": must be -directory, -join,\
		-nocomplain, -path, -tails, -types, or --"
	}
	list [namespace which --] $opt
    }

    namespace ensemble create -command [namespace current]::globby\
	-unknown [namespace which unknown]
}

Lars H: Yes, I considered that, although I don't know if it would do much for the error messages generated. Recursion depth is not likely to be a problem.

And, for what it's worth, the whole thing really started out as an experiment to see if this could be another application of TIP#314 [L1 ]:

 interp alias {} globby {} globby_parse {complain 1 tails 0}
 namespace eval globby_parse {
     namespace export -*

     proc -directory {D directory args} {
         dict set D dir $directory
         [namespace current] $D {*}$args
     }
     proc -join {D args} {
         -- $D [join $args [file separator]]
     }
     proc -nocomplain {D args} {
         dict set D complain 0
         [namespace current] $D {*}$args
     }
     proc -path {D pathPrefix args} {
         dict set D path $pathPrefix
         [namespace current] $D {*}$args
     }
     proc -tails {D args} {
         dict set D tails 1
         [namespace current] $D {*}$args
     }
     proc -types {D typeList args} {
         dict set D types $typeList
         [namespace current] $D {*}$args
     }
     proc unknown {cmd D opt args} {
         switch -glob -- $opt "-*" {
             return -code error\
               "bad option \"$opt\": must be -directory, -join,\
               -nocomplain, -path, -tails, -types, or --"
         } default {
             list [namespace which swap] $opt
         }
     }
     proc swap {pat D args} {-- $D $pat {*}$args}
     proc -- {D args} {
         # Option parsing complete, now do the thing.
         return [list settings $D patterns $args]
     }

     namespace ensemble create -parameters {settings_dict}\
       -unknown [namespace which unknown]
 }

The code here is more compact, since there is no need to upvar or uplevel anything; the dictionary of settings parsed so far is carried along in a parameter argument. Even the namespace current calls can be done away with, as in the first implementation (I just hadn't thought of that trick yet when I wrote the above)!

The disadvantage of this approach is that the parameter dictionary (which the user should not be concerned with) shows up for missing argument errors of the second or later option, although it is appropriately hidden when the same error occurs for the first option:

 % globby -dir
 wrong # args: should be "globby -directory directory ?arg ...?"
 % globby -tails -dir
 wrong # args: should be "::globby_parse {complain 1 tails 1} -directory directory ?arg ...?"