Version 1 of Every Word is a Constructor

Updated 2018-04-26 14:30:43 by pooryorick

Every word is a constructor (EWIAC) conveys the idea that while the arguments to a routine are derived from the words in a command, those words are not in themselves the values for those arguments.

See Also

Tcl Chatroom 2017-06-02
A discussion on the topic.

Description

A value has some type that is determined in the context where it is utilized. Currently, Tcl commands create and use this type information and it is even stored internally but, Tcl discards this type information whenever the value is used in a context that requires a different type of value. Tcl does this in order to ensure that when a value is used as an argument to a command it has the same semantics as if the string the value was derived from appeared literally in the script as that argument. This insulates a command from the treatment values received at the hands of commands. It can not rely on any previous intTrpretation of the string representation for that value. Logically, the arguments each routine receives have no prior interpretation, and any interpretation assigned within a routine is valid only within that routine.

However, behind the scenes Tcl itself cheats. Groups of federated commands cooperate with each other to use the same interpretation of the string value. This allows the commands to be more performant since they can use the preexisting internal represenation. That such cheating can occur illustrates that Tcl is in fact capable of carrying around type information for a given value and that the type information survives transport from one command to another. Currently, this internal type information is used strictly as cached information: If the cached information conflicts with the current interpretation of the string representation, it is discarded. This allows the programmer to freely use the value produced by a substitution in contexts where different types of values are required. This may sound like a convenient thing, but in practice it is limiting. Programmers think of the values as typed values and construct their programs with these types in mind. Type-consistent usage is far more common than type-variable usage. If Tcl made the types of values observable at the script level, they could be used to great effect.

The existence of Rule 5 illustrates the flexibility that the interpreter has in this regard. That flexibility could be better articulated by making a small change to the wording of Rule 2:

..., then all of the words of the command are passed to the command procedure.

, Rule 2 could have been modified to say,

..., then the resulting values are passed as arguments to the command procedure.

Eliminating that second use of "word" to describe the result of the substitutions makes it clear that words are processed into arguments to the routine, and that once they are processed, they are outside of the scope of the Tcl rules.

If values rather than words are the arguments to a routine, commands can treat different types of values differently. puts could take as its optional argument either a channel or a file name. A new variant of append could work for strings, lists, and dictionaries. The first word of a command might actually be a routine rather than the name of a routine.

Example: {*}

In Tcl 8.5, a new processing directive was was admitted at the script level and it was implemented as a change to the syntax: A literal {*} in a script at the beginning of a word tells the interpreter to unpack that word into multiple words, which has the effect of expanding the number of arguments handed to a routine when it is called. This is a welcome bit of functionality, but did it have to be a syntactic change to Tcl? One thing that seems to be conspicuously absent from the discussions that led up to the implementation of {*} is any substantive proposal to accomplish the task by inspecting the internal representation of the value. Rather than introducing new syntax, a command which returned an "expand me" value could have been introduced:

set [expand {name Bob}]

Example: is

string is is currently available to determine whether a string representation of a value conforms to a certain format. Once words and values are no longer being conflated, a similar command, perhaps named is, could be introduced to provide a system for inspecting the value itself:

is value1 value2
Returns true if value1 is the same type of value as as value2, and false otherwise.

extensions that provide their own value types could use some implementation-level mechanism to register a function for is to use when it encounters values of that type . A procedure could use is to condition its operation on these comparisons. For example, an enhanced puts procedure might look like this:

proc newputs args {
        set opened 0
        if {[llength $args == 1]} {
                set target [chan lookup stdout]
        } elseif {[llength $args] == 2} {
                lassign $args target string
                if {![is $target [chan type]]} {
                        set target [open $target w]
                        set opened 1
                }
        }
        try {
                ::puts $target $string
        } finally {
                if {$opened} {
                        close $target
                }
        }
}

chan type returns a value to be used only by is, and not for any real channel operations. For other values such as lists, list could be used:

if {[is $somevariable [list]]} {
        puts {found a list}
}

Also needed in the example was chan lookup stdout which returns the corresponding channel value based on a name.

To Be Continued

That's it from now. More to come.

Page Authors

PYK