Every Word is a Constructor

Every word is a constructor (EWIAC) conveys the idea that while the arguments to a routine are derived from the words in a command, those words are not in themselves the values for those arguments.

Description

A value has some type that is determined in the context where it is utilized. Currently, Tcl commands create and use this type information and it is even stored internally, but Tcl discards this type information whenever the value is used in a context that requires a different interpretation of the value. The string representation of the value, and the string representation alone, conveys all the meaning for the value, and it is up to the consumer of the value to decide how to interpret it. This insulates each command from the treatment values received at the hands of other commands. A command can not rely on any previous interpretation of the string representation for that value. Logically, the arguments each routine receives have no prior interpretation, and any interpretation assigned within a routine is valid only within that routine, and as a cached interpretation for the next command to use if it so decides.

Groups of federated commands cooperate with each other to use the same interpretation of a value. This allows the commands to be more performant since they can use the preexisting internal represenation. This illustrates that Tcl is in fact capable of carrying around type information for a given value and that the type information survives transport from one command to another. Currently, this internal type information is used strictly as cached information: If the cached information conflicts with the current interpretation of the string representation, it is discarded. This allows the programmer to freely use the value produced by a substitution in contexts where different types of values are required. This may sound like a convenient thing, but in practice it is limiting. Programmers think of the values as typed values and construct their programs with these types in mind. Type-consistent usage is far more common than type-variable usage. If Tcl made the types of values observable at the script level, they could be used to great effect.

The existence of Rule 5 illustrates the flexibility that the interpreter has in this regard. That flexibility could be better articulated by making a small change to the wording of Rule 2:

: ..., then all of the words of the command are passed to the command procedure.

, Rule 2 could have been modified to say,

: ..., then the resulting values are passed as arguments to the command procedure.

Eliminating that second use of "word" to describe the result of the substitutions makes it clear that words are processed into arguments to the routine, and that once they are processed, they are outside of the scope of the Tcl rules.

If values rather than words are the arguments to a routine, commands can treat different types of values differently. puts could take as its optional argument either a channel or a file name. A new variant of append could work for strings, lists, and dictionaries. The first word of a command might actually be a routine rather than the name of a routine.

Example: {*}

In Tcl 8.5, a new processing directive was was admitted at the script level and it was implemented as a change to the syntax: A literal {*} in a script at the beginning of a word tells the interpreter to unpack that word into multiple words, which has the effect of expanding the number of arguments handed to a routine when it is called. This is a welcome bit of functionality, but did it have to be a syntactic change to Tcl? One thing that seems to be conspicuously absent from the discussions that led up to the implementation of {*} is any substantive proposal to accomplish the task by inspecting the internal representation of the value. Rather than introducing new syntax, a command which returned an "expand me" value could have been introduced:

set [expand {name Bob}]

APN 2018-04-26 Seems to me the easiest way to do that would be to introduce a new return code TCL_EXPAND (to go with TCL_ERROR, TCL_BREAK etc.) which will cause the command interpreter to expand the result. Given we already have {*}, not sure how useful it would be in practice but it does give the command itself (as opposed to the caller) control over whether its return value should be expanded or not. Can't think offhand of any use cases (other than expand itself) that demand this functionality but that might just be a lack of imagination on my part.

dbohdan 2018-04-28: APN, I like your idea of TCL_EXPAND. As an alternative to {*} it could be useful for 8.4-ish Tcl implementations like Eagle and JTcl (easier to implement than {*}) as well as new Tcl derivatives trying to cut down on syntax. (What is the minimal practical N-logue? N can't be greater than 10.)

Example: is

string is is currently available to determine whether a string representation of a value conforms to a certain format. Once words and values are no longer being conflated, a similar command, perhaps named is, could be introduced to provide a system for inspecting the value itself:

is value1 value2: Returns true if value1 is the same type of value as as value2, and false otherwise.

extensions that provide their own value types could use some implementation-level mechanism to register a function for is to use when it encounters values of that type . A procedure could use is to condition its operation on these comparisons. For example, an enhanced puts procedure might look like this:

proc newputs args {
set opened 0
if {[llength $args == 1]} {
    set target [chan lookup stdout]
} elseif {[llength $args] == 2} {
    lassign $args target string
    if {![is $target [chan type]]} {
        set target [open $target w]
        set opened 1
    }
}
try {
    ::puts $target $string
} finally {
    if {$opened} {
        close $target
    }
}
}

chan type returns a value to be used only by is, and not for any real channel operations. For other values such as lists, list could be used:

if {[is $somevariable [list]]} {
    puts {found a list}
}

Also needed in the example was chan lookup stdout which returns the corresponding channel value based on a name.

Example: Routine

Currently, a routine that takes as an argument a command prefix must do something like this to make sure it can later call the command in the proper context:

set cmd [list ::apply [list {cmd args} {
    ::tailcall {*}$cmd {*}$args
} [uplevel 1 {namespace current}] $cmd

Under EWIAC, a command could capture the current namespace into an internal type:

set cmd [list [uplevel 1 ::command [lindex $cmd 0]] {*}[lrange $cmd 1 end]]

Example: Recursive Data Structures

Currently, it's problematic to use a dictionary as a general recursive data structure because there is nothing in a value to indicate whether is should be interpreted as a nested dictionary. An operation on a dictionary could use ::tcl::unsupported::representation to determine whether the value is a nested dictionary, but because Tcl can gratuitously swap out the internal representation, the approach is problematic.

Caching Considered Troublesome

The type information cached in a Tcl_Obj is quite useful, and even necessary for the operation of modern Tcl. What isn't necessary is the caching aspect. It can complicate already-complicated routines. On the Tcl_Obj page there are descriptions of circular references among Tcl_Obj. The solutions to these problems would be more straightforward without the caching behaviour.

Page Authors

PYK

Category Concept