Expanding the Expansion Prefix

AMB: The expansion prefix {*}, or BraceStarBrace, was a great addition to Tcl. It eliminated the need to call "eval" a lot, making code cleaner and more secure. Personally, I can't imagine doing Tcl without the expansion prefix. Similarly, TIP #401 proposed that {#} be implemented as a prefix for comments within commands.

Why stop there? Why not introduce a general prefix syntax for Tcl that extends the existing functionality of {*}, includes the proposed functionality of {#}, and opens things up for customization, in typical Tcl fashion? Based on some conversations here on the wiki with FM and others, I came up with the following:

EDIT: When I posted this originally, I did not realize that the Tcl prefix command existed.

Syntax

A new command called "prefix" would be added, with subcommands "add" and "remove" (at a minimum).

The syntax of this proposed command is as follows:

prefix add name commandPrefix
prefix remove name

where "commandPrefix" behaves like the Tcl "trace" command and executes in the level that it was called.

The word that follows the registered prefix would take a single argument, like how the {*} prefix currently works, and the returned list would be expanded.

Using this syntax, the following code could then re-implement the existing expansion prefix {*}, as well as the proposed comment prefix {#}:

prefix add {*} {apply {{word} {return $word}}}
prefix add {#} {apply {{word} {return}}}

Examples:

A custom prefix {@} as a shortcut to expand a Tcl array.

# Create the prefix for expanding the values of an array
prefix add {@} {array get}

# proc that prints the args in option: value format
proc printOptions {args} {
    foreach {option value} $args {
        puts "$option: $value"
    }
}
# Define options in an array
set options(-foo) bar
set options(-hello) world
printOptions {@}options; # prints "-foo: bar\n-hello: world"
printOptions {*}[array get options]; # equivalent expression.

Another one I could see being popular: {c} as a shortcut to split a string by commas.

prefix add {c} {apply {{word} {split $word ,}}}
list {c}0,1,2,3; # returns 0 1 2 3
list {*}[split {0,1,2,3} ,]; # equivalent expression

I think this proposed feature has a lot of potential to "expand" the capabilities and expressiveness of Tcl.

What do you all think?


KSA: In general an interesting idea. However I wonder which use-cases there are that require it. For example things that can't be done by simply writing a proc (or things that are at least troublesome using a proc). It's obvious that a proc is of no help for what {*} does and the same is true for {#} as well. But apart from these two?

Okay, after some thinking, this might be another use-case: Would it be possible to simplify array arguments by introducing references (e.g. using a {&} prefix)? Example:

proc print_array {v} {
   foreach key [array names v] {
      puts "$key: $v($key)"
   }
}
array set x {a 0 b 1 c 2}
print_array {&}$x

But on the other hand the current solution is not much more complex:

proc print_array {v_} {
   upvar $v_ v
   foreach key [array names v] {
      puts "$key: $v($key)"
   }
}
array set x {a 0 b 1 c 2}
print_array x

AMB The implementation that I proposed would not allow for the sort of array referencing you show. The proposed prefix notation works exactly like the existing {*} notation, in that it returns multiple arguments, but with similar functionality to a single-argument proc. This would be consistent with the behavior of {*} and allow for the proposed behavior of {#} for comments.

With my proposed implementation, your example would return an error, because the interpreter would still parse the word after the prefix, and $x would return an error because x is an array.

I think passing variables as references in procs is a different problem, best handled by a modification to the syntax of proc itself.

Edit: upon reflection, you could introduce something to call within the body of the proc as a shorthand for “upvar”

Here is an example of how I could see it be implemented. I don’t know if I like it, again I think that passing variables by reference should be a change to the syntax of proc itself.

prefix add {c} {apply {{csv} {split $csv ,}}}
prefix add {&} {apply {{vars} {
    set command [list upvar 1] 
    foreach var [list {c}$vars] {
        lappend command $var $var
    }
    return $command
}}}

proc foo {} {{&}a,b,c
    puts [list $a $b $c]
}

set a 1
set b 2
set c 3
foo; # prints 1 2 3

More examples:

Double expansion. Kinda goofy lol.

prefix add {**} {apply {{matrix} {concat {*}$matrix}}}
set matrix {{1 2 3} {4 5 6}}
foreach element [list {**}$matrix] {
    puts $element
}
# prints 1 2 3 4 5 6

Not all prefix use cases return multiple arguments. Here’s another example: this one returns one value: a file path with spoofed tilde substitution. Note that the returned value is a one-element list.

prefix add {~} {apply {{path} {
    list [file join [file home] $path]
}
source {~}/myfiles/hello.tcl

Another single-arg use case that people might use is shorthand for expr. Personally, I think that an alias of = for expr is good enough, but this would be a possibility with the proposed “prefix” command.

prefix add {=} {apply {{expr} {
    list [uplevel 1 [list expr $expr]]
}}}
set x 5
set y {=}{$x * 4}; # 20

Another idea is to convert key=value values into a dictionary and then pass the values on as individual arguments.

prefix add {keys} {apply {{input} {
    set result {}
    foreach entry [lmap entry [split $input ,] {
        lassign [split $entry =] option value
        dict set result -$option $value
    }
    return $result
}}}

proc printOptions {args} {
foreach {option value} $args {
    puts “$option: $value
}
}
printOptions {keys}{foo=bar,hello=world}
# prints “-foo: bar” and “-hello: world”

Essentially, the majority of potential use cases I see is where you are already using {*}, as well as a few convenience cases for one-argument output.

Edit: you could also use it as a sort of annotation for specific types of values.

For example:

prefix add {f} ::tcl::mathfunc::double
set x {f}10; # 10.0

MG If I've understood the intent correctly, for {#} to be for inline commenting in the middle of a command, wouldn't it result in an additional null argument? If you had, for instance

foo $bar {#}"this is a comment" $baz

You'd end up running $foo with 3 arguments, the second being an empty string, wouldn't you? I don't see any way around this is if {#} is implemented at the Tcl-level, and it would be a proper parser change if it's implemented at the C level to have an argument parsed and completely ignored where it starts with {#}

CGM Yes the prefix command itself has to be implemented in C and integrated with parsing, as {*} is now. But then the code it runs for each case can then be written in Tcl.

<FM> The way I see it : some prefix should be recognized by the parser itself, beeing general syntax rules of Tcl itself, like the prefix {*}, {=} or {#}, in a way that the parser can operate with them directly, eventually in modifying the number of arguments of the command line, like it is for {*} prefix and like it should be for the {#} prefix.

But some other prefix, unknown to the parser, should just be pass as « metadata » tied to the Tcl_Object holding the argument. This way, the C fonction could be configure to detect a specific prefix (by the metadata of its arguments) and act in consequences.

For instance, let's say we have a {lst} prefix, unknown to the parser, and let's write this command in console :

set {lst}L A B C D

the Tcl_SetObjCmd will recieve a 5 arguments objv :

  • the first one is a tcl object with string representation « L », tied with a metadata {lst}
  • the last one are simple tcl objects

If Tcl_SetObjCommand is able to detect the metadata of the first argument, and has been made aware that this {lst} prefix denotes a list, no error sould be thrown, the SetObjCmd can create immediately a var « L » that is a list whose content has to be all the last arguments of objv.

Similary, lets say we have an {args}prefix which is unknown by the generic Tcl parser, and let's write this command in the console

proc foo {args}{
   {#}{my improved argument specification language here}
   {&}myUpvar
} {
   # body of the proc
   return $myUpvar
}

The parser will simply pass to the proc command a two objc lentgth objv[], first element beeing the arg Tcl_Object, tied with the metadata object {args}, and the second one beeing the body Tcl_Object. If the proc command is made able to detect this specific {args}prefix metadata, it can use a new specific argument parser that a TIP has describe.</FM>


CGM Wow! I had been thinking of the same idea but had not found the time to write it up. I particularly like {=} as a compact way to invoke expr, which obsoletes my own TIP 676 .


<FM> Let's continue to explore the « set modulation » possibilities... Let's generalize the () to denote part of any collection type : arrays, list, dict, tree, matrix, graph Let's explore the possibilities that are offered to a list variable. Imagine we denote a continuous range with ->, and separate discrete elements by a pipe |

set {lst}L a b c d e f g
set i 3
set result0 {lst}$L(0->{=}$i-1|{=}$i+1->end) ; # result : a b c e f g
set result1 [list {*}[lrange $L 0 [expr {$i-1}]] {*}[lrange $L [expr {$i+1}] end]] ; # result : a b c e f g

Moreover, if expr is made aware of metadata, it can also modulate its behaviour, knowing the « type » of a variable

# let's denote a 3x3 matrix by a {m(3x3)} prefix
proc tcl::mathop::* {left right} {
    set TypeLeft [info metadata name $left]
    set TypeRight [info metadata name $right]
    switch $TypeLeft {
       m {
          # matrix type data on the left : must be the same on the right
          if {$TypeRight eq "m"} {
              # must check if the two matrix dimensions are well suited
              set i [lindex [split [lindex [info metadata parameters $left] 0] "x"] 0]
              set j {lst}[split {lst}[info metadata parameters $right](0) "x"](1)
              if {$i == $j} {
                   tcl::mathfunc::matrix::* $left $right
              } else {
                   puts stderror "dimension incompatibility between matrix"
              }
          } else {
              puts stderror "type incompatibility between arguments"
          }
       } default {
            tcl::mathfunc::number::* $left $right
       }
    }
}
set {m(3x3)}Mat1 {
   1.0 2.0 3.0
   2.0 3.1 3.8 
   0.5 1.3 5.7
}
set {m(3x3)}Mat2 {
   1.0 2.0 3.0
   2.0 3.1 3.8 
   0.5 1.3 5.7
}
set Product {=}{
   $Mat1 * $Mat2
}

Conclusion :

  • Prefix could be seen as metadata.
  • This metadata value could be retrieve by a command : info metadata name /objectName/
  • Metadata could have parameters
  • The list of parameters could be retrieve by a command : info metadata parameters /objectName/
  • The rule could be to have a comma separated list of parameters enclosed in parenthesis : {meta(param1,param2,...etc)}Word

</FM>

AMB: If the prefix notation is used for defining types in the "set" command, the syntax of the "set" command should be modified to allow for this. Here is how I see it working:

set varName ?-type type? ?value?

The optional "type" argument would assert the input is that type, and then attach metadata to it which would reinforce the type whenever there is access to the variable, unless overwritten by calling set with a -type specified again. This would reduce unwanted shimmering in scripts. The default type would be {}, and the type of a variable could be accessed through "info type varName".

Expanding upon this, a new command called "typedexpr" or something could be added that returns "-type type value", determining the type of the resulting expression based on the types of the variables accessed. This could potentially allow for vector math, which is a feature that has been widely requested.

Then, prefixes could be defined to annotate your code as shown:

prefix add {num} {apply {{varName} {
    list $varName -type double
}}}
prefix add {=} {typedexpr}
set {num}x 10; # equivalent to "set x -type double 10"
set {num}y 4
set z {=}{$x/$y}; # expands to -type double 2.5
puts [info type z]; # double

I reiterate: I do not think that the {prefix} notation should have any significant meaning besides returning multiple arguments. It can be combined with other features to appear to have special meaning, but ultimately it should just return a list that is expanded by the parser.

Edit: Another reason for this proposed behavior of the prefix notation, besides being consistent with the existing behavior of {*}, is that it would not add any rule to the Dodekalogue. It would simply modify the rule about argument expansion.

FM : Only the {*} prefix is expanding the command line arguments. {#} prefix will contract it from one (it will cancel its own). {=} will let it of constant size, just changing the way the word is interpreted.

The prefix syntax will not be reduced any more to the expanding operator case. It's still one rule, a word prefixed by {prefix} has a special meaning, but this rule will become like a tree (composed of 2 main chapters) :

  • rule : a word prefixed by {prefix}
    • {prefix} recognized by Tcl parser which allows special action of it :
      • Expansion argument prefix {*}
      • comment argument prefix {#}
      • math mode argument prefix {=}
    • {prefix} unknown by Tcl parser which is simply tied as metadata to the argument objv of the command. This is the reponsability of the command to use or not this metadata and to do whatever it wants with it. Some prefix are predefined for some commands (refer to their command manual to know their effect).
      • predefined prefix
        • Prefixes recognized by « set » (refer to this command manual)
          • prefix on 2nd argument :
            • {list} : take this var as a list.
            • {dict} : take this var as a dict.
            • {string} : take this var as a string.
          • prefix on 3rd,..., nth argument :
            • {list} : take this value as a list.
            • {dict} : take this value as a dict.
            • {string} : take this value as a string.
        • Prefixes recognized by « proc » (refer to this command manual)
          • prefix on 2nd argument :
            • {namedargs} : parse proc arguments as named argument
            • {args} : use the new argument parser, allowing to mix named et positional arguments
            • {xml} : parse argument as a xml string (access with tdom)
            • {c} : parse arguments as C language arguments
          • prefix on 3rd argument :
            • {c} : compile body as C code
            • {asm} : compile body as asm
        • ...etc

KSA Just a remark to the interesting example given by FM: Unfortunately it's currently not possible to override/replace math operators. It's possible to redefine ::tcl::mathop::* procs but Tcl does not care about this and will not call this when evaluating expr. There's a small remark in the docs that confirms this: ::tcl::mathop . I would be delighted if this could be changed in future versions of Tcl. In fact this was a show-stopper for my attempts to add list/vector support to the expr command.

Maybe I'm a bit blind here, but I don't see the benefit of replacing expr by some other notation yet. If users are not satisfied with expr they could do the following:

proc = {args} {
   return [expr $args]
}
set x [= 42 * 8 + 1]
puts $x

Or even more drastic:

proc unknown {args} {
   return [expr $args]
}
set x [1+1]
puts $x

This is straight forward only using known Tcl rules, no magic needed.

Adding metadata could be interesting but may require some careful thinking: For example what happens if metadata declares that a variable $x contains a 3x3 matrix but value of $x is later changed to say "Hello world"? Does altering the value of a variable remove its metadata or is it assigned as long as variable exists (even if the value is no longer what the metadata suggests)? Can a variable have multiple metadata "tags" assigned? This might be important when passing variables between packages that assign "their" metadata "tags". One package should not remove the metadata the other package relies on and vice versa. What if two variables with metadata assigned are concatenated? etc.

FM We can consider the difference :

> set {matrix(3,3)}Id {{1 0 0} {0 1 0} {0 0 1}}
{{1 0 0} {0 1 0} {0 0 1}}
> set {matrix(4,4)}M $Id
"Type mismatch between matrix(4,4) matrix(3,3) : unable to set M"
> set M {matrix(4,4)}$Id
"warning : implicit conversion form matrix(3,3) to matrix(4,4)"
> set Id "Hello word", # error
"unable to interpret 'Hello world' as a matrix(3,3)"
> unset Id
> set Id {matrix(3,3)}{{1 0 0} {0 1 0} {0 0 1}}
{{1 0 0} {0 1 0} {0 0 1}}
> set Id "Hello word"
Hello World

If it's the varname that is tagged, the var should be typed, and no shimering should be allowed any more.

But if it's the var value that is tagged, it's a normal var, that allows shimering as usual.

So we can choose : Will my var need a fixed type ? Should I allow it to Shimmer or not ?

The question of concatenation : it is far more complex. It depends on how the set command react to the metadata.

> set {list}L 1 2 3
> set {dict}D k1 v1 k2 v2
> set {string}S "hello world"
# A simple concatenation just concatenates the values (everything will shimmer)
> set A [list $L $D $S]
"{1 2 3} {k1 v1 k2 v2} {hello world}"
# A more sophisticate solution, that won't shimmer : We need to keep trace of each component.
> set {struct}myStruct \
    {{list}myList}$L \
    {{dict}myDict}$D \
    {{string}myString}$S
"mylist {1 2 3} myDict {k0 v0 k1 v1 k2 v2} myString {hello world}"
> set {myList}myStruct(1)
"2"
> set {myDict}myStruct(k2) 
"v2"
> set {myString}myStruct(0->4)
"hello"

I Need to think more about it.


AMB - 2025-06-12 11:55:31

I’ll be honest, I really don’t like any interpretation of a prefix other than one that expands arguments. I was simply suggesting that you could define custom prefixes that do preprocessing of the input that comes after the prefix, as defined by commandPrefix. Again, this would be compatible with the existing {*} prefix, and the proposed comment prefix {#}. It wouldn’t radically alter the language.

I want to be able to define a custom prefix within a Tcl session with a new command “prefix”, in a way that is completely compatible with the existing {*} operator. That is all.

Oh, and in regard to shorthand for expr, I agree that an alias of "=", and/or importing the math operators and functions directly for prefix notation is sufficient, but a custom prefix would be another option.

Edit: Last weekend I did a deep dive in how the parser works, and specifically how it interprets the “expansion prefix”. From my understanding, it should be easy to have it look up registered prefixes like it looks up commands and then expand the word as usual.

Edit edit: Upon further reflection, I realize that as it stands, the {*} only has meaning within a Tcl command, and is not actually valid syntax within lists or dictionaries. So I see the potential to utilize the {prefix}word syntax for rigid type annotation, even within lists.

FM I know you dyslike it. Personnaly, I don't want to have a preference « a priori ». A lots of talks where skeptical about the expand operator {*} syntax. Nowadays, nobody is complaining anymore about it, everybody is just using it. So, I just want to see the possibilities the concept can offer.

But your a quite right : expand {*} operator is not implemented into list. Let's type :

set L {1 {*}{2 3 4} 5}
lindex $L
"list element in braces followed by "{2" instead of space"

L is not a valid list : it will remains a string forever. The complain come from FindElement in tclUtil.c, which is waiting for a cannonical list representation. What if the cannonical List representation was allowed to accept, and recognized, the expansion operator ? We would have instead :

> set L {1 {*}{2 3 4} 5}
> lindex $L
"1 2 3 4 5"

I don't see any use case of this construct.

But there is one I'm missing sometimes, in the expression parsing context. Let's imagine you have a list of values, and you want to calculate the min, the max, the mean, or apply any statistical function to the values of this list :

> set L [list 1 2 3 4 5]
> set max [expr {max({*}$L)}]
"missing operator at _@_
in expression max({*}_@_$L"
> set max [expr {max([join $L ,])}]
"expected floating-point number but got 1,2,3,4,5"

I even can't brace my expression... I must write this, which is a weird and not really explicit :

> set max [expr max([join $L ,])]
"5"

In the expr context, the expansion operateur {*} should separate the elements of the list with commas. This way, with an {=} expression operator, we could just write :

> set max {=}max({*}$L)

Which is a lot more clear IMO


AMB - 2025-06-16 15:22:15

Well, upon thinking about it more, I don't mind there being other special meanings to {prefix}word beyond the expansion prefix. Also, I did not know of the existence of the prefix command in Tcl, so my proposed syntax would conflict with an existing command. I also realize that the syntax error of characters after a brace (or quote, for that matter) is an opportunity to expand the language, and thus a solution like what I proposed could potentially permanently stunt the future of the language.

Regarding the issue with the math "max" and "min" commands, I agree, it is annoying that we have comma-delimited arguments in expr and space delimited everywhere else. There are a few ways to get around it though:

1. Expose the Tcl math functions

namespace path ::tcl::mathfunc
set L [list 1 2 3 4 5]
set max [max {*}$L]; # or "set max [expr {[max {*}$L]}]"

2. Create your own max mathfunc that takes a list rather than multiple arguments

proc ::tcl::mathfunc::maxl {list} {
    ::tcl::mathfunc::max {*}$list
}
set L [list 1 2 3 4 5]
set max [expr {maxl($L)}]

Personally, I like to expose the math functions (and operators).


KSA: I've published a testing project of mine that adds vector/list support to the expr math functions: expr++ . Beside vector/list support it also adds some new expr math functions I considered missing or useful. It should replace the existing math functions to support lists/vectors without creating incompatibilities (hopefully). And it also "fixes" max() and min() math functions:

set x {1 5 3 4 2}
puts [expr {max($x)}]
puts [expr {max(1,5,3,4,2)}]

The only limitation (for now) is that math operators can't be replaced. So, it wasn't possible to extend + or * operator to support lists/vectors. Instead new functions add() and mul() were added.