Proposal of syntax, expressiveness and semantic improvements with annotations

Introduction

The creation of {*} expansion prefix brought clarity to Tcl.

There is still space in the parser for this kind of annotations. But how can it be usefull for programmers ?

Let's suppose that we add a new rule to Tcl :

  • a word enclosed by braces in a prefix position to another word is called an annotation of this word.
  • These annotations can be recognized at two levels :
    • in a generic way, at the top-level parser, as a 'generic annotation'
    • in a specific way, inside a command parser, as a 'specific annotation'

Generic annotations for the Tcl parser :

Generic annotations should be recognized by the Tcl parser and result in immediate actions from it. Here's a list of suggestions :

  • {*}$list : actual expansion prefix
  • {=}$expr : new expression prefix : the prefixed word is to be computed as an expression
  • {&}$ref : new reference prefix : the prefixed word is to be captured as a reference to a variable
  • {#}comment : new comment prefix : the prefixed word is to be skiped by the parser
  • {:}Label : new label prefix : the prefixed word is to be recorded as a location in the script.

Specific annotations :

Specifics annotations should be also recognized by Tcl the parser, but only result in an annotation Token, whose prefixed word is to become the subsequent component. The future substitution of this annotation Token will result in a annotated Tcl_Obj. That means this kind of proposal imply an evolution of the Tcl_Obj structure.

This annotation usage will be specific to the command that recieve It, so we can't give generic rules on it. It's like a tag on the receive Tcl_Obj, that can be read through a fonction, let's say : Tcl_getNoteFromObj.

But we can distinguish some appetitive features for the main Tcl entities it could offer:

  • Usage of annotation for variables:
    • Help and simplify the way to set data into the code.
    • Constraint the type of a variable : why ? The Type theory has nowedays a number of applications, for instance in mathematics (see Coq ). Programmers needs to model a reality and that commonly comes with some typologies or taxonomies. So There is a need for it.
  • Usage of annotation for commands
    • shorthand for upvar
    • procedure named arguments ?

Usage of annotations on variable

A variable, basically, comes with a Tcl_Obj in some key position in a script, most of the time it's the second Tcl_Obj of the set command (also true with append, lappend, lset, ...etc). It comes also when a token is prefixed by the char $. The idea here is to modulate the set algorithm in consequence of an object annotation.

Generic annotations of variable

set would recognized some generic annotation, part of the language :

AnnotationExampleEffect
{list}set {list}L = e0 ... ei ... encreate a listObject
{list}set {list}L(0) = e1 set the 1rst element of L to be e1
{list}set {list}L(1:end) = {}set element form 1 to end to be null
{dict}set {dict}D = k0 v0 ... ki vi ... kn vncreate a dictObject
{dict}set {dict}D(k0) = v0create a dictObject, set its value to be v0 at key k0
{dict}set {dict}D(k0,k1,k2) = v0 v1 v2create a dictObject, set multiple values at multiples keys
{num}set {num}i = 1create a numObject, set it at 1
{num}set {num}i = math_exprcreate a numObject, set it with expr
{string}set {string}s = $prefix $infix create a stringObject, append each argument to it
{string}set {string}s += $suffixappend $suffix to s
{string}set {string}s -= $infixremove the string $infix from s
{enum}set {enum}e = n0 c0 ... ni ci ... nn cncreate an enum object
{enum}set {enum}e(n0)select in the e enum the n0 component, return it's c0 value.

Note that, as the type of variable is known by set, through the annotation, we can extend the use of the parenthesis.

Specific annotations of variables

"set" could recognized annotations configured by the programmer. We can distinguish 3 main types of configurables annotations :

  • attribute
  • entity
  • relation

An attribute is like a value :

  • It may have (at least) one or many fields.
  • It can be parsed from a string (generic method)
  • It can be represented as a string (generic method).
  • It may have specific operations working on it's value (optional method).

Example :

define {attribute}complex {
        fields {
                {num}reel 0
                {num}imaginaire 0
        } fromString s {
            regexp {^(-?([0-9]*\.)?[0-9]+)?(([+-]([0-9]*\.)?[0-9]*)i)?$} $s -> reel - - imaginaire
        } toString {} {
                format "%d+%di" reel imaginaire
        } toPolar {} {
             ...etc
        }
}

set {complex}z = {1+3i} {#}{set would call the generic method "fromString"}
set {complex}z          {#}{set would call the generic method "toString"}
set {complex}z(reel)
set {complex(toPolar)}z {#}{set would call the specific method "toPolar}

The definition of an attribute needs 2 tables : One for the fields, One for the operations on it. A fieldvalue can be recover via an array-like construct :

set {AttributeType}AttributeName(fieldName) = value

A operation can be done trough an array-like annotation parameter

set {AttributeType(specificOperation)}AttributeName

Nb : this imply to introduce parameters of annotation : {annotation(parameter)}value

An entity is like an object. It has a dictionnary of attributes that are named and typed. We can access to the value of one its attributes through an array-like construct :

set {entityType}EntityName(AttributeName)
set value {entityType}$EntityVarName(AttributeName)

We should also be able to set a lot of attributes once

set {entityType}EntityName attrName_1 attrValue_1 ... attrName_i attrValue_i ... attrName_n attrValue_n

An entity has a special attribute index, which is its index in the collection of entities known by the interpreter. We access to it through the simple variable syntax :

set {entityType}EntityName {#}{return the entity identifier of EntityName}
set id {entityType}$EntityVarName {#}{set id to be the the entity identifier of EntityVarName}

An entity has also a collection of methods. We can eval it through an array-like construct, but only in a command position. It is simple logic, what you can take as command, use it as commande. What you can take as variable, use it at variable. Ex :

EntityName(methodName) {*}$args
set R [EntityName(methodName) {*}$args]

The definition of an entity should look like this :

define {entity}Personne {
        attributes {
                {string}Nom = {}
                {string}Prénom = {}
                {enum}Sex = {Homme M Femme F}
                {string}Occupation = {}
                {date}DateOfBirth = {}
        } methods {
                age {} {
                        return [expr {
                                [clock format [clock seconds] -format %Y]-[clock format $DateOfBirth -format %Y]
                        }]
                }
        }
}

set {Personne}P1 = Nom OUT Prénom John Sex Homme Occupation Programmer
set {Personne}P2 = Nom WIN Prénom July Sex Femme Occupation HouseWife
set {Personne}P3 = Nom LAW Prénom John Sex Homme Occupation Lawer

set P1(DateOfBirth) [clock scan 1955/04/10 -format "%Y/%m/%d"]
$P1(age)  {#}{compute the age of entity $P1}

A relation is the jonction of some entities. each entity play a role in it. A relation can have attributes. A relation can have methods.

define {relation}Mariage {
        {Personne}Mari 
        {Personne}Femme
} {
        attributes {
                {date}DateOfWedding
                {place}PlaceOfWedding
                {date}DateOfDivorce
                {Personne}LawerOfDivorce
        } methods {
                divorce {{date}Date {Personne}LawerOfDivorce {string}Format) {
                        set DateOfDivorce [clock scan $Date -format $Format]
                        set LawerOfDivorce $Lawer
                }
        }
}

set M [{Relation}Mariage \
        {Mari}$P1 \
        {Femme}$P2 \
        DateOfWedding [clock scan 05/10/2024 -format "%d/%m/%Y"] \
        PlaceOfWedding Buxerolles]

The entities playing a role in the relation are annotated with the role they play. The attributes are introduced as a dictionnary.

As for entities, a relation's method can be eval through an array-like construct, only in a command position

Specific annotations of proc arguments

First is to distinguish optional vs mandatory argument. Let's say :

The annotation {?}[list ...] imply a list of optional arguments
The annotation {!}[list ...] imply a list of mandatory arguments

In these lists, each arguments is either an annoted element (a variable) or an annoted list of 2 elements (a variable and its default value)

 {Argument_annotation(parameter)}variable
 {Argument_annotation(parameter)}{variable defaultValue}

We can distinguish here some kind of Arguments type :

  • a named annotation denotes an argument which is introduced by a name. The option name is passed in parameter.
 {named(nameOfOption)}{variable defaultVal}
  • a flag annotation denotes an option whose occurence set the annotated variable to 1, but whose non-occurence let it to 0. The option name is passed as parameter. I don't see any need for default value here.
 {flag(nameOfOption)}variable
  • a select annotation denotes a list of options that set one unique variable. The value of each options for the variable is passed as a parameter, which is a dict of option's names mapping the option's values
 {select(nameOfOption1 value1 nameOfOption2 value2 ... ...)}{variable defaultVal}
  • We should also be able to annote a variable, to restrict it's value, then the construct will look like :
 {Argument_annotation(parameter)}{{Variable_annotation}variable defaultValue}

As, currently, to annotate args produces an error, the actual syntax (without annotations) can coexist with any syntax using annotation. Old procs can be kept.

Some examples, below, adapted from parse_args to illustrate the idea

proc lsort {
        {?}{{#}{Optional parameters}
                {select(-ascii 0 -dictionary 1 -integer 2 -real 3)}compare 0}
                {select(-increasing 0 -decreasing 1)}order 0}
                {named(-command)}command {}}
                {named(-index)}index {}}
                {named(-stride)}stride 1}
                {flag(-indice)}indice
                {flag(-nocase)}nocase
                {flag(-unique)}unique
                {flag(--)}endOfOptions
        } {!}list
} {
   # ...code
}

proc __entry {
    {?}{{#}{Optionnal arguments}
        {named(-disabledbackground)}{
            {Tk_color}disabledbackground {}
        }
        {named(-disabledforeground)}{
            {Tk_color}disabledforeground {}
        }
        {named(-invalidcommand)}{
            {Tcl_proc}invalidcommand {}
        }
        {named(-readonlybackground)}{
            {Tk_color}readonlybackground {}
        }
        {named(-show)}{
            show {}
        }
        {named(-state)}{
            {enum(normal disabled readonly)}state normal
        }
        {named(-validate)}{
            {enum(none focus focusin focusout key all)}validate none
        }
        {named(-validatecommand)}{
            {Tcl_proc}validatecommand {}
        }
        {named(-width)}{
            {Tk_dim}width 0
        }
        {named(-textvariable)}{
            {Tcl_var}textvariable {}
        }
    } {!}{{#}{mandatory argument}
        {Tk_widget}widget
    }
} {
        # code
}

Note that naming an argument is a way to annote it. So, we can imagine also an argument annotation to indicate this argument must be annoted. Let's look at this :

proc __entry {
    {?}{{#}{Optionnal arguments}
        {annoted(disabledbackground)}{
            {Tk_color}disabledbackground {}
        } ...
} {
...
}
__entry {disabledbackground}#bbaaee {state}normal .e

proc lsort {
        {?}{{#}{Optional parameters}
                {select(-ascii 0 -dictionary 1 -integer 2 -real 3)}compare 0}
                {select(-increasing 0 -decreasing 1)}order 0}
                {annoted(command)}command {}}
                {annoted(index)}index {}}
                {annoted(stride)}stride 1}
                {flag(-indice)}indice
                {flag(-nocase)}nocase
                {flag(-unique)}unique
                {flag(--)}endOfOptions
        } {!}list
} {
   # ...code
}
lsort {command}customSort $L

...writing in progress...

Discussion

Difference between an annotated list and a dict

Let's imagine we can define an annoted list as :

 set {list}aL {annotation_1}e_1 ... {annotation_i}e_i ... {annotation_n}e_n

Let's imagine we can retrieve a value specifically annoted from the list, using, for instance, a special annotation «{@}» on the list index :

 set {list}aL({@}annotation_i)

Then the annotation of an element looks a lot like a dict key, and an annoted list looks a lot like a dict.

 set {dict}D annotation_1 e_1 ... annotation_i e_i ... annotation_n e_n
 set {dict}D(annotation_i) 

So, is there a need for the concept of annoted list if annoted list and dict are the same ?

The difference can be that the dict command will return only ONE Value for the key, whereas the annotated list would return ALL the values that are identically annoted. This way, a list of annoted entities can be filtred by annotation. For example :

 set {list}Widgets [list {button}.b1 {entry}.e1 {frame}.f1 {button}.b2 ... ]
 #  select all button from the Widgets list
 set Buttons {list}$Widgets({@}button)

The interface beeing different, there is specific used cases for both structures. Dict is good for a list of distinct attributs. Annotated list is good for list of typed entities.

Comments Welcome