itcl in Javascript Paper Chapter 3

Design Goals for Implementation of Itcl in Javascript

The internal types of a TclObject js Object (in the C-Implementation a TclObj) are:

  • OBJECT_TYPE_TEXT
  • OBJECT_TYPE_LIST
  • OBJECT_TYPE_INTEGER
  • OBJECT_TYPE_REAL
  • OBJECT_TYPE_BOOL
  • OBJECT_TYPE_DICT
  • OBJECT_TYPE_STMTS
  • OBJECT_TYPE_STMT
  • OBJECT_TYPE_WORD
  • OBJECT_TYPE_WORD_PART
  • OBJECT_TYPE_EXPR_TREE

Parsing rules for Tcl script input are corresponding to the Dodekalog.

The reason for partially parsing the Tcl input is mostly performance and to some extent later on easier handling of the execution of a statement. Partially parsing is done in the following way: all the tokenizing for Tcl is done, but no variables are expanded, no bracket commands are executed and braced parts are handled as one token. And also within quoted strings the parts, which have later on to be expanded are parsed into separate “word_part” TclObject js Objects. Same is done for array names and array references.

Tokens

returned from parsing are:

  • TOKEN_WORD_SEP
  • TOKEN_STR
  • TOKEN_EOL
  • TOKEN_EOF
  • TOKEN_ESC
  • TOKEN_CMD
  • TOKEN_VAR
  • TOKEN_EXPAND
  • TOKEN_PAREN
  • TOKEN_BRACE
  • TOKEN_VAR_ARRAY
  • TOKEN_VAR_ARRAY_NAME
  • TOKEN_ARRAY_NAME
  • TOKEN_VAR_COMPOSED
  • TOKEN_BRACED_VAR
  • TOKEN_QUOTED_STR
  • TOKEN_COMMENT
  • TOKEN_DECIMAL
  • TOKEN_INTEGER
  • TOKEN_REAL
  • TOKEN_BOOLEAN
  • TOKEN_HEX
  • TOKEN_OCTAL
  • TOKEN_MINUS
  • TOKEN_PLUS
  • TOKEN_MUL
  • TOKEN_DIV
  • TOKEN_MOD
  • TOKEN_LT
  • TOKEN_GT
  • TOKEN_LE
  • TOKEN_GE
  • TOKEN_NE
  • TOKEN_EQ
  • TOKEN_NOT
  • TOKEN_RP
  • TOKEN_AND
  • TOKEN_OR
  • TOKEN_EXOR
  • TOKEN_AND_IF
  • TOKEN_OR_IF
  • TOKEN_STR_EQ
  • TOKEN_STR_NE
  • TOKEN_STR_IN
  • TOKEN_STR_NI
  • TOKEN_STR_PARAM
  • TOKEN_STR_CMD
  • TOKEN_NO_WORD_SEP
  • TOKEN_EXPR
  • TOKEN_STMTS

For “normal” Tcl code the tokens from TOKEN_WORD_SEP to TOKEN_COMMENT are returned

Tokens TOKEN_DECIMAL to TOKEN_STR_NI are returned for expression like parts in if, while and in the expr command. The last few ones are used internally for partially parsed statements.

Examples:

String Token Value
$abc TOKEN_VAR abc
${abc def} TOKEN_BRACED_VAR
TOKEN_BRACE
TOKEN_STR abc
${abc def} TOKEN_BRACED_VAR
x(y) TOKEN_ARRAY_NAME x0x01y
$x(y) TOKEN_VAR_ARRAY
${x}(y) TOKEN_VAR_ARRAY_NAME
set a 1 TOKEN_CMD
{a y} TOKEN_BRACE a y
“abc” TOKEN_QUOTED_STRING abc
“abcx a$y{d e f}yyy” TOKEN_QUOTED_STRING
TOKEN_STR abc
TOKEN_CMD x a
TOKEN_VAR y
TOKEN_BRACE d e f
TOKEN_STR yyy
xyz TOKEN_STR xyz
{*} TOKEN_EXPAND “”

Some commands use a statement part as en expression to be evaluated and to return a value of true or false like if and while for the condition or the Tcl expr command. For these the condition is parsed to an expression tree existing of nodes (TclNode js Object ). When tokenizing an expression string first all parts are put into nodes objects and these TclNode js Objects are placed in a tree with the operator as the parent node and the operands as the child nodes. A paren “(“ is also a parent node. The nodes are first put in parsing order in the expression tree and afterward the expression tree is reorganized according to the precedence rules o the operators.

Operators

are:

+ TOKEN_PLUS
- TOKEN_MINUS
* TOKEN_MUL
/ TOKEN_DIV
% TOKEN_MOD
< TOKEN_LT
> TOKEN_GT
<= TOKEN_LE
>= TOKEN_GE
!= TOKEN_NE
== TOKEN_EQ
! TOKEN_NOT
( TOKEN_PAREN pseudo operator used for precedence handling
) TOKEN_RP pseudo operator used for precedence handling
& TOKEN_AND
| TOKEN_OR
^ TOKEN_EXOR
&& TOKEN_AND_IF
|| TOKEN_OR_IF
eq TOKEN_STR_EQ
ne TOKEN_STR_NE
in TOKEN_STR_IN
ni TOKEN_STR_NI

Precedence rules

are (as in C):

TOKEN_OR_IF 1
TOKEN_AND_IF 2
TOKEN_OR 3
TOKEN_EXOR 4
TOKEN_AND 5
TOKEN_EQ 6
TOKEN_NE 6
TOKEN_LT 7
TOKEN_GT 7
TOKEN_LE 7
TOKEN_GE 7
TOKEN_PLUS 9
TOKEN_MINUS 9
TOKEN_MUL 10
TOKEN_DIV 10
TOKEN_MOD 10
TOKEN_PAREN 12
TOKEN_STR 99

Reorganizing is done in flipping nodes that have a higher precedence:

  • if precedence of node is greater than the precedence of the left node and the node is not a TOKEN_PAREN flip nodes.
  • set the parent->child_left to child_left of the node
  • set parent of the node to child_left
  • set child_left of the node to child_left->child_right
  • set child_left->child_right to the node
  • reorganize child_left

Some tokens are used only internal during parsing:

TOKEN_EOL the separator for a Tcl statement either “\r”, “\n” or a “;”
TOKEN_WORD_SEP space or tab between words
TOKEN_ESC for signaling the different parts of a word like in yyya bccc
one part is yyy one part is the TOKEN_CMD “a b” and one
part is ccc. In between TOKEN_ESC is returned to signal
these parts
TOKEN_EOF at the end of the code to parse

Some tokens are only used within expression trees:

TOKEN_INTEGER 1 to n digits 0-9
TOKEN_DECIMAL an integer with a leading unary “+” or “-”
TOKEN_REAL a decimal with a “.” as the fraction separator, a fraction and an
exponent with e+/-nnn syntax
TOKEN_BOOLEAN the Tcl values for a boolean, true/false/0/1 …
TOKEN_HEX 0x followed by 0-9A-Fa-f characters
TOKEN_OCTAL 0-7 1 to n characters

(Part of itcl in Javascript Paper)