itcl in Javascript Paper Chapter 3

Design Goals for Implementation of Itcl in Javascript

The internal types of a TclObject js Object (in the C-Implementation a TclObj) are:

OBJECT_TYPE_TEXT
OBJECT_TYPE_LIST
OBJECT_TYPE_INTEGER
OBJECT_TYPE_REAL
OBJECT_TYPE_BOOL
OBJECT_TYPE_DICT
OBJECT_TYPE_STMTS
OBJECT_TYPE_STMT
OBJECT_TYPE_WORD
OBJECT_TYPE_WORD_PART
OBJECT_TYPE_EXPR_TREE

Parsing rules for Tcl script input are corresponding to the Dodekalog.

The reason for partially parsing the Tcl input is mostly performance and to some extent later on easier handling of the execution of a statement. Partially parsing is done in the following way: all the tokenizing for Tcl is done, but no variables are expanded, no bracket commands are executed and braced parts are handled as one token. And also within quoted strings the parts, which have later on to be expanded are parsed into separate “word_part” TclObject js Objects. Same is done for array names and array references.

Tokens

returned from parsing are:

TOKEN_WORD_SEP
TOKEN_STR
TOKEN_EOL
TOKEN_EOF
TOKEN_ESC
TOKEN_CMD
TOKEN_VAR
TOKEN_EXPAND
TOKEN_PAREN
TOKEN_BRACE
TOKEN_VAR_ARRAY
TOKEN_VAR_ARRAY_NAME
TOKEN_ARRAY_NAME
TOKEN_VAR_COMPOSED
TOKEN_BRACED_VAR
TOKEN_QUOTED_STR
TOKEN_COMMENT
TOKEN_DECIMAL
TOKEN_INTEGER
TOKEN_REAL
TOKEN_BOOLEAN
TOKEN_HEX
TOKEN_OCTAL
TOKEN_MINUS
TOKEN_PLUS
TOKEN_MUL
TOKEN_DIV
TOKEN_MOD
TOKEN_LT
TOKEN_GT
TOKEN_LE
TOKEN_GE
TOKEN_NE
TOKEN_EQ
TOKEN_NOT
TOKEN_RP
TOKEN_AND
TOKEN_OR
TOKEN_EXOR
TOKEN_AND_IF
TOKEN_OR_IF
TOKEN_STR_EQ
TOKEN_STR_NE
TOKEN_STR_IN
TOKEN_STR_NI
TOKEN_STR_PARAM
TOKEN_STR_CMD
TOKEN_NO_WORD_SEP
TOKEN_EXPR
TOKEN_STMTS

For “normal” Tcl code the tokens from TOKEN_WORD_SEP to TOKEN_COMMENT are returned

Tokens TOKEN_DECIMAL to TOKEN_STR_NI are returned for expression like parts in if, while and in the expr command. The last few ones are used internally for partially parsed statements.

Examples:

String	Token	Value
$abc	TOKEN_VAR	abc
${abc def}	TOKEN_BRACED_VAR
	TOKEN_BRACE
	TOKEN_STR	abc
${abc def}	TOKEN_BRACED_VAR
x(y)	TOKEN_ARRAY_NAME	x0x01y
$x(y)	TOKEN_VAR_ARRAY
${x}(y)	TOKEN_VAR_ARRAY_NAME
set a 1	TOKEN_CMD
{a y}	TOKEN_BRACE	a y
“abc”	TOKEN_QUOTED_STRING	abc
“abcx a$y{d e f}yyy”	TOKEN_QUOTED_STRING
	TOKEN_STR	abc
	TOKEN_CMD	x a
	TOKEN_VAR	y
	TOKEN_BRACE	d e f
	TOKEN_STR	yyy
xyz	TOKEN_STR	xyz
{*}	TOKEN_EXPAND	“”

Some commands use a statement part as en expression to be evaluated and to return a value of true or false like if and while for the condition or the Tcl expr command. For these the condition is parsed to an expression tree existing of nodes (TclNode js Object ). When tokenizing an expression string first all parts are put into nodes objects and these TclNode js Objects are placed in a tree with the operator as the parent node and the operands as the child nodes. A paren “(“ is also a parent node. The nodes are first put in parsing order in the expression tree and afterward the expression tree is reorganized according to the precedence rules o the operators.

Operators

are:

+	TOKEN_PLUS
-	TOKEN_MINUS
*	TOKEN_MUL
/	TOKEN_DIV
%	TOKEN_MOD
<	TOKEN_LT
>	TOKEN_GT
<=	TOKEN_LE
>=	TOKEN_GE
!=	TOKEN_NE
==	TOKEN_EQ
!	TOKEN_NOT
(	TOKEN_PAREN	pseudo operator used for precedence handling
)	TOKEN_RP	pseudo operator used for precedence handling
&	TOKEN_AND
\|	TOKEN_OR
^	TOKEN_EXOR
&&	TOKEN_AND_IF
\|\|	TOKEN_OR_IF
eq	TOKEN_STR_EQ
ne	TOKEN_STR_NE
in	TOKEN_STR_IN
ni	TOKEN_STR_NI

Precedence rules

are (as in C):

TOKEN_OR_IF	1
TOKEN_AND_IF	2
TOKEN_OR	3
TOKEN_EXOR	4
TOKEN_AND	5
TOKEN_EQ	6
TOKEN_NE	6
TOKEN_LT	7
TOKEN_GT	7
TOKEN_LE	7
TOKEN_GE	7
TOKEN_PLUS	9
TOKEN_MINUS	9
TOKEN_MUL	10
TOKEN_DIV	10
TOKEN_MOD	10
TOKEN_PAREN	12
TOKEN_STR	99

Reorganizing is done in flipping nodes that have a higher precedence:

if precedence of node is greater than the precedence of the left node and the node is not a TOKEN_PAREN flip nodes.
set the parent->child_left to child_left of the node
set parent of the node to child_left
set child_left of the node to child_left->child_right
set child_left->child_right to the node
reorganize child_left

Some tokens are used only internal during parsing:

TOKEN_EOL	the separator for a Tcl statement either “\r”, “\n” or a “;”
TOKEN_WORD_SEP	space or tab between words
TOKEN_ESC	for signaling the different parts of a word like in yyya bccc
	one part is yyy one part is the TOKEN_CMD “a b” and one
	part is ccc. In between TOKEN_ESC is returned to signal
	these parts
TOKEN_EOF	at the end of the code to parse

Some tokens are only used within expression trees:

TOKEN_INTEGER	1 to n digits 0-9
TOKEN_DECIMAL	an integer with a leading unary “+” or “-”
TOKEN_REAL	a decimal with a “.” as the fraction separator, a fraction and an
	exponent with e+/-nnn syntax
TOKEN_BOOLEAN	the Tcl values for a boolean, true/false/0/1 …
TOKEN_HEX	0x followed by 0-9A-Fa-f characters
TOKEN_OCTAL	0-7 1 to n characters

(Part of itcl in Javascript Paper)

Category Language