Version 14 of A Tcl Compiler Idea

Updated 2003-10-21 20:48:17

GPS Oct 19, 2003 - I've been thinking about how to build a Tcl compiler that allows for efficiently storing the compiled structure to the disk and also provides fast execution. I think this might work.

Rather than using a parse-tree a flat array of Tcl_Objs is used with an ordered list for execution. A LIFO stack is used to find the proper order of execution at compile time.

I propose the functions Tcl_Compile() and Tcl_Run() for dealing with these data structures.

Consider the command:

 set v [foo bar [set bar 20] [baz $bar]]

In that example we first create an array of Tcl_Objs. The first element will be for a result of set v ...

 [0 RESULT] [1 set]

We push the index of the set command, which in this case is 1.

 LIFO stack[] = 1

We continue down the line adding objects for strings until we reach another branch.

 [0 RESULT] [1 set] [2 v] [3 RESULT] 

Now we are at a branch, where we have:

 [foo bar [set bar 20] [baz $bar]]

We append another object for the foo command and push the index of the object, which in this case is 4.

 [0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] 
 LIFO stack[] = 1 4

Now we are at another branch for:

  [set bar 20] [baz $bar]

We append a RESULT object and push an index for the set object.

 [0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] [6 RESULT] [7 set] [8 bar] [9 20]
 LIFO stack[] = 1 4 7

9 was the end of a command ( ] ), so we pop one item off of the LIFO into our order list:

 order list = 7

Now we just have the [baz $bar] to finish in a similar manner. We push the index of [baz $bar]'s baz onto the LIFO and we end up with:

 LIFO stack[] = 1 4 11

 [0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] [6 RESULT] [7 set] [8 bar] [9 20] [10 RESULT] [11 baz] [12 $bar] [13 END OF ALL]

The end of the command [baz $bar] is reached so we pop another item off the stack into our order list.

 order list = 7 11 

The end of the [foo ...] command is reached so we pop another and get:

 order list = 7 11 4 

and so forth until the command is complete

 our order is: 7 11 4 1

Compare that order to these and you will see that it's the proper result:

 [0 RESULT] [1 set] [2 v] [3 RESULT] [4 foo] [5 bar] [6 RESULT] [7 set] [8 bar] [9 20] [10 RESULT] [11 baz] [12 $bar] [13 END OF ALL]

One aspect which we haven't discussed is result transfer. Each command copies its result to the object behind its beginning. This makes it so that each command result works properly.

This approach allows us to write the Tcl_Obj structures to the disk more easily, and unlike a parse-tree it doesn't require many little mallocs, frees and evaluation at each branch.

Based on my somewhat naive assesment of the Tcl sources this design would require a new parser, and perhaps some changes to the Tcl_Obj structure or a new Tcl_ObjType. I also propose a preprocessor that would tranlate $foo to [set foo].

To improve performance each head of a command would also have an offset into an array that stores a function pointer. An alternative would be to use an epoch and do hash lookup, but to me that is overkill. The disadvantage of using an array offset is that to add new commands we would need to either increase the array's size and add more pointers, or replace existing offsets with new pointers and risk having some old commands call the new inappropriately.

What do you think?


DKF: I don't yet see my way through this idea far enough. OK, it's Monday morning and I'm not at my sharpest. If the fundamental execution model is stack-based bytecode (and that's an easy target to generate from a compiler), could we have a bit of an illustration of a few opcodes and their effect on the stack? And then perhaps a sample conversion of a Tcl command fragment into those bytecodes? That would help my wooly-thinking very much. Thanks! (And if you've already done some of that above, please forgive me. I'm just having problems seeing where this is all going. Once I 'get' it, I'll probably be able to offer much more useful comments...)

RS would like to know how this differs from the current bytecode compiler, and what advantages it would bring.

GPS: The idea is that rather than using a BCC we store the Tcl_Objs in a format that is more easily executed. To execute a command all we have to do is make pointers to the Tcl_Objs in the array, do lookup on the first object for the function pointer (or cache it), and call the function. Basically there is no stack used at runtime (only at Tcl_Compile() time to fetch the order list). I'll work on some examples.

To me the BCC is too complicated. MS and DGP seem to fix a BCC bug a week. It's kind of ridiculous. MS once called Tcl_ExecuteByteCode Medusa, because of the way it snakes around with goto and so on.

MS: I'm not sure I understand correctly, but (modulo possible storage details) I imagine this being the current bcc engine without bcc'ed commands - ie, where INST_INVOKE is the main player. I have thought that that would be a desirable approach for a while. Note that the function pointers are currently cached in the command name Tcl_Objs.

The example above would produce a bytecode stream in the flavor of:

   [INST_PUSH "set"] [INST_PUSH "v"] [INST_PUSH "foo"] [INST_PUSH "bar"] 
   [INST_PUSH "set"] [INST_PUSH "bar"] [INST_PUSH "20"] [INST_INVOKE 3] 
   [INST_PUSH "baz"] [INST_LOAD_SCALAR "bar"] [INST_INVOKE 2]
   [INST_INVOKE 4] [INST_INVOKE 3]

where [INST_INVOKE x] uses the top x stack items to build a command, and pushes the command's result on the stack.

The problems I wouldn't know how to solve right away are performance related - and the reason behind the complexity of TEBC. TEBC mainly substitutes jumps for function calls, and indexed access to memory for local variable lookups by name. Some of the work required for caching variable pointers (similar to command name Tcl_Objs) has been started, but I did not manage to bring it to good port yet. Initial timings show a 10-20% core slowdown when using this approach. Replacing the expression operators with function calls (which I had not done) would produce further slowdowns.

Now a confession RE BCC-bug-of-the-week and goto-hell: bugs in TEBC have occurred mainly in the couple of days after *I* made some major changes. More a problem with the designer than with the design ... I have been guilty of increasing the goto-fest in TEBC. It is relatively well structured, I'd say akin to structured programming in assembler. Reducing code duplication in TEBC is the main cause of most gotos. That factorisation of the code seemed a good thing for me - still does.

A better approach could be to have special code to generate TEBC - similar to vmgen [L1 ], coded in tcl. Another thing I started and kept for another day ...