Version 0 of The anatomy of a bytecoded command

Updated 2001-05-16 21:56:44

PURPOSE: Provide a working example for Tcl maintainers attempting to add bytecode compilation of a new Tcl command.


Background: Kevin Kenny has been working on a new [lset] command that makes surgical modifications to Tcl lists. The target is to get decent performance of operations such as Shuffle a list without needing to resort to mysterious usage such as the [K] combinator.

The proposed [lset] command takes the syntax:

    lset listVar index ?index...? value

In its simplest form,

    lset myList $i $x

it says, "replace the $i'th element of variable myList with the content of variable x.

Thus, if myList contains:

    a b c d

after executing

    lset myList 1 e

the variable will contain

    a e c d

Additional index parameters index into sublists. For example, if myList contains:

    {a b c} {d e f} {g h i}

and we execute

    lset myList 1 1 j

then after the operation, myList will contain

    {a b c} {d j f} {g h i}

It is an error to address outside any list: each index argument must be at least zero and less than the length of the corresponding list.


The first cut at [lset] is to implement it as an object command. The general flow is:

  • A. Look up the variable. If it designates a shared object, duplicate the object. Put the object pointer into two temp storage locations, finalValuePtr and listPtr.
  • B. For each index argument except for the last, locate the corresponding element of the Tcl object designated by listPtr. If it is a shared object, duplicate it and replace the element with the new copy. Set listPtr to designate the element.
  • C. Locate the element in listPtr that the last index argument designates. Replace it with newValue.
  • D. Finally, replace the value of the variable with finalValuePtr from step A.

The C code for the object command is:


 int
 Tcl_LsetObjCmd( clientData, interp, objc, objv )
    ClientData clientData;        /* Not used. */
    Tcl_Interp *interp;                /* Current interpreter. */
    int objc;                        /* Number of arguments. */
    Tcl_Obj *CONST objv[];        /* Argument values. */
 {

    Tcl_Obj* listPtr;                /* Pointer to the list being altered. */
    Tcl_Obj* subListPtr;        /* Pointer to a sublist of the list */
    Tcl_Obj* finalValuePtr;        /* Value finally assigned to the variable */
    int index;                        /* Index of the element being replaced */
    int result;                        /* Result to return from this function */
    int listLen;                /* Length of a list being examined */
    Tcl_Obj** elemPtrs;                /* Pointers to the elements of a
  • list being examined */
    int i;

    /* Check parameter count */

    if ( objc < 4 ) {
        Tcl_WrongNumArgs( interp, 1, objv, "listVar index ?index...? value" );
        return TCL_ERROR;
    }

    /* Look up the list variable */

    listPtr = Tcl_ObjGetVar2( interp, objv[ 1 ], (Tcl_Obj*) NULL,
                              TCL_LEAVE_ERR_MSG );
    if ( listPtr == NULL ) {
        return TCL_ERROR;
    }

    /* Make sure that the list value is unshared. */

    if ( Tcl_IsShared( listPtr ) ) {
        listPtr = Tcl_DuplicateObj( listPtr );
    }

    finalValuePtr = listPtr;

    /*
    • If there are multiple 'index' args, handle each arg except the
    • last by diving into a sublist.
     */

    for ( i = 2; i < objc-2; ++i ) {

        /*
        ** Take apart the list
         */

        result = Tcl_ListObjGetElements( interp, listPtr,
                                         &listLen, &elemPtrs );
        if ( result != TCL_OK ) {
            return result;
        }

        /*
        ** Derive the index of the requested sublist
         */

        result = TclGetIntForIndex( interp, objv[i], (listLen - 1), &index );
        if ( result != TCL_OK ) {
            return result;
        }

        if ( ( index < 0 ) || ( index >= listLen ) ) {

            Tcl_SetObjResult( interp,
                              Tcl_NewStringObj( "list index out of range",
                                                -1 ) );
            return TCL_ERROR;
        }

        /*
        ** Extract the appropriate sublist, and make sure that it
        ** is unshared.
         */

        subListPtr = elemPtrs[ index ];
        if ( Tcl_IsShared( subListPtr ) ) {
            subListPtr = Tcl_DuplicateObj( subListPtr );
            result = Tcl_ListObjSetElement( interp, listPtr, index,
                                            subListPtr );
            if ( result != TCL_OK ) {
                return TCL_ERROR;
            }
        } else {
            Tcl_InvalidateStringRep( listPtr );
        }

        listPtr = subListPtr;
    }

    /* Break apart the innermost sublist */

    result = Tcl_ListObjGetElements( interp, listPtr,
                                     &listLen, &elemPtrs );
    if ( result != TCL_OK ) {
        return result;
    }

    /* Convert the last index */

    result = TclGetIntForIndex( interp, objv[objc-2], (listLen - 1), &index );
    if ( result != TCL_OK ) {
        return result;
    }
    if ( ( index < 0 ) || ( index >= listLen ) ) {
        Tcl_SetObjResult( interp,
                          Tcl_NewStringObj( "list index out of range", -1 ) );
        return TCL_ERROR;
    }

    /* Store the result in the list element */

    result = Tcl_ListObjSetElement( interp, listPtr, index, objv[objc-1] );
    if ( result != TCL_OK ) {
        return result;
    }

    /* Finally, update the variable so that traces fire. */

    listPtr = Tcl_ObjSetVar2( interp, objv[1], NULL, finalValuePtr,
                              TCL_LEAVE_ERR_MSG );
    if ( listPtr == NULL ) {
        return TCL_ERROR;
    }

    return result;

 }

This code is much cheaper (30-50% reduction in CPU time on the Shuffle a list trials) than the corresponding logic using the [K] combinator, but we can do much better still using the bytecode compiler. The chief reason is that the calls to Tcl_ObjGetVar2 and Tcl_ObjSetVar2 are fairly expensive lookups in a hash table, while local variables in the stack frame can be compiled to lookups that operate in constant time. Let's sketch out what the bytecode will have to look like for

    lset varName ind1 ind2 value

To begin with, all of the parameters will have to go on the stack, except possibly for the variable. The variable name may or may not go on the stack, depending on whether it uses a frame slot.

In order for traces to fire in order, the parameters must be substituted from left to right. Thus far, then, the compilation procedure for lset has to:

  • call TclPushVarName to put the variable name on the stack if needed.
  • call TclCompileTokens for each of the index arguments in turn, so that they are pushed on the stack from left to right.
  • call TclCompileTokens for the value argument, too.
  • Now comes one tricky part. We have to get the value of the variable. We need a new bytecode command, like the index word in Forth, to pull the variable name that we pushed up above. Then we need to emit the appropriate load instruction, which is one of INST_LOAD_SCALAR1, INST_LOAD_SCALAR4, INST_LOAD_SCALAR_STK, INST_LOAD_ARRAY1, INST_LOAD_ARRAY4, INST_LOAD_ARRAY_STK, or INST_LOAD_STK. (Whew!)
  • Now we're able to emit a bytecode instruction for the lset operation itself. This instruction will have a parameter that is the number of index arguments, so that it can work with them off the stack. It will begin by popping the variable value off the stack (reducing its ref count to 1 again if the value is shared), and duplicating the value if necessary. The rest of the code will more or less follow the object command implementation. The stack after this instruction will contain the value that is to be put into varName, and the variable name if the first step pushed it.
  • Finally, we store the result back into the variable by emitting the appropriate store instruction, which is one of INST_STORE_*, where * is SCALAR1, SCALAR4, SCALAR_STK, ARRAY1, ARRAY4, ARRAY_STK or STK.

So it appears that we need a new compilation procedure for the lset command, plus two new bytecode instructions: INST_STKINDEX and INST_LSET.


Jeffrey Hobbs:

How to add a byte-compiled command:

Most examples are found in tclCompCmds.c, which is the best place to start for examples. Files that must be edited to add a byte-compiled command are (this applies for core commands):

  • tclCompile.h: add any instructions you need
  • tclCompile.c: add the description of the instructions (for compiler debugging)
  • tclCompCmds.c: add your command that will byte-compile
  • tclBasic.c: in the list of commands, make sure your byte-compiled command is used
  • tclInt.h: add the prototype of your command for byte-compiling
  • tclExecute.c: add the code for any new instructions.

At this time, there is no way to add a byte-compiled command outside of the core.