TclBlend Problems

TclBlend is used to either embed C version Tcl into a running Java process, or embed a JVM into a running C version Tcl process. You can view the document Using Tcl in Java on http://www-cs-students.stanford.edu/~jwu/ on why and how to embed Tcl into a Java program. [It seems, to me, that the Using Tcl in Java document is no longer mentioned on the page in question. I wonder if http://docs.rinet.ru/JaTricks/ch29.htm is the document being referenced.]

This document summarizes the current open issues in using TclBlend. (The reader should note that this page describes very old issues that have been fixed in recent releases of Tcl Blend. Recent means approximately 1 Aug 2006.) From this point on, it is assumed that the reader is familiar with:

  • embeding Tcl in Java or embeding Java in Tcl
  • TclBlend/Jacl's "java::***" command suite
  • Tcl_Obj object model, e.g. reference counting
  • Java threading model
  • Java memory model, e.g. garbage collection, object allocation

The problems described here are only for TclBlend. These problems have not being observed in Jacl.

  1. intermittent "invalid command name java0x1" errors: Tcl is trying to access Java objects that are no longer available.
  2. Java GC thread deadlocks when trying to free Tcl_Obj
  3. If one removes the ability to free Tcl_Obj from the GC thread, then there are memory leaks on Tcl_Obj in TclBlend
  4. No multi-threaded support.
  5. Java static method vs thread-safe Tcl C functions

The detail description of the 4 problems are discussed at the end of the document.

The way I work around the above problems are:

  1. Use a single top-level Tcl interpreter and the Tcl event loop as described in http://www-cs-students.stanford.edu/~jwu to embed Tcl into Java.
  2. Don't use any of the "java::*" commands in a production environement. I still use these for ad hoc commands for testing in an interactive Tcl shell.
  3. Use the "Command" interface to register Java methods as Tcl commands for Tcl scripts to call Java functions.
  4. Use preserve/release religiously when dealing with TclObject.
  5. Patch TclBlend to use preserve/release rather than relying on implicit GC to free Tcl_Obj.
  6. When passing Java objects into Tcl, use a global Java hash table to explicitly create and remove persistent string references to Java objects.

Problem 1: "invalid command name java0x1"

TclBlend/Jacl provides the ability to directly reference Java objects from inside a Tcl script using the "java::***" command suite. When using this mechanism with general Tcl scripts, you may intermitently encounter "invalid command name java0x1" error messages. The error occurs when a Tcl script is trying to access a Java object that is no longer available.

This problem is caused by the combination of the following three factors:

  A. Tcl interpreter/byte compiler can internally choose to use the
     string only representation of a Tcl_Obj during the execution of
     a Tcl script.
  B. Java objects are exposed into Tcl using a Tcl_Obj with an
     artifical string name as the reference.
  C. Java objects are auto-free'ed when no Java reference is pointing
     to them.

The following script demostrates the above three factors:

    set x [java::new String foo]   # line 1
    after 1000 "$x toString"       # line 2
    unset x                        # line 3
    update                         # line 4

line 1: a new Java object is created, this object is wrapped inside a

        Tcl_Obj with a reference count of 1.  The Tcl_Obj is assigned
        to a temporary variable called "x".

line 2: a Tcl script is created using the "string reference" of

        the Java object.  The script uses the string only
        representation of the Java object, not the original Tcl_Obj
        itself.  As a result, the original Tcl_Obj still has a
        reference count of 1.

line 3: The temporary variable "x" is cleared. As a result, the

        Tcl_Obj's reference count is decremented.  This results in the
        Tcl_Obj being free'ed and the the Java object also being
        free'ed.

line 4: The script "$x toString" is executed in the timer handler

        later.  The script attempts to find the Java object using the
        "string reference" of the object.  That name no longer
        resolves into any known Java object.  The script produces an
        "invalid command name java0x1".

In the above example, we explicitly choose to use a string representation of the Java object in the callback script. In a general production software, there may be many lines of Tcl code. Even if we do not explicitly use the string representation of the Java object, the rest of Tcl script, or the Tcl interpreter/byte compiler may convert the original Tcl_Obj into this string only representation. For example, even if one calls a Tcl proc as in:

        do_some_work [java::new String foo]

There is no guarantee that the created Java object (internally referenced by a Tcl_Obj) will exist when the proc 'do_some_work' needs to access the object's methods. Worse, the original Tcl_Obj is lost at some unknown point because there are a number of Tcl_Eval***() variation used internally by Tcl. Using a script from the interactive Tcl shell can produce a different result from running the same script in the byte compiler.


Problem 2. deadlock on GC

When using Tcl and Java together, the Java program often needs to access values in Tcl. For example, the Java program may want to read the result of a variable or get the data out of a list. A piece of Java program may do the following to print the result on the stdout:

        System.out.println(interp.getResult().toString());

where 'interp' is the handle to the C version Tcl interpreter in Java. Let's see what is really happening in the above code:

  A. interp.getResult() returns a temporary Java object of type
     TclObject.  This TclObject contains a reference to the Tcl_Obj
     C data structure returned by the real Tcl interpreter.  The
     reference count of Tcl_Obj is incremented by one.
  B. the toString() method of the TclObject is called.
  C. the resulting string is passed to System.out for printing.
  D. the temporary TclObject is dereferenced, awaiting for garbage
     collection.

  E. Java GC thread attempts to free the temporary TclObject.
     TclObject decrements the reference count of underlining Tcl_Obj.

If the GC thread is an independent thread in the JVM, then step E is an invalid use of the Tcl because decrementing Tcl_Obj reference count from a different thread is not supported in Tcl. In some JVM, step E can causes a deadlock. In other JVM, step E may produce incorrect results.


Problem 3. preserve/release paradigm

A Tcl/C program requires explicit management of Tcl_Obj using the preserve/release mechanism by the application. Tcl_Obj is exposed to Java in a Tcl/Java program as a Java TclObject. However, Java implicitly free objects by the JVM.

This impedence mismatch between memory model Tcl and Java produces the deadlock problem occuring in problem 2. One way to solve problem 2 is to use the preserve/release mechanism of Tcl to explicitly free Tcl_Obj inside Java. When this is used, TclObject severs the connection between itself and the underlining Tcl_Obj on release. Then the GC no longer tris to decrement Tcl_Obj in step E.

However, the existing TclBlend code has not followed the above preserve/release paradigm. There are many fixes required to make TclBlend compliant.


Problem 4. No multi-threaded support

TclBlend does not support the creation of multiple Tcl interpreter from multiple Java threads. There are a number of globals used both in the C and the Java portion of TclBlend. Some of these globals are:

     a global JAVA_LOCK in C
     a global JVM environment pointer in C
     a global [Notifier] object in Java

Problem 5. Java static methods vs thread-safe Tcl C functions

TclBlend contains at least one static method, tcl.lang.Interp.commandComplete(). This method is mapped to the Tcl C function Tcl_CommandComplete() by TclBlend. Since in TclBlend, this method is public static, it implies that the method is thread-safe and independent of any thread state. The C part of TclBlend tries to protect against concurrent calls to Tcl_CommandComplete(). This implementation causes deadlocks when mulitple Java threads try to access tcl.lang.Interp.commandComplete().

Assuming Tcl_CommandComplete() is thread-safe and independent of any thread states, then removing the protection in TclBlend for Tcl_CommandComplete() solves the deadlock problem.

However, I can't find any clear documentation in Tcl core as to which C function in Tcl is thread safe. I think there should be a list stating very clearly whether a Tcl function can be called from multiple threads and whether this function depends on any local thread storage. This will go a long way in helping TclBlend implementation.


PD: Solution to problem 1.

I am currently working on adding opaque objects (which is essentially what the java0x1 object is) to the Tcl core. This should solve many of your problems. e.g.

    set a [java::new ....]
    lindex $a 0

will fail because $a is not a valid list (this is not intentional but rather a side effect of the string representation which was chosen for opaque objects in order to try and distinguish them from simple strings).

    # Change the type of $a to 'binary'
    binary scan $a ....

    # Change the object back to the original opaque object
    $a ...

    # Create a new object with the same string representation as the
    # opaque object $a, it is not however an opaque object itself.
    set b [string range $a 0 end]

    # Make $b a reference to the same opaque object as $a.
    $b ...

    # Free the reference from $b to the opaque object.
    append $b "junk"

    # Frees the reference from $a to the opaque object which as it
    # is the last reference results in the opaque object being freed.
    unset a

JW: In this proposal, are you requiring the user of the opaque object to explicitly delete the object after knowing that all references to the object is gone? Can you "unset a" after "set b [string range $a 0 end] ] and still be able to use "$b ..."? No. Hence the comment at the bottom of the page.

I also know how to fix

    set a [java::new ...]
    after 1000 "$a toString"
    unset a

in particular and callbacks in general but this would only work if the callback stores a Tcl_Obj * and not a string. There is no way to solve

    set a [java::new ...]
    after 1000 [string range "$a toString" 0 end]
    unset a

Mixing string operations with list operations can cause problems and so will using them on opaque objects.

PD

JW: Another problem with callback scripts is that it would be very difficult to write them using Tcl_Obj only. Often, the callback scripts are constructed purely inside Tcl using multi-lined strings like

     {
         code
         code
         ...
     }

For Tcl-only programmers, you can't expect them to write all their callback scripts using list. Being more of a functional language, Tcl passes everything by value and assumes it is always possible to reconstruct the original object via its value. Most of the Tcl script also assumes this. We need to make Java objects behave as closely to this model as possible. Your opaque object proposal may help in this area.

PD There are limits as to how close opaque objects can behave like a transparent object like a list, the opaque object mechanism I am working on will do as much as is possible. It is very similar to the problems you have with garbage collection, except that the references are strings and not just pointers.

Multi-line script callbacks don't have a problem as no substitution is done on them so they can't have a pure string representation of an opaque object. The recommended way of doing multi-line scripts when you want to pass a parameters is to put the script into a proc and then you are back to your original problem. As I mentioned I have a possible solution for the situation when the script is created by concatenating strings ("$x toString") rather than using [list].

It is interesting that you compare Tcl to a functional language as I have [lambda] (actually it is just an unnamed procedure) and [curry] functions which means that you don't have to worry about what name to give your callback procedure. e.g.

    button .b -command [list [lambda {w args} {
      code
      code
      ...
    }] .b]

Problem #1 Intermittent "invalid command name java0x1" errors: Tcl is trying to access Java objects that are no longer available. can be addressed (albeit awkwardly) by java::lock.

The original:

    set x [java::new String foo]   # line 1
    after 1000 "$x toString"       # line 2
    unset x                        # line 3
    update                         # line 4

becomes:

    java::lock [set x [java::new String foo]]   # line 1
    after 1000 "$x toString; java::unlock $x"       # line 2
    unset x                        # line 3
    update                         # line 4

-- Todd Coram