Version 16 of Managing the reference count of Tcl objects

Updated 2009-06-19 01:26:56 by kbk

See also Tcl_Obj refCount HOWTO


From a usenet posting by Andre Ponitz: (Sorry about losing the accented letters; the Wiki chokes on them)

 > Until now I have been using Tcl_Eval as the only method to access Tcl from
 > my C++-Code.
 > 
 > For performance reasons I'd like to switch to some more elaborate method,
 > but I am still confused where and when I have to call Tcl_IncrRefCount and
 > Tcl_DecrRefCount.
 > 
 > Specifically: Suppose I have an object that actually is a list. I have
 > called Tcl_IncrRefCount once on every item in the list and on the list
 > itself. How do I free the list?

MS 2005-10-24: apart from all the comments below, something specific to lists. In general, you should not concern yourself with the refCount of list items, as the Tcl library will handle those for you. Ie, the rule is that you should not touch them (neither incr nor decr) as long as you are using the List functions. If you are doing direct surgery on the list (as opposed to doing it via Tcl_ListObjReplace() or similar), it is really more involved. Recommendation is

   (a) don't
   (b) if you still do, read the Tcl sources carefully (especially tclListObj.c) 

You're making things difficult for yourself.

There's only one rule:

You need to worry about ref counts if and only if you have a Tcl_Obj* on the left hand side of an equal sign. In this case, you must

  1. Tcl_IncrRefCount() the new content of the variable.
  2. Make sure that you Tcl_DecrRefCount() the new content when you lose the reference, either because you overwrote it with another assignment or because it went out of scope.

If you know exactly what you're doing, you can sometimes skip incrementing/decrementing the ref count, because you're sure that there's another reference somewhere. But the rule above always works.

The problem with this rule is performance. Often the Tcl_IncrRefCount() makes an unshared object appear to be shared, and an extra Tcl_DuplicateObj() is needed. For this reason, you can optionally add a second rule:

  • If you're absolutely sure that nobody else will decrement the ref count of an object while you're holding a reference to it, you can skip manipulating the ref count.

Let's make an [lshift] command to illustrate how this works. The [lshift] command will accept the name of a variable that is presumed to be a list, and remove its first element. A Tcl equivalent would be:

 proc lshift { varName } {
     upvar 1 $varName var
     set var [lrange $var 1 end]
 }

The following is a naive implementation of [lshift] in C. It is ultraconservative about reference counts; it always, always adjusts the count when it stores a Tcl_Obj pointer.

 int
 Lshift_ObjCmd( ClientData unused,
                Tcl_Interp* interp,
                int objc,
                Tcl_Obj * CONST objv[] )
 {
     Tcl_Obj* listPtr;    /* Pointer to the list being shifted. */
     int status;          /* Status return from Tcl library */
     Tcl_Obj* retPtr;     /* Pointer returned from Tcl library */ 
 
    /* Check arguments */
 
     if ( objc != 2 ) {
         Tcl_WrongNumArgs( interp, 1, objv, " varName" );
         return TCL_ERROR;
     }
 
     /* Get a pointer to the list */ 
 
     listPtr = Tcl_ObjGetVar2( interp, objv[1], (Tcl_Obj*) NULL,
                               TCL_LEAVE_ERR_MSG );
     if (listPtr == NULL ) {
         return TCL_ERROR;
     }
 
     /* See the discussion for comments on the following line */
 
     Tcl_IncrRefCount( listPtr );                        /* [A] */
 
     /* If the list object is shared, make a private copy. */
 
     if ( Tcl_IsShared( listPtr ) ) {
         Tcl_Obj* temp = listPtr;
         listPtr = Tcl_DuplicateObj( listPtr );
         Tcl_DecrRefCount( temp );
         Tcl_IncrRefCount( listPtr );
     }
     
     /**
      ** At this point, listPtr designates an unshared copy of the
      ** list.  Edit it.
      **/
 
     status = Tcl_ListObjReplace( interp, listPtr, 0, 1, 0, (Tcl_Obj*)NULL );
 
     /**
      ** Put the new copy of the list back in the variable.
      **/
 
     if ( status == TCL_OK ) {
         retPtr = Tcl_ObjSetVar2( interp, objv[ 1 ], (Tcl_Obj*) NULL,
                                  listPtr, TCL_LEAVE_ERR_MSG );
     }
 
     /**
      ** Store the new copy of the list in the interpreter result.
      **/
     if ( retPtr == NULL ) {
         status =  TCL_ERROR;
     } else {
         /* Use retPtr instead of listPtr if trace action are required in the result */
         Tcl_SetObjResult( interp, listPtr ); 
     }
 
     /* Record that listPtr is going out of scope */
 
     Tcl_DecrRefCount( listPtr );
 
     /* Tell the caller whether the operation worked. */
 
     return status;
 }

OK, now why did I say this was a naive implementation? It's simple: After the Tcl_IncrRefCount() at [A] in the code, the object is always shared: there's at least the one reference to it in the variable table, and the one we just added. The result is that we'll always duplicate the object.

This is (only) a performance problem. The code as written will work. It simply won't be as fast as it might be. Making it faster in a safe manner requires a little bit of source diving to discover that Tcl_ListObjReplace() doesn't mess with the ref count of the list. Tcl_ObjSetVar2(), however, does adjust the ref count of the object stored in the variable. We therefore can let the ref count float while we're performing surgery on the list, as long as we repair it by the time we're storing it back in the variable.

Doing this kind of optimization requires knowing about routines safe for zero-ref objs.

That change leads us to the following:

 int
 Lshift_ObjCmd( ClientData unused,
                Tcl_Interp* interp,
                int objc,
                Tcl_Obj * CONST objv[] )
 {
     Tcl_Obj* listPtr;    /* Pointer to the list being shifted. */
     int status;          /* Status return from Tcl library */
     Tcl_Obj* retPtr;     /* Pointer returned from Tcl library */ 

     /* Check arguments */
 
     if ( objc != 2 ) {
         Tcl_WrongNumArgs( interp, 1, objv, " varName" );
         return TCL_ERROR;
     }
 
     /* Get a pointer to the list */ 
 
     listPtr = Tcl_ObjGetVar2( interp, objv[1], (Tcl_Obj*) NULL,
                               TCL_LEAVE_ERR_MSG );
     if (listPtr == NULL ) {
         return TCL_ERROR;
     }
 
     /** PERFORMANCE CHANGE:
      ** We intend to perform surgery on the list object, so
      ** avoid making adjustments to its reference count yet.
      ** Hence, its reference count is one too low.
      **/

     /* REMOVED:  Tcl_IncrRefCount( listPtr ); */
 
     /* If the list object is shared, make a private copy. */
 
     if ( Tcl_IsShared( listPtr ) ) {
         Tcl_Obj* temp = listPtr;
         listPtr = Tcl_DuplicateObj( listPtr );
         /** PERFORMANCE CHANGE: At this point, we're not yet
          **    tracking the reference count of listPtr.  Its
          **    reference count remains one too low.
          ** REMOVED:
          **   Tcl_DecrRefCount( temp );
          **   Tcl_IncrRefCount( listPtr );
          **/
     }
     
     /**
      ** At this point, listPtr designates an unshared copy of the
      ** list.  Edit it.
      **/
 
     status = Tcl_ListObjReplace( interp, listPtr, 0, 1, 0, (Tcl_Obj*)NULL );
 
     /** PERFORMANCE CHANGE: 
      ** From this point forward, we ensure
      **    that listPtr's reference count is correct.
      ** ADDED: */

     Tcl_IncrRefCount( listPtr );

     /**
      ** Put the new copy of the list back in the variable.
      **/
 
     if ( status == TCL_OK ) {
         retPtr = Tcl_ObjSetVar2( interp, objv[ 1 ], (Tcl_Obj*) NULL,
                                  listPtr, TCL_LEAVE_ERR_MSG );
     }
 
     /**
      ** Store the new copy of the list in the interpreter result.
      **/
      if ( retPtr == NULL ) {
         status =  TCL_ERROR;
     } else {
         Tcl_SetObjResult( interp, listPtr ); 
     }
 
     /* The reference to the list is going out of scope. */
 
     Tcl_DecrRefCount( listPtr );
 
     /* Tell the caller whether the operation worked.
 
     return status;
 }

DKF - It turns out that you can be even more efficient than the above by taking advantage of the fact that Tcl_ObjSetVar2() does the Right Thing in the process of putting an object into a variable so that if you put an object into a variable where it already exists, its refcount can never drop to zero. This allows for the following implementation which does virtually no explicit reference count manipulation at all (just the Tcl_IsShared()/Tcl_DuplicateObj() combo in the middle):

MS - Wrong! Tcl_ObjSetVar2() does not always do the right thing (See Bug #1334947 [L1 ]). The code below will leak the listPtr whenever Tcl_ObjSetVar2() fails due to e.g. objv[1] being the name of an array at the time it is called. I do hope that this will be fixed in Tcl8.5.

 int
 Lshift_ObjCmd( ClientData unused,
                Tcl_Interp* interp,
                int objc,
                Tcl_Obj * CONST objv[] )
 {
     Tcl_Obj* listPtr;    /* Pointer to the list being shifted. */
     int status;          /* Status return from Tcl library */
     Tcl_Obj* retPtr;     /* Pointer returned from Tcl library */ 

     /* Check arguments */
 
     if (objc != 2) {
         Tcl_WrongNumArgs(interp, 1, objv, "varName");
         return TCL_ERROR;
     }
 
     /* Get a pointer to the list */ 
 
     listPtr = Tcl_ObjGetVar2(interp, objv[1], (Tcl_Obj*) NULL,
                              TCL_LEAVE_ERR_MSG);
     if (listPtr == NULL) {
         return TCL_ERROR;
     }
 
     /* If the list object is shared, make a private copy. */
 
     if (Tcl_IsShared(listPtr)) {
         listPtr = Tcl_DuplicateObj(listPtr);
     }
     
     /**
      ** At this point, listPtr designates an unshared copy of the
      ** list.  Edit it.
      **/
 
     status = Tcl_ListObjReplace(interp, listPtr, 0, 1, 0, (Tcl_Obj*)NULL);
 
     /** PERFORMANCE CHANGE: 
      ** At this point, listPtr's refcount is either zero or one
      ** and it will get incremented (and then decremented again
      ** if it was previously 1) in the Tcl_ObjSetVar2() call, in
      ** effect transferring ownership of the object to the
      ** variable.
      **
      ** NOTE that this may leak listPtr if Tcl_ObjSetVar2 fails. 
      ** DO NOT USE in Tcl8.4, nor in Tcl8.5 until Bug #1334947 is fixed.
      **
      ** REMOVED:
      **   Tcl_IncrRefCount(listPtr);
      **/
 
     /**
      ** Put the new copy of the list back in the variable.
      **/
 
     if (status == TCL_OK) {
         retPtr = Tcl_ObjSetVar2(interp, objv[1], (Tcl_Obj*) NULL,
                                 listPtr, TCL_LEAVE_ERR_MSG);
     }
 
     /**
      ** Store the new copy of the list in the interpreter result.
      ** Increments the reference count of listPtr.
      **/
      if ( retPtr == NULL ) {
          status =  TCL_ERROR;
      } else {
          Tcl_SetObjResult( interp, listPtr ); 
      }
  
     /** PERFORMANCE CHANGE:
      ** The refcount for listPtr is now definitely 2, so do
      ** nothing here; effectively, the
      ** REMOVED:
      **   Tcl_DecrRefCount(listPtr);
      **/
 
     /* Tell the caller whether the operation worked.
 
     return status;
 }

NEM - I had some trouble recently tracking down a bug which occurred due to not calling Tcl_IncrRefCount on an object which I later call Tcl_DecrRefCount on. The bug was very difficult to track down, as it only showed up somewhere completely unrelated. Example:

 Tcl_Obj *cmd = Tcl_NewStringObj("somecmd args", -1);
 // Missing Tcl_IncrRefCount here - this is the bug
 int res = Tcl_EvalObjEx(interp, cmd, TCL_EVAL_GLOBAL);
 Tcl_DecrRefCount(cmd);
 // Check res and continue or error

The problem is that 99% of the time, the bug will not surface here but much later. For me, it surfaced when calling [info commands] in my code, at which point the program dumped core (segfault).

Miguel Sofer MS explained why this can happen: Basically, you should always call Tcl_IncrRefCount on any object you send to a Tcl API call, as that call may call Tcl_Incr/DecrRefCount on it, which would result in the object being "freed". The reason why "freed" is in inverted commas, is that (depending on the memory allocator in use) the object may not be freed at all, but returned to a pool for reuse later. In my example, the problem was that the object was being freed, and the then immediately reused for a different purpose, before I did my Tcl_DecrRefCount. This meant that I was basically freeing an object that I don't own, and that causes all hell to break loose.

The solution is to (obviously) double check to make sure that all of your Tcl_Incr/DecrRefCount's match up. It can be easy to overlook one though, as I discovered. You can compile Tcl with -enable-symbols=purify (or all, or debug) to get Tcl to use the standard malloc()/free() in which case the problem will be localized (i.e. you overly freeing an object won't result in a problem for Tcl), but you still need to find and fix the bug.

To help with this, I'm going to look into how hard it would be to add a debug compile flag for Tcl which allows tracking of Tcl_Obj reference counts. Should be fun!

09nov03 jcw - For another idea on how to simplify the programming task of cleaning up reference count, see the "AutoReleasePool' used in Cocoa on Mac OS X, described at [L2 ].

10nov03 DKF - It's not entirely clear to me from the above page how an AutoReleasePool would be done in conventional C, but then I'm a bit jet-lagged right now...

The idea seems to be that one puts a ref in a pool, when returning an object which ought to be released at some point. The pool is normally created right after every event and deleted just before returning to poll/suspend for another event. Deleting the pool means: drop all refcounts stored in it This approach makes it possible to never have zero-ref objects: simply add them to the pool and if the object is not used anywhere else it'll go away when the system is idle. -jcw

2003-11-11 elfring Can the design pattern "Resource acquisition is initialization" [L3 ] help to find solutions? Can anything from the Smart Pointer Library [L4 ] be converted into C functions for TCL?


Category Internals