Version 19 of Tcl_Obj refCount HOWTO

Updated 2017-12-31 11:11:52 by pooryorick

Description

The clt thread, "Tcl_ObjSetVar2..., Martin Lembug, 2005-10-12 ", and Bug #1334947 revived the discussion of how to properly manage the reference count of Tcl_Objs.

This is an attempt at clarifying the issue, and a roadmap to improving both the core's refCount management as well as the documentation related to the subject.

Categorization

Joe English writes: There are roughly four classes of Tcl_Obj-related library routines:

  • Constructors, which return a fresh Tcl_Obj with a reference count of 0 ;
  • Readers, which only read the value (but may cause shimmering).
  • Consumers, which store a new reference to an existing Tcl_Obj, increment the refcount, and arrange to decrement the refcount at some unspecified point in the future.
  • Mutators, which change the Tcl_Obj and can *only* be used with unshared Tcl_Objs (reference count == 0 or 1).

and Donal Fellows adds the category

  • Hairy Monsters. Don't give these things refcount==0 objects since they might manipulate the reference count during their processing and might or might not retain a reference.

Please note that the same function may belong to different categories with respect to different arguments: for example, as currently implemented (up to Tcl8.5a4), the function Tcl_ObjSetVar2(interp, part1Ptr, part2Ptr, newValuePtr, flags) is a Reader wrt part1Ptr and part2Ptr, but a Hairy-Monster wrt newValuePtr. (DKF: It should be noted that there is no guarantee that Tcl_ObjSetVar2 will remain a reader wrt part2ptr; if we ever optimize internal memory management of arrays, that will likely change. This is why knowing which function is what with respect to each argument is hard.)

As a general rule, all Tcl commands should be considered to be Hairy-Monsters wrt the objects in the objv array.

We hope to improve the documentation wrt to the categorization of the different functions, and also to reduce significantly the population of Hairy-Monsters. As of today, Constructors and Mutators should be properly documented as such.

PYK 2017-12-30: Another category is Accessors, which return a reference to an existing Tcl_Obj. The reference count should be incremented before passing the object to a function that might decrement it. In the following example from itcl , The Tcl_Obj that is the current interpreter result s passed to a function which call Tcl_SetObjResult in case of an error, passing it the Tcl_Obj that is the current result. Tcl_SetObjResult in turn frees that, but then attempts to use the bytes member of the same object, which the caller gave it, resulting in an fault:

Tcl_GetObjectFromObj(interp, Tcl_GetObjResult(interp));

Instead, the reference count should be incremented:

resPtr = Tcl_GetObjResult(interp);
Tcl_IncrRefCount(resPtr);
Tcl_GetObjectFromObj(interp,resPtr);
Tcl_DecrRefCount(resPtr);

Rules for dealing safely with the different categories

Note first that Constructors are not an issue: there is no Tcl_Obj to manage before you call them.

The always-safe rules are:

  • Mutators: pass an unshared object (refCount is 0 or 1). In order to respect copy-on-write semantics, make a copy for your use if you need to modify a shared object, and modify the copy.
  • Readers, Consumers and Hairy-Monsters: Tcl_IncrRefCount(objPtr) before calling the library function, Tcl_DecrRefCount(objPtr) on return. This means: assume that every function in the Tcl library that is not a Mutator is a Hairy-Monster.

The optimal rules in terms of performance and code simplicity (but risky in light of incomplete documentation) are:

  • Mutators: pass an unshared object (refCount is 0 or 1). In order to respect copy-on-write semantics, make a copy for your use if you need to modify a shared object, and modify the copy. Use Tcl_IsShared to determine whether you may modify, and Tcl_DuplicateObj to get the copy.
  • Readers: if you pass an object with refCount==0, make sure to Tcl_DecrRefCount(objPtr) on return in order not to leak the object.
  • Consumers: do not worry about reference counts as the consumer takes care of it, including the freeing of unneeded objects. This is fire-and-forget. Passing a fresh Tcl_Obj* to a consumer means you're through using it.
  • Hairy-Monsters: Tcl_IncrRefCount(objPtr) before calling the library function, Tcl_DecrRefCount(objPtr) on return.

Tcl DOES NOT GARBAGE COLLECT!

I have been trying to understand clearly the rules for Tcl reference counting of objects, and how to properly use the increment and decrement ref count operations. I finally came across a question and answer posting elsewhere, in which Donal K. Fellows clearly explains a very, very, VERY important concept:

Tcl Does Not Garbage Collect!

What does this mean for the Tcl extension writer? I can sum it up in a nutshell... if you create an object, and you never pass it back to Tcl as part of another object (eg a list object), or as the result object, it will NOT get freed. You MUST call Tcl_DecrRefCount() because this is where the memory deallocator gets called... and nowhere else!

It goes without saying that for a very experienced programmer such as myself to have to hunt around for this morsel of information means that the documentation for Tcl Objects in the Tcl C reference is not explicit enough in making this fact crystal clear.

It should also be pointed out that calling Tcl_DecrRefCount() to free an allocated object, without first having called Tcl_IncrRefCount(), is perfectly OK. The deallocation will happen if the ref count is zero or negative. This can be the case if you must create a new object for the sole purpose of passing it as an argument to another Tcl API, but then have no further use for it. If that bit of code is called over and over, you will end up with many cats and hats wandering in the woods outside Mr. Tesla's mountain laboratory, homeless.

I hope someone finds this clarification useful.

MS: disgrees strongly with is perfectly OK - at least in the unqualified version above! If you created an object without calling Tcl_IncrRefCount(), and passed it somewhere, calling Tcl_DecrRefCount() on it is possibly disastrous: if some part of Tcl kept a reference to it, you will be removing it and freeing the object - which will cause memory corruption further down the line when the reference count is decremented by the rightful owner of the reference! Similarly, if some part of Tcl did an incr/decr of the refCount, the object will already be free when you call Tcl_DecrRefCount() so that your call causes memory corruption. That advise is ONLY correct if you never ever pass the Tcl_Obj anywhere else.

I also disagree (less emphatically) about it being difficult to find that Tcl does not garbage collect. It is never implied that it does; if you have to hunt around for such a morsel of information in the C documentation you will also spend a lot of time. It is even harder to find the morsel of information that Tcl will not wash your dishes. Then again, neither will C, C++, Java or C# - and the docs keep absolute silence about that!

See Also

routines safe for zero-ref objs
Enuerates routines that have this quality.
Tcl_Obj
The structure that holds Tcl values.
Managing the reference count of Tcl objects
When to and when not to increment and decrement refcounts.