Tcl_Obj

Difference between version 73 and 74 - Previous - Next
'''[http://www.tcl.tk/man/tcl/TclLib/Object.htm%|%Tcl_Obj]''' is the [C]structure that implements every [Tcl] value.  The name is misleading.
'''Tcl_Value''' would better 
describe its purpose, but that term was already
 taken by
`[http://www.tcl.tk/man/tcl/TclLib/CrtMathFnc.htm%|%Tcl_CreateMathFunc()]`.

[AMG]: Tcl 9 [http://core.tcl.tk/tcl/info/d6204416c376e4d5%|%renames] Tcl_Obj to Tcl_Value.

[FM]: Tcl_Cache could be more explicit, since it caches the result of a previous parsing.


   ''Tcl_Obj's are like storks. They have two legs, the internal representation and the string representation. They can stand on either leg, or on both.'':    -- attributed to [DKF]

   ''Any Tcl_Obj that is not isomorphic to a string is a bug.'':   -- [dgp], [Tcl Chatroom], 2015-07-01



** See Also **

   [Creating and Using Tcl Handles in C Extensions]:   

   [How to embed Tcl in C applications%|%How to embed Tcl in C applications]:   

   [Managing the reference count of Tcl objects]:   

   [Tcl_Obj Deep Copy]:   
   [Tcl_Obj refCount HOWTO]:   How to manage the reference count of a Tcl_Obj.

   [Tcl_Obj types list]:   

   [Blessed Tcl_Obj Values]:   

   [A Tcl_Obj Command Machine Code Generator]:   

   [Tcl_Obj proposals]:   Refers to discussions of changes to the Tcl_Obj structure and its semantics. 
   [Tcl_Objs]:   A list of types of `Tcl_Obj` values.

   [Tcl_Obj types list]:   Another list of types of `Tcl_Obj` values.

   [Tcl_Obj vs Command]:   

   [Extending Tcl]:   

   [Islist Extension]:   

   [32-bit integer overflow]:   

   [Category Tcl Library]:   

   [stasher]:   A project to make it possible to associate arbitrary data with a Tcl_Obj at the script level.

   [Tweezer]:   Hack on Tcl objects from the script level.

   [https://github.com/auriocus/tclvalue%|%tclvalue]:   An extension for Tcl 8.6+ to reflect the Tcl_Obj API into the script level.

   [https://github.com/cyanogilvie/type%|%type]:   Allow Tcl script to create custom [Tcl_ObjType%|%Tcl_ObjTypes].
   [scrobj]:   Makextension ithat allpowssible to implement Tcl_Object types on the script level.

   [https://github.com/cyanogilvie/dedup%|%dedup]:   An extension that provides an alternative to Tcl_NewStringObj that returns a reference to an existing string object if it exists rather that creating a new duplicate string. The primary benefit of applying this mechanism to Tcl_Objs created from strings by an extension is performance - firstly because (in the case where an existing cached value is returned) no memory management is performed, the cached Tcl_Obj’s reference count is just incremented and the pointer returned. This can be a substantial win for an extension such as an XML parser that will spend much of its time creating Tcl_Objs for the same set of strings.



** Documentation **

   [http://www.usenix.org/legacy/publications/library/proceedings/tcl96/full_papers/lewis/%|%An On-the-fly Bytecode Compiler for Tcl], Brian T. Lewis, 1996:   Introduces Tcl_Obj.

   [http://www.tcl.tk/man/tcl/TclLib/Object.htm%|%official reference for] Tcl_Obj and friends:   

   [http://www.tcl.tk/man/tcl8.5/TclLib/ObjectType.htm%|%official reference for object types]:   



** Description **

'''Tcl_Obj''' provides for both a string (UTFmodified utf-8) representation and an
"internal" representation of the value, which is limited only by the
constraints of the [C] language itself.  Each Tcl_Obj carries information about
the type of its internal representation, and how to perform a conversion from
the string representation to the internal representation, and vice-versa.  In
this way, the string becomes the universal intermediate representation for
conversions between types.  To enhance performance, Tcl knows how to perform
direct conversions between certain often-used types.
A Tcl_Obj is reference counted, and the allocator for it is very heavily
 tuned.  I
t hase na dmeep "Tcly_Obj" uis nfot relatued to [object-orientation%|%obje ct-oriented
programeming], but the far more apt ''Tcl_Value''
 was palreviouslady takusend for handling 
user-defined `[expr]` functions - a now
 obsolete facility.
Despite the name "Tcl_Obj", this structure has nothing to do with
object-oriented programming (that's what [TclOO] is for). Think of Tcl_Obj as a
Tcl value with different clothes on. Depending on your needs, Tcl will provide
you with the Tcl value dressed differently.

[RS]:   thinks that the name is ok if one does not expect [OO] features, class
membership etc. Objects have been there long before OO, and the name is
certainly not under a monopoly (I'd object against that ;-). But the basic
feature of Tcl_Obj's is that they have a string representation and possibly a
problem-oriented one, but each can be regenerated from the other (also if you
define your obj Obj types). If such type conversions occur frequently, this
costs performance - the so-called [shimmering] occurs. E.g. see what happens to
`$i` below:

======
for {set i 0} {$i < 10} {incr i} { #here we need the integer rep
    puts [string length $i]      ;#here the string rep..
    puts [llength $i]            ;# and here the list rep, so int rep goes away 
}
======

[CMcC]: I've put together a summary page of [Tcl_Objs] current for 8.4,
containing information culled from the source.

A `Tcl_Obj` is defined as a structure containing:

   `refCount` (`integer`):   The number of references to this Tcl_Obj.

   `bytes` (`char *`):   The Unicode string value, encoded in [utf-8%|%modified utf-8]. Although [everything is a string%|%each value is conceptually a string], actual generation of this value is delayed as long as possible to improve performance.  Therefore, it often points to `NULL`.  When it is not NULL, it points to memory allocated by '''`Tcl_Alloc()`'''y. For an empty string, objv[[i]]->bytes points to a static char in the Tcl library that holds a single `NUL` byte.

   `length` (`integer`):   The length of the string representation in `bytes` (minus the extra byte for the terminating NUL).

   `typePtr` (`Tcl_ObjType *`):   A pointer to the type of the object, a structure that provides the four fundamental operations which all Tcl_Obj instances implement.

   `internalRep` (`Tcl_ObjInternalRep`):   A value for internal use the implementor of the object type.  Whatever puts this to use must discipline itself to conform to what the interpretation string representation would be, even if the string representation hasn't been generated.

Each `Tcl_ObjType` structure contains the following four function pointers plus
a name.

   `freeIntRepProc`:   Frees any the internal representation. NULL if nothing special is needed when the internal representation is cleaned up. 

   `dupIntRepProc`:   Creates a new Tcl_Obj that is a copy of the current Tcl_Obj.  If it is NULL, Tcl simply uses memcpy to copy whole `internalRep` structure.

   `updateStringProc`:   Updates the string representation from the internal representation.  ''(Not sure what NULL means for this; IME that's not an especially good idea. [DKF]: It's OK provided you never ever set the `bytes` field to NULL.)''

   `setFromAnyProc`:   Frees any existing internal representation, replacing it with a new internal representation for this type.  Returns TCL_ERROR on failure.  NULL indicates that objects of this type can't normally be created (typically because extra context is needed.)



** Allocating a Tcl_Obj **

[DKF]: You ''must not'' allocate a `Tcl_Obj` manually.  ''Always'' call
`[Tcl_NewObj]` (or one of its close relatives, such as `Tcl_NewIntObj()`) to do
it for you. This is because Tcl uses a special memory management engine for
them that is tuned to be extra efficient -- useful because Tcl uses these
things ''a lot'' -- and that's only accessible through `Tcl_NewObj` (or some
wholly internal APIs that aren't exposed outside the Tcl library).



**  Reference Counting **

[AMG]: In C extension code for Tcl you have several options with respect to reference counting:

   1. Manually invoke Tcl_IncrRefCount() on the Tcl_Objs you create.  This protects them from being freed, but you're also responsible for calling Tcl_DecrRefCount() or else they'll leak.
   1. Give your Tcl_Objs to something that increments their reference counts.  For example, put them in a Tcl variable, list, or dict.  Don't call Tcl_IncrRefCount() unless your code is also retaining pointers that you expect to be valid sometime in the future.
   1. Don't fuss with reference counting because you're not the one creating the Tcl_Objs.  This is the case when all you do is read arguments passed to your function which is implementing an extension command.
   1. Don't call Tcl_IncrRefCount() because you like to live dangerously and "know" that you're only passing your Tcl_Objs to things that won't pull the rug out from under you.  Call Tcl_DecrRefCount() when you're done, and the Tcl_Objs will be freed when their refcounts go negative just as surely as when they go zero.

There's no way Tcl can remotely zap your Tcl_Objs with nonpositive refcount unless you've passed your Tcl_Obj pointers to Tcl library functions.  Tcl doesn't keep a list of Tcl_Objs in existence, so it can't sweep.



** Discarding the Internal Cached Interpretation of a Value **

Up until some point `[string length]` caused the internal cached interpretation
of a value to be discarded, but this is no longer the case with more recent
versions of Tcl:

======none
% incr i
1
% ::tcl::unsupported::representation $i
value is a int with a refcount of 2, object pointer at 0x6000660e0, internal representation 0x1:0x600066320, string representation "1"
% string length $i
1
% ::tcl::unsupported::representation $i
value is a string with a refcount of 2, object pointer at 0x6000660e0, internal representation 0x6000ad6a0:0x600066320, string representation "1"
======

There's no guaranteed way to strip the internal representation from a Tcl_Obj,
but a new Tcl_Obj can of course be created.  To avoid any potential
optimization to `[string range]` that might make this ineffective, use a
slightly more complicated technique:

======
set var [string index $var 0][string range $var[set var {}] 1 end]
======



** Threads **

Data members of `Tcl_Obj`, particularly `internalRep`, can be mutated, so a
`Tcl_Obj` should be exclusively owned by one thread.  See
[https://core.tcl.tk/tcl/tktview/17f747a4a4720f8e9797e7933dd43745b73e0e2c%|%Thread
safety in tclZipfs.c].



** Nested Tcl_Obj Structures **

[PYK] 2018-05-12:  Sometimes a `Tcl_Obj` is stored in the internal
representation of another `Tcl_Obj`.  This can lead to tricky issue such as
[http://core.tcl.tk/tcl/tktview?name=80304238ac%|%this memory leak] in
`[foreach]`.  [https://core.tcl.tk/tcl/info/7070d2aa2222bc5c%|%The fix]
involved clearing the internal representation of the nested `Tcl_Obj`, but
couldn't something else set the internal representation back to a problematic
value again?



<<categories>> Concept | Internals | Tcl Library