Tcl_NewByteArrayObj

Tcl_Obj * Tcl_NewByteArrayObj(CONST unsigned char *bytes, int length)

Tcl_Obj * Tcl_NewByteArrayObj(bytes, length)

Tcl_NewByteArrayObj will create a new object of byte-array type. Both of these procedures set the object's type to be byte-array and set the object's internal representation to a copy of the array of bytes given by bytes. Tcl_NewByteArrayObj returns a pointer to a newly allocated object with a reference count of zero.

https://www.tcl-lang.org/man/tcl8.5/TclLib/ByteArrObj.htm


HaO: To populate a byte array by a C extension and return it as the result object, the following code might be used:

Tcl_Obj *pObj = Tcl_NewObj();
unsigned char *pChar = Tcl_SetByteArrayLength(pObj, 3);

// Dummy population functionality
pChar[0] = '\1'; pChar[1] = '\xff'; pChar[2] = '\x80';
Tcl_InvalidateStringRep(pObj);
Tcl_SetObjResult(interp,pObj); 

Remarks:

  • Alexandre Ferrieux: Tcl_NewObj() may be replaced by Tcl_NewByteArrayObj( "", 1 ).
  • DGP: If the object is not newly created (for example a parameter object is modified, one should check the object to be shared:
if ( Tcl_IsShared(pObj) )
    pObj = Tcl_DuplicateObj(pObj);

DKF: I don't recommend using a length of 0 to Tcl_NewByteArrayObj; the behaviour of that depends on whether malloc(0) keels over (which is system dependent).

Duoas That seems me fairly obnoxious. Doesn't the Tcl_NewByteArrayObj() code know that it oughtn't bother to try allocating zero bytes, and just initialize an empty/nil internal representation to begin with? [edit2] For that matter, would it be reasonable to specify NULL for source and a non-zero length for an uninitialized block of bytes?

DKF: I merely report the current state of affairs.

Duoas :-D Perhaps I'll get the latest CVS and fix that sometime in the next couple of days.

Duoas 2008-10-31 Well, I've looked at the source (~/tcl/generic/tclBinary.c) and this is what I've found.

The documentation plainly states that 'length' must be greater than or equal to zero. Assuming no one ever passes a negative number, none of the functions will break. Zero is fine --malloc() will not get a zero size_t for a zero-length byte array, so it is malloc()-safe (remembering again not to give it negative values).

Frankly, I think I would insert a line above 327 that forces 'length' to be >= 0.

As for a NULL 'bytes' argument, if 'length' is zero, there is no problem. However, if 'length' is non-zero, then you will almost surely cause a segfault/access violation. Again, I would insert a little line above 330:331 just for that:

void
Tcl_SetByteArrayObj(
    Tcl_Obj *objPtr,                /* Object to initialize as a ByteArray. */
    const unsigned char *bytes,        /* The array of bytes to use as the new
                                 * value. */
    int length)                        /* Length of the array of bytes, which must be
                                 * >= 0. */
{
    ByteArray *byteArrayPtr;

    if (Tcl_IsShared(objPtr)) {
        Tcl_Panic("%s called with shared object", "Tcl_SetByteArrayObj");
    }
    TclFreeIntRep(objPtr);
    Tcl_InvalidateStringRep(objPtr);

    length = (length < 0) ? 0 : length;
    byteArrayPtr = (ByteArray *) ckalloc(BYTEARRAY_SIZE(length));
    byteArrayPtr->used = length;
    byteArrayPtr->allocated = length;
    if (bytes && length) {
        memcpy(byteArrayPtr->bytes, bytes, (size_t) length);
    }

    objPtr->typePtr = &tclByteArrayType;
    SET_BYTEARRAY(objPtr, byteArrayPtr);
}

These two simple changes make the routine idiot-proof --er-- error-proof, and improve the functionality to allow creating a new ByteArray object with an uninitialized 'length' of bytes:

  Tcl_Obj *myByteArray = Tcl_NewByteArray( NULL, 1024 );
  unsigned char *my1024 = Tcl_GetByteArrayFromObj( myByteArray, NULL );

Finally, I would have Tcl_GetByteArrayFromObj() return NULL if the ByteArray is zero-length.

unsigned char *
Tcl_GetByteArrayFromObj(
    Tcl_Obj *objPtr,                /* The ByteArray object. */
    int *lengthPtr)                /* If non-NULL, filled with length of the
                                 * array of bytes in the ByteArray object. */
{
    ByteArray *baPtr;

    if (objPtr->typePtr != &tclByteArrayType) {
        SetByteArrayFromAny(NULL, objPtr);
    }
    baPtr = GET_BYTEARRAY(objPtr);

    if (lengthPtr != NULL) {
        *lengthPtr = baPtr->used;
    }
    if (baPtr->used == 0) {  /* edit: fixed! */
        return NULL;
    }
    return (unsigned char *) baPtr->bytes;
}

But this is not essential... assuming the user is smart enough to actually verify that there is data before reading/writing it.

If you wan't I'll actually submit a patch, but as this only requires adding six lines...


Alexandre Ferrieux I think using NULL this way is really neat. But why bother with negative lengths at this point ? I mean, enforcing obvious contracts is an all-or-none decision. Checking all preconditions like that can lead to adding 10,000 lines of code to existing Tcl... It may be useful at some spots, because it allows early detection, but here it's just one nanosecond before ;-) Moreover, maybe in the future someone will find a neat semantics for negative values like you just did with the NULL pointer (eg to indicate a buffer to externally-allocated memory like an mmap(), that shouldn't be ckfree()ed on object disposal)...

Duoas I disagree. Bitflags deserve their own argument. It is bad practice to mangle/obscure them into some other value. Make things what they claim to be, and nothing more or less. If the core gets a future change, it can get an updated argument list. Legacy code can compile just fine with the right #defines included to provide the missing arguments.

The problem is that the actual argument value is not necessarily a constant (though I imagine it usually is). And if you are calculating a value, you must at some point verify that it is >= 0. I prefer to do that in one spot. Besides, Tcl only has around 270,000 lines of code for all platforms combined (including the 13,000 for the pure-Tcl parts). Having an occasional check for such things won't cause anywhere near as much bloat as 10,000 lines of code. That would be an additional line of code for every 27 --and there simply isn't that much to worry about in the core --it is already very-well coded and tested.

There is always a trade-off between error testing and a nice, small, fast library. When you have just a few careful programmers who know bad vs good argument values (because they know the inner workings of the API) and only pass-in the latter, then it is easier to justify those kinds of shortcuts. However, IMHO it is still always better to code against small errors that could segfault the core than to simply assume that everyone will always get it right. Really nasty bugs are made of such. Tcl has a very complete, powerful test suite. But it can't test everything, or prevent a programmer from mistakenly sending in a death code. It didn't test trying to set a ByteArray length to -17. (Try it. ;-> ) Further, Tcl already has a lot of testing in there to validate stuff and provide an appropriate response for really bad errors. A user who gets an apparently random access violation and core dump will be a lot less happy/helpful than one who can say, "hey, your app crashed with the message Q when I pressed button X".

The NULL trick is really just a convenience for

  Tcl_Obj *ba = Tcl_NewByteArrayObj( NULL, 0 );
  Tcl_SetByteArrayLength( ba, n );

with the added benefit of needing much less background machinery to do the job. Technically, though, it diverges from the current documentation, which says that 'length' is "The length of the array of bytes. It must be >= 0.", where "the array of bytes" refers to both the 'bytes' argument and the length of the ByteArray object's newly allocated bytes.

My $0.02