Useful C Commands For Tcl

APN Note: In 8.5 and later, the core Tcl_ObjPrintf function provides the same functionality.

RHS 15Nov2004

A sprintf-like command that returns a Tcl_Obj. I use it to do things like:

 Tcl_ListObjAppendElement(interp, bcListObj, RHS_MakeStringObj("The offset: %d", pcOffset));

I'm sure there's places in the code that would run into problems when not using gcc, but thats all I use. If people want to improve this, that would be fantastic. Now, for the actual code...

 // This is coded specifically for glibc
 Tcl_Obj *
 RHS_MakeStringObj(const char *fmt, ...) {
    /* Guess we need no more than 100 bytes. */
    int n, size = 100;
    char *p;
    Tcl_Obj *retObj;

    va_list ap;
    p = ckalloc(size);
    while (1) {
        /* Try to print in the allocated space. */
        va_start(ap, fmt);
        n = vsnprintf(p, size, fmt, ap);
        va_end(ap);
        /* If that worked, return the string. */
        if (n > -1 && n < size) {
            retObj = Tcl_NewStringObj(p, strlen(p));
            ckfree(p);
            return  retObj;
        }
        /* Else try again with more space. */
        if (n > -1) {   /* glibc 2.1 */
            size = n+1; /* precisely what is needed */
        } else {        /* glibc 2.0 */
            size *= 2;  /* twice the old size */
        }
        p = ckrealloc(p, size);
    }
 }

RHS It appears that the microsoft compiler uses the name _vsnprintf rather than vsnprintf. So, if you're using MSVC, you may need to change the name.

From chat:

 <patthoyts>        In fact, on BSD it looks like you need stdio.h and stdarg.h

MAK Since you mention gcc, I suggest:

 #ifndef __GNUC__
 #   define  __attribute__(x)
 #endif

 Tcl_Obj *
 RHS_MakeStringObj(const char *fmt, ...) __attribute__((format(printf, 1, 2))))
 ...

This tells gcc that the form of the arguments is the same as a printf and that it should check to make sure that the number of arguments and their types are appropriate given the format string, and issue errors/warnings if not, just as if you mess up a printf() call.

(Btw, that's great. It would be cool to TIP that into the core and use it for [L1 ].)

RHS 16Nov2004 Having been thinking about what it would take to add such a thing into the core, what I have found leads me to believe that the best option is to write code that calculates the size the string needed to store the results of a printf type input. With that, it would be possible to write the above code simple by just calculating the size needed, malloc'ing it, and using sprintf to write to it... then creating the Tcl_Obj from that string. This would free us from having to use nvsprintf, which is not particularly portable. I'll put some free time (what little I have, as I just bought a house) into writing such a beast. If anyone already has such code and is willing to release it under the Tcl license, that would simplify things :)

MAK I'd imagine you could use Tcl_FormatObjCmd() as a starting point.

RHS Tcl_FormatObjCmd() handles a bit more than I had planned on implementing. Specifically, it handles XPG3 positional arguments, which I hadn't planned on handling (as far as I can tell, they'd be a royal pain do deal with using varargs). There's a couple simple implementations of "take format string and values and figure out the length" out there, and I was planning on going that route.


RHS 18Nov2004 Ok, I wrote some code to calculate the length of a string given its format string and values. It takes a lot of shortcuts in the pursuit of speed (like, all floats are 320 chars long, all ints are 21 chars long, etc). It seems to work ok, and it should be considerably more cross-platform than the above code, since it no longer has to rely on nvsprintf. It should understand all the ansi-c specifications for printf formatting.

 #include "tcl.h"
 #include "tclInt.h"
 #include "tclPlatDecls.h"
 #include "tclDecls.h"
 #include "tclCompile.h"

 #include <float.h>
 #include <limits.h>

 #define max(x,y) (x < y ? y : x)

 #define SIZE_INT 20  /* %d Signed decimal integer: +/- + 19 chars */
 #define SIZE_UINT 21 /* %u Unsigned decimal integer: "+" + 20 chars */
 #define SIZE_IINT 21 /* %i Signed decimal string: +/- + 20 chars */
 #define SIZE_OINT 23 /* %o Unsigned octal string: 0 + 22 chars */
 #define SIZE_XINT 18 /* %xX Unsigned hexadecimal string: "0x" + 16 chars */
 #define SIZE_CHAR 3  /* %c Unicode char */
 #define SIZE_DOUBLE 320 /* %f Float: up to 316 chars */
 /* Should be define a SIZE_LONG_DOUBLE for modifier 'L'? */
 #define SIZE_POINTER SIZE_XINT /* %p Pointer: same as hex "0x" + 16 chars */

 #ifndef __GNUC__
 #   define  __attribute__(x)
 #endif

 Tcl_Obj *RHS_MakeStringObj(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
 int Tcl_PrintfLength(const char *fmt, ...) __attribute__((format(printf, 1, 2)));
 int Tcl_VAPrintfLength(const char *fmt, va_list ap);


 Tcl_Obj *
 RHS_MakeStringObj(const char *fmt, ...) {
    int size;
    va_list ap;
    char *p;
    Tcl_Obj *retObj;

    va_start(ap, fmt);
    size = Tcl_VAPrintfLength(fmt, ap);

    p = (char *)ckalloc(size*sizeof(char));
    if(p == NULL) {
        Tcl_Panic("Failed to allocate memory for string");
    }

    vsprintf(p, fmt, ap);
    va_end(ap);

    retObj = Tcl_NewStringObj(p, strlen(p));
    Tcl_Free(p);
    return retObj;
 }

 /*
  *  Between % and format conversion character:
  *    Flags:
  *      - : left adjust 
  *      + : always sign 
  *      space : space if no sign 
  *      0 : zero pad 
  *      # : Alternate form: for conversion character o, first digit will be zero, for xX, prefix 0x or 0X to non-zero, for eEfgG, always decimal point, for gG trailing zeros not removed.
  *    Width:
  *    Period:
  *    Precision: for conversion character s, maximum characters to be printed from the string, for eEf, digits after decimal point, for gG, significant digits, for an integer, minimum number of digits to be printed.
  *    Length modifier:
  *      h : short or unsigned short 
  *      l : long or unsigned long 
  *      L : long double 
  *
  *   Conversions:
  *   d, i : int; signed decimal notation 
  *   o : int; unsigned octal notation 
  *   x,X : int; unsigned hexadecimal notation 
  *   u : int; unsigned decimal notation 
  *   c : int; single character 
  *   s : char*; 
  *   f : double; -mmm.ddd 
  *   e,E : double; -m.dddddde(+|-)xx 
  *   g,G : double 
  *   p : void*; print as pointer 
  *   n : int*; number of chars written into arg (no output for this)
  *   % : print % 
  */

 int
 Tcl_PrintfLength(const char *fmt, ...) {
    int size;
    va_list ap;

    va_start(ap, fmt);
    size = Tcl_VAPrintfLength(fmt, ap);
    va_end(ap);

    return size;
 }

 int
 Tcl_VAPrintfLength(const char *fmt, va_list ap) {
    int total;
    char *ptr = (char *)fmt;
    char *tmp;
    int minsize = 0;

    /* Start with the length of the format string + 1 for the \0 */
    total = strlen(fmt) + 1;

    /* Iterate over the characters in the format string */
    while(*ptr != '\0') {
        if(*ptr++ == '%') {  /* Its a format starter */
            if(*ptr == '%') { /* Its just a percent sign */
                total--;
                ptr++;
                continue;
            }
            /* Handle the "non-sizing" modifiers */
            while( strchr("-+ 0#", *ptr) != NULL ) {
                ptr++;
                total--;
            }
            
            /* Next has to be a size modifier, if its a number */
            tmp = ptr;
            minsize = strtoul(ptr, (char **) &ptr, 10);
            total -= ptr - tmp;

            /* Ok, now the "after the decimal" part */
            /* Here, we just add it to the total    */
            /* we don't modify the min size         */
            if ( *ptr == '.' ) {
                ptr++;
                tmp = ptr;
                total += strtoul(ptr, (char **) &ptr, 10);
                total -= (ptr - tmp) +1;
            }

            /* Last before format specifier is the length modifier */
            while ( strchr("hlL", *ptr) != NULL ) {
                ptr++;
                total--;
            }

            switch(*ptr) {
            case 'd':
                total += max(minsize, SIZE_INT) - 2;
                va_arg(ap, int);
                break;
            case 'u':
                total += max(minsize, SIZE_UINT) - 2;
                va_arg(ap, int);
                break;
            case 'i':
                total += max(minsize, SIZE_IINT) - 2;
                va_arg(ap, int);
                break;
 #ifdef _MSC_VER
            case 'I':
                total += max(minsize, SIZE_IINT) - 2;
                va_arg(ap, __int64);
                break;
 #endif
            case 'o':
                total += max(minsize, SIZE_OINT) - 2;
                va_arg(ap, int);
                break;
            case 'x':
            case 'X':
                total += max(minsize, SIZE_XINT) - 2;
                va_arg(ap, int);
                break;
            case 'c':
                total += max(minsize, SIZE_CHAR) - 2;
                va_arg(ap, int);
                break;
            case 's':
                tmp = (char *)va_arg(ap, char *);
                total += max(minsize, strlen(tmp)) -2;
                break;
            case 'f':
                total += max(minsize, SIZE_DOUBLE) - 2;
                va_arg(ap, double);
                break;
            case 'e':
            case 'E':
                /* This is a cheat, we come up with a better size */
                total += max(minsize, SIZE_DOUBLE) - 2;
                va_arg(ap, double);
                break;
            case 'g':
            case 'G':
                /* This is a cheat, we come up with a better size */
                total += max(minsize, SIZE_DOUBLE) - 2;
                va_arg(ap, double);
                break;
            case 'p':
                /* This is a cheat, we come up with a better size */
                total += max(minsize, SIZE_POINTER) - 2;
                va_arg(ap, void *);
                break;
            }
        }
    }

    if(total <= 0) {
        total = 1;
    }
    return total;
 }

 #ifdef TEST

 void testme(char *name, int expect, int actual) {
    if(expect == actual) {
        printf("%-15s\tPASSED\n", name);
    } else {
        printf("%-15s\tFAILED (%d != %d)\n", name, expect, actual);
    }

 }

 int main(argc, argv)
    int argc;
    char *argv;
 {
    char tmp4000;
    Tcl_Obj *testObj;

    /* ****************************************/
    /* Size Checks                            */
    /* ****************************************/
    sprintf(tmp, "%f", 1.79769313486231570e+308);
    sprintf(tmp, "%f", DBL_MAX);
    printf("Testing max %%f (%d): too big to bother showing\n", strlen(tmp));
    if(SIZE_DOUBLE < strlen(tmp)) {
        printf("SIZE_DOUBLE is too small");
    }
    sprintf(tmp, "%f", DBL_MIN);
    printf("Testing min %%f (%d): too big to bother showing\n", strlen(tmp));
    if(SIZE_DOUBLE < strlen(tmp)) {
        printf("SIZE_DOUBLE is too small");
    }

    sprintf(tmp, "%+ld", LONG_MAX);
    printf("Testing max %%ld (%d): %s\n", strlen(tmp), tmp);
    if(SIZE_INT < strlen(tmp)) {
        printf("SIZE_INT is too small");
    }
    sprintf(tmp, "%+ld", LONG_MIN);
    printf("Testing min %%ld (%d): %s\n", strlen(tmp), tmp);
    if(SIZE_INT < strlen(tmp)) {
        printf("SIZE_INT is too small");
    }

    sprintf(tmp, "%lu", ULONG_MAX);
    printf("Testing max %%lu (%d): %s\n", strlen(tmp), tmp);
    if(SIZE_UINT < strlen(tmp)) {
        printf("SIZE_UINT is too small");
    }

    sprintf(tmp, "%+li", LONG_MAX);
    printf("Testing %%+li (%d): %s\n", strlen(tmp), tmp);
    if(SIZE_IINT < strlen(tmp)) {
        printf("SIZE_IINT is too small");
    }

    sprintf(tmp, "%#lo", ULONG_MAX);
    printf("Testing %%#lo (%d): %s\n", strlen(tmp), tmp);
    if(SIZE_OINT < strlen(tmp)) {
        printf("SIZE_OINT is too small");
    }

    sprintf(tmp, "%#lx", ULONG_MAX);
    printf("Testing %%#lx (%d): %s\n", strlen(tmp), tmp);
    if(SIZE_XINT < strlen(tmp)) {
        printf("SIZE_XINT is too small");
    }

    sprintf(tmp, "%p", NULL);
    printf("Testing %%p (%d): %p\n", strlen(tmp), tmp);
    if(SIZE_POINTER < strlen(tmp)) {
        printf("SIZE_POINTER is too small");
    }

    /* ****************************************/
    /* Tests                                  */
    /* ****************************************/
    testme("no format", 10, Tcl_PrintfLength("no format"));
    testme("no string", 1, Tcl_PrintfLength(""));
    testme("percent", 12, Tcl_PrintfLength("this %% sign"));

    // Non-sizing modifiers
    testme("left align", SIZE_INT+1, Tcl_PrintfLength("%-d", 10));
    testme("signed", SIZE_INT+1, Tcl_PrintfLength("%+d", 10));
    testme("spaced", SIZE_INT+1, Tcl_PrintfLength("% d", 10));
    testme("0 padded", SIZE_INT+1, Tcl_PrintfLength("%0d", 10));
    testme("alt output", SIZE_XINT+1, Tcl_PrintfLength("%#x", 10));
    
    // Base types
    testme("signed int", SIZE_INT+1, Tcl_PrintfLength("%d", 50));
    testme("uunsigned int", SIZE_UINT+1, Tcl_PrintfLength("%u", 50));
    testme("signed decimal", SIZE_IINT+1, Tcl_PrintfLength("%i", 50));
    testme("unsigned octal", SIZE_OINT+1, Tcl_PrintfLength("%o", 50));
    testme("unsigned hex", SIZE_XINT+1, Tcl_PrintfLength("%x", 50));
    testme("unsigned heX", SIZE_XINT+1, Tcl_PrintfLength("%X", 50));
    testme("char", SIZE_CHAR+1, Tcl_PrintfLength("%c", 50));
    testme("string", 9+1, Tcl_PrintfLength("%s", "my string"));
    testme("double f", SIZE_DOUBLE+1, Tcl_PrintfLength("%f", 50.0));
    testme("double e", SIZE_DOUBLE+1, Tcl_PrintfLength("%e", 50.1));
    testme("double g", SIZE_DOUBLE+1, Tcl_PrintfLength("%g", 50.2));
    testme("pointer", SIZE_POINTER+1, Tcl_PrintfLength("%p", NULL));

    // Length modifiers (which don't actually change anything
    testme("short", SIZE_INT+1, Tcl_PrintfLength("%hd", 101));
    testme("long", SIZE_INT+1, Tcl_PrintfLength("%ld", 101l));
    testme("unsigned short", SIZE_UINT+1, Tcl_PrintfLength("%hu", 101));
    testme("unsigned long", SIZE_UINT+1, Tcl_PrintfLength("%lu", 101L));
    testme("long double", SIZE_DOUBLE+1, Tcl_PrintfLength("%Lf", 123.345L));
    
    testme("string", 21, Tcl_PrintfLength("Here %s There", "my string"));

    // Size modifier
    testme("size int <", SIZE_INT+1, Tcl_PrintfLength("%20d", 15));
    testme("size int >", 40+1, Tcl_PrintfLength("%40d", 15));

    // Float size modifier
    testme("size double", SIZE_DOUBLE+3+1, Tcl_PrintfLength("%.3f", 12.5678901));


    // Tcl Strings
    testObj = RHS_MakeStringObj("My name is %s and this may be line %d\n",
                                "RHS", 316);
    printf("%s\n", Tcl_GetStringFromObj(testObj, NULL));

    return 0;
 }


 #endif

MAK (2 Jan 2005) - I just happened to remember this while modifying some code where it'd be useful. If I might make another suggestion:

 Tcl_Obj *
 RHS_MakeStringObj(const char *fmt, ...) {
    int size;
    va_list ap;
    Tcl_Obj *retObj = Tcl_NewObj();

    va_start(ap, fmt);
    size = Tcl_VAPrintfLength(fmt, ap);

    if (Tcl_AttemptSetObjLength(retObj, size + 1)) {
        size = vsnprintf(Tcl_GetString(retObj), size, fmt, ap);
        Tcl_SetObjLength(retObj, size);
    }

    va_end(ap);
    return retObj;
 }

...although I'm not 100% certain about the safety of printing to the pointer returned by Tcl_GetString (if it were in the core it wouldn't need to, of course; but hey, it's not constified :P). It still passes the tests and doesn't crash, at least. ;) The purpose here, if it isn't obvious, is (1) avoid panicing and (2) be more efficient by only allocating buffer space once and avoiding copying. Mainly the latter. If you truely want it to panic, you can just use Tcl_SetObjLength() instead of Tcl_AttemptSetObjLength(). I just changed it to vsnprintf for double the safety, but it may be overkill.


MAK (15 Feb 2005) I fixed a bug in the above. RHS_MakeStringObj() and Tcl_PrintfLength() were both variadic functions, and RHS_MakeStringObj() was passing a va_list in for the ... part. This would cause it to calculate the length wrong (too short) with just a string format. This would cause a buffer overrun! Tcl_PrintfLength() is now split into Tcl_VAPrintfLength() to take a va_list as an argument, and RHS_MakeStringObj() modified to use it instead.


MAK (25 Feb 2005) Traced another crash to Tcl_VAPrintfLength(). It wasn't handling VC++'s formatting codes for 64 bit integers ("%I64d" and "%I64u"), and has been fixed above.


DKF 7-Sep-2005: Doing something like this internally for the core would be possible, and would make a number of places in the core simpler. Also for errorInfo.


SG (2006Aug18) I tweaked the above code a bit to create an eval function that takes a printf-stype string, substitutes in and then evaluates it.

 Tcl_Obj *
 _PrintfMakeStringObj(const char *fmt, va_list ap )
 {
   size_t size;
   char *p;
   Tcl_Obj *retObj;

   size = _VAPrintfLength(fmt, ap);

   p = (char *)ckalloc((unsigned int)size*sizeof(char));
   if(p == NULL) {
     Tcl_Panic("Failed to allocate memory for string");
   }

   _vsnprintf(p, size, fmt, ap);
   va_end(ap);

   retObj = Tcl_NewStringObj(p, (int)strlen(p));
   Tcl_Free(p);
   return retObj;
 }

 int MyTcl_PrintfEval( Tcl_Interp *interp, unsigned int flags,
                      const char *fmt, ... )
 {
   va_list ap;
   int result;
   Tcl_Obj *cmd;

   va_start(ap, fmt);
   cmd = _PrintfMakeStringObj( fmt, ap );
   Tcl_IncrRefCount( cmd );
   result = Tcl_EvalObjEx( interp, cmd, flags | TCL_EVAL_DIRECT );
   Tcl_DecrRefCount( cmd );  

   return result;
 }

This lets you write code like:

 MyTcl_PrintfEval( interp, 0, "lindex $mylist %d", foo() );

I did have to make some minor mods to the above code to create a va_list interface, but it's all pretty mechanical.

I'm surprised something like this doesn't already exist in the core. I haven't looked at the core code in any depth, but I would think this sort of operation happens all over the place.