Version 20 of tcc4tcl

Updated 2019-06-11 19:30:30 by AMG

tcc4tcl (Tiny C Compiler for Tcl) is a Tcl extension that provides an interface to TCC.

It is a fork of tcltcc by Mark Janssen.

It is licensed under the terms of the LGPL v2.1 (or later).


tcc version

AMG: The current version of tcc4tcl (0.30) directly includes a patched copy of tcc 0.9.26 from early 2013. The most recent version of tcc is 0.9.27 from late 2017. I would very much like to upgrade to 0.9.27 because of the many improvements, but doing so breaks virtually all the tcc patches found in [L1 ].

Access to more symbols

AMG: The C code has access to Tcl and Tk symbols just fine, but other than that, the only available symbols are: printf, fprintf, fopen, and fclose. These are listed in tcc_syms in tccrun.c. It's clear that this list is present as only an example of how to add symbols.

I'm very much wishing for the ability to extend this list without editing the tcc source. In fact, many of the symbols I want to access may not even be available during the compilation and linking of tcc4tcl since they'll come from other dynamically loaded extension libraries. Mostly SQLite.

I tried improving the add_symbol command to do dynamic symbol lookups using Tcl_LoadFile() when the value argument is non-numeric.

However, it's kind of a dead end. Let's say I want to access vsnprintf from libc. Dynamically loading segfaults, but shouldn't I just be able to use the symbol already present in the Tcl interpreter? If I were using dlsym() directly, I'd just say dlsym(RTLD_DEFAULT, "vsnprintf") and away we go. Or I would use dlopen(NULL, 0) to get a handle for the current process to use in place of RTLD_DEFAULT. Unfortunately, the Tcl loading API does not appear to expose RTLD_DEFAULT or dlopen(NULL). Instead, it insists on loading a library. Even if the Tcl loading API were extended to have these features, how would things work on Windows? GetProcAddress [L2 ] and friends have the same restrictions.

If I pass info nameofexecutable as the library name, it fails as follows:

% package require tcc4tcl
% set tcc [tcc4tcl::new]
% $tcc add_symbol vsnprintf [info nameofexecutable]
{vsnprintf /home/ascv/kitcreator-0.11.1/tclkit-8.6.9}
% $tcc go
couldn't load file "/tmp/tcl_OePPHZ": /tmp/tcl_OePPHZ: file too short

This would be a very powerful feature, if we can figure out how to make it work.

Here's the patch. Warning: tabs!

diff -ur tcc4tcl-0.30~/tcc4tcl.c buildsrc/tcc4tcl.c
--- tcc4tcl-0.30~/tcc4tcl.c        2019-06-11 12:25:16.849343862 -0500
+++ tcc4tcl-0.30/tcc4tcl.c        2019-06-11 13:56:44.171708319 -0500
@@ -57,7 +57,9 @@
 static int Tcc4tclHandleCmd ( ClientData cdata, Tcl_Interp *interp, int objc, Tcl_Obj * CONST objv[]){
+        Tcl_LoadHandle loadHandle;
         Tcl_WideInt val;
+        Tcl_Obj *expr_o;
         Tcl_Obj *val_o;
         void *val_p;
         int index;
@@ -128,24 +130,55 @@
                 return TCL_OK;
         case TCC4TCL_ADD_SYMBOL:
-            if (objc != 4) {
-                Tcl_WrongNumArgs(interp, 2, objv, "symbol value");
+            if (objc != 4 && objc != 5) {
+                Tcl_WrongNumArgs(interp, 2, objv,
+                        "symbol value|(?librarySymbol? libraryName)");
                 return TCL_ERROR;
-            rv = Tcl_ExprObj(interp, Tcl_ObjPrintf("wide(%s)", Tcl_GetString(objv[3])), &val_o);
-            if (rv != TCL_OK) {
-                return TCL_ERROR;
+            if (objc == 4) {
+                /*
+                 * Called with two arguments.  If the final argument is a
+                 * numeric value, treat it as a memory address, and add the
+                 * symbol directly.  If the final argument is non-numeric, fall
+                 * through to the dynamic symbol lookup routine below.
+                 */
+                expr_o = Tcl_ObjPrintf("wide(%s)", Tcl_GetString(objv[3]));
+                rv = Tcl_ExprObj(interp, expr_o, &val_o);
+                Tcl_DecrRefCount(expr_o);
+                if (rv == TCL_OK) {
+                    rv = Tcl_GetWideIntFromObj(interp, val_o, &val);
+                    Tcl_DecrRefCount(val_o);
+                    if (rv != TCL_OK) {
+                        return TCL_ERROR;
+                    }
+                    tcc_add_symbol(s, Tcl_GetString(objv[2]), (void *)val); 
+                    return TCL_OK;
+                }
-            rv = Tcl_GetWideIntFromObj(interp, val_o, &val);
-            if (rv != TCL_OK) {
+            /*
+             * Called with three arguments; or called with two arguments, the
+             * last of which is non-numeric.  Perform a dynamic symbol lookup.
+             *
+             * If there are only two arguments, reuse the TCC symbol name as the
+             * library symbol name.  If there are three arguments, the middle
+             * argument is the symbol to look up in the library.  In other
+             * words, the second-to-last argument is the symbol to look up.
+             *
+             * The final argument is the library name.  Load the library to find
+             * the symbol.  Tcl keeps track of already-loaded libraries, so the
+             * library won't actually be loaded multiple times.
+             */
+            str = Tcl_GetString(objv[objc - 2]);
+            if (Tcl_LoadFile(interp, objv[objc - 1], &str, 0, &val_p,
+                    &loadHandle) != TCL_OK) {
                 return TCL_ERROR;
+            printf("%s: %p\n", str, val_p);
-            val_p = (void *) val;
-            tcc_add_symbol(s,Tcl_GetString(objv[2]), val_p); 
+            tcc_add_symbol(s, Tcl_GetString(objv[2]), val_p); 
             return TCL_OK; 
         case TCC4TCL_COMMAND:
             if (objc != 4 && objc != 5) {
diff -ur tcc4tcl-0.30~/tcc4tcl.tcl buildsrc/tcc4tcl.tcl
--- tcc4tcl-0.30~/tcc4tcl.tcl        2019-06-11 12:25:16.850344362 -0500
+++ tcc4tcl-0.30/tcc4tcl.tcl        2019-06-11 13:54:07.764711760 -0500
@@ -46,7 +46,7 @@
-                array set $handle [list code "" type $type filename $output package $pkgName add_inc_path "" add_lib_path "" add_lib "" add_macros "" add_files ""]
+                array set $handle [list code "" type $type filename $output package $pkgName add_inc_path "" add_lib_path "" add_lib "" add_symbol "" add_macros "" add_files ""]
                 proc $handle {cmd args} [string map [list @@[email protected]@ $handle] {
                         set handle {@@[email protected]@}
@@ -63,7 +63,7 @@
                         set callcmd ::tcc4tcl::_$cmd
                         if {[info command $callcmd] == ""} {
-                                return -code error "unknown or ambiguous subcommand \"$cmd\": must be cwrap, ccode, cproc, ccommand, delete, linktclcommand, code, tk, add_include_path, add_library_path, add_library, process_command_line, or go"
+                                return -code error "unknown or ambiguous subcommand \"$cmd\": must be cwrap, ccode, cproc, ccommand, delete, linktclcommand, code, tk, add_include_path, add_library_path, add_library, add_symbol, process_command_line, or go"
                         uplevel 1 [list $callcmd $handle {*}$args]
@@ -118,6 +118,12 @@
                 lappend state(add_lib) {*}$args
+        proc _add_symbol {handle args} {
+                upvar #0 $handle state
+                lappend state(add_symbol) $args
+        }
         proc _add_file {handle args} {
                 upvar #0 $handle state
@@ -618,6 +624,10 @@
                         tcc add_file $addFile
+                foreach addSymbol $state(add_symbol) {
+                        tcc add_symbol {*}$addSymbol
+                }
                 switch -- $state(type) {
                         "memory" {
                                 tcc compile $code

32-bit Windows link failure

AMG: With tcc4tcl 0.30 (tcc 0.9.26) compiled with 32-bit MXE GCC 5.4.0, or with the official build [L3 ] (tried with [L4 ]), I get the following error:

% tcc4tcl::cproc test {Tcl_WideInt a int b} Tcl_WideInt {return a << b;}
tcc: error: undefined symbol '__ashldi3'
relocating failed

This is only a problem in Windows, not Linux. The cause is twofold. (1) gcc is prefixing all the symbol names in libtcc1.a with an extra underscore that tcc is not expecting, and (2) gcc is producing PE-format object files, and tcc only knows how to link ELF object files.

The solution is to apply this patch:

diff -ur tcc4tcl-0.30-old/ tcc4tcl-0.30-new/
--- tcc4tcl-0.30-old/        2017-10-13 15:37:05.000000000 -0500
+++ tcc4tcl-0.30-new/        2019-03-13 15:04:27.510092464 -0500
@@ -8,6 +8,8 @@
 CPP = @[email protected]
 AR = @[email protected]
 RANLIB = @[email protected]
+OBJCOPY = @[email protected]
+OBJDUMP = @[email protected]
 CFLAGS = @[email protected] @[email protected]
 CPPFLAGS = @[email protected] -I$(shell cd @[email protected] && pwd) -I$(shell cd @[email protected] && pwd)/tcc -I$(shell pwd)/tcc @[email protected] @[email protected]
 LDFLAGS = @[email protected]
@@ -29,7 +31,7 @@
 host_os = @[email protected]
 @[email protected]
-all: $(TARGET) tcc/libtcc1.a
+all: $(TARGET) tcc/libtcc1-elf.a
         if [ "$(srcdir)" = "." ]; then \
@@ -46,6 +48,17 @@
         -$(MAKE) -C tcc [email protected]@
         $(MAKE) -C tcc libtcc1.a
+# tcc supports dynamically loading object code from ELF, not from PE, so on some
+# platforms it is necessary to convert its runtime support library to ELF.
+tcc/libtcc1-elf.a: tcc/libtcc1.a
+        if $(OBJDUMP) -a $< | grep -q ' file format pei\?-x86-64$$'; then \
+                $(OBJCOPY) --remove-leading-char -O elf64-x86-64 $< [email protected]; \
+        elif $(OBJDUMP) -a $< | grep -q ' file format pei\?-i386$$'; then \
+                $(OBJCOPY) --remove-leading-char -O elf32-i386 $< [email protected]; \
+        else \
+                cp -f $< [email protected]; \
+        fi
 tcc4tcl.o: $(srcdir)/tcc4tcl.c $(srcdir)/tcc/tcc.h $(srcdir)/tcc/libtcc.h tcc/config.h
         $(CC) $(CPPFLAGS) $(CFLAGS) -o tcc4tcl.o -c $(srcdir)/tcc4tcl.c
@@ -60,7 +73,7 @@
         mv tcc4tcl-static.a
-install: $(TARGET) pkgIndex.tcl $(srcdir)/tcc4tcl.tcl $(srcdir)/tcc4critcl.tcl tcc/libtcc1.a $(shell echo $(srcdir)/tcc/include/*) $(shell echo $(srcdir)/tcc/win32/lib/*.c) $(srcdir)/headers.awk $(srcdir)/
+install: $(TARGET) pkgIndex.tcl $(srcdir)/tcc4tcl.tcl $(srcdir)/tcc4critcl.tcl tcc/libtcc1-elf.a $(shell echo $(srcdir)/tcc/include/*) $(shell echo $(srcdir)/tcc/win32/lib/*.c) $(srcdir)/headers.awk $(srcdir)/
         $(INSTALL) -d "$(DESTDIR)$(PACKAGE_INSTALL_DIR)/include"
@@ -68,7 +81,7 @@
         $(INSTALL) -m 0644 pkgIndex.tcl "$(DESTDIR)$(PACKAGE_INSTALL_DIR)"
         $(INSTALL) -m 0644 $(srcdir)/tcc4tcl.tcl "$(DESTDIR)$(PACKAGE_INSTALL_DIR)"
         $(INSTALL) -m 0644 $(srcdir)/tcc4critcl.tcl "$(DESTDIR)$(PACKAGE_INSTALL_DIR)"
-        $(INSTALL) -m 0644 tcc/libtcc1.a "$(DESTDIR)$(PACKAGE_INSTALL_DIR)/lib"
+        $(INSTALL) -m 0644 tcc/libtcc1-elf.a "$(DESTDIR)$(PACKAGE_INSTALL_DIR)/lib/libtcc1.a"
         $(INSTALL) -m 0644 $(shell echo $(srcdir)/tcc/win32/lib/*.c) "$(DESTDIR)$(PACKAGE_INSTALL_DIR)/lib"
         $(INSTALL) -m 0644 $(shell echo $(srcdir)/tcc/include/*) "$(DESTDIR)$(PACKAGE_INSTALL_DIR)/include"
         @if ! echo "_WIN32" | $(CPP) $(CPPFLAGS) - | grep '^_WIN32$$' >/dev/null; then \
diff -ur tcc4tcl-0.30-old/ tcc4tcl-0.30-new/
--- tcc4tcl-0.30-old/        2017-10-13 15:37:16.000000000 -0500
+++ tcc4tcl-0.30-new/        2019-03-13 15:06:59.156101031 -0500
@@ -39,6 +39,7 @@
+AC_CHECK_TOOL([OBJDUMP], [objdump])
diff -ur tcc4tcl-0.30-old/test.tcl tcc4tcl-0.30-new/test.tcl
--- tcc4tcl-0.30-old/test.tcl        2017-10-13 15:37:05.000000000 -0500
+++ tcc4tcl-0.30-new/test.tcl        2019-03-13 15:09:06.231108210 -0500
@@ -15,6 +15,9 @@
 # This should work
 tcc4tcl::cproc test3 {int i} int { return(i+42); }
+# Check for libtcc1 functionality
+tcc4tcl::cproc testlibtcc1 {double x} int { return(x); }
 # Multiple arguments
 tcc4tcl::cproc add {int a int b} int { return(a+b); }

Be cautious with this patch since it intentionally contains tabs and end-of-line whitespace, all of which have been stripped off by the wiki. Let me know if you want the original file with correct formatting.

After applying the patch, run "autoconf" to regenerate the configure script.

I have confirmed that the above patch fixes 32-bit Windows and does not break 64-bit Windows, 32-bit Linux, or 64-bit Linux.

Note: Even though I say that the error only happens on 32-bit Windows, as far as I can tell, this is only because libtcc1 is not needed as much on 64-bit Windows. If I dug deeper, I could probably find a test case that would break 64-bit Windows too, i.e. a case where libtcc1 is used on both 32- and 64-bit CPUs.

Variadic arguments

AMG: Is there any facility to access a cproc command's full objv argument list?

Answer: No. Bypass cproc and instead use ccode directly to create functions matching the signature shown below. Then use linktclcommand to arrange for Tcl_CreateObjCommand to be called on those functions. Note that the created Tcl commands are in the global namespace by default, not in the current namespace eval namespace, unless the namespace is explicitly given in the command name.

typedef int Tcl_ObjCmdProc(
        ClientData clientData,
        Tcl_Interp *interp,
        int objc,
        Tcl_Obj *const objv[]);


dbohdan 2015-03-16: As of version 0.23 you can compile tcc4tcl on Linux thus:

set -e
curl "$url" -o "${release}.tar.gz" # -O may not be unavailable.
tar zxvf "$release.tar.gz"
cd $release

dbohdan: Updated the script once more to hammer Roy Keene's server less.