This page was prompted by recent developments (see [CriTcl builds C extensions on-the-fly]) and occasional discussions over the past years with a number people who are heavily into scripting - [JCW] ---- The issue addressed here could be summarized as: * '''All policy issues in software must be scripted''' * Or even more succinct: '''No poli-C, please!''' What I mean by this is that code falls into a number of categories, including: general structure, administration of details, core domain-specific logic, and performance issues. And much more, clearly. What seems to happen all too frequently, is that decisions one makes about how software should behave end up being spread out in many parts of the code. When scripting with some parts coded as C extensions, there is the risk that such choices end up solidifying... ''in C''. When software is deployed to other machines, and when this is done with compiled code, that code may become quite hard to adjust: one has recompile, test locally, re-distribute, and test remotely. Here's a little case study which illustrates the problem (this is not a reconstruction of facts, but a loose summary of what I ''think'' happened): 1. Long ago, someone coded a "base64" encoding/decoding algorithm in C. Probably because a Tcl implementation was slow (especially in the pre-8x days). 2. The Tcl code remains useful, i.e. when there is no compiler, on new platforms, for old installations, as documentation, etc. 3. As some point, someone wanted/needed to add a few options to encoding: maximum line length, and a customizable separator between lines. 4. The Tcl version is easily changed, the results are instantly available. 5. The C code takes more work (to code, but especially to re-deploy). 6. The state today, is a bit of C code which does some things, and a Tcl wrapper around it which does oter things and goes out of its way to make both pure-Tcl and C-optimized scenario's work in the same way. 7. The pure-Tcl version is now part of TclLib, and has been further optimized to take advantage of Tcl works nowadays (some idioms change gradually, once underlying commands become substantially faster). While the effort is laudable, understandable, and in fact quite logical, it leads to a lot of (IMNSHO) ... "cruft". There is complexity in here which can easily be avoided. Worse still, this complexity will continue to make it hard to add more features - if this ever needed. What this example illustrates, is what happens all too often: when mixing scripting and C code, the switch to C (usually for performance reasons) ends up being full-scale. '''Policy''' decisions, i.e. how to split / join text resulting from base64 encoding, end up being coded in C as well - even though they do not impact performance. Handling of special cases, and errors, often also ends up being fully duplicated in C. The consequence is that a substantial part of extension logic is coded in C, thus killing one of the key benefits of scripting. What if I wanted to separate lines by more than a single character (say "\n ", for presentation purposes perhaps)? Or make different trade-offs w.r.t speed (knowing that bad input cannot happen, for example). The C code can't do it without a recompilation (on each platform), whereas in Tcl it would be trivial. This could very easily have been avoided. Let's assume that the C code exists solely for performance reasons. The performance, in the case of base64, comes from a bit of mapping and bit-fiddling, to perform the 3-byte vs. 4-byte mapping of strings. Here's how one "ought to" code the base64 extension, IMNSHO: * a support function in C which encodes N*3 bytes into N*4 bytes * a support function in C which decodes N*4 bytes into N*3 bytes * a support function in C which turns a string into a list of X-byte substrings The rest, and that includes all option handling, special cases when sizes aren't a multiple of 3 or 4, and line length splitting/joining, can ''easily'' be coded in Tcl. Each of the above 3 C functions is no more than two dozen lines of C code (Tcl arg checking and all). They do just one thing each, and they do it fast. In fact, they will do it faster than the current C implementations, because there is no special-case checking at all. They just race through input strings and construct results in a tight loop. If validation of correct input is an issue, a regexp call can easily be added in the Tcl layer. But base64 is really the tip of the iceberg, and not even an important one. What would really be needed, is a structured and concerted effort to apply the rule of "no policy gets coded in C" to the Tcl and Tk cores. There are huge amounts of C code in the core which reduce the flexibility of scripting: it is not possible to override sub-commands in Tcl ("info source" was hard to fix in the first Tcl-based VFS implementations), the channel design does not (yet?) allow coding channels in Tcl (again, VFS critically depends on this), Tk's "text" widget is huge and has things like a b-tree implementation which is totally inaccessible for any other purpose. Perl implements far more of itself in Perl (without being a slow system at all). Python does not need channels to be stackable, it uses its standard object protocol to provide a "file interface". To wrap it all up, I propose to look at scripting and C with a different mindset. Extensions coded in C should not aim to be useful as top-level modules, but only focus on doing small things well. With a Tcl layer wrapped on top, one can then build the API that gets used everywhere. This makes the C code smaller, simpler, and sometimes even faster. Note that even pure-Tcl implementations become simpler: only simple bits of C need to be recoded in Tcl, most of the code was already Tcl anyhow. It is time to let go of the idea that the center of the universe is C, ''even'' for Tcl and Tk themselves. The "CriTcl" package makes it feasible to fully leave that approach behind one day. But even in small ways, reducing the number of policy decisions that are coded in C can quickly pay off, in that C code changes (and fixes, and [testing]) become far less frequent, and that far more fixes/changes/improvements end up in Tcl. '''JCW''' ---- [RS] Though sometimes I feel so, the center of the universe would surely also not be Tcl/Tk - rather, an empty point around which C(++)?, Tcl, Python, Perl, etc. rotate... Fascinating thoughts, well worth continuing. By separating core functionality from policy/configuration/syntax specifics, work spent on Tk and even Tcl internals could be of interest to the other [scripting language]s, similarly how they embrace Tk already... (vague visions of "the open source answer to .net"...)