Small Core == Good

The New Tcl Philosophy


 Small core         == Good.
 Static executables == Bad.

 Core only distros    == Bad.
 Fully loaded distros == Good.

What bugs me about this slogan is that we do no necessarily see it in practice. For instance - many of the string features that have been added over the past few years certainly could have been done in an extension that was shipped with the core distribution - but instead, it went into the core.

In fact, how much currently in the core has to be there? Seems to me that a small core interpreter, able to do loading of scripts, dynamic loading of extensions, and interpretation of code, would be possible without sockets, regular expressions, the history command, etc. in the true core code, but with all those features distributed in the core distribution of code, so that by default, Tcl would include those features - but as dynamically loaded extensions.

The core itself might need some new features - to provide the abiltiy for dynamically loaded commands to be treated as byte codes, for instance.


escargo 16 Jan 2003 - But what are the costs and benefits of the different approaches?

For example, for every feature that might be currently loaded or not, code either has to load it or otherwise test that it is loaded or else handle the error that results if it not loaded. That's a burden on the programmer (writing the code) and on the execution (since it adds some run-time overhead and code volume).

Part of the problem is where the cost is paid. Is it paid when code gets loaded? Is it paid when code gets byte-compiled? Is it paid when code gets executed?

I personally think dynamic linking is great. I liked it ever since I used it in Multics many years ago. But there are advantages to statically binding all parts of a program together. (In some ways this is similar to the discussions about the Linux kernel and its loadable modules.)


KBK 17 Jan 2003 - Regular expressions seem to be everyone's favorite example of a feature that could be unbundled from the Core, but they're a really difficult example. The problem with attempting to unbundle them is that so very many other commands, which might otherwise ideally be bundled into their own packages, depend on them. [switch -regexp], [lsearch -regexp], [$textWidget search -regexp] all have a dependency on it. Arguably, these "options" might have been better implemented as "subcommands", but escargo is right that willy-nilly unbundling isn't a good idea.

When people ask for something to be "added to the Core", they usually are really requesting that it be "universally available wherever the Core is available". I wonder if what that means would become clearer if we did some small amount of restructuring, perhaps by doing something like unbundling [text] and [canvas] into supplied packages.

jcw - Kevin, as [L1 ] illustrates, regexp can be built as extension with relatively little effort.

The dependency on it can be dealt with IMO: let the C calls in the Tcl core go through a stubs table, and have this new extension export a stubs table.

The one issue remaining is what to do when regexp is not available (or not yet, in case startup needs it, say regexp were to be an extension in VFS). IMO, this too is solvable in a relatively straightforward manner: the commands which offer regexp options need to check, and then generate an error if regexp is used.

This would be a nice start to reduce the core (by about 10%, I expect). There are simpler engines, which may be sufficient for some cases. With a stubs-connected regexp approach, it would be up to every developer to decide what to include.

Other biiig candidates for splicing off, IMO:

  • the file system (we're so close with VFS already!)
  • sockets (may be trickier, with all event tie-ins)

Reasons for doing so: I would be interested in a smaller core which makes it possible to embed Tcl in far more things (other scripting languages, as "engine"?). It's not just for environments which have limited capabilities (no filesystem/network), it's at least as important for environments where these capabilities are already present, but not needed on the Tcl end of things (Tcl as a systems-scripting language).

escargo 17 Jan 2003 - How hard would it be for different commands (or parts of commands) to declare in some way what parts of other commands that they rely on? (Or is there a way to derive this automatically?) Given the list of dependency relations, it would be possible to compute a topological graph of dependencies where the most independent features would have no incoming connections and the most frequently used features would have many incoming connections. One could then generate strongly connected subgraphs and other transformations that could tell you how different features should be bundled together.

Certainly for documentation purposes it is useful for a unit of code to specify what other units of code it depends on (and perhaps even their versions).


US 17. Jan. 2003 - The burden on the programmer is minimal - that's what unknown already does.

escargo 17 Jan 2003 - But unknown does this at execution time for running programs. Is there any way to do it on the implementation of tcl statically? (A call graph and a dependency graph are not the same thing.)