tarpack

JMN 2005-08-22


tarpack is an experimental Tcl-only package for loading & creating Tcl Modules that are also valid tar archives.

This is done by adding the 1st file in the archive with a special name.. such as the Tcl comment character # or a short piece of Tcl script. I chose "#tarpack-loadscript" as the 1st file's name and this also acts as a sort of file-type 'signature'.

This first file also contains the ctrl-z (\u001A) character as a terminator at the end so that Tcl can source the tar archive.

You can examine the .tm files with a text editor and any standard tar archiving programs to get a feel for what's going on.

It is available here: http://vectorstream.com/tcl/packages/ along with some other packages such as Thread, trofs & tdom wrapped using tarpack.. I refer to them as 'tarpacks'.


Some points about the tarpack approach:

1) In the simplest case of a single Tcl script package; wrapping it in a tarpack only adds a small comment at the beginning of the file (Tcl comment absorbs the tar header data) and a ctrl-z and a little bit of tar padding at the end of the file. i.e there should be next to no performance hit for loading such a package - 'sourcing' such a tar file doesn't even require that Tcl understand or invoke any tar code. (see the 'uri' module at the site above for an example of a trivially tarpacked package)

2) With a tarpacked set of modules, a repository could easily add & maintain metadata as specially named files added to the archive.

3) Especially with modern gui-based tar tools, a repository maintainer/developer can easily inspect and modify tarpacked modules.


I've been playing around with Tcl Modules in an attempt to reduce interp startup time. For installations with a large number of packages on auto_path and/or with network connections on the auto_path, load time for packages can be significant.

I don't know how useful tarpack will end up being - It was intended to be tcl-only so that an installation including binary packages can be created such that all packages are Tcl Modules rather than the usual packages on auto_path.

I was initially playing around with trofs as an archiver - and presumably trofs as a binary package will be much more efficient, but by wrapping trofs with tarpack, I can at least bootstrap trofs into the system without relying on auto_path.

Like trofs tarpack doesn't depend on vfs, but unlike trofs, tarpack doesn't actually 'mount' the module as part of the filesystem.

LV Why avoid VFS?

JMN It's not so much about avoiding VFS altogether.. I just wanted to remove dependencies on other packages for the main usecases so that a basic tarpack can be loaded simply by placing the tarpack-<ver>.tm file and the particular tarpack on the 'Module Path'. Indeed in the simplest cases, such as a single trivially wrapped Tcl script as mentioned above, the tarpack-<ver>.tm package itself is not even needed!

For slightly more complicated tarpacks e.g packages that may need access to their folder structure to load images etc I may add a tarpack::mount command (that does use VFS) so that the load-script can more easily make the package file-structure available. Even as is however, you could tarpack more complex packages and call vfs::tar yourself from the loadscript.


Here's a rough example of how the Thread package was wrapped as a tarpack:

 %package require tarpack
 1.0
 %tarpack::create Thread-2.6.1.tm [list ttrace.tcl Windows-x86 FreeBSD-x86]
 Thread-2.6.1.tm

At this point the tar archive (and Tcl Module!) Thread-2.6.1.tm has been created with a default basic load-script. This default load-script will simply source any .tcl files we gave in the list to tarpack::create above. If there were .so or .dll files directly listed there it would have called 'load' on them. (after copying out to file system)

The load-script is also left as a separate file in the working directory. It's named #tarpack-loadscript

This separate load-script can now be edited to make 'tarpack::source' & 'tarpack::load' calls to any of the wrapped files.

A subsequent tarpack::create call with the same arguments above will now use the modified load-script instead of the default. Alternatively

 tarpack::update Thread-2.6.1.tm "" 

will read the tar contents list of Thread-2.6.1.tm and re-archive them along with the modified load-script. (final arg is an empty list to indicate we are not adding any extra files)


Sorry there's no separate documentation - just some comments in the source.