Version 19 of trofs

Updated 2007-07-24 19:34:37 by pcam

See http://math.nist.gov/~DPorter/tcltk/trofs/


Tcl package that provides commands to create, mount, and unmount archives containing read-only filesystems.


File format (version 1, as found in trofs-0.4.3)

(reconstructed from trofs-0.4.3/library/procs.tcl by PS)

The trofs archive file format is a concatenation of files bundled in directories, where each directory contains file, link and directory TOC entries which point to an absolute location in the archive file and specify the length of the item in the file. All offsets are from the start of the archive file. Each toc is a utf-8 encoded dict

 <arbitrary header (optional)>
 0x1a
 <file1 data>
 <file2 data>
 <file3 data>
 <dir1/file1 data>
 <dir1/file2 data>
 <dir1/dirA/file1 data>
 <dir1/dirA TOC>
 <dir1 TOC>
 <dir2/file1 data>
 <dir2 TOC>
 <root TOC>
 0x1a
 trofs01 (literal string - trofs file signature)
 <size of root TOC (big endian 32-bit int)> 
  • Arbitrary header: For example, a Tcl script for mounting the trofs file as a Tcl Module.
  • File data: A verbatim copy of the contents of the file
  • root/dir TOC: a dict of filenames (see example below)
  • size of root toc: Helps point to the root toc: filesize-12-rootTocSize

TOC:

 file1 {F <length> <offset>}
 file2 {F 15 32}
 dir1 {D <size> <offset>}
 dir2 {D 203 12542}
 link1 {L <target>}
 link2 {L ../file1}

What is the reason for inventing Yet Another Archive Format And Making It Tcl Specific, and not using - say - ZIP? -jcw

PWQ 27 Mar 05, I would go one further. Since the FS is to be read only, why don't you adopt ext2-fs or dos-fs, or even cram-fs. The advantages of these is that you can mount these under linux and create/modify them easily. When you re-invent the wheel, it should be a better wheel.

Would makes no sense on Windows -jcw Jcw: Hence the dos-fs option. I would rephrase Windows makes no sense PWQ.

PS 27Mar05 The short answer I got from dgp was that it is open for discussion. The main reason for me writing this format down here is to familiarize myself with the trofs code. As far as I can tell, a lot of effort went into making the trofs code thread safe. The next step will be adding compression to trofs, and after that I am likely to hack it to read (a subset of) zipfiles. The significant difference between the zip file format and trofs file format is the directory stucture. In zipfiles, a main TOC holds all files, trofs has a toc for each directory. I think the simplest trick will be to turn the zip TOC into a nested dict, that would preserve most of the trofs internals.

SEH 28Mar05 -- If you were to use the Debian .deb format, then you would get the benefit of re-use, and to boot every existing Debian package could be trivially converted to a Tcl module. Then Tcl could be used as a general-purpose software/install management tool, rather than simply a Tcl package tool. I dream of using Debian-style archives on Windows as well as Unix. That would be another opportunity to widen the appeal and use of Tcl in the computer-user community.

30Mar05 -- The tar virtual filesystem included in the Tclvfs package has pure-Tcl code for reading from standard tar files. So why not simply use the tar format for your filesystem and adapt the existing code, or better yet use the tar vfs?

DGP One of the main motivations was to write code that provided a Tcl_Filesystem that did not make use of the tclvfs package. This was in part to prove that the Tcl_Filesystem interface was usable by someone other than the person who wrote it, and in part to avoid some (perceived?) limitations imposed by the tclvfs package itself. In particular, I believe that the scheme tclvfs uses to determine what paths belong to what filesystem is entirely prefix based, which prevents nested mounts. trofs does not have that limitation. Also, because tclvfs defines filesystems in terms of callback scripts, there must be an interp in order to use tclvfs filesystems. trofs filesystems can remain mounted even after all interps have been destroyed. This suggests that with further work, trofs filesystems will be able to be mounted before any interp is created, and this will offer a solution to some of the tricky startup problems experience by programs like Tclkit that wish to access data from a virtual filesystem as part of interp initialization.

Regarding the archive format suggestions, I've never thought the archive format was the point. I actively don't care what the format is. Completely open to changing it, so long as doing so actually provides some benefit.

jcw - I appreciate the fact that trofs addresses startup issues. It's a most unfortunate that this was done using Yet More C Code, since compiled code prevents tinkering/evolution (see my a critical mindset about policy rant). None of this is likely to be a performance bottleneck if it were to be coded in Tcl. Is there no way to decouple the startup logic further so interps can be inited without depending on a file system, and then "injecting" some Tcl scripting logic to start things off?

As for the trofs format - isn't this turning the argument around a bit? There is a perfectly valid (ubiquitous, in fact) ZIP archive format and zipvfs driver, it is not clear to me how trofs improves on it (other than w.r.t. the current startup issues).

TclVFS could be improved further if it has limitations - AFAICT it is merely a thin reflect-to-tcl wrapper.

DGP Further decoupling and other improvements to Tcl's startup are better addressed elsewhere. Yes, that's in progress on the HEAD of Tcl. That said, if you want to evaluate Tcl code, you need a Tcl interp, so anything that must be done as part of creating a Tcl interp and preparing it to evaluate scripts must be written in C.

Turning the argument around -- yes, I'd agree if I already knew the zip format, and if I actively rejected it in favor of the trofs format instead. That's not the case. I needed an archive, and I lacked the patience to learn about any existing ones. The one that's in place now is pretty much a trivial copy of data structures to disk file. I did leave the door open to change it later, once I or someone else had the patience to make a better choice. Note that I didn't even bother to document the format; Pascal did that as an aid to his own modifications, and then (unwisely?) posted it here. Then the piling on began.

At one level TclVFS is a thin wrapper, but where I believe it acquires its limitations is in its C-coded part that implements the Tcl_Filesystem it provides. VfsPathInFilesystem imposes the "no nested mounts" constraint. No matter how clever each Tcl-coded client filesystem of tclvfs may be, the innards of tclvfs get in the way.

pcam 24/07/2007 I have just tried with ActiveState 8.5a8 package require trofs and tclsh and wish both exit. Surely this is a bug.


Category Deployment | Category Package