Metakit Documentation

Overview

Metakit is a machine- and language-independent toolkit for storing and managing structured data. This is a description of the Mk4tcl extension, which allows you to create, access, and manipulate Metakit datafiles using Tcl. This document is derived from http://www.equi4.com/metakit/tcl.html


Definitions

A Store contains a set of named views. A Tag denotes an open Store. A Store may be a file, or a temporary in-RAM structure.

A View is an indexable collection of Rows (equivalently: a table of records, an array of elements). Views are homogenous: each row in a view contains the same type of information (this also implies that all Subviews within the same View always have the same structure.) Top-level Views are denoted by tag.viewname.

Each Row is an ordered set of named Properties. An Index is a position in a View denoting a Row (the first Row is at Index zero). A Row is denoted by tag.viewname!N (where N is the Row's Index.)

Property values can be strings, numeric, untyped data, or a nested View (called a Subview.) The tuple (view, index, property) combination denotes a single data value.

Subviews extend the Row denotation, e.g. tag.viewname!N.subview. Subview Rows continue in the same way, e.g. tag.viewname!N.subview!M.

A Cursor is a reference to a specific Row in a specific View, i.e. a (view, index) tuple.

The specification of a View (either top-level or Subview) is called a Path. Thus, both tag.viewname and tag.viewname!N.subview are Paths. A trailing Row Index is allowed and ignored wherever a Path is expected.

A Cursor placed at the Nth Row is denotationally equivalent to the string "path!N". As a result, Cursors are allowed (and frequently used) as Path arguments. A Cursor need not point to an existing Row (its current position may be out of range).


Opening, closing, and saving stores

The mk::file command is used to open and close Metakit Stores.

It is also used to force pending changes to disk (commit), to cancel the last changes (rollback), and to send/receive the entire contents of a store over a Tcl channel, including sockets (load/save).

mk::fileopen
mk::fileopen tag
mk::fileopen tag filename ?-readonly? ?-nocommit? ?-extend? ?-shared?

Without arguments, mk::file open returns the list of tags and filenames of all datasets which are currently open (of the form tag1 name1 tag2 name2 ...).

The mk::file open command associates a store with a unique symbolic tag. A tag must consist of alphanumeric characters, and is used in the other commands to refer to a specfic open store. If filename is omitted, a temporary in-memory dataset is created (which cannot use commit, but which you could save to an I/O channel).

When a store is closed, all pending changes will be written to file, unless the -nocommit option is specified. In that case, only an explicit commit will save changes.

To open a file only for reading, use the -readonly option. Stores can be opened read-only by any number of readers, or by a single writer (no other combinations are allowed).

There is an additional mode, specified by the -extend option: in this case changes are always written at the end of the store. This allows modifications by one writer without affecting readers. Readers can adjust to new changes made that way by doing a "rollback" (see below). The term is slightly confusing in this case, since it really is a "roll-forward" ...

The -shared option causes an open store to be visible in every Tcl interpreter, with thread locking as needed. The store is still tied to the current interpreter and will be closed when that interpreter is terminated.

mk::fileviews tag

The mk::file views command returns a list with the views currently defined in the open store associated with tag. You can use the mk::view layout command to determine the current structure of each view.

mk::fileclose tag

The mk::file close command closes the store and releases all associated resources. If not opened with -readonly or -nocommit, all pending changes will be saved to file before closing it. A tag loses its special meaning after the corresponding store has been closed.

mk::filecommit tag ?-full?

The mk::file commit command flushes all pending changes to disk. It should not be used on a file opened with the -readonly option. The optional -full argument is only useful when a commit-aside is active (see below). In that case, changes are merged back into the main store instead of being saved separately. The aside dataset is cleared.

mk::filerollback tag ?-full?

The mk::file rollback command cancels all pending changes and reverts the situation to match what was last stored on file. When commit-aside is active, a full rollback cause the state to be rollback to what it was without the aside changes. The aside dataset will be ignored from now on.

mk::fileload tag channel
mk::filesave tag channel

The mk::file load command replaces all views with data read from any Tcl channel. This data must have been generated using mk::file save. Changes are made permanent when commit is called (explicitly or implicitly, when a store is closed), or they can be reverted by calling rollback.

mk::fileaside tag tag2

The 'mk::file aside' command starts a special "commit-aside" mode, whereby changes are saved to a second database file. This can be much faster that standard commits, because only changes are saved. In commit-aside mode, the main store will not be modified it all, in fact it can be opened in read-only mode.

mk::fileautocommit tag

The mk::file autocommit command sets up a database file to automatically issue a commit when the file is closed later. This is useful if the file was initially opened in -nocommit mode, but you now want to change this setting (there is no way to return to -nocommit, although a rollback has a similar effect).


View structure and size operations

The mk::view command is used to query or alter the structure of a view in a store (layout, delete), as well as the number of rows it contains (size). The last command (info) returns the list of properties currently defined for a view.

Note that the layout and delete sub-commands operate only on top-level views (of the form tag.view), whereas size and info take a path as arguments, which is either a top-level view or a nested subview (of the form 'tag.view!index.subview!subindex...etc...subview').

mk::viewlayout tag.view
mk::viewlayout tag.view {structure}

The mk::view layout command returns a description of the current datastructure of tag.view. If a structure is specified, the current data is restructured to match that, by adding new properties with a default value, deleting obsolete ones, and reordering them.

Structure definitions consist of a list of properties. Subviews are specified as a sublist of two entries: the name and the list of properties in that subview. Note that subviews add two levels of nesting (see phones in the phonebook example below). The type of a property is specified by appending a suffix to the property name (the default type is string):

  • :S - A string property for storing strings of any size, but no null bytes.
  • :I - An integer property for efficiently storing values as integers (1..32 bits).
  • :L - An long property for storing values as 64-bit integers.
  • :F - A float property for storing single-precision floating point values (32 bits).
  • :D - A double property for storing double-precision floating point values (64 bits).
  • :B - A binary property for untyped binary data (including null bytes).

Properties which are not listed in the layout will only remain set while the store is open, but not be stored. To make properties persist, you must list them in the layout definition, and do so before setting them.

mk::viewdelete tag.view

The mk::view delete command completely removes a view and all the data it contains from a store.

mk::viewsize path
mk::viewsize path size

The mk::view size command returns the number of rows contained in the view identified as tag.view. If an argument is specified, the size of the view is adjusted accordingly, dropping the highest rows if the size is decreased or adding new empty ones if the size is increased. The command mk::view size 0 deletes all rows from a view, but keeps the view in the store so rows can be added again later (unlike mk::view delete.)

mk::viewinfo path

The 'mk::view info' returns the list of properties which are currently defined for path.


Cursor variables for positioning

The mk::cursor command is used to manipulate 'cursor variables', which offer an efficient means of iterating and repositioning a 'reference to a row in a view'. Though cursors are equivalent to strings of the form somepath!N, it is much more efficient to keep a cursor around in a variable and to adjust it (using the position subcommand), than evaluating a 'somepath!$index' expression every time a cursor is expected.

mk::cursorcreate name ?path? ?index?

The mk::cursor create command defines (or redefines) a cursor variable. The index argument defaults to zero. This is a convenience function, since mk::cursor create X somePath N is equivalent to set X somePath!N.

When both path and index arguments are omitted from the mk::cursor create command, a cursor pointing to an empty temporary view is created, which can be used as buffer for data not stored on file.

mk::cursorposition name
mk::cursorposition name 0
mk::cursorposition name end
mk::cursorposition name index

The mk::cursor position command returns the current position of a cursor, i.e. the 0-based index of the row it is pointing to. If an extra argument is specified, the cursor position will be adjusted accordingly.

The 'end' pseudo-position is the index of the last row (or -1 if the view is currently empty). Note that if 'X' is a cursor equivalent to somePath!N, then mk::cursor position X M is equivalent to the far less efficient 'set X somePath!M'.

mk::cursorincr name ?step?

The mk::cursor incr command adjusts the current position of a cursor with a specified relative step, which can be positive as well as negative. If step is zero, then this command does nothing. The command mk::cursor incr X N is equivalent to mk::cursor position X expr {[mk::cursor position X + N}].


Create, insert, and delete rows

The mk::row command deals with one or more rows of information. There is a command to allocate a temporary row which is not part of any store (create), and the usual set of container operations: appending, inserting, deleting, and replacing rows.

mk::rowcreate ?prop value ...?

The mk::row create command creates an empty temporary row, which is not stored in any store. Each temporary rows starts out without any properties. Setting a property in a row will implicitly add that property if necessary. The return value is a unique cursor, pointing to this temporary row. The row (and all data stored in it) will cease to exist when no cursor references to it remain.

mk::rowappend path ?prop value ...?

The mk::row append command extends the view with a new row, optionally setting some properties in it to the specified values.

mk::rowinsert cursor count ?cursor2?

The mk::row insert command is similar to the append sub-command, inserting the new row in a specified position instead of at the end. The count argument can be used to efficiently insert multiple copies of a row.

mk::rowdelete cursor ?count?

The mk::row delete command deletes one or more rows from a view, starting at the row pointed to by cursor.

mk::rowreplace cursor ?cursor2?

The mk::row replace command replaces one row with a copy of another one, or clears its contents if cursor2 is not specified.


Fetch values

The mk::get command fetches values from the row specified by cursor.

mk::getcursor ?-size?
mk::getcursor ?-size? prop ...

Without argument, mk::get returns a list of 'prop1 value1 prop2 value2 ...'. This format is most convenient for setting an array variable, as the following example illustrates:

If the -size option is specified, the size of property values is returned instead of their contents. This is normally in bytes, but for integers it can be a negative value indicating the number of bits used to store ints (-1, -2, or -4). This is an efficient way to determine the sizes of property values without fetching them.

If arguments are specified in the get command, they are interpreted as property names and a list will be returned containing the values of these properties in the specified order.

If cursor does not point to a valid row, default values are returned instead (no properties, and empty strings or numeric zero's, according to the property types).


Store values

The mk::set command stores values into the row specified by cursor.

mk::setcursor ?prop value ...?

If a property is specified which does not exist, it will be appended as a new definition for the containing view. As an important side effect, all other rows in this view will now also have such a property, with an appropriate default value for the property. Note that when new properties are defined in this way, they will be created as string properties unless qualified by a type suffix (see 'mk::view layout' for details on property types and their default values).

Using the mk::set command without specifying properties returns the current value and is identical to mk::get.

If cursor points to a non-existent row past the end of the view, an appropriate number of empty rows will be inserted first.


Iterate over the rows of a view

mk::loopcursor {body}
mk::loopcursor path {body}
mk::loopcursor path first ?limit? ?step? {body}

The mk::loop command offers a convenient way to iterate over the rows of a view. Iteration can be restricted to a certain range, and can optionally use a forward or backward step. This is a convenience function which is more efficient than performing explicit iteration over an index and positioning a cursor.

When called with just a path argument, the loop will iterate over all the rows in the corresponding view. The cursor loop variable will be set (or reset) on each iteration, and is created if it did not yet exist.

When path is not specified, the cursor variable must exist and be a valid cursor, although its current position will be ignored. The command mk::loop X {...} is identical to mk::loop X $X {...}.

The first argument specifies the first index position to use (default 0), the limit argument specifies the last argument (default 'end'), and the step argument specifies the increment (default 1). If step is negative and limit exceeds first, then the loop body will never be executed. A zero step value can lead to infinite looping unless the break command is called inside the loop.

The first, limit, and step arguments may be arbitrary integer expressions and are evaluated exactly once when the loop is entered.

Note that you cannot easily use a loop to insert or delete rows, since changes to views do not adjust cursors pointing into that view. Instead, you can use tricks like moving backwards (for deletions), or splitting the work into two separate passes.


Selection and sorting

The mk::select command combines a flexible selection operation with a way to sort the resulting set of rows. The result is a list of row index numbers (possibly empty), which can be used to reposition a cursor and to address rows directly.

mk::selectpath ?options ...?

A selection is specified using any combination of these criteria:

Selection criteria

  • prop value: Numeric or case-insensitive match
  • -min prop value: Property must be greater or equal to value (case is ignored)
  • -max prop value: Property must be less or equal to value (case is ignored)
  • -exact prop value: Exact case-sensitive string match
  • -glob prop pattern: Match "glob-style" expression wildcard
  • -globnc prop pattern: Match "glob-style" expression, ignoring case
  • -regexp prop pattern: Match specified regular expression
  • -keyword prop word: Match word as free text or partial prefix

If multiple criteria are specified, then selection succeeds only if all criteria are satisfied. If prop is a list, selection succeeds if any of the given properties satisfies the corresponding match.

Selection Constraints

  • -first pos: Selection starts at specified row index
  • -count num: Return no more than this many results

Note: not very useful with sorting, which is done after these constraints have been applied.

Selection Sorting

To sort the set of rows (with or without preliminary selection), use:

  • -sort prop or -sort {prop ...}: Sort on one or more properties, ascending
  • -rsort prop or -rsort {prop ...}: Sort on one or more properties, descending

Multiple sort options are combined in the order given.


Channel interface

The mk::channel command provides a channel interface to binary fields. It needs the Path of a Row and the name of a binary Prop, and returns a channel descriptor which can be used to read or write from.

mk::channelpath prop ?mode?

Channels are opened in one of three modes:

  • read: open for reading existing contents
  • write: clear contents and start saving data
  • append: keep contents, set seek pointer to end

Note: do not insert or delete rows in a view within which there are open channels, because subsequent reads and writes may end up going to the wrong property.

Examples

Adapted from the Mark Roseman tutorial .

# load MetaKit into our application; Mk4tcl is the name of the Tcl extension
package require Mk4tcl

# open a datafile named mydata.db; we'll refer to it with the tag 'db'
mk::file open db mydata.db

# create a view within the datafile which describes what we'll store
set view [mk::view layout db.addressbook "name country"]

# create a bunch of new rows in the view to store our data
mk::row append $view name "Mark Roseman" country "Canada"
mk::row append $view name "Jean-Claude Wippler" country "The Netherlands"
mk::row append $view name "Jeff Hobbs" country "Canada"
mk::file commit $view

# search for all living in Canada and print their names
foreach row [mk::select $view country "Canada"] {
    puts [mk::get $view!$row name]
}

# close the datafile
mk::file close db