Version 15 of e4graph

Updated 2003-05-15 23:02:52

Get it from: http://sourceforge.net/projects/e4graph/ Home page at: http://www.e4graph.com/e4graph/

e4graph 1.0a7 is out. Get it from the above URLs!

e4Graph is a C++, Tcl and Java library that allows programs to store graph-like data persistently and to access and manipulate that data efficiently. With e4Graph, you can arrange your data in the most natural form that reflects the relationships between its parts, rather than having to force it into a table-like format. The e4Graph library also allows you to concentrate on the relationships you want to represent, and not on how to store them in a database. You can modify data items, and add and remove connections and relationships between pieces of data on the fly. e4Graph allows you to represent an unlimited number of different connections between pieces of data, and your program can selectively manipulate the data according to the relationships it cares about, not having to know about other connections represented in the data set. [Cool things: swappable, even scriptable, back end, at least architecturally. Arbitrary data stores. Conceptual power. Embeddability. Portability of data and source. Screaming performance. ...]


Jacob Levy 05/15/2003. This is a proposal for some API changes to e4Graph. Please send comments to jyl at mod3 dot net.

I wanted everyone to have an opportunity to have a say in this discussion, so that I can make the right decision. The background is that I'm finding the current terminology of e4Graph hard to explain and understand, and non-standard as far as graph theory goes. So the following changes are intended to make the APIs more conformant to the common terminology as well as making e4Graph richer, more capable and easier to understand. I want to make the changes *now*, before e4Graph 1.0final comes out, so I will not have to support two very different APIs forever. But I will do what the majority of those responding ask for, so your input is very important.

Some of you may know already that e4Graph is a way to store arbitrary directed graphs of objects in persistent storages. I use Metakit as the storage mechanism, with great success -- Metakit has been extremely robust, fast, portable and useful, and besides, it's supported by the world's most responsive programmer, JCW :). e4Graph has language bindings for Tcl and Java, and now Python is being added. I'm also planning to make e4Graph use other DBs as storage, e.g. mySQL, postgress, etc.

e4Graph currently has just two concepts, nodes and vertices. Nodes are ordered collections of vertices, and vertices are a mechanism to transition from a node to a value (be it an integer, double, string, binary or other node). You'll note that the terminology is not graph-theory-standard; usually what e4Graph calls vertice is called edge in graph-theoretic terminology.

The new e4Graph API will be based on three concepts (instead of the existing two):

 1. Nodes stay the same.
 2. Vertices that lead from a node to another node are now edges.
 3. Vertices whose value is a scalar, string or binary are now attributes.

An attribute is simply an association between a string name and a typed value. In the new API, attributes can have *anything* as their value, including scalars, string, binary, edge or node.

Nodes thus now become ordered collections of edges. Additionally, edges and nodes can have any number of attributes. Previously verices were named. The name now becomes one of the attributes on the equivalent edge.

All of this implies large changes in the APIs of all language bindings. I formally can do this now, because e4Graph is still in Alpha (and I kept it in Alpha for a very long time for the express purpose of getting everything right the first time). However, I recognize that many projects already use e4Graph, some commercially. Therefore I solicit feedback.

Two alternatives, besides making the changes now:

 1. The first is to bring e4Graph 1.0 to the release version with the
    current API and then immediately abandon it. New development will go into
    e4Graph 2.0 which will be finished much faster than 1.0 was.
 2. Not make any changes, ever. e4Graph 1.0 is the only version, and we live
    with the non-standard terminology and without the flexibility afforded by
    the new attribute mechanism.

As might be obvious, I prefer the proposal I'm making here, and as a less desirable alternative, to finish 1.0 and move on immediately to 2.0 while supporting 1.0 with bug fixes (but NO NEW DEVELOPMENT).

Please send in your comments. I will wait for a couple of weeks before deciding anything.


Jacob Levy 05/15/2003 Has anyone thought about making the struct::graph and struct::tree packages in tcllib be wrappers for e4Graph storage? Is there any impedance mismatch? Would this be a cool thing to do?

AK May 15, 2003. Cool: Yes, IMHO. Impedance mismatch: Yes, for the current interface. For the proposed interface there is less mismatch. The concepts are the same (node, edges, attributes, attributes for nodes and edges). Missing in e4graph are attributes associated with a graph instead of components, tcllib has them. Another possible mismatch is the addressing of the nodes and edges in tcllib and e4graph. Might need storage in the tcl layer for translation, or maybe through special attributes. Definitely feasible.

See also cgraph.

Jacob Levy 05/15/2003 OK, for attributes for whole graphs in tcllib, in e4Graph they could be attached to the root node. That way there's a simple way of finding them. Can you expand on the addressing issues?


e4Graph makes a prominent appearance in a developerWorks article [L1 ] on high-performance XML (which subject, incidentally, ought also focus attention on tDOM.

28sep02 jcw - The above link to the article appears to be broken, unfortunately... -- Dot in the wrong place SC


27Sep02 Jacob Levy e4Graph has a very powerful XML binding, it is able to store XML natively and allow natural manipulation of XML data as tree structured data. It also has an excellent Tcl binding, and provides an extensible object model where Tcl procedures can be stored inside the graph and extend the functionality of nodes.

Here's an example of this idea:

 % load tgraph                   (1)                  
 % set s [tgraph::open foo.db]   (2)
 % set r [$x root]               (3)
 % $r add bar last hello         (4)
 % $r method square {x} {        (5)
         return [expr $x * $x]
   }
 % $r call square 5              (6)
 => 25

Explanation: line 1 loads the tgraph library, which contains the Tcl binding for e4Graph. Line 2 opens the storage 'foo.db' and stores the result in 's'. Line 3 makes 'r' refer to the root node of this storage. Line 4 adds a new vertex named 'bar' as the last vertex with the string value 'hello'. Line 5 defines the method 'square' which takes one argument, a number, and returns the square of it. Finally, line 6 invokes the method to compute the square of 5.


[ Category Package | Category Mathematics | Category Data Structure ]