Version 7 of rope

Updated 2008-01-25 11:32:44 by dkf

A rope is a data structure used to represent strings. Ropes are typically implemented with B-trees [L1 ] of (possibly immutable) strings. The potential immutability of source strings makes it a good candidate for a mutable string structure on systems where the native string representation is immutable, such as Java. Also, ropes can be used to mix data from disjoint sources, e.g. stack, heap, memory-mapped files, on-demand content, etc.

The Tk text widget uses a B-tree to store textual data along with tags, so it's some sort of augmented rope, where each node has associated metadata. TIP #169 [L2 ] (Add Peer Text Widgets), implemented in Tk8.5, already involved uncoupling the B-Tree from the widget. A native rope object type could be used to push this idea even further, by allowing data sharing outside of text widgets. For example, a comment dated 24Jan2004 on the text page expresses the wish to share this structure with tdom.

The B-Tree code used in the text widget could serve as a basis for a rope object type, as the code is mature and widely tested.

Cloverfield suggests that ropes be used in place of flat strings as the primary string representation in Tcl.

Each substring node could have metadata attached. For example, the string representation of a proc body could point to the original sourced string, with metadata giving the file name. This could be used to generate meaningful stack traces with line numbers and file names. See also Tcl9 and annotated strings.

See [L3 ] for a real-world implementation of ropes in C ("C Cords").

SGI provides a rope template class in its STL implementation [L4 ].


RS 2008-01-25: Could someone give an example of a string rep for a rope - i.e. demonstrate how the internals can be turned outside? I roughly assume it to be a relation like of text widget's get method (text content only) vs. dump method ("including the text and information about marks, tags, and embedded windows")... Even if metadata is kept hidden at runtime, it has to be serialized for channel transport (files, sockets, pipes, etc.)


DKF 2008-01-25: The text widget B-tree is only partially decoupled from the text widget, and it's not as widely tested as it should be (it had major changes in 8.5 to support smooth-scrolling and other related changes [L5 ]). It's not at all clear to me that it's the best place to start from for your own implementation either; it's very focussed on being a model for 2-D strings (it's a collection of paragraphs containing rich text).


See also Abstract Data Types.