huddle

Tcl's containers are very simple, but it is not distinguished mutually.

Although the advantage on code description also exists, it becomes a big fault at the time of file reading and the beginning.


Huddle provides a generic Tcl-based serialization/intermediary format. Currently, each node is wrapped in a tag with simple type information. Huddle object can contain both dicts and list with mixed. Also it can add other types with user-callback. http://sourceforge.net/tracker/index.php?func=detail&aid=1970893&group_id=12883&atid=362883


AK let me try to describe it in my own words.

  1. Huddle provides a generic Tcl-based serialization format
  2. The entries in that format are tagged with simple type information
  3. The currently the known types are 'L' for list, and 'D' for dict (AMG: also 's' for string, 'num' for number, 'b' for true/false, and 'null')
  4. When converting huddle-notation to other serialization formats like JSON or YAML this type information is used to select the proper notation.
  5. And when going from JSON/YAML/... to huddle their notation can be used to select the proper huddle type.
  6. In that manner huddle can serve as a common intermediary format.
  7. The nice thing about its notation that Tcl can read this format directly (list/dict commands) without the need for a special parser.

Working Sample:

# create as a dict
% set bb [huddle create a b c d]
HUDDLE {D {a {s b} c {s d}}}

# create as a list
% set cc [huddle list e f g h]
HUDDLE {L {{s e} {s f} {s g} {s h}}}
% set bbcc [huddle create bb $bb cc $cc]
HUDDLE {D {bb {D {a {s b} c {s d}}} cc {L {{s e} {s f} {s g} {s h}}}}}
% set folding [huddle list $bbcc p [huddle list q r] s]
HUDDLE {L {{D {bb {D {a {s b} c {s d}}} cc {L {{s e} {s f} {s g} {s h}}}}} {s p} {L {{s q} {s r}}} {s s}}}

# normal Tcl's notation
% huddle strip $folding
{bb {a b c d} cc {e f g h}} p {q r} s

# get a sub node
% huddle get $folding 0 bb
HUDDLE {D {a {s b} c {s d}}}
% huddle gets $folding 0 bb
a b c d

# overwrite a node
% huddle set folding 0 bb c kkk
HUDDLE {L {{D {bb {D {a {s b} c {s kkk}}} cc {L {{s e} {s f} {s g} {s h}}}}} {s p} {L {{s q} {s r}}} {s s}}}

# remove a node
% huddle remove $folding 2 1
HUDDLE {L {{D {bb {D {a {s b} c {s kkk}}} cc {L {{s e} {s f} {s g} {s h}}}}} {s p} {L {{s q}}} {s s}}}
% huddle strip $folding
{bb {a b c kkk} cc {e f g h}} p {q r} s

# dump as a JSON stream
% huddle jsondump $folding
[
  {
    "bb": {
      "a": "b",
      "c": "kkk"
    },
    "cc": [
      "e",
      "f",
      "g",
      "h"
    ]
  },
  "p",
  [
    "q",
    "r"
  ],
  "s"
]

Currently, you can get the library at head of tcllib CVS(https://core.tcl-lang.org/tcllib/dir?name=modules/yaml/ ) that is used to implement YAML library.


Comments

Lars H: The feature demonstrated in the folding example strikes me as somewhat dangerous; it seems to imply that you cannot store a string that looks like a huddle inside a huddle without having it interpreted as such and fused with the huddle. I suppose it is by design, and an edge case that "won't happen accidentally in real life", but it's a kind of thing that worries me deeply. To protect against it, one would probably have to put all strings in some kind of string container.

kanryu 20080610: As pointed out, there is a special meaning to the label "HUDDLE" in huddle objects. It need to be careful to store as a child node.

# huddle like string
% set hh {HUDDLE {like string}}
HUDDLE {like string}

# It is not correct
% huddle create p q r $hh
HUDDLE {D {p {s q} r {like string}}}

# It need to wrap the node.
% set ff [huddle wrap s $hh]
HUDDLE {s {HUDDLE {like string}}}
% huddle create p q r $ff
HUDDLE {D {p {s q} r {s {HUDDLE {like string}}}}}

About the case of being other, there is no assumption of huddle nodes except for possible handling as a node of Tcl-list. Therefore, not only a simple English words but a multi-byte character, binary data, etc. are storable.


dbohdan 2014-08-02: It's worth pointing out that the yaml package uses its own set of huddle type tags that correspond to a subset those in the YAML spec .

eltclsh > package req yaml
0.3.6
eltclsh > package req huddle
0.1.5
eltclsh > ::yaml::yaml2huddle [::yaml::huddle2yaml {HUDDLE {L {{s a} {D {b {s c}}}}}}]
HUDDLE {!!seq {{!!str a} {!!map {b {!!str c}}}}}
eltclsh > tail -n 13 /usr/share/tcl8.5/tcllib-1.15/yaml/yaml.tcl
huddle addType ::yaml::_huddle_mapping
huddle addType ::yaml::_huddle_sequence
huddle addType [::yaml::_makeChildType string !!str]
huddle addType [::yaml::_makeChildType string !!timestamp]
huddle addType [::yaml::_makeChildType string !!float]
huddle addType [::yaml::_makeChildType string !!int]
huddle addType [::yaml::_makeChildType string !!null]
huddle addType [::yaml::_makeChildType string !!true]
huddle addType [::yaml::_makeChildType string !!false]
huddle addType [::yaml::_makeChildType string !!binary]
huddle addType [::yaml::_makeChildType plain !!plain]

I think using those longer tags actually makes the format more readable for humans.

See Also

Alternative JSON
JSON codec with a similar tagged data format