Tclon

Sarnold: An attempt to produce a Tcl-equivalent to JSON, a data format easily parsable by Web applications. Here we have a simple code to handle this format at Tcl-level, with a little sugar to handle types, that is required by the Javascript version.

Lars H 2008-02-06: Do you have an example of what the data format looks like? Also, what is the point of doing a more Tclish format in this case? I'd expect it to be easier to teach Tcl to parse a data structure with Javascript syntax than vice versa, but on the other hand it is probably good for the mountain to come to Muhammed every once in a while too.

Sarnold: The data format maps Tcl's list. The structure of the data is separate from the data itself, due to EIAS.

The data structure is also a list.

Data structure syntax for Tclon:

list : type ?type ...?
   | list '*' # the star matches one or more values
   | '*' # same remark
   | 'list' type # this is a list where all elements have the same type (the argument) 
type : name | list | dict
dict : 'dict'  # a non-formal dict (keys and values are just strings)
            | 'dict' type type; # a formal dict : second and third types are the types of keys and values, resp.

The data structure is a 'list' as the syntax shows.

Data format examples:

Structure:                  Data sample:
==========                  ============
*                           {a b {c d e}}
{dict}                      {{opt value opt2 value2}}
{{dict key {a b c}}}        {{colors {red green blue} values {100 200 300}}}

Javascript code:

var tclon = new Object();
tclon.tclon=function (input, types) {
    return tclon._tclon(tclon.parse(input), tclon.parse(types));
}

tclon._tclon = function (input, types) {
    if (types.length == 0 || (types.length==1 && types[0]=="*"))
        return input;
    var res=new Array(), i, tab, elt;
    for (i=0; i<input.length && i<types.length; i++) {
        var t = types[i];
        var d = input[i];
        if (typeof t == "string") {
            if (t == "dict")
                res.push(tclon.toDict(d));
            else if (t == 'list')
                res.push(d);
            else if (typeof d == "Array")
                res.push(d.join());
            else
                res.push(d);
        } else if (t[0]=="list") {
            tab = new Array();
            for (j in d) {
                tab.push(tclon._tclon(d[j], t[1]));
            }
            res.push(tab);
        } else if (t[0]=="dict") {
            if (t.length != 3) throw "invalid dict type";
            res.push(tclon.toDictWithType(d,t[1],t[2]));
        }
    }
    return res;
}

tclon.toDict = function (arr) {
    var dict = new Object();
    if(arr.length%2!=0) {
        alert(arr);
        throw "not a dict";
    }
    for (var i = 0; i <arr.length;i+=2)
        dict[arr[i]]=arr[i+1];
    return dict;
}

tclon.toDictWithType = function (arr, ktype, vtype) {
    var dict = new Object();
    if(arr.length%2!=0) {
        alert(arr);
        throw "not a dict";
    }
    for (var i = 0; i <arr.length;i+=2)
        dict[tclon._tclon(arr[i], ktype)]=tclon._tclon(arr[i+1], vtype);
    return dict;
}

tclon.StdEscapeSequences = {
    "{" : "{", "}": "}", "\"": "\"", "r": "\r", "t": "\t",
    "a": "\a", "b": "\b", "n": "\n", "v": "\v", "f": "\f"
};

tclon.getSeq = function (str, pos) {
    var my = str.substr(pos+1, 10);
    if (my.charAt(0)=="x")
        return String.fromCharCode(parseInt(my.substr(1,2), 16));
    if (my.charAt(0)=="u")
        return String.fromCharCode(parseInt(my.substr(1,4), 16));
    return tclon.StdEscapeSequences[my];
}

tclon.seqLen = function (str, pos) {
    var my = str.substr(pos+1, 10);
    if (my.charAt(0)=="x")
        return 3;
    if (my.charAt(0)=="u")
        return 4;
    return 1;
}

tclon.parse = function (str) {
    var tab;
    var res = new Array();
    var i = 0, level = 0, work="", c = 0, inString = false;
    res.push(new Array());
    for (i = 0; i <str.length; i++) {
        switch(str.charAt(i)) {
            case "{":
                if (inString) {
                    work+="{";
                    break;
                }
                level++;
                res.push(new Array());
                break;
            case "}":
                if (inString) {
                    work+="}";
                    break;
                }
                level--;
                if (level < 0) throw "unmatched braces";
                tab = res[res.length-1];
                if (str.charAt(i-1)!="}") tab.push(work);
                res.pop();
                res[res.length-1].push(tab);
                work = "";
                break;
            case "\\":
                // Escape sequence
                work+=tclon.getSeq(str, i+1);
                i+=tclon.escapeSeq(str, i+1);
                break;
            case "\"":
                inString = !instring;
                break;
            default:
                if (str.charAt(i).match("[ \t\r\n\v\f]")) {
                    if (str.charAt(i-1)!="}") res[res.length-1].push(work);
                    work = "";
                }        else
                    work+=str.charAt(i);
                break;
        }
    }
    if (str.charAt(str.length-1)!="}") res[0].push(work);
    if (level != 0)
        throw "unmatched braces";
    if (inString)
        throw "missing double-quotes";
    return res[0];
}

alert(tclon.tclon("a {{b c} {d e}} f", "s {list {a b}} s"));

Tcl code:

proc tclon {data {type *}} {
    if {![llength $type] || $type eq "*"} {return $data}
    set res ""
    foreach t $type d $data {
        if {[llength $t]<=1} {
            lappend res $d
        } elseif {[lindex $t 0] eq "dict"} {
            set dict [list]
            foreach {k v} $d {
                lappend dict [tclon $k [lindex $t 1]] [tclon $v [lindex $t 2]]
            }
            lappend res $dict
        } elseif {[lindex $t 0] eq "list"} {
            set l [list]
            foreach v $d {lappend l [tclon $v [lindex $t 1]]}
            lappend res $l
        } else {
            lappend res [tclon $d $t]
        }
    }
    set res
}



proc fromtclon {data {type *}} {
    if {![llength $type] || $type eq "*"} {return $data}
    set res [list]
    foreach t $type d $data {
        if {[llength $t]<=1} {
            lappend res $d
        } elseif {[lindex $t 0] eq "dict"} {
            set dict [list]
            foreach {k v} $d {
                lappend dict [fromtclon $k [lindex $t 1]] [fromtclon $v [lindex $t 2]]
            }
            lappend res $dict
        } elseif {[lindex $t 0] eq "list"} {
            set l [list]
            foreach v $d {lappend l [fromtclon $v [lindex $t 1]]}
            lappend res $l
        } else {                
            lappend res [fromtclon $d $t]
        }
    }
    set res
}

Some testing:

proc ? {s r} {
    if {![string equal $r [uplevel 1 $s]]} {error "assertion failed"}
}

proc validate {data type} {
    set out [fromtclon [tclon $data $type] $type]
    if {$out ne $data} {puts "Difference:"
    puts "1: $data"
    puts "2: [tclon $data $type]"
    puts "3: $out"
    error "assertion failed"
    }
}

? {tclon 1 int} 1
? {tclon {a b}} {a b}
? {tclon {{a b} b c} {{list s} *}} {{a b} b c}
validate 1 int
validate {a b} {s s}
validate {{a b} b c} {{list s} *}
validate {{x {1 2} y {23 4}}} {{dict key {a b}}}