Version 5 of Tclon

Updated 2008-02-07 17:03:56 by sarnold

Sarnold: An attempt to produce a Tcl-equivalent to JSON, a data format easily parsable by Web applications. Here we have a simple code to handle this format at Tcl-level, with a little sugar to handle types, that is required by the Javascript version.

Lars H, 2008-02-06: Do you have an example of what the data format looks like? Also, what is the point of doing a more Tclish format in this case? I'd expect it to be easier to teach Tcl to parse a data structure with Javascript syntax than vice versa, but on the other hand it is probably good for the mountain to come to Muhammed every once in a while too.

Sarnold: The data format maps Tcl's list. The structure of the data is separate from the data itself, due to EIAS. The data structure is also a list.

 Data structure syntax for Tclon:

 list : type ?type ...?
    | list '*' # the star matches one or more values
    | '*' # same remark
    | 'list' type # this is a list where all elements have the same type (the argument) 
 type : name | list | dict
 dict : 'dict'  # a non-formal dict (keys and values are just strings)
             | 'dict' type type; # a formal dict : second and third types are the types of keys and values, resp.

The data structure is a 'list' as the syntax shows.

 Data format examples:

 Structure:                  Data sample:
 ==========                  ============
 *                           {a b {c d e}}
 {dict}                      {{opt value opt2 value2}}
 {{dict key {a b c}}}        {{colors {red green blue} values {100 200 300}}}

Javascript code:

var tclon = new Object();
tclon.tclon=function (input, types) {
	return tclon._tclon(tclon.parse(input), tclon.parse(types));
}

tclon._tclon = function (input, types) {
	if (types.length == 0 || (types.length==1 && types[0]=="*"))
		return input;
	var res=new Array(), i, tab, elt;
	for (i=0; i<input.length && i<types.length; i++) {
		var t = types[i];
		var d = input[i];
		if (typeof t == "string") {
			if (t == "dict")
				res.push(tclon.toDict(d));
			else if (t == 'list')
				res.push(d);
			else if (typeof d == "Array")
				res.push(d.join());
			else
				res.push(d);
		} else if (t[0]=="list") {
			tab = new Array();
			for (j in d) {
				tab.push(tclon._tclon(d[j], t[1]));
			}
			res.push(tab);
		} else if (t[0]=="dict") {
			if (t.length != 3) throw "invalid dict type";
			res.push(tclon.toDictWithType(d,t[1],t[2]));
		}
	}
	return res;
}

tclon.toDict = function (arr) {
	var dict = new Object();
	if(arr.length%2!=0) {
		alert(arr);
		throw "not a dict";
	}
	for (var i = 0; i <arr.length;i+=2)
		dict[arr[i]]=arr[i+1];
	return dict;
}

tclon.toDictWithType = function (arr, ktype, vtype) {
	var dict = new Object();
	if(arr.length%2!=0) {
		alert(arr);
		throw "not a dict";
	}
	for (var i = 0; i <arr.length;i+=2)
		dict[tclon._tclon(arr[i], ktype)]=tclon._tclon(arr[i+1], vtype);
	return dict;
}

tclon.StdEscapeSequences = {
	"{" : "{", "}": "}", "\"": "\"", "r": "\r", "t": "\t",
	"a": "\a", "b": "\b", "n": "\n", "v": "\v", "f": "\f"
};

tclon.getSeq = function (str, pos) {
	var my = str.substr(pos+1, 10);
	if (my.charAt(0)=="x")
		return String.fromCharCode(parseInt(my.substr(1,2), 16));
	if (my.charAt(0)=="u")
		return String.fromCharCode(parseInt(my.substr(1,4), 16));
	return tclon.StdEscapeSequences[my];
}

tclon.seqLen = function (str, pos) {
	var my = str.substr(pos+1, 10);
	if (my.charAt(0)=="x")
		return 3;
	if (my.charAt(0)=="u")
		return 4;
	return 1;
}

tclon.parse = function (str) {
	var tab;
	var res = new Array();
	var i = 0, level = 0, work="", c = 0, inString = false;
	res.push(new Array());
	for (i = 0; i <str.length; i++) {
		switch(str.charAt(i)) {
			case "{":
				if (inString) {
					work+="{";
					break;
				}
				level++;
				res.push(new Array());
				break;
			case "}":
				if (inString) {
					work+="}";
					break;
				}
				level--;
				if (level < 0) throw "unmatched braces";
				tab = res[res.length-1];
				if (str.charAt(i-1)!="}") tab.push(work);
				res.pop();
				res[res.length-1].push(tab);
				work = "";
				break;
			case "\\":
				// Escape sequence
				work+=tclon.getSeq(str, i+1);
				i+=tclon.escapeSeq(str, i+1);
				break;
			case "\"":
				inString = !instring;
				break;
			default:
				if (str.charAt(i).match("[ \t\r\n\v\f]")) {
					if (str.charAt(i-1)!="}") res[res.length-1].push(work);
					work = "";
				}	else
					work+=str.charAt(i);
				break;
		}
	}
	if (str.charAt(str.length-1)!="}") res[0].push(work);
	if (level != 0)
		throw "unmatched braces";
	if (inString)
		throw "missing double-quotes";
	return res[0];
}

alert(tclon.tclon("a {{b c} {d e}} f", "s {list {a b}} s"));

Tcl code:

proc tclon {data {type *}} {
	if {![llength $type] || $type eq "*"} {return $data}
	set res ""
	foreach t $type d $data {
		if {[llength $t]<=1} {
			lappend res $d
		} elseif {[lindex $t 0] eq "dict"} {
			set dict [list]
			foreach {k v} $d {
				lappend dict [tclon $k [lindex $t 1]] [tclon $v [lindex $t 2]]
			}
			lappend res $dict
		} elseif {[lindex $t 0] eq "list"} {
			set l [list]
			foreach v $d {lappend l [tclon $v [lindex $t 1]]}
			lappend res $l
		} else {
			lappend res [tclon $d $t]
		}
	}
	set res
}



proc fromtclon {data {type *}} {
	if {![llength $type] || $type eq "*"} {return $data}
	set res [list]
	foreach t $type d $data {
		if {[llength $t]<=1} {
			lappend res $d
		} elseif {[lindex $t 0] eq "dict"} {
			set dict [list]
			foreach {k v} $d {
				lappend dict [fromtclon $k [lindex $t 1]] [fromtclon $v [lindex $t 2]]
			}
			lappend res $dict
		} elseif {[lindex $t 0] eq "list"} {
			set l [list]
			foreach v $d {lappend l [fromtclon $v [lindex $t 1]]}
			lappend res $l
		} else {		
			lappend res [fromtclon $d $t]
		}
	}
	set res
}

Some testing:

proc ? {s r} {
	if {![string equal $r [uplevel 1 $s]]} {error "assertion failed"}
}

proc validate {data type} {
	set out [fromtclon [tclon $data $type] $type]
	if {$out ne $data} {puts "Difference:"
	puts "1: $data"
	puts "2: [tclon $data $type]"
	puts "3: $out"
	error "assertion failed"
	}
}

? {tclon 1 int} 1
? {tclon {a b}} {a b}
? {tclon {{a b} b c} {{list s} *}} {{a b} b c}
validate 1 int
validate {a b} {s s}
validate {{a b} b c} {{list s} *}
validate {{x {1 2} y {23 4}}} {{dict key {a b}}}