[Lars H]? For a more robust (but also heavier) Tcl parser in Tcl, see [parsetcl]. [GPS]: Yours is much more complex than mine. I won't comment on the issue of robustness. I made some changes and now version 5 passes the test you proposed at the end of this page. I also added an optional parray test, which it also passes now. ---- #Copyright 2003 George Peter Staplin #You may use this under the same terms as Tcl. #A Mini Tcl Parser in Tcl. #version 5 interp alias {} loop {} while 1 set s { A [B 123] [C 456 [C2 789]] set x [+ 20 [* 30 200]] list a b {a b} $c {$c} [d e; f g] } if 0 { #We need to have autoloading source the parray code. array set ar {} parray ar append s "\n[info body parray]" } #1 or 0 proc get.token {s iPtr tokPtr typePtr} { upvar $iPtr i upvar $tokPtr tok upvar $typePtr type set tok "" set sLen [string length $s] set lastChar "" #unknown, brace, bracket, quote, end set type unknown set braceCount 0 set bracketCount 0 set quote 0 for {} {$i < $sLen} {incr i} { set c [string index $s $i] if {$braceCount} { if {("\{" == $c) && ("\\" != $lastChar)} { incr braceCount } elseif {("\}" == $c) && ("\\" != $lastChar)} { incr braceCount -1 } if {0 == $braceCount} { incr i return 1 } append tok $c } elseif {$bracketCount} { if {("\[" == $c) && ("\\" != $lastChar)} { incr bracketCount } elseif {("\]" == $c) && ("\\" != $lastChar)} { incr bracketCount -1 } if {0 == $bracketCount} { incr i return 1 } append tok $c } elseif {$quote} { if {("\"" == $c) && ("\\" != $lastChar)} { incr i return 1 } append tok $c } else { if {("\{" == $c) && ("\\" != $lastChar)} { set type brace incr braceCount } elseif {("\[" == $c) && ("\\" != $lastChar)} { set type bracket incr bracketCount } elseif {("\"" == $c) && ("\\" != $lastChar)} { set type quote set quote 1 } elseif {(" " == $c) || ("\t" == $c)} { if {[string length $tok]} { return 1 } } elseif {("\n" == $c) || ("\r" == $c) || (";" == $c)} { if {[string length $tok]} { return 1 } else { set type end set tok $c incr i return 1 } } else { append tok $c } } set lastChar $c } if {"unknown" ne $type} { puts stderr "incomplete command: still in state of $type" return 0 } if {[string length $tok]} { return 1 } return 0 } set ::level 0 proc parse s { global level set i 0 set tok "" set type "" loop { set r [get.token $s i tok type] if {!$r} break #puts "TOK:$tok TYPE:$type" if {"end" == $type} { puts [string repeat " " $level]SEP } elseif {"bracket" == $type} { incr level puts [string repeat " " $level]BRACK parse $tok incr level -1 } else { puts [string repeat " " $level]TOK:$tok } } } proc main {} { parse $::s } main ---- Example output: $ tclsh85g.exe mini_tcl_parser-3.tcl TOK:A TOK:B TOK:123 TOK:C TOK:456 TOK:C2 TOK:789 TOK:set TOK:x TOK:+ TOK:20 TOK:* TOK:30 TOK:200 ---- [Lars H] (19 aug 2003): It seems a bit simplistic. Consider: % parse {list a b {a b} $c {$c} [d e; f g]} TOK:list TOK:a TOK:b TOK:a b TOK:$c TOK:$c TOK:d TOK:e TOK:f TOK:g That $c and {$c} are the same can be an artifact of the way that the parse procedure presents the result (type information not shown), but the level of the f and g tokens is simply wrong. ---- [GPS]: I've fixed this problem. Thanks for pointing it out. Here's the output with version 5: $ tclsh85g.exe mini_tcl_parser-5.tcl SEP TOK:A BRACK TOK:B TOK:123 BRACK TOK:C TOK:456 BRACK TOK:C2 TOK:789 SEP TOK:set TOK:x BRACK TOK:+ TOK:20 BRACK TOK:* TOK:30 TOK:200 SEP TOK:list TOK:a TOK:b TOK:a b TOK:$c TOK:$c BRACK TOK:d TOK:e SEP TOK:f TOK:g SEP ----