Actually several packages: * [grammar::peg] - Construction and manipulation of parsing expression grammars * grammar::peg::interp - Interpreter for parsing expression grammars. Documentation: * http://tcllib.sourceforge.net/doc/peg.html * http://tcllib.sourceforge.net/doc/peg_interp.html ---- 3-9-8: I hope the following may help others who are trying to use the "grammar::peg" package. These were tested under "[etcl]" (Evolane Tcl/Tk Engine Version 1.0-rc26). ====== 1% # For now, just load the 'grammar::peg^ package 2% package require grammar::peg 0.1 3% #initialize a new (abeit empty) grammar. The start symbol will be, by default, 'epsilon^. 4% grammar::peg myGrammar ::myGrammar 5% # add a nonterminal to it 6% ::myGrammar nonterminal add myDigit {/ {t 0} {t 1} {t 2}} 7% # show its serialization 8% ::myGrammar serialize grammar::pegc {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} epsilon 9% # change the start symbol to the nonterminal 'myDigit^ 10% ::myGrammar start {/ {n myDigit}} 11% # show its serialization 12% ::myGrammar serialize grammar::pegc {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} {/ {n myDigit}} 13% # Now we show how you could, instead, have started with a list representing 14% # the grammar. Since "myGrammar" exists, we will make "myGrammarFromList". 15% package require struct::list 1.6.1 16% # (Incidentally, Andreas Kupries states: 'struct::list' is a package in Tcllib, like grammar::peg, and 17% # guesses that grammar::peg forgot to 'package require' it.) 18% set mySerialization [list grammar::pegc \ > {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} {/ {n myDigit}} \ > ] grammar::pegc {myDigit {/ {t 0} {t 1} {t 2}}} {myDigit value} {/ {n myDigit}} 19% grammar::peg myGrammarFromList deserialize $mySerialization ::myGrammarFromList 20% # Next we attempt to use (interpret) the grammar on an input string. 21% #Load all relevant packages (this may be overkill) 22% set pattern {grammar::me|grammar::peg}; foreach packageName [package names] {if [regexp $pattern $packageName] {puts "[catch {package require $packageName}] $packageName"}} 0 grammar::peg::interp 0 grammar::peg 0 grammar::me::tcl 0 grammar::me::cpu 0 grammar::me::util 0 grammar::me::cpu::core 0 grammar::me::cpu::gasm 23% # Initialize the interpreter. 24% grammar::peg::interp::setup ::myGrammar 25% # As of 3-9-8, that's as far as I have gotten. The documentation at http://tcllib.sourceforge.net/doc/peg_interp.html 26% # suggests using 27% # ::grammar::peg::interp::parse nextcmd errorvar astvar 28% #" which "interprets the loaded grammar and tries to match it against the stream of characters represented by the command prefix nextcmd". 29% # so I put the string "102" in "myFile.txt" and tried the following, but, as you can see, was unsuccessful. 30% set f [open c:/myFile.txt r+] file37bf6c8 31% chan configure $f -encoding cp1252 32% set offset 0 0 33% ::grammar::peg::interp::parse $f myErrorVar myAstVar invalid command name "file37bf6c8" 34% chan close $f ====== ---------- [JBR] I've made my own attempt to create a grammar::peg example. Maybe someone who knows can make this example work. Thanks. ====== #!/usr/bin/env tclsh8.6 # package require grammar::peg package require grammar::peg::interp proc parse-string { string } { coroutine next-char apply { { string } { yield set i 1 foreach ch [split $string {}] { yield [list $ch 1 1 $i] incr i } while { 1 } { yield {} } } } $string } ::grammar::peg parser parser nonterminal add Digit { / { t 1 } { t 2 } { t 3 } { t 4 } { t 5 } } parser nonterminal add Int { + { n Digit } } parser start { n Int } parse-string 54321 grammar::peg::interp::setup parser grammar::peg::interp::parse next-char err ast ====== When run I get this: ======none wrong # args: should be "ict_match_token tok msg" while executing "ict_match_token "Expected $ch"" (procedure "MatchExpr" line 40) invoked from within "MatchExpr $e" (procedure "MatchExpr" line 236) invoked from within "MatchExpr $ru($nt)" (procedure "MatchExpr" line 75) invoked from within "MatchExpr $sub" (procedure "MatchExpr" line 159) invoked from within "MatchExpr $ru($nt)" (procedure "MatchExpr" line 75) invoked from within "MatchExpr $se" (procedure "grammar::peg::interp::parse" line 9) invoked from within "grammar::peg::interp::parse next-char err ast" (file "./ISBL" line 30) ====== This appears to be a bug in grammar::peg::interp I changed line 116 of `peg_interp.tcl` to: ====== ict_match_token $ch "Expected $ch" ====== NOTE [tomas] 2010-01-25: [http://sourceforge.net/tracker/?func=detail&atid=112883&aid=3163541&group_id=12883%|%This bug] is fixed from release 1.13 onwards (thanks, [AK], you are my hero). [AK]: Note that the above sentence is a bit ambiguous. To clarify, the bug fix is not in Tcllib 1.13. It is in the CVS, just after the release, and will be found in the next release after 1.13. That being said, the 'Parser Tools' supercede grammar_peg and grammar_me. Then changing my coroutine to return EOF multiple times after the input is exhausted the string parses returning: ALL 0 4 {Int 0 4 {Digit 0 0 {{} 0 0}} {Digit 1 1 {{} 1 1}} {Digit 2 2 {{} 2 2}} {Digit 3 3 {{} 3 3}} {Digit 4 4 {{} 4 4}}} Now to learn how to do something useful with this. The input tokens don't appear to be represented in the output AST? ---- '''[AK] - 2011-01-21 14:40:21''' Regarding the representation of input tokens. No, they are not represented directly. The numbers in the AST tell you the character range covered by the symbol, as offsets from the beginning of the string. This allows you to extract the lexemes/token from the input string. See http://docs.activestate.com/activetcl/8.5/tcllib/pt/pt_parser_api.html for an example of the tree and its contents. --------- '''[tomas] - 2011-01-23''' Here's a more complete example. It tries to implement the "classical" expression grammar, with (integer decimal) numbers, the four arithmetical operations. To note: * PEGs look a lot like BNF. They are not the same (as noted in the comments). See [http://en.wikipedia.org/wiki/Parsing_expression_grammar%|%this wikipedia page] for a discussion. I still rode on this similarity to make clear in the comments what the PEG statements "mean". * I stumbled upon the bug mentioned above and tried to file in the [https://sourceforge.net/tracker/?func=detail&atid=112883&aid=3163541&group_id=12883%|%SourceForge bug tracker] * The program tries to visualize what's going on by printing the AST's structure with the (recursive) function print_nested. * Grokking whitespace is left as an exercise for the reader :-) Comments very welcome! ====== #!/usr/bin/tclsh # # peg: playing around with peg package require grammar::peg package require grammar::me::tcl package require grammar::peg::interp grammar::peg expression # PEGs ain't BNF, but the similarities are striking # One important difference: the choices {/ ...} are _ordered_ # choices; no backtracking in * and +. See # # for te gory details # Therefore the comments evoking BNF are just to tickle the # reader's gentle associative memory (or was it the associative's # reader gentle memory? Ah -- you know what I mean. # Sign ::= '+' | '-' expression nonterminal add Sign {/ {t +} {t -}} # Digit ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' expression nonterminal add Digit {/ {t 0} {t 1} {t 2} {t 3} {t 4} {t 5} {t 6} {t 7} {t 8} {t 9}} # Addop ::= '+' | '-' expression nonterminal add Addop {/ {t +} {t -}} # Mulop ::= '*' | '/' expression nonterminal add Mulop {/ {t *} {t /}} # Number ::= Sign? Digit+ expression nonterminal add Number {x {? {n Sign}} {+ {n Digit}}} # Factor ::= '(' Expression ')' | Number expression nonterminal add Factor {/ {x {t (} {n Expression} {t )}} {n Number}} # Term ::= Factor ( Mulop Factor )* expression nonterminal add Term {x {n Factor} {* {x {n Mulop} {n Factor}}}} # Expression ::= Term ( Addop Term )* expression nonterminal add Expression {x {n Term} {* {x {n Addop} {n Term}}}} # Start here: expression start {n Expression} puts [expression serialize] grammar::peg::interp::setup expression set i -1 # The expression to be parsed. No whitespaces! set src {1024+256} proc nexttok {} { global src global i return [list [string index $src [incr i]] {} $i 0] } set errs {} set ast {} grammar::peg::interp::parse nexttok errs ast puts "errs = $errs" puts "ast = $ast" proc print_nested {ast {prefix ""}} { global src puts "$prefix [lindex $ast 0]: [string range $src [lindex $ast 1] [lindex $ast 2]]" foreach sub [lrange $ast 3 end] { print_nested $sub "$prefix " } } print_nested $ast ====== ---- !!!!!! %| [Category Package], a part of [Tcllib] | [Category Parsing] |% !!!!!!