Areas | Language parsing, computer arithmetic |
Good if student knows | variable, based on interests |
Priority | Low |
Difficulty | low to medium |
Benefits to the student | introduction to grammar and parsing concepts |
Benefits to Tcl | Leverage ease of use of existing tools |
Mentor | Steve Huntley |
Perl enables arbitrary precision math by means of a pragma, which overloads the standard math operators and thus transparently accepts large integers as arguments for mathematical expressions. In this way Perl is able to do not only large integer math transparently as Tcl does, but also floating-point, rational number, vector, etc. calculations [L1 ] [L2 ].
This approach could not work with Tcl, since it does not have math operators per se, but instead has the command expr with its own syntax for math expressions. However, it should be straightforward to overload the expr command itself; with a command that is able to parse existing valid math expressions, and can be expanded to accept a wider range of operations.
The goal of this project would be to write a selection of parsers/lexers for mathematical expressions, and incorporate them into a replacement for the expr command, thus allowing for transparent use of a wider range of numerical types (similar to Perl's usage), and experiments on new ways to execute expressions that are currently valid, such as:
The exact mix of features would be dependent on the skills and interests of the student.
Existing pure-Tcl parsing tools such as Yeti or taccle should be suitable for the task. Evaluation of parser features and selection of tools would be part of the project.
SEH -- In previous years, Tcl mentors have received criticism from Google that idea pages were messy, discursive and confusing to prospective students. Thus I am trying to keep all comments below the collapsible discussion header.
JBR - I've had very good experiences with grammar_peg which is included in tcllib. Look for my examples posted on the wiki.
Larry Smith: incorporate the Tcl expr patch allowing us to eliminate the dereferencing operator and outer braces? Please? Finally? allowing [ expr x*x ] instead of [ expr {$x*$x} ]
AMG: There are a lot of syntax conflicts which require variable names to be preceded by $. As for removing the outer braces, you can do so currently, though it is unsafe and slow.
However, these two changes at the same time would avoid part of the safety problem, since [expr] (not the Tcl parser) would be performing the variable substitution. But the speed problem would remain. So long as there are any spaces in the expression, there's more than one argument to [expr], and it has to [concat] them, which makes it impossible to bytecode the expression.
Larry Smith It's not encouraged because it's not implemented and it's not implemented because it is not encouraged? The idea should be to make Tcl's expr syntax simpler, more flexible, easier to use and easier to read. Restricting it to one argument is counterproductive of these goals and silly if we just implement your suggested method.
% tcl::unsupported::disassemble lambda {{} {expr 2+2}} ByteCode 0x0174E2A8, refCt 1, epoch 4, interp 0x016144F8 (epoch 4) Source "expr 2+2" Cmds 1, src 8, inst 3, litObjs 1, aux 0, stkDepth 1, code/src 0.00 Proc 0x015D6168, refCt 1, args 0, compiled locals 0 Commands 1: 1: pc 0-1, src 0-7 Command 1: "expr 2+2" (0) push1 0 # "4" (2) done % tcl::unsupported::disassemble lambda {{} {expr 2 + 2}} ByteCode 0x0174E6A8, refCt 1, epoch 4, interp 0x016144F8 (epoch 4) Source "expr 2 + 2" Cmds 1, src 10, inst 14, litObjs 3, aux 0, stkDepth 5, code/src 0.00 Proc 0x015D6568, refCt 1, args 0, compiled locals 0 Commands 1: 1: pc 0-12, src 0-9 Command 1: "expr 2 + 2" (0) push1 0 # "2" (2) push1 1 # " " (4) push1 2 # "+" (6) push1 1 # " " (8) push1 0 # "2" (10) concat1 5 (12) exprStk (13) done
AMG: The remaining safety issue is due to [square bracket script substitution], which would still be performed by the Tcl parser instead of internal to [expr].
Some syntax conflicts, ambiguities, and difficulties:
Larry Smith =) This dichotomy was something of an artifact of the first implementation of Tcl. Arguably, the whole thing is over-engineered in this respect. Rather than "proc foo { bar } {...$bar...}", Ousterhaut could have simply done "set foo {...$0...}" and executed it with an optional "$" prefix - this mechanism would have made the whole lambda thing a doddle. Granted, such a change goes far astray from the objective under discussion, but if the objective under discussion has such far-ranging implications, perhaps we should talk about it.
Command 1: "expr {$asdf(188)}" (0) push1 0 # "188" (2) loadArray1 %v0 # var "asdf" (4) tryCvtToNumeric (5) done Command 1: "expr {asdf(188)}" (0) push1 0 # "tcl::mathfunc::asdf" (2) push1 1 # "188" (4) invokeStk1 2 (6) tryCvtToNumeric (7) done
Larry Smith No, I don't. I was simply pointing out the most correct solution but, yes, it does imply quoting hell in the current way of doing things.
It isn't really new, it merely regularizes referencing the null variable.
AMG: Nevertheless, here's a simple example demonstrating an overloaded [expr] that behaves as you ask. Note that it still has the safety problem, since variable substitution is performed by Tcl before calling [expr].
rename expr _expr; proc expr {args} { uplevel 1 _expr [regsub -all -nocase {[a-z:][a-z0-9_:]*\M(?!\()} [concat $args] {$&}] }
This code doesn't get all cases, it doesn't support arrays, and it screws up the ?: ternary operator. Since it doesn't support arrays, which are used by the [history] mechanism, it won't work with an interactive Tcl session. For interactive use, try this:
proc expr2 {args} { uplevel 1 expr [regsub -all -nocase {[a-z:][a-z0-9_:]*\M(?!\()} [concat $args] {$&}] }
AMG: Larry, don't get me wrong, I think this is a great idea, I'm just trying to be realistic about it so that it can get implemented without breaking everything.
Larry Smith I understand that, and, fundamentally, I support it. But someday we are going to have to either break backward compatibility or move on to another programming language. These are the sort of issues that will drive that discussion, and I'd like to keep them in mind.
SEH Since we're considering parsing and grammars here, the grammar devised could be more C-like and include keywords and everything. Thus trying to use a variable named e.g. cos could throw an error. But I was more interested in adding functionality than providing syntactic sugar. Another possibility: a parser specialized for currency calculations, thus solving a real problem.
Larry Smith I would argue that this change does add functionality, since I find a more readable and typeable syntax a win-win situation. However, if you look into the possibility of stacking various expr parsers these changes add a lot of new functionality. One preprocessor might translate "$1.34" as "[dollars 1 34]", for example, before handing it off the more general case (likewise £¥₠ to pound sterling, yen, and Euro). Another might be able to recognize vectors or numbers in some format - something like 2 3 ρ ι6 from APL say - permitting array operations in the same concise manner. Each layer would further specialize the underlying mathematical engine of expr to handle new problem domains in a concise manner.