Optimizations

Big subject...

I won't write here about Pegged internal optimizations since, as of this writing, there aren't many of them. Pegged simplifies sequences and choices when they have one element or are sequences-of-sequences or choices-of-choices. That's about it.

The thematic of this page is more 'how can I make my grammar faster?'.

First, if you're satisfied with the way a grammar is, you can compile it and drop the resulting code into a file, to make it an independent module. Use asModule(string moduleName, string grammar) for that. That'll create a D module that imports pegged.peg and contains your grammar. I might add an asAutonomousModule function that'd also write the necessary Pegged infrastructure in the file, thus making it totally autonomous (you wouldn't need Pegged to use it elsewhere).

Once written in a file, the grammar-compiling step can be avoided. It won't make the grammar any faster but does prevent recompiling many times something that is stable.

Second, note there is a tension between a very complete grammar (and hence, parse tree) and a fast one: parsing speed depends (at least partially) on the grammar depth, its number of rules. So inlining could maybe be used to accelerate the parsing somewhat:

From:

mixin(grammar(" 
    Expr     <- Factor AddExpr*
    AddExpr  <- ('+'/'-') Factor
    Factor   <- Primary MulExpr*
    MulExpr  <- ('*'/'/') Primary
    Primary  <- Parens / Number / Variable / '-' Primary

    Parens   <- '(' Expr ')'
    Number   <~ [0-9]+
    Variable <- Identifier
"));

to:

mixin(grammar(" 
    Expr     <- (Primary (('*'/'/') Primary)*) (('+'/'-') Primary (('*'/'/') Primary)*)*
    Primary  <- '(' Expr ')' / [0-9]+ / Identifier / '-' Primary
"));

I most probably botched the manual inlining, but you get the idea. Note the process stopped at Primary, since its recursive. In general, any recursion (direct, indirect or mutual) will stop the inlining process.

Notice how the second grammar is far shorter, but also much less readable. Suppressing levels like this will make grammars look like regular expressions, with all that entails. Also, the parse tree will only contain Expr and Primary nodes, so to calculate the result (for example), it will be a bit more difficult.

Another advice would be to use the PEG operators when possible. Do not do BNF-like sequences like this:

A <- Elements
Elements <- Element Elements / Element
Element <- (...)

Use + (or *, ?):

A <- Element+
Element <- (...)

Not only will that make the grammar more readable, it'll also make it more efficient, since + and co are directly coded as D while loops instead of recursive rules.

Pegged Tutorial

Home

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations

Optimizations

Clone this wiki locally