Skip to content

Using the Parse Tree

PhilippeSigaud edited this page Mar 18, 2012 · 25 revisions

Using the Parse Tree

Pegged parse trees are described here: Parse Trees.

As you could see, there are a deliberately simple structure. There are not multiple types to encode AST nodes and whatnots. Since D has wonderful compile-time evaluation capabilities and code can be mixed-in, Pegged has all the power it needs:

Since parse trees are structs, they are easy to use at compile-time (classes are a bit trickier for now, so I chose structs): you can iterate on them, search for specific nodes, delete nodes or simplify them, etc.

See Semantic Actions, User-Defined Parsers and Generating Code for more on this.

An Output contains a ParseTree member called parseTree. Since Output uses alias this, the parse tree is directly accessible:

auto p = Expr.parse(input);
writeln(p.capture); // really p.parseTree.capture
foreach(ref child; p.children) // in truth, we are iterating on p.parseTree.children
    doSomething(child);

Since parse trees are run-of-the-mill N-ary trees, I didn't add tree-walking algorithms in Pegged, but I may do so to group useful tree functions with the related structure:

  • searching for all nodes with a particular name
  • getting the leaves (parse trees with no children)
  • filtering a tree: cutting the nodes that do not obey a particular predicate.
  • modifying a tree.

I'll see. You can find tree-related algorithms in another Github project of mine: https://github.com/PhilippeSigaud/dranges (see for example https://github.com/PhilippeSigaud/dranges/blob/master/treerange.d)


Once you have a parse tree, an external function can be called to transform the tree into user-defined input. Say for example we want a parser to transform a wiki-like syntax into LaTeX code:

  • From ===Title=== to \section{Title}

  • From ==Title== to \subsection{Title}

  • From _text_ to \emph{text}

  • From * Text1 (LF) * Text2 (LF) * Text3(LF) to `begin{itemize}\item Text1 \item Text2 \item Text3 \end{itemize}

And so on... First, let's define a small grammar for such a wiki syntax:

a

I'll use the following input:

enum input = "
=== This is the Title===

A _very_ important introductory text.

== A nice small section ==

And some text.

== Another section ==

And a list:

* A
* B
";

Used on input, Wiki gives the following parse tree:

a

Now, to generate the wanted LaTeX code, we will switch on the nodes names:

string toLaTeX(Output o)
{
    string parseToCode(ParseTree p)
    {
        string result;
        switch(p.name)
        {
            case "Section":

        }
        return result;
    }

    return parseToCode(o.parseTree);
}

(to be continued)


Next lesson: Extended PEG Syntax


Pegged Tutorial

Clone this wiki locally