Groovy Syntax Tree
Jeremy Rayner
Groovy Engineer
Member of JSR 241 Expert Group
Senior Developer, FT Interactive Data
Parser History
- Initially the parser was 'hand written' during conceptual phase
- Grammar rewritten using ANTLR v2.6.2.
- Based directly on the Java 5 grammar by Michael Studman.
- Good tools and support allow language edge cases to be recognized during language development (e.g. non-determinism)
- ANTLR is used to parse Groovy source code into the Groovy Concrete Syntax Tree
Lexer
- The input program is split up in a stream of Tokens by the Lexer.
- Notice how closely the tokens match up between Java and Groovy.
- There is no structure yet, just line and column information
GroovyLexer
java -cp groovy-all-1.0.jar org.codehaus.groovy.antlr.LexerFrame
Parser
- Parser recognizes valid streams of Tokens
- ...and then rearrange these tokens into the Syntax Tree
GroovyRecognizer
Java Syntax Tree
Groovy Syntax Tree
$ export JAVA_OPTS=-Dantlr.ast=mindmap
$ groovyc Foo.groovy
open
Foo.groovy.mm
in
freemind
( http://freemind.sf.net )
accept ast
accept()
no visit
CLASS_DEF ---> accept_FirstChild_v_RestOfTheChildren()
MODIFIERS:
CLASS_DEF: class
IDENT: GroovyExample GroovyExample
EXTENDS_CLAUSE:
IMPLEMENTS_CLAUSE:
OBJBLOCK ---> accept_v_AllChildren_v()
OBJBLOCK -> opening visit {
VARIABLE_DEF: x
VARIABLE_DEF: y ...
CTOR_IDENT:
METHOD_DEF: main
OBJBLOCK -> closing visit }
accept class_def
accept_FirstChild_v_RestOfTheChildren()
no visit
accept & visit modifiers
accept_v_FirstChild_v_RestOfTheChildren()
public
visit class_def
accept_FirstChild_v_RestOfTheChildren()
public class
visit identifier
accept_v_FirstChild_v()
public class GroovyExample
visit extends clause
accept_v_FirstChild_v()
public class GroovyExample
visit implements clause
accept_v_FirstChild_v()
public class GroovyExample
opening visit to object block
accept_v_AllChildren_v()
public class GroovyExample {
... and so on...
public class GroovyExample {
...
closing visit to object block
accept_v_AllChildren_v()
public class GroovyExample {
...
}
Tree Walker and Visitor
Tree Walkers
PreOrderTraversal
- visits node then subtrees are traversed recursively
Post Order Traversal *
- traverses subtrees then visits node
Euler Tour Traversal *
- walks around the edge of the tree, keeping the 'walls' on our left.
SourceCodeTraversal
- treats each type of AST node differently
traversals typically run in O(n) time
Visitors
NodePrinter
- simple xml representation of node
MindMapPrinter
- in format suitable for viewing in freemind
NodeAsHTMLPrinter
- html markup used around source printer
SourcePrinter
- outputs groovy source code
More Visitors
CompositeVisitor
- a collection of visitors (composite pattern)
SummaryCollector
- remembers any public classes defined
Eclipse AST Builder *
IntelliJ PSI Builder *
- pretend Groovy source is Java and take part in the refactoring, code completion
GroovyDoc *
- c.f. javadoc
AntlrParserPlugin *
- the main transformation from Groovy Source to Java bytecode
Language Conversion - Java2Groovy
Book List
- Data Structures and Algorithms in Java - Goodrich / Tamassia
- Building Parsers in Java - Metsker
- Groovy in Action - Koenig et all