LLK

Grammar Syntax

Sections

Each grammar file may contains three types of grammar (lexer, parser and treeparser grammar). If there are multiple grammars, they should be specified in the following order so that token types from previous grammars can be passed implicitly to the later ones:

Global options
Lexer
Parser
TreeParser

Each grammar consists of sections in the following order.

%(LEXER|PARSER|TREEPARSER)(grammar_name) {
    ... // Header section.
}
// Any valid code with the lexer/parser class at the end, before the %OPTIONS section.
...
class grammar_name implements ... {
    ...
} 
%OPTIONS {
    // Grammar options
}
%KEYWORDS {
    // Keywords
}
rules
...
  • Header section is optional. It should contains code, such as import statements that would be emitted after the generated package and import declarations but before the class section.
  • class section is the main class for the lexer, parser or tree parser and is mandatory.
  • %OPTIONS section is mandatory. It specify options for the grammar (see here for valid options). It also work as a separator between class code and the rules. So it is required even if no options is specified.
  • %KEYWORDS section is optional and for lexer grammar only. It declared the keywords recognized by the lexer. See here for more details.
  • rules section specify the rules for the grammar.
  • Each rule consists of a header, a declaration block, a definition block and an optional return action block. Action of the declaration block are performed before any rule matching operations. Action in the return action block is performed after implict generated token/AST creation code, if there are any. See the rule syntax section below for detail.
    void rule(): {
        // Declaration block
        Token t;
    }
    {
        // Definition block
        t=ID()
    }
    {
        // Optional return action block.
        return llkThis;
    }
    
LLKNodeGraph.png
See LLK.ll for details.