Lexer Part of Anglr File


Introduction

A lexical analyser is that part of the text analyser (e.g. the compiler) that transmits terminal symbols to the syntax analyser. It's composed from any number of scanners. One of them must be an initial scanner and must never be removed from the stack unless it is replaced immediately by another scanner.

Syntax

The syntax of the lexical part is very simple. It has no content. The list of scanners it needs for its operation is defined by attributes. The followining syntax rule defines the structure of lexer part:


RULE L-1

<lexer part>
    : <attribute list> ? '%lexer' <identifier> '%{' <attribute list> ? '%}'
    ;
                    

Here is an example of lexer part:

[ Description Text='Lexer for anglr file' Hover='true' ]
[
    UseScanner
        ScannerId='commentScanner'
        InitialScanner='mathScanner'
        Hover='true'
]
[ CompilationInfo ClassName='MathLexer' NameSpace='Math.Lexer' Access='public' Hover='true' ]
%lexer mathLexer
%{

%}
                    

Discussion

RULE L-1 - Structure of lexer part

Rule RULE L-1 defines the structure of the lexer part:

  • Lexer part is preceded by possibly empty attribute list
  • attribute list is followed by reserved word %lexer and an identifier representing the name of lexer
  • Then there is another possibly empty list of attribute list between part parentheses %{ ad %}.

In example above lexer part is preceded by this attribute list:

[ Description Text='Lexer for anglr file' Hover='true' ]
[
    UseScanner
        ScannerId='commentScanner'
        InitialScanner='mathScanner'
        Hover='true'
]
[ CompilationInfo ClassName='MathLexer' NameSpace='Math.Lexer' Access='public' Hover='true' ]
                    

Name of lexer part is:

%lexer mathLexer
                    

and its body is empty:

%{

%}
                    

Attributes

The list of attributes specified before the lexical part must contain attributes describing which scanners contain the lexical analyser. Attributes located between %{ and %} are not important.

These attributes are mandatory for lexer part:

Attribute Value Name Value Description
UseScanner ScannerId scanner id Scanner id is equal to some name of scanner part within anglr file. We can specify multiple ScannerId values. As many as we need them. For each scanner contained in the lexical analyzer, one ScannerId value must be specified.
InitialScanner scanner id this is id of scanner which will be loaded by lexical analyzer at the beginnig of its operation. If we don't mention initial scanner, the first ScannerId value will be used for initial scanner. InitialScanner value need not be one of ScannerId values. If it is not, additional scanner will be added to the set of scanners used by lexical analyzer: the one with InitialScanner id.
CompilationInfo ClassName name of C# class this name will be used by anglr compiler to generate class which will implement this lexical analyzer
NameSpace namespace of C# class namespce generated by anglr compiler
Access class access class access should be one of these keywords: public, private or internal