antlr

Using antlr to parse a | separated file

So I think this should be easy, but I'm having a tough time with it. I'm trying to parse a | delimited file, and any line that doesn't start with a | is a comment. I guess I don't understand how comments work. It always errors out on a comment line. This is a legacy file, so there's no changing it. Here's my grammar. grammar Route;...

ANTLR how to detect rubbish data at end of input

When using grammars written in ANTLR, the parser correctly recognizes data from an input stream, but if I have some rubbish text at the end of the input (which is not supposed to be parsed by the grammar) the parser does not complain. I guess this behavior is all right (I mean the parser did its job and parsed whatever I said it should ...

Antlr beginner mismatchedtoken question

I'm just starting out with Antlr, so please forgive the noob question here. I'm lost. Any help is appreciated. This is my grammar script: grammar test; script : 'begin script' IDENT ':' 'end script' IDENT ; IDENT : ('a'..'z' | 'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')* ; This is the script I'm trying to run it agai...

How can my ANTLR lexer match a token made of characters that are subset of another kind of token?

I have what I think is a simple ANTLR question. I have two token types: ident and special_ident. I want my special_ident to match a single letter followed by a single digit. I want the generic ident to match a single letter, optionally followed by any number of letters or digits. My (incorrect) grammar is below: expr : special_...

Antlr3 - HIDDEN token in the parser

Can you use a token defined in the lexer in a hidden channel in a single rule of the parser as if it were a normal token? The generated code is Java... thanks ...

How can I execute an ANTLR parser action for each item in a rule that can match more than one item?

I am trying to write an ANTLR parser rule that matches a list of things, and I want to write a parser action that can deal with each item in the list independently. Some example input for these rules is: $(A1 A2 A3) I'd like this to result in an evaluator that contains a list of three MyIdentEvaluator objects -- one for each of A1, A...

Hidden token into default channel - AntlrV3

Suppose I'm having white spaces (WS) in the hidden channel. And for a particular rule alone, I want white spaces also to be considered, is it possible to bring WS to the default channel for that particular rule alone in the parser? ...

Antlr3 parser path command shell

I need to parse the command shell such as: cp /home/test /home/test2 My problem is in the correct path parsing. I defined a rule (I can not use a token as path but I need to define it in the parser): path : ('/' ID)+; with ID: (A.. Z | a.. z) +; WS: (' ') {$channel = HIDDEN;}; I need to keep the token WS hidden, but this gives ...

Will rewriting a multipurpose log file parser to use formal grammars improve maintainability?

TLDR: if I built a multipurpose parser by hand with different code for each format, will it work better in the long run using one chunk of parser code and an ANTLR, PyParsing or similar grammar to specify each format? Context: My job involves lots of benchmark log files from ~50 different benchmarks. There are a few in XML, a few HTML,...

How can I transform a functional language in XML to Java?

I'm working with a DSL based on an XML schema that supports functional language features such as loops, variable state with context, and calls to external Java classes. I'd like to write a tool which takes the XML document and converts it to, at the very least, something that looks like Java, where the <set> tags get converted to variabl...

Interpreting a variable number of tree nodes in ANTLR Tree Grammar

Whilst creating an inline ANTLR Tree Grammar interpreter I have come across an issue regarding the multiplicity of procedure call arguments. Consider the following (faulty) tree grammar definition. procedureCallStatement : ^(PROCEDURECALL procedureName=NAME arguments=expression*) { if(procedureName.equals("foo...

Using $ as delimiter in StringTemplate from ANTRL rewriter grammars

I'm trying to write an ANTLR3 grammar that generates HTML output using StringTemplate. To avoid having to escape all the HTML tags in the template rules (e.g. \<p\><variable>\</p\>), I'd prefer to use dollar as the delimiter for StringTemplate (e.g. <p>$variable$</p>). While the latter seems to be the default when StringTemplate is used...

ANTLR - basic grammar including unexpected characters?

I've got a really simple ANTLR grammar that I'm trying to get working, but failing miserably at the moment. Would really appreciate some pointers on this... root : (keyword|ignore)*; keyword : KEYWORD; ignore : IGNORE; KEYWORD : ABBRV|WORD; fragment WORD : ALPHA+; fragment ALPHA : 'a'..'z'|'A'..'Z'; fragment ABBRV : WOR...

Converting Antlr syntax tree into useful objects

Hello everyone, I'm currently pondering how best to take an AST generated using Antlr and convert it into useful objects which I can use in my program. The purpose of my grammar (apart from learning) is to create an executable (runtime interpretted) language. For example, how would I take an attribute sub-tree and have a specific A...

Regular expression token antlrV3

Can I write a rule where the initial token is partly fixed and partly generic? rule: ID '=' NUMBER ; ID: (A.. Z | a.. Z) + NUMBER: (0 .. 9) + But only if the token ID is in the form var* (var is fixed) Thanks ...

Combined grammar ANTLR option filter

I have a combined grammar (lexer and parser on the same file). How do I set the filter = true to the lexer? Thanks ...

How can I modify the text of tokens in a CommonTokenStream with ANTLR?

I'm trying to learn ANTLR and at the same time use it for a current project. I've gotten to the point where I can run the lexer on a chunk of code and output it to a CommonTokenStream. This is working fine, and I've verified that the source text is being broken up into the appropriate tokens. Now, I would like to be able to modify the...

Equal (not a token) in an ANTLR grammar. What does this mean?

What does the construct basename = in the following rule? tabname: (ID'.')? basename = ID ; There is this single occurrence of basename in the grammar. Thanks ...

ANTLR - Writing a tree grammar for an AST

I have an AST outputted for some Lua code by my grammar file, which currently does parsing and lexing for me. I want to add a tree grammar to this, but since i'm using C# i'm not sure how to do it. What's the basic process for generating tree grammar code when you already have a parser and lexer written? UPDATE: I have the following gra...

Improving ANTLR DSL parse-error messages

I'm working on a domain-specific language (DSL) for non-programmers. Non-programmers make a lot of grammar mistakes: they misspell keywords, they don't close parentheses, they don't terminate blocks, etc. I'm using ANTLR to generate my parser; it provides a nifty mechanism for handling RecognitionExceptions to improve error handling. ...