antlr

What is a 'semantic predicate' in ANTLR?

What is a semantic predicate in ANTLR? ...

Can a parser tell the lexer to ignore a newline?

I'm writing a preprocessor for my language. In the preprocessor I've output a line that wasn't in the source file. This causes any error messages that Anltr creates to be incremented by one line. The Lexer handles the line count so I'm wondering if there is a way for the parser to tell the lexer to decrement the line count, or to ign...

Processing OCRed text

Hi I am extracting texts from OCRed Tiff files by using a library and dumping it in database. The text I am extracting are actually FORMS having fields like NAME,DOB,COUNTRY etc. Since OCR does not the difference between actual value and the label,it's just dumping all text. Now I have text in DB in following format: Name: MyName Ad...

Parsing ambiguous input with Antlr

I have been trying for a few days to parse some text that consists of text and numbers (I've called it a sentence in my grammar). sentence options { greedy=false; } : (ANY_WORD | INT)+; I have a rule that needs to parse a sentence that finishes with an INT sentence_with_int : sentence INT; ...

scanning binary files in antlr3

I would like to parse a binary file and specify the characters in hex format instead of unicode, is this possible? For instance: rule: '\x7F' ; Instead of: rule: '\u007F' ; Since I do not understand how unicode maps to one byte. ...

MismatchedTokenException in HTML subset grammar

I am writing an ANTLR grammar to recognize HTML block-level elements within plain text. Here is a relevant snippet, limited to the div tag: grammar Test; blockElement : div ; div : '<' D I V HTML_ATTRIBUTES? '>' (blockElement | TEXT)* '</' D I V '>' ; D : ('d' | 'D') ; I : ('i' | 'I') ; V : ('v' | 'V') ; HTML_ATTRIBUTES : ...

Is there a detailed tutorial on creating a multipass language translator using ANTLR?

I have been through the ANTLR tutorial series by Scott Stanchfield and am also going through the ANTLR book... My project requires me to build a multipass language translator to translate code from one language to another both the languages are similar in syntax with the difference being that the source language has support for classes a...

How to create a parser which tokenizes a list of words taken from a file?

Hi, I am trying to do a sintax text corrector for my compilers' class. The idea is: I have some rules, which are inherent to the language (in my case, Portuguese), like "A valid phrase is SUBJECT VERB ADJECTIVE", as in "Ruby is great". Ok, so first I have to tokenize the input "Ruby is great". So I have a text file "verbs", with a lot ...

ANTLR 3, Parsing, Single Quote and LookAhead

A very long subject for an annoying issue. I am trying to parse this: $A['B', 3] for example so that I can invoke a java class with the 3 arguments, something in the lines of: simpleTermTwoParams returns [IEvaluator e]: v1=VAR '[' '\'' s1=STR '\'' ',' i1=INTEGER ']' {$e = new VarEvaluator ($v1.text, $s1.text, Integer.parseInt($i1.tex...

Interpreting IF statements in ANTLR

I'm implementing a BASIC-like language, the syntax of if statements is almost the same to BASIC: IF a == b THEN PRINT "EQUAL" ELSE PRINT "UNEQUAL" ENDIF I have write a grammar file to parse and a tree walker to interpreter the language: [Expr.g] options { language=Python; output=AST; ASTLabelType=CommonTree; } to...

How do I exclude characters / symbols using ANTLR grammar?

I'm trying to write a grammar for various time formats (12:30, 0945, 1:30-2:45, ...) using ANTLR. So far it works like a charm as long as I don't type in characters that haven't been defined in the grammar file. I'm using the following JUnit test for example: final CharStream stream = new ANTLRStringStream("12:40-1300,15:123-18:59"...

Antlr - Object reference not set to an instance of an object

Hey, Does anyone know anything about how Antlr works? I'm getting an error on a dev server: [NullReferenceException: Object reference not set to an instance of an object.] Antlr.StringTemplate.CommonGroupLoader.LocateFile(String filename) +19 Antlr.StringTemplate.CommonGroupLoader.LoadGroup(String groupName, StringTemplateGroup s...

How can I configure Visual Studio 2010 to build ANTLR grammars for C++ projects?

The .rules files provided with the distribution aren't recognized by VS2010, and I'd really like to avoid having to write a whole MSBuild task and all that for what should be a simple tool. Currently I've been using the pre-build event and making the commandline manually... but that kind of sucks when there's more than one grammar to wo...

ANTLR, how to convert BNF,EBNF data in ANTLR?

I have to generate parser of CSV data. Somehow I managed to write BNF, EBNF for CSV data but I don't know how to convert this into an ANTLR grammar (which is a parser generator). For example, in EBNF we write: [{header entry}newline]newline but when I write this in ANTLR to generate a parser, it's giving an error and not taking brack...

ANTLR Trees necessary?

What is the purpose of using AST while building a compiler (with ANTLR). Is it necessary to have one? What is the so called TreeParser and how can one use it? Is it possible to build a compiler without any trees? If not, are there any good tutorials describing the topic in details? ...

ANTLR, steps order

I am trying to design a compiler for a language like C# in ANTLR. But I don't fully comprehend the proper order of steps that should be undertaken. This is how I see it: First I define Lexer tokens Then grammar rules (with rewrite rules to build AST) with actions that gather informations about classes and methods declarations (so that...

ANTLR: rule Tokens has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2

grammar AdifyMapReducePredicate; PREDICATE : PREDICATE_BRANCH | EXPRESSION ; PREDICATE_BRANCH : '(' PREDICATE (('&&' PREDICATE)+ | ('||' PREDICATE)+) ')' ; EXPRESSION : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; Trying to interpret this in ANTLRWorks 1.4 and getting the followin...

Returning multiple values in ANTLR rule

Hi, I have an ANTLR rule like this receive returns[Evaluator e,String message] : RECEIVE FILENAME {$e= new ReceiveEvaluator($FILENAME.text);} ; I have added a new return message and I want to put the file content in that. One way I could do is make the evaluator return the String when I walk the tree by calling the evaluate() met...

An antlr problem with embedded comments

Hi, I am trying to implement a nested comment in D. nestingBlockComment : '/+' (options {greedy=false;} :nestingBlockCommentCharacters)* '+/' {$channel=HIDDEN;}; // line 58 nestingBlockCommentCharacters : (nestingBlockComment| '/'~'+' | ~'/' ) ; //line 61 For me, it would be logical that this should work... This is the error...

Using the ANTLR C target, how can I get the previously matched token in the Lexer?

I have a relatively complicated lexer problem. Given the following input: -argument -argument#with hashed data# #plainhashedData# I need these tokens: ARGUMENT (Text = "argument") ARGUMENT (Text = "argument") EXTRADATA (Text = "with hashed data") OTHER (Text = "#plainhasheddata#") I've been able to take care of the text manipulatio...