Recommendations for a good C#/ .NET based lexical analyser
Can anyone recommend a good .NET based lexical analyser, preferably written in C#? ...
Can anyone recommend a good .NET based lexical analyser, preferably written in C#? ...
Lexical analyzers are quite easy to write when you have regexes. Today I wanted to write a simple general analyzer in Python, and came up with: import re import sys class Token(object): """ A simple Token structure. Contains the token type, value and position. """ def __init__(self, type, val, pos): self.ty...
I'm looking for a decent lexical scanner generator for C#/.NET -- something that supports Unicode character categories, and generates somewhat readable & efficient code. Anyone know of one? EDIT: I need support for Unicode categories, not just Unicode characters. There are currently 1421 characters in just the Lu (Letter, Uppercase)...
I'm working on some code generation tools, and a lot of complexity comes from doing scope analysis. I frequently find myself wanting to know things like What are the free variables of a function or block? Where is this symbol declared? What does this declaration mask? Does this usage of a symbol potentially occur before initialization?...
I want to be able to predicate pattern matches on whether they occur after word characters or after non-word characters. In other words, I want to simulate the \b word break regex char at the beginning of the pattern which flex/lex does not support. Here's my attempt below (which does not work as desired): %{ #include <stdio.h> %} %x ...
I have a Python regular expression that contains a group which can occur zero or many times - but when I retrieve the list of groups afterwards, only the last one is present. Example: re.search("(\w)*", "abcdefg").groups() this returns the list ('g',) I need it to return ('a','b','c','d','e','f','g',) Is that possible? How can I do i...
It seems that flex doesn't support UTF-8 input. Whenever the scanner encounter a non-ASCII char, it stops scanning as if it was an EOF. Is there a way to force flex to eat my UTF-8 chars? I don't want it to actually match UTF-8 chars, just eat them when using the '.' pattern. Any suggestion? EDIT The most simple solution would be: ...
I need a simple lexical analyzer that reports for-loop errors in C/C++. ...
I have a project where I need to compare multi-chapter documents to a second document to determine their similarity. The issue is I have no idea how to go about doing this, what approaches exist or if their are any libraries available. My first question is... what is similar? The numbers of words that match, the number of consecutive wo...
I am programming a lexer in C and I read somewhere about the header file tokens.h. Is it there? If so, what is its use? ...
Suppose I have a lex regular expression like [aA][0-9]{2,2}[pP][sS][nN]? { return TOKEN; } If a user enters A75PsN A75PS It will match But if a user says something like A75PKN I would like it to error and say "Character K not recognized, expecting S" What I am doing right now is just writing it like let [a-zA-Z] num [0-9] {l...
I already made a scanner, now I'm supposed to make a parser. What's the difference? ...
I have a Makefile so that when I type make the following commands run: yacc -d parser.y gcc -c y.tab.c flex calclexer.l gcc -c lex.yy.c But then after this I get the following error messages: calclexer.l:10: error: parse error before '[' token calclexer.l:10: error: stray '\' in program calclexer.l:15: error: stray '\' in program cal...
When I run make on the following Makefile, when is the symbol table built, if it even is? LEX = flex YACC = yacc CC = gcc calcu: y.tab.o lex.yy.o $(CC) -o calcu y.tab.o lex.yy.o -ly -lfl y.tab.c y.tab.h: parser.y $(YACC) -d parser.y y.tab.o: y.tab.c parser.h $(CC) -c y.tab.c lex.yy.o: y.tab.h lex.yy.c $(CC) -c lex.y...
I need to make a scanner in lex/flex to find tokens and a parser in yacc/bison to process those tokens based on the following grammar. When I was in the middle of making the scanner, it appeared to me that variables, functions, and arrays in this language can only have the name 'ID'. Am I misreading this yacc file? /* C-Minus BNF Gram...
Is it always necessary to do so? What does it look like? ...
I made a program that is supposed to recognize a simple grammar. When I input what I think is supposed to be a valid statement, I get an error. Specifically, if I type int a; int b; it doesn't work. After I type int a; the program echoes ; for some reason. Then when I type int b; I get syntax error. The lex file: %{ #include <st...
In my yacc file I have things like the following: var_declaration : type_specifier ID ';' | type_specifier ID '[' NUM ']' ';' ; type_specifier : INT | VOID ; ID, NUM, INT, and VOID are tokens that get returned from flex, so yacc has no problems recognizing them. The problem is that in the above there are things like ...
If I have the following in my flex file, what does it do? [\\[\\];] { return yytext[0]; } ...
I made a program that is supposed to recognize a simple grammar. When I input what I think is supposed to be a valid statement, I get an error. Specifically, if I start out with an identifier, I automatically get a syntax error. However, I noticed that using an identifier won't generate an error if it is preceded by 'int'. If a is an...