views:

233

answers:

3

Hello,
I'm building my own language using Flex, but I want to know some things:

  • Why use lexical analyzers?
  • There are going to help me in something?
  • Are they obligatory?

Thanks.

+4  A: 

Lexical analysis helps simplify parsing because the lexemes can be treated as abstract entities rather than concrete character sequences.

You'll need more than flex to build your language, though: Lexical analysis is just the first step.

Richard Pennington
Can you show a example of code without the simplicity of Flex?
Nathan Campos
+1  A: 

You would consider using a lexical analyzer because you could use BNF (or EBNF) to describe your language (the grammar) declaratively, and then just use a parser to parse a program written in your language and get it in a structure in memory and then manipulate it freely.

It's not obligatory and you can of course write your own, but that depends on how complex the language is and how much time you have to reinvent the wheel.

Also, the fact that you can use a language (BNF) to describe your language without changing the lexical analyzer itself, enables you to make many experiments and change the grammar of your language until you have exactly what it works for you.

Petros
You can write your own lexer, yes. However, to do it *right* like flex does, fast and efficient, you have to check each character against every character at that token position in your entire grammar one by one until one (and only one) is regognized. Implementing a state machine like that is most efficiently done with extensive use of gotos. That may be OK for someone who knows what they are doing on a simple grammar, but generally such things are best left to tools.
T.E.D.
@T.E.D. I agree. That's why I said "...how much time you have to reinvent the wheel". Anyway, I certainly don't encourage someone to write something like this from scratch.
Petros
+2  A: 

Any time you are converting an input string into space-separated strings and/or numeric values, you are performing lexical analysis. Writing a cascading series of else if (strcmp (..)=0) ... statements counts as lexical analysis. Even such nasty tools as sscanf and strtok are lexical analysis tools.

You'd want to use a tool like flex instead of one of the above for one of several reasons:

  • The error handling can be made much better.
  • You can be much more flexible in what different things you recognize with flex. For instance, it is tough to parse a C-format hexidecimal value properly with scanf routines. scanf pretty much has to know the hex value is comming. Lex can figure it out for you.
  • Lex scanners are faster. If you are parsing a lot of files, and/or large ones, this could become important.
T.E.D.