views:

632

answers:

10

Recently, I am started learning Antlr. And knew that lexer/parser together could be used in construction of programming languages.

Other than DSL & programming languages, Have you ever directly or in-directly used lexer/parser tools (and knowledge) to solve real world problem? is it possible to solve the same problem by an average programmer without lexer/parser knowledge?

+7  A: 

Yes, I've used them. Yes, you can do things with out them--but any time you choose the wrong tool for the job, you'll be making needless pain for yourself.

Some example of the non-standard uses I've personally put the technology to:

  • scraping data from reports generated by legacy systems
  • picking out patterns in data too complex for a regexp
  • protocol analysis
  • text based adventure games
  • the metaprograming API that ate Toledo (not it's real name)
  • code analysis / log analysis
  • picking apart "freeform" fields in a database
  • and scads more I"m forgetting (I'm old)
MarkusQ
+1  A: 

A great example of a lexer/parser that is in use in many systems exists in Apache Lucene (an open source Search Index library). Both the query parser and the document tokenizer use these techs. While I guess you could categorize the query parser in Lucene as a dsl parser, it is still being used to help solve a real world problem.

For that matter, I'm sure that Google is employing some sort of lexer/parser for it's own query syntax and document parsing.

dustyburwell
+3  A: 

Syntax highlighting. The Scite text editor allows you to write your own lexer (in C++) to provide syntax highlighting for any custom language. I wrote my own custom lexer for Scite as a refresher on this topic (I studied it a while ago at university).

Regular Expressions are often used as an alternative for pattern matching and simple language processing. This is even more common of recent years thanks to the improved RegEx support in frameworks such as .NET. In many cases developers may not even know of lexing/parsing techniques and so fall into usng Regex by default.

However, as another answer says, Regex can quikcly become inefficient,slow and difficult to maintain for anything more then a simple grammar/language. In that situation parser/lexers are generally the best choice.

Ash
+1  A: 

Yes, I've used them in real world stuff - but mostly the creation of custom languages that you use lexers and parsers for, has been supplanted by languages defined in XML. More verbose but then you don't have to do all that work...

Kendall Helmstetter Gelner
+1  A: 

This is interesting -

I just wrote a lexer/parser by hand to allow simple string-based query expressions to be handled by an IBindingListView implementation. That was the first useful thing outside of code that I have actually been able to use it for, and not just heard about it.

Pretty pedestrian example, but I'm pretty pedestrian in my experience with them.

codekaizen
+2  A: 

Yes, I've used them. I'm a big fan of ANTLR. I give some tips and tricks on using ANTLR here and a brief endorsement of it here. It's possible to hand write your own parser using ad hoc methods but it is a lot harder and will take a lot longer to figure out how to make changes when you need to grow the language that your parser is supposed to parse.

Glenn
Agreed - I never want to write one by hand again. ANTLR has a bit of a learning curve on how to integrate it into your language, though.
codekaizen
A: 

I have not used one of the big guys to do any lexical analysis yet, I have however written my own lexer by hand for a project I worked on. We had to parse data that came back from a Near Space project's data computer and it was written to the SD card in binary. I had to pull the bits apart, convert them from binary to decimal and then write the entire contents out in a comma separated file.

It is a lot of fun to sit and think through it logically and write a state machine for the task at hand!

X-Istence
A: 

Any place you handle text input ends up using some kind of lexer/parser although some times they end up being the degenerate case (lex anything but a comma as one token type and a comma as another. Parse A number, a name, a number and an end of line. That sort of thing) In one way of looking at it sscanf could be considered to be the most degenerate cases of a lexer/parser generator.

As for a full blown lex/yacc operation? I expect that gets used mostly for GPLs and for things that fall under the loose definition of DSLs

BCS
+1  A: 

If you want to process a computer langauge, you need lexers and parsers as a starting place. They aren't enough; you have to do something with the parser result.

A really spectacular usage of lexing and parsing that we did is to translate JOVIAL, a 1960s language, into C, for the B-2 stealth bomber. See http://www.semdesigns.com/Products/Services/NorthropGrummanB2.html

Ira Baxter
A: 

Yes! The team I work with has implemented a document generation framework, which among other things allows (mostly arithmetic) expressions to be evaluated. We're using a parser to extract expressions from the inputs/definitions for the generated documents and create expression trees for them. Afterwards those trees are evaluated and the evaluated results are written to the final document.

andyp