views:

491

answers:

8
+11  Q: 

What is parsing?

Parsing is something i come accross alot in development, but as a junior its one of those things i assume i will get the hang of at some point, when its needed. In my current project ive been told to find and use an HTML parser for a certain function, I have found a couple on the web, but what does an HTML parser actually do? And what does it mean to parse an object??

+21  A: 

Parsing usually applies to text - the act of reading text and converting it into a more useful in-memory format, "understanding" what it means to some extent. So for example, an XML parser will take the sequence of characters (or bytes) and convert them into elements, attributes etc.

In some cases (particularly compilers) there's a separation between lexical analysis and syntactic analysis, so the real "understanding" part of the parser works on a sequence of tokens (identifiers, operators etc) rather than on the raw characters.

Jon Skeet
+1 Textbook answer. You should write a book! :p
Mike
He has already written a book. C# in depth
rahul
@Mike - he already did
RobV
Jon Skeet doesn't write books, they just form around him
TWith2Sugars
Thanks adamantium and RobV. Here's a picture for you :) http://tinyurl.com/65j4f5
Mike
`Jon Skeet doesn't write books, they just form around him` lolz, JS facts :D
Rakesh Juyal
+4  A: 

You can start here: http://en.wikipedia.org/wiki/Parsing

Konamiman
+3  A: 

I think this wikipedia article is a good starting point.

KB22
+1  A: 

It is the process of identifying the tokens [tags, attributes] inside an HTML.

rahul
+3  A: 

Parsing is taking a set of data and extracting the meaningful information from it. With HTML parsing, you're looking to read some html and return a structured set of tags and text

adam
+1  A: 

In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens (for example, words), to determine its grammatical structure with respect to a given (more or less) formal grammar.

:0)

Wikipedia

Mongus Pong
+2  A: 

Parse (computers), by Dictionary.com:

To analyze (a string of characters) in order to associate groups of characters with the syntactic units of the underlying grammar.

Igor Oks
+1  A: 

Don't attempt to write anything but a trivial parser yourself. There are good tools for this use ANTLR and bison are two I can think of.

If you use the tools you'll be able to ask for help when you hit a problem.

cheers, Martin.

martsbradley