parsing

How to rewrite a stream of HTML tokens into a new document?

Suppose I have an HTML document that I have tokenized, how could I transform it into a new document or apply some other transformations? For example, suppose I have this HTML: <html> <body> <p><a href="/foo">text</a></p> <p>Hello <span class="green">world</span></p> </body> </html> What I have currently written is a tokenizer t...

How to get files from 1 to n in a particular pattern?

Suppose you have files like: NewFile.part01.zip NewFile.part02.zip NewFile.part04.zip NewFile.part06.zip NewFile.part07.zip How do you get the files in this patter so you only get a SINGLE file called "NewFile" and also get the missing ones as integers, in this case (3, 5) Right now I am checking files one by one and if the name only...

How do you constrct a parse tree from a stream of tokens?

Does anyone know what is the general method for turning a stream (or a list) of tokens into a parse tree? I am dying to know this. I haven't programmed stuff like this and would love to learn it! Thanks, Boda Cydo. ...

PHP DOMParser Help

As recommended in a previous question of mine, I am using the PHP Simple HTML DOM Parser as a way to search an HTML document and grab contents from a specific element (in my case, a textarea). I was wondering if anyone had used this before and was able to give me any advice for my problem (or recommend another solution). The code I'm usi...

XSLT parse escaped HTML stored in an attribute and convert that attribute's content into element's content

Hello Guys, I'm stuck on what I think should be simple thing to do. I've been looking around, but didn't find the solution. Hope you will help me. What I have is an XML element with an attribute that contains escaped HTML elements: <Booking> <BookingComments Type="RAM" comment="RAM name fred&lt;br/&gt;Tel 09876554&lt;br/&gt;Emai...

Parsing mathematical functions of custom types

I'm about to start developing a sub-component of an application to evaluate math functions with operands of C++ objects. This will be accessed via a user interface to provide drag and drop, feedback of appropriate types followed by an execute button. I'm quite interested in using flex and bison for this having looked at equation parsing...

What is the best way to parse C++ class architecture in Python?

I want to parse c++ sources in Python. Namely, information about the structure of the classes. I need to construct some object in Python that should contain information about c++ class. Something like this: [namespace] function name [: parent] [constructor([parameters])] [destructor([parameters])] public methods([parameters]) private...

Is it possible to factor out bin_op nonterminal in grammar specification?

Inconvenience in specifying grammars - we cannot factor out bin_op in following example (Bison): expr : expr bin_op expr ; bin_op : Add | Mul ; because of shift/reduce conflicts. Is there parsing technique or parser generator which allows such thing? ...

Python+parsing custom config file

I have a quite big custom made config file I need to extract data from once a week. This is an "in house" config file which doesn't comply to any know standard like INI or such. My quick and dirty approach was to use re to search for the section header I want and then extract the one or 2 lines of information under this header that I wa...

Library to parse C/C++ source code

Which library should I use to parse C/C++ source code file? I need to parse source file, calculate how much useful strings inside, how much 'for' blocks, 'if' blocks, how much comments inside. I also may need to insert a comment or small piece of code after each 'for' block. Is there any libraries? May be any library included in Micros...

Can VTD-XML take a String as an input?

Hey, I'm trying to use VTD-XML to parse XML given to it as a String, but I can't find how to do it. Any help would be appreciated. http://vtd-xml.sourceforge.net ...

MIME parser for iPhone app

Hi, Can anyone recommend MIME parser for iPhone? My app should receive message in this format and parse the text and attachments. I found some open source libraries like libvmime but i run into compile problems on Mac/iPhone. Any idea is welcomed! Thanks ...

XSLT: Handling numeric values that use exponent notation

We have to transform some XML that contain numbers in exponent (aka scientific) notation eg. <Value>12.34e12</Value> <Value>-12.34e-12</Value> rather irritatingly, we cannot use the sum() function and the like because the XSLT parser expects numbers to be in decimal format. [We are using the .Net XslCompiledTransform class...

How should I go about building a simple LR parser?

I am trying to build a simple LR parser for a type of template (configuration) file that will be used to generate some other files. I've read and read about LR parsers, but I just can't seem to understand it! I understand that there is a parse stack, a state stack and a parsing table. Tokens are read onto the parse stack, and when a rule...

Creating FIRST and FOLLOW sets for all non-terminals

If somebody could help me with the rules of FIRST and FOLLOW sets that would be awesome. The question is calculate the FOLLOW sets for all of the non-terminals in the following grammar S ::= S b T a E ¦ a T b ¦ c T a c R ::= E T ¦ a E T ::= a c E ¦ epsilon E ::= R ¦ T a d ¦ epsilon I have read the rules of crea...

Arithmetical operations with void* pointers to numerical data.

Hi I am working on small parser and "equation solver" in C, part of this process is to do arithmetical operations on tokens. Each token contains void* pointer to numerical data, and enum, which defines type of data. This is example of the function, which creates new token by adding two others tokens. In order to do so, I need to che...

Using C++ types in an ANTLR-generated C parser

I'm trying to use an ANTLR v3.2-generated parser in a C++ project using C as the output language. The generated parser can, in theory, be compiled as C++, but I'm having trouble dealing with C++ types inside parser actions. Here's a C++ header file defining a few types I'd like to use in the parser: /* expr.h */ enum Kind { PLUS, MI...

Haskell: syntax error when adding new line in pattern matching

Basically I'm modifying a parser to handle additional operators. Before my changes, one part of the parser looked like this: parseExpRec e1 (op : ts) = let (e2, ts') = parsePrimExp ts in case op of T_Plus -> parseExpRec (BinOpApp Plus e1 e2) ts' T_Minus -> parseExpRec (BinOpApp Minus e1 e2) ts' T_Times -...

Operator precedence and associativity in a parser (Haskell)

I am trying to extend a recursive-descent parser to handle new operators and make them associate correctly. Originally there were only four operators (+ - / *) and they all had the same precedence. The function I am looking at is the parseExpRec function: parseExpRec :: Exp -> [Token] -> (Exp, [Token]) parseExpRec e [...

Parse a log4j log file

We have several applications that use log4j for logging. I need to get a log4j parser working so we can combine multiple log files and run automated analysis on them. I'm not looking to reinvent the wheel, so can someone point me to a decent pre-existing parser? I do have the log4j conversion pattern if that helps. If not, I'll have to ...