parsing

PHP - complete url parser help

I have been trying to find an effective url parser, php's own does not include subdomain or extension. On php.net a number of users had contributed and made this: function parseUrl($url) { $r = "^(?:(?P<scheme>\w+)://)?"; $r .= "(?:(?P<login>\w+):(?P<pass>\w+)@)?"; $r .= "(?P<host>(?:(?P<subdomain>[-\w\.]+)\.)?" . "(?P<doma...

how to extract data from a raw html file

Is there a way to extract desired data from a raw html which has been written unsemantically with no IDs and classes? I mean, suppose there is a saved html file of a webpage(profile) and I want to extract the data like (say)'hobbies'. Is it possible to do this using PHP? ...

Regex, writing a toy compiler, parsing, comment remover

Hi, I'm currently working my way through this book: http://www1.idc.ac.il/tecs/ I'm currently on a section where the excersize is to create a compiler for a very simple java like language. The book always states what is required but not the how the how (which is a good thing). I should also mention that it talks about yacc and lex and s...

Lexer written in Javascript?

I have a project where a user needs to define a set of instructions for a ui that is completely written in javascript. I need to have the ability to parse a string of instructions and then translate them into instructions. Is there any libraries out there for parsing that are 100% javascript? Or a generator that will generate in javascri...

parsing binary file in C#

I have a binary file. i stored it in byte array. file size can be 20MB or more. then i want to parse or find particular value in the file. i am doing it by 2 ways -> 1. By converting full file in char array. 2. By converting full file in hex string.(i also have hex values) what is best way to parse full file..or should i do in binar...

Best JSON parser for Qt?

I'm using QT for Symbian and need a simple json parser. I need to be able to go from json to Qt-variant and the other way around. Is there a simple json parser that I can use? I don't want to write my own. What is the best way to go? Thanks! ...

How to parse a URI like this in Java

I'm trying to parse the following URI : http://translate.google.com/#zh-CN|en|你 but got this error message : java.net.URISyntaxException: Illegal character in fragment at index 34: http://translate.google.com/#zh-CN|en|你 at java.net.URI$Parser.fail(URI.java:2809) at java.net.URI$Parser.checkChars(URI.java:2982) ...

Program capable of opening a large XML file in windows

Hiya All, I need to parse and process an XML feed, unfortunately the feed is about 110mb in size (and i cannot do anything about it) but to be able to parse it i need to see the structure (or if anyone has any other ideas i'd love to hear it). But for some reason using editplus i've been unable to open the file. I'm on a 64bit Vista Ma...

LALR(2) dangling else

Hello Is LALR(2) able to handle the dangling else case naturally (without any special rules, as with LALR(1))? Thanks ...

Automatic Documentation - Best method for creating a quick parser

I have a large script that end-users need to edit so it requires somewhat redundant commenting. I use a layout for my files similar to this //******************** // // FileName // This script contains: // - Function X - does something // - Function Y - does something else // //******************** //******************** // Fu...

What is the copy constructor bug causing parsing errors?

I'm writing a compiler for a small language, and my Parser class is currently in charge of building an AST for use later. However, recursive expressions are not working correctly because the vector in each AST node that holds child nodes are not working correctly. Currently my AST's header file looks like this: class AST { public: e...

PHP SoapClient call response missing parts of answer

I am having trouble with PHP parsing of a SoapClient call's response. For some types of answers, it is returning arrays of empty stdClass objects instead of initialized stdClass objects. The server is a java webservice deployed with axis2 on tomcat6. The Java signature of the problematic service call is public Course getCourseDetails(...

Is C++ code generation in ANTLR 3.2 ready?

Hi, I was trying hard to make ANTLR 3.2 generate parser/lexer in C++. It was fruitless. Things went well with Java & C though. I was using this tutorial to get started: http://www.ibm.com/developerworks/aix/library/au-c%5Fplusplus%5Fantlr/index.html When I checked the *.stg files, I found that: CPP has only ./tool/src/main/resources/...

How to use XMLREADER in php?

I have the following XML file, the file is rather large and i haven't been able to get simplexml to open and read the file so i'm trying XMLREADER with no success in php <?xml version="1.0" encoding="ISO-8859-1"?> <products> <last_updated>2009-11-30 13:52:40</last_updated> <product> <element_1>foo</element_1> <element_...

Grammatica Parsing Error, Wrong Expected Encoding?

I'm trying to use grammatica to generate a C# parser for a language I'm attempting to build (hobby project). However, everytime I run the grammatica parser, I get an error at line 1 position 1, "unexpected character 'x'" where x is some strange ASCII character (looks kind of like an 'n') The grammatica output shows 3 such strange charac...

Cocoa: Parse NSString by character length

I have an NSString I'm working with, but I would like to parse it by character length. So break it apart into an NSArray, and have each object in the array be x characters from that string. So basically, break up the string into sub strings of a certain length So, how do I do it? example: NSString *string = @"Here is my string" NS...

How can I learn about compiler theory - online/free resources

I'm interested in learning - at depth - about compiler theory... parsing EBNF LALR? Are all terms I'm familiar with but don't really understand how to actually implement/use.. I'm looking for links, tutorials, and other resources; all online, preferably all free... I'm more interested in simple / complete implementations, than comp...

how to copy only body of pdf file using itextsharp or pdfsharp in C#

I am still waiting for the reply from you all. I really need your support.......plz I am developing a project called pdf recovery. In that i want to just copy the body of file into another pdf file using itextsharp C# lib. i want to parse entire file as soon as i get body i.e. " 1 0 obj<<....>>endobj " according to pdf reference. t...

External entity in XML causing null pointer exception during DocumentBuilder.parse("file");

Hello All, I was trying to parse a XML Document using DOM Parser. I got null pointer exception while executing doc = builder.parse(xmlDataFile); There were few entities in the XML data file. On removing a particular entity, i was able to parse the file successfully The entity was some thing like this <!ENTITY SAMPLE.TIF SYSTEM "S...

In yacc, how to put a definition into a variable?

In my yacc file I have the following code: fun_declaration : type_specifier ID '(' params ')' {$2->type = "function"; $2->args = params; } params : param_list | VOID ; Do you see what I'm trying to do? args is a string. I'm trying to put the function parameters into this string. How to do that...