parsing

How to modify XML on Objective-C?

Hi, I'm working on a project for the iPad, I need to read and write to an xml file, which is also used by the counter part of the application in windows. The problem that I have is that I've been looking around but I haven't found a way to modify an element or attribute in an xml, without having to build the whole xml again. I saw thi...

Where can I find a file/struct layout for a tcpdump() file?

We are capturing packets to a file using tcpdump(). I need to write a program to parse it, does anyone know where I could find a file layout for a dump file created by this tool? ...

Extract known pattern substring from NSString (without regex)

I'm really tempted to drop RegexKit (or my own libpcre wrapper) into my project in order to do this, but before I do that I want to know how Cocoa developers manage to do half of this basic stuff without really convoluted code or without linking with RegexKit or another regular expression library. I find it gobsmacking that Cocoa does n...

Python/YACC Lexer: Token priority?

I'm trying to use reserved words in my grammar: reserved = { 'if' : 'IF', 'then' : 'THEN', 'else' : 'ELSE', 'while' : 'WHILE', } tokens = [ 'DEPT_CODE', 'COURSE_NUMBER', 'OR_CONJ', 'ID', ] + list(reserved.values()) t_DEPT_CODE = r'[A-Z]{2,}' t_COURSE_NUMBER = r'[0-9]{4}' t_OR_CONJ = r'or' t_ignore = ' \t' def t_ID(t...

Fast, lightweight HTML parser for C++

I'm looking for a fast, lightweight open-source HTML parser -- something along the lines of a non-validating SAX parser (except, of course, for HTML). The answers to this question cover a parser that generates a DOM (don't want that), and these answers suggest conforming the HTML to XML before sending it to Xerxes (can't do that in my c...

How can I create the XML::Simple data structure using a Perl XML SAX parser?

Summary: I am looking a fast XML parser (most likely a wrapper around some standard SAX parser) which will produce per-record data structure 100% identical to those produced by XML::Simple. Details: We have a large code infrastructure which depends on processing records one-by-one and expects the record to be a data structure in a form...

Parsing Complex Text File with C#

Hello, I need to parse a text file that has a lot of levels and characters. I've been trying different ways to parse it but I haven't been able to get anything to work. I've included a sample of the text file I'm dealing with. Any suggestions on how I can parse this file? I have denoted the parts of the file I need with TEXTINEED. (...

parse content away from structure in a binary file

Using C#, I need to read a packed binary file created using FORTRAN. The file is stored in an "Unformatted Sequential" format as described here (about half-way down the page in the "Unformatted Sequential Files" section): http://www.tacc.utexas.edu/services/userguides/intel8/fc/f_ug1/pggfmsp.htm As you can see from the URL, the file i...

Detecting regular expression in content during parse

I am writing a simple parser for C. I was just running it with some other language files (for fun - to see the extent of C-likeness and laziness - don't wanna really write separate parsers for each language if I can avoid it). However the parser seems to break down for JavaScript if the code being parsed contains regular expressions.....

Are there any PHP code libraries with function(s) to read and parse information from CSV files?

I am looking for some routines that will read and parse CSV files. I have written some code to do this, but the data files I download are not always evenly formatted for data extraction. I generally have to clean up the file manually before I can run my parser. ...

How would I code a complex formula parser manually?

Hm, this is language - agnostic, I would prefer doing it in C# or F#, but I'm more interested this time in the question "how would that work anyway". What I want to accomplish ist: a) I want to LEARN it - it's about my ego this time, it's for a fun project where I want to show myself that I'm a really good at this stuff b) I know a ti...

Lack of IsNumeric function in C#

One thing that has bothered me about C# since its release was the lack of a generic IsNumeric function. I know it is difficult to generate a one-stop solution to detrmine if a value is numeric. I have used the following solution in the past, but it is not the best practice because I am generating an exception to determine if the value ...

PHP string parsing

I am trying to parse a list of operating system instances with their unique identifiers. I am looking for a solution to parse a text string, and pass the values into two variables. The string to be parsed is as followed: "Ubuntu 9.10" {40f2324d-a6b2-44e4-90c3-0c5fa82c987d} ...

What libraries will parse a DTD using PHP

I need to parse DTDs using PHP and am hoping there's a simple library to help out. Each DTD has numerous <!ENTITY... and <!-- Comment... elements, which I need to act upon. Note that I do not need to validate anything against these DTDs, simply parse them as data files themselves. A few options I've looked at: James Clarke's SD, which...

Parsing complicated query parameters

My Python server receives jobs that contain a list of the items to act against, rather like a search query term; an example input: (Customer:24 OR Customer:25 OR (Group:NW NOT Customer:26)) So when a job is submitted, I have to parse this recipient pattern and resolve all those customers that match, and create the job with that input....

Packrat parsing HTTP

Hello could somebody give me a start on how to parse the HTTP-protocol with scala 2.8 packrat-parsing? I need to parse attached examplary HTTP Response into ResponseStatusCode:Int Headers:List[(String,String)] Body: String, Array[Byte], CharBuffer or whatever Short examplary usage of a Packrat-Parser very much appreciated. Thanks! ...

How to write a bison grammer for WDI?

I need some help in bison grammar construction. From my another question: I'm trying to make a meta-language for writing markup code (such as xml and html) wich can be directly embedded into C/C++ code. Here is a simple sample written in this language, I call it WDI (Web Development Interface): /* * Simple wdi/html sample source cod...

Is it valid to have more than one question mark in a URL?

I came across the following URL today: http://www.sfgate.com/cgi-bin/blogs/inmarin/detail??blogid=122&amp;entry_id=64497 Notice the doubled question mark at the beginning of the query string: ??blogid=122&entry_id=64497 My browser didn't seem to have any trouble with it, and running a quick bookmarklet: javascript:alert(document.l...

Are there libraries or techniques for collecting and weighing keywords from a block of text?

I have a field in my database that can contain large blocks of text. I need to make this searchable but don't have the ability to use full text searching. Instead, on update, I want my business layer to process the block of text and extract keywords from it which I can save as searchable metadata. Ideally, these keywords could then be...

Extract information from javascript counter via PHP

Hi, I'm looking for a way to extract some information from this site via PHP: http://www.mycitydeal.co.uk/deals/london There ist a counter where the time left is displayed, but the information is within the JavaScript. Since I'm really a JavaScript rookie, I didn't really know how to get the information. Normally I would extract the ...