parsing

Need help using json-framework on iPhone

So I've got json-framework up and running on my project, but need help figuring out how to use it to parse this json string: [ { "id":"0", "name":"name", "info":"This is info", "tags":[ { "id":"36", "tag":"test tag" }, { "id":"37", "tag":" tag 2" } ], "other":"nil" }, { "id":"1", "name":"name", "info":"This is info", "tags":[ { "id":"36...

Am I parsing this HTTP POST request properly?

Let me start off by saying, I'm using the twisted.web framework. Twisted.web's file uploading didn't work like I wanted it to (it only included the file data, and not any other information), cgi.parse_multipart doesn't work like I want it to (same thing, twisted.web uses this function), cgi.FieldStorage didn't work ('cause I'm getting th...

How can I use the python HTMLParser library to extract data from a specific div tag?

I am trying to get a value out of a HTML page using the python HTMLParser library. The value I want to get hold of is within this html element: ... <div id="remository">20</div> ... This is my HTMLParser class so far: class LinksParser(HTMLParser.HTMLParser): def __init__(self): HTMLParser.HTMLParser.__init__(self) self.see...

How to see progress when parsing large XML file with XML::Parser?

I'm using following code to parse rather large xml file (> 50GB): use XML::Parser; my $p = new XML::Parser( 'Handlers' => { 'Start' => \&handle_start, 'End' => \&handle_end, 'Char' => \&handle_char, } ); $p->parsefile( 'source.xml' ); ... sub handle_start { ... } The problem is tha...

How to make this PHP URL parsing function nearly perfect?

This function is great, but its main flaw is that it doesn't handle domains ending with .co.uk or .com.au. How can it be modified to handle this? function parseUrl($url) { $r = "^(?:(?P<scheme>\w+)://)?"; $r .= "(?:(?P<login>\w+):(?P<pass>\w+)@)?"; $r .= "(?P<host>(?:(?P<subdomain>[-\w\.]+)\.)?" . "(?P<domain>[-\w]+\.(?P<ex...

Groovy xml parse then rebuild

With more than a little help from daviderossi.blogspot.com I have managed to get some code working to replace an xml value with another def fm_xml = '''<?xml version="1.0" encoding="UTF-8"?> <MAlong> <Enquiry.ID>SC11147</Enquiry.ID> <student.name_middle></student.name_middle> <student.name_known></student.name_known> <student.name_previ...

Using Lambdas to build executable functions from string expressions

I'm using python, and I want a function that takes a string containing a mathematical expression of one variable (x) and returns a function that evaluates that expression using lambdas. Syntax should be such: f = f_of_x("sin(pi*x)/(1+x**2)") print f(0.5) 0.8 syntax should allow ( ) as well as [ ] and use standard operator precedence. ...

How can you parse excel CSV data that contains linebreaks in the data?

I'm attempting to parse a set of CSV data using PHP, but having a major issue. One of the fields is a long description field, which itself contains linebreaks within the enclosures. My primary issue is writing a piece of code that can split the data line by line, but also recognize when linebreaks within the data should not be used. Th...

fast auto-guessing of date strings

For a huge number of huge csv files (100M lines+) from different sources I need a fast snippet or library to auto-guess the date format and convert it to broken-down time or unix time-stamp. Once successfully guessed the snippet must be able to check subsequent occurrences of the date field for validity because it is likely that the dat...

Customize Rails Hash.from_xml with custom conversion

I want to convert my XML document to Hash in Ruby/Rails. Actually, the default conversion by Hash.from_xml in Rails works for me except in one case. I have a list of items contained in <item-list> element, these items can be of different types though. For instance, standard-item and special-item, each of which has different set of child...

Recovering error tokens in parsing (Lemon).

I'm using Lemon as a parser generator, its error handling is the same as yacc's and bison's if you don't know Lemon. Lemon has an option to define the error token in a set of rules in order to catch parsing errors. The default behavior of the generated parser is to destroy the token causing the error; is there any way to override this b...

I want to parse multiple strings into variables from a .txt file in java. What's the easiest way to do it?

What's the best way to do it? Should I use the File class and scanner? I've never done it before and cant seem to find a solid guide for it online so I figured I would ask here. Thanks! ...

Parse XML from String using XPath in Bada?

I have read the tutorial on XML parsing in Bada. But I don't want to use a file. I need to parse my XML from a Osp::Base::String. Any ideas which methods should I use? So far I have replaced xpathCtx = xmlXPathNewContext(doc); if(xpathCtx == NULL) { AppLog("Error: unable to create new XPath context"); xmlFreeDoc(doc); return(E...

How to tell Nokogiri when parsing a document not to convert it a different encoding (in my case not to convert &paund; to to anything else)

Question: how to tell Nokogiri when parsing a document not to convert it a different encoding (in my case not to convert to to anything else) I have a file with the following contents: <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> </head> <body> <span>&pound;</span> </body> </html> I p...

Ruby libxml parsing and inserting to database

Hello, I am currently trying to read from an xml file which records the jobs on a PBS. I have succesfullly managed to parse the code, but am unable to insert the objtects into my database, i receive this error: "You have a nil object when you didn't expect it! You might have expected an instance of ActiveRecord::Base. The error occurred...

How to parse css (stylesheet) comments (annotations)?

Hi! I have this idea, the user defines set of css rules with some comments (comments are simple annotations): /* @name Page style */ body { font: 16px/1.5 Arial; /* @editable */ backgorund-color: #fff; /* @editable */ } /* @name Section header */ h1 { font: 20px/1.2 Arial; /* @editable */ color: #c44 } I can apply this ...

How to parse raw POST data into array?

I have a raw form-data that look like this: ------------V2ymHFg03ehbqgZCaKO6jy Content-Disposition: form-data; name="intro" O ------------V2ymHFg03ehbqgZCaKO6jy Content-Disposition: form-data; name="title" T ------------V2ymHFg03ehbqgZCaKO6jy Content-Disposition: form-data; name="apiKey" 98d32fdsa ------------V2ymHFg03ehbqgZCaKO6jy C...

Is it possible to write JSON parser without using recursion ?

Is it possible to write JSON parser without using recursion ? If it is, what approach would you suggest ? ...

Shunting-yard: missing argument to operator

I'm implementing the shunting-yard algorithm. I'm having trouble detecting when there are missing arguments to operators. The wikipedia entry is very bad on this topic, and their code also crashes for the example below. For instance 3 - (5 + ) is incorrect because the + is missing an argument. Just before the algorithm reaches the ), t...

Any Multi-Format Document Reading Lib for Python /or C?

Is there any good Document Parsing Lib , in C or Python? I am trying to Parse Strings from Documents - PDF, Word Doc/Docx , Excel xls/x , PPT, ODF, and also Mac Formats. Please Recommand Solutions that would also work in Linux/Unix enviorment. ...