parsing

parsing JSon using JSon.net

I'm trying to parse some JSon using the JSon.Net library. The documentation seems a little sparse and I'm confused as to how to accomplish what I need. Here is the format for the JSon I need to parse through. { "displayFieldName" : "OBJECT_NAME", "fieldAliases" : { "OBJECT_NAME" : "OBJECT_NAME", "OBJECT_TYPE" : "OBJECT_T...

How to parse a binary file with floats (Java generated) using Cocoa Touch?

Given the following Java code for generating a binary file: DataOutputStream out = new DataOutputStream(new FileOutputStream("foo.dat")); out.writeInt(1234); out.writeShort(30000); out.writeFloat(256.384f); I'm using the following Objective-C code and manage to parse the int and the short values: NSString *path = [[NSBundle mainBundl...

Show a list of words repeated in haskell

I need to be able to write a function that shows repeated words from a string and return a list of strings in order of its occurrence and ignore non-letters e.g at hugs prompt repetitions :: String -> [String] repetitions > "My bag is is action packed packed." output> ["is","packed"] repetitions > "My name name name is Sean ." output...

What is the simplest way to parse binary files in Cocoa Touch?

Say I have a binary file (generated with Java) containing a 32 bit int value. I'm currently using the following Objective-C code for parsing it: NSString *path = [[NSBundle mainBundle] pathForResource:@"foo" ofType:@"dat"]; NSFileHandle *file = [NSFileHandle fileHandleForReadingAtPath:path]; unsigned long intValue; memcpy(&intValue, [[...

parse html code to find a field

Hi, I have these page http://www.elseptimoarte.net/. The page have a search field, If I put for instance "batman" it give me some searchs results with a url of every result: http://www.elseptimoarte.net/busquedas.html?cx=003284578463992023034%3Alraatm7pya0&cof=FORID%3A11&ie=ISO-8859-1&oe=ISO-8859-1&q=batman#978 I would ...

Fixing malformed HTML in PHP?

I am constructing a large HTML document from fragments supplied by users that have the annoying habit of being malformed in various ways. Browsers are robust and forgiving enough but I want to be able to validate and (ideally) fix any malformed HTML if at all possible. For example: <td><b>Title</td> can be reasonably fixed to: <td>...

C# - convert string to int and test success

How can you check whether a string is convertible to an int? Lets say we have data like "House", "50", "Dog", "45.99", I want to know whether I should just use the string or use the parsed int value instead. In Javascript we had this parseInt() function, if the string couldn't be parsed, it would get back NaN. ...

parsing HTML on the iPhone

Can anyone recommend a C or Objective-C library for HTML parsing? It needs to handle messy HTML code that won't quite validate. Does such a library exist, or am I better off just trying to use regular expressions? ...

Reading the fileset from a torrent

I want to (quickly) put a program/script together to read the fileset from a .torrent file. I want to then use that set to delete any files from a specific directory that do not belong to the torrent. Any recommendations on a handy library for reading this index from the .torrent file? Whilst I don't object to it, I don't want to be dig...

Implement word boundary states in flex/lex (parser-generator)

I want to be able to predicate pattern matches on whether they occur after word characters or after non-word characters. In other words, I want to simulate the \b word break regex char at the beginning of the pattern which flex/lex does not support. Here's my attempt below (which does not work as desired): %{ #include <stdio.h> %} %x ...

Parsing fixed-format data embedded in HTML in python

I am using google's appengine api from google.appengine.api import urlfetch to fetch a webpage. The result of result = urlfetch.fetch("http://www.example.com/index.html") is a string of the html content (in result.content). The problem is the data that I want to parse is not really in HTML form, so I don't think using a python HT...

I need to parse xml from a google app engine app

Any examples ? thanks ...

How can I parse dates and convert time zones in Perl?

I've used the localtime function in Perl to get the current date and time but need to parse in existing dates. I have a GMT date in the following format: "20090103 12:00" I'd like to parse it into a date object I can work with and then convert the GMT time/date into my current time zone which is currently Eastern Standard Time. So I'd li...

Parsing data from txt file in J2ME

Basically I'm creating an indoor navigation system in J2ME. I've put the location details in a .txt file i.e. Locations names and their coordinates. Edges with respective start node and end node as well as the weight (length of the node). I put both details in the same file so users dont have to download multiple files to get their ma...

Can you preserve leading and trailing whitespace in XML?

How does one tell the XML parser to honor leading and trailing whitespace? Dim xml: Set xml = CreateObject("MSXML2.DOMDocument") xml.async = False xml.loadxml "<xml>1 2</xml>" wscript.echo len(xml.documentelement.text) Above prints out 3. Dim xml: Set xml = CreateObject("MSXML2.DOMDocument") xml.async = False xml.loadxml "<xml> 2</xm...

What is the optimal (speed) way of parsing a large (> 4GB) text file with many (milions) of lines?

I'm trying to determine what is the fastest way to read in large text files with many rows, do some processing, and write them to a new file. In C#/.net, it appears StreamReader is a seemingly quick way of doing this but when I try to use for this file (reading line by line), it goes about 1/3 the speed of python's I/O (which worries me...

[.NET] Should I roll my own version of ParseInt32?

I am writing a high performance parser, and it seems to me that Int32.Parse can be too slow. I wrote a simple version that assumes correct input, and it's performing much better. So should I create my own version instead? Or is there another faster method already available? My method is like this: // parse simple int, assuming relative...

Parsing C#, finding methods and putting try/catch to all methods

I know it sounds weird but I am required to put a wrapping try catch block to every method to catch all exceptions. We have thousands of methods and I need to do it in an automated way. What do you suggest? I am planning to parse all cs files and detect methods and insert a try catch block with an application. Can you suggest me any par...

What are the rules for file extensions in Windows and Unix?

Hi, i'm currently using File::Basename fileparse to separate out a file's directory, base file name and it's extension using something like this: my($myfile_name,$mydirectory, $file_extension) = fileparse($$rhash_params{'storage_full_path_location'},'\..{1,4}'); But see that there's a variation where you can actually provide a array o...

Could you please review my Quick Int Parser implementation?

I wrote some utility methods that can parse a 32-bit signed integer much faster than Int32.Parse. I hope you could give me some of your experience by reviewing my code and suggesting enhancements or pointing out bugs. If you are interested (and have the time of course), I would be very grateful. I have posted my code on the "Refactor My...