parsing

Natural language processing / text structure analysis starting point

I need to parse & process a big set of semi-structured text (basically, legal documents - law texts, addendums to them, treaties, judge's decisions, ...). The most fundamental thing I'm trying to do is extract information on how subparts are structured - chapters, articles, subheadings, ... plus some metadata. My question is if anyone ca...

efficient and flexible binary data parsing

I have an external device that spits out UDP packets of binary data and software running on an embedded system that needs to read this data stream, parse it and do somethign useful. The binary data gets logged to a file as well. I would like to write a parser that can easily take the input directly from either the UDP stream, or a file...

LISP Parser C++

Is there an existing LISP parser written in C++? I just want the parser, not a full interpreter, but an interpreter to go along with it would be a plus. ...

linq to xml - read hibernate file

How can I get the connection.connection_string value from the following hibernate xml file using linq? <?xml version="1.0" encoding="utf-8" ?> <hibernate-configuration xmlns="urn:nhibernate-configuration-2.2"> <session-factory> <property name="connection.provider">NHibernate.Connection.DriverConnectionProvider</property> <p...

Parsing XML with JQuery

I am querying the Microsoft Office SharePoint Server Search Service to write some results into a web part. I have the query working correctly but am having some trouble parsing the xml response via JQuery. Below is the XML response <ResponsePacket xmlns="urn:Microsoft.Search.Response"> <Response domain="QDomain"> <Range> <StartAt...

Parsing a decimal from a DataReader.

Hey all, I found a workaround for this error, but am now really curious as to why this would be happening, was wondering if anyone else has had this error. My function is as follows: public void Blog_GetRating(int blogID, ref decimal rating, ref int voteCount) { // Sql statements // Sql commands if(DataReader.Read()) ...

XML Parsing Error: junk after document element

I am getting this error when I try to view my dynamically generated (PHP) XML document: XML Parsing Error: junk after document element Location: http://dev.leisurepublishing.com/vtc/master.xml.php Line Number 17, Column 1: ^ I have googled and looked through the document and I can't figure out what's wrong, can someone help me spot the...

ANTLR vs. Happy vs. other parser generators

I want to write a translator between two languages, and after some reading on the Internet I've decided to go with ANTLR. I had to learn it from scratch, but besides some trouble with eliminating left recursion everything went fine until now. However, today some guy told me to check out Happy, a Haskell based parser generator. I have no...

FasterCSV - raising MalformedCSVError when it shouldn't

Sample data: "iWine","Barcode","Location","Bin","Size","Valuation","Price","StoreName",\ "PurchaseDate","Note","Vintage","Wine","Locale","Country","Region","SubRegion",\ "Appellation","Producer","SortProducer","Type","Color","Category","Varietal",\ "MasterVarietal","Designation","Vineyard","WA","WS","IWC","BG","WE","JR",\ "RH","JG","GV"...

How do I use this regular expression? (Aweber Email Parser)

Hello All, I'm having a bit of difficulty figuring out how to use this Regular Expression which Aweber provides for email parsing. I'm supposed to be able to send an email to aweber, with this ruleset, and aweber will add the email to my list. Here is the rule: Trigger Rule: From:[^\n|.]+user\@domain\.com | MATCH HEADERS Rule 1: ...

HTML parsing error

But I don't know what's causing it on my page at http://www.jazmyn.com/media.html I keep getting this message: HTML Parsing Error: Unable to modify the parent container element before ...

Problem with HTML Agility Pack and Visual Studio C++

I am in need of a very simple HTML parser which can extract text, table from well-formed HTML documents in the .NET environment. I found several references to HTMLAgilityPack. My problem is that I am using the Visual C++ environment in the .NET framework. Can anyone help me with instructions on how do I add a "reference" to the C# genera...

Can someone suggest how this Perl script works?

I have to maintain the following Perl script: #!/usr/bin/perl -w die "Usage: $0 <file1> <file2>\n" unless scalar(@ARGV)>1; undef $/; my @f1 = split(/(?=(?:SERIAL NUMBER:\s+\d+))/, <>); my @f2 = split(/(?=(?:SERIAL NUMBER:\s+\d+))/, <>); die "Error: file1 has $#f1 serials, file2 has $#f2\n" if ($#f1 != $#f2); foreach my $g (0 .. $#f1...

how to access xml nodes in flex

A web service return to my flex3 client this custom exception: <SOAP-ENV:Fault xmlns:ro="urn:Gov2gLibrary" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:HNS="http://tempuri.org/" xmlns:SOAP-ENC="http://schemas.xmlsoap.or...

Get HTML element informations in .NET

Hi, I'm just thinking if there is any way how to get information about element in HTML in my .NET application. The input is HTML page and path to CSS files etc. I want to take e.g. H1 tag and found what will be the CSS for it. Is there any code or can I use IE and try to take this information from it automatically inside of my applicati...

Why can't DateTime.ParseExact() parse "9/1/2009" using "M/d/yyyy"

Hi, I have a string that looks like this: "9/1/2009". I want to convert it to a DateTime object (using C#). This works: DateTime.Parse("9/1/2009", new CultureInfo("en-US")); But I don't understand why this doesn't work: DateTime.ParseExact("9/1/2009", "M/d/yyyy", null); There's no word in the date (like "September"), and I know t...

Background reading for parsing sloppy / quirky / "almost structured" data?

I'm maintaining a program that needs to parse out data that is present in an "almost structured" form in text. i.e. various programs that produce it use slightly different formats, it may have been printed out and OCR'd back in (yeah, I know) with errors, etc. so I need to use heuristics that guess how it was produced and apply differen...

XML Parser for Ruby

Looking for something similar to xerces for parsing an xml file in ruby. I saw the native processor REXML and another called hpricot (though I can't find any documentation on hpricot, the links all appear to be dead). I'm looking for something that would parse an xml document via SAX2 in ruby. TIA. ...

Moving C++ Method Declarations from .hh to .cc File

I'm working on a C++ project in which there are a lot of classes that have classes, methods and includes all in a single file. This is a big problem, because frequently the method implementations require #include statements, and any file that wants to use a class inherits these #includes transitively. I was just thinking that it would be...

Parsing large text files with Adobe AIR

Hello, I am trying to do the following in AIR: browse to a text file read the text file and store it in a string (ultimately in an array) split the string by the delimiter \n and put the resulting strings in an array manipulate that data before sending it to a website (mysql database) The text files I am dealing with will be anywher...