parsing

perl code that makes a list of all words that follow a given string in a text file

This is difficult to describe but useful in extracting data in the output I am dealing with (I hope to use this code for a large number of purposes) Here is an example: Say I have a text file with words and some special characters ($, #, !, etc) that reads: blah blah blah add this word to the list: 1234.56 blah blah blah blah b...

Tokenize valid words from a long string

Suppose you have a dictionary that contains valid words. Given an input string with all spaces removed, determine whether the string is composed of valid words or not. You can assume the dictionary is a hashtable that provides O(1) lookup. Some examples: helloworld-> hello world (valid) isitniceinhere-> is it nice in here (valid) zx...

How to read an XML file with an undefined namespace with XMLReader?

I'm relatively new to parsing XML files and am attempting to read a large XML file with XMLReader. <?xml version="1.0" encoding="UTF-8"?> <ShowVehicleRemarketing environment="Production" lang="en-CA" release="8.1-Lite" xsi:schemaLocation="http://www.starstandards.org/STAR /STAR/Rev4.2.4/BODs/Standalone/ShowVehicleRemarketing.xsd"> <...

Steps and involvement of implementing a parser (in .Net - and in this case XPath 2.0)

Hi, In the lack of any good free XPath 2.0 implementations for .Net build upon Linq to XML I have thought about implementing my own (also for the experience). But just to be clear (and not building something that exists) these are the XPath 2.0 implementations I have found: Saxon .Net Query Machine - I had problems with this - excepti...

How to resolve parse error in Splint

Splint is not continuing it's checking after finding parse errors. I've tried with +trytorecover option also but no change. Please let me know on how to use +trytorecover to make Splint attempt to continue after a parse error. Here is what I'm receiving, 161: splint +trytorecover spy.c Splint 3.1.1 --- 19 Jul 2006 spy.c:41:12: Parse ...

Boolean Query / Expression to a Concrete syntax tree

I'm creating a search form that allows boolean expressions, like: "foo AND bar" or "foo AND NOT bar". Is there a library for PHP, Ruby or Java that can transform boolean expressions to a concrete syntax tree? (I could write my own lexer/parser, but I rather use something tried and tested) EDIT: To clarify, I'm not parsing arrhythmic...

regex to eliminate field in bibtex file

I am trying to slim down the bib text files I get from my reference manager because it leaves extra fields that end up getting mangled when I put it into LaTeX. A characteristic entry that I want to clean up is: @Article{Kholmurodov:2001p113, author = {K Kholmurodov and I Puzynin and W Smith and K Yasuoka and T Ebisuzaki}, journal = {...

How to write parser for unified diff syntax

Should I use RegexParsers, StandardTokenParsers or are these suitable at all for parsing this kind of syntax? Example of the syntax can be found from here. ...

Read and delete text between two strings in perl

I need a way to read and delete text between two different strings found in some file, then delete the two strings. Like a "cut command." I would like to have the text stored in a variable. I saw the post about reading text between two strings, but I could not figure out how to delete it as well. I intend to execute the stored text i...

Native JSON support in iOS?

Is there a class to parse JSON from a server in the iOS SDK? (similar to NSXML for XML and by extension RSS.) ...

How can I do modeling in reverse by parsing a C program and turning it in to a circuit diagram to be displayed.

How can I do modeling in reverse by parsing a C program and turning it in to a circuit diagram to be displayed. Example Except this is psedocode. ...

HTML tags in XML file, how to ignore HTML tags while XML parsing

Hi all, Thanks in advance, I am using NSXMLParser to parse xml file, in my application, my xml file is like this < item > < ID > 123456 < /ID > < category > Films < /category > < Heading > HollyWood films < /Heading > < Author > samule < /Author > < imageFull > http://tree_one.jpg < /image...

dojo datePatterns and parsing

How do I make dojo parse dates without the slashes, while still respecting the current locale? Example: Dates that must be parseable if locale is: en-us 12/24/2010 12/24/10 12242010 122410 da-dk 24/12/2010 24/12/10 24122010 241210 Currently dojo only parses the dates containing slashes. The dates without slashes return null when...

Parse String to Organize By Row with PHP

Hello All, This is probably something easy to accomplish. I have some dates in my MySQL database, (and I am using PHP). Some are stored as such 2010-08-25 11:00:00, while others are stored as 2010-08-25T08:00:00 My question is: I am selecting the dates from the database, and then using ORDER BY start_date However, I have noticed tha...

Scala packrat parser

Hi, I have some questions about Packrat parser combinator presented in Scala 2.8. Unfortunatelly I wasn't able to find any tutorials of how to use this new feature except of Scaladoc PackratParsers trait description, which is rather short. Could it be possible to receive an example of using it? Actually, I have no experiance in Scala....

parse url and title from string of multiple href tags in coldfusion

i need to parse the url and title from multiple href tags in a string regex... i need to get each url and title into a variable eg. <DT><A HREF="http://www.partyboatnj.com/" ADD_DATE="1210713679" LAST_VISIT="1225055180" LAST_MODIFIED="1210713679">NJ Party Boat - Sea Devil of Point Pleasant Beach, NJ</A> <DT><A HREF="http://www....

Parsing HTML with Lxml

I need help parsing out some text from a page with lxml. I tried beautifulsoup and the html of the page I am parsing is so broken, it wouldn't work. So I have moved on to lxml, but the docs are a little confusing and I was hoping someone here could help me. Here is the page I am trying to parse: http://bit.ly/bf1T12. I need to get ...

How to automate generating and running input files and parsing output files.

I have tried to do it myself--asking for help on specific functions, but the more I think about the possibilities, the more lost I get. I have some software (quantum chemistry packages). That reads input files and generates output files that are basically clumps of data of the form: Energy:[many spaces]6.3432 H 5 O 33 OHO 32 And weird st...

using xpath on single Nokogiri node returns elements in all nodes

I am parsing an XML doc that looks something like this: <MyBook> <title>Favorite Poems</title> <issn>123-456</issn> <pages>45</pages> </MyBook> <MyBook> <title>Chocolate Desserts</title> <issn>654-098</issn> <pages>100</pages> </MyBook> <MyBook> <title>Jabberwocky</title> <issn>454-545</issn> <pages>19</pages>...

Binary parser or serialization ?

I want to store a graph of different objects for a game, their classes may or may not be related, they may or may not contain vectors of simple structures. I want parsing operation to be fast, data can be pretty big. Adding new things should not be hard, and it should not break backward compatibility. Smaller file size is kind of impor...