parsing

LR1 Parser and Epsilon

I'm trying to understand how LR1 Parsers work but I came up with a strange problem: What if the grammar contains Epsilons? For instance: if I have the grammar: S -> A A -> a A | B B -> a It's clear how to start: S -> .A A -> .a A A -> .B ... and so on but I don't know how to do it for such a grammar: S -> A A -> a A a | \epsilon...

Design advice for file parser

I am trying to design an edifact parser, I was planning on having one class to read the file, one class to map the data and then one other class to deal with data storage. The part where I am having a major problem is in how instances of those classes should communicate with each other. Any advice would be appreciated. ...

Parsing XML result from a Web Service

I'm new to Web Services and XML and was tasked to parse an XML response packet returned. What's the best way to parse an XML result in C#.NET? I need to bind to a data grid as the end result from a search query. ...

Spreadsheet Parser in Java/Groovy

Hi I'm looking to parse spreadsheets (xls/ods) in Groovy. I have been using the Roo library for Ruby and was looking to try the same tasks in Groovy, as Java is already installed on a development server I use, and I would like to keep the number of technologies on the server to a simple core few. I am aware that the ods format is zipped...

Is there a good *strict* date parser for Java?

Is there a good, strict date parser for Java? I have access to Joda-Time but I have yet to see this option. I found the "Is there a good date parser for Java" question, and while this is related it is sort of the opposite. Whereas that question was asking for a lenient, more fuzzy-logic and prone to human error parser, I would like a ...

Is Scalas/Haskells parser combinators sufficient?

I'm wondering if Scalas/Haskells parser combinators are sufficient for parsing a programming language. More specifically the language MiniJava. I'm currently reading compiller construction and jflex and java cup is quite painful to work with so I'm wondering if I could/should use parser combinators instead. The MiniJava syntax is very sm...

Dotnetnuke 3.0.12 installing site "Could not load type 'dotnetnuke.common.global'"

I received a dotnetnuke 3.0.12 installation in a zip file and made a web site under c:\inetpub\wwwroot and copied the files. When I access default.aspx, I get the error: Could not load type 'DotNetNuke.Common.Global'. ...

File and space in Python

I have a file like: <space> <space> line1 <space> column 1 column 2 column 3 ... . . . <space> <space> How to remove this extra spaces? I need to extract the heading which will be on line1. Also, I need to extract column 1, column 2, column 3 etc. At the end of last column content there is '\n'.How to get rid of it ??? H...

Why are parsing tools needed for DSLs?

Couldn't a DSL be as simple as an API and therefore not need a parser? Or am I misunderstanding what a domain specific language really is? I thought it referred to any organized set of rules for solving a particular domain problem. An API would seem to fit that definition, right? ...

How to parse nagios status.dat file?

I'd like to parse status.dat file for nagios3 and output as xml with a python script. The xml part is the easy one but how do I go about parsing the file? Use multi line regex? It's possible the file will be large as many hosts and services are monitored, will loading the whole file in memory be wise? I only need to extract services tha...

How should I deserialize this string data?

My application receives strings that represent objects. I have to parse the strings to get the property values for the object I'm creating. Each object will have to know the specifics about how many attributes there are, what each attribute means, etc. However, I want to avoid having each class know about how to parse the string. I'd...

Parsing email headers

Hi, We need to parse email headers. We need to extract domain\IPs through which the mail has traversed.Also, we need to figure if an IP is an internal IP. Is there already a library which can help out , especially in C\C++. For example, Received: from server.mymailhost.com (mail.mymailhost.com [126.43.75.123]) by pilot01.cl.msu.ed...

Big XML file and OutOfMemoryError

Hello, I’m trying to parse a XML file up to 500 mb in java. I tried to use SAX but it gives me this error java.lang.OutOfMemoryError: Java heap space at com.sun.org.apache.xerces.internal.util.XMLStringBuffer.append(Unknown Source) Can you help me? Thanks a lot. P.S. Smaller XML files works just fine ...

How can I parse command-line switches in Perl?

In order to extend my "grep" emulator in Perl I have added support for a -r switch which enables recursive searching in sub-directories. Now the command line invocation looks something like this: perl pgrep.pl -r <directory> <expression> Both -r and the directory arguments are optional (directory defaults to '.'). As of now I simply c...

How can I remove all characters from string starting at first non-alpha character?

This would have been a lot easier if not for certain situations. Sample data: KENP989SD KENP913E KENPX189R KENP913 What regular expression can I use to remove all characters from the string starting at the first non-alpha character? Basically, I want to find the first non-alpha character and chop everything off after that regardless ...

How can I find a specific line in a file without using regular expressions in Perl?

I have a file that I want to read in using the File::Slurp module then search that file for a specific line. I know Perl has regular expressions but the strings I’m searching for are user-generated so don’t want to have to worry about escaping everything. I could write a foreach loop to do this but was wondering if there’s a function in ...

Is there a CSS parser for C#?

I've seen CSS parsers in other languages, and they don't look very complex (the grammar is remarkably simple). I want one in C# (er, .NET), but I can't seem to find one, and I'd rather not write one if I can reasonably avoid it. Does one exist? ...

T-SQL check constraint for .NET TimeSpan?

I have a nvarchar(max) column in a sql server 2005 table which is used for storing string representations of .NET TimeSpan objects. Sometimes the table is manually edited. I want to add a check constraint to verify that the string can be parsed by TimeSpan.Parse(). How should I do this? I guess I could use one of the methods for enabling...

PHP XML Parsing

Which is the best way to parse an XML file in PHP ? First Using the DOM object //code $dom = new DOMDocument(); $dom->load("xml.xml"); $root = $dom->getElementsByTagName("tag"); foreach($root as $tag) { $subChild = $root->getElementsByTagName("child"); // extract values and loop again if needed } Second Using the simplexml_load Me...

PHP XMLReader getting parent node?

While parsing an XML File using the XMLReader Method, how do I get the parent node of an element ? $xml = new XMLReader(); $xml->XML($xmlString); while($xml->read()) { $xml->localName; // gives tag name $xml->value; // gives tag value // how do I access the parent of this element } ...