parsing

what's the best way to parse a body of text against multiple (15+) regexes on each line?

I have a body of text that I have to scan and each line contains at least 2 and sometimes four parts of information. The problem is that each line can be 1 out of 15-20 different actions. in ruby the current code looks somewhat like this: text.split("\n").each do |line| #around 20 times.. .............. expressions['actions']...

Getting sqlcmd output into a GridView

I am designing an ASP.NET website that will run sqlcmd, get some output and put this into a grid on screen. I was wondering if there is a method for reading the results of a query from sqlcmd into some kind of format that I can work with, XML, DataSet etc. Is there a friendly switch in sqlcmd that will output it in a nice format or wil...

A good ocaml parser?

Hi, I'm looking for a good ocaml parsing library that isn't a derivative of flex/bison. Ideally, I'd like a monadic combinator library along the lines of parsec, but I can't find anything. I would use haskell, but making llvm bindings for haskell is proving more tiresome than I originally thought. Cheers, Duane ...

How to specify the exact number of occurance of a token in ANTLR?

I have to define the grammar of a file like the one shown below. //Sample file NameCount = 4 Name = a Name = b Name = c Name = d //End of file Now I am able to define tokens for NameCount and Name. But i have to define the file structure including the valid number of instances of token Name , which is the value after NameCount. I have ...

Compiling ANTLRWorks generated class files

I am using ANTLRWorks to create ANTLR grammars. I have a valid grammar and the parser and lexer source files are generated as well. I have also tried debugging the generated code and the output is as expected in the debugger output. But when I try to invoke the __Test__ class generated by the debugger nothing is coming up in the conso...

How do I stop the Sun JDK1.6 builtin StAX parser from resolving DTD entities

I'm using the StAX event based API's to modify an XML stream. The stream represents an HTML document, complete with DTD declaration. I would like to copy this DTD declaration into the output document (written using an XMLEventWriter). When I ask the factory to disregard DTD's it will not download the DTD, but remove the whole statement a...

Parsing a Date Range in C# - ASP.NET

Given say 11/13/2008 - 12/11/2008 as the value posted back in TextBox, what would be the best way to parse out the start and end date using C#? I know I could use: DateTime startDate = Convert.ToDateTime(TextBoxDateRange.Text.Substring(0, 10)); DateTime endDate = Convert.ToDateTime(TextBoxDateRange.Text.Substring(13, 10)); Is there a...

Parsing GPS receiver output via regex in Python

I have a friend who is finishing up his masters degree in aerospace engineering. For his final project, he is on a small team tasked with writing a program for tracking weather balloons, rockets and satellites. The program receives input from a GPS device, does calculations with the data, and uses the results of those calculations to con...

C#.NET Importing a registry hive and parsing its contents

I have been given a .Hive file from a registry which i have to parse and use the contents as part of a html report(from this i assume i have to convert to text somehow). The whole thing must be done within the program so i cant just convert the hive file and then run it through my program. I currently have no idea how to even start this ...

Code editor with autocomplete

I need to create a code editor for my own simple language: className.MethodName(parameterName = 2, ... ) I've created the appropriate grammar and autogenerate parser using ANTLR tool. Now I would like to have an autocomplete for class, method, variables and parameter names. This list should be context dependent, f.e. for "class." it sh...

CSV Parsing

I am trying to use C# to parse CSV. I used regular expressions to find "," and read string if my header counts were equal to my match count. Now this will not work if I have a value like: "a",""b","x","y"","c" then my output is: 'a' '"b' 'x' 'y"' 'c' but what I want is: 'a' '"b","x","y"' 'c' Is there any regex or any other logi...

Parsing Binary Data in C?

Are there any libraries or guides for how to read and parse binary data in C? I am looking at some functionality that will receive TCP packets on a network socket and then parse that binary data according to a specification, turning the information into a more useable form by the code. Are there any libraries out there that do this, or...

How does Gmail recognize email signatures (alternatively, "What's the best way to recognize email signatures?")

Gmail automatically greys text that looks like a signature. Anyone have any guesses how it does this? (I've noticed that it depends on the presence of the sender's name, but I think that's only part of the story). I ask because I'm working on a web application that has an email interface, and I'd like to remove users' signatures before ...

How to parse XML in JavaScript from Google

I want to parse this XML Document http://www.google.de/ig/api?weather=Braunschweig,%20Deutschland I want to be able to read out condition, temp_c and humidity. All this I want to do inside of JavaScript without using any server sided scripts such as PHP and I want it to work on modern browsers as well as IE7 and if without many problems ...

What is the fastest way to parse large XML docs in Python?

I am currently the following code based on Chapter 12.5 of the Python Cookbook: from xml.parsers import expat class Element(object): def __init__(self, name, attributes): self.name = name self.attributes = attributes self.cdata = '' self.children = [] def addChild(self, element): self.chi...

How can I get a frame's content with mshtml?

Here's the issue: I have a hook in IE that reacts on WebBrowser.OnNavigateComplete2 event to parse the content of the document for some precise info. That document contains frames, so I look into the HTMLDocument.frames. For each one, I look into the document.body.outerHTML property to check for the content. Problem is, the string I'...

Open-source parser code for Mediawiki markup

I'm interested in selectively parsing Mediawiki XML markup to generate a customized HTML page that's some subset of the HTML produced by the actual PHP Mediawiki render engine. I want it for BzReader, an offline Mediawiki compressed dump reader written in C#. So a C# parser would be ideal, but any good code would help. Of course, if n...

Best way to parse a dynamic text list in PHP.

I have below a list of text, it is from a popular online game called EVE Online and this basically gets mailed to you when you kill a person in-game. I'm building a tool to parse these using PHP to extract all relevant information. I will need all pieces of information shown and i'm writting classes to nicely break it into relevant encap...

Dynamic logical expression parsing/evaluation in PHP?

I have a need to evaluate user-defined logical expressions of arbitrary complexity on some PHP pages. Assuming that form fields are the primary variables, it would need to: substitute"varibles" for form fields values; handle comparison operators, minimally ==, <, <=, >= and > by symbol, name (eg eq, lt, le, ge, gt respectively); handl...

XML parsing and transformation in PHP?

I have a custom XML schema defined for page display that puts elements on the page by evaluating XML elements on the page. This is currently implemented using the preg regex functions, primarily the excellent preg_replace_callback function, eg: ... $s = preg_replace_callback("!<field>(.*?)</field>!", replace_field, $s); ... function r...