parsing

What is the best way to parse strings in Java

I have some friends making a text-based game in Java (what the hell?), and they're looking for the best way to parse strings for commands. They've come across many methods and are wondering what would be the best way to go about things....

Split a string ignoring quoted sections

Given a string like this: a,"string, with",various,"values, and some",quoted What is a good algorithm to split this based on commas while ignoring the commas inside the quoted sections? The output should be an array: [ "a", "string, with", "various", "values, and some", "quoted" ] ...

Looking for algorithm that reverses the sprintf() function output

I am working on a project that requires the parsing of log files. I am looking for an fast algorithm that would take groups messages like this: Input: The temperature at P1 is 35F. The temperature at P1 is 40F. The temperature at P3 is 35F. Logger stopped. Logger started. The temperature at P1 is 40F. and puts out something in th...

Delimited string parsing framework for .NET

I'm looking at parsing a delimited string, something on the order of a,b,c But this is a very simple example, and parsing delimited data can get complex; for instance 1,"Your simple algorithm, it fails",True would blow your naiive string.Split implementation to bits. Is there anything I can freely use/steal/copy and paste that offer...

C# Save Dialogs

What would be the easiest way to separate the directory name from the file name when dealing with SaveFileDialog.FileName in C#? ...

Parse usable Street Address, City, State, Zip from a string

Problem: I have an address field from an Access database which has been converted to Sql Server 2005. This field has everything all in one field. I need to parse out the individual sections of the address into their appropriate fields in a normalized table. I need to do this for approximately 4,000 records and it needs to be repeatable. ...

How can I learn about parser combinators?

I've found a few resources on the subject, but they all require a deep understanding of SmallTalk or Haskell, neither of which I know. ...

.Net Parse verses Convert

In .Net you can read a string value into another data type using either .parse or Convert.To. I'm not familiar with the fundamentals of parse versus convert so I am always at a loss when asked which one is better/faster/more appropriate. So - which way is best in what type of circumstances? ...

Best Approach to Parse for SQL in PHP Files?

For my senior thesis, I developed a program that would automatically detect and suggest fixes to SQL injection vulnerabilities using prepared statements. Specifically the mysqli extension for PHP. My question for the SO community is this: What would your preferred approach be to detect the SQL in PHP source code? I used an enum contai...

Resolving reduce/reduce conflict in yacc/ocamlyacc

I'm trying to parse a grammar in ocamlyacc (pretty much the same as regular yacc) which supports function application with no operators (like in Ocaml or Haskell), and the normal assortment of binary and unary operators. I'm getting a reduce/reduce conflict with the '-' operator, which can be used both for subtraction and negation. Here ...

How do I put unicode characters in my Antlr grammar?

I'm trying to build a grammar with the following: NUMERIC: INTEGER | FLOAT | INFINITY | PI ... INFINITY: '∞' PI: 'π' But Antlr refuses to load the grammar. ...

Where do I get the Antlr Ant task?

I'm trying to call an Antlr task in my Ant build.xml as follows: <path id="classpath.build"> <fileset dir="${dir.lib.build}" includes="**/*.jar" /> </path> ... <target name="generate-lexer" depends="init"> <antlr target="${file.antlr.lexer}"> <classpath refid="classpath.build"/> </antlr> </target> But Ant can't find the ta...

What HTML parsing libraries do you recommend in Java

I want to parse some HTML in order to find the values of some attributes/tags etc. What HTML parsers do you recommend? Any pros and cons? ...

Equation (expression) parser with precedence?

I've developed an equation parser using a simple stack algorithm that will handle binary (+, -, |, &, *, /, etc) operators, unary (!) operators, and parenthesis. Using this method, however, leaves me with everything having the same precedence - it's evaluated left to right regardless of operator, although precedence can be enforced usin...

Parsing XML using unix terminal

Sometimes I need to quickly extract some arbitrary data from XML files to put into a CSV format. What's your best practices for doing this in the Unix terminal? I would love some code examples, so for instance how can I get the following problem solved? Example XML input: <root> <myel name="Foo" /> <myel name="Bar" /> </root> My desi...

Parsing, where can I learn about it.

I've been given a job of 'translating' one language into another. The source is too flexible (complex) for a simple line by line approach with regex. Where can I go to learn more about lexical analysis and parsers? ...

Resources for lexing, tokenising and parsing in python

Can people point me to resources on lexing, parsing and tokenising with Python? I'm doing a little hacking on an open source project (hotwire) and wanted to do a few changes to the code that lexes, parses and tokenises the commands entered into it. As it is real working code it is fairly complex and a bit hard to work out. I haven't w...

Does C# have built-in support for parsing page-number strings?

The C# newbie has another simple question! Does C# have built-in support for parsing strings of page numbers? By page numbers, I mean the format you might enter into a print dialog that's a mixture of comma and dash-delimited. Something like this: 1,3,5-10,12 What would be really nice is a solution that gave me back some kind of li...

Learning Resources on Parsers, Interpreters, and Compilers

I've been wanting to play around with writing my own language for a while now (ostensibly for the learning experience) and as such need to be relatively grounded in the construction of Parsers, Interpreters, and Compilers. So: Does anyone know of any good resources on constructing Parsers, Interpreters, and Compilers? EDIT: I'm not l...

An easy way to diff log files, ignoring the time stamps?

I need to diff two log files but ignore the time stamp part of each line (the first 12 characters to be exact). Is there a good tool, or a clever awk command, that could help me out? ...