parsing

How can fractional number expressions be parsed using pyparsing?

We've just started to kick the tires pyparsing and like it so far, but we've been unable to get it to help us parse fractional number strings to turn them into numeric data types. For example, if a column value in a database table contained the string: 1 1/2 We'd like some way to convert it into the numeric python equivalent: 1.5 We...

Alternate XML parser implementations for Android

I am looking for an XML parsing solution for Android besides the built-in kXML pull parser. I am trying to parse a large (4MB+) XML file downloaded from a server and the kXML parser throws an OutOfMemory error after trying to allocate a 1MB+ byte array while parsing. A good streaming XML parser shouldn't be allocating such a big array!! ...

BaseAdapter Class used with Json Parsing

Hi All! I have Parsed the Json Response and Now I want to use the BaseAdapter Class in my Application. I have a rough Idea about the BaseAdapter Class but not very clear about the same. Can anybody please tell me what exactly the Base Class does. Also do I need to use the Getter and Setter Methods if I am using the BaseAdapter Class i...

Parsing Nested Text in C#

If I have a series of strings that have this base format: "[id value]"//id and value are space delimited. id will never have spaces They can then be nested like this: [a] [a [b value]] [a [b [c [value]]] So every item can have 0 or 1 value entries. What is the best approach to go about parsing this format? Do I just use stuff li...

Free POP3 mail component

Right now I'm working on a project, which will read mails from pop3 inbox and save its attachments to specified folder. I'm looking for a free POP3 mail component, which can be used with .NET 3.5; Please recommend easy to use POP3 component, it will be great if it is open source. ...

How to create Date object from the string date in Javascript for the first years of A.D ?

Hi folks, I have next date string: "Thu Nov 14 0002 01:01:00 GMT+0200 (GTB Standard Time)" and I'm trying to convert it to the Date object: date = new Date("Thu Nov 14 0002 01:01:00 GMT+0200 (GTB Standard Time)") => Invalid Date {} and it doesn't work. And date = new Date("Thu Nov 14 2 01:01:00 GMT+0200 (GTB Standard Time)")...

asynctask in Android

Hi Team! Can anybody tell about the "asynctask" used in android application. Currently I am working on an application where I have to create a Class in which I have to just get the response of any particular URL given. I this particular class I was told to perform this task by making use of "asynctask". I had been getting very quick r...

Parsing PDF file using Regular expresions in Python

Hello, I am trying to parse some object elements from a PDF file using re module of Python. My goal is to parse each PDF object using a regular expression. A PDF object example is the following: 1 0 obj << /Type /Catalog /Pages 2 0 R >> endobj 2 0 obj << /Type /Pages /Kids [ 3 0 R ] /Count 1 >> endobj ... When I u...

C# Strict Int/Float parsing

I'm wondering how to implement strict parsing of a string in C#. Specifically, I want to require that the string contains just the number (in culture-specific format), and no other characters. "1.0" -> 1.0 "1.5 " -> fail " 1.5" -> fail " 1,5" -> fail (if culture is with ".") "1,5" -> fail Cheers! ...

Java Date Parse from one to another format

Is there a way whereby the date format can be determined in Java similar to .Net? Consider the following example: private String reformatDateString(String dateParam){ if(dateParam == null || dateParam.isEmpty()){ return null; } try{ SimpleDateFormat inDateFormat = new SimpleDateFormat("yyyy-MM-dd"); ...

Human- and computer-readable hierarchical data format with inheritance across files

I'm looking for a data format for text files with hierarchical information. These files will be created primarily by human input (rather than generated by programs), but will be read primarily by programs. The main requirements are: Very simple and uncluttered syntax. (Example: levels of hierarchy defined by tabs would work fine.) So s...

Picking out symbols from a code base with Python

Hi all Given a code base (say for example a large C or Objective-C project) I would like to analyze the sourcecode files and pick out symbols of interest. They might be class declarations, variable names or types, or method names. Is there a Python module that could help me with this? The only approach I can see going forward is to u...

Requiring existence of XML elements without schema in TinyXML

I am trying to implement a short converter using TinyXML that will take an XML file (with fixed format), parse it, and populate a protobuf object with the elements. Problem is, some elements are optional in the protobuf definition and TinyXML does not have schema support. What would be a simple way to parse the elements robustly taking...

Question on how to develop and then parse a data structure

I'm designing a weather program where I need to keep track of certain things and allow the user to add data which will be saved and subsequently read later. My fields are City State Zip Metar I might have more I want to do with this configuration file later, so I would like it to have something like this: [LOCATIONS] Phoenix:AZ:85001:...

Parse Lines of Single Words and Groups of Words Inside Quotes Using Regular Expressions in Ruby

I'm trying to figure out how to better parse lines of text that have values that look like this: line1 'Line two' fudgy whale 'rolly polly' fudgy 'line three' whale fudgy whale 'line four' 'line five' 'fish heads' line six I wish to use a single regular expression to display the desired output. I already know how to kludge ...

What is the runtime difference between different parsing algorithms?

There are lots of different parsing algorithms out there (recursive descent, LL(k), LR(k), LALR, ...). I find a lot of information about the different grammars different types of parser can accept. But how do they differ in runtime behavior? Which algorithm is faster, uses less memory or stack space? Or to put this differently - which ...

(x)HTML: Parsing bizarre tags

I am building my own humble (x)html parser. All is ok, but some doctype tags break it. Let me show you: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd" [ <!ENTITY D "&#x2014;"> <!ENTITY o "&#x2018;"> <!ENTITY c "&#x2019;"> <!ENTITY O "&#x201C;"> <!ENTITY C "&#x201D...

Example using WikipediaTokenizer in Lucene

Hi, I want to use WikipediaTokenizer in lucene project - http://lucene.apache.org/java/3_0_2/api/contrib-wikipedia/org/apache/lucene/wikipedia/analysis/WikipediaTokenizer.html But I never used lucene. I just want to convert a wikipedia string into a list of tokens. But, I see that there are only four methods available in this class, end...

Groovy How can change an XmlParser to an xml format?

HI, I've used XmlParser to can change the attributes of some nodes in my xml file. Some code: def temp = groovyUtils.getXmlHolder( "testAddress CUY#ResponseAsXML") def aux = temp.getXml(); def lang = new XmlParser().parseText(aux) lang.prov[0].description[0].setValue('newDesciption') After doing that I have something like " root[at...

Trying to use HPSG PET Parser

Hi I'm trying to use the PET Parser, but the documentation given for usage is insufficient. Can anyone point me to a good article or tutorial on using PET? Does it support utf-8? ...