parsing

Parse text file and assign parts to columns

I receive a tab-delimited text file that must be parsed. Once parsed, the parts must be assigned to specific columns. Here is an example of the code I'm using to do this: string path = "C:\\Users\\Robert\\Desktop\\Test.txt"; FileInfo fileInfo = new FileInfo(path); using (StreamReader streamReader = fileInfo.OpenText()) ...

Protocol charset conflict, ESMTP vs. XML in an email body

We have a process in which XML is transferred to us via ESMTP in an email body. The character set of the email body is specified as ISO-8859-1, and no encoding is specified for the XML. According to the protocol, the default is UTF-8. The problem is our XML parser is throwing an exception when it encounters the ® character because it ...

Ruby on Rails: How to parse user entered string for URLs and safely display?

I'm adding a feature to a web app where users can click a button to enter a link, and then paste in an address. I then want to be parse out the string entered, and extract the domain from the URL so that I can display the domain separately next to the link. The idea here is something similar to what Slashdot does, where links are display...

Are there any tutorials on building a simple interpreter using Alex + Happy?

I'm working on a school project where I have to build an interpreter for a simple language using Alex + Happy in Haskell. After looking through the documentation I understand most of it, but would like to see a full blown example on using the tools. ...

Parsing a line with a variable number of entries in C or C++ (no boost)

I have a file containing lines of the form, double mass, string seq, int K, int TS, int M, [variable number of ints] 688.83 AFTDSK 1 1 0 3384 2399 1200 790.00 MDSSTK 1 3 1 342 2 I need a (preferably simple) way of parsing this file without boost. If the number of values per line ...

build error with boost spirit grammar (boost 1.43 and g++ 4.4.1) part III

Ok i am trying to build a grammar and currently it looks like this: #ifndef _INPUTGRAMMAR_H #define _INPUTGRAMMAR_H #include <boost/config/warning_disable.hpp> #include <boost/spirit/include/qi.hpp> #include <boost/spirit/include/phoenix_core.hpp> #include <boost/spirit/include/phoenix_operator.hpp> #include <boost/spirit/include/phoe...

Interpreter in Python: Making your own programming language?

Remember, this is using python. Well, I was fiddling around with an app I made called Pyline, today. It is a command line-like interface, with some cool features. However, I had an idea while making it: Since its like a "OS", wont it have its own language? Well, I have seen some articles online on how to make a interpreter, and parser, ...

python search from tag

hi i need help with python programming: i need a command which can search all the words between tags from a text file. for example in the text file has <concept> food </concept>. i need to search all the words between <concept> and </concept> and display them. can anybody help please....... ...

error with parse funcion in lxml

Hi all! i have installed lxml2.2.2 on windows platform(i m using python version 2.6.5).i tried this simple command: from lxml.html import parse p= parse(‘http://www.google.com’).getroot() but i am getting the following error: Traceback (most recent call last): File “”, line 1, in p=parse(‘http://www.google.com’).getroot() File “C:...

Are there any C# library for screen scraping?

Hi, there are lots of open source screen scraping libraries for python,php. However I couldn't find any .Net counterpart. Could you recommend any library for screen scraping or just html parsing which make life easier. ...

Problem during parsing datetime

Hi, I have problem when im trying parse datetime in format like: "1.00:29:00" 1- days,29-minutes, after invoke DateTime.Parse im getting "String was not recognized as a valid DateTime" thanks in advance for any suggestion. ...

C++ program parsing arguments

I want to make a program to be called from the command line like this: myprogram.exe -F/-T FILE/TEXT -W FILE.WAV -P FILE.PHO -A They are 3 parts: myprogram.exe -F OR -T and the File or text -W FILE -P FILE and -A (At least one, up to 3, in any order (or not, if it's complicated)) So it can be: myprogram.exe -T "Text Text, test te...

do any of the C# RSS reader support reading custom fields

I have rss that has a number of other custom fields. do any of the c# libraries (RSS.NEt, etc) support reading these fields? i can't seem to find any reference to this. what is the easier way to parse XML from an RSS feed and include customer fields ...

Could we access to the Qt creator internal parser from a plugin ?

Hi, I would like to make a plugin for Qt Creator, and I want access to the parsing files (AST) in Qt Creator. For example if you right click on a variable, retrieve its type. I just looked at the doc and I have not found something very significant. And I fear myself having to parse the page. Anyone have tried it and succeeded? :p ...

Parsing Java Class From Perl or Python

I want to get a .java file, recognize the first class in the file and getting information about annotations, methods and attributes from this class. Is there any module in both languages that already does that? I could build up a simple regexp to do it also, but I don't known how to recognize in the regexp the braces indicating the end ...

parsing a line of text to get a specific number

I have a line of text in the form " some spaces variable = 7 = '0x07' some more data" I want to parse it and get the number 7 from "some variable = 7". How can this be done in python? ...

Better practice to strcpy() or point to another data structure?

Because it's always easier to see code... My parser fills this object: typedef struct pair { char* elementName; char* elementValue; } pair; My interpreter wants to read that object and fill this one: typedef struct thing { char* label; } thing; Should I do this: thing.label = pair.elementName; or this: thing.label = (char*)...

what is the best html parser for java?

Assuming we have to use java, what is the best html parser that is flexible to parse lots of different html content, and also requires not a whole lot of code to do complex types of parses? ...

How to create a parser which tokenizes a list of words taken from a file?

Hi, I am trying to do a sintax text corrector for my compilers' class. The idea is: I have some rules, which are inherent to the language (in my case, Portuguese), like "A valid phrase is SUBJECT VERB ADJECTIVE", as in "Ruby is great". Ok, so first I have to tokenize the input "Ruby is great". So I have a text file "verbs", with a lot ...

Parse a custom file in C#

Should I be using RegularExpressions to do this? Possible to structure the results as queryable, IEnumerable, etc. I have a file, I cannot change how it is generated. I wish to create a parser class to extract all the data. Ideally, I would like to then use this class to open the file and have it return a queryable array type structur...