parsing

clever way to conditionally split this string?

I've got a string that could be in one of two forms: prefix=key=value (which could have any characters, including '=') or key=value So I need to split it either on the first or second equals sign, based on a boolean that gets set elsewhere. I'm doing this: if ($split_on_second) { $parts = explode('=', $str, 3); $key = $par...

Arabica's documentation (XML and HTML processing toolkit)

Beyond the doxygen file and the code examples, is there some documentation or tutorials for Arabica ? I just can't find anything. Update I gave up on Arabica. In the mean time I've also tried Xerces, the doc is better but the interface is just awful. So I settled on rapidXML and I'll look into pugixml later. ...

Problem while fetching xml through some rss feed

I was fetching xml through some rss feed. I am unable to sort items in depth like i have sorted easily "channel -> description" as NSString *resultValue=[[responseDictionary valueForKeyPath:@"rss.channel.description"] textContent]; Above Result: YouTube RSS Feed My question is how i can parse .... item -> description... i.e (Music vi...

Config file format

Hello, does anyone knows a file format for configuration files easy to read by humans? I want to have something like tag = value where value may be: String Number(int or float) Boolean(true/false) Array(of String values, Number values, Boolean values) Another structure(it will be more clear what I mean in the fallowing example) Now I...

Java static source analysis/parsing (possibly with antlr), what is a good tool to do this?

I need to perform static source analysis on Java code. Ideally, I want the system to work out of the box without much modification from me. For example, I have used Antlr in the past, but I spent a lot of time building grammar files and still didn't get what I wanted. I want to be able to parse a java file and have return the charact...

How do i make my web browser made in wx.python to parse pages(ex.Google.ro)

Can somebody help me? please i really need to parse at least google. i need to parse url page. i've made a web browser and this web browser doesn't parse pages. It's made in wxpython. ...

parsing of mathematical expressions

(in c90) (linux) input: sqrt(2 - sin(3*A/B)^2.5) + 0.5*(C*~(D) + 3.11 +B) a b /*there are values for a,b,c,d */ c d input: cos(2 - asin(3*A/B)^2.5) +cos(0.5*(C*~(D)) + 3.11 +B) a b /*there are values for a,b,c,d */ c d input: sqrt(2 - sin(3*A/B)^2.5)/(0.5*(C*~(D)) + sin(3.11) +ln(B)) /*max lenght of formula is 250 characters...

Sorting a google calendar feed (parsing with DOM)

I'm embedding dates from google calendar into a website, and it's all working, with the exception of sorting. For some reason, it sorts into reverse-chronological order, when I'd really just like it to be normal chronological (first event first). this is the output: August 11th: Intern depart August 6th: Last Day of Summer Camp July ...

What's the best way to explain parsing to a new programmer?

I am a college student getting my Computer Science degree. A lot of my fellow students really haven't done a lot of programming. They've done their class assignments, but let's be honest here those questions don't really teach you how to program. I have had several other students ask me questions about how to parse things, and I'm n...

C# parsing txt files IF name format is desired format

OK, I have txt files that I am parsing and saving into a sql db. The names are formatted like R306025COMP_272A4075_20090929_080159.txt However, there are a select few (out of thousands of files) with names that are formatted differently (particularly files that were generated as tests), example R306025COMP_SU2_TestBottom_20090915_101...

Trouble parsing some RSS feeds using Java and Sax

I've written an RSS feed parser in Java (running on Android) and it parses some feeds perfectly, and others not at all. I get the following error when it tries to parse Slashdot (http://rss.slashdot.org/Slashdot/slashdot) org.apache.harmony.xml.ExpatParser$ParseException: At line 1, column 0: unbound prefix If I try to parse Wired (h...

Loading data from file to Vector structure

I'm trying to parse through fixed-width formatted file extracting x,y values of points from it, and then storing them in int[] array inside a Vector. Text file looks as follows : 0006 0015 0125 0047 0250 0131 That's the code : Vector<int[]> vc = new Vector<int[]>(); try { BufferedReader file = new BufferedR...

Parse Text using scanner useDelimiter

Looking to parse the following text file: Sample text file: <2008-10-07>text entered by user<Ted Parlor><2008-11-26>additional text entered by user<Ted Parlor> I would like to parse the above text so that I can have three variables: v1 = 2008-10-07 v2 = text entered by user v3 = Ted Parlor v1 = 2008-11-26 v2 = additional text entered...

parse youtube video id using preg_match

Hi, I am attempting to parse the video ID of a youtube URL using preg_match. I found a regular expression on this site that appears to work; (?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+ As shown in this pic: My PHP is as follows, but it doesn't work (gives Unknown modifier '[' error)... <? $subject = "http://www....

Top-Down Parsing Implement in c#

hi I want to implement Top-Down Parsing by c# language is there any source to show me the way. I mean good method and description to implement or algorithms. for example here are some steps to implement : 1- developing a Grammar 2- LL1 Parsing 3- Construct the parser. These steps are in order and order is very important to give you...

Regex to extract portions of file name

I have text files formatted as such: R156484COMP_004A7001_20100104_065119.txt I need to consistently extract the R****COMP, the 004A7001 number, 20100104 (date), and don't care about the 065119 number. the problem is that not ALL of the files being parsed have the exact naming convention. some may be like this: R168166CRIT_156B2075_S...

xml parsing in adobe alchemy

hello can you prvide an example how to parse xml file in adobe alchemy. im trying to work on expat, however i got no luck on passing of bytearrays to and from the c code. do i need to pass the byte array of file to the alchemy, or is it enough to pass the filename. thanks. cbs ...

YQL Open Data Table for Wikipedia

Has anyone written a YQL open data table for accessing Wikipedia? I've had a hunt around the internet and found mention of people using YQL for extracting various bits of information from Wikipedia pages such as microformats, links or content but I haven't been able to find an open data table that ties it all together. ...

Lexing partial SQL in C#

I'd need to parse partial SQL queries (it's for a SQL injection auditing tool). For example '1' AND 1=1-- Should break down into tokens like [0] => [SQL_STRING, '1'] [1] => [SQL_AND] [2] => [SQL_INT, 1] [3] => [SQL_AND] [4] => [SQL_INT, 1] [5] => [SQL_COMMENT] [6] => [SQL_QUERY_END] Are their any at least lexers for SQL that I base...

Python/YACC: Resolving a shift/reduce conflict

I'm using PLY. Here is one of my states from parser.out: state 3 (5) course_data -> course . (6) course_data -> course . course_list_tail (3) or_phrase -> course . OR_CONJ COURSE_NUMBER (7) course_list_tail -> . , COURSE_NUMBER (8) course_list_tail -> . , COURSE_NUMBER course_list_tail ! shift/reduce conflict for...