parsing

Extra items on stack when parsing CFF font data

I've written a few routines to parse CFF font data. Occasionally I am getting extra items on the stack when processing an hvcurveto and vvcurveto command. For these two commands the stack depth should be either 4, 5, 12, 13, 20, 21, ... or 8, 9, 16, 17, 24, 25, ... For some fonts I'm getting a stack size of 10. There's an extra...

What is the best way to manually parse an XElement into custom objects?

I have an XElement variable named content which consists of the following XML: <content> <title>Contact Data</title> <p>This is a paragraph this will be displayed in front of the first form.</p> <form idCode="contactData" query="limit 10; category=internal"/> <form idCode="contactDataDetail" query="limit 10; category=int...

json parse error with double quotes

A double quote even if escaped is throwing parse error. look at the code below //parse the json in javascript var testJson = '{"result": ["lunch", "\"Show\""] }'; var tags = JSON.parse(testJson); alert (tags.result[1]); This is throwing parse error because of the double quotes (which are already escaped). Even eval() won't work ...

Help with Shift/Reduce conflict - Trying to model (X A)* (X B)*

Im trying to model the EBNF expression ("declare" "namespace" ";")* ("declare" "variable" ";")* I have built up the yacc (Im using MPPG) grammar, which seems to represent this, but it fails to match my test expression. The test case i'm trying to match is declare variable; The Token stream from the lexer is KW_Declare KW_Varia...

How do I get the sum of all content when parsing an XML tag in Ruby?

I have some XHTML (but really any XML will do) like this: <h1> Hello<span class='punctuation'>,</span> <span class='noun'>World<span class='punctuation'>!</span> </h1> How do I get the full content of the <h1/> as a String in Ruby? As in: assert_equal "Hello, World!", h1_node.some_method_that_aggregates_all_content Do any of t...

Making YACC output an AST (token tree)

Is it possible to make YACC (or I'm my case MPPG) output an Abstract Syntax Tree (AST). All the stuff I'm reading suggests its simple to make YACC do this, but I'm struggling to see how you know when to move up a node in the tree as your building it. ...

Regex help, greedy vs. non-greedy

Hey all I have a large html string like <a style="background: rgb(100, 101, 43) none repeat scroll 0% 0%; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-backg round-inline-policy: -moz-initial;" href="#">swatch4</a> <a style="background: rgb(34, 68, 33) none repeat scroll 0% 0%; -moz-background-clip...

General Address Parser for Freeform Text

We have a program that displays map data (think Google Maps, but with much more interactivity and custom layers for our clients). We allow navigation via a set of combo boxes that prefill certain fields with a bunch of data (ie: Country: Canada, the Province field is filled in. Select Ontario, and a list of Counties/Regions is filled i...

Composable Grammars

There are so many programming languages which support the inclusion of mini-languages. PHP is embedded within HTML. XML can be embedded within JavaScript. Linq can be embedded within C#. Regular expressions can be embedded in Perl. // JavaScript example var a = <node><child/></node> Come to think of it, most programming languages can ...

How can I use Perl to get a list of CSS elements with a color or background color attribute?

Hello, I asked a question earlier today regarding using Perl to search in a CSS document. I have since refined my requirements a little bit, and have a better idea of what I am trying to do. The document I am searching through is actually an .html doc with CSS as a style in the <head>, if that makes sense. Basically, what I need to d...

Parse VCALENDAR (ics) with Objective-C

I'm looking for an easy way to parse VCALENDAR data with objective-c. Specifically all I am concerned with is the FREEBUSY data (See below): BEGIN:VCALENDAR VERSION:2.0 METHOD:REPLY PRODID:-//CALENDARSERVER.ORG//NONSGML Version 1//EN BEGIN:VFREEBUSY UID:XYZ-DONT-CARE DTSTART:20090605T070000Z DTEND:20090606T070000Z ATTENDEE:/principals/...

How to resolve a shift-reduce conflict in unambiguous grammar

I'm trying to parse a simple grammar using an LALR(1) parser generator (Bison, but the problem is not specific to that tool), and I'm hitting a shift-reduce conflict. The docs and other sources I've found about fixing these tend to say one or more of the following: If the grammar is ambiguous (e.g. if-then-else ambiguity), change the l...

Handling Token Ambiguity in JavaCC

I'm attempting to write a parser in JavaCC that can recognize a language that has some ambiguity at the token level. In this particular case the language supports the "/" token by itself as a division operator while it also supports regular expression literals. Consider the following JavaCC grammar: TOKEN : { ... < VAR : "var...

Parsing numbers in Python

i want to take inputs like this 10 12 13 14 15 16 .. how to take this input , as two diffrent integers so that i can multiply them in python after every 10 and 12 there is newline ...

Are there faster XML parsers in Java than Xalan/Xerces

I haven't found many ways to increase the performance of a Java application that does intensive XML processing other than to leverage hardware such as Tarari or Datapower. Does anyone know of any open source ways to accelerate XML parsing? ...

Read lua-like code in php

I got a question... I got code like this, and I want to read it with PHP. NAME { title ( A_STRING ); settings { SetA( 15, 15 ); SetB( "test" ); } desc { Desc ( A_STRING ); Cond ( A_STRING ); } } I want: $arr['NAME']['title'] = "A_STRI...

Determine if a String is a valid date before parsing

I have this situation where I am reading about 130K records containing dates stored as String fields. Some records contain blanks (nulls), some contain strings like this: 'dd-MMM-yy' and some contain this 'dd/MM/yyyy'. I have written a method like this: public Date parsedate(String date){ if(date !== null){ try{ 1. cr...

XML Validation in configuration parsing code with XMLSchema: Best Practices? (Java)

Hi everyone! I have to parse a XML configuration file. I decided to do the basic validation with XMLSchema to avoid massive amounts of boilerplate code for validation. Since the parsing class should work for itself i was wondering: "How can i validate the incoming XML file with XMLSchema without having the XMLSchema stored on disc where...

Produce a sentence from a grammar with a given number of terminals

Say you've got a toy grammar, like: (updated so the output looks more natural) S -> ${NP} ${VP} | ${S} and ${S} | ${S}, after which ${S} NP -> the ${N} | the ${A} ${N} | the ${A} ${A} ${N} VP -> ${V} ${NP} N -> dog | fish | bird | wizard V -> kicks | meets | marries A -> red | striped | spotted e.g., "the dog kicks the red wizard...

Different ways for parsing XML

I'm a beginner when it comes to XML. I created a simple XML file and tried to parse it and assign the values into variables. It worked but the method I used made me wonder if there're better ways, more elegant if you will, for this task. Are there any? Here's my XML file: <start> <record> <var1>hello</var1> <var2>world</var2> </record>...