I need to quickly build a parser for a very simplified version of a html-like markup language in Java. In python, I would use pyparsing library to do this. Is there something similar for Java? Please, don't suggest libraries already out there for html parsing, my application is a school assignment which will demonstrate walking a tree of...
I am trying the python pyparsing for parsing. I got stuck up while making the recursive parser.
Let me explain the problem
I want to make the Cartesian product of the elements. The syntax is
cross({elements },{element})
I put in more specific way
cross({a},{c1}) or cross({a,b},{c1}) or cross({a,b,c,d},{c1}) or
So the general f...
I saw on the Google App Engine documentation that http://www.antlr.org/ Antlr3 is used as the parsing third party library.
But from what I know Pyparsing seems to be the easier to use and I am only aiming to parse some simple syntax.
Is there an alternative? Can I get pyparsing working on the App Engine?
...
I've tried taking this code and converting it to something for a project I'm working on for programming language processing, but I'm running into an issue with a simplified version:
op = oneOf( '+ - / *')
lparen, rparen = Literal('('), Literal(')')
expr = Forward()
expr << ( Word(nums) | ( expr + op + expr ) | ( lparen + expr + rparen)...
I'm trying to write something that will parse some code. I'm able to successfully parse foo(spam) and spam+eggs, but foo(spam+eggs) (recursive descent? my terminology from compilers is a bit rusty) fails.
I have the following code:
from pyparsing_py3 import *
myVal = Word(alphas+nums+'_')
myFunction = myVal + '(' + delimitedList( ...
I want to create a SQL interface on top of a non-relational data store. Non-relational data store, but it makes sense to access the data in a relational manner.
I am looking into using ANTLR to produce an AST that represents the SQL as a relational algebra expression. Then return data by evaluating/walking the tree.
I have never implem...
What I'm especially interested in is the ability to define the grammar in the code as ordinary code without any unnecessary cruft.
I'm aware I could use IronPython. I don't want to.
UPDATE:
To further explain what I'm looking for, I'm including some sample pyparsing code. This is an incomplete parser to convert emacs shortcut keys to ...
Here is a subset of the Python grammar:
single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
stmt: simple_stmt | compound_stmt
simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
small_stmt: pass_stmt
pass_stmt: 'pass'
compound_stmt: if_stmt
if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
suite: s...
I'm trying to parse words which can be broken up over multiple lines with a backslash-newline combination ("\\n") using pyparsing. Here's what I have done:
from pyparsing import *
continued_ending = Literal('\\') + lineEnd
word = Word(alphas)
split_word = word + Suppress(continued_ending)
multi_line_word = Forward()
multi_line_word << ...
I'm using pyparsing to parse HTML. I'm grabbing all embed tags, but in some cases there's an a tag directly following that I also want to grab if it's available.
example:
import pyparsing
target = pyparsing.makeHTMLTags("embed")[0]
target.setParseAction(pyparsing.withAttribute(src=pyparsing.withAttribute.ANY_VALUE))
target.ignore(pypar...
Using pyparser, I am trying to create a very simple parser for the S-Expression language. I have written a very small grammar.
Here is my code:
from pyparsing import *
alphaword = Word(alphas)
integer = Word(nums)
sexp = Forward()
LPAREN = Suppress("(")
RPAREN = Suppress(")")
sexp << ( alphaword | integer | ( LPAREN + ZeroOr...
I am new to Python and pyparsing. I need to accomplish the following.
My sample line of text is like this:
12 items - Ironing Service 11 Mar 2009 to 10 Apr 2009
Washing service (3 Shirt) 23 Mar 2009
I need to extract the item description, period
tok_date_in_ddmmmyyyy = Combine(Word(nums,min=1,max=2)+ " " + Word(alphas, exact=3)...
I'm building a parser for an imaginary programming language called C-- (not the actual C-- language). I've gotten to the stage where I need to translate the language's grammar into something Pyparsing can accept. Unfortunatly when I come to parse my input string (which is correct and should not cause Pyparsing to error) it's not parsing ...
python/pyparsing
When I use scanString method, it is giving the start and end location of the matched token, in the text.
e.g.
line = "cat bat"
pat = Word(alphas)
for i in pat.scanString(line):
print i
I get the following:
((['cat'], {}), 0, 3)
((['bat'], {}), 4, 7)
But cat end location should be "2" right? Why it is repor...
I am parsing a file with python and pyparsing (it's the report file for PSAT in Matlab but that isn't important). here is what I have so far. I think it's a mess and would like some advice on how to improve it. Specifically, how should I organise my grammar definitions with pyparsing?
Should I have all my grammar definitions in one fun...
Pythonistas:
Suppose you want to parse the following string using Pyparsing:
'ABC_123_SPEED_X 123'
were ABC_123 is an identifier; SPEED_X is a parameter, and 123 is a value. I thought of the following BNF using Pyparsing:
Identifier = Word( alphanums + '_' )
Parameter = Keyword('SPEED_X') or Keyword('SPEED_Y') or Keyword('SPEED_Z')
...
I need to be able to take a formula that uses the OpenDocument formula syntax, parse it into syntax that Python can understand, but without evaluating the variables, and then be able to evaluate the formula many times with changing valuables for the variables.
Formulas can be user input, so pyparsing allows me to both effectively handle ...
The majority of pyparsing examples that I have seen have dealt with linear expressions.
a = 1 + 2
I'd like to parse mediawiki headlines, and hash them to their sections.
e.g.
Introduction goes here
==Hello==
foo
foo
===World===
bar
bar
Dict would look like:
{'Introduction':'Whoot introduction goes here', 'Hello':"foo\nfoo", 'World...
I want to be able to pull out the type and count of letters from a piece of text where the letters could be in any order. There is some other parsing going on which I have working, but this bit has me stumped!
input -> result
"abc" -> [['a',1], ['b',1],['c',1]]
"bbbc" -> [['b',3],['c',1]]
"cccaa" -> [['a',2],['c',3]]
I could use se...
TLDR: if I built a multipurpose parser by hand with different code for each format, will it work better in the long run using one chunk of parser code and an ANTLR, PyParsing or similar grammar to specify each format?
Context:
My job involves lots of benchmark log files from ~50 different benchmarks. There are a few in XML, a few HTML,...