Im using Lucene Highlighter to highlight the matches that I have found in a Lucene Index. Now, my problem is that If I have to search multiple fields of a document, and I need to display the matching text, then how can I get in which field the hit has occurred?
The code which I am using for the highlighter is basically the second functi...
How can i implement an eliminator for this?
A := AB |
AC |
D |
E ;
...
Hello everyone,
I have searched the web for days now but I can't seem to find a good solution to my problem:
For one of my projects I'm looking for a good (lightweight) MIME parser. My customer provides MIME formatted files (linear, no hierarchy) which contain 3-4 "parts". The application must be able to split those parts and process t...
For example, we have xml file with this format:
<A>
<B>
<C></C>
<D></D>
<D></D>
</B>
</A>
i need that:
if all "D"-tags elements are empty, then we need to delete whole "A"-tag element
and, of course, we need to do this with all "A"-tags in xml.
...
I want to write a parser for EDIFACT messages with JavaCC.
My problem is that I cannot define all terminal symbols before parsing a message because at the begining of each message there is a so called "Advice Segment" ("UNA" Segment) which defines things like element seperator symbol, escape symbol, segment terminator symbol and decimal ...
I have written a small function, which uses ElementTree and xpath to extract the text contents of certain elements in an xml file:
#!/usr/bin/env python2.5
import doctest
from xml.etree import ElementTree
from StringIO import StringIO
def parse_xml_etree(sin, xpath):
"""
Takes as input a stream containing XML and an XPath expression...
I am trying to parse a remote XML document (from Amazon AWS):
<ItemLookupResponse xmlns="http://webservices.amazon.com/AWSECommerceService/2009-03-31">
<OperationRequest>
<RequestId>011d32c5-4fab-4c7d-8785-ac48b9bda6da</RequestId>
<Arguments>
<Argument Name="Condition" Value="New"></Argument>
...
Hi! I am trying to make app that displays an RSS feed, with text and images into a table, but I am really struggeling with it!
I found a really good [sample code-project][1] that i can really recommend-- but im struggeling getting it to display images in the tablecells instead of only text
I would be reeeeally happy with any help!!
...
I have a situation where something can appear in a format as follows:
---id-H--
Header: data
Another Header: more data
Message: sdasdasdasd
Message: asdasdasdasd
Message: asdasdasd
There may be many messages, or just a couple. I'd prefer not having to step outside of RegEx, because I am using the RegEx to parse some header information...
I'm having issues getting a small spirit/qi grammar to compile. i am using boost 1.43 and g++ 4.4.1.
the input grammar header:
the build error seems to be pointing to the definition of the 'instruction' rule, maybe it is the '[sp::_val = sp::_1]' that somehow brokes it but this is more or less based on what the spirit documentation tuto...
Hi Guys,
I'm looking to construct a script that would go through an XML file. Would find specific tags in it, put them in a table and fill the table with specific tags within them. I'm using MySQL 5.1 so loadXML isn't an option and I think that ExtractData() method wont be much use either.. but I don't really know. What would be the bes...
In Antlr, if I have a rule for example:
someRule : TOKENA TOKENB;
it would accept : "tokena tokenb"
if I would like TOKENA to be optional, I can say,
someRule : TOKENA* TOKENB;
then I can have : "tokena tokenb" or "tokenb" or "tokena tokena tokenb"
but this also means it can be repeated more that once. Is there anyway I can say t...
Hello
I've got supervisor's status output, looking like this.
frontend RUNNING pid 16652, uptime 2:11:17
nginx RUNNING pid 16651, uptime 2:11:17
redis RUNNING pid 16607, uptime 2:11:32
I need to extract nginx's PID. I've done it via grep -P comman...
I'm using an LL(k) EBNF grammar to parse a character stream. I need three different types of tokens:
CHARACTERS
letter = 'A'..'Z' + 'a'..'z' .
digit = "0123456789" .
messageChar = '\u0020'..'\u007e' - ' ' - '(' - ')' .
TOKENS
num = ['-'] digit { digit } [ '.' digit { digit } ] .
ident = letter { letter | digit | '_' } .
...
My regex skills are not very good and recently a new data element has thrown my parser into a loop
Take the following string
"+USER=Bob Smith-GROUP=Admin+FUNCTION=Read/FUNCTION=Write"
Previously I had the following for my regex : [+\\-/]
Which would turn the result into
USER=Bob Smith
GROUP=Admin
FUNCTION=Read
FUNCTION=Write
FUNCT...
I am testing various methods to read (possibly large, and very often) XML configuration files in PHP. No writing is ever needed. I have two successful implementations, one using SimpleXML (which I know is a DOM parser) and one using XMLReader.
I know that a DOM reader must read the whole tree and therefore uses more memory. My tests ...
Hey awesome SO users,
I have an Android application that parses an XML file for users and displays results in a much more mobile friendly format. The app works great for most users, but some users have lots and lots of data and the app crashes on them because it runs out of memory.
Is there any way I have a DOM style XML parser quit pa...
I'm writing a Python script to process emails returned from Procmail. As suggested in this question, I'm using the following Procmail config:
:0:
|$HOME/process_mail.py
My process_mail.py script is receiving an email via stdin like this:
From hostname Tue Jun 15 21:43:30 2010
Received: (qmail 8580 invoked from network); 15 Jun 2010 2...
I'm working with a service that provides data as a Lisp-like S-Expression string. This data is arriving thick and fast, and I want to churn through it as quickly as possible, ideally directly on the byte stream (it's only single-byte characters) without any backtracking. These strings can be quite lengthy and I don't want the GC churn ...
Given an arbitrary string, for example ("I'm going to play croquet next Friday" or "Gadzooks, is it 17th June already?"), how would you go about extracting the dates from there?
If this is looking like a good candidate for the too-hard basket, perhaps you could suggest an alternative. I want to be able to parse Twitter messages for date...