I need to parse recipe ingredients into amount, measurement, item, and description as applicable to the line, such as 1 cup flour, the peel of 2 lemons and 1 cup packed brown sugar etc. What would be the best way of doing this? I am interested in using python for the project so I am assuming using the nltk is the best bet but I am open t...
Our web application sends e-mails. We have lots of users, and we get lots of bounces. For example, user changes company and his company e-mail is no longer valid.
To find bounces, I parse smtp log file with log parser.
Some bounces are great, like 550+#[email protected]. There is [email protected] in bounce.
But som...
Since I've started to use jQuery, I have been doing a lot more JavaScript development.
I have the need to parse different date formats and then to display them into another format.
Do you know of any good tool to do this?
Which one would you recommend?
...
I've been wondering about how hard it would be to write some Python code to search a string for the index of a substring of the form ${expr}, for example, where expr is meant to be a Python expression or something resembling one. Given such a thing, one could easily imagine going on to check the expression's syntax with compile(), evalu...
How can I convert a Google search query to something I can feed PostgreSQL's to_tsquery() ?
If there's no existing library out there, how should I go about parsing a Google search query in a language like PHP?
For example, I'd like to take the following Google-ish search query:
("used cars" OR "new cars") -ford -mistubishi
And turn ...
There are a number of email regexp questions popping up here, and I'm honestly baffled why people are using these insanely obtuse matching expressions rather than a very simple parser that splits the email up into the name and domain tokens, and then validates those against the valid characters allowed for name (there's no further check ...
I have read the GOLD Homepage ( http://www.devincook.com/goldparser/ ) docs, FAQ and Wikipedia to find out what practical application there could possibly be for GOLD. I was thinking along the lines of having a programming language (easily) available to my systems such as ABAP on SAP or X++ on Axapta - but it doesn't look feasible to me,...
Hi,
I am wondering - What's the most effective way of parsing something like:
{{HEADER}}
Hello my name is {{NAME}}
{{#CONTENT}}
This is the content ...
{{#PERSONS}}
<p>My name is {{NAME}}.</p>
{{/PERSONS}}
{{/CONTENT}}
{{FOOTER}}
Of course this is intended to be somewhat of a templating system in the end, s...
I'm storing an ArrayList of Ids in a processing script that I want to spit out as a comma delimited list for output to the debug log. Is there a way I can get this easily without looping through things?
EDIT: Thanks to Joel for pointing out the List(Of T) that is available in .net 2.0 and above. That makes things TONS easier if you have...
Does anyone know of any code or tools that can strip literal values out of SQL statements?
The reason for asking is I want to correctly judge the SQL workload in our database and I'm worried I might miss out on bad statements whose resource usage get masked because they are displayed as separate statements. When, in reality, they are p...
Hi
I am trying to develop a script to pull some data from a large number of html tables. One problem is that the number of rows that contain the information to create the column headings is indeterminate. I have discovered that the last row of the set of header rows has the attribute border-bottom for each cell with a value. Thus I de...
I had a problem a week or so ago. Since I think the solution was cool I am sharing it here while I am waiting for an answer to the question I posted earlier. I need to know the relative position for the column headings in a table so I know how to match the column heading up with the data in the rows below. I found some of my tables ha...
I am parsing an input text file. If I grab the input one line at a time using getline(), is there a way that I can search through the string to get an integer? I was thinking something similar to getNextInt() in Java.
I know there has to be 2 numbers in that input line; however, these values will be separated by one or more white spa...
I'm using the following regex to capture a fixed width "description" field that is always 50 characters long:
(?.{50})
My problem is that the descriptions sometimes contain a lot of whitespace, e.g.
"FLUID COMPRESSOR "
Can somebody provide a regex that:
Trims all whitespace off the end
Collapses an...
One of our providers are sometimes sending XML feeds that are tagged as UTF-8 encoded documents but includes characters that are not included in the UTF-8 charset. This causes the parser to throw an exception and stop building the DOM object when these characters are encountered:
DocumentBuilder.parse(ByteArrayInputStream bais)
throws...
I want to digest a multipart response in C++ sent back from a PHP script. Anyone know of a very lightweight MIME parser which can do this for me?
Regards
Robert
...
I have a database full of small HTML documents and I need to programatically insert several into, say, a PDF document with iText or a Word document with Aspose.Words. I need to preserve any formatting within the HTML documents (within reason, honouring <b> tags is a must, CSS like <span style="blah"> is a nice-to-have).
Both iText and ...
Is there a best way to turn an integer into its month name in .net?
Obviously I can spin up a datetime to string it and parse the month name out of there. That just seems like a gigantic waste of time.
...
I'm not the best at PHP and would be extremely grateful if somebody could help. Basically I need to parse each line of a datafeed and just get each bit of information between each "|" - then I can add it to a database. I think I can handle getting the information from between the "|"'s by using explode but I need a bit of help with parsi...
Does anyone know a good date parser for different languages/locales. The built-in parser of Java (SimpleDateFormat) is very strict. It should complete missing parts with the current date.
For example
if I do not enter the year (only day and month) then the current year should be used.
if the year is 08 then it should not parse 0008...