I have a weird date format in some files I'm parsing. Here are some examples:
1954203
2012320
2010270
The first four digits are the year and the next three digits are day of year. For example, the first date is the 203rd day of 1954, or 7/22/1954.
My questions are:
What's this date format called?
Is there a pre-canned way to parse ...
I'm pulling hair out, i might pull a tooth out next, thats how frustrated i am.
I have deleted (for the purpose of proving a point) ALL my RSS files in my wordpress site
http://baked-beans.tv
No matter what i edit, Google Reader reads what it wants, ie: the posts, and all it's content!
So how on earth am I supposed to edit the conten...
I am using Nokogiri which works for small documents well. But for a 180 kb html file I have to increase the process stack size (via ulimit -s) and the parsing takes a long time. Let alone xpath queries.
Are there faster alternatives available using a stock ruby 1.8 distribution?
I am getting used to xpath, but the alternative does not ...
I'm having some trouble with BNF. I can't tell what seems to be the standard way of doing things (if there is one), and whether or not there are types like char or int or whatever already built in.
However, my main problem is not understand how the part of the BNF in curly braces works.
Given something like:
exp : term ...
Hi I have custom google search included on a html page. like
http://www.*.com/search.htm?cx=partner-pub--00000000000-c77&cof=FORID%3A10&ie=ISO-8ds3-1&q=software&sa=Search&siteurl=www.*.com%2#1342
When I am using same url in browser i get results. I want to call it by simple dom html parser then it is returning blank...
Hi I want to parse a bibtex publications file and sort for specific fields (e.g. year) and filter certain content, to then put it on a website. I came across pybtex, which works as far as reading and parsing the bibtex file, but it is basically not documented and I can't figure out how to sort the entries.
Is pybtex the way to go (how c...
I have the following lemon grammar (simplified from the real grammar):
%right ASSIGN .
%nonassoc FN_CALL .
program ::= expression .
expression ::= expression ASSIGN expression .
expression ::= function_call . [FN_CALL]
expression ::= IDENTIFIER .
function_call ::= expression LPAREN RPAREN . [FN_CALL]
I'm not able to fix the shift-r...
Hello,
I'm trying to write a python script that takes in one or two xml files and outputs one or two new files based on the contents of the input files. I was trying to write this script using the minidom module. However, the input files contain a number of instances of the escape character
inside node attributes. Unfortunately, in...
Hi All,
Is there any way to find out what all status codes a host got when tried to access the particular website.
Something like
28-10-2010 192.168.1.1 HTTP 404 http://localhost/BAC/default.aspx
28-10-2010 192.168.1.10 HTTP 200 //localhost/BAC/default2.aspx1
I tried using some free log analysers like : IIS Log Analyser,I...
In the middle of an XML document I'm transforming, there is a CDATA node which I know itself is composed of XML. I would like to have that "recursively parsed" as XML so that I can transform it too. Upon searching, I think my question is very similar to http://stackoverflow.com/questions/1927522/handling-node-with-inner-xml-in-xslt.
T...
What is the purpose of the Parse::CPAN::Authors module?
use Parse::CPAN::Authors;
# must have downloaded
my $p = Parse::CPAN::Authors->new("01mailrc.txt.gz");
# either a filename as above or pass in the contents of the file
my $p = Parse::CPAN::Authors->new($mailrc_contents);
my $author = $p->author('LBROCARD');
# $a is ...
Hi to all.
I have a problem with reading .xlsx files in asp.net mvc2.0 application, using c#. Problem occurs when reading empty cell from .xlsx file. My code simply skips this cell and reads the next one.
For example, if the contents of .xlsx file are:
FirstName LastName Age
John 36
They will be read as:
First...
Hi,
On my web app, I take a look at the current URL, and if the current URL is a form like this:
http://www.domain.com:11000/invite/abcde16989/root/index.html
-> All I need is to extract the ID which consists of 5 letters and 5 numbers (abcde16989) in another variable for further use.
So I need this:
var current_url = "the whole p...
Hi
I am facing a problem
I want my application to pick up resources from the framework. Here is my code snippet of an xml.
For this to be achieved following changes were made in attrs.xml
and themes.xml at the framework level
@android:drawable/btn_minus_ss
The drawable btn_minus_ss.png is added to drawable-hdpi folder at ...
I have a python3 program that I'm making which uses a sqlite database with several tables, I want to create a selector module to allow me to chose which table to pull data from.
I have found out that I can't use paramater substitution for a table name as shown bellow, so I'm looking for some alternative methods to accomplish this.
c.ex...
Hello!
So, I am using the HtmlAgility pack (http://htmlagilitypack.codeplex.com/) to parse a script node and then I use regular expressions to parse out an object definition.
The string I end up with is plain javascript that defines an object.
Here is the sample Javascript I am trying to parse:
<!--Module 328 Buying Options Table-->
<...
I am writing a programming language text parser, out of curiosity. Say i want to define an immutable (at runtime) graph of tokens as vertices/nodes. These are naturally of different type - some tokens are keywords, some are identifiers, etc. However they all share the common trait where each token in the graph points to another. This pro...
I've got a set of documents which have a semi-regular format. Rows are typically separated by new line characters, and the main components of each row are separated by spaces. Some examples are a set of furniture assembly instructions, a set of table of contents, a set of recipes and a set of bank statements.
The problem is that each s...
hello.
I'm trying to use uTorrent webUI API. I think this is a pretty n00b question but there's little documentation about this API on the web, sorry.
my server uses file_get_contents($url) and I get the data I want. but in a format I do not understand.
for example:
{
"build": BUILD NUMBER (integer),
"label": [
[
...
Sample PDF file that I cannot parse (2.6MB Zip File)
Note: I am not interested in using a parsing library. This is for my own entertainment.
I've been experimenting with ripping text out of PDF files for a search gizmo, but am unable to extract text from some pdf files.
Note that this is a much easier problem than straight up parsing...