parsing

Parse/deconstruct SQL with VBA-Access

Does anyone know of a way to deconstruct a SQL statement (take a select SQL statement, extract columns from each SELECT, tables from each FOR and each JOIN, and filtering criteria from each WHERE. I can then put this data into a BOM table to create a "map" of the query), including subqueries, using VBA? I have a project to map Teradata v...

How to make asp.net parse markup tag property as list

Is there a way that I can extend asp.net to accept the markup <c:MyControl runat="server" MyList="1,2,6,7,22" /> Where MyList is a List<int> or List<string> or even List<someEnum>? So I want asp.net to parse automatically all lists (that can be parsed) generically. I know I could take the way around it and make MyList a string, the...

Converting a pdf to text/html in python so I can parse it

Dear Python Experts, I have the following sample code where I download a pdf from the European Parliament website on a given legislative proposal: EDIT: I ended up just getting the link and feeding it to adobes online conversion tool (see the code below): import mechanize import urllib2 import re from BeautifulSoup import * adobe = "...

Parsing Text Data File With Linq

I have a large text file of records, each delimited by a newline. Each record is prefixed by a two digit number which specifies it's type. Here's an example: .... 30AA ALUMINIUM ALLOY LMELMEUSD2.00 0.35 5101020100818 40AADFALUMINIUM ALLOY USD USD 100 1 0.20000 1.00 0 100 140003 50201008180.999993 0.00 0.0...

git mergers full path to files

Good day I need to find a way to find the full path of changed files when reported after a git merge. Git generally puts a .../rest/of/path/to/the/file if the path is too long. But I am trying to parse it, and depending on the location of the file I want to be able to decide a suitable action while building (writing building scripts) Is ...

Writing a parser for regular expressions

Even after years of programming, I'm ashamed to say that I've never really fully grasped regular expressions. In general, when a problem calls for a regex, I can usually (after a bunch of referring to syntax) come up with an appropriate one, but it's a technique that I find myself using increasingly often. So, to teach myself and under...

Intelligently extracting tags from blogs and other web pages

I'm not talking about HTML tags, but tags used to describe blog posts, or youtube videos or questions on this site. If I was crawling just a single website, I'd just use an xpath to extract the tag out, or even a regex if it's simple. But I'd like to be able to throw any web page at my extract_tags() function and get the tags listed. I...

Basic input file parsing in R

I'm used to perl and new to R. I know you can read whole tables using read.table() but I wonder how can I use R to parse a single line from an input file. Specifically, what is the equivalent to the following perl snippet: open my $fh, $filename or die 'can't open file $filename'; my $line = <$fh>; my ($first, $second, $third) = split ...

How to access comments using lxml

I am trying to remove comments from a list of elements that were obtained by using lxml The best I have been able to do is: no_comments=[element for element in element_list if 'HtmlComment' not in str(type(each))] I am wondering if there is a more direct way? I am going to add something based on Matthew's answer - he got me almost t...

Liskov Substitution Principle and the directionality of the original statement

I came across the original statement of the Liskov Substitution Principle on Ward's wiki tonight: What is wanted here is something like the following substitution property: If for each object o1 of type S there is an object o2 of type T such that for all programs P defined in terms of T, the behavior of P is unchanged when o1 is subs...

Objective C parse hex string to integer

I would like to know how to parse a hex string, representing a number, in objective c. I am willing to use both an objective, or a C based method, either is fine. example: #01FFFFAB should parse into the integer: 33554347 Any help would be appreciated! ...

Large file parsing

I wasn't very bright and I deleted some files that I need for school. I need to parse a dump of my hard drive for the files. The file will be hundreds of gigabytes in size. Can anyone tell me a way to do this in c or python? I know that c can only handle files up to 2 gigabytes in size is there any way I can get Around this? My parsing w...

Unable to use regex to search in PHP?

Hi, I'm trying to get the code of a html document in specific tags. My method works for some tags, but not all, and it not work for the tag's content I want to get. Here is my code: <html> <head></head> <body> <?php $url = "http://sf.backpage.com/MusicInstruction/"; $data = file_get_contents($url); $pattern = "/<di...

How to read large XML file consisting of large number of small items efficiently in Java?

I have a large XML file that consists of relatively fixed size items i.e. <rootElem> <item>...</item> <item>...</item> <item>...</item> <rootElem> The item elements are relatively shallow and typically rather small ( <100 KB), but there may be a lot of them (hundreds of thousands). The items are completely independent of each o...

how to parse a JSON string into JsonNode in Jackson?

it should be so simple, but I just cannot find it after being trying for an hour #embarrasing I need to get a JSON string e.g. {"k1":v1,"k2":v2} parsed as a JsonNode JsonFactory factory = new JsonFactory(); JsonParser jp = factory.createJsonParser("{\"k1\":\"v1\"}"); JsonNode actualObj = jp.readValueAsTree(); gives java.lang....

Getting the JSON string from a Twitter search result into Java to be parsed

I am trying to get the JSON data from a Twitter search request such as this link text into my Java program so that I can parse it using Gson. How would I get the data from that URL into Java? Would I use an http request or something else? I've seen JSONRequest.get, but I can't see where that's coming from at all. ...

Access xml feed with HTAUTH in PHP

Hello All, How can I access a xml feed with HAUTH in PHP? Dummy link format is http://username:[email protected]/www.domain.com/trends/001.xml I'm using the code below to access it. $doc = new DOMDocument(); $doc->load($source); Thanks in advance, steamboy ...

FB.XFBML renders only first element in a div

So I have a div in which I load all my friends, div's name is 'myFriendList'. I have their names and profile pictures with and , then I call FB.XFBML.parse(document.getElementById('myFriendList')); but it only shows me one name of first friend in the list, why doesn't it shows all names and pictures ? ...

javascript plain text url parsing

I'm trying to search plain old strings for urls that begin with http, but all the regex I find doesn't seem to work in javascript nor can I seem to find an example of this in javascript. This is the one I'm trying to use from here and here: var test = /\b(?:(?:https?|ftp|file)://www\.|ftp\.)[-A-Z0-9+&@#/%=~_|$?!:,.]*[A-Z0-9+&@#/%=~_|$]...

Get text within div.

<div class="plugin-block"> <h3><a href="http://wordpress.org/extend/plugins/sailthru-triggermail/"&gt;Sailthru&lt;/a&gt;&lt;/h3&gt; **Intergrate Sailthru API functionality into your WordPress blog.** <ul class="plugin-meta"> <li><span class="info-marker">Version</span> 1.0</li> <li><span class="info-marker">Updated</span> 20...