scrape

Excel to XML for data stripping

I am trying to strip data from thousands of identical Excel 2007/2010 files. I would prefer to do this using scraping techniques. Is it possible to scrape an Excel file since, as far as I know, the file is basically some sort of XML format. So, is it possible to convert an Excel file to XML or some other markup format? ...

Using SoupStrainer to parse selectively

Im trying to parse a list of video game titles from a shopping site. however as the item list is all stored inside a tag . This section of the documentation supposedly explains how to parse only part of the document but i cant work it out. my code: from BeautifulSoup import BeautifulSoup import urllib import re url = "Some Shopping ...

Http Agility Pack - Accessing Siblings?

Using the HTML Agility Pack is great for getting descendants and whole tables etc... but how can you use it in the below situation ...Html Code above... <dl> <dt>Location:</dt> <dd>City, London</dd> <dt style="padding-bottom:10px;">Distance:</dt> <dd style="padding-bottom:10px;">0 miles</dd> <dt>Date Issued:</dt> <dd>26/10/2010</dd> <d...