Hi, I am having a problem Scraping Code i require to extract information for a Web MashUp i'm creating.
Basically, I am trying to Scrap Code from:
http://yellowpages.com.mt/Meranti-Ltd-In-Malta-Gozo;/Hair-Accessories;Hijjhkikke=Hiojhhfokje.aspx
This is just one of the pages i will need to scrape and hence i cannot feed the program d...
If I screen scrape a form from a site secured with SSL, and put that form on my site (which is also secured by SSL), do I still get the benefits of SSL?
...
Looking around for a solution to this, I have found different methods. Some use regex, some use DOM scripting or something.
I want to go to a site, log in, fill out a form and then check if the form sent. The logging in part is the part I can't find anything on.
Anyone know of an easy way to do this?
...
Hi, I've come across an issue which unfortunately I can't seem to surpass, I'm also just a newborn to Ruby on rails unfortunately hence the number of questions
I am attempting to scrape a webpage such as the following:
http://www.yellowpages.com.mt/Malta/Grocers-Mini-Markets-Retail-In-Malta-Gozo.aspx
I would like to scrape The Addres...
I'd like to extract a few pieces of system information from a Kronos clock programatically. I can scrape the web-based interface but there's got to be a cleaner interface.
Does anyone have experience querying a Kronos 4500 clock for status info?
...
Dear everyone,
I am using scrapy for scrapping
I decided to write my own scheduler middleware to store some request to reduce the size of that within memory.
Here is my code:
def enqueue_request_into_scheduler(self, spider, request):
print "ENQUEUE SCHEDULER with request %s" % str(request)
scrapyengine.scheduler.enqueue_reques...
I would like to save a web page programmatically.
I don't mean merely save the HTML. I would also like automatically to store all associated files (images, CSS files, maybe embedded SWF, etc), and hopefully rewrite the links for local browsing.
The intended usage is a personal bookmarks application, in which link content is cached in c...
I want to display on a wordpress page the total volume of shares traded on the NYSE stock exchange the last 2 weeks that it's been open. What is the best way to go about doing this?
...
I'm trying to scrape using Google "I'm Feeling Lucky" button.
For small query like 'iteminfo.ca' it works, because it redirects me to iteminfo.ca.
This is the query url:
http://www.google.com/search?hl=en&source=hp&q=iteminfo.ca&btnI=I%27m+Feeling+Lucky
But for the query like '061754020164 site:iteminfo.ca' it doesn't wo...
I want to create a C++ cross-platform (Windows and MacOS X) application that sends the screen as a video stream to a server.
The application is needed in the context of lecture capture. The end result will be a Flash based web page that plays back the lecture (presenter video and audio + slides/desktop).
I am currently exploring a few ...
Hello!
Im trying to figure out a way to strip out all html tags from records in a database, then create xml?
Any ideas?
Built on asp.net 2.0 with sql server
...
I need to obtain delivery tracking details from the Canada Post website, which does not offer an API.
I've formulated a URL that when entered into a browser correctly returns the tracking information, but I can't get the request to function with CURL (it returns a 500 We're Sorry page).
class cURL {
var $headers;
var $user_agent;
v...
How do I repair malformed HTML using C#? A great answer would be an HTML Agility Pack sample!
I'm scraping a site (for legitimate use). The site's HTML is OK but there are some annoying problems.
One way I could go would be through regular expressions. I used Expression Web to analyse the problems and the regular expressions needed t...
Hello,
is it possible to scrape this applet http://www.text118118.com/livefeed.aspx
Its not possible to do it traditionally as the text is within the applet however is it possible to do it with a macro.
The feeds loops after 8 questions and the text stays highlighted?
...
Hi,
Assuming I have an Amazon product URL like so
http://www.amazon.com/Kindle-Wireless-Reading-Display-Generation/dp/B0015T963C/ref=amb_link_86123711_2?pf_rd_m=ATVPDKIKX0DER&pf_rd_s=center-1&pf_rd_r=0AY9N5GXRYHCADJP5P0V&pf_rd_t=101&pf_rd_p=500528151&pf_rd_i=507846
How could I scrape just the ASIN using javascript...
I'm working on an SEO app that (among other things) shows the number of incoming links to your site over time.
There are a few ways to get this data. Scraping Google "link:yoursite.com" results gives you some (not all) of the links they know about, but they aren't too happy if you are doing lots of scraping. Similarly Yahoo has their ...
Hi,
How to convert the retrieved xhtml string to xml file? Are there any FCL libraries to do this?
...
which one is better for screen scraping? simple html dom or snoopy ??
i use simple html dom and find it comfortable..
does snoopy has any advantage over simple html dom?
my requirements : if i wanna scrape contents from a page(after login)..
simple html dom is easy but it takes a lotta time to print the results..
...
A recent blog entry by a Jeff Atwood says that you should never parse HTML using regular expressions - yet doesn't give an alternative.
i want to scrape search search results, extracting values:
<div class="used_result_container">
...
...
<div class="vehicleInfo">
...
...
...
Hey,
It's not really scraping, I'm just trying to find the URLs in a web page where the class has a specific value. For example:
<a class="myClass" href="/url/7df028f508c4685ddf65987a0bd6f22e">
I want to get the href value. Any ideas on how to do this? Maybe regex? Could you post some example code?
I'm guessing html scraping libs, su...