tags:

views:

30

answers:

1

I'm using cURL to scrape web pages but I can only seem to scrape top-level URLs. For example, if I want to cURL the URL "http://www.businessweek.com/news/2010-09-29/flaherty-says-canada-july-gdp-report-tomorrow-may-be-negative.html" then it returns nothing (as if it's a blank page).

This is my C code:

#include <stdio.h>
#include <curl/curl.h>

int main(void)
{
  CURL *curl;
  CURLcode res;

  curl = curl_easy_init();
  if(curl) {
//THIS WORKS
//curl_easy_setopt(curl, CURLOPT_URL, "news.google.com"); 

//THIS DOESN'T WORK
  curl_easy_setopt(curl, CURLOPT_URL, "http://www.businessweek.com/news/2010-09-29/flaherty-says-canada-july-gdp-report-tomorrow-may-be-negative.html"); 
    res = curl_easy_perform(curl);

    curl_easy_cleanup(curl);
  }
  return 0;
}

If I could get some input on this issue that would be great.

+4  A: 

It's because the site is sending a 301. Set CURLOPT_FOLLOWLOCATION to 1 to follow them automatically.

curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1);
Matthew Flaschen
+1 @Matthew Exactly!
karlphillip