getting real link from rss feed link | ansaurus

tags:

views:

36

answers:

1

+1 Q:

getting real link from rss feed link

I am experimenting with scraping certain pages from an RSS feed using curl and php. The page scraping was working fine when I was just using actual links, not links from the rss feeds. However, I realize now that links in rss feeds are usually just redirects to the actual page (at least this is what it seems like). Because now when I scrape a page with the rss link, it doesn't actually get the information I am looking for.

Has anyone encountered this and know of a workaround. Is there anyway to see where the rss link is redirecting to and capturing that value?

A:

I think you might need to use the -L switch to tell it to follow redirects. I'm not sure if you can do this directly from PHP or whether you need to follow this approach http://php.net/manual/en/function.curl-setopt.php#95027. It is always possible that the site you are scraping blocks by user agent or something as well. Maybe try one of the links in a browser while running Fiddler or similar to see if any redirection is actually taking place.

Martin Smith 2010-05-08 16:52:50

thanks, yeah I managed to find a script that loops through the redirects and finds the last one. It seems like most sites don't block by useragent, but there are some.

pfunc 2010-05-10 13:30:29

related questions

IDE suggestions: Eclipse IDE vs. Zend Studio ( confused )

MySQL/Apache Error in PHP MySQL query

Lightweight IDE for Linux

What PHP framework would you choose for a new application and why?

Why is my ternary expression not working?

How can I get at the matches when using preg_replace in PHP?

Mechanisms for tracking DB schema changes

Wordpress theme development offline tools

Using object property as default for method property

How can I get the authenticated user name under Apache using plain HTTP authentication and PHP?

Make XAMPP/Apache serve file outside of htdocs

How do you debug PHP scripts?

PHP Variables passed by value or by reference?

Best way to implement unit testing in PHP

Connect PHP to an AS/400

Best way to access Exchange using PHP?

PHP Session Security

How do I access a remote form in php?

What's the best way to generate a tag cloud from an array? (using h1 through h6 for sizing)

Apache/PHP: error_log per Virtual Host?

How do I track file downloads with apache/PHP

How would you access Object properties from within an object method?

Flat File Databases in PHP

Best way to allow plugins for a PHP application

Latest information on PHP upcoming releases