tags:

views:

64

answers:

2

Hey guys,

(Not a native english speaker)

I'm doing a personal project in PHP in which I use the Simple HTML Parser to parse the HTML of a given URL and retrieve the first image in a DIV that have a specific ID or class (maincontent, content, main, wrapper, etc. - it's all in an array) and ignore ads. The goal is to take this image and make a thumbnail with it, pretty much like on Digg and others.

I thought everything was working fine until I tried my script with the website Snopes ("http://www.snopes.com/photos/animals/luckycoyote.asp" <- this page more exactly).

The source of the first image it gets is: " graphics/luckycoyote1.jpg ". So far, to correct this problem I created a little function that gets the domain name of the given URL and insert it before the IMG's source attribute. So for sites like Snopes.com, it gives me: "http://www.snopes.com/graphics/luckycoyote1.jpg" ... while the real URL on Snopes for this image is "http://www.snopes.com*/photos/animals/*graphics/luckycoyote1.jpg" (or, more precisely: " http://graphics1.snopes.com/photos/animals/graphics/luckycoyote1.jpg " - note the subdomain here).

So my main question is: how can I externally/dynamically retrieve the full URL address of an image ("absolute path") when I am only given the "relative path"? I'm pretty sure this is possible, since when I paste the link in Facebook's "What are you doing?" field for example, it gives me the correct path to the image while on the website, the source of the image is only (example) "image/photo/example.jpg".

Thank you for your time.

A: 

In your case my guess is that there is a server redirect going on and the only real way would be for you to try and make a web request to get the image using the "default domain" as you initially completed, and then see where/what it gets redirected to during the process.

Mitchel Sellers
Thank you for taking the time to respond. I think your solution is "out of my league", but I guess I'll have to do some research to see if this could work.
+3  A: 

When you get a relative graphic URL graphics/luckycoyote1.jpg which means the src="" tag DOESN'T start with a / you should instead of using the domain name use the current path your browsing.

To get this in PHP run dirname('http://www.snopes.com/photos/animals/luckycoyote.asp') and it will return the path you need. Stick that in front of graphics/luckycoyote1.jpg and you'll get your image.

The graphics1.snopes.com happens automatically on the server and you shouldn't need to worry about it. When the image src="" starts with a / use the domain name http://www.snopes.com instead.

Matt S
Thanks a million for your response. I'll try this as soon as possible and give feedback if it works correctly.
Hey Matt, I just had a chance to make the needed modifications and the dirname function gives me the appropriate path.Sadly, it seems like Snopes doesn't want people to direct link to their picture (even thought that's not what I want to do), since when I display the image from my script, it gives me this image: http://67.19.222.106/club21.gif . Strange, since Facebook has no problem getting the right images.Is there something I can do about this? Thanks again.
This is likely an issue with referrer checking on Snopes side. If that is the case there's little you can do but there are ways. Google will be your friend here as unfortunately I don't know them.
Matt S