views:

140

answers:

4

A few years ago I helped someone put together a webpage (for local personal use only, not served to the world) that aggregates outdoor webcam photos from several of his favorite websites. It's a time-saver for viewing multiple websites at once. We had it easy when the images on those websites had fixed URLs. And we were able to write some JavaScript code when the URLs changed predictably (e.g., when the url had a date it in). But now he'd like to add an image whose filename changes seemingly at random and I don't know how to handle that. Basically, I'd like to:

  1. Programmatically visit another website to find the URL of a particular image.
  2. Insert that URL into my webpage with an <img> tag.

I realize this is probably a confusing and unusual question. I'm willing to help clarify as much as possible. I'm just not sure how to ask for what this guy wants to do.

Update: David Dorward mentioned that doing this with JavaScript violates the Same Origin Policy. I'm open to suggestions for other ways to approach this problem.

A: 

If you use php at your project you can use CURL library to get another website content and using regex parse it for getting image url from source code.

antyrat
I wouldn't use regex to parse HTML.
Oren
Can we stop with that knee-jerk response? I wouldn't use regular expressions to parse any possible arbitrary HTML, but why wouldn't you use it to parse an expected HTML string?
Tom
Because it's 'expected' and sometimes, our expectations let us down.
belugabob
@belugabob If the HTML is not "as expected" then you are going to fail to extract the URL, whatever method you use. That's not an argument against regexes
MarkJ
+1  A: 

Its probably a big fat violation of copyright.

The picture is most like containered within a page - just regularly visit that page and parse the img tag. Make sure that the random bit you commented on is not just a random parameter to force browsers to fetch the fresh image instead of retrieving a cached version.

kime waza
Yes, that's why I noted "personal use only", not served out to the world. It's a time-saver to look at several webpages at the same time.
Kristo
it's not a copyright violation if the image is offered to the world, i.e. placed on a publicly-accessible website. There are enough ways to deny access. Some are even considered effective enough that bypassing them would violate the DMCA, but for the basic copyright claim that's not even needed.
MSalters
Being on a publicly available website does not mean that the images cannot be copyrighted. The phrase 'copyright' means just that 'The right to copy' and is not - nor should be expected to - superceded by the image being publicly visible. If you took a photograph of a painting that was copyrighted, but temporarily displayed in a public place, you wouldn't have any right to sell copies of that photograph - the same applies to the web.
belugabob
A: 
  1. Fetch html of remote page using Cross Domain AJAX.
  2. Then parse it to get urls of images of interest.
  3. Then for each url do <img src=url />
TheMachineCharmer
Why the downvote?
TheMachineCharmer
I'm guessing it's because the question is about photos on various other web pages, not the one he has control over.
Tom
This will also get every other image on the page, which could be LOTS of images - your problem has then changed to 'Out of a load of URLs, how do I find the ones that I'm interested in?'
belugabob
:) Yeah got it. Answer changed friends. Give it a try. Two more downvotes and I ll delete it. :)
TheMachineCharmer
A: 

You have a Python question in your profile, so I'll just say if I were trying to do this, I'd go with Python & Beautiful Soup. Has the added advantage of being able to handle invalid HTML.

Tom