views:

377

answers:

5

I'm trying to take a given URL entered by user and determine if the URL is pointing to a image or a video.

Example use case.

When a user paste in the URL of a youtube video, on save the page will auto display the embedded Youtube player.

When a user poaste int he URL of a picture in flickr, on save, the page will auto display a smaller version of the flickr image.

Btw: I'm working in Grails and Java.

+3  A: 

Hit the link and inspect the content type header? If the result is a HTML page you could look for the largest image or embedded flash file on the page and choose to display that?

Sam
Thanks! This is indeed very helpful clue.
Seymour Cakes
Most definitely, I was about to suggest the same. As for youtube, you will get redirected to another place. you should use a http client to follow the redirects +1
OscarRyz
instead of doing a full GET request and downloading the file, you might try issuing a HEAD request- that should only return the HTTP headers, including the mime type. Here's the spec: http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
Tim Howland
+6  A: 

You can fetch the URL and see Content-type from the response.

You can use the HTTP Client from apache, it helps you to fetch the content of the URL and you can use it to navigate the redirects. For instance try to fetch the following:

http://www.youtube.com/watch?v=d4LkTstvUL4

Will return an HTML containing the video. After a while you'll find out the video is here:

http://www.youtube.com/v/d4LkTstvUL4

But if you fetch that page you will get a redirect:

HTTP/1.0 302 Redirect
Date: Fri, 23 Jan 2009 02:25:37 GMT
Content-Type: text/plain
Expires: Fri, 23 Jan 2009 02:25:37 GMT
Cache-Control: no-cache
Server: Apache
X-Content-Type-Options: nosniff
Set-Cookie: VISITOR_INFO1_LIVE=sQc75zc-QSU; path=/; domain=.youtube.com; expires=
Set-Cookie: VISITOR_INFO1_LIVE=sQc75zc-QSU; path=/; domain=.youtube.com; expires=
Location: http://www.youtube.com/swf/l.swf?swf=http%3A//s.ytimg.com/yt/swf/cps-vf
L4&rel=1&eurl=&iurl=http%3A//i1.ytimg.com/vi/d4LkTstvUL4/hqdefault.jpg&sk=Z_TM3JF
e_get_video_info=1&load_modules=1

So, what you have to do is to fetch the URL and examine it, until you get final content

This section explains how to handle the redirects.

OscarRyz
+6  A: 

Issue an HTTP HEAD request so you can examine the HTTP headers that come back without having to first download the entire document. Showing a non-programmatic case under Linux using "curl":

$ curl --head http://stackoverflow.com/Content/Img/stackoverflow-logo-250.png
HTTP/1.1 200 OK
Cache-Control: max-age=28800
Content-Length: 3428
Content-Type: image/png
Last-Modified: Fri, 16 Jan 2009 09:35:30 GMT
Accept-Ranges: bytes
ETag: "98f590c5bd77c91:0"
Server: Microsoft-IIS/7.0
Date: Fri, 23 Jan 2009 03:55:39 GMT

You can see here from the Content-Type that this is an image. You can use HTTPClient from Apache from Java to do the HTTP Head request.

If you want to download the content for sure, then just issue the HTTP GET (using Httpclient) and use the same HTTP Header to determine the content type.

Eddie
excellent point, especially if it's a 100MB video.
Jason S
A: 

I suggest using curl with a range header to allow you to inspect the file type itself.

curl -s -v -r0-499 -o test  http://stackoverflow.com/content/img/so/logo.png
* About to connect() to stackoverflow.com port 80 (#0)
*   Trying 69.59.196.211... connected
* Connected to stackoverflow.com (69.59.196.211) port 80 (#0)
> GET /content/img/so/logo.png HTTP/1.1
> Range: bytes=0-499
> User-Agent: curl/7.19.4 (i386-apple-darwin9.6.0) libcurl/7.19.4 zlib/1.2.3
> Host: stackoverflow.com
> Accept: */*
> 
< HTTP/1.1 206 Partial Content
< Cache-Control: max-age=604800
< Content-Type: image/png
< Content-Range: bytes 0-499/3438
< Last-Modified: Fri, 05 Jun 2009 06:52:35 GMT
< Accept-Ranges: bytes
< ETag: "25dd4b35aae5c91:0"
< Server: Microsoft-IIS/7.0
< Date: Fri, 19 Jun 2009 19:39:43 GMT
< Content-Length: 500
< 
{ [data not shown]
* Connection #0 to host stackoverflow.com left intact
* Closing connection #0

Then execute:

$ file test
test: PNG image data, 250 x 61, 8-bit colormap, non-interlaced

Now you know the mime type: image/png, the file size 3438 bytes, and the file is a 250 x 61 color PNG image.

brianegge
A: 

Fast video indexer is a video capture software that can capture video frames automatically from a list of videos and create index web pages, index pictures or a list of images.