views:

371

answers:

4

I'm getting most of the music on Rap Exegesis from YouTube (in the form of embedded players). Unfortunately, there's always the risk that one of the videos I'm using will be taken down (due to copyright issues or whatever), thereby breaking the corresponding page on my site.

Ideally I would have a cronjob that would check (nightly say) whether any videos had been removed and notify me. What's the best way to do this?

+9  A: 

The information you need is available via the YouTube API, specifically in the yt:state tag

Depending what language you are programming in there is lots of code around for interacting with the YouTube API.

Post here with more details if you are still having issues getting this to work.

seengee
To be specific, you have to look for the yt:state tag in the XML document located at "http://gdata.youtube.com/feeds/api/videos/#{video_id}"
Horace Loeb
Horace Loeb
+1  A: 

A hacky way to do it would be to use CURL to get the html of the page/video you are wondering about, and then look for the error-box DIV that shows up at the top that says the video has been removed. If it exists and its visible, the video has probably been removed.

Hacky, but I betcha it would work.

Jakobud
Well, it would work today. But don't build something that relies on a div tag unless YouTube says that you can trust it. It seems too easy for that to change. I'd definitely recommend using the API instead.
marcc
I definitely agree. Just wanted to throw out another option, even though its a total hack.
Jakobud
+2  A: 

As well as the "yt:state tag", the OP of the video may not allow it to be embedded. If the list of songs on the front page is coming from a playlist that you maintain on YouTube, for example, then a way to make sure you aren't getting songs that aren't embeddable is to include the "&format=5" parameter when retrieving your list. E.g.

http://gdata.youtube.com/feeds/api/playlists/8BCDD04DE8F771B2?v=2&format=5

Also, if you are worried about country-level restrictions, then use the "&restriction=[two-letter country code]" parameter.

See the 'Developer's Guide: Data API Protocol – API Query Parameters'.

Rafe Lavelle
A: 

As @seengee says, the "right" way to do this is to look for the yt:state tag in the XML representation of a YouTube video via the YouTube API

To get this XML representation, you GET http://gdata.youtube.com/feeds/api/videos/VIDEO_ID (more details here). So implementing this check should be as easy as:

def valid_embed_link?
  doc = Hpricot(open("http://gdata.youtube.com/feeds/api/videos/#{youtube_video_id}"))
  doc.at('yt:state').blank?
end

Unfortunately this yields false positives. For example, http://www.youtube.com/watch?v=MX6rC1krGp0 plays fine, but http://gdata.youtube.com/feeds/api/videos/MX6rC1krGp0 contains a yt:state tag. Therefore, I've gone with this hackier method:

def valid_embed_link?
  doc = Hpricot(open("http://www.youtube.com/watch?v=#{youtube_video_id}"))
  return doc.at('.yt-alert-content').blank?
end
Horace Loeb