So for example the youtube video ID from a youtube page, or a tweet ID from a twitter page, or a Facebook uid from a facebook profile...
views:
30answers:
2
A:
You don't need an open source project for that. Lifting the ID from the page is usually a matter of parsing the URL that got you there. In youtube's case, the "v" querystring parameter indicates the video ID. The other examples have similar answers.
Scott Stafford
2010-07-14 02:55:56
Scott, Youtube only is easy. What if I want to do that for 100 site types?
David Haddad
2010-07-14 03:47:50
@David Haddad: Can you clarify your question then? You want a generic way to extract what exactly from arbitrary web pages? Just the identifying ID? Semantic information?
Scott Stafford
2010-07-14 04:17:24
@Scott Stafford It's kind of hard to explain. The main content of a page changes from one page type to the other. So let's say if you pass it the link to a tweet page, then the main output would be the tweet_id, twitterer, and the tweet text. It would vary from one site to the other. However if you do the same with a youtube video link, it would send you the youtube video ID/title/etc...
David Haddad
2010-07-14 19:23:26
@David Haddad: I am pretty sure you're not going to find any project that is prewritten that just knows all the specific formats of every popular social networking/web 2.0 site and can parse it for you.
Scott Stafford
2010-07-15 02:54:49
A:
The oembed protocol has a specification for accessing structured relevant data based on a URL. embed.ly is a company that procides an api based on that standard.
http://www.oembed.com/ http://embed.ly
David Haddad
2010-10-11 17:29:02