How does Facebook prefetch url/feed content? | ansaurus

tags:

facebook

views:

109

answers:

1

+1 Q:

How does Facebook prefetch url/feed content?

I posted a url to a blog post in a Facebook message http://www.autoblog.com/2009/06/22/we-are-all-bumblebee-beijing-transformers-fans-gather-to-celebr/ and Facebook inlined the title and abbreviated text as if it fetched them from the rss feed http://www.autoblog.com/rss.xml but when I submitted the link the blog post was already expired out of the feed - I checked.

see this screenshot: http://i43.tinypic.com/nwbu4m.jpg

Is it using a feedburner search? How can this be similarly accomplished?

cheers

+3 A:

I think they do some advanced scraping looking for the most significant blocks of data and HTML and using that. Basically, they analyze everything quickly, toss out ads, etc. and use the big blobs of data.

Digg is doing similar things aswell.

I would do this to implement it.

Scan for meta tags, rss feed tags, and the title tag.
Find large "areas" with a lot of content. Also include p tags. Weight or grade them on the likelihood of them being content. Look for keyword css classes/id (e.g. rate "content" higher than "ads" or "navigation"
Look for large images
Store information about the site for future use and improved heuristics

This is all done on the server-side likely, and served to the browser using AJAX.

Daniel A. White 2009-06-24 03:02:04

I think you're right, it's definitely served to the browser via Ajax (confirmed using Firebug). Certainly the server-side stuff is pretty complicated.For some pages that don't have big "blobs" of textual data, their algorithm seems to fall back to some simpler things, like <meta> tags. For example, for this linkhttp://www.theweathernetwork.com/weather/caon0493The <meta name="description"> is used.

Peter 2009-06-24 03:12:25

thanks for the suggestion. I was hoping screen scraping could be avoided, but dang, that's not a fun thing to implement scalably.

john 2009-06-24 03:15:29

I actually know someone that was working on something like this using part of WebKit.

Daniel A. White 2009-06-24 03:17:47

related questions

Does the Facebook REST API allow you to get a friend's phone number?

Hosting a Facebook Application?

My Facebook application's Javascript doesn't work in Google Chrome

Successful sites using Asp.NET and the Facebook API

How do you remove the "box head" in a Facebook application?

Render Self-Closing Tag in ASP.NET custom control derived from Control

What does the new Facebook Connect platform offer over a traditional Facebook App?

I need help creating a modding system in a Facebook Application

New Facebook app - FBML or iFrame?

how much does facebook denormalize their databases to break things up by network and therefore shrink the tables and speed the system?

using a named function as the callback for $.getJSON in jQuery to satisfy Facebook request signing demands

Which Facebook .NET Library is the best to use?

In Facebook app, is there a way to link directly to "join a group"

Facebook Development vs. XNA, Which is Worth Learning?

What's a Good Resource for Learning Facebook Application/Game Development?

Example Facebook Application using TurboGears -- pyFacebook

How to authenticate in a Facebook Flash application?

Facebook RSS application

Using Merb for Facebook Application

How to start facebook app?

Is anyone developing facebook apps on Grails

How do I import facebook friends from another website

Facebook java integration

Developing and Testing a Facebook application

Anyone have a link to a technical discussion of anything akin to the Facebook news feed system?