Did you ever submit a link in your Facebook status? When you do, they do something very nice: They get a title, summary, and bunch of relevant images from that page, and you can choose one of them as thumbnail.
I need something like that right now. Is there any open-source piece of code that does this? (It needs to be in Python because it's a Python app I'm working on.) Or maybe just a guide or a blog post about this? I would really like to learn from other people's experience about this.
Given the URL of a web page, I want to get:
- The title: Probably just the
<title>
tag but possibly the<h1>
, not sure. - A one-paragraph summary of the page.
- A bunch of relevant images that could be used as a thumbnail. (The tricky part is to filter out irrelevant images like banners or rounded corners.
I may have to implement it myself, but I would at least want to know about how other people have been doing these kinds of tasks.