views:

243

answers:

3

I'm embedding Wikipedia pages in my app, and I'd like to show the same simplified abstract that Google Earth shows. (It gives the first several paragraphs and a link to the full content, without any serious layout.)

I know about the printable=true option, but that's not what I'm looking for.

+4  A: 

You might want to consider using the API : you can grab a "text" version of any article. Afterwards, it is up to you to extract the summary.

Another option is just to request the page in raw format:

Raw (Wikitext) page processing: sending a action=raw or a action=raw&templates=expand GET request to index.php will give the unprocessed wikitext source code of a page.

E.g.

http://en.wikipedia.org/wiki/Main_Page?action=raw

Of course you'll need to do a bit a scraping. Going through the API might prove more efficient as you have better control of what you can pull from the database directly (wikitext if you wish).

jldupont
FYI - Broken link
Greg
jldupont
I was hoping for a magic keyword (like printable)...but thanks.
koops
A: 

Did you look at the Wikipedia API? Mediawiki (and so Wikipedia) has a feature-rich and flexible API which is documented on http://www.mediawiki.org/wiki/API

eyazici
A: 

Use the mediawiki API with action=query and prop=revisions to fetch a given revision , remove the wikitext ( images, infoboxes) and extract the content of the first sentence.

Pierre