views:

442

answers:

4

I want to use PHP (possibly with Curl/XPath?) to extract data from Wikipedia pages. What would be the best way to go about this? I'll be using CakePHP for this project, although just need to figure out how to get this working first.

A: 

This has been asked before, see http://stackoverflow.com/questions/627594/is-there-a-wikipedia-api where a few options are listed for interacting with Wikipedia.

Alistair
A: 

You can download snapshots of wikipedia database and handling this into self diskspace. This make by alternative maybe better solution.

Wikipedia database snapshots you can find at: http://dumps.wikimedia.org/

Svisstack
A: 

Several options: (Search on google for them)
1. DBPedia
2. Freebase Wikipedia Extracs (WEX)
3. There is Wikipedia link dataset as well

RandomVector
+1  A: 

You can fetch some data with this PHP function that uses CURL:

http://www.barattalo.it/2010/08/29/php-bot-to-get-wikipedia-definitions/

Pons