tags:

views:

80

answers:

2

Take this for example: http://en.wikipedia.org/wiki/United_States_Bill_of_Rights

Under the "Amendments" section, I want to get what each bullet point says, and display them in a nice list in an android application. I know there's a wikimedia api, but I have absolutely no idea how to use it, and from what I understand of it, you can get the text under a section, but I'm not sure if you can get each bullet point separately.

What would be the best way to do this? Or instead of this, should I just spend my time copying the text from over 300 pages onto a text file, and reading it in the application?

+1  A: 

I am sure you would have already thought of this:

  • If your goal is to view Wiki in your app, you can use WebView
  • If your goal is to capture specific data elements you can download the html page and string process it (div >> ol/ul >> li)
Sameer Segal
+2  A: 

This link uses the mediawiki API to query the page from you question (based on this wiki article):

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&titles=United_States_Bill_of_Rights

As you can see, it returns an xml document, the page text is found under the <rev> tag. And it is the plain editor text in mediawiki text markup language.

So to extract the information from this text you should use a parse. Here is a list of alternative parsers, some are written in Java.

Andreas_D
thank you! i tested and it works perfectly
magicman