wikipedia

How to access wikipedia

Hi, I want to access HTML content from wikipedia .But it is showing access denied. How can i access Wiki. Please give some suggestion ...

Get first lines of Wikipedia Article

I got a Wikipedia-Article and I want to fetch the first z lines (or the first x chars, or the first y words, doesn't matter) from the article. The problem: I can get either the source Wiki-Text (via API) or the parsed HTML (via direct HTTP-Request, eventually on the print-version) but how can I find the first lines displayed? Normaly t...

Are partially updated values when multithreading still a concern on modern CPUs?

From the Wikipedia article on Read-Copy-Update: The reason that it is safe to run the removal phase concurrently with readers is the semantics of modern CPUs guarantee that readers will see either the old or the new version of the data structure rather than a partially updated reference. Is this true for all modern CPUs (ARM, x86, ...

Truncate mediawiki

Hi all, I'm working with the mediawiki API ( e.g. http://en.wikipedia.org/w/api.php) and I would like to be able to 'truncate' the mysql tables in order to reset the local installation while keeping some tables (users, ?...). What would be the SQL queries ? I would say: tuncate all the tables but ${PREFIX}_user and update ${PREFIX}_us...

Scraping and Parsing a Wikipedia Page

Hey guys. I'm wondering if there are any existing libraries in or accessible from Objective-C that would allow me to scrape pages formatted like this one. Specifically, all of the dates and all of the text next to each date. If not, what would be the best way to go about doing this? Regular expressions? I heard that NSString might alread...

How does Wikipedia avoid duplicate entries?

How can websites as big as Wikipedia sort duplicated entries out? I need to know the exact procedure from the moment that user creates the duplicate entry and so on. If you don't know it but you know a method please send it. ----update---- Suppose there is wikipedia.com/horse and somebody afterward creates wikipedia.com/the_horse this...

Blacklist IP database

Hi, Is there an open database of blacklisted IP for the Web? With a lot of public web proxy you know... such the blacklist used by the Global blocking of Wikipedia. Thanks in advance. ...

How to strip all tags from wikipedia pages or make page more readable.

I want to strip all tags, remove the [show][Hide] stuffs from wikipedia, or is there some website that makes pages in more readable format. Please I am aware of the Wikipedia printable version, but I don't need any tags in that, as I have some other use. So please answer the original question only, about any website or webservice or c...

Linking to Wikipedia abstracts (the way Google Earth does it)

I'm embedding Wikipedia pages in my app, and I'd like to show the same simplified abstract that Google Earth shows. (It gives the first several paragraphs and a link to the full content, without any serious layout.) I know about the printable=true option, but that's not what I'm looking for. ...

How does Wikipedia's "What links here" work?

I recently used Wikipedia's function "What links here" (which is found under the "Toolbox" element in any entry's left menu) and it got me started wondering how this function actually works. I'm guessing that searching through all the article entries after links isn't very effective, so are all the links stored in a separate database? If...

Loading a Wikipedia page

I want to make a button in an UIAlertView that opens a Wikipedia page, with a subject stored in my array "array" Here is how I'm doing it. Wikipedia follows the format of http://en.wikipedia.org/wiki/<subject>. In my array, I have text entries of subjects. I want it to open in mobile Safari when tapped. So far, no luck :( Help ...

How to remove blocks surrounded by curly brackets via python

Sample text: String -> content within the rev tag (via lxml). I'm trying to remove the {{BLOCKS}} within the text. I've used the following regex to remove simple, one line blocks: p = re.compile('\{\{*.*\}\}') nonBracketedString = p.sub('', bracketedString) However this does not remove the first multi line bracketed section at the b...

Getting a large number (but not all) Wikipedia pages

For a NLP project of mine, I want to download a large number of pages (say, 10000) at random from Wikipedia. Without downloading the entire XML dump, this is what I can think of: Open a Wikipedia page Parse the HTML for links in a Breadth First Search fashion and open each page Recursively open links on the pages obtained in 2 In ste...

Documentation Management tool

I've been considering installing wikimedia (the software that runs wikipedia), as part of a documentation for our small software system for a small development team. I heard that Drupal has a book editing add-on where the documentations can be extracted, so I was wondering what other developers might suggest. I'll be using the system f...

Splay tree insertion

Going through some excercises to hone my binary tree skills, I decided to implement a splay tree, as outlined in Wikipedia: Splay tree. One thing I'm not getting is the part about insertion. It says: First, we search x in the splay tree. If x does not already exist, then we will not find it, but its parent node y. Second, we perfor...

MediaWiki styling for iPhone

When you visit en.wikipedia.org with an iPhone you are forwarded to en.m.wikipedia.org which is formatted beautifully for the device. I have MediaWiki on my own server and I'd love to have this formatting available when I visit my site with my iPhone. Is there an easy way to enable this? I've gotten as far as www.mediawiki.org/wiki/Manua...

Where to find "bug free" html to wiki converter

While googling for it.I've stumbled upon html2wiki that seems to do the job(will try after done posting the Q up). But, other than that, there are many other choices popped out during the query session. An word on which app to choose would be appreciated! Thanks ...

Where can I find the source of the open-source Wikipedia iPhone application?

Where can I find the source of the open-source Wikipedia iPhone application? ...

Why am I getting the error "cannot import name Scanner" when I try to use the mwclient module for Python?

I'm using Python 2.5.2 (because mwclient still only works for 2.x). I've copied the mwclient folder into the /usr/lib/python2.5/site-packages/mwclient folder, and when I run a program that imports mwclient I get this: Traceback (most recent call last): File "get_wiki.py", line 2, in <module> import mwclient File "/usr/lib/pyth...

How to crawl entire Wikipedia?

I've tried WebSphinx application. I realize if I put wikipedia.org as the starting URL, it will not crawl further. Hence, how to actually crawl the entire Wikipedia? Can anyone gimme some guidelines? Do I need to specifically go and find those URLs and put multiple starting URLs? Anyone has suggestions of good website with the tutori...