+2  A: 

You can start by taking a look at the strip_tags function.

Konamiman
looks cool, Is there something in C# or some sort of webservice too, as I don't want to direct each page request to my webservers.
Priyank Bolia
A: 

What about htmlagilitypack

htmlagilitypackt

Similar thread available in stackoverflow

Is there a Wikipedia API?

Try this function.

Dim pattern As String = "<(.|\n)*?>"
Return System.Text.RegularExpressions.Regex.Replace(strHtmlString, pattern, String.Empty).Trim()
Anuraj
Bad choice, regex is not used for HTML parsing. There are lot of question and internet articles for details. http://www.codinghorror.com/blog/archives/001311.html
Priyank Bolia
that would create another problem in its own, how to create a webpage using the XML, then I have to write even bigger code to generate the html from the parsed XML
Priyank Bolia
A: 

I want to strip all tags, remove the [show][Hide] stuffs from wikipedia, or is there some website that makes pages in more readable format.

You should take a look at DBpedia, Wikipedia, but just the data.

http://dbpedia.org/About

Cups
doesn't look the right thing, its more like semantic webpage, it just have the heading and the links and meta info about the articles. I don't need the metainfo or semantic info, I need a very simple webpage that is similar to text file without much tags except image, paragraphs, etc.
Priyank Bolia
A: 

You could use an HTML parser, BeautifulSoup (Python) or Simple HTML DOM for example. Or you could try using an XML parser.

Vinz
I think the simple HTML DOM looks the best, easy and extensible.
Priyank Bolia