views:

67

answers:

2

Hi all ... when I use :

NSString *myText = [webView stringByEvaluatingJavaScriptFromString:@"document.documentElement.innerText"];
NSLog(@"my text -> %@",myText);

I get all the JavaScript for the webView but what i want is to save the body text only from the web page so can any body help me with some codes or any ideas thanks

A: 

Take the innerText of some element in the document, i.e. from body element.

adf88
txt for the replay i try it but i steel get all the link so i will do some search and if you can help me i will be grateful
jissa
`"document.body.innerText"`
adf88
txt again for replay but what i do (like vienna application) is to let the user enter the desired site and than get the rssfeed and store the body. so when i use document.body.innerText i get the body and the other links... i didn't found the solution yet but i steel looking for...
jissa
So what do you want to store exactly?
adf88
i want to store all the text body;when i use - (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDicti can store the description ,the title ,etc... the description is only few line from a big text so what i need is to go to original site and get all the text... when i use NSString *content=[webView stringByEvaluatingJavaScriptFromString:@"document.body.innerText"];i get the text, the links, date... so can you help to get "only the original text " .
jissa
What is "only the original text"? You see, that's the problem, you must precisely specify what part of that page you want. "only the original text" is very confusing. So what do you want to store **exactly**? Generally speaking you must traverse DOM tree and get what you want. Is your app dedicated to some chosen website(s) or is it common purpose (if so, what is the purpose)? If you want to know how a certain website is built (what is the DOM tree) I suggest you use some browser debugger like Firebug for Firefox.
adf88
if you go to this page:<br> http://www.boston.com/news/nation/articles/2010/08/17/obama_sharpens_message_for_fall_election/<br> you see the title,image and the text and also some links and some buttons (minuButton,plusButton,printButton...) so what i want is to take the text as string to copy into my app and use it
jissa
But tell me, what is your goal? Generic way to extract text of an article on boston.com site?
adf88
i give you an example... my app let the user decide from where he want to read the news: the user put the site and the app retrieve the title,text and the picture and put into my app with my interface design...
jissa
A: 

It sounds like you want to get the text of the document excluding the tags.

If the page you're visiting uses JQuery, you could simply use $(body).text() to achieve this.

If not, you may need to strip off the tags with a regular expression yourself. This post seems to have an answer for this problem.

William