views:

256

answers:

2

Hey

I'm developping an app that download the content of a web page on the iPhone then store it so the user would be able to access it offline.

I'm using NSURLConnection to download the page, as is the doc. But it downloads only the HTML code without extra content like images.

Even if images are not in a NSData structure, I would like have at least some references or some delegate method call. An idea ?

+3  A: 

You have to parse the HTML code and download all referenced files yourself (and then modify the HTML to use relative URLs if it doesn't already). This is not a trivial problem. You might want to look at the source of Unix tools like wget to get an idea of how they do it. I believe libxml2 can parse HTML so that should probably be the library you want to look into.

Ole Begemann
For parsing HTML, I found a very good api called ElementParser.Take a look at http://touchtank.wordpress.com/element-parser/ !
Martin
A: 

Ouch. That is the answer I was afraid to read. If somebody know's any other method, I would be very interested !

But for know, I'm going to think about this solution.

I know how to get the HTML source code in an NSString var. So I'd just parse the code and detect content to download. But what is the content ?

  • external javascript
  • external css
  • img
  • iframes (with recursivity)
  • ...

what else ?

Martin
Please edit your question or add a comment rather than add new answers.
Stephen Darlington