views:

155

answers:

1

Background

  • I'm building an app that links recent web pages you've visited together.

  • To do this, I need to get the HTML for recent URLs using Cocoa.

  • Right now, I'm using an invisible WebView to do this.

  • As I understand it, if the URL isn't in the cache for my app, this is hitting web servers.

What I want

The chances are high that the URL I'm grabbing has already been cached by Safari as the page has already been visited.

I want my app to check Safari's cache for the URL first. If it's there, it should just use this data. If not, it should hit the web server and store the page in my app's cache.

I don't really want to have to parse the cache.db file from Safari using sqlite3 - I've no idea if this format will stay the same. I'm after something simpler and more high level.

What I've tried

I know that you can set up your own NSURLCache using the method initWithMemoryCapacity:diskCapacity:diskPath: but I don't want to try pointing this to the Safari cache in case it screws up Safari by writing to it.

Is there an easy, high level way of sharing the Safari cache?

UPDATE

Aha. I've just realised there may be a way to do this I've been missing.

I could make a new instance of NSURLCache with initWithMemoryCapacity:diskCapacity:diskPath:, point it at the Safari cache, then specify a cache policy of NSURLRequestReturnCacheDataDontLoad for the URL Request when loading the page.

When this fails, I could just try and load the page as normal. I'll try this out and update the question when I know more.

+1  A: 

To be honest, you just can't do this.

Firstly, I'm pretty certain -[NSURLCache initWithMemoryCapacity:diskCapacity:diskPath:] won't work as you expect. It will instead blow away the old cache file to create its own; potentially highly upsetting Safari.

Secondly NSURLCache is a composite cache. That is, it caches data first in memory, and then moves it out to disk at some point. So even if you could properly access Safari's cache file (which you can't) you'd only be able to access the older cached data; not the most recent.

Mike Abdullah
Sigh. I've not tried it out yet, but I'm pretty certain you're right. That would make complete sense.
John Gallagher
An alternative is that Safari saves copies of visited pages into the History so that Spotlight can search them. Can you pull out what you need that way?
Mike Abdullah