views:

217

answers:

2

In a server-side application running on Tomcat, I am generating full HTML pages (with header) based on random user-requested sites pulled down from the Internet. The client-side application uses asynchronous callbacks for requesting processing of a particular web page. Since processing can take a while, I want to inform the user about progress via polling, hence the callbacks.

On server-side, after the web page is retrieved, it is processed and an "enhanced" version is created. Then this version has to go back to the user. Displaying the page as part of the page of the client-side application is not an option.

Currently, the server generates a temporary file and sends back a link to it. This is clearly suboptimal.

The next best solution I can come up with inolves creating a caching-DB that stores the HTML content together with its md5-sums or sha1-ids and then sends back a link to a servlet, with the hash-ID as an argument. The servlet then requests the site from the caching-DB.

Is there any better solution? If not, which DB-backend would you propose? I'm thinking of SQLite. Part of the problem to be solved is: how do I push a page <html> to </html> back to client side?

+1  A: 

If true persistence isn't required how about using something more temporal like memcached instead of SQL? Calling semantics are pretty clean and easy - and of course you can expire the data manually, ttl, or @ restart.

stephbu
+1  A: 

Instead of creating a temporary file, filling it up, and then sending a link, you can create a memory buffer, fill it up, and then send that as the response (serve it with mime-type 'text/html'). If you don't want to send page-buffers immediately, you can save them for later in the user's session. If you're worried of taking up too much memory that way, you may want to keep only a certain number of page-buffers around in memory, and write the rest to disk for later retrieval. Using a DB sounds like overkill (after all, there's no relational information involved) - but it would solve the caching problem nicely.

tucuxi