views:

32

answers:

1

My application pulls a large amount of data from an external source - normally across the internet - and stores it locally on the application server.

Currently, as a user starts a new project we aggressively try to pull the data from the external source based on the order that we predict the user will want to access it. This process can take 2 - 3 hours.

It seems like a smarter approach here is to provide access to the data in a lazy loading style fashion. Eg - If a user wants to access entity A, try to grab it from our database first. If it's not yet there, fetch it from the remote source and populate the database at the same time.

This, combined with continuing to populate the database in the background, would give a much slicker experience for the user.

Are there frameworks which manage this level of abstraction? (My application is in Java).

There's several considerations here - Ie., Currently my database enforces relational integrity - something that might have to be turned off to facilitate this lazy loading approach. Concurrency seems like it would cause problems here.

Also, it seems like entities and collections could exist in a partially populated state - this requires additional schema data to distinguish the complete from the partially populated.

As I understand it, this is just an aggregated repository pattern - is this correct, or is this a more appropriate pattern I should study?

A: 

Have you tried JPA/Hibernate? This seems easily possible in Hibernate.

Padmarag
I'm using Hibernate for my database layer on the app server. However, the challenge is in loading data which may not yet be in the database. If that's the case, we need to fetch the data from the remote source, and populate the db. Essentially this is an abstraction layer above hibernate.
Marty Pitt
Ok. Got it. What is the remote source? Is it Database, XML, Web Service?
Padmarag
It's a server somewhere else in the world -- for the purposes of this discussion, it could be any of the protocols you mentioned - a webserver, a remote database, or a remote application server.
Marty Pitt