My application pulls a large amount of data from an external source - normally across the internet - and stores it locally on the application server.
Currently, as a user starts a new project we aggressively try to pull the data from the external source based on the order that we predict the user will want to access it. This process can take 2 - 3 hours.
It seems like a smarter approach here is to provide access to the data in a lazy loading style fashion. Eg - If a user wants to access entity A, try to grab it from our database first. If it's not yet there, fetch it from the remote source and populate the database at the same time.
This, combined with continuing to populate the database in the background, would give a much slicker experience for the user.
Are there frameworks which manage this level of abstraction? (My application is in Java).
There's several considerations here - Ie., Currently my database enforces relational integrity - something that might have to be turned off to facilitate this lazy loading approach. Concurrency seems like it would cause problems here.
Also, it seems like entities and collections could exist in a partially populated state - this requires additional schema data to distinguish the complete from the partially populated.
As I understand it, this is just an aggregated repository pattern - is this correct, or is this a more appropriate pattern I should study?