Hi,
I am looking for some input please. I am designing a multi-user, mutli-vendor, n-tier database application.
Each local installation of this system must replicate its records to a single central server. This server is not directly available to the users of the installed database app and the Internet is the only route to it. The central db is only used for reporting so replication needs only be one-way to the central server.
The primary objective is to get the records to the central db as fast as possible to ensure a “pseudo real-time” snapshot.
Here what I’ve got so far:
Use web server on Internet (SOAP service) to upload compressed/encrypted files (XML data). “Central” db server polls the web server to collect and import the data at regular intervals.
Local db systems do not delete records, but mark them as deleted.
Each table in the database has a primary key using a GUID-like surrogate key (i.e. no auto-inc fields) to prevent key violations on the central db.
I have discounted using the client’s db server to run the extraction and upload process, as there will be precious little IT support for this system at customers sites so if the upload fails (due to local firewall/proxy server issues) then I want the user to know a.s.a.p. so they can get it sorted.
The extraction process needs only be run by one client application at a time as it will query the whole local database to extract changes.
The bit I am struggling with is the database structure to enable efficient (and accurate) extraction of recently changed records. Also this will happen whilst users are working i.e. need to be careful about a user having a record open for editing whilst the extraction is going on and need it to run “in the background” so as not to interrupt the user too much.
I.e. do I use a date & time based approach? Do I use some kind of status field, or even a seperate change-log table that can be queried and then cleared?
Apologies for the long post, but I’d value any experience anyone has about developing this solution.
TIA,
Stuart