ansaurus

Question

Sync mysql table data from client to master

Answer 1

+1 A:

I'd suggest using auto_increment_increment to keep all the ids unique over all of the servers. Then, all you need to do is a SELECT * FROM blah WHERE sync = '0000-00-00 00:00:00', and then generate the insert statements and execute them. You won't have to deal with any kind of conflict resolution for conflicting primary keys...

As for the long query times, you need to look at the size of your data. If each record is sizable (a few hundred kb +), it's going to take time...

One option may be to create a federated table for each child server's table. Then do the whole thing in SQL on the master. INSERT INTO master_table SELECT * FROM child_1_table WHERE sync = '0000-00-00 00:00:00'... You get to avoid pulling all of the data into PHP. You can still run some checks to make sure everything went well, and you can still log since everything is still executed from PHP land...

ircmaxell 2010-08-19 11:04:35

The ID's is not a problem. Those do not need to sync. I get the correct record via various other columns. I just use the ID to link the docarch_printout (All details regarding the document) table to the docarch_printout_docs table (1 - 1 Contains just the document). The other problem is that we do not have a permanent connection to the client. Some are dial on demand ISDN lines. Due to that I don't think the federated table will work. Nice idea tho, never knew MySQL had that option.

Surim 2010-08-19 11:25:05

Well, I suppose you could store the create sql for the federated table in your program. Then when you connect to the client, run the create script. then drop it once you're done (so it only uses the connection when you're actively syncing)...

ircmaxell 2010-08-19 11:32:58

True. I really like that idea. Everythng seems great but, I am needing to sync 2 tables (1-1 relationship) which are referenced by the id field. One being the details, the other being the actual document. Would be easy peasy of it was all just a single table. Any further thoughts on that? Thanks.

Surim 2010-08-19 12:29:50

Create two federated tables, and just adjust your insert statement to only move the relevant columns... It shouldn't be too hard (especially since you can join the federated tables onto the local tables to determine the unique identifier if you need to)...

ircmaxell 2010-08-19 12:40:51

I spent last nyt rewriting the database structure and managed to get the federated tables working. Thanks. Seems to be loads faster. Probably a combination of the new structure + better sync process. Thank you

Surim 2010-08-20 07:45:19

Answer 2

A:

The basic method sounds OK - but taking 0.5 seconds to do one operation is ridiculously excessive - how much data are you pulling across the network? The entire image? Are you doing anything else in the operation? Is there an index on the sync column?

You could get a small benefit by doing an export of the un-synced data on the database:

1) mark all records available for sync with a transaction id in a new column
2) extract all records flagged in first step into a flat file
3) copy the file across the network
4) load the data into the master DB
5) if successful notify the origin server
6) origin server then sets the sync time for all records flagged with that transaction id

This would require 3 scripts - 2 on the origin server (one to prepare and send the data, one to flag as complete) and one on the replicated server to poll the data AND notify outcome.

But this is probably not going to make big inroads into the performance which seems absurdly high if you are only replicating meta-data about the image (rather than the image itself).

C.

symcbean 2010-08-19 13:13:02

Updated my initial post with a steb by step run through of the script so you can see my logic

Surim 2010-08-19 14:15:41

This still doesn't explain why it's taking 0.5 seconds per record. Since a transaction cannot span 2 independent DBMS, its not adding any value here. How big is the record?

symcbean 2010-08-19 21:18:18

It would be due to the fact the client's are connected by ADSL (a few via ISDN) to us. The record documents are only a few kilobytes of text.

Surim 2010-08-20 05:53:30

hmmm, if it really is the bandwidth that's the problem then the solution is probably to add more bandwidth - you're still being very vague about the network traffic - certainly if updates of each record are coordinated across a slow link then latency may be the problem. If you try my method (single file) then you would eliminate the latency problem and could compress the file before transmitting it.

symcbean 2010-08-20 11:56:41

Answer 3

A:

I know you prefer a PHP based solution, but you might want to check out Microsoft Sync Framework -

http://msdn.microsoft.com/en-in/sync/default(en-us).aspx

This will necessitate the sync module to be written in .net, but there is a huge advantage in terms of sync logic and exception handling (network failure, sync conflicts, etc), which will reduce time for you.

The framework handles non-sql server databases as well, as long as there is a database connector for .net. Mysql should be supported quite easily - just take a sample from the following link -

http://code.msdn.microsoft.com/sync/Release/ProjectReleases.aspx?ReleaseId=4835

and adapt the same to mysql.

Roopesh Shenoy 2010-08-19 14:27:14

That would be great, but it has to run on CentOS. We don't have any Microsoft servers

Surim 2010-08-20 05:47:32

hmm.. thats a problem alright.. though it would be cheap to actually get one running just for this purpose! We actually saved a lot of dev effort with this, so you could do some Cost/benefit analysis and decide.

Roopesh Shenoy 2010-08-20 06:05:07

Answer 4

A:

Theres another possibility if you cant use sync framework -

Is it possible for you to distribute the load throughout the day, instead of end of day? Say, trigger synchronization every time 10 new documents come in or 10 edits are done? (this can be done if the synchronization is initiated from client side).

In case you want to take the sync logic to server side, you can consider using messaging queues to send notifications to server from clients, whenever client needs to synchronize. The server can then pull the data. You can use in-house service bus or on-demand platforms like azure appfabric/Amazon SQS for this.

Roopesh Shenoy 2010-08-20 07:48:09

The documents that get archived archived are generated by another application at it's day end. The document archive system monitors the directory for new files, processes it etc. Because of this, the imports are done in batch as well. The sync needs to be done at the end of the day to tie in with when the clients with ISDN lines connect to us for other purposes.

Surim 2010-08-20 08:23:20

Okay.. since you've already got an answer that works.. cheers!

Roopesh Shenoy 2010-08-20 09:04:29

ansaurus

tags:

views:

answers:

Sync mysql table data from client to master

related questions