views:

34

answers:

3

I have an interesting problem, that i am sure has a simple answer, but i can't seem to find it in the docs.

I have two separate database tables, on different servers. They are both identical table schema with the same primary keys.

I want to merge the tables together on one server. But, if the row on Server1.Table1 exists in Server2.Table2 then sum up the totals in the columns i specify.

Table1{ column_pk, counter }; "test1", 3 "test2", 4

Table2{ column_pk, counter }; "test1", 5 "test2", 6

So after i merge i want:

"test1",8 "test2",10

Basically i need to do a mysqldump but instead of it kicking out raw INSERT statements, i need to do a INSERT..ON DUPLICATE KEY UPDATE statements.

What are my options?

Appreciate any input, thank you

A: 
INSERT INTO table_new (column_pk, counter) VALUES ('test',4)
  ON DUPLICATE KEY UPDATE c=c+4;

Where 4 and test are replaced by your actual data.

As for how you can achieve an export file containing those queries, I'd suggest you use regular expressions to modify a standard export. Or insert both of them in the same table and the sum them up like suggested by Anax

thisMayhem
I was hoping not to have to do a lot of fiddling with the SQL output, but if thats the only way, so be it.
Alan Williamson
A: 

You can use a query to select the result you want into a new file:

SELECT column_pk, SUM(counter) FROM
(
  SELECT column_pk, counter FROM Server1.Table1 
  UNION ALL column_pk, counter FROM Server2.Table2
) data
GROUP BY column_pk
INTO OUTFILE "dump.txt" 

You can then load this file into the destination server using LOAD DATA INFILE

EDIT: Here "Server1.Table1" is a pseudonym - you can't really write that. To read from tables on other servers as if they were local, use the federated storage engine.

mdma
How can he access server1 and server2 over the same connection? As far as I know you can perform a query on multiple tables at the same time, but not on multiple servers... please clarify
thisMayhem
you can use the federated storage engine.
mdma
I don't want to read all the servers from the one -- in this environment that won't be possible. We have lots of worker tasks all producing their independent results, and the main database is the aggregated view of it all.
Alan Williamson
A: 

I would thrown both tables into a new one (new_pk, old_column_pk, old_counter) and then

SELECT DISTINCT old_column_pk AS column_pk, SUM(old_counter) AS counter
FROM newtable GROUP BY old_column_pk

as my final data

Anax