tags:

views:

400

answers:

4

I am working on a hobby app which will contain a large slew of basically-hardcoded data, as well as dynamic user data once I deploy it. I want the ability to update the hardcoded data locally (more UPDATEs than INSERTs), and then export these data to the server. In other words, I need to dump data to a file, and import it in such a way that new rows (which will be relatively few) are INSERTed, and existing rows (as identified by the PK) are UPDATEd. Clearly, new rows can't be INSERTed on the server (or the PKs would potentially clash and issue erroneous UPDATEs); this is an acceptable limitation. However, I cannot DELETE the rows to be UPDATEd, let alone drop the synchronised tables, because the user-accessible tables will have FK constraints on the "static" tables.

Unfortunately, this seems to be very difficult to do. Google and Postgres mailing lists inform me that the "MySQL-like" feature on_duplicate_key_update "will be in a new version" (stale information; is it there?), along with a suggestion that "pg_loader can do it", with no indication of how.

In a worst-case scenario, I suppose I can come up with a home-brewn solution (dump a data file, write a custom import script that checks for PK conflicts and issues INSERT or UPDATE statements appropriately), but that seems an awfully clumsy solution for a problem that others have surely encountered before me.

Any thoughts?

A: 

Have you considered running postgresql locally?

  1. Dump server data
  2. Import locally
  3. Turn on query logging
  4. Make updates
  5. Take query log and run on server
  6. Clear query log

Or am I misunderstanding what you're trying to do?

Chad Birch
No misunderstanding: That is definitely a workable solution—I am already running a local postgres server (two, in fact, on laptop and desktop, respectively). It sounds like a solution that requires a little more baby-sitting than ideal (what with having to manage and reset the logs on every server update, and sync that between my two local machines), but it will definitely *work*.
A: 
create temporary table newdata as select * from table1 where false;

Fill newdata with new data, then:

start transaction;
create function fill_table1() returns void as
$$
declare
    data record;
begin
    for data in select * from newdata
    loop
        update table1 set
            column1 = data.column1,
            column2 = data.column2
        where id = data.id
        if not found then
            insert into table1 values data.*;
        end if;
    end loop;
end;
$$ language plpgsql;
select fill_table();
drop function fill_table();
commit;

This is not tested so it will probably need some tweaking to work. It requires that table will not change while it is running.

Tometzky
+1  A: 

Yeah, temp table + some combination on insert/update is the way to go (you dont need to do it in a function, but you can).

For the record, the feature your looking for is the Merge command. Someone wrote a patch for it, but it was never finished. AFAIK no one is working on it now; it certainly isn't in the soon to be released Postgres 8.4.

xzilla
A: 

This question seems similar to this:

Maybe you could check the solution I provided there.

Andrea Bertani