ansaurus

Question

Insert fail then update OR Load and then decide if insert or update.

Answer 1

+1 A:

If performance is your goal then first get rid of the word iterate from your vocabulary! learn to do things in sets.

If you need to update or insert, always do the update first. Otherwise it is easy to find yourself updating the record you just inserted by accident. If you are doing this it helps to have an identifier you can look at to see if the record exists. If the identifier exists, then do the update otherwise do the insert.

HLGEM 2008-12-19 16:00:33

Answer 2

+1 A:

The important thing is to understand the balance or ratio between the number of inserts versus the number of updates on the list you receive. IMHO you should implement an abstract strategy that says "persists this on database". Then create concrete strategies that (for example):

checks for primary key, if zero records are found does the insert, else updates
Does the update and, if fails, does the insert.
others

And then pull the strategy to use (the class fully qualified name for example) from a configuration file. This way you can switch from one strategy to another easily. If it is feasible, could be depending on your domain, you can put an heuristic that selects the best strategy based on the input entities on the set.

smink 2008-12-19 16:08:11

Answer 3

+4 A:

If your database supports MERGE, I would have thought that was most efficient (and treats all the data as a single set).

See:

http://www.oracle.com/technology/products/oracle9i/daily/Aug24.html

http://blogs.techrepublic.com.com/datacenter/?p=194

DanSingerman 2008-12-19 16:16:59

YES! It supports merge :) Thank god :)

sakana 2008-12-19 16:22:25

Answer 4

+1 A:

MySQL supports this:

INSERT INTO foo
SET bar='baz', howmanybars=1
ON DUPLICATE KEY UPDATE howmanybars=howmanybars+1

chaos 2008-12-19 17:30:20

Answer 5

+1 A:

Option 2 is not going to be the most efficient. The database will already be making this check for you when you do the actual insert or update in order to enforce the primary key. By making this check yourself you are incurring the overhead of a table lookup twice as well as an extra round trip from your Java code. Choose which case is the most likely and code optimistically.

Expanding on option 1, you can use a stored procedure to handle the insert/update. This example with PostgreSQL syntax assumes the insert is the normal case.

CREATE FUNCTION insert_or_update(_id INTEGER, _col1 INTEGER) RETURNS void
AS $$
    BEGIN
        INSERT INTO
            my_table (id, col1)
        SELECT
            _id, _col1;
    EXCEPTION WHEN unique_violation THEN
        UPDATE
            my_table
        SET
            col1 = _col1
        WHERE
            id = _id;
    END;
END;
$$
LANGUAGE plpgsql;

You could also make the update the normal case and then check the number of rows affected by the update statement to determine if the row is actually new and you need to do an insert.

As alluded to in some other answers, the most efficient way to handle this operation is in one batch:

Take all of the rows passed to the web service and bulk insert them into a temporary table
Update rows in the mater table from the temp table
Insert new rows in the master table from the temp table
Dispose of the temp table

The type of temporary table to use and most efficient way to manage it will depend on the database you are using.

cope360 2008-12-20 22:05:01

ansaurus

tags:

views:

answers:

Insert fail then update OR Load and then decide if insert or update.

related questions