tags:

views:

62

answers:

2

I am making a PHP utility that imports and analyzes a CSV file into $data, and whether or not to INSERT "new" rows into the database ($saveNew).

Right now, I have a bit of an ugly mess: (in generalized pseudo-PHP)

function synchronize($data,$saveNew) {
    $existing_ids = $table->find_all('ID'); //array of ID's in the table
    $incoming_ids = get_key('ID',$data); //grabs the ID field from each record
    $new_ids = array_diff($incoming_ids,$existing_ids);

    foreach ($data as $record) {
        if (in_array($record['ID'],$new_ids)) { //it's new
            if ($saveNew) $table->insert($record);
            else continue;
        } else {
            $table->update($record);
        }
     }
 }

To me, this just has a smell, and I think I could accomplish this in just a single query, except that I'm not that familiar with SQL.

I'm using a simple ORM in my app, but I can easily just use straight SQL.

Oh, and I'm using MySQL.

Can this be done with just one query? It seems like this would be a common problem with a simple solution, but I just can't figure it out.

+3  A: 

Have a look at MySQL's INSERT ... ON DUPLICATE KEY ... syntax, which allows you to do just that: http://dev.mysql.com/doc/refman/5.1/en/insert-on-duplicate.html

Whether you want to implement that in your ORM or in that specific subroutine of yours is up to you...

mfontani
This looks much better, but wouldn't this only apply if `$saveNew` is true? If we want to `synchronize($data,false)`, then we don't want to `INSERT` at all. Is there such a thing as `UPDATE IF EXISTS`?
Austin Hyde
There's no need for an `UPDATE IF EXISTS`... when you call `UPDATE ... WHERE id = 5`, if id `5` doesn't exist, it just won't update anything... No error, just no updates...
ircmaxell
@ircmaxwell: Oh wow, talk about a "duh!" moment for me. I'm apparently not thinking very straight... :D
Austin Hyde
A: 

If you have a unique row identifier, you can use the INSERT ... ON DUPLICATE KEY UPDATE syntax:

INSERT INTO foo (id, col1, col2) VALUES (?, ?, ?) 
    ON DUPLICATE KEY UPDATE col1 = VALUES(col1), col2 = VALUES(col2);

Then, when you bind NULL, 4, 5 it'll insert a new row (assuming id is an autoincrement column)

When you bind 1, 4, 5 it'll insert a new row if there is no row 1, otherwise it'll update row 1's col1 and col2 fields to 4 and 5 respectively...

ircmaxell