tags:

views:

83

answers:

3

First of all, let me just say that I'm using the PHP framework Yii, so I'd like to stay within its defined set of SQL statement if possible. I know I could probably create one huge long SQL statement that would do everything, but I'd rather not go there.

OK, imagine I have a table Users and a table FavColors. Then I have a form where users can select their color preferences by checking one or more checkboxes from a large list of possible colors.

Those results are stored as multiple rows in the FavColors table like this (id, user_id, color_id).

Now imagine the user goes in and changes their color preference. In this scenario, what would be the most efficient way to get the new color preferences into the database?

Option 1:

  • Do a mass delete of all rows where user_id matches
  • Then do a mass insert of all new rows

Option 2:

  • Go through each current row to see what's changed, and update accordingly
  • If more rows need to be inserted, do that.
  • If rows need to be deleted, do that.

I like option one because it only requires two statements, but something just feels wrong about deleting a row just to potentially put back almost the exact same data in. There's also the issue of making the ids auto-increment to higher values more quickly, and I don't know if that should be avoided whenever possible.

Option 2 will require a lot more programming work, but would prevent situations where I'd delete a row just to create it again. However, adding more load in PHP may not be worth the decrease in load for MySQL.

Any thoughts? What would you all do?

+4  A: 

UPDATE is by far much faster. When you UPDATE, the table records are just being rewritten with new data. And all this must be done again on INSERT.

When you DELETE, the indexes should be updated (remember, you delete the whole row, not only the columns you need to modify) and data blocks may be moved (if you hit the PCTFREE limit). Also deleting and adding new changes records IDs on auto_increment, so if those records have relationships that would be broken, or would need updates too. I'd go for UPDATE.

That's why you should always use INSERT ... ON DUPLICATE KEY UPDATE instead of REPLACE.

The former one is an UPDATE operation in case of a key violation, while the latter one is DELETE / INSERT

UPDATE: Here's an example INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;

For more details read update documentation

MovieYoda
If updating an indexed field, the index is updated too
OMG Ponies
@OMG yeah! Forgot to mention that. Good that you did...
MovieYoda
Thanks, that makes sense. Unfortunately, when you're using a framework based on ActiveRecord, it's not always easy to create a custom SQL statement and include "INSERT ... ON DUPLICATE KEY UPDATE". If I do that, I'll have to manually create a query, and that query most likely won't translate if I need to switch DBMS in the future.
Philip Walton
check out if your framework provides any kind of feature which will help you to pull this off. Otherwise, just do an `update`.
MovieYoda
Would updating 10 rows individually with 10 SQL statements really be faster than deleting 10 and then adding 10 with two SQL statements?
Philip Walton
try to do all your updates in 1 single SQL statement. That helps.
MovieYoda
I can't do it all in 1 single SQL statement because the values I'm updating them to are not consistent across all updated rows. Given that, is it STILL better to update 10 rows 1 by 1 than to do a mass delete and then insert?
Philip Walton
MovieYoda
@movieyoda, thanks, I appreciate the dialog!
Philip Walton
@philip in keeping with good practices at StackOverflow, please upvote any answer that has helped you and finally mark the 'one' answer that has helped you the most as 'correct' (green check).
MovieYoda
@movieyoda, I would certainly give you credit, but I can't upvote yet since I don't have enough points. I also want to hear what others have to say before pick a correct answer, but it will most likely be yours :)
Philip Walton
A: 

To anyone else looking for a solution to this problem, I found this page to be very helpful as well: http://www.karlrixon.co.uk/articles/sql/update-multiple-rows-with-different-values-and-a-single-sql-query/

It shows you how to update multiple rows with different values in a single SQL statement.

Philip Walton
A: 

Philip, Have you tried doing prepared statements? With prepared statements you can batch one query with different parameters and call it multiple times. At the end of your loop, you can execute all of them with minimal amount of network latency. I have used prepared statements with php and it works great. Little more confusing than java prepared statements.

Amir Raminfar