views:

172

answers:

8

I have a table that holds information about cities in a game, you can build one building each turn and this is recorded with the value "usedBuilding".

Each turn I will run a script that alters usedBuilding to 0, the question is, which of the following two ways is faster and does it actually matter which way is used?

UPDATE cities SET usedBuilding = 0;
UPDATE cities SET usedBuilding = 0 WHERE usedBuilding = 1;
+3  A: 

If usedBuilding is indexed, it will be quicker to use the where clause since it will only access/update rows where usedBuilding is true. If it's not indexed, you'd be doing a full table scan anyway, so it wouldn't make much (any?) difference.

mopoke
+1  A: 

It seems like there would be a lower number of transactions to make the "UPDATE cities SET usedBuilding = 0;" execute than the more specific query. The main reason I can think of against this would be if you had more than one state to your column. If its merely a boolean then it would be fine, but you may want to spend some time thinking if that will always be the case.

Indexing could also cause the execution plan to be more efficient using the WHERE clause.

crackity_jones
+1  A: 

The best way to get a definitive answer would be to profile using a lot of sample data under differing scenarios.

mattlant
+3  A: 

Try both ways in a loop a few thousand times and time them! It probably depends on: how many records are actually in this table, and whether they all fit in memory or have to be paged to disk. How many buildings are at value 1 before you run the update (I'm guessing this might be 1).

It doesn't matter which way is used, but the shortest one's probably got the least that can go wrong with it. Code that you don't write can't have bugs.

Scott Langham
"Traditional" profiling is generally less than useless in RDBMS land. You *have* to look at query plans and stats, rather than just blindly timing. Also, in general, the *more* data you can give an RDBMS, the better optimization it can do on your behalf.
Mark Brackett
+1  A: 

indexing won't help you at all unless you have something like maybe 2% of the usedBuilding = 1 values.

however these 2 statements are logically different and can mean totally different things. but if for your case they are the same then use the one without the where clause.

Mladen
+1  A: 

How often are these turns happening? How many rows do you expect to have in this table? If the answers are 'less than once a second' and 'less than 10000', just stop worrying.

Unless if you happen to have some sort of academic interest in this, of course.

Internet Friend
+4  A: 

In general, the 2nd case (with the WHERE) clause would be faster - as it won't cause trigger evaluation, transaction logging, index updating, etc. on the unused rows.

Potentially - depending on the distribution of 0/1 values, it could actually be faster to update all rows rather than doing the comparison - but that's a pretty degenerate case.

Since ~95% of your query costs are I/O, using the WHERE clause will either make no difference (since the column is not indexed, and you're doing a table scan) or a huge difference (if the column is indexed, or the table partitioned, etc.). Either way, it doesn't hurt.

I'd suspect that for the amount of data you're talking, you won't notice a difference in either execution plans or speed - which makes it academic at best, premature optimization at worst. So, I'd advise to go with whatever logically makes sense for your app.

Mark Brackett
+1  A: 

How many rows exactly will you have? I suspect that for a smallish online game, you really don't care.

If you're doing several updates to the "cities" table, it might be a good idea to do them all in one UPDATE statement if possible.

Making any change to a row probably takes just as much I/O as writing the entire row (except of course updating indexed columns also requires index writes), so you lose out by making several UPDATEs which hit lots of rows.

But if you have, say, <1000 rows, you really don't care :)

MarkR