ansaurus

Question

Any SQL database: When is it better to fetch a whole table instead of querying for particular rows?

Answer 1

+2 A:

In my experience it is best to query for what you want and let the database figure out the best way to do it. You can examine the query plan to see if you have any bottlenecks that could be helped by indexes as well.

Otávio Décio 2009-01-22 15:47:55

Answer 2

+7 A:

with 10K to 100K rows, number 1 is the clear winner to me. If it was <1K I might say keep it cached in the application, but with this many rows, let the DB do what it was designed to do. With the proper indexes, number 1 would be the best bet.

If you were pulling the same set of data over and over each time then caching the results might be a better bet too, but when you are going to have a different where all the time, it would be best to let the DB take care of it.

Like I said though, just make sure you index well on all the appropriate fields.

Ryan Guill 2009-01-22 15:48:40

Answer 3

+3 A:

I firmly believe option 1 should be preferred in an initial situation. When you encounter performance problems, you can look on how you could optimize it using caching. (Pre optimization is the root of all evil, Dijkstra once said).

Also, remember that if you would choose option 3, you'll be sending the complete table-contents over the network as well. This also has an impact on performance .

Frederik Gheysels 2009-01-22 15:48:43

It was actually Knuth that said that. :-)

Jason Baker 2009-01-22 16:03:00

Answer 4

+4 A:

Seems to me that a system that was designed for rapid searching, slicing, and dicing of information is going to be a lot faster at it than the average developers' code. On the other hand, some factors that you don't mention include the location or potential location of the database server in relation to the application - returning large data sets over slower networks would certainly tip the scales in favor of the "grab it all and search locally" option. I think that, in the 'general' case, I'd recommend querying for just what you want, but that in special circumstances, other options may be better.

Harper Shelby 2009-01-22 15:49:07

Answer 5

A:

if you do this:

SELECT * FROM users;

mysql should perform two queries: one to know fields in the table and another to bring back the data you asked for.

doing

SELECT id, email, password FROM users;

mysql only reach the data since fields are explicit.

about limits: always ss best query the quantity of rows you will need, no more no less. more data means more time to drive it

Gabriel Sosa 2009-01-22 16:05:18

Untrue. Any SQL interpreter will have to fetch the attribute information in order to tell if your list is correct. Otherwise, how would it know that SELECT asdkfljdsalkfjsdlakf isn't a valid attribute?

Matt Rogish 2009-01-22 16:10:55

Answer 6

+1 A:

The only reason to use anything other than option 1 is if the WHERE clause itself is huge (i.e. if your WHERE clause identifies each row individually, e.g. WHERE id = 3 or id = 4 or id = 32 or ...).

j_random_hacker 2009-01-22 16:09:49

Answer 7

+2 A:

First of all, let us dismiss #2. Searching tables is data servers reason for existence, and they will almost certainly do a better job of it than any ad hoc search you cook up.

For #3, you just say 'filter the output as needed" without saying where that filter is been done. If it's in the application code as in #2, than, as with #2, than you have the same problem as #2.

Databases were created specifically to handle this exact problem. They are very good at it. Let them do it.

James Curran 2009-01-22 16:10:09

Answer 8

A:

Is anything else changing your data? The point about letting the SQL engine optimally slice and dice is a good one. But it would be surprising if you were working with a database and do not have the possibility of "someone else" changing the data. If changes can be made elsewhere, you certainly want to re-query frequently.

sfuqua 2009-01-22 16:13:34

Answer 9

A:

Trust that the SQL server will do a better job of both caching and filtering than you can afford to do yourself (unless performance testing shows otherwise.)

Note that I said "afford to do" not just "do". You may very well be able to do it better but you are being paid (presumably) to provide functionality not caching.

Ask yourself this... Is spending time writing cache management code helping you fulfil your requirements document?

Chris Nava 2009-01-24 05:46:37

ansaurus

tags:

views:

answers:

Any SQL database: When is it better to fetch a whole table instead of querying for particular rows?

related questions