views:

221

answers:

5

How much of a performance benefit is there by selecting only required field in query instead of querying the entire row? For example, if I have a row of 10 fields but only need 5 fields in the display, is it worth querying only those 5? what is the performance benefit with this limitation vs the risk of having to go back and add fields in the sql query later if needed?

+5  A: 

It depends on how many rows are selected, and how much memory do those extra fields consume. It can run much slower if several text/blobs fields are present for example, or many rows are selected.

How is adding fields later a risk? modifying queries to fit changing requirements is a natural part of the development process.

Eran Galperin
+2  A: 

The only benefit I know of explicitly naming your columns in your select statement is that if a column your code is using gets renamed your select statement will fail before your code. Even better if your select statement is within a proc, your proc and the DB script would not compile. This is very handy if you are using tools like VS DB edition to compile/verify DB scripts. Otherwise the performance difference would be negligible.

FutureGuy
+2  A: 

The number of fields retrieved is a second order effect on performance relative to the large overhead of the SQL request itself -- going out of process, across the network to another host, and possibly to disk on that host takes many more cycles than shoveling a few extra bytes of data.

Obviously if the extra fields include a megabyte blob the equation is skewed. But my experience is that the transaction overhead is of the same order, or larger, than the actual data retreived. I remember vaguely from many years ago than an "empty" NOP TNS request is about 100 bytes on the wire.

Tom A
+2  A: 

If the SQL server is not the same machine from which you're querying, then selecting the extra columns transfers more data over the network (which can be a bottleneck), not forgetting that it has to read more data from the disk, allocate more memory to hold the results.

There's not one thing that would cause a problem by itself, but add things up and they all together cause performance issues. Every little bit helps when you have lots of either queries or data.

The risk I guess would be that you have to add the fields to the query later which possibly means changing code, but then you generally have to add more code to handle extra fields anyway.

skirmish
+6  A: 

It's not just the extra data aspect that you need to consider. Selecting all columns will negate the usefulness of covering indexes, since a bookmark lookup into the clustered index (or table) will be required.

Mitch Wheat
He is asking about columns per row, this has no effect on the execution plan. Indexing works as usual
Eran Galperin
@Eran Galperin: Some DBMSs can avoid going to the actual table if all columns you use are in an index. This can save some I/O.
derobert
@Eran Galperi: That's incorrect. The columns selected affects the choice of index.
Mitch Wheat
Plus, it doesn't just save on I/O (logical and possibly physical), it completly alters the locking landscape, when processes have to visit the table (or clustered index) as oppose to satisfying a query from a covering index.
Mitch Wheat
I stand corrected :)
Eran Galperin