views:

129

answers:

6

I've heard using 'select *' will add to the time SQL Server takes to build the query execution plan as it has to find out which columns are present on the referenced tables. Does the same apply for a query based on a view instead of a table?

+1  A: 

Yes; the column discovery process still has to take place. "select *" should never be used in a production application or process. You should ALWAYS explicitly define the specific data you want to retrieve.

Adam Robinson
+1  A: 

I would wager that the time is negligable in either case; however, you should avoid select * for other reasons.

Using select * can return more data then what you might actually need which depending on the data could be significant. Then there are problems where if you remove a column, your query might still work; however, the consuming code might fail. You really loose the ability to determine if somone is using that column.

JoshBerke
+6  A: 

Negligible time. Yes, if it applies to a table it applies to a view.

But in 12 years of doing SQL, I have yet to see a query sped up by explicitly naming columns instead of using *.

in production code, I don't use *, but that's for reasons of making code mean what it says, not efficiency, and because order can matter when binding to a result set

In production view, I'll use * if the intent of the view (if "what I mean to say") is "bring in all columns"; that way, recompiling a view will pick up table changes. Order doesn't matter in a view; only in the client-side query that may use the view.

On edit: let me note that a view definition is parsed once (until it is recompiled), when the view is created, not when it is used. So the tiny amount of time to get from * to column1, c2, d3, foobar happens once, at the time the create view is sent to the db server.

Now, bringing back all the columns to the client will be slower than bringing back only one (though usually not by much). But that's a different issue.

tpdi
+3  A: 

There is no difference in terms of the query plan, but explicitly defining columns is considered good practice for a number of reasons, including:

  • Adding columns to the table will not break legacy code that depends on the old column setup.
  • Selecting data from columns that you don't need means more data transfer, which is often the slowest part of getting data from the database.
Terrapin
+2  A: 

Imagine you have a table like this:

CREATE TABLE t_test (id INT NOT NULL PRIMARY KEY, value1 INT, value2 INT, aux_value VARCHAR(200))
CREATE INDEX ix_test_values ON (value1, value2)

Then you want to select all values inside a certain range:

SELECT  value1, value2
FROM    t_test
WHERE   value1 BETWEEN 10 AND 20

In this case, SQL Server will just scan the index ix_test_values. Everything you want to know is contained in this index, that's why nothing but the INDEX SCAN is required.

Now you issue:

SELECT  *
FROM    t_test
WHERE   value1 BETWEEN 10 AND 20

SQL Server now needs to select id and aux_value along with value1 and value2. These values are not contained in the index, that's why for each index leaf SQL Server should peek into the table itself and retrieve the values from the table pages.

This can takes 4 to 10 times longer than a simple index scan, depending on how complex you table structure is and how many pages fit into memory.

Quassnoi
+2  A: 

I've heard using 'select *' will add to the time SQL Server takes to build the query execution plan as it has to find out which columns are present on the referenced tables.

Think about it. What does SQL Server do when you send:

SELECT ID, col1, col2, col3
  FROM table

The statement above implies it won't check and see if ID, col1, col2, etc exist in the table -- that it'll just take your word for it? Gosh no! :D

It's going to look in the system catalogs to see if those columns exist. Something which, internally, is virtually identical to saying:

SELECT * FROM table

There are plenty of reasons to not use SELECT * (they've been enumerated in this list so far) but claiming that it adds measurable, meaningful overhead to query parse time is just downright silly. So if anyone ever tries to tell you that, please correct them :)

Matt Rogish