views:

196

answers:

5

I've Googled this question and can't seem to find a consistent opinion, or many opinions that are based on solid data. I simply would like to know if using the wildcard in a SQL SELECT statement incurs additional overhead than calling each item out individually. I have compared the execution plans of both in several different test queries, and it seems that the estimates always read the same. Is it possible that some overhead is incurred elsewhere, or are they truly handled identically?

What I am referring to specifically:

SELECT *

vs.

SELECT item1, item2, etc.
+1  A: 

EDIT: If you are talking about the asterisk wild card as in Select * From ... then see other responses...

If you are talking about wildcards in predicate clauses, or other query expressions using Like operator, (_ , % ) as described below, then:

This has to do with whether using the Wildcard affects whether the SQL is "SARG-ABLE" or not. SARGABLE, (Search-ARGument-able)means whether or not the query's search or sort arguments can be used as entry parameters to an existing index. If you prepend the wild card to the beginning of an argument

 Where Name Like '%ing'

Then there is no way to traverse an index on the name field to find the nodes that end in 'ing'.

If otoh you append the wildcard to the end,

Where Name like 'Donald%'

then the optimizer can still use an index on the name column, and the query is still SARG-able

Charles Bretana
I should have been more specific. I edited to do so. I was referring to SELECT * vs. a SELECT list. However, this answer gives me some insight into an issue I hadn't considered. Thanks!
Geo Ego
+1  A: 

Do you mean "select * from ..." instead of "select col1, col2, col3 from .."?

I think it's always better to name the column and retrieve the minimal amount of information, because

  • your code will work independently of the physical order of the columns in the db. The column order should not impact your application, but it will be the case if you use *. It can be dangerous in case of db migration, etc.
  • if you name the columns, the dbms can optimize further the execution. For instance, if there is an index that contains all the data your are interested in, the table will not be accessed at all.

If you mean something else with "wildcard", just ignore my answer...

ewernli
That is the case I was referring to, although I was looking to see where the performance may be incurred other than in the execution plan. I agree with your reasons for not using it, though.
Geo Ego
+6  A: 
SELECT * FROM...

and

SELECT every, column, list, ... FROM...

will perform the same because both are an unoptimised scan

The difference is:

  • the extra lookup in sys.columns to resolve *
  • the contract/signature change when the table schema changes
  • inability to create a covering index. In fact, no tuning options at all, really
  • have to refresh views needed if non schemabound
  • can not index or schemabind a view using *
  • ...and other stuff

Other SO questions on the same subject...

gbn
To add, I would assume that the "overhead" in this wildcard resolution is negligible when compared to the actual work needed to fetch the data off disk, etc.
Mark Canlas
True, but folk often mention it. of course, for 300 columns and 10 rows.. ;-)
gbn
Thanks for the information!
Geo Ego
A: 

If that you call SQL wild car is *. It does not imply performance overhead by it self. However, if the table is extended you could find yourself retrieving fields you doesn't search. In general not being specific in the fields you search or insert is a bad habit. Consider

insert into mytable values(1,2)

What happen if the table is extended to three fields?

Xavier Combelle
A: 

It may not be more work from an execution plan standpoint. But if you're fetching columns you don't actually need, that's additional network bandwidth being used between the database and your application. Also if you're using a high-level client API that performs some work on the returned data (for example, Perl's selectall_hashref) then those extra columns will impose performance cost on the client side. How much? Depends.

Dan