ansaurus

Question

Answer 1

+2 A:

It's because you don't always need every variable, and also to make sure that you are thinking about what you specifically need.

There's no point getting all the hashed passwords out of the database when building a list of users on your site for instance, so a select * would be unproductive.

Rich Bradshaw 2008-11-04 16:55:42

Answer 2

+20 A:

I don't think there's much in software that is "just bad", but there's plenty of stuff that is misused in bad ways :-)

The example you give is a reason why * might not give you what you expect, and I think there are others. For example, if the underlying tables change, maybe columns are added or removed, a view that uses * will continue to be valid, but might break any applications that use it. If your view had named the columns explicitly then there was more chance that someone would spot the problem when making the schema change.

On the other hand, you might actually want your view to blithely accept all changes to the underlying tables, in which case a * would be just what you want.

Martin 2008-11-04 16:56:57

Please check my other comment, on SQL Server all changes are not necessarily accepted.

2008-11-04 23:03:57

Additionally, the code using the view should not use *, so it may be quite reasonable to use * in the view, and the onus will be on the code using the view to make sure it is not using *.

Jamie Love 2008-11-06 03:52:43

Answer 3

+2 A:

Once upon a time, I created a view against a table in another database (on the same server) with

Select * From dbname..tablename

Then one day, a column was added to the targetted table. The view started returning totally incorrect results until it was redeployed.

Totally incorrect : no rows.

This was on Sql Server 2000.

I speculate that this is because of syscolumns values that the view had captured, even though I used *.

David B 2008-11-04 16:57:20

Answer 4

+7 A:

Using '*' for anything production is bad. It's great for one-off queries, but in production code you should always be as explicit as possible.

For views in particular, if the underlying tables have columns added or removed, the view will either be wrong or broken until it is recompiled.

Dave 2008-11-04 16:57:40

Answer 5

+13 A:

Another reason why "*" is risky, not only in views but in queries, is that columns can change name or change position in the underlying tables. Using a wildcard means that your view accommodates such changes easily without needing to be changed. But if your application references columns by position in the result set, or if you use a dynamic language that returns result sets keyed by column name, you could experience problems that are hard to debug.

I avoid using the wildcard at all times. That way if a column changes name, I get an error in the view or query immediately, and I know where to fix it. If a column changes position in the underlying table, specifying the order of the columns in the view or query compensates for this.

Bill Karwin 2008-11-04 17:01:30

Please read my other answer, changes in underlying tables are not always reflected, at least on SQL server.

2008-11-04 23:04:34

Answer 6

+16 A:

Although many of the comments here are very good and reference one common problem of using wildcards in queries, such as causing errors or different results if the underlying tables change, another issue that hasn't been covered is optimization. A query that pulls every column of a table tends to not be quite as efficient as a query that pulls only those columns you actually need. Granted, there are those times when you need every column and it's a major PIA having to reference them all, especially in a large table, but if you only need a subset, why bog down your query with more columns than you need.

Anne Porosoff 2008-11-04 17:05:51

Excellent points!

Bill Karwin 2008-11-04 17:22:21

This is my main reason to not use SELECT *. The other reasons are mainly relevant if you do other things which are dangerous, such as SELECT * with a JOIN or referencing columns by index rather than by name. Duh, OF COURSE those are dangerous combinations. But by itself select * is not evil.

Mr. Shiny and New 2008-12-01 16:03:04

I agree it's not generally dangerous. I frequently use SELECT * when doing data checks on a db. However, I tend to avoid using SELECT * when writing production code so as to avoid encountering the problems illustrated above as well as to continue to enforce good programming habits.

Anne Porosoff 2008-12-01 22:03:06

Answer 7

+2 A:

It's generally a bad idea to use *. Some code certification engines mark this as a warning and advise you to explicitly refer only the necessary columns. The use of * can lead to performance louses as you might only need some columns and not all. But, on the other hand, there are some cases where the use of * is ideal. Imagine that, no matter what, using the example you provided, for this view (aview) you would always need all the columns in these tables. In the future, when a column is added, you wouldn't need to alter the view. This can be good or bad depending the case you are dealing with.

bruno conde 2008-11-04 17:19:04

Answer 8

+3 A:

Using SELECT * within the view does not incur much of a performance overhead if columns aren't used outside the view - the optimizer will optimize them out; SELECT * FROM TheView can perhaps waste bandwidth, just like any time you pull more columns across a network connection.

In fact, I have found that views which link almost all the columns from a number of huge tables in my datawarehouse have not introduced any performance issues at all, even through relatively few of those columns are requested from outside the view. The optimizer handles that well and is able to push the external filter criteria down into the view very well.

However, for all the reasons given above, I very rarely use SELECT *.

I have some business processes where a number of CTEs are built on top of each other, effectively building derived columns from derived columns from derived columns (which will hopefully one day being refactored as the business rationalizes and simplifies these calculations), and in that case, I need all the columns to drop through each time, and I use SELECT * - but SELECT * is not used at the base layer, only in between the first CTE and the last.

Cade Roux 2008-11-04 17:42:23

Answer 9

+1 A:

I think it depends on the language you are using. I prefer to use select * when the language or DB driver returns a dict(Python, Perl, etc.) or associative array(PHP) of the results. It makes your code alot easier to understand if you are referring to the columns by name instead of as an index in an array.

Brian C. Lane 2008-11-04 18:32:12

If the order of field is modified, associative array continue to work.

Luc M 2008-11-04 18:35:00

Answer 10

+5 A:

These other answers all have good points, but on SQL server at least they also have some wrong points. Try this:

create table temp (i int, j int)
go
create view vtemp as select * from temp
go
insert temp select 1, 1
go
alter table temp add k int
go
insert temp select 1, 1, 1
go
select * from vtemp

SQL Server doesn't learn about the "new" column when it is added. Depending on what you want this could be a good thing or a bad thing, but either way it's probably not good to depend on it. So avoiding it just seems like a good idea.

To me this weird behavior is the most compelling reason to avoid select * in views.

The comments have taught me that MySQL has similar behavior and Oracle does not (it will learn about changes to the table). This inconsistency to me is all the more reason not to use select * in views.

2008-11-04 20:30:16

Same behavior in MySQL - the new column does not become part of the view. Apparently the wildcard is converted to a list of columns at the time the view is created. Also renaming a column in the base table makes the view unusable.

Bill Karwin 2008-11-04 23:49:37

Oracle does not work like this. The view would be invalidated by the table change and recompiled on next access with the new column.

WW 2008-11-05 23:47:16

Answer 11

+2 A:

No one else seems to have mentioned it, but within SQL Server you can also set up your view with the schemabinding attribute.

This prevents modifications to any of the base tables (including dropping them) that would affect the view definition.

This may be useful to you for some situations. I realise that I haven't exactly answered your question, but thought I would highlight it nonetheless.

Russ Cam 2008-11-05 23:44:03

Answer 12

+1 A:

A SQL query is basically a functional unit designed by a programmer for use in some context. For long-term stability and supportability (possibly by someone other than you) everything in a functional unit should be there for a purpose, and it should be reasonably evident (or documented) why it's there - especially every element of data.

If I were to come along two years from now with the need or desire to alter your query, I would expect to grok it pretty thoroughly before I would be confident that I could mess with it. Which means I would need to understand why all the columns are called out. (This is even more obviously true if you are trying to reuse the query in more than one context. Which is problematic in general, for similar reasons.) If I were to see columns in the output that I couldn't relate to some purpose, I'd be pretty sure that I didn't understand what it did, and why, and what the consequences would be of changing it.

le dorfier 2008-11-06 03:41:32

ansaurus

tags:

views:

answers:

Why is using '*' to build a view bad ?

related questions