ansaurus

Question

What are the most common SQL anti-patterns?

Answer 1

+29 A:

Don't have to dig deep for it: Not using prepared statements.

stesch 2008-12-06 19:45:49

Yup. Followed closely in the same context, in my experience, with "not trapping errors".

le dorfier 2008-12-06 22:37:00

Answer 2

+14 A:

use SP as the prefix of the store procedure name because it will first search in the System procedures location rather than the custom ones.

Oscar Cabrero 2008-12-06 19:48:32

"usp" FTW :)

annakata 2008-12-06 19:59:28

Can also be extended to using any other common prefix for all stored procedures, making it more difficult to pick through a sorted list.

le dorfier 2008-12-06 22:36:16

+1 for doofledorfer comment!! I've seen this a lot, I find this idiotic and does indeed make searching for a particular SP *very* difficult!!! Also extended to "vw_" for views, "tbl_" for tables and the like, how I hate them!

Joe Pineda 2008-12-07 06:24:18

The prefixes can be useful if you're scripting the objects to files (eg: for source control, deployments or migration)

Rick 2009-07-15 23:59:58

Why on earth would it be useful to prefix every single stored procedure with sp *or* usp? It just makes it harder to scan the list for the one you want.

Kyralessa 2009-12-03 20:14:14

Answer 3

A:

SELECT * FROM TABLE;

Don't use *
Doesn't have to be capitals!

Rich Bradshaw 2008-12-06 19:48:55

I actually like capitals very much. They let you easily filter out the "SQLy" part of the statement, and focus on the important part (fields, tables, etC)

Daniel Magliola 2008-12-06 20:31:44

Agree with #1, not #2

Michael Haren 2008-12-06 21:25:59

#2 is not convincingly demonstrated: SELECT COL1, COL2, COL3, SOMETHING AS "ELSE" FROM SOMEHWERE, SOMEWHERE WHERE SOMEHWERE.SOMETHING = SOMEWHERE.ANOTHERTHING is objectionable; I use capitals for keywords, and mixed case or lower case for database objects (tables, columns, etc).

Jonathan Leffler 2008-12-06 22:23:16

John's example demonstrates much better what I'm talking about with 2.

Rich Bradshaw 2008-12-06 22:38:24

SELECT * FROM table is only bad if the code makes assumptions about the fields (or their order) that come back.

staticsan 2008-12-08 00:43:52

It's difficult to reliably predict what assumptions you or other developers will make about the fields (or their order). Enumerating columns is both safer and more importantly a better reflection of your design intent. (Unless you don't *know* what columns you use, which is worse.)

le dorfier 2008-12-08 17:58:25

@staticsan : SELECT * FROM table is almost always wrong, performance-wise, since you almost never need all fields. Also, it is cleaner to state the fields you want (easier to know what you get when you look a t a complex query)

Martin 2009-04-01 00:06:34

If you indent properly and use a text editor with syntax coloring, I see absolutely no need to capitalize the SQL keywords. But, I suppose it could be useful if you have SQL directly in your codebase as strings.

dotjoe 2010-05-12 20:55:36

Answer 4

+43 A:

Here's my top 3.

Number 1. Failure to specify a field list. (Edit: to prevent confusion: this is a production code rule. It doesn't apply to one-off analysis scripts - unless I'm the author.)

SELECT *
Insert Into blah SELECT *

should be

SELECT fieldlist
Insert Into blah (fieldlist) SELECT fieldlist

Number 2. Using a cursor and while loop, when a while loop with a loop variable will do.

DECLARE @LoopVar int

SET @LoopVar = (SELECT MIN(TheKey) FROM TheTable)
WHILE @LoopVar is not null
BEGIN
  -- Do Stuff with current value of @LoopVar
  ...
  --Ok, done, now get the next value
  SET @LoopVar = (SELECT MIN(TheKey) FROM TheTable
    WHERE @LoopVar < TheKey)
END

Number 3. DateLogic through string types.

--Trim the time
Convert(Convert(theDate, varchar(10), 121), datetime)

Should be

--Trim the time
DateAdd(dd, DateDiff(dd, 0, theDate), 0)

David B 2008-12-06 19:54:03

hmmm, I'll give you a +1 for points 2 and 3 alone, but developers overplay rule 1. It has it's place sometimes.

annakata 2008-12-06 20:01:26

What is the reasoning behind #1?

jalf 2008-12-06 20:05:11

When you use select *, you get whatever is in the table. Those columns may change names and order. Client code frequently relies on names and order. Every 6 months I'm asked how to preserve column order when modifying a table. If the rule was followed it wouldn't matter.

David B 2008-12-06 20:11:26

I've used #2 sometimes, others I've gone the cursor route (though then I first save the results of the query on a table var, open the cursor on that). I've always wondered if someone has done a performance test of both.

Joe Pineda 2008-12-07 06:13:09

@Joe, A cursor can allow parallelism, while a "loop over keys" or a table var can't. In that scenario, the cursor wins for performance. Here's another stackoverflow article about cursors: http://stackoverflow.com/questions/172526/sql-cursorsany-use-cases-you-would-defend#173216

David B 2008-12-07 16:23:50

another side effect of using "select *" approach described in point 1: http://stackoverflow.com/questions/321468/select-from-table-vs-select-cola-colb-etc-from-table-interesting-behaviour-in-s

kristof 2009-04-20 22:40:48

-1 because point 2 is a crock of shit. Cursors often out perform this loop scenario. Especially cursors declared for only going forward on a recordset (as opposed to backward, arbitrarily relocatable). The worst part about number 2 is that so many developed are deluded into thinking that it is not a cursor. So they'll go with a cursor approach, then apply 2 and say it's not a cursor. Instead of re-thinking about another way to do the problem that is set based (which is not always possible, but a LARGE portion of the time it is....).

Cervo 2010-09-10 20:00:02

Answer 5

+20 A:

Using meaningless table aliases:

from employee t1,
department t2,
job t3,
...

Makes reading a large SQL statement so much harder than it needs to be

Tony Andrews 2008-12-06 19:59:54

aliases? hell I've seen actual column names like that

annakata 2008-12-06 20:03:49

terse aliases are OKAY. If you want a meaningful name then don't use an alias at all.

Joel Coehoorn 2008-12-06 22:56:28

@Joel, you need alias for your sub-selects.

Zoredache 2008-12-06 23:31:43

He didn't say "terse," he said "meaningless." In my book there would be nothing wrong with using e, d, and j as the aliases in the example query.

Robert Rossney 2008-12-07 09:14:45

Absolutely, Robert - e, d, and j would be fine with me.

Tony Andrews 2008-12-07 12:06:34

I would use emp for employee, dep for department and job for job (or maybe jb) :)

Andrei Rinea 2008-12-17 02:04:31

Aliasing fields/tables (to something sensible) is very usfeul in large and well organised data warehouses. it allows you to copy and paste a query, then just change one table name; relying on the alias for all it's references.

Dems 2009-02-04 15:07:59

Answer 6

+3 A:

1) I don't know it's an "official" anti-pattern, but I dislike and try to avoid string literals as magic values in a database column.

An example from MediaWiki's table 'image':

img_media_type ENUM("UNKNOWN", "BITMAP", "DRAWING", "AUDIO", "VIDEO", 
    "MULTIMEDIA", "OFFICE", "TEXT", "EXECUTABLE", "ARCHIVE") default NULL,
img_major_mime ENUM("unknown", "application", "audio", "image", "text", 
    "video", "message", "model", "multipart") NOT NULL default "unknown",

(I just notice different casing, another thing to avoid)

I design such cases as int lookups into tables ImageMediaType and ImageMajorMime with int primary keys.

2) date/string conversion that relies on specific NLS settings

CONVERT(NVARCHAR, GETDATE())

without format identifier

devio 2008-12-06 20:05:19

And no syntactical indentation, either. Argghh.

le dorfier 2008-12-07 03:01:45

Why is this bad? surely if you are trying to express a set of values this works just as well as a lookup table, and fits better with code that calls it. Id rather have an enum in my app code that maps to an enum constraint in my DB than an enum in my app code that maps to specific rows of a lookup table. It just feels cleaner.

Jack Ryan 2010-02-08 18:10:40

Answer 7

+14 A:

Overuse of temporary tables and cursors.

Mr. Brownstone 2008-12-06 20:08:09

Good evidence that "all I know is procedural languages".

le dorfier 2008-12-06 22:34:45

Overuse of anything is by definition unwanted. A specific example of where using temp tables/cursors would not needed would be helpful.

Jace Rhea 2010-02-08 18:39:00

Mostly I see temp tables under-used. with SQL Server often you get performance gains by doing stuff with a bunch of temp tables instead of one monolithic query.

Cervo 2010-09-10 20:05:37

Answer 8

+26 A:

Human readable password fields, egad. Self explanatory.
Using LIKE against indexed columns, and I'm almost tempted to just say LIKE in general.
Recycling SQL-generated PK values.
Surprise nobody mentioned the god-table yet. Nothing says "organic" like 100 columns of bit flags, large strings and integers.
Then there's the "I miss .ini files" pattern: storing CSVs, pipe delimited strings or other parse required data in large text fields.
And for MS SQL server the use of cursors at all. There's a better way to do any given cursor task.

Edited because there's so many!

annakata 2008-12-06 20:11:47

i dont understand the LIKE argument, I never used LIKE because I was bored, but only because they want wildcards search. In my current job every column in search is LIKED. I bet it will get optimized when its too slow, but i dont get the LIKE hate.

01 2008-12-06 20:38:54

wrong about cursors, i would be hesitant about saying doing any particular thing is 100% right or 100% wrong

Shawn Simon 2008-12-06 23:41:08

At least in SQL Server, you can parse delimited strings faster than you can get data out of an XML column/object. So they do have their place, if you care about performance! And I can't understand your aversion to LIKE, I try hard and can't think of why stay away of LIKE.

Joe Pineda 2008-12-07 06:22:18

Sometimes you just have to use a cursor! Unless you are willing to do your processing out of the database, like say in a specific purpose, home-brewed tiny utility... Think of a way to send a mail to a group of people, for instance, without either cursors or an external app

Joe Pineda 2008-12-07 06:43:44

Or generating a result set with running totals.

Robert Rossney 2008-12-07 09:13:23

So far every cursor defense example I've seen is using the wrong tool for the job. But if all you know is SQL, you either use it inappropriately, or you learn to write other kinds of software.

le dorfier 2008-12-08 02:08:36

I would also defend the "parseable data" inside sql (in some cases). Sometimes the "plob" (parseable large object) is a type that you don't want to deal with in sql (each has different fields, etc.)... cf: http://writeonly.wordpress.com/2008/12/05/simple-object-db-using-json-and-python-sqlite/

Gregg Lind 2008-12-12 17:47:36

LIKE can use an index (at least in Oracle).

tuinstoel 2009-01-01 13:23:15

@tuinstoel: How does LIKE '%blah%' get to use an index? Indexing relies on ordering and this example searches a random middle position of a string. (Indexes order by the 1st character 1st, and so looking at the middle 4 characters gives a virtually random order...)

Dems 2009-02-04 15:06:31

On most database servers (at least the ones I've used), LIKE can use indexes.. as long as it's a prefix-search (LIKE 'xxx%') -- that is, as long as the wildcard characters don't come first in the search string. I think you might be talking at cross-purposes here a little.

Cowan 2010-06-05 07:50:49

Answer 9

+5 A:

Identical subqueries in a query.

EvilTeach 2008-12-06 20:15:13

Unfortunately, sometimes you just can't avoid that - in SQL 2000 there was no "WITH" keyword, and using UDFs to encapsulate common subqueries sometime leads to performance penalties, blame MS on that...

Joe Pineda 2008-12-07 06:25:33

Well, hopefully they will get around to adding it one of these days.

EvilTeach 2008-12-09 02:24:36

In SQL 2000, you can use table variables.

recursive 2008-12-30 04:33:44

@recursive: you can't have indexes on a table variable, which will often make it slower than a subquery. However you could use a temporary table with custom indexes.

Rick 2009-07-16 00:08:23

Cool, have been working with SQL for years, and didn't even know Common Table Expressions exist (though I would have needed them). Now I do! Thanks!

sleske 2009-10-29 23:27:49

Answer 10

+23 A:

My bugbears are the 450 column Access tables that have been put together by the 8 year old son of the Managing Director's best friends dog groomer and the dodgy lookup table that only exists because somebody doesn't know how to normalise a datastructure properly.

Typically, this lookup table looks like this:

ID INT,
Name NVARCHAR(132),
IntValue1 INT,
IntValue2 INT,
CharValue1 NVARCHAR(255),
CharValue2 NVARCHAR(255),
Date1 DATETIME,
Date2 DATETIME

I've lost count of the number of clients I've seen who have systems that rely on abominations like this.

Pete OHanlon 2008-12-06 20:25:25

Worse yet, I read that in newest version of Access that's actually supported automatically, which I fear will *encourage* more of this Value1, Value2, Value3... column fetichism

Joe Pineda 2008-12-07 06:18:30

LOL... Beautiful.

Cj Anderson 2009-03-23 03:52:50

Answer 11

+54 A:

I am consistently disappointed by most programmers tendency to mix their UI-logic in the data access layer:

SELECT
    FirstName + ' ' + LastName as "Full Name",
    case UserRole
        when 2 then "Admin"
        when 1 then "Moderator"
        else "User"
    end as "User's Role",
    case SignedIn
        when 0 then "Logged in"
        else "Logged out"
    end as "User signed in?",
    Convert(varchar(100), LastSignOn, 101) as "Last Sign On",
    DateDiff('d', LastSignOn, getDate()) as "Days since last sign on",
    AddrLine1 + ' ' + AddrLine2 + ' ' + AddrLine3 + ' ' +
        City + ', ' + State + ' ' + Zip as "Address",
    'XXX-XX-' + Substring(
        Convert(varchar(9), SSN), 6, 4) as "Social Security #"
FROM Users

Normally, programmers do this because they intend to bind their dataset directly to a grid, and its just convenient to have SQL Server format server-side than format on the client.

Queries like the one shown above are extremely brittle because they tightly couple the data layer to the UI layer. On top of that, this style of programming thoroughly prevents stored procedures from being reusable.

Juliet 2008-12-06 21:55:16

A good poster-child pattern for maximum coupling across the largest possible number of tiers/abstraction layers.

le dorfier 2008-12-06 22:33:38

It may not be good for de-coupling, though for performance reasons I've done stuff like that often, iterative changes done by SQL Server are faster than done by code in mid-tier. I don't get you reusability point - nothing stops you from running the SP and renaming the cols if so you wish.

Joe Pineda 2008-12-07 06:17:05

My favorite is when people embed HTML AND javascript, e.g. SELECT '<a href=... onclick="">' + name ' </a>'

Matt Rogish 2009-01-14 17:19:23

With queries like this, you can edit the grid in a website with a simple alter statement. Or change the content of an export, or reformat a date in a report. This makes clients happy, and saves me time. So thanks, but no thanks, I'll stick with queries like this.

Andomar 2009-05-18 15:13:13

Answer 12

+15 A:

select some_column, ...
from some_table
group by some_column

and assuming that the result will be sorted by some_column. I've seen this a bit with Sybase where the assumption holds (for now).

Adrian Pronk 2008-12-06 22:25:41

upvote for EVER assuming sort order, just because that was the way it showed up in the query tool that one time

Joel Coehoorn 2008-12-06 23:00:26

I've even seen this reported as a bug more than once.

le dorfier 2008-12-07 03:02:36

in MySQL, it is documented to sort. <http://dev.mysql.com/doc/refman/5.0/en/select.html>. So blame MySQL (again).

derobert 2008-12-07 06:09:05

In Oracle, the unsorted results (almost) always matched the grouping - until version 10G. Lots of rework for the developers who used to leave out the ORDER BY!

Tony Andrews 2009-01-14 17:05:22

I was even in a training class where this was stated as a fact for SQL Server. I had to protest really loud. For just saving to type 20 characters you rely on obscure or undocumented behavior.

erikkallen 2009-12-16 21:42:56

Answer 13

+1 A:

Using SQL as a glorified ISAM (Indexed Sequential Access Method) package. In particular, nesting cursors instead of combining SQL statements into a single, albeit larger, statement. This also counts as 'abuse of the optimizer' since in fact there isn't much the optimizer can do. This can be combined with non-prepared statements for maximum inefficiency:

DECLARE c1 CURSOR FOR SELECT Col1, Col2, Col3 FROM Table1

FOREACH c1 INTO a.col1, a.col2, a.col3
    DECLARE c2 CURSOR FOR
        SELECT Item1, Item2, Item3
            FROM Table2
            WHERE Table2.Item1 = a.col2
    FOREACH c2 INTO b.item1, b.item2, b.item3
        ...process data from records a and b...
    END FOREACH
END FOREACH

The correct solution (almost always) is to combine the two SELECT statements into one:

DECLARE c1 CURSOR FOR
    SELECT Col1, Col2, Col3, Item1, Item2, Item3
        FROM Table1, Table2
        WHERE Table2.Item1 = Table1.Col2
        -- ORDER BY Table1.Col1, Table2.Item1

FOREACH c1 INTO a.col1, a.col2, a.col3, b.item1, b.item2, b.item3
    ...process data from records a and b...
END FOREACH

The only advantage to the double loop version is that you can easily spot the breaks between values in Table1 because the inner loop ends. This can be a factor in control-break reports.

Also, sorting in the application is usually a no-no.

Jonathan Leffler 2008-12-06 22:29:16

The style, although not this syntax, is particularly rampant in PHP in my experience.

le dorfier 2008-12-06 22:32:30

The syntax is actually IBM Informix-4GL - but it is clear enough not to need much in the way of explanation (I think). And the style is rampant in a lot of SQL programs - regardless of programming language.

Jonathan Leffler 2008-12-07 07:58:21

Answer 14

+15 A:

The ones that I dislike the most are

Using spaces when creating tables, sprocs etc. I'm fine with CamelCase or under_scores and singular or plurals and UPPERCASE or lowercase but having to refer to a table or column [with spaces], especially if [ it is oddly spaced] (yes, I've run into this) really irritates me.
Denormalized data. A table doesn't have to be perfectly normalized, but when I run into a table of employees that has information about their current evaluation score or their primary anything, it tells me that I will probably need to make a separate table at some point and then try to keep them synced. I will normalize the data first and then if I see a place where denormalization helps, I'll consider it.
Overuse of either views or cursors. Views have a purpose, but when each table is wrapped in a view it's too much. I've had to use cursors a few times, but generally you can use other mechanisms for this.
Access. Can a program be an anti-pattern? We have SQL Server at my work, but a number of people use access due to it's availabilty, "ease of use" and "friendliness" to non-technical users. There is too much here to go into, but if you've been in a similar environment, you know.

Jamal Hansen 2008-12-06 22:34:37

#4 - there is another thread just for <a href='http://stackoverflow.com/questions/327199/what-will-we-do-after-access'>Access</a> :).

le dorfier 2008-12-06 22:40:53

Access is NOT a DBMS. It's a RAD environment, with a very simple database manager included. SQL Server, Oracle, et al. will *never* replace it, unless you add a VB-like language and a Crystal Reports like facility.

Joe Pineda 2008-12-07 06:31:42

Answer 15

+6 A:

The FROM TableA, TableB WHERE syntax for JOINS rather than FROM TableA INNER JOIN TableB ON
Making assumptions that a query will be returned sorted a certain way without putting an ORDER BY clause in, just because that was the way it showed up during testing in the query tool.

Joel Coehoorn 2008-12-06 23:03:39

My Oracle DBAs always complain that I use "ANSI joins", that is, what you present as the correct way. But I keep doing it, and I suspect that deep down they know its better.

Steve McLeod 2008-12-07 08:19:39

I suspect that Oracle wishes standard SQL would go away. :-) Also, you can't mix implicit and explicit JOINS (aka ANSI JOINs) in MySQL 5 - it doesn't work. Which is another argument for explicit JIONs.

staticsan 2008-12-08 00:42:43

I would say that even A INNER JOIN B ON is an anti pattern. I prefer A INNER JOIN B USING.

John Nilsson 2009-03-10 21:39:53

Oracle supports ANSI syntax now, but they used to have this really weird syntax for outer joins in the past and there are too many people still using it.

Cervo 2010-09-10 20:06:49

Answer 16

+24 A:

var query = "select COUNT(*) from Users where UserName = '" + tbUser.Text + "' and Password = '" + tbPassword.Text +"'";

1) Not sanitizing user input (!!!!!)
2) Query via concatenation, aka not using parameterized queries
3) Cleartext passwords

Will 2008-12-06 23:49:50

All of which can usefully be dealt with by using a database abstracton layer of some (any) kind.

le dorfier 2008-12-07 03:04:24

@doofledorfer: Agree, a middle tier would be definitely better in a case like this, plus providing results caching as a nice side effect.

Joe Pineda 2008-12-07 06:45:50

Awesome example. If a dev groks how to replace that with a good solution, they are half-way to becoming a decent SQL dev.

Steve McLeod 2008-12-07 08:17:57

Answer 17

+6 A:

I need to put my own current favorite here, just to make the list complete. My favorite antipattern is not testing your queries.

This applies when:

Your query involves more than one table.
You think you have an optimal design for a query, but don't bother to test your assumptions.
You accept the first query that works, with no clue about whether it's even close to optimized.

And any tests run against atypical or insufficient data don't count. If it's a stored procedure, put the test statement into a comment and save it, with the results. Otherwise, put it into a comment in the code with the results.

le dorfier 2008-12-07 02:57:51

A very useful technique for minimal T-SQL test: In the .SQL file where you define your SP, UDF, etc., immediately after it create a block test like IF 1=2 BEGIN (sample cases for your code, with expected results as comments) END

Joe Pineda 2008-12-07 06:34:08

SQL Server does parse the code within the test block, even though it's never executed. So when your object gets modified and receives more parameters, or of different type, etc. or an objects it depends on is modified you'll receive an error just by asking for an execution plan!

Joe Pineda 2008-12-07 06:37:34

It's not always possible to test with real data. Often the dev server/"test" server is underpaid and gets a fraction of the live server. Generally tests are frowned on against the live server. Some places are better and have a test or staging server with live data.

Cervo 2010-09-10 20:08:30

Answer 18

+2 A:

The two I find the most, and can have a significant cost in terms of performance are:

Using cursors instead of a set based expression. I guess this one occurs frequently when the programmer is thinking procedurely.
Using correlated sub-queries, when a join to a derived table can do the job.

Mitch Wheat 2008-12-07 04:48:27

I agree if you mean what I think you mean; although a correlated sub-query is a type of derived table IIRC.

le dorfier 2008-12-07 05:08:08

A derived table is a set operation, whereas a correlated subquery runs for each row in the outer query, making it less efficient (9 times out of 10)

Mitch Wheat 2008-12-07 06:07:05

A couple years ago I found to my surprise that SQL S. is somehow optimized for handling correlated queries: for simple ones you get the same execution plan as with a logically equivalent query using a JOIN! Also, correlated queries that bring Oracle to its knees run only slowly on SQL S.!

Joe Pineda 2008-12-07 06:39:43

That's why I always test it both ways. And I <i>do</> usually try it both ways. In practice, for SQL Server anyway, I've usually found the correlated sq to be no slower.

le dorfier 2008-12-07 08:46:25

I cheated and googled "correlated subquery derived table". That's where I discovered two sources saying a csq is one type of derived table. (It also mentioned the common misapprehension among SQL Server users. I didn't even know enough to be confused; so Mitch is up on points.)

le dorfier 2008-12-07 08:49:39

But a csq is, in fact, a table that is derived as well - just repeatedly for each row. (Unless it's optimized away, of course, which does happen.)

le dorfier 2008-12-07 08:50:42

I'm very guilty of the 2nd, but I'm working on that.

Joel Coehoorn 2008-12-07 13:38:28

PLEASE understand that a correlated subquery and a join are IDENTICAL (in most cases). They are not even different things that are optimized to one another, but just different textual representations of the same operation.

erikkallen 2009-12-16 21:54:41

Related to Mitch's answer: our app had one operation that used a CSQ in another CSQ (the dynamic nature of this part of the app demands the setup. it really does.) When one client got to a certain size dataset, the query exploded to 4.9 BILLION reads. By unrolling the inner query, we reduced the runtime of this particular bit by 99%.

DaveE 2010-02-08 17:25:33

Answer 19

+4 A:

The Altered View - A view that is altered too often and without notice or reason. The change will either be noticed at the most inappropriate time or worse be wrong and never noticed. Maybe your application will break because someone thought of a better name for that column. As a rule views should extend the usefulness of base tables while maintaining a contract with consumers. Fix problems but don't add features or worse change behavior, for that create a new view. To mitigate do not share views with other projects and, use CTEs when platforms allow. If your shop has a DBA you probably can't change views but all your views will be outdated and or useless in that case.
The !Paramed - Can a query have more than one purpose? Probably but the next person who reads it won't know until deep meditation. Even if you don't need them right now chances are you will, even if it's "just" to debug. Adding parameters lowers maintenance time and keep things DRY. If you have a where clause you should have parameters.
The case for no CASE -

SELECT
CASE @problem
WHEN 'Need to replace column A with this medium to large collection of strings hanging out in my code.'
THEN 'Create a table for lookup and add to your from clause.'
WHEN 'Scrubbing values in the result set based on some business rules.'
THEN 'Fix the data in the database'
WHEN 'Formating dates or numbers.'
THEN 'Apply formating in the presentation layer.'
WHEN 'Createing a cross tab'
THEN 'Good, but in reporting you should probably be using cross tab, matrix or pivot templates'
ELSE 'You probably found another case for no CASE but now I have to edit my code instead of enriching the data...' END

jms 2008-12-08 01:46:21

Loved that third one. I'm already using it locally...

alphadogg 2009-02-26 20:44:06

Thanks for the props. :)

jms 2009-02-26 20:53:47

Answer 20

+4 A:

Contrarian view: over-obsession with normalization.

Most SQL/RBDBs systems give one lots of features (transactions, replication) that are quite useful, even with unnormalized data. Disk space is cheap, and sometimes it can be simpler (easier code, faster development time) to manipulate / filter / search fetched data, than it is to write up 1NF schema, and deal with all the hassles therein (complex joins, nasty subselects, etc).

I have found the over-normalized systems are often premature optimization, especially during early development stages.

(more thoughts on it... http://writeonly.wordpress.com/2008/12/05/simple-object-db-using-json-and-python-sqlite/)

Gregg Lind 2008-12-12 17:55:35

I think non-normalization is often premature optimization.

tuinstoel 2009-01-01 13:17:02

Sometimes it is, sometimes it isn't. Luckily, it's often easy to test, and different options work with different db needs.

Gregg Lind 2009-01-05 00:26:48

Answer 21

+1 A:

Maybe not an anti pattern but it annoys me is when DBA's of certain DB's (ok I'm talking about Oracle here) write SQL Server code using Oracle style and code conventions and complain when it runs so bad. Enough with the cursors Oracle people! SQL is meant to be set based.

Craig 2009-01-01 13:05:54

I think this is more related to your DBA than it is to Oracle. Oracle advices people to think and act set based too instead of row by row procedural thinking with cursors.

tuinstoel 2009-01-01 13:37:01

You are probably right tuinstoel. But we have numerous DBA's in my company and they all seem to love cursors.

Craig 2009-01-05 01:47:36

Then they're not very good DBA's.... You don't happen to work in the same place as me do you? ;)

Andrew Rollings 2009-01-21 19:12:45

Answer 22

+3 A:

Putting stuff in temporary tables, especially people who switch from SQL Server to Oracle have a habit of overusing temporary tables. Just use nested select statements.

tuinstoel 2009-01-01 13:11:28

Answer 23

+1 A:

Just to support Point 1 of David's answer check out this example of situation when using select * may produce some unexpected results

kristof 2009-01-14 16:49:09

Answer 24

+1 A:

Joining redundant tables into a query like this:

select emp.empno, dept.deptno
from emp
join dept on dept.deptno = emp.deptno;

Tony Andrews 2009-01-14 17:17:16

Answer 25

+2 A:

I just put this one together, based on some of the SQL responses here on SO.

It is a serious antipattern to think that triggers are to databases as event handlers are to OOP. There's this perception that just any old logic can be put into triggers, to be fired off when a transaction (event) happens on a table.

Not true. One of the big differences are that triggers are synchronous - with a vengeance, because they are synchronous on a set operation, not on a row operation. On the OOP side, exactly the opposite - events are an efficient way to implement asynchronous transactions.

le dorfier 2009-01-30 07:02:10

Answer 26

+6 A:

using @@IDENTITY instead of SCOPE_IDENTITY()

Quoted from this answer :

@@IDENTITY returns the last identity value generated for any table in the current session, across all scopes. You need to be careful here, since it's across scopes. You could get a value from a trigger, instead of your current statement.
SCOPE_IDENTITY returns the last identity value generated for any table in the current session and the current scope. Generally what you want to use.
IDENT_CURRENT returns the last identity value generated for a specific table in any session and any scope. This lets you specify which table you want the value from, in case the two above aren't quite what you need (very rare). You could use this if you want to get the current IDENTITY value for a table that you have not inserted a record into.

Brann 2009-04-06 15:22:51

Answer 27

+5 A:

SELECT FirstName + ' ' + LastName as "Full Name", case UserRole when 2 then "Admin" when 1 then "Moderator" else "User" end as "User's Role", case SignedIn when 0 then "Logged in" else "Logged out" end as "User signed in?", Convert(varchar(100), LastSignOn, 101) as "Last Sign On", DateDiff('d', LastSignOn, getDate()) as "Days since last sign on", AddrLine1 + ' ' + AddrLine2 + ' ' + AddrLine3 + ' ' + City + ', ' + State + ' ' + Zip as "Address", 'XXX-XX-' + Substring(Convert(varchar(9), SSN), 6, 4) as "Social Security #" FROM Users

Or, cramming everything into one line.

Jasper Bekkers 2009-04-19 03:53:16

Used a previous comment's query, just because that was the first SQL-statement I had available.

Jasper Bekkers 2009-04-19 03:54:55

Answer 28

+4 A:

Temporary Table abuse.

Specifically this sort of thing:

SELECT personid, firstname, lastname, age
INTO #tmpPeople
FROM People
WHERE lastname like 's%'

DELETE FROM #tmpPeople
WHERE firstname = 'John'

DELETE FROM #tmpPeople
WHERE firstname = 'Jon'

DELETE FROM #tmpPeople
WHERE age > 35

UPDATE People
SET firstname = 'Fred'
WHERE personid IN (SELECT personid from #tmpPeople)

Don't build a temporary table from a query, only to delete the rows you don't need.

And yes, I have seen pages of code in this form in production DBs.

geofftnz 2009-04-20 22:05:53

+1, I agree. Although, I have found at least one or two cases where this technique has improved performance - the queries involved were complex to say the least.

ar 2010-08-30 10:45:23

True - they do have a place, just not in every query :)

geofftnz 2010-08-30 22:14:54

Sometimes you have to do that if the conditions are super complicated. True it can be abused to extremes. But many times a simple delete is much simpler than the logic for getting the case in the initial query. Also sometimes if the clause is not sargeable the initial query will slow down. But just doing it on the smaller temp table is more efficient. And other times you keep adding cases that business people keep adding after the fact.

Cervo 2010-09-10 20:13:49

Answer 29

+2 A:

For storing time values, only UTC timezone should be used. Local time should not be used.

Frank Schwieterman 2009-04-20 22:14:10

I've still not found a good simple solution for converting from UTC to local time for dates in the past, when daylight saving has to be considered, with varying change dates accross years and countries, as well as all exceptions within countries. So UTC doesn't save you from conversion complexity. However, it's important to have a way to know the timezone of every stored datetime.

ckarras 2009-06-14 11:38:47

Answer 30

A:

Having 1 table

code_1
value_1
code_2
value_2
...
code_10
value_10

Instead of having 3 tables

code, value and code_value

You never know when you may have need more than 10 couples code, value.

You don't waste disk space if you only need one couple.

Luc M 2009-05-12 20:13:47

Answer 31

A:

Learning SQL in the first six months of their career and never learning anything else for the next 10 years. In particular not learning or effectively using windowing/analytical SQL features. In particular the use of over() and partition by.

Window functions, like aggregate functions, perform an aggregation on a defined set (a group) of rows, but rather than returning one value per group, window functions can return multiple values for each group.

See O'Reilly SQL Cookbook Appendix A for a nice overview of windowing functions.

Brian 2010-02-03 17:59:56

Answer 32

A:

Not using the With clause or a proper join and relying on sub-queries.

Anti-Pattern:

select 
 ...
from data
where RECORD.STATE IN (
          SELECT STATEID
            FROM STATE
           WHERE NAME IN
                    ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    ))

Better:
I like using the with clause to make my intent more readable.

with valid_states as (
          SELECT STATEID
            FROM STATE
           WHERE NAME IN
                    ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    )
select  ... from data, valid_states
where data.state = valid_states.state

Best:

select 
  ... 
from data join states using (state)
where 
states.state in  ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    )

Brian 2010-02-03 18:19:20

Why? Could you please describe the difference in behind-the-scene

abatishchev 2010-08-30 11:08:20

Answer 33

+1 A:

Re-using a 'dead' field for something it wasn't intended for (e.g. storing user data in a 'Fax' field) - very tempting as a quick fix though!

MarcP 2010-02-08 17:16:34

Answer 34

+1 A:

re: using @@IDENTITY instead of SCOPE_IDENTITY()

you should use neither; use output instead

cf. https://connect.microsoft.com/SQLServer/feedback/details/328811/scope-identity-sometimes-returns-incorrect-value

Denis Valeev 2010-08-30 10:27:21

Answer 35

A:

Using primary keys as a surrogate for record addresses and using foreign keys as a surrogate for pointers embedded in records.

Walter Mitty 2010-09-15 13:01:16

ansaurus

tags:

views:

answers:

What are the most common SQL anti-patterns?

related questions