views:

5341

answers:

35

All of us who work with relational databases have learned (or are learning) that SQL is different. Eliciting the desired results, and doing so efficiently, involves a tedious process partly characterized by learning unfamiliar paradigms, and finding out that some of our most familiar programming patterns don't work here. What are the common antipatterns you've seen (or yourself committed)?

+29  A: 

Don't have to dig deep for it: Not using prepared statements.

stesch
Yup. Followed closely in the same context, in my experience, with "not trapping errors".
le dorfier
+14  A: 

use SP as the prefix of the store procedure name because it will first search in the System procedures location rather than the custom ones.

Oscar Cabrero
"usp" FTW :)
annakata
Can also be extended to using any other common prefix for all stored procedures, making it more difficult to pick through a sorted list.
le dorfier
+1 for doofledorfer comment!! I've seen this a lot, I find this idiotic and does indeed make searching for a particular SP *very* difficult!!! Also extended to "vw_" for views, "tbl_" for tables and the like, how I hate them!
Joe Pineda
The prefixes can be useful if you're scripting the objects to files (eg: for source control, deployments or migration)
Rick
Why on earth would it be useful to prefix every single stored procedure with sp *or* usp? It just makes it harder to scan the list for the one you want.
Kyralessa
A: 

SELECT * FROM TABLE;

  1. Don't use *

  2. Doesn't have to be capitals!

Rich Bradshaw
I actually like capitals very much. They let you easily filter out the "SQLy" part of the statement, and focus on the important part (fields, tables, etC)
Daniel Magliola
Agree with #1, not #2
Michael Haren
#2 is not convincingly demonstrated: SELECT COL1, COL2, COL3, SOMETHING AS "ELSE" FROM SOMEHWERE, SOMEWHERE WHERE SOMEHWERE.SOMETHING = SOMEWHERE.ANOTHERTHING is objectionable; I use capitals for keywords, and mixed case or lower case for database objects (tables, columns, etc).
Jonathan Leffler
John's example demonstrates much better what I'm talking about with 2.
Rich Bradshaw
SELECT * FROM table is only bad if the code makes assumptions about the fields (or their order) that come back.
staticsan
It's difficult to reliably predict what assumptions you or other developers will make about the fields (or their order). Enumerating columns is both safer and more importantly a better reflection of your design intent. (Unless you don't *know* what columns you use, which is worse.)
le dorfier
@staticsan : SELECT * FROM table is almost always wrong, performance-wise, since you almost never need all fields. Also, it is cleaner to state the fields you want (easier to know what you get when you look a t a complex query)
Martin
If you indent properly and use a text editor with syntax coloring, I see absolutely no need to capitalize the SQL keywords. But, I suppose it could be useful if you have SQL directly in your codebase as strings.
dotjoe
+43  A: 

Here's my top 3.

Number 1. Failure to specify a field list. (Edit: to prevent confusion: this is a production code rule. It doesn't apply to one-off analysis scripts - unless I'm the author.)

SELECT *
Insert Into blah SELECT *

should be

SELECT fieldlist
Insert Into blah (fieldlist) SELECT fieldlist

Number 2. Using a cursor and while loop, when a while loop with a loop variable will do.

DECLARE @LoopVar int

SET @LoopVar = (SELECT MIN(TheKey) FROM TheTable)
WHILE @LoopVar is not null
BEGIN
  -- Do Stuff with current value of @LoopVar
  ...
  --Ok, done, now get the next value
  SET @LoopVar = (SELECT MIN(TheKey) FROM TheTable
    WHERE @LoopVar < TheKey)
END

Number 3. DateLogic through string types.

--Trim the time
Convert(Convert(theDate, varchar(10), 121), datetime)

Should be

--Trim the time
DateAdd(dd, DateDiff(dd, 0, theDate), 0)
David B
hmmm, I'll give you a +1 for points 2 and 3 alone, but developers overplay rule 1. It has it's place sometimes.
annakata
What is the reasoning behind #1?
jalf
When you use select *, you get whatever is in the table. Those columns may change names and order. Client code frequently relies on names and order. Every 6 months I'm asked how to preserve column order when modifying a table. If the rule was followed it wouldn't matter.
David B
I've used #2 sometimes, others I've gone the cursor route (though then I first save the results of the query on a table var, open the cursor on that). I've always wondered if someone has done a performance test of both.
Joe Pineda
@Joe, A cursor can allow parallelism, while a "loop over keys" or a table var can't. In that scenario, the cursor wins for performance. Here's another stackoverflow article about cursors: http://stackoverflow.com/questions/172526/sql-cursorsany-use-cases-you-would-defend#173216
David B
another side effect of using "select *" approach described in point 1: http://stackoverflow.com/questions/321468/select-from-table-vs-select-cola-colb-etc-from-table-interesting-behaviour-in-s
kristof
-1 because point 2 is a crock of shit. Cursors often out perform this loop scenario. Especially cursors declared for only going forward on a recordset (as opposed to backward, arbitrarily relocatable). The worst part about number 2 is that so many developed are deluded into thinking that it is not a cursor. So they'll go with a cursor approach, then apply 2 and say it's not a cursor. Instead of re-thinking about another way to do the problem that is set based (which is not always possible, but a LARGE portion of the time it is....).
Cervo
+20  A: 

Using meaningless table aliases:

from employee t1,
department t2,
job t3,
...

Makes reading a large SQL statement so much harder than it needs to be

Tony Andrews
aliases? hell I've seen actual column names like that
annakata
terse aliases are OKAY. If you want a meaningful name then don't use an alias at all.
Joel Coehoorn
@Joel, you need alias for your sub-selects.
Zoredache
He didn't say "terse," he said "meaningless." In my book there would be nothing wrong with using e, d, and j as the aliases in the example query.
Robert Rossney
Absolutely, Robert - e, d, and j would be fine with me.
Tony Andrews
I would use emp for employee, dep for department and job for job (or maybe jb) :)
Andrei Rinea
Aliasing fields/tables (to something sensible) is very usfeul in large and well organised data warehouses. it allows you to copy and paste a query, then just change one table name; relying on the alias for all it's references.
Dems
+3  A: 

1) I don't know it's an "official" anti-pattern, but I dislike and try to avoid string literals as magic values in a database column.

An example from MediaWiki's table 'image':

img_media_type ENUM("UNKNOWN", "BITMAP", "DRAWING", "AUDIO", "VIDEO", 
    "MULTIMEDIA", "OFFICE", "TEXT", "EXECUTABLE", "ARCHIVE") default NULL,
img_major_mime ENUM("unknown", "application", "audio", "image", "text", 
    "video", "message", "model", "multipart") NOT NULL default "unknown",

(I just notice different casing, another thing to avoid)

I design such cases as int lookups into tables ImageMediaType and ImageMajorMime with int primary keys.

2) date/string conversion that relies on specific NLS settings

CONVERT(NVARCHAR, GETDATE())

without format identifier

devio
And no syntactical indentation, either. Argghh.
le dorfier
Why is this bad? surely if you are trying to express a set of values this works just as well as a lookup table, and fits better with code that calls it. Id rather have an enum in my app code that maps to an enum constraint in my DB than an enum in my app code that maps to specific rows of a lookup table. It just feels cleaner.
Jack Ryan
+14  A: 

Overuse of temporary tables and cursors.

Mr. Brownstone
Good evidence that "all I know is procedural languages".
le dorfier
Overuse of anything is by definition unwanted. A specific example of where using temp tables/cursors would not needed would be helpful.
Jace Rhea
Mostly I see temp tables under-used. with SQL Server often you get performance gains by doing stuff with a bunch of temp tables instead of one monolithic query.
Cervo
+26  A: 
  • Human readable password fields, egad. Self explanatory.

  • Using LIKE against indexed columns, and I'm almost tempted to just say LIKE in general.

  • Recycling SQL-generated PK values.

  • Surprise nobody mentioned the god-table yet. Nothing says "organic" like 100 columns of bit flags, large strings and integers.

  • Then there's the "I miss .ini files" pattern: storing CSVs, pipe delimited strings or other parse required data in large text fields.

  • And for MS SQL server the use of cursors at all. There's a better way to do any given cursor task.

Edited because there's so many!

annakata
i dont understand the LIKE argument, I never used LIKE because I was bored, but only because they want wildcards search. In my current job every column in search is LIKED. I bet it will get optimized when its too slow, but i dont get the LIKE hate.
01
wrong about cursors, i would be hesitant about saying doing any particular thing is 100% right or 100% wrong
Shawn Simon
At least in SQL Server, you can parse delimited strings faster than you can get data out of an XML column/object. So they do have their place, if you care about performance! And I can't understand your aversion to LIKE, I try hard and can't think of why stay away of LIKE.
Joe Pineda
Sometimes you just have to use a cursor! Unless you are willing to do your processing out of the database, like say in a specific purpose, home-brewed tiny utility... Think of a way to send a mail to a group of people, for instance, without either cursors or an external app
Joe Pineda
Or generating a result set with running totals.
Robert Rossney
So far every cursor defense example I've seen is using the wrong tool for the job. But if all you know is SQL, you either use it inappropriately, or you learn to write other kinds of software.
le dorfier
I would also defend the "parseable data" inside sql (in some cases). Sometimes the "plob" (parseable large object) is a type that you don't want to deal with in sql (each has different fields, etc.)... cf: http://writeonly.wordpress.com/2008/12/05/simple-object-db-using-json-and-python-sqlite/
Gregg Lind
LIKE can use an index (at least in Oracle).
tuinstoel
@tuinstoel: How does LIKE '%blah%' get to use an index? Indexing relies on ordering and this example searches a random middle position of a string. (Indexes order by the 1st character 1st, and so looking at the middle 4 characters gives a virtually random order...)
Dems
On most database servers (at least the ones I've used), LIKE can use indexes.. as long as it's a prefix-search (LIKE 'xxx%') -- that is, as long as the wildcard characters don't come first in the search string. I think you might be talking at cross-purposes here a little.
Cowan
+5  A: 

Identical subqueries in a query.

EvilTeach
Unfortunately, sometimes you just can't avoid that - in SQL 2000 there was no "WITH" keyword, and using UDFs to encapsulate common subqueries sometime leads to performance penalties, blame MS on that...
Joe Pineda
Well, hopefully they will get around to adding it one of these days.
EvilTeach
In SQL 2000, you can use table variables.
recursive
@recursive: you can't have indexes on a table variable, which will often make it slower than a subquery. However you could use a temporary table with custom indexes.
Rick
Cool, have been working with SQL for years, and didn't even know Common Table Expressions exist (though I would have needed them). Now I do! Thanks!
sleske
+23  A: 

My bugbears are the 450 column Access tables that have been put together by the 8 year old son of the Managing Director's best friends dog groomer and the dodgy lookup table that only exists because somebody doesn't know how to normalise a datastructure properly.

Typically, this lookup table looks like this:

ID INT,
Name NVARCHAR(132),
IntValue1 INT,
IntValue2 INT,
CharValue1 NVARCHAR(255),
CharValue2 NVARCHAR(255),
Date1 DATETIME,
Date2 DATETIME

I've lost count of the number of clients I've seen who have systems that rely on abominations like this.

Pete OHanlon
Worse yet, I read that in newest version of Access that's actually supported automatically, which I fear will *encourage* more of this Value1, Value2, Value3... column fetichism
Joe Pineda
LOL... Beautiful.
Cj Anderson
+54  A: 

I am consistently disappointed by most programmers tendency to mix their UI-logic in the data access layer:

SELECT
    FirstName + ' ' + LastName as "Full Name",
    case UserRole
        when 2 then "Admin"
        when 1 then "Moderator"
        else "User"
    end as "User's Role",
    case SignedIn
        when 0 then "Logged in"
        else "Logged out"
    end as "User signed in?",
    Convert(varchar(100), LastSignOn, 101) as "Last Sign On",
    DateDiff('d', LastSignOn, getDate()) as "Days since last sign on",
    AddrLine1 + ' ' + AddrLine2 + ' ' + AddrLine3 + ' ' +
        City + ', ' + State + ' ' + Zip as "Address",
    'XXX-XX-' + Substring(
        Convert(varchar(9), SSN), 6, 4) as "Social Security #"
FROM Users

Normally, programmers do this because they intend to bind their dataset directly to a grid, and its just convenient to have SQL Server format server-side than format on the client.

Queries like the one shown above are extremely brittle because they tightly couple the data layer to the UI layer. On top of that, this style of programming thoroughly prevents stored procedures from being reusable.

Juliet
A good poster-child pattern for maximum coupling across the largest possible number of tiers/abstraction layers.
le dorfier
It may not be good for de-coupling, though for performance reasons I've done stuff like that often, iterative changes done by SQL Server are faster than done by code in mid-tier. I don't get you reusability point - nothing stops you from running the SP and renaming the cols if so you wish.
Joe Pineda
My favorite is when people embed HTML AND javascript, e.g. SELECT '<a href=... onclick="">' + name ' </a>'
Matt Rogish
With queries like this, you can edit the grid in a website with a simple alter statement. Or change the content of an export, or reformat a date in a report. This makes clients happy, and saves me time. So thanks, but no thanks, I'll stick with queries like this.
Andomar
+15  A: 
select some_column, ...
from some_table
group by some_column

and assuming that the result will be sorted by some_column. I've seen this a bit with Sybase where the assumption holds (for now).

Adrian Pronk
upvote for EVER assuming sort order, just because that was the way it showed up in the query tool that one time
Joel Coehoorn
I've even seen this reported as a bug more than once.
le dorfier
in MySQL, it is documented to sort. <http://dev.mysql.com/doc/refman/5.0/en/select.html>. So blame MySQL (again).
derobert
In Oracle, the unsorted results (almost) always matched the grouping - until version 10G. Lots of rework for the developers who used to leave out the ORDER BY!
Tony Andrews
I was even in a training class where this was stated as a fact for SQL Server. I had to protest really loud. For just saving to type 20 characters you rely on obscure or undocumented behavior.
erikkallen
+1  A: 

Using SQL as a glorified ISAM (Indexed Sequential Access Method) package. In particular, nesting cursors instead of combining SQL statements into a single, albeit larger, statement. This also counts as 'abuse of the optimizer' since in fact there isn't much the optimizer can do. This can be combined with non-prepared statements for maximum inefficiency:

DECLARE c1 CURSOR FOR SELECT Col1, Col2, Col3 FROM Table1

FOREACH c1 INTO a.col1, a.col2, a.col3
    DECLARE c2 CURSOR FOR
        SELECT Item1, Item2, Item3
            FROM Table2
            WHERE Table2.Item1 = a.col2
    FOREACH c2 INTO b.item1, b.item2, b.item3
        ...process data from records a and b...
    END FOREACH
END FOREACH

The correct solution (almost always) is to combine the two SELECT statements into one:

DECLARE c1 CURSOR FOR
    SELECT Col1, Col2, Col3, Item1, Item2, Item3
        FROM Table1, Table2
        WHERE Table2.Item1 = Table1.Col2
        -- ORDER BY Table1.Col1, Table2.Item1

FOREACH c1 INTO a.col1, a.col2, a.col3, b.item1, b.item2, b.item3
    ...process data from records a and b...
END FOREACH

The only advantage to the double loop version is that you can easily spot the breaks between values in Table1 because the inner loop ends. This can be a factor in control-break reports.

Also, sorting in the application is usually a no-no.

Jonathan Leffler
The style, although not this syntax, is particularly rampant in PHP in my experience.
le dorfier
The syntax is actually IBM Informix-4GL - but it is clear enough not to need much in the way of explanation (I think). And the style is rampant in a lot of SQL programs - regardless of programming language.
Jonathan Leffler
+15  A: 

The ones that I dislike the most are

  1. Using spaces when creating tables, sprocs etc. I'm fine with CamelCase or under_scores and singular or plurals and UPPERCASE or lowercase but having to refer to a table or column [with spaces], especially if [ it is oddly spaced] (yes, I've run into this) really irritates me.

  2. Denormalized data. A table doesn't have to be perfectly normalized, but when I run into a table of employees that has information about their current evaluation score or their primary anything, it tells me that I will probably need to make a separate table at some point and then try to keep them synced. I will normalize the data first and then if I see a place where denormalization helps, I'll consider it.

  3. Overuse of either views or cursors. Views have a purpose, but when each table is wrapped in a view it's too much. I've had to use cursors a few times, but generally you can use other mechanisms for this.

  4. Access. Can a program be an anti-pattern? We have SQL Server at my work, but a number of people use access due to it's availabilty, "ease of use" and "friendliness" to non-technical users. There is too much here to go into, but if you've been in a similar environment, you know.

Jamal Hansen
#4 - there is another thread just for <a href='http://stackoverflow.com/questions/327199/what-will-we-do-after-access'>Access</a> :).
le dorfier
Access is NOT a DBMS. It's a RAD environment, with a very simple database manager included. SQL Server, Oracle, et al. will *never* replace it, unless you add a VB-like language and a Crystal Reports like facility.
Joe Pineda
+6  A: 
  • The FROM TableA, TableB WHERE syntax for JOINS rather than FROM TableA INNER JOIN TableB ON

  • Making assumptions that a query will be returned sorted a certain way without putting an ORDER BY clause in, just because that was the way it showed up during testing in the query tool.

Joel Coehoorn
My Oracle DBAs always complain that I use "ANSI joins", that is, what you present as the correct way. But I keep doing it, and I suspect that deep down they know its better.
Steve McLeod
I suspect that Oracle wishes standard SQL would go away. :-) Also, you can't mix implicit and explicit JOINS (aka ANSI JOINs) in MySQL 5 - it doesn't work. Which is another argument for explicit JIONs.
staticsan
I would say that even A INNER JOIN B ON is an anti pattern. I prefer A INNER JOIN B USING.
John Nilsson
Oracle supports ANSI syntax now, but they used to have this really weird syntax for outer joins in the past and there are too many people still using it.
Cervo
+24  A: 
var query = "select COUNT(*) from Users where UserName = '" + tbUser.Text + "' and Password = '" + tbPassword.Text +"'";

1) Not sanitizing user input (!!!!!)
2) Query via concatenation, aka not using parameterized queries
3) Cleartext passwords

Will
All of which can usefully be dealt with by using a database abstracton layer of some (any) kind.
le dorfier
@doofledorfer: Agree, a middle tier would be definitely better in a case like this, plus providing results caching as a nice side effect.
Joe Pineda
Awesome example. If a dev groks how to replace that with a good solution, they are half-way to becoming a decent SQL dev.
Steve McLeod
+6  A: 

I need to put my own current favorite here, just to make the list complete. My favorite antipattern is not testing your queries.

This applies when:

  1. Your query involves more than one table.
  2. You think you have an optimal design for a query, but don't bother to test your assumptions.
  3. You accept the first query that works, with no clue about whether it's even close to optimized.

And any tests run against atypical or insufficient data don't count. If it's a stored procedure, put the test statement into a comment and save it, with the results. Otherwise, put it into a comment in the code with the results.

le dorfier
A very useful technique for minimal T-SQL test: In the .SQL file where you define your SP, UDF, etc., immediately after it create a block test like IF 1=2 BEGIN (sample cases for your code, with expected results as comments) END
Joe Pineda
SQL Server does parse the code within the test block, even though it's never executed. So when your object gets modified and receives more parameters, or of different type, etc. or an objects it depends on is modified you'll receive an error just by asking for an execution plan!
Joe Pineda
It's not always possible to test with real data. Often the dev server/"test" server is underpaid and gets a fraction of the live server. Generally tests are frowned on against the live server. Some places are better and have a test or staging server with live data.
Cervo
+2  A: 

The two I find the most, and can have a significant cost in terms of performance are:

  • Using cursors instead of a set based expression. I guess this one occurs frequently when the programmer is thinking procedurely.

  • Using correlated sub-queries, when a join to a derived table can do the job.

Mitch Wheat
I agree if you mean what I think you mean; although a correlated sub-query is a type of derived table IIRC.
le dorfier
A derived table is a set operation, whereas a correlated subquery runs for each row in the outer query, making it less efficient (9 times out of 10)
Mitch Wheat
A couple years ago I found to my surprise that SQL S. is somehow optimized for handling correlated queries: for simple ones you get the same execution plan as with a logically equivalent query using a JOIN! Also, correlated queries that bring Oracle to its knees run only slowly on SQL S.!
Joe Pineda
That's why I always test it both ways. And I <i>do</> usually try it both ways. In practice, for SQL Server anyway, I've usually found the correlated sq to be no slower.
le dorfier
I cheated and googled "correlated subquery derived table". That's where I discovered two sources saying a csq is one type of derived table. (It also mentioned the common misapprehension among SQL Server users. I didn't even know enough to be confused; so Mitch is up on points.)
le dorfier
But a csq is, in fact, a table that is derived as well - just repeatedly for each row. (Unless it's optimized away, of course, which does happen.)
le dorfier
I'm very guilty of the 2nd, but I'm working on that.
Joel Coehoorn
PLEASE understand that a correlated subquery and a join are IDENTICAL (in most cases). They are not even different things that are optimized to one another, but just different textual representations of the same operation.
erikkallen
Related to Mitch's answer: our app had one operation that used a CSQ in another CSQ (the dynamic nature of this part of the app demands the setup. it really does.) When one client got to a certain size dataset, the query exploded to 4.9 BILLION reads. By unrolling the inner query, we reduced the runtime of this particular bit by 99%.
DaveE
+4  A: 
  • The Altered View - A view that is altered too often and without notice or reason. The change will either be noticed at the most inappropriate time or worse be wrong and never noticed. Maybe your application will break because someone thought of a better name for that column. As a rule views should extend the usefulness of base tables while maintaining a contract with consumers. Fix problems but don't add features or worse change behavior, for that create a new view. To mitigate do not share views with other projects and, use CTEs when platforms allow. If your shop has a DBA you probably can't change views but all your views will be outdated and or useless in that case.

  • The !Paramed - Can a query have more than one purpose? Probably but the next person who reads it won't know until deep meditation. Even if you don't need them right now chances are you will, even if it's "just" to debug. Adding parameters lowers maintenance time and keep things DRY. If you have a where clause you should have parameters.

  • The case for no CASE -

    SELECT
    CASE @problem
    WHEN 'Need to replace column A with this medium to large collection of strings hanging out in my code.'
    THEN 'Create a table for lookup and add to your from clause.'
    WHEN 'Scrubbing values in the result set based on some business rules.'
    THEN 'Fix the data in the database'
    WHEN 'Formating dates or numbers.'
    THEN 'Apply formating in the presentation layer.'
    WHEN 'Createing a cross tab'
    THEN 'Good, but in reporting you should probably be using cross tab, matrix or pivot templates'
    ELSE 'You probably found another case for no CASE but now I have to edit my code instead of enriching the data...' END

jms
Loved that third one. I'm already using it locally...
alphadogg
Thanks for the props. :)
jms
+4  A: 

Contrarian view: over-obsession with normalization.

Most SQL/RBDBs systems give one lots of features (transactions, replication) that are quite useful, even with unnormalized data. Disk space is cheap, and sometimes it can be simpler (easier code, faster development time) to manipulate / filter / search fetched data, than it is to write up 1NF schema, and deal with all the hassles therein (complex joins, nasty subselects, etc).

I have found the over-normalized systems are often premature optimization, especially during early development stages.

(more thoughts on it... http://writeonly.wordpress.com/2008/12/05/simple-object-db-using-json-and-python-sqlite/)

Gregg Lind
I think non-normalization is often premature optimization.
tuinstoel
Sometimes it is, sometimes it isn't. Luckily, it's often easy to test, and different options work with different db needs.
Gregg Lind
+1  A: 

Maybe not an anti pattern but it annoys me is when DBA's of certain DB's (ok I'm talking about Oracle here) write SQL Server code using Oracle style and code conventions and complain when it runs so bad. Enough with the cursors Oracle people! SQL is meant to be set based.

Craig
I think this is more related to your DBA than it is to Oracle. Oracle advices people to think and act set based too instead of row by row procedural thinking with cursors.
tuinstoel
You are probably right tuinstoel. But we have numerous DBA's in my company and they all seem to love cursors.
Craig
Then they're not very good DBA's.... You don't happen to work in the same place as me do you? ;)
Andrew Rollings
+3  A: 

Putting stuff in temporary tables, especially people who switch from SQL Server to Oracle have a habit of overusing temporary tables. Just use nested select statements.

tuinstoel
+1  A: 

Joining redundant tables into a query like this:

select emp.empno, dept.deptno
from emp
join dept on dept.deptno = emp.deptno;
Tony Andrews
+2  A: 

I just put this one together, based on some of the SQL responses here on SO.

It is a serious antipattern to think that triggers are to databases as event handlers are to OOP. There's this perception that just any old logic can be put into triggers, to be fired off when a transaction (event) happens on a table.

Not true. One of the big differences are that triggers are synchronous - with a vengeance, because they are synchronous on a set operation, not on a row operation. On the OOP side, exactly the opposite - events are an efficient way to implement asynchronous transactions.

le dorfier
+6  A: 

using @@IDENTITY instead of SCOPE_IDENTITY()

Quoted from this answer :

  • @@IDENTITY returns the last identity value generated for any table in the current session, across all scopes. You need to be careful here, since it's across scopes. You could get a value from a trigger, instead of your current statement.
  • SCOPE_IDENTITY returns the last identity value generated for any table in the current session and the current scope. Generally what you want to use.
  • IDENT_CURRENT returns the last identity value generated for a specific table in any session and any scope. This lets you specify which table you want the value from, in case the two above aren't quite what you need (very rare). You could use this if you want to get the current IDENTITY value for a table that you have not inserted a record into.
Brann
+5  A: 
SELECT FirstName + ' ' + LastName as "Full Name", case UserRole when 2 then "Admin" when 1 then "Moderator" else "User" end as "User's Role", case SignedIn when 0 then "Logged in" else "Logged out" end as "User signed in?", Convert(varchar(100), LastSignOn, 101) as "Last Sign On", DateDiff('d', LastSignOn, getDate()) as "Days since last sign on", AddrLine1 + ' ' + AddrLine2 + ' ' + AddrLine3 + ' ' + City + ', ' + State + ' ' + Zip as "Address", 'XXX-XX-' + Substring(Convert(varchar(9), SSN), 6, 4) as "Social Security #" FROM Users

Or, cramming everything into one line.

Jasper Bekkers
Used a previous comment's query, just because that was the first SQL-statement I had available.
Jasper Bekkers
+4  A: 

Temporary Table abuse.

Specifically this sort of thing:

SELECT personid, firstname, lastname, age
INTO #tmpPeople
FROM People
WHERE lastname like 's%'

DELETE FROM #tmpPeople
WHERE firstname = 'John'

DELETE FROM #tmpPeople
WHERE firstname = 'Jon'

DELETE FROM #tmpPeople
WHERE age > 35

UPDATE People
SET firstname = 'Fred'
WHERE personid IN (SELECT personid from #tmpPeople)

Don't build a temporary table from a query, only to delete the rows you don't need.

And yes, I have seen pages of code in this form in production DBs.

geofftnz
+1, I agree. Although, I have found at least one or two cases where this technique has improved performance - the queries involved were complex to say the least.
ar
True - they do have a place, just not in every query :)
geofftnz
Sometimes you have to do that if the conditions are super complicated. True it can be abused to extremes. But many times a simple delete is much simpler than the logic for getting the case in the initial query. Also sometimes if the clause is not sargeable the initial query will slow down. But just doing it on the smaller temp table is more efficient. And other times you keep adding cases that business people keep adding after the fact.
Cervo
+2  A: 

For storing time values, only UTC timezone should be used. Local time should not be used.

Frank Schwieterman
I've still not found a good simple solution for converting from UTC to local time for dates in the past, when daylight saving has to be considered, with varying change dates accross years and countries, as well as all exceptions within countries. So UTC doesn't save you from conversion complexity. However, it's important to have a way to know the timezone of every stored datetime.
ckarras
A: 

Having 1 table

code_1
value_1
code_2
value_2
...
code_10
value_10

Instead of having 3 tables

code, value and code_value

You never know when you may have need more than 10 couples code, value.

You don't waste disk space if you only need one couple.

Luc M
A: 

Learning SQL in the first six months of their career and never learning anything else for the next 10 years. In particular not learning or effectively using windowing/analytical SQL features. In particular the use of over() and partition by.

Window functions, like aggregate functions, perform an aggregation on a defined set (a group) of rows, but rather than returning one value per group, window functions can return multiple values for each group.

See O'Reilly SQL Cookbook Appendix A for a nice overview of windowing functions.

Brian
A: 

Not using the With clause or a proper join and relying on sub-queries.

Anti-Pattern:

select 
 ...
from data
where RECORD.STATE IN (
          SELECT STATEID
            FROM STATE
           WHERE NAME IN
                    ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    ))

Better:
I like using the with clause to make my intent more readable.

with valid_states as (
          SELECT STATEID
            FROM STATE
           WHERE NAME IN
                    ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    )
select  ... from data, valid_states
where data.state = valid_states.state

Best:

select 
  ... 
from data join states using (state)
where 
states.state in  ('Published to test',
                     'Approved for public',
                     'Published to public',
                     'Archived'
                    )
Brian
Why? Could you please describe the difference in behind-the-scene
abatishchev
+1  A: 

Re-using a 'dead' field for something it wasn't intended for (e.g. storing user data in a 'Fax' field) - very tempting as a quick fix though!

MarcP
+1  A: 

re: using @@IDENTITY instead of SCOPE_IDENTITY()

you should use neither; use output instead

cf. https://connect.microsoft.com/SQLServer/feedback/details/328811/scope-identity-sometimes-returns-incorrect-value

Denis Valeev
A: 

Using primary keys as a surrogate for record addresses and using foreign keys as a surrogate for pointers embedded in records.

Walter Mitty