ansaurus

Question

SQL hidden techniques?

Answer 1

+6 A:

SQL GROUP BY - CUBE, ROLLUP clauses

Analytic (AKA ranking, AKA windowing) functions IE:

ROW_NUMBER
RANK
DENSE_RANK
NTILE
OVER

Views: Normal and Materialized

It's difficult to say much without referencing vendor specific syntax

OMG Ponies 2010-05-28 16:10:35

Ranking functions are very useful.

Philip Kelley 2010-05-28 16:14:36

Answer 2

+3 A:

People don't use built in functions enough and like to reinvent the wheel, here are Ten SQL Server Functions That You Have Ignored Until Now

Using NEWSEQUENTIALID() instead of NEWID() on a clustered uniqueidentifier column will perform much better since it won't cause page splits and thus fragmentation

Using an auxilarry table of numbers so that you can quickly do some set based logic

for example

select DATEADD(m,number,'20010101')
from master..spt_values
where type = 'P'
order by 1

ANY, ALL and SOME

SQLMenace 2010-05-28 16:10:50

I haven't used ANY/ALL/SOME in *ages*, using JOINs

OMG Ponies 2010-05-28 16:17:32

The third item is often referred to as using a "tally" table, and can be very useful for finding gaps in sequences.

Philip Kelley 2010-05-28 16:19:48

Do people really not know about GETUTCDATE()?

Joe Philllips 2010-05-28 20:17:08

Answer 3

+6 A:

OVER Clause (SQL Server) a.k.a. Window functions (PostgreSQL) or analytic functions (Oracle)

This has been very nice to know for me. You can do all sorts of handy things like counting, partitioning, ranking, etc.

Joe Philllips 2010-05-28 16:13:16

Same goes for PostgreSQL window functions - http://www.postgresql.org/docs/current/static/tutorial-window.html - with much the same syntax. Seems to be the answer to half of the sql questions I've answered recently...

araqnid 2010-05-28 16:26:43

Answer 4

+2 A:

Lately I have been using CROSS APPLY a lot.

THEn 2010-05-28 16:16:25

Answer 5

+5 A:

SELECT... EXCEPT SELECT...

and

SELECT... INTERSECT SELECT...

can be useful (and disturbingly efficient) at pickout out differing or common rows--and that's for all columns in the row--between sets. This is extremely useful when you have lots of columns.

Philip Kelley 2010-05-28 16:17:23

Answer 6

+2 A:

Pivot

It's new in 2005 (which i know was a long time ago, but there's loads of people still using 2000). Saves doing a bunch of "case when name = 'tim' then value else 0 end" to build your aggregates this weekend.

Bob 2010-05-28 16:27:09

PIVOT/UNPIVOT is supported by SQL Server 2005+, Oracle 11g+. I believe it's ANSI. CASE is more widely supported, making it my preference over PIVOT for sake of portability.

OMG Ponies 2010-05-28 17:03:08

Answer 7

+6 A:

EXISTS. I'm amazed how many people still use COUNT(*) when checking existence or IN (SELECT...) clauses when EXISTS can do the job much quicker.

Most frequently you might see :

SELECT @MyVar = Count(*) FROM Table1 WHERE....
If @MyVar <> 0
BEGIN
   --do something
END

when

IF EXISTS(SELECT 1 FROM Table1 WHERE...)
BEGIN
    --don something
END

is always better.

CodeByMoonlight 2010-05-28 16:30:14

isn't it better to join and aggregate than use subqueries? :D

AlexRednic 2010-05-28 16:32:48

Outer joins are typically faster (at least were in SQL Server 2000) than doing NOT EXISTS unless the set not fulfilling the existential predicate was much smaller than the table. However, overall performance will largely be dependent on whether the query optimiser gets the query right - which it will probably do in most cases. With this being the case EXISTS is probably better if it makes the query more legible.

ConcernedOfTunbridgeWells 2010-05-28 16:40:40

Answer 8

+1 A:

under MySQL, using the keyword "STRAIGHT_JOIN". If you know your data, and the relationships of lookup tables that you are joining to, sometimes the optimizer looks at the smaller tables as a basis of a join and tries to query the "less record" count to your "bigger" table thus taking significantly more time. If your primary table is first in the "from", and its "criteria" up front, the straight join will hit that first, join to the rest of the tables and be done in no time.

I've had to do this dealing with gov't data of 10+ million records joined to about 15+ lookup tables. Without straight-join, the system choked after 20+ hours. Adding Straight-join, it was done in about 2 hrs.

DRapp 2010-05-28 17:24:20

Answer 9

+3 A:

Two from Postgresql: DISTINCT ON (see example) and the new WITH.

leonbloy 2010-05-28 17:31:00

dude these are kick@$$ :D +1 clearly

AlexRednic 2010-05-28 19:52:28

Answer 10

A:

In Sql Server, the HAVING clause. Particularly, HAVING(COUNT DISTINCT FOO)> @SomeNumber to quickly find rows with more than one distinct value for a given grouping.

From MSDN:

USE AdventureWorks2008R2 ;
GO
SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING SUM(LineTotal) > 100000.00
ORDER BY SalesOrderID ;

wraith 2010-05-28 19:07:50

Answer 11

A:

From PostgreSQL Docs:

Table partitioning

Partitioning refers to splitting what is logically one large table into smaller physical pieces. Partitioning can provide several benefits:

Query performance can be improved dramatically for certain kinds of queries.
Update performance can be improved too, since each piece of the table has indexes smaller than an index on the entire data set would be. When an index no longer fits easily in memory, both read and write operations on the index take progressively more disk accesses.
Bulk deletes may be accomplished by simply removing one of the partitions, if that requirement is planned into the partitioning design. DROP TABLE is far faster than a bulk DELETE, to say nothing of the ensuing VACUUM overhead.
Seldom-used data can be migrated to cheaper and slower storage media.

pcent 2010-05-28 19:51:30

Answer 12

A:

Derived tables to create "variables" and reduce repeated code.

Something like this but can be expanded upon. Obviously "Average Value" can be a much more complex calculation, and if you have several it helps clean up code.

Select *, case when AverageValue > 50 then 'Pass' Else 'Fail' end
From
(
 Select ColA, ColB, AverageValue = (ColA+ColB)/2
 From InnerMostTable
) AverageValues
Order By AverageValue Desc

Jeremy 2010-05-28 20:13:38

see leonbloy's answer. much more elegant.

AlexRednic 2010-05-28 21:24:18

Answer 13

+2 A:

Common Table Expressions (SQL Server 2005+)

WITH x AS (
    SELECT 1 as A, 2 as B, 3 as C
),
WITH y AS (
    SELECT 4 as A, 5 as B, 6 as C
    UNION
    SELECT 7 as A, 8 as B, 9 as C
)
SELECT A, B, C FROM x
UNION
SELECT A, B, C FROM y

They are really nice for breaking your queries into steps

Joe Philllips 2010-05-28 20:21:05

Answer 14

A:

In SQL Server using the Convert() function to get dates in the format mm/dd/yyyy instead of Cast() function

SELECT convert(datetime,  '1/1/2010', 101)

I use this all the time

GeoSQL 2010-05-30 00:39:43

ansaurus

tags:

views:

answers:

SQL hidden techniques?

related questions