views:

3561

answers:

8

Given the following table in SQL Server 2005:

ID   Col1   Col2   Col3
--   ----   ----   ----
1       3     34     76  
2      32    976     24
3       7    235      3
4     245      1    792

What is the best way to write the query that yields the following result (i.e. one that yields the final column - a column containing the minium values out of Col1, Col2, and Col 3 for each row)?

ID   Col1   Col2   Col3  TheMin
--   ----   ----   ----  ------
1       3     34     76       3
2      32    976     24      24
3       7    235      3       3
4     245      1    792       1

UPDATE:

For clarification (as I have said in the coments) in the real scenario the database is properly normalized. These "array" columns are not in an actual table but are in a result set that is required in a report. And the new requirement is that the report also needs this MinValue column. I can't change the underlying result set and therefore I was looking to T-SQL for a handy "get out of jail card".

I tried the CASE approach mentioned below and it works, although it is a bit cumbersome. It is also more complicated than stated in the answers because you need to cater for the fact that there are two min values in the same row.

Anyway, I thought I'd post my current solution which, given my constraints, works pretty well. It uses the UNPIVOT operator:

with cte (ID, Col1, Col2, Col3)
as
(
    select ID, Col1, Col2, Col3
    from TestTable
)
select cte.ID, Col1, Col2, Col3, TheMin from cte
join
(
    select
     ID, min(Amount) as TheMin
    from 
     cte 
     UNPIVOT (Amount for AmountCol in (Col1, Col2, Col3)) as unpvt
    group by ID
) as minValues
on cte.ID = minValues.ID

I'll say upfront that I don't expect this to offer the best performance, but given the circumstances (I can't redesign all the queries just for the new MinValue column requirement), it is a pretty elegant "get out of jail card".

+3  A: 

There are likely to be many ways to accomplish this. My suggestion is to use Case/When to do it. With 3 columns, it's not too bad.

Select Id,
       Case When Col1 < Col2 And Col1 < Col3 Then Col1
            When Col2 < Col1 And Col2 < Col3 Then Col2 
            Else Col3
            End As TheMin
From   YourTableNameHere
G Mastros
This was my initial thought. But the real query needs 5 columns, and the number of columns could grow. So the CASE approach becomes a little unwieldy. But it does work.
stucampbell
If the number of columns could grow, you're *definitely* doing it wrong - see my post (the rant on why you shouldn't have your DB schema set up this way :-).
paxdiablo
Thanks. As I mentioned in another comment. I'm not querying actual tables. The tables are normalised correctly. This query is part of a particularly complex query and is working on intermediate results from derived tables.
stucampbell
In that case, can you derive them differently so they look normalized?
Kev
I'd be very circumspect about using this on any decent-sized table. SELECTs that perform functions on every row are notoriously unscalable. Assuming your row count is small enough, this won't matter but it will cause troubles on large databases.
paxdiablo
+1  A: 

This is brute force but works

 select case when col1 <= col2 and col1 <= col3 then col1
           case when col2 <= col1 and col2 <= col3 then col2
           case when col3 <= col1 and col3 <= col2 then col3
    as 'TheMin'
           end

from Table T

... because min() works only on one column and not across columns.

Learning
+5  A: 

The best way is to not do it - I don't know why people insist on storing their data in a way that requires SQL gymnastics to extract meaningful information. The right way to do this, in my opinion, is to have the following table:

ID    Col    Val
--    ---    ---
 1      1      3
 1      2     34
 1      3     76

 2      1     32
 2      2    976
 2      3     24

 3      1      7
 3      2    235
 3      3      3

 4      1    245
 4      2      1
 4      3    792

with ID/Col as the primary key and possibly Col as a secondary key, depending on your needs. Then your query becomes a simple

select min(val) from tbl

and you can still treat the individual 'old columns' separately by using

where col = 2

in your other queries. This also allows for easy expansion should the number of 'old columns' grow.

This makes your queries so much easier. The guideline I tend to use is, if you ever have something that looks like an array in a database row, you're probably doing something wrong.

However, if you can't change the database schema, I'd suggest using insert and update triggers and add another column which these triggers set to the minimum on Col1/2/3. This will move the 'cost' of the operation away from the select to the update/insert where it belongs.

In short, since the minimum for a row only changes when one of the other columns change, that's when you should be calculating it, not every time you select (which is wasted if the data isn't changing). You would then end up with a table like:

ID   Col1   Col2   Col3   MinVal
--   ----   ----   ----   ------
 1      3     34     76        3
 2     32    976     24       24
 3      7    235      3        3
 4    245      1    792        1

Although that's also changing the schema, so why not do it properly?

Any other option that has to make decisions on select is a bad idea performance-wise, since the data only changes on insert/update - the addition of another column takes up more space in the DB and will be slightly slower for the inserts and updates but will be faster for selects - the preferred approach should depend on your priorities there but, in my experience, most tables are read far more often than they're written.

paxdiablo
Um. Thanks for the diatribe. The real database is properly normalized. This was a simple example. The actual query is complicated and the 5 columns I am interested in are intermediate results from derived tables.
stucampbell
The diatribe still stands unfortunately. Making intermediate tables of the form you suggest is every bit as problematic as making permanent tables like that. This is evidenced by the fact that you have to perform what I like to call SQL gymnastics to get the result you want.
paxdiablo
If there are genuine reasons for needing the 'array' in a single row, feel free to enlighten us, but using it to select the minimum value is not one of them.
paxdiablo
Look you don't understand the context so get off your high horse. The 5 columns in the result set are needed by the business and are the results of complex financial computations. They also need the min values per row. These computations are at a different level of abstraction from the data storage.
stucampbell
You're free to choose the answer you think suitable, that's your right. My solution can easily give you min val per row and across all rows. If people thought more about DB design, they'd have a lot less trouble. It's amateur DBAs thinking they know it all that cause that trouble. 'Nuff said.
paxdiablo
+1 for the trigger suggestion to preserve the original (if flawed) table structure.
Scott Ferguson
+1  A: 

If you're able to make a stored procedure, it could take an array of values, and you could just call that.

Kev
Oracle has a function called LEAST() that does exactly what you want.
Kev
Thanks for rubbing that in :)I can't believe that SQL Server doesn't have an equivalent!
stucampbell
I was even going to say, "Hey, my favourite pgsql doesn't have it either," but it actually does. ;) The function itself wouldn't be tough to write though.
Kev
Oh, except that T-SQL doesn't even have array support (???) Well, I guess you could have a five-parameter function and if you need more just extend it...
Kev
+1  A: 
select *,
case when column1 < columnl2 And column1 < column3 then column1
when columnl2 < column1 And columnl2 < column3 then columnl2
else column3
end As minValue
from   tbl_example
Phil Corcoran
This is a duplicate of G Mastros' answer, so if you should wonder: I guess that's where down-vote comes from.
Tomalak
+2  A: 

You could also do this with a union query. As the number of columns increase, you would need to modify the query, but at least it would be a straight forward modification.

Select T.Id, T.Col1, T.Col2, T.Col3, A.TheMin
From   YourTable T
       Inner Join (
         Select A.Id, Min(A.Col1) As TheMin
         From   (
                Select Id, Col1
                From   YourTable

                Union All

                Select Id, Col2
                From   YourTable

                Union All

                Select Id, Col3
                From   YourTable
                ) As A
         Group By A.Id
       ) As A
       On T.Id = A.Id
G Mastros
This works but performance will degrade when the row count rises.
Tomalak
Thanks. Yes this works. As Tomalak says, in my realword query this would be quite nasty for performance. But +1 for effort. :)
stucampbell
+1  A: 

If you use SQL 2005 you can do something neat like this:

;WITH res AS (
    SELECT 
        t.YourID
        , CAST(
              (
                SELECT 
                      Col1 as c01
                    , Col2 as c02
                    , Col3 as c03
                    , Col4 as c04
                    , Col5 as c05
                FROM YourTable AS cols
                WHERE YourID = t.YourID
                FOR XML AUTO, ELEMENTS
              ) AS XML
        ) AS colslist
    FROM YourTable AS t
)
SELECT YourID
,colslist.query('for $c in //cols return min(data($c/*))').value('.', 'real') AS YourMin
,colslist.query('for $c in //cols return avg(data($c/*))').value('.', 'real') AS YourAvg
,colslist.query('for $c in //cols return max(data($c/*))').value('.', 'real') AS YourMax
FROM res

This way you don't get lost in so many operators :)

However, this could be slower than the other choice.

It's your choice...

leoinfo
Holy cow . . . .
dreamlax
Well, like I said, this could be slow, but if you have too many columns (obviously, as a result of a really bad DB design!), it could worth using this (at least for AVG). You didn't give me any hint if it's a good *holy cow* or a bad one:) Maybe you should use the up/down vote to help me figure out.
leoinfo
It wasn't really a good or a bad one ;). I'm no database expert, so I was just saying "holy cow" because the question seemed like it would have a trivial answer. I guess it's a good one since you managed to provide a flexible, extensible solution to the problem!
dreamlax
+1  A: 

Both this question And this question try to answer this.

The recap is that Oracle has a built in function for this, with Sql Server you are stuck either defining a user-defined-function or using case statements.

Sam Saffron