views:

51

answers:

4

There is a table of currencies rates in MS SQL Server 2005:

ID | CURR | RATE | DATE
1   | USD   | 30      | 01.10.2010
3   | GBP   | 45      | 07.10.2010
5   | USD   | 31      | 08.10.2010
7   | GBP   | 46      | 09.10.2010
9   | USD   | 32      | 12.10.2010
11 | GBP   | 48      | 03.10.2010

Rate are updated in real time and there are more than 1 billion rows in the table.

It needs to write a SQL-query, wich will provide latest rates per each currency.
My decision is:

SELECT c.[id],c.[curr],c.[rate],c.[date]    
FROM [curr_rate] c, (SELECT curr, MAX(date) AS rate_date FROM [curr_rate] 
      GROUP BY curr) t
WHERE c.date = t.rate_date AND c.curr = t.curr
ORDER BY c.[curr] ASC

Is it possible to write a query without sub-queries and join's with derived tables?

+1  A: 

No, I think not. Do you have an index on CURR, and date?

ck
To choose a PRIMARY INDEX for the table - it's a sub task for this issue.I think PRIMARY INDEX should contains from CURR and DATE fileds.
Sirius Lampochkin
A: 

I don't know that you can avoid the sub-select, but you can avoid the join and even the group by:

SELECT id, curr, rate, date
FROM curr_rate r
WHERE date = (
       SELECT MAX(date)
         FROM curr_rate
        WHERE curr = r.curr
      )
ORDER BY curr ASC

I have no idea how this will perform.

Marcelo Cantos
This query was my first decision, but I guess that my query would be better in perfomance (more faster).Who knows, which one is more perfomance (better)?
Sirius Lampochkin
It is extremely difficult to tell, @Sirius. You should always test queries for performance.
Marcelo Cantos
+1  A: 

Sorting out your indexes is probably more critical than SQL syntax in tuning this query.

It might be worth comparing the subquery approach against a CTE:

;WITH currCTE
AS
(
        SELECT id
               ,curr
               ,rate
               ,date
               ,ROW_NUMBER() OVER (PARTITION BY curr
                                   ORDER BY date desc
                                  ) AS rn
        FROM [curr_rate]
)
SELECT id
       ,curr
       ,rate
       ,date
FROM currCTE
WHERE rn = 1

If the table PK is on id, and rows are always added to the table in date order, you may get better performance by using id in the ORDER BY clause of the ranking function rather than date.

Ed Harper
it was just a question/task - pre-interview for the job.Now, I think there is no a decision withot sub-query and join's for the task.
Sirius Lampochkin
+1  A: 

Your query does not use a subquery, so no need to change that. A subquery is a SELECT query that returns a single value and is nested inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another subquery. A subquery can be used anywhere an expression is allowed. See Subquery Fundamentals

Your query is using a derived table, also called an inline view, which you have named "t".

I would start by getting rid of the ancient join syntax:

SELECT 
    c.[id],c.[curr],c.[rate],c.[date]    
    FROM [curr_rate] c
        INNER JOIN (SELECT
                        curr, MAX(date) AS rate_date
                        FROM [curr_rate] 
                        GROUP BY curr
                   ) t ON c.curr = t.curr AND c.date = t.rate_date 
    ORDER BY c.[curr] ASC

but it will have the same execution plan. You could just move the derived table into a CTE, but that is just about the same as a derived table.

The query would probably run faster if you created an indexed view on:

SELECT
    curr, MAX(date) AS rate_date
    FROM [curr_rate] 
    GROUP BY curr

with an index on curr+MAX(date). If you would then have an index on [curr_rate].curr+date, your query would have better performance and be:

SELECT 
    c.[id],c.[curr],c.[rate],c.[date]    
    FROM [curr_rate] c
        INNER JOIN [curr_rate_max_view] t ON c.curr = t.curr AND c.date = t.rate_date 
    ORDER BY c.[curr] ASC
KM