tags:

views:

101

answers:

4

I have a table with a lot of rows. It is indexed. One of the operations that I'm routinely doing is selecting a random record from this table. To do this, I use following SQL statement:

SELECT TOP 1 * 
FROM 
 ( SELECT TOP (@RecNo) * FROM Table ORDER BY Date ASC ) AS subquery1 
ORDER BY 
 Date DESC ;

Where @RecNo is the random number. Query takes annoyingly lot of time to run. Any ideas what could be optimized here?

+1  A: 

Try ordering it by the clustered primary key. Or include the clustered primary key in your ORDER BY clause.

Kirtan
Date actually is a primary clustered key
galets
A: 

What are you really trying to accomplish? I suppose you just want to get a single random row. The easiest way to do this would be

 SELECT TOP 1 * FROM Table ORDER BY newid()
splattne
I had posted this answer before, but its not working :)
Kirtan
That's because before it can return the first row, it must first generate a GUID for each row and find the lowest valued GUID.
Brannon
there gotta be a way to do this, this thing drives me crazy!
galets
+1  A: 

Since you are on SQL 2005, try using the ROW_NUMBER() function:

http://msdn.microsoft.com/en-us/library/ms186734.aspx

Basically something like:

SELECT * FROM Table WHERE (ROW_NUMBER() OVER Date ASC) = @RecNo

You might need to use a sub-query or CTE to use the ROW_NUMBER() value in a predicate.

I don't know if this will end up being faster than the NEWID() approach. Depends on whether or not SQL will short-circuit the ROW_NUMBER() operation when it finds the value it's looking for. Worst-case it would produce a ROW_NUMBER() for each row, best-case it would stop as soon as it found the row (which could be the first row..).

It's also possible that producing the ROW_NUMBER() for each row is significantly faster than generating a GUID, or otherwise sorting the entire table.

Brannon
this is an excellent idea, i will try this later tonight and see if it works. even if it doesn't, great thinking, thanks!
galets
Nope... Doesn't work. First ROW_NUMBER() is not allowed in WHERE clause, so I need to make a subquery, like `SELECT * FROM (SELECT *, ROW_NUMBER()... AS dd) as xx WHERE xx.dd = @RandID`. Speed increase is non-existent :(
galets
A: 

How much control do you have over the table?

Could you introduce a new "row_id" column? Then index on the new column, and simply do:

SELECT * FROM Table WHERE row_id = @RecNo
Nicolai
yes, i have control, but adding a column is not enough, there must also be an index on this column, which will involve all the crap that an extra index on a huge table involves
galets