Selecting data in clustered index order without ORDER BY

views:

answers:

+2 Q:

Selecting data in clustered index order without ORDER BY

I know there is no guarantee without an ORDER BY clause, but are there any techniques to tune SQL Server tables so they're more likely to return rows in clustered index order, without having to specify ORDER BY every single time I want to run a super-quick ad hoc query? For example, would rebuilding my clustered index or updating statistics help?

Out of all the many tables we use, I've only noticed two ever giving me results in an unpredicted order. This is really just an annoyance, but it would be nice to be able to minimize it. In case this is relevant because of page boundary issues or something like that, I should mention that one of the tables that has inconsistent ordering, is the longest table we have that has a clustered index on an identity column.

Edit: deleted the specific example, because I think the example made this read more like a "help me with this query" question, when I just wanted to use the example to illustrate the abstract question.

+3 A:

If you want the result set to be ordered in a specific way, use ORDER BY.
If you want the query to be accessed using a specific index, you may want to use an INDEX table hint, but they are strongly discouraged.

In your particular example, accessing a very large accesslog table that is clustered by an IDENTITY to query all rows for a particular UserId is the last thing you want. You want to have an index by UserId and you want the query to use that index. As a side note, access log tables clustered by an IDENTITY are usually not a good choice, because the typical access is by time range (accesses between t1 and t2) or by other attribute (by userId, like in your example, or by machineId, or similar). None of these is ever satisfied by a clustered index on an IDENTITY column.

Remus Rusanu 2010-06-07 21:39:07

+1: An audit table is a poor example, because you aren't going to backdate an log entry.

OMG Ponies 2010-06-07 21:43:35

I do have a non-clustered index on UserId. Our most common use case for querying the AuditLog is simply selecting the last 50 actions that were audited. A query by UserId is less common, and only ad hoc. As I said, this question is simply about an annoyance -- about saving me 20 keystrokes, three times a week.

kcrumley 2010-06-07 21:50:25

The biggest issue with a non-clustered index and a `SELECT *` is that it can reach the tipping point where a table scan will be preferred. On something like USerID I can easily see this happening. See http://www.sqlskills.com/BLOGS/KIMBERLY/post/The-Tipping-Point-Query-Answers.aspx.

Remus Rusanu 2010-06-07 21:54:14

Thanks for the link; very interesting. The execution plan shows the non-clustered index still being used in that example query, though. I'm pretty sure none of our users has enough records to push them over the tipping point.Should I take your answer as meaning "no", there isn't a way to tune the table/encourage results to come out in clustered index order without an ORDER BY (or a technique that requires even more keystrokes), then?

kcrumley 2010-06-07 22:18:32

I think the gist of my response is that there are two different issues: 1) the order of the results (which requires ORDER BY, period), 2) the query access plan (use this index vs. that index). The later is a vast topic and goes well beyond a simple answer like mine. Ultimately, SQL is declarative: describe what you want, and let the engine figure out how it gets it. Why I did enter into details about the structure of the accesslog table is that a good table design goes a long way toward obtaining fast results, usually much longer that anything you can do by decorating the SQL text of the query

Remus Rusanu 2010-06-07 22:23:47

ansaurus

tags:

views:

answers:

Selecting data in clustered index order without ORDER BY

related questions