views:

79

answers:

4

I was playing with the Stack Exchange Data Explorer and ran this query:
http://odata.stackexchange.com/stackoverflow/q/2828/rising-stars-top-50-users-ordered-on-rep-per-day

Notice down in the results, rows 11 and 12 have the same value and so are mis-numbered, even though the row_number() function takes the same order by parameter as the query.

I know the correct fix here is to specify an additional tie-breaker column in the order by clauses, but I'm more curious as to why/how the row_number() function returned different results on the same data?

If it makes a difference anywhere, this runs on Azure.

A: 

Is row number just the number of the row that data happens be residing on in some temp table that holds the result of the query? If so then the results are arbitrary and are usually the same based on how the db engine processes the query and how the data exists in the source tables.

Khorkrak
+2  A: 

They aren't misnumbered - your ORDER BY is for a different column. Though they evaluate the same value ultimately, the ORDER BY in the ROW_NUMBER is not to be considered in sync with the ORDER BY for the query.

OMG Ponies
True. IF you want to order by the order in row_number(), project row_number first in a subquery (or a CTE) and the order by the projected row_number
Remus Rusanu
A: 

How do DENSE_RANK, RANK and ROW_NUMBER compare in that query - still inconsistent behavior?

The ROW_NUMBER() is obviously assigned first, but the ORDER BY doesn't specify that the output be sorted by ROW_NUMBER, so the output can be output in any order.

Do this:

ORDER BY
RepPerDays DESC, Row_number() OVER(ORDER BY Reputation/Days DESC)​

And it's ordered to match.

Cade Roux
hmm, it does match but it's still out of order. I have to cast as a float to get it right. And at this point it's simpler to just use the original "Reputation/Days" expression and cast on one operand of that.
Joel Coehoorn
+2  A: 

The problem seems to be with significant digits. Eg: polygenelubricants has 22281 of reputation gained in 101 days, and KennyTM has 39257 of reputation gained in 178 days. The integer part of both RepPerDays is 220, but the floating value of Reputation/Days for polygenelubricants is 220.603#### and for KennyTM is 220.544####.

You should try to order by Reputation / Days both times.

Fede
That works, but only if also cast as float in both places.
Joel Coehoorn