ansaurus

Question

Is there a better way to sort this query?

Answer 1

+1 A:

Use:

SELECT x.*
  FROM (SELECT po.id, 
               po.status,
               ROW_NUMBER() OVER( ORDER BY po.id) AS rno__row__index
          FROM [PurchaseOrders] po
          JOIN [POLineItems] li ON li.id = po.id
         WHERE li.pat_id = ?) x
 WHERE x.rno__row__index BETWEEN 1 AND 25
ORDER BY po.id ASC

Unless you've omitted details in order to simplify the example, there's no need for all your subqueries in what you provided.

OMG Ponies 2009-12-16 17:07:07

The OP does say "we *must* use the subqueries" although it's not clear why.

LukeH 2009-12-16 17:10:40

In order to reference ROW_NUMBER() output, that's correct - which is preserved in the answer.

OMG Ponies 2009-12-16 17:11:57

There's also no reason I see to have the TOP clause, just move the ORDER BY to the outer query. I think you're actually messing up the query optimizer by adding it to the inner query - there's a reason they don't let you specify ORDER in subqueries by default.

Chris Haas 2009-12-16 17:22:46

@OMG Ponies: If that's the case then your query does the job, although I'd drop the `TOP`/`ORDER BY` from the subquery and just add an `ORDER BY x.rno__row__index` to the outermost query. But as I said, the OP isn't clear on *why* the subqueries are required; maybe because they're autogenerated in some crazy way.

LukeH 2009-12-16 17:25:59

@Luke: You *will* have a single subquery at minimum in order to use pagination with the row_number function call.

OMG Ponies 2009-12-16 18:13:09

Re: TOP use - on reflection, I agree - it's unnecessary.

OMG Ponies 2009-12-16 18:13:46

@OMG Ponies: I know that a subquery is required in order to use `ROW_NUMBER()` for pagination, but maybe there are also *other* reasons why the OP requires them (difficult to see what those reasons might be though).

LukeH 2009-12-16 18:23:36

See clarification in post.

Frew 2009-12-16 18:28:13

Answer 2

+3 A:

If you want top rows by me.id, just ask for that in the ROW_NUMBER's ORDER BY. Don't chase your tail around subqueries and TOP.
If you have a WHERE clause on a joined table field, you can have an outer JOIN. All the outer fields will be NULL and filtered out by the WHERE, so is effectively an inner join.

.

WITH cteRowNumbered AS (
  SELECT [me].id, [me].status 
  ROW_NUMBER() OVER (ORDER BY me.id ASC) AS rno__row__index 
  FROM [PurchaseOrders] [me]           
  JOIN [POLineItems] [line_items] ON [line_items].[id] = [me].[id]          
  WHERE [line_items].[part_id] = ?)
SELECT me.id, me.status 
FROM cteRowNumbered
WHERE rno__row__index BETWEEN 1 and 25

I use CTEs instead of subqueries just because I find them more readable.

Remus Rusanu 2009-12-16 18:22:10

LEFT JOIN vs. inner join

Remus Rusanu 2009-12-16 18:55:27

@Remus: Thanks - in my haste, I didn't correct the join. Still, given the where clause null line_item rows would be filtered out.

OMG Ponies 2009-12-16 19:12:44

@Ponies: yes, that was exactly my point. I wanted to call out that a WHERE clause and a LEFT JOIN just don't mix, they cancel each other out to make an ordinary inner JOIN.

Remus Rusanu 2009-12-16 19:15:13

As for why I posted a query almost identical to yours, w/o reading yours first and perhaps just adding a comment... well step 1 is admition: "Hello, my name is Remus and I'm a SO rep addict"

Remus Rusanu 2009-12-16 19:16:57

Note that although this returns the first 25 rows (according to `PurchaseOrders.id`), it doesn't guarantee the order in which those rows will appear in the resultset. This is "stronger" than the code given in the question, which doesn't even guarantee which 25 rows will be returned. (The only guarantee in the OP's code is that the 25 rows will be selected from somewhere in the top 4294967296.)

LukeH 2009-12-17 01:02:26

Answer 3

A:

Your update makes things much clearer. I think that the approach which you're using is seriously flawed. While it's nice to be able to have encapsulated, reusable code in your applications, front-end applications are a much different animal than a database. They typically deal with small structures and small, discrete process that run against those structures. Databases on the other hand often deal with tables that are measured in the millions of rows and sometimes more than that. Using the same methodologies will often result in code that simply performs so badly as to be unusable. Even if it works now, it's very likely that it won't scale and will cause major problems down the road.

Best of luck to you, but I don't think that this approach will end well in all but the smallest of databases.

Tom H. 2009-12-16 20:34:07

I'm gonna vote this down because it doesn't really help our question, but you can feel free to vote my flawed question down too :-)

Frew 2009-12-16 20:37:05

I'm not too worried about helping you. I'm hoping it helps the people who see this question so that they don't fall into the same trap as you have. That's worth the -2 rep to me, so guess I'll just have to live with it."A fool despises good counsel, but a wise man takes it to heart." - Confucius

Tom H. 2009-12-16 21:10:05

Answer 4

+1 A:

Kudos to the only person who saw through naysaying and actually tried the query on a large table we do not have access to. To all the rest saying this simply will not work (will return random rows) - we know what the manual says, and we know it is a hack - this is why we ask the question in the first place. However outright dismissing a query without even trying it is rather shallow. Can someone provide us with a real example (with preceeding CREATE/INSERT statements) demonstrating the above query malfunctioning?

Peter Rabbitson 2009-12-17 09:19:03

ansaurus

tags:

views:

answers:

Is there a better way to sort this query?

related questions