views:

313

answers:

4

So my syntax is apparently correct in all three cases (PostgreSQL isn't grousing about anything) but the results come back in the same order with all three of these queries. Even stranger when I add/remove DESC from any of the following it has no impact either. Is it possible to sort results based on elements of a sub query or not?

Sort by affiliation
SELECT * FROM articles_view WHERE (1=1) 
AND spubid IN 
  (SELECT people.spubid FROM people WHERE (people.slast ilike 'doe') 
    GROUP BY people.spubid, people.slast, people.saffil) 
AND spubid IN 
  (SELECT status.spubid FROM status WHERE ((status.imonth >= 01 OR status.imonth IS NULL) AND status.iyear >= 2000) AND ((status.imonth <= 01 OR status.imonth IS NULL) AND status.iyear <= 2008) ORDER BY status.iyear, status.imonth)

Sort by last name, descending order
SELECT * FROM articles_view WHERE (1=1) 
AND spubid IN 
  (SELECT people.spubid FROM people WHERE (people.slast ilike 'doe') 
    GROUP BY people.spubid, people.slast, people.saffil ORDER BY people.slast DESC) 
AND spubid IN 
  (SELECT status.spubid FROM status WHERE ((status.imonth >= 01 OR status.imonth IS NULL) AND status.iyear >= 2000) AND ((status.imonth <= 01 OR status.imonth IS NULL) AND status.iyear <= 2008))

Sort by year/month descending order
SELECT * FROM articles_view WHERE (1=1) 
AND spubid IN 
  (SELECT people.spubid FROM people WHERE (people.slast ilike 'doe') 
    GROUP BY people.spubid, people.slast, people.saffil ) 
AND spubid IN 
  (SELECT status.spubid FROM status WHERE ((status.imonth >= 01 OR status.imonth IS NULL) AND status.iyear >= 2000) AND ((status.imonth <= 01 OR status.imonth IS NULL) AND status.iyear <= 2008) ORDER BY status.iyear, status.imonth DESC)

I am just not sure why the ORDER BY conditions are having no impact on the order of the results.

***** UPDATE:

What I ended up doing was using the array column in my view (in this case articles_view) to do all my sorting. That way I do all my sorts on a "column" in the primary query and totally avoid having to use JOINS. The way the view is defined, all the columns matching a given pubid (primary key) in the people/status table (both have a 1->many) are stored in array columns in the view. My query with the sort looks like this:

SELECT * FROM articles_view WHERE 
  ((articles_view.skeywords_auto ilike '%ice%') OR (articles_view.skeywords_manual ilike '%ice%')) 
  ORDER BY (articles_view.authors[1]).slast

The reason this works is because I always know that the first member of the array (in Postgres the first index is 1 rather than the usual 0), is the primary author (or primary status) which is what I need for sorting.

+2  A: 

All the sub queries are doing is providing a set of results for the condition to check for the existence of spubid. You need to actually join to the status table and then use the columns in an order by clause on the outer query.

Something like:

SELECT * 
FROM articles_view
       INNER JOIN status ON articles_view.spubid = status.spubid
       INNER JOIN people ON articles_view.spubid = people.spubid
WHERE ((status.imonth >= 01 OR status.imonth IS NULL) AND status.iyear >= 2000)
       AND ((status.imonth <= 01 OR status.imonth IS NULL)
       AND status.iyear <= 2008 AND people.slast ilike 'doe')
ORDER BY status.iyear, status.imonth
Simon Fox
Ugh that is what I was afraid of, I really want to avoid using a join (the product would get way too big, hence the use of sub queries instead).
Nicholas Kreidberg
Using an inner join you will only get the appropriate results, and the query optimizer should determine a plan which strips out anything not satisfying the where clause before the join is applied.
Simon Fox
+1  A: 

You are only sorting the data that is used by the IN statements. You need to sort your top level Select statement.

Edit:

And since the Select statements inside of the IN clauses are not contributing to the overall sorting of your results, you should remove the order by clauses from them, thus preventing the server from having to do unneeded processing.

instanceofTom
+2  A: 

You're not ordering the outer query; you're only ordering the inner query. It's perfectly legal, but all you're doing with those inner results is comparing spubid against them, and it doesn't much matter what order you do that in.

What you're looking for is a JOIN.

SELECT * 
FROM articles_view
INNER JOIN status ON (status.spubid = articles_view.spubid AND ((status.imonth >= 01 OR status.imonth IS NULL) AND status.iyear >= 2000) AND ((status.imonth <= 01 OR status.imonth IS NULL) AND status.iyear <= 2008))
WHERE spubid IN 
  (SELECT people.spubid FROM people WHERE (people.slast ilike 'doe') 
   GROUP BY people.spubid, people.slast, people.saffil ) 
ORDER BY status.iyear, status.imonth DESC

(You could rewrite the the other lookup as a join also, but for simplicity I left that one alone.)

VoteyDisciple
Since I don't know ahead of time whether or not the user will want to sort on a field in the people OR status table I will have to join both. The problem is the product that I am left with, there is just no need to incorporate that much data into the query but it sounds like that is my only option if I want to allow for a wide-variety of sort options.
Nicholas Kreidberg
You're incorporating that much data into the query *processing* whether you `JOIN` or not. If your concern is simply returning that many *columns*, that's *easily* corrected: simply do `SELECT articles_view.*` instead of `SELECT *`
VoteyDisciple
A: 

What I ended up doing was using the array column in my view (in this case articles_view) to do all my sorting. That way I do all my sorts on a "column" in the primary query and totally avoid having to use JOINS. The way the view is defined, all the columns matching a given pubid (primary key) in the people/status table (both have a 1->many) are stored in array columns in the view. My query with the sort looks like this:

SELECT * FROM articles_view WHERE 
  ((articles_view.skeywords_auto ilike '%ice%') OR (articles_view.skeywords_manual ilike '%ice%')) 
  ORDER BY (articles_view.authors[1]).slast

The reason this works is because I always know that the first member of the array (in Postgres the first index is 1 rather than the usual 0), is the primary author (or primary status) which is what I need for sorting.

Nicholas Kreidberg