views:

145

answers:

1
SELECT PLD_LINK.ID, PLD_LINK.TITLE, PLD_LINK.URL, PLD_CATEGORY.TITLE, TLL_SORT_STATUS.status, PLD_LINK_COMMENT.DATE_ADDED
    FROM PLD_LINK, PLD_CATEGORY, TLL_SORT_STATUS, PLD_LINK_COMMENT 
    WHERE PLD_LINK.CATEGORY_ID = PLD_CATEGORY.ID 
        AND PLD_LINK.ID = TLL_SORT_STATUS.link_id 
        AND PLD_LINK.PAGERANK BETWEEN -1 AND 2
        AND PLD_LINK.CATEGORY_ID IN (1291,2113,2664)
        ORDER BY COUNT(PLD_LINK.ID = PLD_LINK_COMMENT.LINK_ID) ASC

This is my query. I would like to get all the results that match before ORDER BY. Now, some of the id's from PLD_LINK.ID may be in the table PLD_LINK_COMMENT but some may not be. SO when mysql reaches ORDER BY i would like to order the results in an ascending order based on the following criteria: the id's that are not in PLD_LINK_COMMENT should be the first and following after that the ones's that are based on the number of comments per ID. (an ID can have none, 1, 2 ... comments.)

+1  A: 

You need an outer join between the three main tables and PLD_LINK_COMMENT.

The outer join will mean you get a null for the PLD_LINK_COMMENT.LINK_ID, which means that the COUNT will return 0 for a PLD_LINK.ID with no matching link comments, and the appropriate number of comments for the others.

You need to group by the non-aggregated columns...but that then shows that M.Date_Added probably isn't what you want to GROUP BY; and the question/answer in the comments shows that we need the most recent comment date, or some preposterously early data like '0001-01-01' (the earliest valid date in SQL) or '0000-00-00' (a pseudo-date used by MySQL).

Hence:

SELECT K.ID, K.Title, K.URL, C.Title, S.Status,
       MAX(NVL(M.Date_Added, DATE '0001-01-01')) AS MostRecentComment
  FROM Pld_Link AS K JOIN Pld_Category AS C ON K.Category_ID = C.ID
       JOIN Tll_Sort_Status AS C ON K.ID = S.Link_ID
       OUTER JOIN Pld_Link_Comment AS M ON K.ID = M.Link_ID
 WHERE K.PageRank BETWEEN -1 AND 2
   AND K.Category_ID IN (1291,2113,2664)
 GROUP BY K.ID, K.Title, K.URL, C.Title, S.Status
 ORDER BY COUNT(M.Link_ID) ASC

You also need your DBMS (MySQL you said) to allow you to order by columns that are not listed in the SELECT-list.


Tested SQL

Tested using IBM Informix Dynamic Server 11.50 on MacOS X 10.6.2. I've used DATE('0001-01-01') because IDS has a function called DATE that requires parentheses, and I used '0001-01-01' because IDS does not recognize '0000-00-00' as a valid date.

SELECT K.ID, K.Title, K.URL, C.Title, S.STATUS,
       MAX(NVL(M.Date_Added, DATE('0001-01-01'))) AS MostRecentComment,
  FROM PLD_Link AS K JOIN PLD_Category AS C ON K.Category_ID = C.ID
       JOIN Tll_Sort_Status AS S ON K.ID = S.Link_ID
       LEFT OUTER JOIN PLD_Link_Comment AS M ON K.ID = M.Link_ID
 WHERE K.PageRank BETWEEN -1 AND 2
   AND K.Category_ID IN (1291,2113,2664)
 GROUP BY K.ID, K.Title, K.URL, C.Title, S.STATUS
 ORDER BY COUNT(M.Link_ID)

Result:

id  title                 url                       title   status  mostrecent
4   #4: This should show  example.com/?id=4;c=2113  Cat 2113  0  0001-01-01
3   #3: This should show  example.com/?id=3;c=1291  Cat 1291 -1  2010-03-10
7   #7: This should show  example.com/?id=7;c=2664  Cat 2664  3  2010-07-11
5   #5: This should show  example.com/?id=5;c=2113  Cat 2113  1  2010-05-17
2   #2: This should show  example.com/?id=2;c=1291  Cat 1291 -2  2010-02-13

Alternative, selecting the comment count, to show that the data is presented in the correct order.

SELECT K.ID, K.Title, K.URL, C.Title, S.STATUS,
       MAX(NVL(M.Date_Added, DATE('0001-01-01'))) AS MostRecentComment,
       COUNT(M.Link_ID) AS CommentCount
  FROM PLD_Link AS K JOIN PLD_Category AS C ON K.Category_ID = C.ID
       JOIN Tll_Sort_Status AS S ON K.ID = S.Link_ID
       LEFT OUTER JOIN PLD_Link_Comment AS M ON K.ID = M.Link_ID
 WHERE K.PageRank BETWEEN -1 AND 2
   AND K.Category_ID IN (1291,2113,2664)
 GROUP BY K.ID, K.Title, K.URL, C.Title, S.STATUS
 ORDER BY CommentCount ASC;

Result set:

id title                 url                       title  status  recent  count
4  #4: This should show  example.com/?id=4;c=2113  Cat 2113  0  0001-01-01  0
3  #3: This should show  example.com/?id=3;c=1291  Cat 1291 -1  2010-03-10  1
7  #7: This should show  example.com/?id=7;c=2664  Cat 2664  3  2010-07-11  1
5  #5: This should show  example.com/?id=5;c=2113  Cat 2113  1  2010-05-17  2
2  #2: This should show  example.com/?id=2;c=1291  Cat 1291 -2  2010-02-13  3

Sample data base

PLD_Category

CREATE TABLE PLD_Category
(
    ID      SERIAL NOT NULL PRIMARY KEY,
    Title   VARCHAR(32) NOT NULL
);

INSERT INTO PLD_Category VALUES(3961, 'Cat 3961');
INSERT INTO PLD_Category VALUES(1291, 'Cat 1291');
INSERT INTO PLD_Category VALUES(2113, 'Cat 2113');
INSERT INTO PLD_Category VALUES(2664, 'Cat 2664');

PLD_Link

CREATE TABLE PLD_Link
(
    ID          SERIAL NOT NULL PRIMARY KEY,
    Category_ID INTEGER NOT NULL REFERENCES PLD_Category,
    Title       VARCHAR(32) NOT NULL,
    PageRank    INTEGER NOT NULL,
    URL         VARCHAR(64) NOT NULL
);

INSERT INTO PLD_Link
    VALUES(1, 3961, '#1: This should not',  0, 'example.com/?id=1;c=3961');
INSERT INTO PLD_Link
    VALUES(2, 1291, '#2: This should show', 0, 'example.com/?id=2;c=1291');
INSERT INTO PLD_Link
    VALUES(3, 1291, '#3: This should show', 0, 'example.com/?id=3;c=1291');
INSERT INTO PLD_Link
    VALUES(4, 2113, '#4: This should show', 0, 'example.com/?id=4;c=2113');
INSERT INTO PLD_Link
    VALUES(5, 2113, '#5: This should show', 0, 'example.com/?id=5;c=2113');
INSERT INTO PLD_Link
    VALUES(6, 3961, '#6: This should not',  0, 'example.com/?id=6;c=3961');
INSERT INTO PLD_Link
    VALUES(7, 2664, '#7: This should show', 0, 'example.com/?id=7;c=2664');
INSERT INTO PLD_Link
    VALUES(8, 2664, '#8: This should show', 0, 'example.com/?id=8;c=2664');

TLL_Sort_Status

CREATE TABLE TLL_Sort_Status
(
    STATUS      INTEGER NOT NULL,
    Link_ID     INTEGER NOT NULL REFERENCES PLD_Link
);

INSERT INTO TLL_Sort_Status VALUES(-3, 1);
INSERT INTO TLL_Sort_Status VALUES(-2, 2);
INSERT INTO TLL_Sort_Status VALUES(-1, 3);
INSERT INTO TLL_Sort_Status VALUES( 0, 4);
INSERT INTO TLL_Sort_Status VALUES(+1, 5);
INSERT INTO TLL_Sort_Status VALUES(+2, 6);
INSERT INTO TLL_Sort_Status VALUES(+3, 7);
INSERT INTO TLL_Sort_Status VALUES(+4, 1);

PLD_Link_Comment

CREATE TABLE PLD_Link_Comment
(
    Link_ID     INTEGER NOT NULL REFERENCES PLD_Link,
    Date_Added  DATE NOT NULL,
    COMMENT     VARCHAR(64) NOT NULL
);

INSERT INTO PLD_Link_Comment VALUES(1, DATE('2010-01-11'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(2, DATE('2010-02-11'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(2, DATE('2010-02-12'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(2, DATE('2010-02-13'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(3, DATE('2010-03-10'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(5, DATE('2010-05-15'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(5, DATE('2010-05-17'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(6, DATE('2010-06-10'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(6, DATE('2010-06-12'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(6, DATE('2010-06-14'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(6, DATE('2010-06-19'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(7, DATE('2010-07-11'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(8, DATE('2010-08-12'), 'Vacuous comment');
INSERT INTO PLD_Link_Comment VALUES(8, DATE('2010-08-13'), 'Vacuous comment');
Jonathan Leffler
if the COUNT value is 0 can I set the value for DATE_ADDED to 0000-00-00 in the sql query?
In MySQL, probably; in other DBMS, no - 0000-00-00 isn't a valid date. You'd probably use the NVL or COALESCE functions: `NVL(M.Date_Added, DATE '0000-00-00')`.
Jonathan Leffler
@user253530: Suppose there are three comments for a PLD_LINK entry with three different dates; which date should be shown - oldest, newest, something else?
Jonathan Leffler
the newest date should be shown
ok, since the question i asked is part of a bigger problem can you help me with this: i have search patterns, variable number of search patterns, which i want to find in the PLD_LINK.TITLE OR PLD_LINK.URL. I need to perform that query to return me only the results that match those search patterns (and in the same time those other conditions have to be satisfied)
That complicates life. With Informix, I'd put the search patterns into a temporary table (call it SearchPatterns, with a single column Pattern), and then add a condition along the lines of: `AND K.Link_ID IN (SELECT K2.Link_ID FROM PLD_Link AS K2 CROSS JOIN SearchPatterns AS P WHERE K2.Title LIKE P.Pattern OR K2.URL LIKE P.Pattern)`.
Jonathan Leffler
Thank you for all the help. You helped me more than i would have expected.
@user253350: The preferred way of saying 'thank you' is an upvote and/or an 'accept' vote.
Jonathan Leffler