views:

138

answers:

2

Hello all, first of all, sorry for the non descriptive title, I'm just too rushed so I couldn't come up with a better one.
Second:
I have a portion of my database the looks like the following diagram:
alt text I have contributors on the system, each write to many sources, and a source can have many working contributors. Users can subscribe to as many contributors as they like and as many sources as they like. Now, what I want to do is simply retrieve all the articles for certain user. These articles are either coming through a contributor or a source the user subscribes to. To make it easy, when a user subscribes to a source I simply copy all the sources contributors to the users_contributors table. One tricky piece, when I retrieve the user's articles I retrieve all the articles that he his contributors write, and all the articles that were published in the sources he follows where those articles doesn't have a valid contributor on the system. (I.E contributorID is null).
I created the following query:

 Select Articles.ArticleID, Articles.ContributorId, Contributors.Name, 
    Sources.Name, Articles.ArticleTitle
From Articles 
      Inner Join Contributors On Articles.ContributorId = Contributors.ContributorId
      Inner Join Sources On Articles.SourceId = Sources.SourceID   
Where Articles.ContributorId in (
    Select ContributorId from Users_Contributors
    Where UserID = 3
    )
    OR (

     Articles.SourceId in (
      Select SourceId from Users_Sources
      Where UserID = 3
      )

      and 
      Articles.ContributorId is null
    )

The problem with the above query is that, it doesn't return any article with contributorID null. I understand this is because of the join on the contributors table. What should I do in such a case?

  1. Should I consider denormalization?
  2. What are the prober fields to index on each table for this query to run fast (Rowset returned are approximately 10000)?
  3. I need to support paging on this query, will "With { }" clause be appropriate to me, or should I consider another strategy?


    Thanks in advance.
    Ps: I'm using SQL Server 2008

+2  A: 

Why don't you just make this join

Inner Join Contributors On Articles.ContributorId = Contributors.ContributorID

an outer join?

Left Join Contributors On Articles.ContributorId = Contributors.ContributorID

That will cause it to return all Articles, whether or not there is a matching SourceID (includes cases where ContributorID is null).

Robert Harvey
oops I mistyped .. it's Join contributors on Articles.ContributorID = Contributors.ContributorId .. I edited, plz check again!
Galilyou
I edited my answer to correct.
Robert Harvey
+4  A: 
SELECT  a.*, s.Name AS SourceName, NULL AS ContributorName
FROM    User_Sources us
JOIN    Articles a
ON      a.SourceID = us.SourceID
JOIN    Source s
ON      s.SourceID = us.SourceID
WHERE   us.UserID = 3
        AND a.ContributorID IS NULL
UNION
SELECT  a.*, s.Name AS SourceName, c.Name AS ContributorName
FROM    User_Contributor uc
JOIN    Articles a
ON      a.ContributorID = uc.ContributorID
JOIN    Contirbutors c
ON      c.ContributorID = uc.ContributorID
JOIN    Sources s
ON      s.SourceID = a.SourceID
WHERE   uc.UserID = 3

If you need paging, use this (to get the pages from 80 to 100):

WITH    q AS (
        SELECT  TOP 100 
                a.*, s.Name AS SourceName, NULL AS ContributorName
        FROM    User_Sources us
        JOIN    Articles a
        ON      a.SourceID = us.SourceID
        JOIN    Source s
        ON      s.SourceID = us.SourceID
        WHERE   us.UserID = 3
                AND a.ContributorID IS NULL
        ORDER BY
                OrderDate
        UNION
        SELECT  TOP 100
                a.*, s.Name AS SourceName, c.Name AS ContributorName
        FROM    User_Contributor uc
        JOIN    Articles a
        ON      a.ContributorID = uc.ContributorID
        JOIN    Contirbutors c
        ON      c.ContributorID = uc.ContributorID
        JOIN    Sources s
        ON      s.SourceID = a.SourceID
        WHERE   uc.UserID = 3
        ORDER BY
                OrderDate
        ),
        page AS
        (
        SELECT  TOP 100 *, ROW_NUMBER() OVER (ORDER BY OrderDate) AS rn
        FROM    q
        )
SELECT  *
FROM    page
WHERE   rn >= 80
Quassnoi
I need to join in Contributors and Sources in order to get contributor name and source name. so .. !!
Galilyou
I'm not sure of the union would work for me. Here are my specific needs: I need to order the return records by Articles.OrderDate and support paging on those. Also on the second part of the union I have to join on the sources table to get SourceName (as it's not going to be null).
Galilyou
@7alwagy : The union should work. Its supported in sql server 2008. See http://technet.microsoft.com/en-us/library/ms180026.aspx.
Nrj
That worked, thanks @Quassnoi .. Winner!
Galilyou