views:

72

answers:

2

Hi I'm having trouble with my query combining records when it shouldn't.

I have two tables Authors and Publications, they are related by Publication ID in a many to many relationship. As each author can have many publications and each publication has many Authors. I want my query to return every publication for a set of authors and include the ID of each of the other authors that have contributed to the publication grouped into one field. (I am working with mySQL)

I have tried to picture it graphically below

    Table: authors               Table:publications
AuthorID | PublicationID        PublicationID | PublicationName
    1    |   123                       123    |       A
    1    |   456                       456    |       B
    2    |   123                       789    |       C
    2    |   789
    3    |   123
    3    |   456

I want my result set to be the following

 AuthorID | PublicationID | PublicationName | AllAuthors
     1    |       123     |        A        |    1,2,3
     1    |       456     |        B        |    1,3
     2    |       123     |        A        |    1,2,3
     2    |       789     |        C        |     2
     3    |       123     |        A        |    1,2,3
     3    |       456     |        B        |    1,3

This is my query

Select   Author1.AuthorID,
    Publications.PublicationID,
    Publications.PubName,
    GROUP_CONCAT(TRIM(Author2.AuthorID)ORDER BY Author2.AuthorID ASC)AS 'AuthorsAll'
FROM Authors AS Author1
LEFT JOIN Authors AS Author2
ON Author1.PublicationID = Author2.PublicationID
INNER JOIN Publications
ON Author1.PublicationID = Publications.PublicationID
WHERE Author1.AuthorID ="1" OR Author1.AuthorID ="2" OR Author1.AuthorID ="3" 
GROUP BY Author2.PublicationID

But it returns the following instead

 AuthorID | PublicationID | PublicationName | AllAuthors
     1    |       123     |        A        |    1,1,1,2,2,2,3,3,3
     1    |       456     |        B        |    1,1,3,3
     2    |       789     |        C        |     2

It does deliver the desired output when there is only one AuhorID in the where statement. I have not been able to figure it out, does anyone know where i'm going wrong?

A: 

To eliminate duplicate authors, change:

ON Author1.PublicationID = Author2.PublicationID

to:

ON Author1.PublicationID = Author2.PublicationID AND
   Author1.AuthorID <> Author2.AuthorID

Also, change:

GROUP BY Author2.PublicationID

to:

GROUP BY Author1.AuthorID, Author2.PublicationID
vladr
Hi, Thanks for your help. When I ran the query on a large data sample and increased the number of authors to look for your solution was the fastest
Matt
A: 

I suppose I'm not sure why you need the GROUP BY in the first place. Why couldn't you use a correlated subquery like so:

Select   Author1.AuthorID
    ,  Publications.PublicationID
    , Publications.PubName
    , (
        Select GROUP_CONCAT(TRIM(Author2.AuthorID) ORDER BY Author2.AuthorID ASC) 
        From Authors As Author2
        Where Author2.PublicationID = Publications.PublicationID
        ) AS 'AuthorsAll'
FROM Authors AS Author1
    INNER JOIN Publications
        ON Author1.PublicationID = Publications.PublicationID
Where Author1.AuthorId In("1","2","3")
Thomas