tags:

views:

152

answers:

2

The following query returns strange results for me:

SELECT
    `Statistics`.`StatisticID`,
    COUNT(`Votes`.`StatisticID`) AS `Score`,
    COUNT(`Views`.`StatisticID`) AS `Views`,
    COUNT(`Comments`.`StatisticID`) AS `Comments`
FROM `Statistics`
LEFT JOIN `Votes` ON `Votes`.`StatisticID` = `Statistics`.`StatisticID`
LEFT JOIN `Views` ON `Views`.`StatisticID` = `Statistics`.`StatisticID`
LEFT JOIN `Comments` ON `Comments`.`StatisticID` = `Statistics`.`StatisticID`
GROUP BY `Statistics`.`StatisticID`
LIMIT 0, 10

I'm querying this on a table structure like the following:

(only data relevant to Statistics.StatisticID = 8)

Votes

StatisticID
    8

Views

StatisticID
    8
    8

Comments

StatisticID
    8
    8
    8
    8
    8

Now if I run this query I get the following result set:

StatisticID    Score    Views   Comments
     8           5        5        5

I knwo where the 5 comes from - the number of Comments - and this works if I take the comments statement out. Can anyone debug this as this is out of my reach (I'm relatively new with SQL).

Thanks, Ross

+4  A: 

When joining like this, you will duplicate the data as many times as you find mathing rows in the other tables. This is fine, if you only have 1 corresponding row in each table.

Run this query without the grouping, and you will get an idea on why you get the same result on all counts. I would however have guessed that you'll get 10 as the count for every field (1*2*5) If you want to solve this, you need to call a subselect for every count.

SELECT
    s.`StatisticID`,
    (SELECT COUNT(*) FROM Votes WHERE Votes.StatisticID = s.StatisticID) AS Score,
    (SELECT COUNT(*) FROM Views WHERE Views.StatisticID = s.StatisticID) AS Views,
    (SELECT COUNT(*) FROM Comments WHERE Comments.StatisticID = s.StatisticID) AS Comments,
FROM `Statistics` s
LIMIT 0, 10

There are certain performance-issues with this if the outher result is big. You might optimize it a little by joining one of the tables however I'm uncertain if the queryparser would be smart enough to only run 1 single time for each grouped item. Hopefully it would. Otherwise you could split it into different queries.

jishi
I wouldn't worry about performance issues - this sub-query solution will execute perfectly, and not perform any spurious actions!
Kieveli
I was more thinking if the outer result would have like 1000 rows, then it would need to run 3000 sub-queries aswell. In that case it would probably be faster to join one table at the time, and call the query 3 times.
jishi
agreed on the subquery thing - if there are a lot of outer results. But we do have limit 0, 10 in it at the moment i suppose.Also, you've selected from votes twice, third suybquery should be from comments
benlumley
Hm, I did change that, somehow my edit was reverted, or it was a glitch in the update.
jishi
Thanks, I wouldn't have been aware of the perf issues if you hadn't pointed that out to me :)
Ross
+2  A: 

Assuming you have an id field or similar on the votes/views/comments:

SELECT
    `Statistics`.`StatisticID`,
    COUNT(DISTINCT `Votes`.`VoteID`) AS `Score`,
    COUNT(DISTINCT `Views`.`ViewID`) AS `Views`,
    COUNT(DISTINCT `Comments`.`CommentID`) AS `Comments`
FROM `Statistics`
LEFT JOIN `Votes` ON `Votes`.`StatisticID` = `Statistics`.`StatisticID`
LEFT JOIN `Views` ON `Views`.`StatisticID` = `Statistics`.`StatisticID`
LEFT JOIN `Comments` ON `Comments`.`StatisticID` = `Statistics`.`StatisticID`
GROUP BY `Statistics`.`StatisticID`
LIMIT 0, 10

Not tested it, but think it should work. (We have to use the different field because the statisticID will always be the same within a given group ...)

benlumley
Actually, this is a better solution. Didn't know that you could use DISTINCT within a count, just tested it and it works.
jishi