views:

84

answers:

3

Hi, I am trying to pull multiple columns from 3 joined tables, but I want the result set to contain only one distinct "data entry" per p.id (the pet id which is a foreign key in the data entry table). The issue i have is that there could be two data entries, numbered 1 and 2, belonging to a pet - the query has to just pick the data entry with the highest number - that's why i was trying to use max and group by but it doesn't quite work. Can anyone see where i'm going wrong? Many thanks

SELECT `p`.`id`, `o`.`id`, `o`.`email`, MAX(d.number), `d`.`number`
FROM (`pets` AS `p`, `owners` AS `o`, `data_entries` AS `d`) 
WHERE `p`.`owner_id` = `o`.`id` 
AND `p`.`id` = `d`.`pet_id` 
GROUP BY `p`.`id` 
ORDER BY `d`.`number` DESC

EDIT: following iniju's suggestion, I tried the following query, but this doesn't not filter out the data entries where the number is lower than another data entry for the same pet :

SELECT `p`.`id`, `o`.`id`, `o`.`email`, `d`.`number`
FROM (`pets` AS `p`, `owners` AS `o`, `data_entries` AS `d`) 
WHERE `p`.`owner_id` = `o`.`id` 
AND `p`.`id` = `d`.`pet_id` 
GROUP BY `p`.`id`, `o`.`id`, `o`.`email`, `d`.`number`
HAVING `d`.`number`=MAX(`d`.`number`)
A: 

If I'm understanding your question correctly, you need

LIMIT 1

at the end.

Will
Ah but that would limit the query to just one result in total
franko75
Ok. I thought that was overly simple, I'm definitely just not understanding your question.
Will
I'll try and update the orginal post to make it a bit clearer
franko75
A: 

Your GROUP BY should contain all the non-aggregated columns that is:
GROUP BY 'p'.'id', 'o'.'id', 'o'.'email', 'd'.'number'

Full query:

SELECT `p`.`id`, `o`.`id`, `o`.`email`, MAX(d.number)  
FROM (`pets` AS `p`, `owners` AS `o`, `data_entries` AS `d`)  
WHERE `p`.`owner_id` = `o`.`id`  
    AND `p`.`id` = `d`.`pet_id`  
GROUP BY `p`.`id`, `o`.`id`, `o`.`email`  
ORDER BY `d`.`number` DESC 
iniju
Actually, you should probably remove 'd'.'number' both from the SELECT and from the GROUP BY
iniju
Hi iniju - i've updated the query with your suggestion, but i need to filter out any of the results which have the same pet_id - by that i mean only select the data_entry with the highest number
franko75
Oh I see now. Then keep the 'd'.'number' in both locations and add the following after the GROUP BY part:HAVING `d`.`number`=MAX(d.number)
iniju
Please see the adjusted answer
iniju
Thanks for the help so far iniju, but this still doesn't filter out the the unwanted data_entries where the number is lower than another data entry for the same pet_id. I'll edit my original post to show the query I just ran with your suggestions
franko75
Sorry try this then, see above
iniju
Hmm the order by has no effect - the query now only retrieves data_entries with the number 1. I'm wondering now if this requires an inner join to make this work
franko75
I didn't touch the order by, I just removed the \`d\`.\`number\` from SELECT and GROUP BY. Please see the full query above.
iniju
The query above needs the d.number to also be selected, otherwise it fails in the HAVING clause - when i add the d.number to the columns selected, the query returns the same results as i mentioned in my last column (p.s thanks again for your help so far)
franko75
What do you get if you remove the HAVING line completely? As in my newly edited answer above (full query)
iniju
Ah that seems to work perfectly! Great help iniju! I now need to figure out how to add the extra clauses i had into that query (stripped it down for this post) - at first glance the results don't seem right but that's for me to worry about
franko75
We finally fixed it! About the extensions just remember that any extra columns in the select that don't use max(), count() etc should also appear in the GROUP BY
iniju
+1  A: 

You cannot simultaneously select both the number and the MAX number while grouping. This is because the MAX function is based on the aggregate of your grouping. In your case you are also selecting the column itself, which will create a row for each distinct number... not each distinct MAX number.

Let's say you have 4 rows:

PET    NUMBER
A      1
A      2
B      4
B      2

Do you want your result set to be:

PET  MAX
A    2
B    4

Or do you want it to be:

PET  NUM  MAX
A    1    2
A    2    2
B    4    4
B    2    4

The difference is that one is grouped and aggregated while the other is NOT grouped but contains the aggregated column.

I suggest you remove the d.number from your SELECT statement, leaving only MAX(d.number) This will solve your problem if, and only if, the o.id and o.email are unique to the joined record for p.id

Try this:

SELECT `p`.`id`, `o`.`id`, `o`.`email`, MAX(d.number)
FROM `data_entries` AS `d`
JOIN `pets` `p` ON `p`.`ID` = `d`.`pet_id`
JOIN `owners` `o` ON `o`.`ID` = `p`.`owner_id`
GROUP BY  `p`.`id`, `o`.`id`, `o`.`email`
ORDER BY MAX(d.number) DESC
Matthew PK