tags:

views:

93

answers:

5

Can someone help me with this query:

SELECT su.name,su.imdb , su.id, su.owner, COUNT(*) AS count 
FROM subhits AS h LEFT JOIN subtitles AS su ON h.subid=su.id 
WHERE su.added BETWEEN '" . $lweek . "' AND '" . $today . "' 
GROUP BY h.subid 
ORDER BY count DESC LIMIT 15

RETURNS:

name      imdb    id  owner  count
*Angels and Demons WTV  /title/tt0808151/  3337  3055  120
Terminator Salvation 2009   /title/tt0438488/  3539  26  120
Friday The 13th-WTV   /title/tt0758746/  3334  26  82
Night at the Museum     /title/tt1078912/  3548  7170  75
*Angels and Demons WTV2 /title/tt0808151/       3512    12 10

*DUPLICATED IMDB

As you see it returns results with the same imdb. What i would like to do is get the highest count, but exclude the lowest imdb count from the result.

A: 

This may help.

http://www.artfulsoftware.com/infotree/queries.php?&bw=1280#101

yieldvs
Thanks, i will take a look at that.
A: 

Try this:

SELECT su.name
        , su.imdb
        , su.id
        , su.owner
        , (SELECT COUNT(*) FROM dbo.subhits AS h WHERE h.subid = su.id) AS count
    FROM dbo.subtitles AS su
    ORDER BY count DESC LIMIT 15

Edit: Using the additional information provided I had an almost working solution but got stuck. Combining Chris Simpson's answer with the partial solution I came up with this. It avoids temp tables and has the missing pieces that the other Chris didn't have time to complete. I'm building this in MSSQL so it might need tweaking to work in MySQL. I also can't guarantee the information will be correct without your data but this might help.

SELECT a.id
        , (SELECT su.name FROM dbo.subtitles su WHERE a.id = su.id LIMIT 1)
        , a.imdb
        , (SELECT su.owner FROM dbo.subtitles su WHERE a.id = su.id LIMIT 1)
        , a.count
    FROM 
        (
            SELECT su.id
                    , su.imdb
                    , COUNT(*) AS count
                FROM dbo.subtitles su
                INNER JOIN dbo.subhits sh ON su.id = sh.subid
                GROUP BY su.id, su.imdb
        ) a
    INNER JOIN 
        (
            SELECT c.imdb
                    , MAX(c.count) AS maxCount
                FROM 
                    (
                        SELECT su.id
                                , su.imdb
                                , COUNT(*) AS count
                            FROM dbo.subtitles su
                            INNER JOIN dbo.subhits sh ON su.id = sh.subid
                            GROUP BY su.id, su.imdb
                    ) c
                GROUP BY c.imdb
        ) b ON a.imdb = b.imdb AND a.count = b.maxCount
    ORDER BY a.count DESC LIMIT 15
Chris Porter
That's what my query does, but it returns duplicated 'su.imdb', and i don't want that to happen.Thank you anyway
Can you update the question with a more detailed explanation of what you expect the data result set to look like? I assumed imdb would be unique by title. If it is not, and you group by imdb, you could have ids and titles that are not included in the result set and the hits that relate to them will be excluded as well. I'm sure there is a way to write this query but you need a better explanation of what you want from the data.
Chris Porter
I edited the question, hope its more clear now. Thank you
A: 

I hate to say this, but this is one of those rare occasions where a temp table might help. If you put the results of this:

select 
  imdb, 
  id, 
  count(*) as cnt
from subtitles  
group by 
  imdb, 
  id

..into a temp table, you can then get the rows you really want:

select
  t.imdb,
  t.id,
  t.cnt
from MyTempTable t
  inner join
  (
    select
      imdb, 
      max(cnt) as maxcnt 
    from MyTempTable
    group by  
      imdb
  ) as drv on t.imdb = drv.imdb and t.cnt = drv.maxcnt

..and the results of the above can be joined back into your main table without having to be grouped again.

Does this help?

Apologies for not putting all the SQL in but I'm pressed for time. This will do what you want but will still duplicate if the counts are the same. The query could be modified to deal with this if you care about it.

On a separate note, I'm not sure why you are left joining in the statement when you are not showing any data from the left hand table?

Chris Simpson
A: 

What I like about MySQL compared to Oracle, is how other fields not part of GROUP BY can be selected without using grouping functions. What the query below does is take your query and groups it by imdb, which selects a unique imdb with the highest count (which would appear first in the descending sort in the subselect). Try it and see if it works for you.

SELECT name, imdb, id, owner, count
FROM 
(
 SELECT su.name,su.imdb , su.id, su.owner, COUNT(*) AS count 
 FROM subhits AS h LEFT JOIN subtitles AS su ON h.subid=su.id 
 WHERE su.added BETWEEN '" . $lweek . "' AND '" . $today . "' 
 GROUP BY h.subid
 ORDER BY count DESC
)
GROUP BY imdb
LIMIT 15
Vladiat0r
With a few tweaks i made that work. Thank you much appreciated.And thanks to everyone else :)
A: 

I'd try to test this myself, but I don't have your data or a good sample to test against.

SELECT su.name,su.imdb , su.id, su.owner, COUNT(*) AS counter
FROM subhits AS h LEFT JOIN subtitles AS su ON h.subid=su.id 
WHERE su.added BETWEEN '" . $lweek . "' AND '" . $today . "' 
GROUP BY h.subid 
having max(counter)
ORDER BY count DESC LIMIT 15
Justin