tags:

views:

109

answers:

2

There are users that can be referenced by an id as seen in the table below:

users:
user_id

I have attempted at a solution for keeping track of show views

tvshow_hits:
user_id|show_id

There is also a table with details in it:

tvshows:
show_id|name

But that is not required I'm open to change. I feel it isn't necessary to keep track of individual views more if they have watched it, since unlike music tv shows and movies are only watched once, and if they are watched again they don't carry any "weight".

So now I'll work backwards: The resulting page should look like this:

People Who Watch Also Watch

South Park 40%
Family Guy 20%
Something 10%

So, what I need is a query and (if needed) PHP code to

Select the shows (in order of P DESC), the amount of people who watch it compared to the other shows (that have been watched by users who have watched ) in percentage form AS P WHERE the show(s) have been watched by users who have watched LIMIT x

I hope that makes sense that is my best way of explaining it, feel free to think of another way to do what I'm trying to do. All suggestions and help welcome.

+1  A: 

Not tested but something like this:

SELECT name, Count(1) AS no_users
    FROM
        tvshowhitdetails 
    WHERE
        userid IN (
            SELECT userid
               FROM
                   tvshow_hits
               WHERE
                   showid = @showid
              )
        AND
        showid <> @showid
GROUP BY
    name
ORDER BY
    no_users DESC

will give you the name of a show (tvshowhitdetails here is a view which joins your show_hits and show details table) and the number of people who have watched it. You can then get the total number of users in another query to work out your percentage.

Update
Your tvshowhitdetails view looks like this:

CREATE VIEW tvshowhitdetails
AS
SELECT     tvshow_hits.UserId, tvshow_hits.ShowId, tvshows.Name
     FROM         tvshow_hits LEFT OUTER JOIN
                  tvshows ON tvshow_hits.ShowId = tvshows.ShowId
jammus
SQL query:SELECT tvshows.name, Count( 1 )FROM tvshowsWHERE user_idIN (SELECT user_idFROM tvshow_hitsWHERE show_id = @show_id)GROUP BY tvshows.nameLIMIT 0 , 30MySQL said:#1054 - Unknown column 'user_id' in 'IN/ALL/ANY subquery'However tvshow_hits and users both have a user_id col.
steven
I think this is because you're selecting from the tvshows table rather than a showhitdetails view. (I've updated my answer to show this)
jammus
Thanks for your reply: I edited my question to reflect my actual table names and I fixed some cols. Here is the query I just used: SELECT tvshows.name, Count(1) AS no_users FROM tvshows WHERE user_id IN ( SELECT user_id FROM tvshow_hits WHERE show_id = @show_id ) AND show_id <> @show_id GROUP BY tvshows.name ORDER BY no_users DESC But it still gives me the same error as before.
steven
You're still selecting from the tvshow table rather than the view. Have updated my answer a little.
jammus
Whoops. Had a few table/column names wrong. Should be fine now.
jammus
For the view:CREATE VIEW [dbo].[tvshowhitdetails] AS SELECT dbo.tvshow_hits.UserId, dbo.tvshow_hitss.ShowId, dbo.tvshows.NameFROM dbo.tvshow_hitsLEFT OUTER JOIN dbo.tvshows ON dbo.tvshow_hits.ShowId = dbo.tvshows.ShowId#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[dbo].[tvshowhitdetails]ASSELECT dbo.tvshow_hits.UserId, dbo.tvshow_hits' at line 1 ASSELECT dbo.tvshow_hits.UserId, dbo.tvshow_hits' at line 1 I don't know anything about views, should I have edited something?
steven
I've made a couple of changes. Try it now.
jammus
hm.. the query: MySQL returned an empty result set (i.e. zero rows). (Query took 0.0001 sec)
steven
Can you give me a link to your dataset?
jammus
+1  A: 

Starting from the show, you can look up other users who watched that show. Then you can look up the shows that those other users watched. Group it by the other show, and calculate the percentage by dividing by the total number of watchers for the selected show.

Here's an example query:

select
    other_shows.showid as ShowId
,   COUNT(other_shows.userid) / (
        select count(*) from tvshows_hits where showid = @showid 
        ) as Percentage
from tvshows_hits other_users
inner join tvshows_hits other_shows
    on other_shows.userid = other_users.userid
    and other_shows.showid <> @showid
where other_users.showid = @showid
and other_users.userid <> @userid
group by other_shows.showid
Andomar
should Other_users and Other_shows be: users and tvshows_hits?
steven
@steven: Other_users and other_shows are aliases, they are both meant to use the table that has the `userid, showid` columns
Andomar
I don't understand what I am meant to change other_shows and other_users to?
steven
@steven: In order to use one table in multiple places, SQL allows you to give it an alias. I'm using the `tvshow_hits` table twice, once aliasing it to `other_users`, and once to `other_shows`. You could also write it as `tvshows_hits as other_users` if that's clearer. The `as` works much like a column name
Andomar
Ah, I better understand your code. The current user should be counted in the total, so I'm assuming I can just get rid of "and other_users.userid <> @userid " and "@showid" means the show_id of the show I choose right? The percentages come up as 0.4 0.2 0.2 I'm guessing that's ok.
steven
@steven: That's right.
Andomar
The problem I have with the query is that the percentages don't add up to 1 (100%). Any idea why? The limit is 20 and it only shows three rows.
steven
@steven: There's no reason they should add up to 100. If you had 10 people in the database, and they all watch the same shows, that would be 3 shows with 100% each!
Andomar
hm, after reviewing your query I'm not sure we are on the same page.The percentage should be: number of users that watch (from users that watch @showid) for that show / total number of "shows/views" made by users that watch @showid.
steven