views:

647

answers:

11

I'm trying to get the percentage of each video I have in my database based on its view count against all other videos.

I'm then trying to display all the videos from highest view count to lowest, displaying its percentage on its side inside a nice HTML page.

Obviously the percentage would range from 0 - 100% (and not over) and the most popular video would probably have 100% I assume..

I have about 3,400 videos in the database. My attempts are laughable and have been scratching my head for about days now..

My table looks something similar to this.

video_public

*id, video_title, video_views*

Attempt:

SELECT id, video_views * 100 / (SELECT COUNT(*) FROM video_public) FROM video_public stat

To be honest I don't even know if this SQL query is right.

I haven't even taken into consideration the videos views against all video views and total videos..

Really stuck..

+2  A: 

I think it's better to calculate your percentage in your script, and just retreive the view-counts from the database.

Ben Fransen
I'd go this route if I really wasn't sure about my SQL skills, but I feel they are good to cultivate. Usually if you're doing stuff with databases though, having the database do the math for you and limiting large volume data transfer can pay off in the future.
sheepsimulator
I'd rather calculate it through SQL, I can then later manage everything alot better.
BeaversAreDamned
Why? You can (probably) get this information in the same SQL command you use to retrieve the counts with the addition of perhaps 60 characters of text. If encapsulated as a view, that information would then be available to any database reader. Why write extra routines and confine the results to a single application?
Larry Lustig
+3  A: 

Well, I'd start by thinking about what your'e looking for:

percentage of each video I have in my database based on its view count against all other videos.

Basically, you want to find it's view rank first. Why not sort the records based on video views:

SELECT id, video_title, video_views
FROM video_public order by video_views DESCENDING

Now, and I think this is what you want to do, is to only show a portion of these, say, the top 10%? You want to then assign each of your records a percentile. This means that for the ordering you've assigned to the videos, you want to make the "top row" (first one returned) be given 100% and the last row returned 0%. It gives a number between 0 and 100 to each item in your resultset.

Your'e percentile is computed:

SELECT id, 
       video_title, 
       video_views, 
       ((video_views * 100) / (select max(video_views) from video_public)) video_percentile
FROM video_public order by video_views DESCENDING

If you only want to show then the top 10%, try the following:

SELECT id, 
       video_title, 
       video_views, 
       ((video_views * 100) / (select max(video_views) from video_public)) video_percentile
FROM video_public
WHERE ((video_views * 100) / (select max(video_views) from video_public)) > 90
ORDER BY video_views DESCENDING

It isn't totally clear what you're looking for, but I think this could be helpful.


EDIT: After looking over the comments, specifically Riven's and Larry Lustig's, and re-reading the question, I'd have to say that the sum() of the video_views is incorrect, so I went back and changed the sum()s to max()s. This will give you a percentage based upon the video viewed the most.

sheepsimulator
But that won't give you any percentage. Just the order of most viewed, or less, depending on your sortingdirection.
Ben Fransen
True, I'm getting there.
sheepsimulator
Done............
sheepsimulator
Ok, i rest my case ;)
Ben Fransen
I've accepted this because its closest to resolving my question. The only -tiny- problem I have is the top viewed videos are a little over 100%. But I think you've just about done it sheepsimulator. Thank you.
BeaversAreDamned
Beavers, that shouldn't be the case with this solution. Every video should come in under 100%, *unless* there's just one video in the table.
peacedog
According to the clarification made by the OP above, this doesn't answer the question he or she meant to ask.
Larry Lustig
@ Larry - After reviewing other answers, I have changed mine to reflect your (and Riven's) percentage calculation, which is what the OP was looking for. I upvoted your answer.
sheepsimulator
+1  A: 

Run two selects:

 select max(video_views) FROM video_public

to get the maximum number of views and then run your other select and calculate the percentage in your script. Otherwise, the subselect might run for each result row ... which is not what you want, performance wise.

Also note that you must use max, not sum or count(*) since you want to know "how often has this video been watched compared to the one which I watched most". Imaging you watched every video once and one twice. What's the percentage going to be? 100%? Or 0.0000001%?

Aaron Digulla
For the other answers this statement doesn't make sense "the most popular video would probably have 100% I assume.." but for this answer it does.
Tim
+1  A: 
select id, ((video_views * 100) / (select sum(views) from videos)) view_percent   from video_public

this will give you what percentage of the total views has each video.

Drevak
Don't forget TSQL will round integer / integer to integer.
Broam
This question is MySQL.
Larry Lustig
+4  A: 

To modify what sheepsimulator suggests, you might try:

SELECT 
    id, 
    video_title, 
    video_views, 
    (select sum(video_views) from video_public)) as TotalViews,
    ((100 * video_views)/(select sum(video_views) from video_public)) as PercentOfViews
FROM 
    video_public 
order by 
    video_views.

Change ordering to suit your tastes, of course.

peacedog
Sheepsimulator had not added the calculated field on the end yet when I posted this, but he's got it right.
peacedog
I would suggest to do the percentage computation in the middleware (php?); reason would be to use the RDBMS only to get the data and the middleware to work on the visualization.but I guess it's also a matter of personal preference.
call me Steve
I wouldn't argue with that, Steve. It would also depend on what I was doing. If this was a walk-up report request (or I was in a serious hurry to get this ready for someone to look at), I'd probably do itin the database. If I was adding this to one of our published reports, we'd have the data generated nightly and stored neatly in a table storing the data about vidoes we wanted to report on (we don't have a datawarehouse per se, yet). If I was serving this up in an application, I would probably have the business layer do it.
peacedog
A: 

Try this it works in MySQL now.

select id, video, views, ((views / (select sum(views) from video)) * 100) as Percentagess
from video

This is the table:

CREATE TABLE IF NOT EXISTS `video` (
  `ID` int(11) NOT NULL,
  `Video` varchar(50) NOT NULL,
  `Views` int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `video`
--

INSERT INTO `video` (`ID`, `Video`, `Views`) VALUES
(1, 'Hulk', 20),
(2, 'Jack', 30),
(3, 'The King', 24);

The Code below works in SQL Server:

It is a cursor with a temporary table i just wrote.

declare @total int set @total = (select sum(Views) from video)

declare @videoid int
declare @video varchar(50)
declare @views int

declare @percentage decimal(18, 2)


IF  EXISTS (SELECT * FROM tempdb.sys.tables WHERE NAME LIKE '%tmp%')
DROP TABLE #tmp


--create temporary table
CREATE TABLE #tmp (VideoID int, VideoTitle varchar(50), Views int, Percentage decimal(18, 2))

DECLARE @videoPercent CURSOR
SET @videoPercent = CURSOR FOR 
    select id, video, views
    from video

OPEN @videoPercent

FETCH NEXT FROM @videoPercent INTO @videoid, @video, @views
WHILE @@FETCH_STATUS = 0
begin

    set @percentage = (convert(decimal(18,2), @views) / convert(decimal(18,2), @total)) * 100; 

    insert into #tmp(VideoID, VideoTitle, Views, Percentage)
    values(@videoid, @video, @views, @percentage);

    FETCH NEXT FROM @videoPercent INTO @videoid, @video, @views
end

select * from #tmp

CLOSE @videoPercent
DEALLOCATE @videoPercent

This is the table:

CREATE TABLE [dbo].[Video](
    [ID] [int] NOT NULL,
    [Video] [varchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
    [Views] [int] NOT NULL
)

Fill it in with data and u are ready to go. Enjoy yourself.

Colour Blend
Is that going to work with MySQL?
sheepsimulator
Im sorry about that. It may be of use to someone else hopefully.
Colour Blend
A: 

Something to bear in mind... so far all answers include subqueries, which I don't think are necessary. These will be evaluated for EACH row in your table!

Rather do this externally, e.g.

Instead of this:

select id, ((video_views * 100) / (select sum(views) from videos)) view_percent
from video_public

Do this:

declare @totalviews int
select @totalviews = sum(views) from videos

select id, (video_views * 100 / @totalviews) view_percent
from video_public

That should run faster, and have less of an impact on your database.

Stuart
Is this MySQL compatible? It looks an awful lot like TSQL.
sheepsimulator
Sorry... this is TSQL. I have a background in TSQL, but thought the problem is generic enough that it can be easily translated.Actually, I only saw the MySQL tag after commenting. Will look at the tags properly before answering again. Sorry!
Stuart
Thanks for giving it an attempt anyway. I appreciate your help.
BeaversAreDamned
Why do you think that sub-query will be executed once per row? It's not correlated to the surrounding query in any way, so any reasonably good optimizer should recognize that it only needs to be evaluated once. Are you saying that MSSQL won't make that optimization (honest question, I don't use MSSQL much).
Larry Lustig
Good point Larry. I was working on correlated subqueries when I wrote this, so probably just got my brain wires crossed. But I decided to do very basic analysis of this, and it seems that the subquery version is marginally less optimised... by about 5 - 6% in the query plans I tried. That said... I haven't really been digging deep, but I am now intrigued as your comments are perfectly valid!
Stuart
A: 

You will not end up with your most viewed video at 100% unless only one of your videos gets viewed, but it will give you the view count, ratio of views of this video over all viewings, and a percent of times this has been viewed compared to all videos.


SELECT 
    Videos.id                     AS VideoID,
    video_views                   AS ViewCount,
    video_views / Total.ViewCount AS ViewRatio,
    Convert(varchar(6),Round(100.00 * video_views / Total.ViewCount, 2)) + '%' AS ViewPercentage
FROM 
    video_public AS Videos
CROSS JOIN 
    (SELECT Convert(float,Sum(video_views)) AS ViewCount FROM video_public) AS Total
ORDER BY
    video_views DESC
Dennis Allen
A: 

The easiest way to accomplish this is actually with two separate queries. I agree with Stuart that subqueries will have a negative impact on database performance because the subquery will be run for each record returned.

What I would do is run a query to get the total video views and store the result in a local variable.

SELECT SUM(video_views) FROM VIDEO_PUBLIC
if you want the percentage viewed of all total views

or

SELECT MAX(video_view) FROM VIDEO_PUBLIC
if you want the percentage viewed relative to the most popular video.

Next, run a query to get the videos, order by most views:

SELECT id, video_title, video_views FROM VIDEO_PUBLIC ORDER BY video_views DESC

Then in code you can calculate the percentage viewed of each video by using the result from the first query while looping through the results of the second query.

I'm not sure what language your using for your display, but in C# it would look like this:

int totalViews = FirstQuery();
IEnumerable<Videos> videos = SecondQuery();
foreach (Video video in videos) {
    double totalPercentage = ((double)video.Video_Views / (double)totalViews) * 100)
}
Jason
+1  A: 

If you'd want calculate the percentage in 1 query, use:

SELECT `vp1`.`id`, `vp1`.`video_views` * 100 / (SELECT MAX(`vp2`.`video_views`) FROM video_public AS `vp2`) FROM video_public AS `vp1`

Ofcourse it would be much more efficient to store the intermediate result in PHP (or a SQL variable) and pass it to the next query

$phpmax <= SELECT MAX(`vp2`.`video_views`) FROM video_public AS `vp2`
SELECT `vp1`.`id`, `vp1`.`video_views` * 100 / {$phpmax} ) FROM video_public AS `vp1`

==> Everybody using SUM(views) in the query has the wrong results !! The highest ranking video should result in 100%, not a percentage of the view count of all videos combined, therefore you must use MAX(views)

Riven
+1  A: 

Okay, based on the clarification of your question:

You want to calculated (video_views * 100) / (largest_views_for_any_single_video) for each video in the database.

The numerator is easy, it's just the video_views column. The denominator is

SELECT MAX(video_views) FROM video_public

So, put it together and you get:

SELECT video_title, ((video_views * 100) / (SELECT MAX(video_views) 
FROM video_public)) FROM video_public

That should produce 100 for the most-viewed video(s), and lower percentages for other videos, down to 0 for anything never viewed.

Larry Lustig
This is what Im looking for. I didn't even think of the MAX syntax! However I would give some credit to sheepsimulator, since he seemed to crack most part of it. But in terms of what I needed. This definitely hit it home. Kudos.
BeaversAreDamned