views:

67

answers:

3

My head is hurting trying to figure out a SQLquery.

I have a table of vote data with team_id, ip_address and date_voted fields and I need to return a count of votes for each team_id but only count the first 10 rows per IP address in any 24 hour period.

Any help will be greatly appreciated and may stop my head from pounding!

A: 

Hmmm... it would be helpful to know what "first" means in this case. Earliest in the day? Random? If there are 12 votes for a given day how do you pick 2 to not count?

Hogan
this is a comment not an answer
KM
hi hogan, it really doesn't matter. the earliest, random, all are correct.
David
@KM - You're right. I was working on the sql when I put this question up -- then got called to a meeting before I finished the sql, looks like others have put up good answers.
Hogan
A: 

Haven't had time to check, but the following should do the trick.

SELECT Yr, DoY, team_id, SUM(IF NbVote < 10, NbVote, 10) As FilteredVoteCount
FROM (
  SELECT YEAR(date_voted) AS Yr, DAYOFYEAR(date_voted) AS DoY, 
    team_id, 
    ip_address,
    COUNT(*) AS NbVotes
  FROM myTable
  -- WHERE here for some possible extra condition.
  GROUP BY YEAR(date_voted), DAYOFYEAR(date_voted), team_id, ip_address
)
GROUP BY Yr, DoY, team_id
ORDER BY Yr, DoY, team_id   -- or some other order may be desired.
mjv
Putting a `DISTINCT` statement and `ORDER BY` clause in the sub-select should make this solution work.
Sonny
sorry mjv, i should of course have been more specific. i need the total count for every 24 hour period that exists in the table. so, for example, i need a total count of the rows, up to a maximum of 10 rows per day per ip address, for monday, tuesday, wednesday, etc. does that make sense at all or am i waffling on only making sense to myself? :)
David
@David, see my edits (BTW aside from adding support to compute totals for each day, I added the GROUP BYs which I had forgotten initially...). If somehow you are interested in the DAY OF WEEK rather that on single days, just change `YEAR(date_voted), DAYOFYEAR(date_voted)` by `DAYOFWEEK(date_voted)`
mjv
ah mjv, you superstar! that put me on the right track, although there was a couple of small errors:the IF statement needed a set of brackets and the internal select needed an alias. otherwise that has saved me hours of head scratching!can't thank you enough!
David
A: 

Assumption: Only the first ten votes for a team (each row in the votes table is a vote for team_id) from a given IP address should count for a given date.

So here's the raw votes per team per day.

select team_id, vote_date, ip_address, count(*) as raw_vote_count
  from votes
 group by team_id, vote_date, ip_address

Now, using that, correct the number of votes down to no more than ten with a CASE expression:

select team_id, vote_date, ip_address,
       case when raw_vote_count > 10 
            then 10 
            else raw_vote_count 
        end as adjusted_vote_count
  from (select team_id, vote_date, ip_address, count(*) as raw_vote_count
          from votes
         group by team_id, vote_date, ip_address
       ) sub1

If you then want a total votes by day, it's:

select team_id, sum(adjusted_vote_count)
  from (
       select team_id, vote_date, ip_address,
              case when raw_vote_count > 10 
                   then 10 
                   else raw_vote_count 
               end as adjusted_vote_count
         from (select team_id, vote_date, ip_address, count(*) as raw_vote_count
                 from votes
                group by team_id, vote_date, ip_address
              ) sub1
       )
 where date = :mydate
 group by team_id
 order by team_id
Adam Musch