views:

43

answers:

2

Hello all and thanks in advance I have the tables accounts, votes and contests
A vote consists of an author ID, a winner ID, and a contest ID, so as to stop people voting twice
Id like to show for any given account, how many times theyve won a contest, how many times theyve come second and how many times theyve come third
Whats the fastest (execution time) way to do this? (Im using MySQL)

A: 

If I'm interpreting things correctly, to stop people voting twice I think you only need a unique index on the votes table by author (account?) ID and contestID. It won't prevent people from having multiple accounts and voting twice but it will prevent anyone from casting a vote in a contest twice from the same account. To prevent fraud (sock puppet accounts) you'd need to examine voting patterns and detect when an account votes for another account more often then statistically likely. Unless you have a lot of contests that might actually be hard.

tvanfosson
yeah i already have them indexed by author and contest thanks :) huge gratitude to bill, ill try that out in a sec
Pez
+1  A: 

After using MySQL for a long time I'm coming to the conclusion that virtually any use of GROUP BY is really bad for performance, so here's a solution with a couple of temporary tables.

CREATE TEMPORARY TABLE VoteCounts (
  accountid INT,
  contestid INT,
  votecount INT DEFAULT 0
);

INSERT INTO VoteCounts (accountid, contestid)
  SELECT DISTINCT v2.accountid, v2.contestid
  FROM votes v1 JOIN votes v2 USING (contestid)
  WHERE v1.accountid = ?; -- the given account

Make sure you have an index on votes(accountid, contestid).

Now you have a table of every contest that your given user was in, with all the other accounts who were in the same contests.

UPDATE Votes AS v JOIN VoteCounts AS vc USING (accountid, contestid)
SET vc.votecount = vc.votecount+1;

Now you have the count of votes for each account in each contest.

CREATE TEMPORARY TABLE Placings (
  accountid INT,
  contestid INT,
  placing INT
);

SET @prevcontest := 0;
SET @placing := 0;
INSERT INTO Placings (accountid, placing, contestid)
  SELECT accountid, 
    IF(contestid=@prevcontest, @placing:=@placing+1, @placing:=1) AS placing,
    @prevcontest:=contestid AS contestid
  FROM VoteCounts
  ORDER BY contestid, votecount DESC;

Now you have a table with each account paired with their respective placing in each contest. It's easy to get the count for a given placing:

SELECT accountid, COUNT(*) AS count_first_place
FROM Placings
WHERE accountid = ? AND placing = 1;

And you can use a MySQL trick to do all three in one query. A boolean expression always returns an integer value 0 or 1 in MySQL, so you can use SUM() to count up the 1's.

SELECT accountid, 
  SUM(placing=1) AS count_first_place,
  SUM(placing=2) AS count_second_place,
  SUM(placing=3) AS count_third_place
FROM Placings
WHERE accountid = ?; -- the given account

Re your comment:

Yes, it's a complex task no matter what to go from the normalized data you have to the results you want. You want it aggregated (summed), ranked, and aggregated (counted) again. That's a heap of work! :-)

Also, a single query is not always the fastest way to do a given task. It's a common misconception among programmers that shorter code is implicitly faster code.

Note I have not tested this so your mileage may vary.


Re your question about the UPDATE:

It's a tricky way of getting the COUNT() of votes per account without using GROUP BY. I've added table aliases v and vc so it may be more clear now. In the votes table, there are N rows for a given account/contest. In the votescount table, there's one row per account/contest. When I join, the UPDATE is evaluated against the N rows, so if I add 1 for each of those N rows, I get the count of N stored in votescount in the row corresponding to each respective account/contest.

Bill Karwin
so am i right to assume that this is going to be quite intensive now matter how i do it and theres no way to do it in a single query? or is it your opinion that this is the *fastest* way to do it? thanks again btw
Pez
cheers man it works perfectly, although i dont understand the syntax and exactly what this does:`UPDATE Votes JOIN VoteCounts USING (accountid, contestid)SET votecount = votecount+1;`does this set the vote count on table `votes` to the vote count fromo table `VoteCounts`? I noticed that `votes` isnt referenced again so i took those two lines out and it seems to work just fine without it
Pez