views:

85

answers:

2

I have two tables of concern here: users and race_weeks. User has many race_weeks, and race_week belongs to User. Therefore, user_id is a fk in the race_weeks table.

I need to perform some challenging math on fields in the race_weeks table in order to return users with the most all-time points.

Here are the fields that we need to manipulate in the race_weeks table.

races_won (int)
races_lost (int)
races_tied (int)
points_won (int, pos or neg)
recordable_type(varchar, Robots can race, but we're only concerned about type 'User')

Just so that you fully understand the business logic at work here, over the course of a week a user can participate in many races. The race_week record represents the summary results of the user's races for that week. A user is considered active for the week if races_won, races_lost, or races_tied is greater than 0. Otherwise the user is inactive.

So here's what we need to do in our query in order to return users with the most points won (actually net_points_won):

  1. Calculate each user's net_points_won (not a field in the DB).

  2. To calculate net_points_won, you take (1000 * count_of_active_weeks) - sum(points__won). (Why 1000? Just imagine that every week the user is spotted a 1000 points to compete and enter races. We want to factor-out what we spot the user because the user could enter only one race for the week for 100 points, and be sitting on 900, which we would skew who actually EARNED the most points.)

This one is a little convoluted, so let me know if I can clarify further.

A: 
SELECT  user_id, 1000 * COUNT(*) - SUM(points_won) AS net_points
FROM    race_weeks
WHERE   races_won + races_lost + races_tied
        AND recordable_type = 'User'
GROUP BY
        user_id
ORDER BY
        net_points DESC
Quassnoi
Since there are multiple types of competitors (called recordable_type) that can race -- namely User and Robot -- we want to narrow the result to just User. Robot has user_id, and User has user_id. So way that we tell difference is recordable_type.
keruilin
If a user didn't participate in a race for a given week, a record is still created in the race_week table. However, races_won = 0, races_lost = 0, and races_tied = 0. That's how we can tell whether the user was inactive or not. So we just can't count all race_week records. We can only count active ones.
keruilin
@keruilin: I added the bot test into the `WHERE` condition. As for the active users, `races_won + races_lost + races_tied` is a shortcut for `races_won + races_lost + races_tied <> 0` in `MySQL`, so only the active users will be selected.
Quassnoi
dang, you smart!
keruilin
A: 

I believe that your business logic is incorrect: net_points should be the sum of points won for that user minus the number of points the user was spotted.

In addition, the check for active weeks should test races_won, races_lost, and races_tied against zero explicitly to give the system the opportunity to use indexes on those columns when the table becomes large.

SELECT user_id
     , SUM(points_won) - 1000 * COUNT(*) AS net_points
  FROM race_weeks
 WHERE recordable_type = 'User'
   AND (races_won > 0 OR races_lost > 0 OR races_tied > 0)
 GROUP BY user_id
 ORDER BY net_points DESC
Vadim K.
Yes, you're abs right about the business logic.
keruilin