views:

254

answers:

3

I have three tables in MySQL that are related, yet not technically linked to each other by a foreign key. They are: users, levels, and classes.

The users table has a karma column, a numeric type. Based on this karma number, I want to know the user's level, which I retrieve from the levels table. This is not a hard relationship, since a level is associated with a range of karma values, like so:

  • level 1 (minimum karma: 0)
  • level 2 (minimum karma: 100)
  • level 3 (minimum karma: 500)
  • ...

So if a user has a karma of 400, the return value should be 2. To make things slightly more complex, the level number indicates the user's class, of which the definitions are stored in the classes table. Once again this is a range relationship:

  • class ant (minimum level 1)
  • class owl (minimum level 5)
  • class lion (minimum level 100)
  • ...

In summary, we're talking about three tables that have an implicit relationship with each other, based on range values. My question relates to how to effectively query these tables. A very common need for me is to get information for one or more users based on a condition, the result set should contain the user details, but also the user's level and class.

For a single user, I managed to write this query which works fine:

SELECT u.*,  lv.num as level, lvc.title as class, lvc.id as classid
FROM user as u, level as lv, levelclass as lvc
WHERE u.id = ? AND lv.min_karma <= u.karma AND lv.num <= lvc.minlevel_num
ORDER BY lv.num DESC LIMIT 1;

However, if I would broaden the result set by leaving WHERE u.id = ? and removing LIMIT 1, thus querying for a list of users, I get the combination of all three tables. Typically you would bring the rows down by doing an inner join on keys, but since this is a range check, that does not work. I tried using the range check in an inner join condition, but that brings the same result. Even using grouping I cannot get the result I want.

In a desperate attempt, I came up with this query, which works:

SELECT usr.*, 
(SELECT lv.num as level
FROM user as u, level as lv, levelclass as lvc
WHERE u.id = usr.id AND lv.min_karma <= u.karma AND lv.num <= lvc.minlevel_num
ORDER BY lv.num DESC LIMIT 1) as level,
(SELECT lvc.title as class
FROM user as u, level as lv, levelclass as lvc
WHERE u.id = usr.id AND lv.min_karma <= u.karma AND lv.num <= lvc.minlevel_num
ORDER BY lv.num DESC LIMIT 1) as class,
(SELECT lvc.image as class_image
FROM user as u, level as lv, levelclass as lvc
WHERE u.id = usr.id AND lv.min_karma <= u.karma AND lv.num <= lvc.minlevel_num
ORDER BY lv.num DESC LIMIT 1) as class_image,
(SELECT lvc.id as classid
FROM user as u, level as lv, levelclass as lvc
WHERE u.id = usr.id AND lv.min_karma <= u.karma AND lv.num <= lvc.minlevel_num
ORDER BY lv.num DESC LIMIT 1) as classid
FROM user as usr
ORDER BY usr.$sortby $direction LIMIT ?,?

However, it seems highly inefficient to me. Basically what I do here is writing a sub query for each column(!) that I need from the level and class tables. If I'd query for multiple columns of the level or class tables in one subquery, it once again returns the combinations of all values. I feel there is a gap in my DB skills, something obvious I am missing, a function I do not know about...

Can you help me? How to write an efficient query for a set of rows that combines columns from three tables, yet is not linked by keys (ranges instead)?

PS: I know that I could greatly simplify the scenario if I would denormalize this schema to combine levels and classes into the users table, but there is a specific reason why I need this, trust me.

+5  A: 

If you can modify the Level and Class tables to have the actual range in the row (i.e. have minkarma and maxkarma in the Level table, minlevel and maxlevel in the Class table), you should be able to just use that in your join condition and it will only return a single row for each user.

Example:

SELECT * FROM user as u
INNER JOIN level as lv on
    u.karma >= lv.min_karma AND u.karma <= lv.max_karma
INNER JOIN levelclass as lvc on
    lv.num >= lvc.min_level AND lv.num <= lvc.max_level

The reason you are getting multiple rows is that you are matching multiple levels and classes by only filtering on the min value.

David Archer
+1, excellent speedup and the denormalization is really quite minor.
Alex Martelli
Presumably based on the end of his question he doesn't really have modification power over the tables.
Jherico
+1 if he has min and max values then absolutely the way to go
Robin Day
So simple, yet perfect. I did not have max columns for levels and classes, but I added them and tested your query. Works great!
Ferdy
A: 

I'd need to see your exact data structure to be sure, but how about something like this?

SELECT
 *
FROM
 [User] u
LEFT OUTER JOIN
 [Level] lv
ON
 lv.num =
(
 SELECT
  MAX(lv2.num)
 FROM
  [Level] lv2
 WHERE
  lv2.min_karma <= u.karma
)
LEFT OUTER JOIN
 [LevelClass] lvc
ON
 lvc.id =
(
 SELECT
  MAX(lvc2.id)
 FROM
  [LevelClass] lvc2
 WHERE
  lvc2.minlevel_num <= lv.num
)
Robin Day
A: 

You want to use a join on a case statemnet

SELECT 
 * 
FROM 
 users u
JOIN 
 levels l
ON 
 l.id = 
CASE 
    WHEN u.karma <= 10
    THEN 1
    WHEN u.karma > 11 & u.karma <= 100
    THEN 2
    WHEN u.karma > 100 & u.karma <= 1000
    THEN 3
    WHEN u.karma > 1000 
    THEN 4
END
JOIN 
 class c
ON 
 c.id is
CASE 
    WHEN l.id <= 5
    THEN 'owl'
    WHEN l.id > 5 & u.karma <= 10
    THEN 'lion'
END

Season to taste.

Jherico
I wish someone would notice this answer. It requires no table mods or subselects.
Jherico
You have the data hard coded into your case statements. If the levels ever change you have to do a coding change instead of updating some records in a file. Bad idea.
Paul Morgan