views:

66

answers:

3

I need a query to prevent a join that produces 1.34218E+35 results!

I have a table item (approx 8k items; e.g. Shield of Foo, Weapon of Bar), and each item is one of 9 different item_type (Armor, Weapon, etc). Each item has multiple entries in item_attribute (e.g. Damage, Defense). Here is a pseudo-code representation:

Table item (
 item_id autoincrement,
 ...
 item_type_id char,    --- e.g. Armor, Weapon, etc
 level int             --- Must be at least this level to wear this item
);

Table item_attribute (
 item_id int references item(item_id),
 ...
 attribute char        --- e.g. Damage, Defense, etc
 amount int            --- e.g. 100
)

Now, a character wears 9 total items at once (one each of Armor, Weapon, Shield, etc) that I call a setup. I want to build a list of setups that maximizes an attribute, but has a minimum of another attribute. In example terms: for a character level 100, present the top 10 setups by damage where sum(defense of all items) >= 100.

The naïve approach is:

select top 10
 q1.item_id,q2.item_id,q3.item_id,..., q1.damage+q2.damage+q3.damage... as damage
from
 (select item_id from item where item_type = 'Armor' 
     and level <= 100) as q1
 inner join (select item_id from item where item_type = 'Shield' 
     and level <= 100) as q2 on 1 = 1
 inner join (select item_id from item where item_type = 'Weapon' 
     and level <= 100) as q3 on 1 = 1
 ...
where
 q1.defense+q2.defense+q3.defense+... >= 100
order by
 q1.damage+q2.damage+q3.damage,... descending

But, because there are approx 8k items in item, that means the magnitude of results for the DBMS to sort through is close to 8000^9 = 1.34218E+35 different setups! Is there a better way?

A: 

Can't you join with only the # most powerful items? Should reduce the collection size drastically. Logically the sum of the highest items should deliver the highest combinations.

Joachim VR
Because the item attributes are not monotonically increasing with level, that proposed solution has the flaw that the "best" setup is below the specified # of items to return...
Derek Frye
+1  A: 

I think your problem can be solved using integer linear programming. I'd suggest pulling your data out of the database and giving it to one of the highly optimized solvers that have been written by people who have spent a long time working on their algorithms, rather than trying to write your own solver in SQL.

Mark Byers
Integer linear programming may provide a faster solution to the above query by translating the problem into a program "optimized" to solve it. But, unless I'm missing something (as I don't understand integer linear programming) it doesn't reduce the complexity or restate the problem in simpler-to-solve terms?
Derek Frye
@Derek Frye: I will be surprised if you or someone else here can write a working solution for this in SQL. If they do they will certainly get my upvote. But it seems to me that SQL is the wrong tool to solve this problem. The time complexity of the problem is not improved by using a better tool - it is NP-Hard either way, but that's not a reason to avoid using the right tool. You can hammer in a nail using a hammer in constant time. You can also hammer in a nail using a screwdriver in constant time. But that doesn't mean that the tools are equally good for the task.
Mark Byers
@Mark: You're right, it looks like there isn't anyone with a solution that reduces the problem complexity. Probably translating this into some sort of optimized solver is the best way fwd.
Derek Frye
A: 

The first thing I would do is isolate your items. Instead of looking at the setup as a whole, look at the sum of the individual items. Unless your items interact with eachother (set bonuses) you're going to go a long way by merely maximizing stat A and minimizing stat B for just one slot, and repeating that process for each item slot in your setup. This will drastically reduce the complexity of the query, even if it will mean more queries. It should make things faster in the long run.

Another thing to ask yourself is how much is it worth gaining stat B (the one you want to lose) to gain stat A? If you could gain 1000 A, and only have to gain 1 B, that might be worth it. But, what about gaining 10 A, but you'd have to gain 9 B to do it? Now things change a bit.

If you stick with a A:B ratio, you could probably do each slot separately, and join each of those separate results into one query.

glowcoder