views:

100

answers:

2

This question is about performance, not about possible solutions.

My system holds many items of different categories. Each category has its own table since each table has many rows AND the fields are different.

ItemA - id, fld1, fld2
ItemB - id, fld1, fld3, fld4
ItemC - id, fld1, fld3, fld5
....

Now there's a need to manage user inventory, meaning the user has an item or not. One option is using a single table:

Inventory - category_id, item_id, user_id

category_id is different for ItemA, ItemB, ... rows and that's how we differentiate.

Second option is to have:

InventoryA - item_id, user_id
InventoryB - item_id, user_id
...

The first option is probably the easiest to manage BUT the inventory table is huge (order of magnitude: number of items on all categories times number of users) and frequently updated and frequently queried.

The second option would be a bit harder to manage (as we create a new inventory table for each category) but may introduce a performance gain as it might prevent race conditions. No single query is likely to require involving more than one of the inventory tables as the categories are quite segregated.

Currently the system uses MySQL and InnoDB engine. There are ~10 categories but expected to grow to a few dozens in the near future. Biggest category has > 200k items and most have > 10k items. The single inventory table has > 10M rows and is expected to get MUCH bigger as more users join.

I know the best is to test the performance of both methods and decide but the truth is that it won't be so quick and painless to move to the multiple table design.

If you have personal experience with a similar issue, please share it.

Thanks

+3  A: 

Normalizing the database is normally better for performance and for maintainability.

This approach would create a table Items that has a 1:1 relation with ItemA, ItemB, etc. Then you can create an Inventory table that has a relation with the base Items table.

According to the documentation, InnoDB supports row level locks, so there's no need to use multiple tables to prevent deadlocks.

Andomar
Thank you Andomar for your answer. IMHO, however, what you are suggesting would actually be a very bad idea that would further decrease performance. Currently, the first solution I've presented is implemented. Each query on the database relates only to one of the categories and I really can't understand how adding an extra Items table would improve performance. Normalization is a good concept but the question here is about practical performance implications.Yes, InnoDB supports row level locks but it's not deadlocks that I'm afraid of but rather the performance decrease.
Colnector
@Colnector: Adding a base table allows you to have a foreign key relation between `inventory` and `item`
Andomar
Yes @Andomar, I understand that but I really don't see how this helps me with what I'm asking about. Currently I have inventory(category_id, item_id, user_id) and with your suggestion I'll have inventory(item_id, user_id) BUT I'll have to JOIN on the new Item table for EACH AND EVERY query. This will decrease performance.
Colnector
A: 

Here is my take on this story, hope this helps a bit.

  • Item table has fields common to all items.
  • Category (A,B,C) tables have fields specific to each one.
  • One user has many items, one item can be used by many users.

    alt text
Damir Sudarevic
Thanks Damir, this is similar to what Andomar suggested above. Please see my comment there. Nice sketch :)
Colnector