views:

25

answers:

1

I am building a website wherein the users can perform a variety of actions and they get a variable number of "points" or "badges" for performing specific actions. Certain data must be stored no matter which type of action the user performs, such as the user ID, the action type, a timestamp, the current point total, and any badge awarded. But depending on the type of action that a user performs, some action-type-specific data must be stored, including image data in BLOBs.

One option is to include all of the fields for all of the action types in the actions table. Unfortunately, each of these columns would only store data for the small proportion of actions matching the appropriate action type. So I would have a large number of empty fields (including BLOBs) with this approach.

The other option is to include a table for each action type in addition to the above actions table. Each action-type table would have a foreign key to the relevant action in the actions table. This would keep the actions table better organized, but it introduces the possibility of the actions table going out of sync with the action-type tables. I also wonder about the performance implications of having to do a large number of joins on the different action-type tables whenever I get data from the actions table.

Finally, I am optimizing for speed rather than size. How should I approach this dilemma?

+1  A: 

Usually, avoiding joins in large tables is a good practice for speed, but it really depends on your usage.

If you are planning on performing aggregations over the action table, I'd strongly recommend using the single table approach.

If all you do is single row index fetches (has user done this particular action), then perhaps using different tables will be more efficient. You'll be able to query the specific table, and since it is smaller, it may more responsive.

A practice I see a lot is having generic fields (number1, number2, ... string1, string2...) and a mapping table that describes each field according the the action type. The benefit of this practice is that the table is more densely populated. The downside is that understanding the data in the table becomes difficult and keeping the mapping in sync is hard work. I'd only use it if there's a good reason. For example, you have more than fifty different action types (in which case managing fifty tables isn't a picnic either).

OmerGertel