views:

57

answers:

2

Hello everyone.

Lets assume we have a User model. And user can plan some activities. The number of types of activities is about 50. All activities have common properties, such as start_time, end_time, user_id, etc. But each of them has some unique properties.

Now we have each activity living in its own table in DB. And thats why we have such terrible sql queries like

  SELECT * FROM `first_activities_table` WHERE (`first_activity`.`id` IN (17,18)) 
  SELECT * FROM `second_activities_table` WHERE (`second_activity`.`id` = 17)
  .....
  SELECT * FROM `n_activities_table` WHERE (`n_activity`.`id` = 44)

About 50 queries. That's terrible.

There are different ways to solve this.

  1. Choose the activity type with the biggest number of properties, create the table 'Activities' and have STI model. But this way we must name our columns in uncomfortable way and often the record in that table would have some NULL fields.
  2. Also STI model, but having columns, common for all of activity types and some blob column with serialized properties. But we have to do some search on activities - there can be a problem. And serialization is quite slow.

Please help me dealing with this. Maybe my problem has quite different solution that will fit my needs.

Thanks for help.

A: 

Perhaps a relational database is not the ideal solution here.

Have a look at a document oriented database such as Mongo DB.

From Wikipedia :

As opposed to relational databases, document-based databases do not store data in tables with uniform sized fields for each record. Instead, each record is stored as a document that has certain characteristics. Any number of fields of any length can be added to a document. Fields can also contain multiple pieces of data.

Edit

Why all the Mongo hate? Someone explain please.

With the exception of the fact that the chap has already started down the road with a relational database - which was not specified as fixed in the original question, I cant see any reason why this would be a bad idea.

There are 50 different representations for Activity. They share some data, but largely the fields for each type will be different. The large number of different activities suggests to me that there really is no fixed representation for an activity. Maintaining a fixed column database to handle a non-fixed set of data is going to be painful.

The second posibility mentioned by the poster, that of serializing the activities into blob data is essentially an extremely inefficient and non-queryable version of a document based database.

Not dissing the downvoters, I just wanna educate myself!

Mongus Pong
Thank you. I am thinking about it. But we have some logic work with current relational database, only there we have problems.Will it be difficult to move to MongoDB? And which realization is working best with rails?Anyway I'm still searching for 'relational' way to deal with it
kshchepelin
I cant say without knowing your whole db how difficult it would be to move to Mongo. I didnt realise you were already down the road with a standard relational database. Was just throwing it out as an idea to consider, which if you were asking from an educational / theoretical viewpoint would still be worth considering for the type of problem you have. There is more information on using it with Ruby here http://www.mongodb.org/display/DOCS/Ruby+Language+Center.
Mongus Pong
+1  A: 

Break your activities out into a couple of tables, and build them out of a few models.

For example:

table activity_types: name

table activity activity_type_id user_id [common fields]

table activity_details activity_id key value

Where you use activity details to store the non common elements, one per row.

You'll lose the simplicity of a single model/table for each activity, but you'll get a unified activity model that you can add some methods to that make fetching the "details" data easier, maybe overloading #method_missing? or something.

You'll still have more than one set of queries, but you won't have fifty, ending up with something like:

select * from activities where user_id = 1; select * from activity_types where id in (...) select * from activity_details where activity_id in (...)

You'll get more rows returned, just because you'll have multiple detail rows for each activity, but I'd guess it'll be faster overall.

Jon Moses