views:

94

answers:

2

Hi all,

thanks for your time first...after all the searching on google, github and here, and got more confused about the big words(partition/shard/fedorate),I figure that I have to describe the specific problem I met and ask around.

My company's databases deals with massive users and orders, so we split databases and tables in various ways, some are described below:

way             database and table name      shard by (maybe it's should be called partitioned by?)
YZ.X            db_YZ.tb_X                   order serial number last three digits
YYYYMMDD.       db_YYYYMMDD.tb               date
YYYYMM.DD       db_YYYYMM.tb_ DD             date too

The basic concept is that databases and tables are seperated acording to a field(not nessissarily the primary key), and there are too many databases and too many tables, so that writing or magically generate one database.yml config for each database and one model for each table isn't possible or at least not the best solution.

I looked into drnic's magic solutions, and datafabric, and even the source code of active record, maybe I could use ERB to generate database.yml and do database connection in around filter, and maybe I could use named_scope to dynamically decide the table name for find, but update/create opertions are bounded to "self.class.quoted_table_name" so that I couldn't easily get my problem solved. And even I could generate one model for each table, because its amount is up to 30 most.

But this is just not DRY!

What I need is a clean solution like the following DSL:

class Order < ActiveRecord::Base
   shard_by :order_serialno do |key|
      [get_db_config_by(key), #because some or all of the databaes might share the same machine in a regular way or can be configed by a hash of regex, and it can also be a const
       get_db_name_by(key), 
       get_tb_name_by(key),        
      ]
   end
end

Can anybody enlight me? Any help would be greatly appreciated~~~~

+1  A: 

If you want that particular DSL, or something that matches the logic behind the legacy sharding you are going to need to dig into ActiveRecord and write a gem to give you that kind of capability. All the existing solutions that you mention were not necessarily written with your situation in mind. You may be able to bend any number of solutions to your will, but in the end you're gonna have to probably write custom code to get what you are looking for.

James Thompson
Thanks for your answer! I did dig into AR source, and I digged into the source of different solution that I found. I did all these to find the way to write my own plugin for AR. One of the closest solution is sharded_database plugin, though it doesn't support table partition...I guess "class << self; set_table_name; end" might give me the ablitity to change table name for each AR instance. But the cost, I don't know. And changing database connection each time, would give up all the benefits of the pool...By asking the question, I'm trying to seek people in the similar situation...
Utensil
If there happens to be an elegant design with little touch and hack on AR, I would be greatly appreciated for the light...
Utensil
+1  A: 

Sounds like, in this case, you should consider not use SQL.

If the data sets are that big and can be expressed as key/value pairs (with a little de-normalization), you should look into couchDB or other noSQL solutions. These solutions are fast, fully scalable, and is REST based, so it is easy to grow and backup and replicate.

We all have gotten into solving all our problems with the same tool (Believe me, I try to too).

It would be much easier to switch to a noSQL solution then to rewrite activeRecord.

bstiles
Thanks for your answer! It's always great to have open solutions~ CouchDB is a great thing. I *love* erlang. Unfortuately, non-SQL in my case isn't an option. The mysql databases and many other C/C++ programs are the majority of the real and serious bussiness, I'm just trying to use rails to talk to them.
Utensil