views:

49

answers:

2

We're developing a new eCommerce website and are using NHibernate for the first time. At present we are splitting our data into multiple SQL Server databases, divided per area of functionality. So we have one for UserInfo, one for Orders, one for ProductCatalogue and so on...

Our justification for this decision is twofold really:

  1. the website has the potential to be HUGE (it is a new website for one of the largest online brands in the UK) and we feel that by partitioning our data along functional lines we will be able to move the databases onto their own servers which would give us an easy scaling route should we need it;

  2. my team has always worked this way - partly as a consequence of following the MS Commerce Server pattern from previous projects.

However, reading up on this decision on the internet, we find that the normal response to this sort of model is extremely scathing. "Creating more work for the devs now in order to create more work for the devs later" is one sample comment from Stack Overflow!

In addition, NHibernate is much easier to use with only one database (just one SessionFactory needed). And knowing that Stack Overflow ran off just one box for a long time makes me think that maybe we should not try to be so clever.

So, my question is, "are we correct in thinking that using fine-grained databases might increase our ability to scale or should we sacrifice this for easier development"?

A: 

I've never worked in a project like this. I'm used to databases with several hundred tables, which had never been a problem.

Therefore I can't say if your idea is a good idea, I never tried it. The "my team has always worked this way"-argument is a major driver for many decisions, and I can't even say that it is always wrong.

With NHibernate you organize your data in classes. They can be in different namespaces and assemblies. You usually don't work much with the database directly, you don't need this kind of structure there.

About the scalability argument: I'm not sure if it is really scaling well when you need to access several databases every time. I mean: you always need users and orders and probably more. Then you need to get all this data from several databases.

Stefan Steinegger
+4  A: 

Why don't you just design your database properly and put the files on appropriate disk? Use a cluster if necessary. Creating multiple databases is not an inherently scaling solution. Also - cross database referential integrity? Good luck.

What's your definition of "HUGE"? SQL Server can handle massive databases, but one thing I've learnt is that people often have no idea what constitutes a lot of data.

starskythehutch
" people often have no idea what constitutes a lot of data." - exactly! +1
Mitch Wheat
It's not so much the size of the dataset as the amount of usage that we are thinking about. So you would suggest using one database and just mapping different tables to different storage? Is that because disk I/O is always the main bottleneck?
Gaz
Agree of the referential integrity issue - you can have orders for non-existent products and non-existent users. That sounds like fun.
Damien_The_Unbeliever
In essence yes. Remember that you can map different parts of a table to different disk using table partitioning, so you can put your non-changing data on a section of raid disk that is very fast read, and your changing data on a raid disk with good read/write.
starskythehutch
@Gaz: if your OLTP database server hardware and software is properly setup, then 80% of pages read should be in memory, with almost no disk access. Data write backs are done lazily.
Mitch Wheat
That's a good rule of thumb Mitch. Thanks for that.
starskythehutch
Well that's all pretty conclusive... we are refactoring to a one-db pattern as we speak. Thanks guys.
Gaz