views:

122

answers:

5

We have built a hosted web application, that is developed in a way that each customer is connected to an individual database. This application is a web plattform/publishing system, and has been working very well so far with this design. We also have a main database.

Now we are about to make some changes in our pricing model, which would introduce free accounts. This should (hopefully) generate a lot of more accounts.

Is there a problem having a lot of databases, say many thousands (currently about twenty)? There are lots of advantages: security by separation, scalability, easily extract customer specific data (part of scalability too), etc.

A: 

I think there is a problem in your design. Instead of having a set of tables in one database, Why are you using a database per customer? As far I see having the advantages you mentioned about I can see a big overhead using many databases.. + Connection strings + separate credentials.

Chathuranga Chandrasekara
definitely; all the mentioned advantages are just about peace of mind, no real 'hard' demonstrable advantages. just add a 'user' field to most records and put 'em all in the same database.
Javier
And what happens if one client gets hacked -- then everyone's information is available. Why not keep them in separate databases with separate DB users so that if one customer is hacked, they all aren't?
George Stocker
I believe two separate DBs is more secure than having one table with a customer field. Am I really wrong here?With separate database servers we have the possibility to easily migrate customers between servers, or let a customer "take their DB and go". We can also export a customer's data by just dumping that DB, for use in local develpment and such. What is the "big overhead" with "+ Connection strings + separate credentials"?
Znarkus
George, that just seems like really backwards logic. If your DB server is hacked, odds are - you're screwed anyway.
Jack Marchetti
@George Stocker: If you by "separate DB users", mean MySQL users, I agree, that would increase security. I'll look into that in the future, thanks!
Znarkus
+4  A: 

Two problems that I can see:

Maintenance - it's going to be a pain to make changes to the database (change table structure, modify SPs, etc.) if you have thousands of database. Sure you can script the changes but that many databases is that many more chances for something to go wrong. What are you going to do if your script fails halfway through and your left with some databases that have the changes and some that don't? Also, as another poster has mentioned, what about maintenance of things like connection strings, logins/passwords, etc.?

Resources - I'm not sure what resources each instance of a database uses but there has got to be some overhead to running that many databases. If you split it over several machines you again run into the Maintenance problem above.

TLiebe
Maintenance - Valid point, even though we got it half covered with DB structure migration scripts. This has worked well so far, but I haven't calculated the impact of massive amounts of databases. Account specific configuration options are stored in XML files, is this a problem?Resource - That is my question, if there is overhead and how much :-)
Znarkus
I don't know how much each individual database consumes. All I can suggest is that you get a test machine, measure the baseline resource usage for a single database (make sure it has the same amount of data, tables, etc. as you would expect for a typical customer) and then write a script to make a few hundred copies of it. Compare your memory and CPU usage before and after. It's not going to be a great comparison as your test databases aren't actually being used but it's a start. You'll be missing things like disk I/O usage because the databases aren't actually being used.
TLiebe
+1  A: 

I agree with TLiebe, in the future maintenace is going to be very very hard to do with hundreds, if not thousands of databases.

A better solution might be by partitioning your databases by purpose, rather than by user. You could have profile, content, site, etc...as your logical partitions and then limit the number of users on each partition to a certain number. You would then go forward with profile1, profile2 until you run out of users. This way you would have your data spread out over a number of different machines, but stil have them related by function.

MySpace has used this concept for their database setup which seems to be pretty successful so far. Article about MySpace's Architecture .

Good luck and hope this helps.

Chris
Thanks, interesting read. But I don't see the advantage for us, as we don't have billions of users :-) This feels like taking the cons from both teams, by having all customer's data mixed AND using multiple DBs.
Znarkus
+1  A: 

I can understand the arguments about peace of mind and scalability, even though there may no hard "evidence" to support it. Maintenance tasks like table changes could be easily automated. Database errors or break-ins would be contained to one customer, and moving a database to a different server could be done in a breeze.

On the other hand, the critics of the model may have a point.

Why not simply try it? Set up a separate credentials system, replicate your database 10.000 times, and see whether overhead and performance are still o.k.

Pekka
Thanks, maybe I should, if no one has any experience with this or something more than "there has to be" :)
Znarkus
A: 

As mentioned, database maintenance (including security) can be scripted (how depends on your database server).

You actually gain scalability with this scenario too. The reason is that when load becomes too great, you can add another server and move a fraction of your databases to the new server. No need to re-design the application.

Ron Leisti
What do you mean by security being scripted? Hehe yeah I realized that as I wrote the pros, but that would be a lot more work.
Znarkus