views:

98

answers:

4

Hi,

We are designing a new version of our existing product on a new schema. Its an internal web application with possibly 100 concurrent users (max)This will run on a SQL Server 2008 database.

On of the discussion items recently is whether we should have a single database of split the database for performance reasons across 2 separate databases.

The database could grow anywhere from 50-100GB over 5 years.

We are Developers and not DBAs so it would be nice to get some general guidance.

[I know the answer is not simple as it depends on the schema, archiving policy, amount of data etc. ]

Option 1 Single Main Database [This is my preferred option].

The plan would be to have all the tables in a single database and possibly to use file groups and partitioning to separate the data if required across multiple disks. [Use schema if appropriate]. This should deal with the performance concerns One of the comments wrt this was that the a single server instance would still be processing this data so there would still be a processing bottle neck.

For reporting we could have a separate reporting DB but this is still being discussed.

Option 2 Split the database into 2 separate databases

DB1 - Customers, Accounts, Customer resources etc

DB2 - This would contain the bulk of the data [i.e. Vehicle tracking data, financial transaction tables etc].

These tables would typically contain a lot of data. [It could reside on a separate server if required]

This plan would involve keeping the main data in a smaller database [DB1] and retaining the [mainly] read only transaction type data in a separate DB [DB2]. The UI would mainly read from DB1 and thus be more responsive. [I'm aware that this option makes it harder for Referential Integrity to be enforced.]

Points for consideration As we are at the design stage we can at least make proper use of indexes to deal performance issues so thats why option 1 to me is attractive and its more of a standard approach. For both options we are considering implementing an archiving database.

Apologies for the long Question. In summary the question is 1 DB or 2?

Thanks in advance,

Liam

+5  A: 

Option 1 in my opinion is the way to go.

CPU is very unlikely to be your bottleneck with 100 concurrent users providing your workload. You could acquire a single multi-socket server with additional CPU capacity available via hot swap technology to offer room to grow should you wish. Dependent on your availability requirements you could also consider using a Clustering solution to allow for swapping in more processing CPU resource by forced fail over to another node.

The performance of your disk subsystem is going to be your biggest concern. Your design decisions will be influenced by the storage solution you use, which I assume will be SAN technology.

As a minimum you will want to place your LOG(RAID 1) and DATA files(RAID 10 or 5 dependent on workload) on separate LUNS.

Dependent on your table access you may wish to consider placing different Filegroups on separate LUN's. Partitioning your table data could prove advantageous to you but only for large tables.

John Sansom
+2  A: 

50 to 100GB and 100 users is a pretty small database by most standards today. Don't over engineer your solution by trying to solve problems that you haven't even seen yet. Splitting it into two databases, especially on two different servers will create a mountain of headaches that you're better off without. Concentrate your efforts on creating a useful product instead.

Tom H.
+2  A: 

I agree to the other comments stating that between 50 and 100GB is small these days. I'd also agree that you shouldn't overengineer.

But, if there is a obvious (or not so obvious) logical separation between the entities you store (like you say, one being read-write and the other parts mainly read-only), I'd still split it in different dbs. At least I would design it in a way I could easily factor one piece out. Security would be one reason, management/backup/restore another, easier serviceability (because inherently the design will be better factored and parts better isolated from each other), and, in SQL Server, ability to scale out (or the lack thereof if it is a single database). Separating login and content databases for example often makes sense for bigger web applications.

And, if you really want a sound design, separate your entities in a single db, using different schemas, putting proper permissions on objects, you end up with almost the same effort in my eyes.

Microsoft products like SharePoint, TFS and BizTalk all use several different databases (Though I do not pretend to be aware of the reasons / probably just the outcome of the way they organize their teams).

Especially with regard to that you cannot scale out a single database instance on SQL Server (clustering needs multiple instances), I'd be tempted to split it.

@John: I would never use RAID5. Solves no purpose other than to hurt performance. I agree with the RAID10 approach.

I determine choice of whether or not to use RAID 5 vs 10 based on the type of workload to be performed. For example, a workload consisting of above 70-90% (Dependant on storage vendor) read activity, a search based application for example, will see superior overall performance from a RAID 5 disk configuration.
John Sansom
+1  A: 

Putting data in another database is not going to make the slightest difference to performance. Performance is a factor of other things entirely.

A reason to create a new database is for maintenance and administration reasons. For example if one set of data needs a different backup and recovery policy or has higher availability requirements.

dportas