views:

78

answers:

5

I am new to web programming, so excuse the ignorance... ;-)

I have a web application that in many ways can be considered to be a multi-tenant environment. By this I mean that each user of the application gets their own 'custom' environment, with absolutely no interaction between those users.

So far I have built the web application as a 'single user' environment. In other words, I haven't actually done anything to support multi-users, but only worked on the functionality I want from the app. Here is my problem... What's the best way to build a multi-user environment:

  1. All users point to the same 'core' backend. In other words, I build the logic to separate users via appropriate SQL queries (eg. select * from table where user='123' and attribute='456').
  2. Each user points to a unique tablespace, which is built separately as they join the system. In this case I would simply generate ALL the relevant SQL tables per user, with some sort of suffix for the user. (eg. now a query would look like 'select * from table_ where attribute ='456').

In short, it's a difference between "select * from table where USER=" and "select * from table_USER".

+4  A: 

Creating tables dynamically is rather dirty and confusing. Additionally, if you have lots of users it'll be a complete chaos if you have tons of tables - especially if you need to change something in n tables instead of a single table.

--> Use one table and add some user_id column. With proper indexes this will be as fast or even faster than separate tables.

ThiefMaster
A: 

The only way separate tables for each tenant makes any sense is if you have a separate database for each tenant, in which case the tables will still have the same names.

Otherwise, use a single table for each entity, and filter them by tenant id.

Robert Harvey
+2  A: 

the first option is better.

In general tables should contain normalized data, you shouldn't duplicate the same table.

Also the 1st option is safer, as you don't need to grant the ability to create or drop real tables to the program

Brian
A: 

If you are on SQL Server, I recommend using single tables for all tenants, give no access to the base tables to whatever login the application is using and restrict access to inline table valued functions. These are just like parameterized views, and mean that no one with access to these can ever retrieve a set for more than one tenant in a single call (so no accidental joins to someone else's product catalog):

CREATE TABLE [dbo].[mt](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [TenantID] [int] NOT NULL,
    [BusinessKey] [varchar](50) NOT NULL,
 CONSTRAINT [PK_mt] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
),
 CONSTRAINT [IX_mt] UNIQUE NONCLUSTERED 
(
    [TenantID] ASC,
    [BusinessKey] ASC
))

CREATE FUNCTION f_mt ( @TenantID AS INT )
RETURNS TABLE
AS
RETURN
    ( SELECT    *
      FROM      mt
      WHERE     TenantID = @TenantID
    )

If you have the TenantID stored somewhere in the connection (using CONTEXT_INFO()), it is also possible to have simple views which wrap this:

CREATE VIEW vw_mt
AS
    SELECT *
    FROM f_mt(CONTEXT_INFO())

It all depends how much abstraction you want to put around this.

Cade Roux
+1  A: 

I would say that your choice depends. You really have three choices:

One database to rule them all...(Your choice 1)

Certainly adding a TenantId column makes it easier to add new tenants (users) but there are some downsides:

  1. You must be careful to ensure that every query filters against TenantId. It will be very easy to accidentally forget the TenantId in the right place and return other tenant's data.
  2. All top-level, parent tables must include a TenantId. It is not just your main data, but all tenant-specific parent data.
  3. Tenants cannot be on different schema versions at different times. For example, suppose you make some data schema changes in version 1.1 of your application. If all tenants are in the same database, everyone must be updated simultaneously whether you want them to or not. In addition, if you are using a single database, you are almost compelled to use a single site as you want to ensure that the site and schema stay in sync. If you do this, you cannot, for example, charge someone to upgrade in order to get a new feature. The features must be built as plug-ins as opposed to version specific updates which may not be a bad thing but it has to be a conscious decision from the outset.
  4. It can be a chore to separate tenant's data should they wish to have a copy of their data or wish to host their own data or if you wish to move them to another database server. Because all resources are shared, you may run into a situation where one tenant is chewing up resources through reports or traffic such that you want to move them to their own database server (and up-sell them this benefit). In addition, I've run into situations where tenants want a copy of their data that they can download themselves. If all data is in a single database, this can be a chore.

If you are going to be selling to corporate customers then I would not go down this path. However, if you plan on adding thousands upon thousands of end users as tenants where you do not need to provide them with their data, then using a single database is probably the right approach.

Segment tenants by schema (e.g. Tenant1.Table1, Tenant1.Table2...Tenant2.Table1, Tenant2.Table2...) (I believe your choice 2)

IMO, this is a harder version of simply using separate databases. It has the advantage that maintenance of the one database is a little easier but beyond that has all the same problems as using separate databases.

Segment tenant per database

For corporate customers, I've found that in the end this turns out to be the simplest. It eliminates the possibility of tenant's seeing the wrong data (unless the connection string is wrong). It allows for corporations to host their own system. It allows for tenants to be on different versions if you have different virtual applications per tenant. It makes it easy to do resource allocation, backups and restores. It's only (but not insignificant) downside is the time cost of setup (and thus financial cost). It can be a pain to add databases when you get a new client. Technically, it could be automated but it is still a pain.

So, in the end it depends on your target customer. If they are standard users, then I would go with the "One database to rule them all" approach and make sure you do lots of code reviews and automated testing. If they are corporate customers, especially large corporate customers, then I would consider separate databases per tenant.

Thomas