views:

189

answers:

6

A web application we wrote intended for one customer is going to be product-ized and sold to dozens of companies, and we will be doing the hosting.

I could use some guidance about the pros and cons of rolling out a seperate instance for each customer versus going with a single (or very small number of) multi-tenant instances.

At first, as we ramp up, I will have to roll out a seperate instance of the application for each new customer (they will come online one at a time) because it's the only immediate option. I imagine this won't scale very well as far as maintenance goes - rolling out changes will become very tedious and possibly error-prone once there are more than 4 or 5 instances out there. Unless we automate that somehow.

Also, the single-instance philosophy seems like it might lead to a bunch of forks if people need customizations. And it would be nice to avoid that.

So what has your experience been with this?

Bonus question #1: What's the performance difference between 10 SQL Servers with 2m records each versus one huge one with 20m? Let's say they are all in one table and we're mainly doing inserts and selects on single records. Sometimes the selects are on an indexed varchar(12) or date field.

Bonus Question #2: I imagine that to avoid forking, we would have to make the customizations configurable, or build a plug-in architecture. However, that might increase the cost of doing customizations, and I don't want to be one of those shops that takes a week to resize a textbox, and I don't want to over-invest in infrastructure. Any thoughts on that?

Scale Details

Each customer will have a decent amount of data -- up to a few million records.

There will be a very small number of concurrent users, only a few per customer, plus a handful of internal reps on our end.

It's unclear whether each customer will require customizations, but I would say some of them probably will, and maybe some of those changes will be things that other customers will not want to see.

+1  A: 

The big advantage of individual instances will be scaling out as each customer's demand increases. For example if you're running on a single server and one customer suddenly needs more preformance you're stuffed. But if they're all individual then moving that customer to a shiny new server is relatively easy.

The big disadvantage will be in managing the instances all individually. (regardless of whether they're all running on the same server or not).

Regardless you should only ever have one instance of the codebase. And customisation should all be controlled through plugins and configuration. Front end should naturally be seperate from content. Although the cost of making a change may be higher, the benefit in terms of features you can offer your other customers (which will just be customisations you've been asked to do) will pay off I'm sure. Which is to say nothing as to how much easier it'll be to manage a single codebase, as opposed to several.

Massif
+2  A: 

I don't see a good reason for either of your two options. I think the real answer lies somewhere in the middle: having multiple instances, each hosting multiple clients.

This adds another layer of automation processing, but it means you can keep the hosting cheap (you won't need to go out and buy a Cray any time soon) and (hopefully) this sort of mentality means you could do failover backups fairly easily.

But let's not get ahead of ourselves... We're talking about a webapp, right? Get your database(s) and aspnet on different machines. Cluster your databases and you'll have a much happier time playing around with various front-end scenarios. You'll also be able to upscale whichever area runs out of puff first.

By the sounds of it, you'll end up with one clustered database over half if not a full dozen database machines and only a couple of front-end boxes.

As for customisations, you've nailed it. You either provide a completely database-hosted set of editable templates or you have to customise who instances. I'm all for the first. It's a lot of work (without much in return) but it's well worth it as you should only need to change the core code when (you will!) you do upgrades. Hunting through a hundred customers' custom instances to make sure they upgrade safely will kill a developer! Template are the answer. At the very very least, you could allow custom CSS without much pain (but they'd need somebody who knew their stuff).

Edit: I've seen a couple of posts going for the all-in-one method. Splitting the instances over multiple machines insulates you from a couple of things:

  • If you introduce a bug not caught in testing, only a few clients are effected at once

  • Hardware fails. Having one mega-server fall over will annoy a lot of people at once. Having a failover mega-server is massively expensive. Having a spare failover box per three or four running servers is much cheaper and annoys fewer people.

  • Performance can be balanced between boxes on a client-by-client basis, so you can put a few light-use clients with a heavy client, or just fill a box with a few medium-use clients, etc.

  • On the same idea, usage spikes or other slowdowns only effect clients on the same box. Of course this doesn't mean the same for the database, but you can split that up into a cluster of clusters when you get there.

Oli
I agree with you -- I did specifically say one huge monolith, but in my mind I was thinking about potentially numerous monoliths. As many as we need, but probably just one in the short term.
Brian MacKay
Hey, just curious - how strongly are you behind clustering? I tend to avoid that because it's seems like such a pain. I've done some hot backup scenarios that seemed to work pretty well but required a lot less work. Downside is that the backup wasn't actually doing any work, it was just sitting there waiting for a problem.
Brian MacKay
Most real clustered SQL systems allow for part of the system to go down without losing access to any data. They're a lot better than traditional failover-only systems where you do just have a spare machine sitting there waiting for something to break. My experience is with MySQL but I know MSSQL server has similar clustering options.
Oli
And let's be clear: it *is* a pain to set up but you get a pretty robust and fast database system at the end of it.
Oli
One year later, I can say that this strategy was definitely the right choice.
Brian MacKay
+1  A: 

I would strongly advise going with the single instance hosted by your company. This has the following advantages:

  • You have physical access to all code and databases to make changes and updates.
  • You control the quality of the hardware it is running on.
  • When you fix a bug in common code, you have fixed it once for all customers.
  • You can refactor the application design to better support customer specific code and avoid forking.
  • As the number of customers grow, you can scale-up and scale-out your servers to meet performance/responsiveness requirements.
  • Your application code and databases cannot be tampered with by "inquistive" customers.

I would have to say it is almost more important where your application is running as opposed to how many separate instances there are of it.

Sure, maintaining multiple separate instances is not ideal due to the support/maintenance overhead, but if these apps. are all on servers you control, life is much easier then needing remote/ physical access to different customers networks and servers.

Joel Spolsky also talks about exactly this on StackOverflow podcast 67.

One thing Joel has learned from selling Fogbugz: software designed to be installed on a server in-house at a customer’s site, under full control of that customer, is almost never worth the hassle

20 million records relatively speaking is not a huge SQL Server database. A single well provisioned SQL Server could handle this size comfortably. More important however is the number of concurrent accesses to the database. However you say that there will be only a few users per customer so is unlikely to hit you until the level of concurrency grows.

Ash
Quick comment: in all cases, we will be the ones doing the hosting. Fortunately, for once, these guys don't want to mess with it and we can keep it in our datacenter.
Brian MacKay
+2  A: 

when faced with a similar challenge, here's what we did:

  1. we have one code base with multiple sql servers. we do maintain multiple iis servers with copies of the same code base. we are free to move clients around from sql server to sql server to maximize performance.

  2. if a customer has the $ for it, we will install them on their own server and maintain a separate iis server for them. this accommodates the largest customers for whom paying much more money every month (10 fold more money). we do not, however, give them a separate code base. if they need a mod, we make it visible on a per client basis (see #3)

  3. custom programming usually results in a configurable option. even the people who pay us to have their own server get the same version of the code. sometimes its as simple as a clause in the code that says "if the customer = "ourbigcustomer then turn on this option". yes, that's kludgy hard-coding, but if the customer has enough money, that is fine with me.

  4. i didn't quite get from your question whether you wanted to mix different customer's data into one big database .. our rule is we never do that (never ever). it is one of the wisest choices we ever made. it makes data manipulation much less risky and restores of data easier.

Don Dickinson
I was definitely intending to mix the customer's data in the monolithic solution, but of course it would be isolated by key, etc. I can see both sides of that one. Fortunately, in our case the customer data is not that complex. Just a few tables.
Brian MacKay
+1  A: 

All of the above are good points but you are missing two key questions. What price point is the service offered at and how many customers (order of magnitude) will you ultimately have to support (ie market size)? In 3 years will you have a maximum of 10 customers each of which will pay you $500,000 per year or 500 customers each paying you $10,000 per year? For a small set of high paying premium customers the advantages of individual deployments is clear, whereas the lower prices and larger customer bases demand a shared solution (a la Oli's comment) is the best way to go. Or go with a cloud platform, although I've only read the hype and tinkered rather than deployed that in the field.

Bonus Question 1: table layout, indexing, number of reads / writes, efficiency and complexity of stored procedures (you are using procs or at least prepared statements, right?) all matter a heck of a lot more than the number of physical records in the database to a point. Beyond that you will likely find yourself needing to either provide individual SQL Server instances for each customer or for a pool of customers, once again depending on some of the questions I raised above.

Bonus Question 2: Putting the time into your design for templating and a plugin architecture is essential in this situation and you need to do it sooner rather than later. Once you're in the grind of customizing code for paying customers you will likely not have the time to do it right. This point cannot be stressed enough. Templates and admin tools that give you quick and deep access to data-driven changes in your product will save you a lot of time down the road. As your company / group expands you can then add less technical staff that can be "product experts" who can perform 90% of customizations and maintenance, freeing up your core to continue development or move on to other projects. Finally, don't neglect your data tier in this planning process. Having a core data tier of (almost) immutable stored procs and tables is very important, with custom tables and stored procs clearly demarcated using a good naming convention.

Good luck, feel free to provide more details if you'd like more specific suggestions.

Tom Crowe
That's a very thoughtful answer Tom, thanks.
Brian MacKay
A: 

Based on some of the advice received here, we did end up implementing a monolithic multi-tenant version of our application.

I'm glad we did. By the time it was done, we had 3 or 4 forks of the code base (mainly custom skins and things we didn't have n-level support for, but also some actual features), and it was only getting crazier.

We got the multi-tenant version up and successfully folded everything in. There ended up being a lot to think about and a lot to keep track of, but our customers never even knew they had been moved to a new system.

I will say that the actual customer migration was a bit of a bear. I thought at first that we would be able to do it by hand in the backend, but I ended up having to write some fairly involved scripts to get the job done. There were just too many identity columns, and it's not like you can just turn off constraints temporarily when you're importing into a live production system.

Brian MacKay