Engineering scalability into an application

+1 A:

Very broadly, scalability means an increase in system load can be handled with a proportionally smaller increase in assets that must be marshaled to serve that load.

If the load on your web app increased by a factor of 100, what would you do?

One fundamental principal for scalability is the identification and elimination of potential bottlenecks in processing, including parallelization of constricting tasks. But that's just a taste; I'm sure you'll get lots of other equally valid answers.

Edit: note that bottlenecks occur not just in actual task processing. They can be in overall process setup, necessary hardware operations, maintenance tasks, redesign/refactoring, you name it.

John Pirie 2009-07-02 12:44:33

This is true, however, scalability is also an issue when significantly decreasing system load. You may often come across a performance problem in "scaling down" in your system. Thus, scalability, in a broad sense, has to do with a change in quantity of resources or clients in an increasing or decreasing manner.

jtbradle 2009-07-02 12:53:06

+2 A:

When I think about "large scale applications" I think of three very different things:

1) Applications that will run across a large scale-out cluster (much larger than 1024 cores). 2) Applications that will deal with data sets that are much larger than physical memory. 3) Applications that have a very large source base for the code.

Each kind of "scalability" introduces a different kind of complexity, and requires a different set of compromises.

Scale-out applications typically rely on libraries that use MPI to coordinate the various processes. Some applications are "embarrassingly parallel" and require very little (or even no) communication between the different processes in order to complete the task (e.g. rendering different frames of an animated movie). This style of application tends to be performance bound based on CPU clock rates, or memory bandwidth,. In most cases, adding more cores will almost always increase the "scalability" of the application. Other applications require a great deal of message traffic between the different processes in order to ensure progress toward a solution. this style of application will tend to be bound on the overall performance of the interconnect between nodes. These message intensive applications may benefit from a very high bandwidth, low latency interconnect (e.g. InfiniBand). Engineering scalability into this style of application begins with minimizing the use of shared files or resources by all the processes.

The second style of scalability are applications that run on a small number of servers (including a single SMP style server), but that deal with a very large dataset, or a very large number of transactions. Adding physical memory to the system can often increase the scalability of the application. However, at some point physical memory will be exhausted. In most cases, the performance bottleneck will be related to the performance of the disc I/O of the system. In these cases, adding high performance persistent storage (e.g. stripped hard drive arrays), or even adding a high performance interconnect to some kind of SAN can help to increase the scalability of the application. Engineering scalability into this style of application begins with algorithmic decisions that will minimize the need to repeatedly touch the same data (or setup the same infrastructure) more than is necessary to complete the task (e.g. open a persistent connection to a database, instead of opening a new connection for each transaction).

Finally, there is the case of scalability related to the overall size of the source code base. In these instances, good software engineering practices can help to minimize conflicts, and to keep the code base clean. The book Large Scale C++ Software Design was the first one that I encountered that really took on the challenge of providing best practices for large source base software development. The book focuses on C++ as the implementation language, but the guidelines and practices can be applied to any project or language. Engineering scalability into this style of application involves making high level decisions about the structure of the code to minimize dependencies within the code base (e.g. do not have a single .h that when changed will force a rebuild of the entire code base, use a build system that will reuse .o's whenever possible).

semiuseless 2009-07-02 13:09:55

+1 A:

Here are some great resources on web application scalability to get you started: Todd Hoff's highscalability.com, Scalable Internet Architectures by Theo Schlossnagle, and Building Scalable Web Sites by Cal Henderson. Highscalability.com will point you to a lot of presentations and articles well worth reading, including this one from Danga about how they scaled LiveJournal as it grew.

Jim Ferrans 2009-07-02 13:38:34

+1 A:

I think when you're talking about the web, you're mainly concerned with:

Partitioning your code such that it can, if necessary, be divided vertically (for one request) along many servers.
Adjusting your code such that all data (especially session data) is persisted in some sort of global store (like a database) rather than locally in the filesystem.
Load balancing.

With this, you can stretch one server into however many tiers (application tier, caching tier, database tier) and expand those horizontally should a scaling problem arise.

Stefan Mai 2009-07-02 13:45:18

A:

Scaleability means that if the load/data can be measured by some metric N, i.e. number of users, total number of transactions done daily, etc., with some fixed response requirement t, that the application can be reconfigured to handle an arbitrary N given an increase in resources of O(f(n)) where f(n) is a linear or close to linear function of N within the same response time t.

Typically this means that the application uses a distributed architecture so that more servers, application servers, webservers, database servers can be added linearly to handle more users. I.e. to handle twice as many users, you would need to add twice the database servers, webservers, machines, etc.

Even theoretically this is not usually possible because distributing the requests usually requires a tree-like structure so that the scaling factor is O(n * log(N)). In practice, because you can use a large branching factor in the tree and the distribution cost is small compared to the overall transaction cost, the log(N) factor is not significant.

Larry Watanabe 2009-07-02 13:54:33

ansaurus

tags:

views:

answers:

Engineering scalability into an application

related questions