tags:

views:

169

answers:

6

I am choosing a platform for a web application.

I understand how cloud computing can scale front end servers, but what do they do with the database servers?

Is there something that the developer has to do to allow for this?

A: 

This depends on the database

Slicehost use MySQL Cluster, Google uses that map-reduce hype and others. Depends on the cloud provider and the database they use

Others just provide a VM and you setup your own database on virtual machines that have private IPs

Aiden Bell
Google calls it BigTable.
Jonas Elfström
Google renames alot of things :P
Aiden Bell
+1  A: 

Short Answer: Yes.

Long Answer: It depends. What kind of processing needs to be done? Can it be map reduced? There's many solutions that exist for this sort of thing. Distributed caching a la memcache can also help scale many services in the backend.

Joey Robert
A: 

If you're using a cloud provider that simply gives you ssh access to a virtual box, you'll need to implement your own database scaling. If you run on Google AppEngine, the Intuit Partner Platform or something similar, the scalability is built into the datastore provided to you.

Basically, theres nothing magical about cloud computing. In order to gain this built in scalability, you give up some freedom. Google's datastore doesn't provide all the aspects of a full relational database, but you can scale to ridiculous amounts of traffic.

Dan Lorenc
A: 

Amazon and Google uses data stores, it's different from a traditional RDBMS.

You can find some more background information by following thislink

And you can find a short list of datastores here

HeDinges
+1  A: 

In general, yes. The most common way to scale a DB across multiple machines is to use a column store. That way each column in a table can be stored on a separate machine, dramatically increasing the amount of cpu power and bandwidth available to search. Searches can be done in parallel also, a search on the company column would only hit one server, so a search on the year column would not be any slower.

From what I've read, this is how Google's MapReduce works.

The benefits section of wikipedia's column store page is particularly informative.

Along similar lines, OLAP is interesting. OLAP changes the read/write tradeoff completely. Querying and reading is fast for large and complicated queries, but writing new data requires a time consuming rebuild process.

shapr
A: 

As far as the how, I recently came upon a paper dedicated to exactly this. It was discussed in a lecture, so although I'm familiar with the paper's contents, I haven't read it myself. Still, the lecture had very interesting ideas: http://reports-archive.adm.cs.cmu.edu/anon/2008/CMU-CS-08-150.pdf

Junier