views:

694

answers:

3

Hello

I have a question for the DBA's out there: If I scale from a single web/DB server setup to two web/two DB server setup with a load balancer in front of the web servers to route incoming queries evenly... how do solutions like MySQL Cluster work so that a change made to one DB server is immediately known to the other (otherwise, users routed to the other DB server won't see the data or will outdated data), or at least so that the other web server is made aware of the fact that it's reading "dirty data" and it should try again in X seconds so as to get up-to-date data?

Thank you.

+1  A: 

TWO ways of doing this. Active/Active or Active/Passive. Active/Passive is most prevalent The data is kept in sync on the passive node. The cluster is useful configuration in as much as the active node goes down the passive is immediately switched hence no downtime. The clustering continuously synchronises the 2 nodes in the cluster.

I work with SQL server but I think the basic premise of clustering is the same for mySQL - that is no (or no noticeable) downtime on hardware failure.

EDIT: Additionally the clustering software handles the synchronisation. You don't need to woory. You view the cluster nodes as a virtual directory, which behaves like one server in windows.

here is document explaining this

http://www.sql-server-performance.com/articles/clustering/clustering_intro_p1.aspx

Stuart
At least a much lower probability of noticeable downtime. There's still the possibility of it.
Sev
A great deal lower probability than alternatives. This happened to me, no one noticed.
Stuart
+1  A: 

In Windows server clustering (to be distinguished from High Performance Clustering), there is a shared external storage array. The active node takes ownership/control of the storage, and when that node fails, the storage 'fails over' to the previously passive node (which is now the active node). There are also different schemes that allow for independent storage at each node, vs. shared storage. However, these require the application to have enough intelligence to know that it is clustered, and keep the two storage sets in sync.

Jay
A: 

Clustering is also where a number of nodes handle the workload, this is sometimes called active/active clusters i.e. all the nodes share the workload and are active. This is normally handled by specialist software like Oracle RAC (RAC@Wikipedia) for the Oracle RDBMS database. RAC allows Oracle to scale to very large workloads.

Daniel