tags:

views:

41

answers:

1

Hi!

I've read much of the MySQL Cluster docs and some tutorials yet I still have some things not clear, and the major of them right now are:

  1. When a data node restarts (crash and goes up again) will it's data still be available? Updates/Additions will work as usual? Will it "sync"?
  2. Does a cluster work faster than standalone? In particular, I update rows many times, but one at a time, meaning network latency might have impact on the performance. Is there any pattern I can follow to make things faster, such as adding more SQL nodes or adding mode data nodes?

Regarding question #2, an update of a row is in the following syntax:

UPDATE db_accounts.tbl_items SET items=items+%lld WHERE id_account=%u

"id_account" is an index (unique).

A: 

MySQL Cluster is an in-memory database (although some columns can be stored on disk, indexed columns cannot). If the plug gets pulled, your data goes away. The recovery process for a node re-joining the cluster is that it pulls the data it lost from a surviving node (requiring good fast links between nodes) and then carefully applies replication events until it catches up and can actively participate. If a recent backup is available, it can be rebuilt from that instead of another node, but the principle is the same: the node has to be repopulated with data from scratch.

MySQL Cluster is basically a distributed hash table. The NDB node that holds a particular row of data is determined by a hash algorithm applied to the primary key. Performance increases by adding nodes, assuming that your data spreads nicely across the nodes. Performance can be badly affected if queries have to touch multiple nodes - ie. complex joins - but is lightning fast for retrieving a specific row given its primary key.

Obviously, given that the nodes are distributed, a slow or congested network will badly affect performance.

Martin
Thanks a lot, I think that'll do!
Poni