views:

1948

answers:

6

Hi.

I'm looking to create a load-balanced MySQL cluster, but without the actual load-balancer, in order not to add another point of failure or complexity.

What I was thinking was to have the following:

1) Have a master-master setup for MySQL

2) On every client, place a simple round-robin proxy which would rotate the requests between servers.

Is this possible? Or there are better ways to achieve this?

Regards.

+2  A: 

Personally, the better way would be to use a load balancer!

Yes, it does add another point of failure, but any routine that you put in place, or install on EVERY client, adds a lot more complexity than a standard load balancer....

Mitchel Sellers
It makes sense, but the problem is the single point of failure - even with 2 LB's...In case one of clients goes down, only it impacted and no one else.
SyRenity
It's hard to maintain LB on every node. If You install a LB on 12 servers and then You will like to change something (address of one of the DBs or add a DB or something) - You will notice the trouble. I did.
Reef
+2  A: 

Master-master replication is not as good as You might think, same goes to the round-robin proxy and similar 'easy' solutions. If You commit colliding data to separate servers fast enough (faster than the delay between the servers, which on production servers might be up to a full second*), both will accept the data. If You have an auction server, You just sold the same car twice. Who bought it? It depends on which DB will You ask!

The application must be aware that there are actually 2 databases out there and it has to know both of their ip addresses. If You want to "sell", You should f.e.

DB_number = `auction_number` % `number_of_databases`

( % is for modulo )

... and commit it to DB_number database. If You get a connection error, then perhaps do it with the other one (but in case of an auction server, I'd just display an error).

Also, the IP addresses should be wackamole-d between both servers. On a disaster scenario, where one database server goes down for a couple of hours in peak usage time, You will find that the application will try to connect to the absent server and hang until TIMEOUT, say, 3s. Suddenly half of Your queries run 3s longer (and they all go to the same database eventually - which doesn't make it run faster than before the disaster). This doesn't make Your httpd happy, as it probably has a limited connection pool of concurrent request handler threads...

* replication delay on production servers might be up to a full second - I have tested this in a remote colocation and in our datacenter and for like 99% of the time it's 0, but sometimes mysql shows 1s. On massive traffic I had many collisions due to client application making two requests resulting in two queries, insert and select. For some cases, the row just wasn't there yet, so We used hash of the userID and it fixed the problem

I hope You will learn from my mistakes ;-)

Reef
Hi.Thanks for sharing.I thought about Wackamole, which is actually good for HA. My issue with it that all the load would be on one of the master servers, when the second would be idle, basically creating active/pasive, while I'm looking for active/active.Perhaps it's better to place some light LB solution on each client, in order to allow it to switch requests between the servers?Any idea if such tool exists?
SyRenity
If You need redundancy, then "one working, one idle" is good. Let's say one of the 2 servers dies (I remind You, that You bought the other one so if the first one breaks You can still function). If the second server can't handle all the traffic, then it's for scale, not for HA! Also: relying only on Wackamole is a bad solution (ping ok != mysqld ok).
Reef
+1  A: 

Please read my other answer to this question before actually using a MySQL proxy of any kind. If You have 2 master-master servers that a CMS is writing to, and 10 httpd that only read from it, You'll be fine, but (as pointed out in the other answer) that's not always the case. You've been warned.

MySQL Proxy is a simple program that sits between your client and MySQL server(s) that can monitor, analyze or transform their communication. Its flexibility allows for unlimited uses; common ones include: load balancing; failover; query analysis; query filtering and modification; and many more.

.

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications

If You would run it in TCP mode, it could be even better than Wackamole. If I had to choose between them, I would use HAProxy. Also HAProxy can have a lot of backends, Waclamole can have only 2. Note that HAProxy is "dumb", it connects sockets without any looking on what's inside the stream - dedicated MySQL Proxy might f.e. have an option to point various requests to specified servers.

Reef
Just to verify:1) HAProxy would require additional machine / 2 machines for HA2) Wackamole can only support 2 servers per setup?Regards.
SyRenity
Wackamole's standard usage pattern (in fact the only one that I know) is to have serverA and serverB watch each other and take the other one's IP if it dies. Wackamole website says that it can be used to guard a pool of IPs... But I must say that Wackamole does not give stability as one would like to, so I don't recomment that. About HAProxy, You would put 2 of them on 2 dedicated machines for redundancy, or You could even put one on each node, as You said in the question. If Your queries are mostly reading, then I think it'll work pretty well.
Reef
Hi Reef.Just a last bit about Wackamole - from your experience, it's not stable enough on two machines?
SyRenity
2 machines ping eachother ok, but one of them has load 200, all cpu's on 100% usage, all ram used. MySQL has crashed. <-- wackamole will NOT work there. HAProxy can check if the remote APPLICATION is up, Wackamole only if the server is up and application_uptime < server_uptime. We had many cases where We relied on wackamole and it let us down.
Reef
A: 

Splitting writes won't take the load off of the servers because the writes still must be replicated.

If your using only 2 servers use heartbeat with drbd and let drbd handle the replication. If the first server fails the second server will take over. If you would like to put the second server to use, you can use gfs over drbd and then run the second server as read only and use it as a read server. When failover occurs then change the server to read / write.

re: wackamole - wackamole isn't limited to 2 servers

I'm working on a tutorial series covering this, but it's really simple to setup.

Yes, in theory, wackamole can support more than 2 servers, but have You ever tried this on production? We did. We now regret.
Reef
So far I've had no issues, other than the fact I can't get it to compile under centos 5 64 bit
A: 

A load-balanced MySQL (or some other) database cluster is pretty futile. If you're writing to more than one server, then you will run into trouble, or you use synchronous replication (which MySQL doesn't support anyway), and that hurts performance badly as it needs to synchronise locks.

I recommend you split read/write loads, and load balance the reads amongst mysql slaves, and have either a single master for writes, or use an active/passive failover pair for your master.

Essentially, you cannot scale writes by putting more servers in a database as slaves, as each still has to write the entire write-load of your application.

To scale writes you need to split your data logically across multiple servers, by partitioning or "sharding" etc. This typically requires nontrivial (think very difficult to test) changes to your application, so you don't want to do this unless you REALLY need it.


You can of course use MySQL cluster if you really want, but it's a completely different engine with its own features and drawbacks - it's a bit complicated to set up but does really provide a HA load-balanced database on commodity hardware. It still suffers from write performance penalties from using synchronous replication, but does allow you to scale writes as it has built in partitioning across servers.

MarkR
A: 

Connector/J has the ability to loadbalance queries across several servers. This is primarily intended for MySQL NDB Cluster where all SQL nodes will have a consistent view of the data but if you can ensure that the two masters database will be reasonably consistent between these two masters it might be safe for your application.

The connect string would look something like:

jdbc:mysql:loadbalance://host-1,host-2,... host-n/dbname?loadBalanceStrategy="random"&loadBalanceBlacklistTimeout=5000

Matt Montgomery