PDA

View Full Version : Database replication, multiple clusters, hardware loadbalancers



prosperent brian
07-08-2011, 02:44 PM
Infrastructure is probably one of the most difficult topics to find information on, so hopefully I can share some of my findings and help someone else down the road with the same issues.

Right now, prosperent has 13 dual quad core xeon "application servers" sitting behind a software haproxy loadbalancer on a dedicated server, then backed with a large "image" storage array, and finally a large mysql database server.

Our application servers connect to the large image array via nfs mounts which serve our source code from a single point. This is great for deployment, but gives us a single point of failure (downtime of the image server node).

MySQL used to be another point of issue. We had a single MySQL server with a large raid array for redundancy. We currently have two large MySQL servers in a master/slave configuration so there are always two live copies of all of our data. On top of that, the master server is backed up every 3 hours and the slave hourly. This gives us expansion for reads, but we would have to shard the database to get more write capacity.

This is where cassandra comes in. 99 percent of our write load is analytics. These queries can be pushed to a cassandra cluster that can scale linearly for writes and also handle multi datacenter replication which will be important down the road.


NOW, the changes. The software load balancer (haproxy) that we use now works fine, but a single point of failure doesn't make me happy. It's easy enough to divert traffic to our second load balancer, but there are other issues deeper in the system, so the new infrastructure will keep a global layer 7 hardware load balancer in front of everything. These typically have 5 9's of uptime, so we're looking good already. Sitting behind the hardware load balancer will now be multiple complete clusters in multiple datacenters. Each arm of the cluster will have a haproxy dedicated load balancer at the very front taking requests from the hardware load balancer. The haproxy nodes can then monitor all of the servers under them and shift traffic to different datacenters/clusters based on network latency, and server failure. If an entire node fails, we know at a very high level and can reroute all traffic to the other clusters with no downtime.

Each of these clusters will be complete with application servers, an image server for images, and our nfs mounts, and of course a load balancer to make sure it is all up and running.

Behind the clusters are several large mysql servers in multi master configuration. While you can only write to one node at a time, you can run selects across all of them, and further, if our primary mysql node goes down, we can failover to one of the slaves in under 1 second.

The cassandra clusters will all be built per datacenter and configured to replicate data within the datacenter as well as to each other datacenter. Again, no single point of failure.

Now that we have this mapped out, we start the migration. Luckily, all of the big pieces are already in place. All that is left is setting up the second "image" server which is underway now, then we configure the second loadbalancer and switch dns over to our hardware load balancer. The rest is already in place and running, and the last bits can be completed with zero downtime.

What this means to you? Even if an entire arm of our cluster goes down, we still have capacity left in the other arms to serve up requests with minimal or zero downtime.

Alright, back to linux shell to make this all happen :)

AcidRaZor
07-08-2011, 03:01 PM
What do you guys use to backup mysql on the live databases?

prosperent brian
07-08-2011, 03:09 PM
percona xtrabackup. It's the best thing since sliced bread. It doesn't lock any tables unlike mysqldump. Instead, it notes the current binary log position when it starts, copies all of the raw innodb files to a new directory, then applies the rest of the binary log updates to the files, compresses the results and finishes. When you do a restore, you just delete your entire mysql directory, copy over the raw files from the backup and start mysql. It does a crash recovery at startup and you are up and running. It literally takes less than 5 minutes for us to backup a couple hundred GB of data with our micron ssd's. A restore can be done in 5-10 minutes as well. Pretty amazing piece of software.

AcidRaZor
07-08-2011, 03:14 PM
Sweet, thanks for the tip

AcidRaZor
07-08-2011, 04:04 PM
Do you guys run "prepare" each time you backup or do you leave it out of the 3 hour cycle?

prosperent brian
07-08-2011, 04:24 PM
No, we leave it out. Not really necessary.

mister
07-09-2011, 10:40 AM
Thanks for the share Brian :)

Pretty incredible infrastructure!