views:

579

answers:

3

Server virtualization is a big thing these days, so I'm tasked at work to install some of our software on a virtualized server and see what happens. Long story short: a rsync transfer promptly brings the virtualized server to its knees. The virtualization host is a beefy machine with no other load; I don't think this should be happening. Top shows high load averages, and cpu iowait near 100%. There's a huge bottleneck somewhere.

I'm more a programmer than a sysadmin, I lack the knowledge on how to go about fixing this outside of random Googling. I suspect I'm not alone in this.

What I'd like to see here is general advice on virtualization, and pointers to good articles and other resources, which I and others could use to educate ourselves.

  • What tools (even standard unix tools) can be used to pinpoint bottlenecks?
  • What metrics should be followed to ensure things run smoothly?
  • What kind of things can be efficiently virtualized?
  • What kind of setups are doomed to fail?

I apologize the broadness of the question. I just don't have the knowledge to ask useful specific questions about this.

Edit: More on my specific problem:

  • XAN paravirtualization, 3 x guest CentOS
  • All guests on local SCSI disks, there is a fully hardware raid controller
  • rsyncd running on 1 guest os, transfer initiated from a remote non virtualized server through 100mbps LAN

Like I said before, I really can't provide a ton of useful data. I'm not really expecting to get a direct solution to this problem, I'd be happy with pointers on where to start building the skillset required to better understand these kinds of problems.

+2  A: 

I use rsync to keep some parts of our (very new) virtual environment in sync without any issues. I don't think it's a virtualization issue as much as it is an I/O issue, which you appear to have already identified.

I've found that virtualization is very, very taxing on hard disks, and this only gets worse the more guests you have on the host box. For machines that are very I/O intensive, consider segmenting their disk access away from the other hosts. Are you using any kind of SAN technology? We've found that to be very useful at my workplace (we're using two 8-core Sun Intel servers and a 1TB 12 disk iSCSI array).

Is your hardware fully supported by the virtualization software provider? If you're trying to run on unsupported hardware then there's a good chance that your disk controller is not going to be using the best drivers, which would explain your slow disk accesses.

You can use iostat on Linux/Unix to get some feedback on the I/O, and there's iotop too, though it's not packaged in many distros yet.

mlambie
I'll look into SAN, thanks. The host machine has scsi drives with a fully hardware raid controller. I'm afraid I don't know the spesifics, like I said, I'm not a sysadmin.
Internet Friend
+1  A: 

I was going to put this in a comment, but I think it's more useful in the open:

Could you add more detail about your setup:

  • Which VM server?
    (VMware Server, VMware ESX, MS VirtualServer, MS Hyper-V, something else?)
  • Which OS for the guest(s)?
    (Windows, Linux, 32-bit, 64-bit?)
  • Where are the guest(s) stored?
    (Local disk, or on a NAS or SAN?)
  • Were you rsyncing between guests on the same VM server, or between a guest and a physical server?
  • If across a network, how fast is the network?

Performance tuning in any environment is 90% collecting data, and 10% analysis. Virtualized environments have more variables to consider than Physical environments, but more importantly, they have a different response curve than physical environments. Some applications can perform better in a virtualized environment than a physical environment; others will not. You have to understand the requirements of the application as well as constraints of the implementation.

I don't believe that there are any software-only applications that cannot be successfully deployed on virtual servers, if you pay attention to the details. (Applications that require custom hardware that can't be successfully virtualized is a different problem.)

Craig Trader
At the moment I'm still very much stuck on the 'collecting data' part, as I don't have experience with the tools, and I exactly don't know where to look. But your post has me thinking, perhaps I should read up on general performance tuning first, and forget about the virtualization aspect for now.
Internet Friend
A: 

Off hand I'd say this is I/O problem. In virtual environments on of the biggest factors that affects performance is the state of the disk of the host machine. The things that we do to optimize performance are:

  1. Fixed disk allocation. This way you get a contiguous block of drive space for the VM to live in.
  2. Schedule a defrag of the VM Slice OS and the Host server drive. Fragmentation is your enemy.
  3. Be sure that you are ending an server sessions gracefully. When bouncing the virtual OS, DO NOT just shutdown the VM slice, as this cause huge disk fragmentation. The virtual OS needs to perform the shut down / restart process.
David Robbins