tags:

views:

111

answers:

0

I am in the process of setting up a storage network for our Vmware ESXi Servers. The way it is set up, is as follows:

  • Dell P200 Dual Core Xeon @3Ghz Server running Openfiler 2.3
  • Dell MD1000 DAS Unit
  • Perc 6 SAS Controller, 512MB Battery Backed Cache
  • 7 x 2TB SAS 7.2K Disks in RAID-10 (one hot-spare)
  • 8 x Gigabit Ethernet Ports (Intel 1000/Pro cards)

The ethernet ports are directly attached (no switch) to the vmware servers. It's presently set up for two servers, but will eventually run for 4 or 5, with a single Ethernet port to each, or more if greater network performance is later required.

Jumbo frames at 9000 bytes are enabled, and working (I have tested this) on the vmkernel and virtual interfaces, and on openfiler at the other end.

I am testing the setup. I am not expecting remarkable speeds, it will be used for a variety of development systems, and we don't need huge throughput. I do however, want to get to as close to the maximum speed of the hardware as possible.

Ideally, we could use NFS, because this will give us some useful advantages in terms of backup, and also allows moving VMs between servers extremely easily ithout having fancy vsphere licenses, which dwarf the cost of the hardware. I am considering iSCSI too, but we lose the management advantage. I am also not overly impressed with the VMFS filesystem so far.

Ok, here is my problem.

I run performance tests, directly in the vmware 'unsupported' console using dd commands. This allows me to see the raw throughput on the array, in different configurations.

I then run the same dd command on the openfiler machine, to see the raw performance I am achieving from the array without the network stage.

The problem is - the disks are not being fully utilised - there is a bottleneck somewhere. I can see this is the case when I look at the iostats on the openfiler box.

If I use iSCSI, the bottleneck seems perhaps to be the VMFS filesystem. I can tell this, because if I use a raw device in a guest VM, performance is much faster. Same iSCSI target, nothing else going on, and performance is better when bypassing VMFS. The difference is astounding - 12MB/s read through the vmware directly, and 120MB/s with the Linux guest pointing at a raw vmware device. I have read that it is a very slow filesystem, so this does not surprise me. What does surprise me is the NFS performance.

I have run several tests, with commands like this. I average over three tests with the same command:

time dd if=test1 of=/dev/null bs=8k count=2000000

This is 15625MB file, and here are the timings, with the various methods of connecting to the share on the openfiler box:

  • iSCSI: 20m 54s (12.46MB/s)
  • NFS: 12m 24s (20MB/s)
  • iSCSI from Guest OS: 2m 30s (109MB/s)
  • NFS from Guest OS: 3m 23s (80.3MB/s)
  • Raw disk speed (no network): 29s (400MB/s)

Write performance is faster!:

time dd if=/dev/zero of=test1 bs=8k count=2000000

  • iSCSI: 6m 12s (42MB/s)
  • NFS: 8m 12s (31.75MB/s)
  • iSCSI from Guest OS: 2m 19s (112MB/s)
  • NFS from Guest OS: 3m 6s (88.3MB/s)
  • Raw disk speed (no network): 1m 8s (229.77MB/s)

It seems to me that vmware is causing a huge performance hit by the way it mounts the NFS and iSCSI. What gives? I don't seem to have any option as to how I mount the NFS share through ESXi, it just chooses mount options for me. iSCSI is obviously not a direct comparison, as I am also comparing using VMFS to ext3. Even more strangely, the write performance is massively better.

I know the network is bottlenecking the disk speed - but this does not explain the differences between the different mounting options, which is what I am trying to understand.

When I choose larger block sizes, performance increases. I chose 8k here, because this is the typical oracle block size, and I am intending on using the storage network for oracle data files (not transaction logs, just data files).

Any ideas why vmware is creating such an artificial bottleneck here?

Note: it is not the speed of the disks or the array which is the problem - you can see the raw cached speeds are high enough. It also doesn't appear to be a fundamental network issue, since it is clearly possible for it to achieve speeds of 112MB/s in the guest VM. I just want to know what is bottlenecking it so much when accessing the shares through ESXi.

It's not CPU either, there are plenty of spare cycles on both boxes, and it is hardly stressing the openfiler server. Any bright ideas?