Performance Recommendations for Virtualizing Zimbra with VMware vSphere: Difference between revisions

m (Protected "Performance Recommendations for Virtualizing Zimbra with VMware vSphere 4" ([edit=sysop] (indefinite) [move=sysop] (indefinite)))
No edit summary
Line 75: Line 75:


[http://www.vmware.com/files/pdf/vsphere_performance_wp.pdf What's New in VMware vSphere 4: Performance Enhancements]
[http://www.vmware.com/files/pdf/vsphere_performance_wp.pdf What's New in VMware vSphere 4: Performance Enhancements]
{{Article Footer|Zimbra Collaboration Suite 6.0.x and VMware vSphere 4.0|5/15/2010}}
[[Category:Certified]]

Revision as of 04:49, 15 May 2010


Introduction

VMware vSphere’s capability to deliver computing and IO resources far exceeds the resource requirements of most x86 applications, including Zimbra. This is what allows multiple application workloads to be consolidated onto the vSphere platform and benefit from reduced server cost, improved availability, and simplified operations.

However, there are some common mis-configuration or design issues that many experience when virtualizing applications, especially Enterprise workloads with higher resource demands than smaller departmental workloads.

We’ve compiled a short list of the essential best practices and recommendations to ensure a highly performant Zimbra deployment on the vSphere platform. Also we’ve provided a list of highly recommended reference material to both build and deploy a vSphere platform with performance in mind, as well as troubleshooting steps to resolve performance related issues.

CPU Resources

  • Make sure hardware assisted virtualization is enabled in the BIOS on your hardware platform.
  • Make sure CPU/MMU Virtualization is configured correctly in your VM for your hardware platform.
    • To configure CPU/MMU Virtualization: vSphere Client -> ‘myZimbraVM’ -> Summary Tab -> Edit Settings -> Options -> CPU/MMU Virtualization
  • Reduce the number of vCPUs allocated to your Zimbra VM to the fewest number required to sustain your workload. Over allocating vCPUs causes excessive and unnecessary CPU overhead and idle time on the physical host. When memory and disk resources are sized appropriately, Zimbra is not a CPU bound workload. If your Zimbra VM experiences less than 45% sustained utilization during peak workloads, we recommend reducing the allocated vCPUs to half the number of currently allocated vCPUs.
  • If you see periods of high, sustained CPU utilization on your Zimbra VM, this may actually be caused by memory backpressure or a poorly performing disk subsystem. It is recommended to first increase the memory allocated to your Zimbra VM (make sure you match the VM memory reservation to the total allocated memory for Zimbra Mailbox VMs as a JAVA workload best practice). Then, monitor VM CPU utilization, VM disk IO, and in-guest swapping (can cause excessive disk IO); for signs of improvement and other issues before increasing the number of vCPUs allocated to your Zimbra VM.

NUMA

Non-Uniform Memory Access is a memory architecture used in multi-processor systems. A NUMA node is comprised of the processor and bank of memory local to that processor. In NUMA architecture, a processor can access its own local memory faster than non-local memory or memory local to another processor. A phenomenon known as NUMA “crosstalk” occurs when a processor accesses memory local to another processor causing a performance penalty.

VMware ESX is NUMA aware and will schedule all of a VM's vCPUs on a ‘home’ NUMA node. However, if the VM container size (vCPU and RAM) is larger than the size of a NUMA node on the physical host, NUMA crosstalk will occur. It is recommended, but not required, to configure your maximum Zimbra VM container size to fit on a single NUMA node.

For Example

  • ESX host with 4 sockets, 4 cores per socket, and 64 GB of RAM
  • NUMA nodes are 4 cores with 16 GB of RAM (1 socket and local memory)
  • Recommended maximum VM container is 4 vCPU with 16GB of RAM

CPU Over Commit

It is OK to over commit CPU resources, it is not OK to over utilize. Meaning you can allocate more virtual CPUs (vCPUs) than there are physical cores (pCores) in an ESX host as long as the aggregate workload does not exceed the physical processor capabilities. Over utilizing the physical host can cause excessive wait states for VMs and corresponding applications while the ESX scheduler is busy scheduling processor time for other VMs.

Zimbra is not CPU bound when disk and memory resources are sized correctly. It is perfectly fine to over commit vCPUs to pCores on ESX hosts where Zimbra workloads will be running. However, in any over committed deployment it is recommended to monitor host CPU utilization, VM Ready Time, and utilize the Dynamic Resource Scheduler (DRS) to load balance VMs across hosts in a vSphere Cluster.

VM Ready Time, host CPU utilization, and other important resource statistics can be monitored using ESXtop or from the Performance tab in the vSphere Client. You can also configure Alarms and Triggers to email administrators and perform other automated actions when performance counters reach critical thresholds that would affect the end user experience.

Memory Resources

  • It is recommended to size the VM memory not to exceed the amount of memory local to a single NUMA node. For Example
    • ESX host with 4 sockets, 4 cores per socket, and 64 GB of RAM
    • NUMA nodes are 4 cores with 16 GB of RAM (1 socket and local memory)
    • Recommended maximum VM container is 4 vCPU with 16GB of RAM
  • Set the memory reservation for your Zimbra MBS VMs to the total amount of memory allocated to the VM. For example, if you allocated 8192MB of memory to the Zimbra Mailbox VM, then the memory reservation should be set to 8192MB.
    • To configure memory reservations: vSphere Client -> ‘myZimbraVM’ -> Summary Tab -> Edit Settings -> Resources - > Memory -> Reservation

Network Resources

  • Use the VMXNET3 paravirtualized network adapter if supported by your guest Operating System.
  • Use separate physical NIC ports, NIC teams, and VLANs for VM network traffic, vMotion, and IP based storage traffic (i.e. iSCSI storage or NFS datastores). This will avoid contention between client/server IO, storage IO, and vMotion traffic.

Storage Resources

  • Do not oversubscribe VMFS datastores. Latencies for disk IO are primarily determined by storage design and has the same impact on Zimbra performance virtual as it does running natively. Design your Zimbra storage with the appropriate number of spindles to satisfy IO requirements for Zimbra DBs, indexes, redologs, blob stores, etc.
  • Insufficient memory allocation can cause excessive memory swapping and disk IO. See the memory resource section of the Performance Trouleshooting for VMware vSphere 4 guide for information on tuning VM memory resources.
  • Use the PVSCSI paravirtualized SCSI adapter if supported by you guest Operating System.
  • There is no performance benefit to using RDM devices versus VMFS datastores. It is recommended to use VMFS datastores unless you have specific storage vendor requirements to support hardware snapshots or replications in a virtual environment.
  • Configure your Zimbra VM’s, VMDK disk device as thick-eagerzeroed to zero out each block when the VMDK is created. By default, new thick VMDK disk devices are created lazyzeroed. This causes duplicate IO the first time each block is written to the disk device by first zeroing the block, then writing your application data. This can cause significant performance overhead for disk IO intensive applications.
    • To configure thick-eagerzero VMDK disk devices
    • Check the box to ‘Support clustering features such as Fault Tolerance’ when creating the VM. This does not enable FT, but does eagerzero the disks.
    • Or from the ESX CLI
vmkfstools -k /vmfs/volumes/path/to/vmdk
  • If using Fiber Channel storage, configure the maximum queue depth on the FC HBA card.
  • Do not oversubscribe network interfaces or switches when using IP based storage (i.e. iSCSI or NFS). Use EtherChannel with ESX NIC teams and IP storage targets or 10GE if storage IO requirements exceed a single 1Gb network interface.
  • Use dedicated physical NIC ports, teams, and VLANs for IP based storage traffic (i.e. iSCSI storage or NFS datastores). This will avoid contention between client/server IO, storage IO, and vMotion traffic.
  • Use Jumbo frames to increase storage IO throughput and performance when using IP based storage (i.e. iSCSI or NFS).

Reference Materials

Performance Best Practices for VMware vSphere 4.0

VMware vSphere 4 Performance with Extreme I/O Workloads

Performance Troubleshooting for VMware vSphere 4

Understanding Memory Resource Management in VMware ESX Server

Comparison of Storage Protocol Performance in VMware vSphere 4

Best Practices for Running vSphere on NFS Storage

Configuration Maximums for VMware vSphere 4.0

What's New in VMware vSphere 4: Performance Enhancements

Verified Against: Zimbra Collaboration Suite 6.0.x and VMware vSphere 4.0 Date Created: 5/15/2010
Article ID: https://wiki.zimbra.com/index.php?title=Performance_Recommendations_for_Virtualizing_Zimbra_with_VMware_vSphere Date Modified: 2010-05-15



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »

Jump to: navigation, search