Ajcody-Notes-ServerPlanning

Server Performance Planning & Infrastructure - The Big Picture

Actual Server Planning Homepage

Please see Ajcody-Notes-ServerPlanning

Initial Comments

These are just my random thoughts. They have not been peer-reviewed yet.

Actual DR and Server restore issues are listed else where in my Ajcody-Server-Topics page.

Items For Your Review


Redundancy - High Availability - And Other Related Questions

One might ask, "Is there any other way to have redundant zimbra email servers?"

The short answer is No, but if you have millions of dollars you can get really close.

Remember, redundancy always comes with a dollar amount cost. And this dollar amount will determine your range of options to implement redundant technologies or processes.

Redundancy isn't magical and all choices have a "time" for the operation to failover and the application being used can restrict what can be done or increase the length of "time" of the operation. Timing of the operation is where the service/server has a client side impact of being "unavailable".

So you break down the server and services into components and put them in a graph or excel sheet. This will help to make sure your not "over engineering" the system.

For example, disk and data.

Disks can use different raid systems to give redundancy. Channels to the disk (remote) can also have redundant paths. The disk chassis can have redundant power supplies which goto different UPS's on different circuit breakers. This data can also be sent to tape, rysnc'd/copied to another disk subsystem on another server, or "flashed" to another location if your filesystem or SAN unit supports it. There's allot that has to fail for you to completely use the data. When exploring these items, you want to have multiple channel paths to make sure copies, rsync's, flashing occurs differently as compared to the "production" path. Have your network backup occur on a different ethernet device.

Two big picture objectives for "redundancy" are "data redundancy" for DR situations and "application" redundancy for service availability. Our "cluster" situation is an active-passive "redundancy" for the purposes of the mailstore application layer. The raid level of the disks on the san servers the "redundancy" in regards to data recovery for DR.

When you introduce the objective of having off-site redundancy the costs and issues become huge. Remote locations introduce speed issue for data transfers which will also impact performance for most applications as it tries to stay in sync between the two machines for write and roll-back purposes.

My intent wasn't so much to give specific answers to this question but rather demonstrate that to answers for these questions you have to get down to the real specifics - it's simply impossible to answer the broad question of "how do I make redundant servers". It would take a book to fully explore all the issues and possibilities but then still come down to - how much money you have to spend on the project. I use to work with High Performance Clusters and HA Linux+Oracle+SAP global deployments with TB's of data - this issue would arise daily for us in those environments.

HA Clustering Software

From the Single Node Cluster Installation Guide - Rev 1 June 2008:

For ZCS 5.0.7 and later
For cluster integration to provide high availability, Zimbra Collaboration Suite (ZCS) can integrate with either of the following:
  • Red Hat® Enterprise Linux® Cluster Suite version 4, Update 5 or later update release. In the single-node cluster implementation, all Zimbra servers are part of a cluster under the control of the Red Hat Cluster Manager.
Note: Red Hat Cluster Suite consists of Red Hat Cluster Manager and Linux Virtual Server Cluster. For ZCS, only Red Hat Cluster Manager is used. In this guide, Red Hat Cluster Suite refers only to Cluster Manager.
  • Veritas™ Cluster Server by Symantec (VCS) version 5.0 with maintenance pack 1 or later.

References:

CPU And Motherboards

Other references: Performance_Tuning_Guidelines_for_Large_Deployments#RAM_and_CPU

This is your mail server, there is no higher profile application in your environment most likely than this box. You'll want to think 3 to 5 years down the road when you spec out the box. Make sure to confirm:

  • It will scale or offer:
    • Hotpluggable technology
      • You'll need to confirm the OS can handle this
    • Redundant and Hot-swap power supplies
      • You might need to upgrade your power supplies depending on the "accessories" you put in it.
    • CPU's (There is NO reason to use a 32bit chip - the memory limitations will kill you)
      • By default, a 32-bit Linux kernel only allows each process to address 2GB of space. Through PAE (Process Address Extension), a feature available in some CPUs, and a special 32-bit kernel that supports large address space for processes, it is possible to get a 32-bit mode kernel that really uses > 4GB of RAM, and get a per process 3-4GB address range
    • Memory
      • What are your onboard cache options?
      • How many slots are available? Does it force you to buy the large memory sticks to max out total memory - this increases cost?
      • Can you mix & match different memory size sticks? (This will increase costs when you goto scale)
      • Understand how memory interacts with multi-cpu's if your server has them
      • Understand the front side bus (FSB)
      • Does the motherboard offline for bad memory detection
    • Have growth for expansion cards and the right kind
      • What is PCI Express?
      • How PCI Express Works?
      • Know what slots you need for your network card, raid card, san card, etc.
        • Then do a sanity check against the motherboard and understand if the cpu and channels can allow the full throughout from all channels.
      • Is there room for redundancy with your expansion cards?

Memory

Other references: Performance_Tuning_Guidelines_for_Large_Deployments#RAM_and_CPU

When I was working with HPC, I found there was never a good reason to NOT at least start with 32GB's of RAM. Now, a mail server isn't a HPC compute node - I understand that. But I would still try to spec out a system that has at least 8 memory slots (usually a ratio of x slots per cpu on system) but allows me to use 4 4GB DIMMS giving me 16GB of memory and allows an upgrade path choice of 4 more 4GB or 8GB DIMMS. Get the higher speed DIMMS, you can't mixed memory of different speeds.

Chart Of Memory Speeds
Memory Interconnect Buses Bit Bytes
PC2100 DDR-SDRAM (single channel) 16.8 Gbit/s 2.1 GB/s
PC1200 RDRAM (single-channel) 19.2 Gbit/s 2.4 GB/s
PC2700 DDR-SDRAM (single channel) 21.6 Gbit/s 2.7 GB/s
PC800 RDRAM (dual-channel) 25.6 Gbit/s 3.2 GB/s
PC1600 DDR-SDRAM (dual channel) 25.6 Gbit/s 3.2 GB/s
PC3200 DDR-SDRAM (single channel) 25.6 Gbit/s 3.2 GB/s
PC2-3200 DDR2-SDRAM (single channel) 25.6 Gbit/s 3.2 GB/s
PC1066 RDRAM (dual-channel) 33.6 Gbit/s 4.2 GB/s
PC2100 DDR-SDRAM (dual channel) 33.6 Gbit/s 4.2 GB/s
PC2-4200 DDR2-SDRAM (single channel) 34.136 Gbit/s 4.267 GB/s
PC4000 DDR-SDRAM (single channel) 34.3 Gbit/s 4.287 GB/s
PC1200 RDRAM (dual-channel) 38.4 Gbit/s 4.8 GB/s
PC2-5300 DDR2-SDRAM (single channel) 42.4 Gbit/s 5.3 GB/s
PC2-5400 DDR2-SDRAM (single channel) 42.664 Gbit/s 5.333 GB/s
PC2700 DDR-SDRAM (dual channel) 43.2 Gbit/s 5.4 GB/s
PC3200 DDR-SDRAM (dual channel) 51.2 Gbit/s 6.4 GB/s
PC2-3200 DDR2-SDRAM (dual channel) 51.2 Gbit/s 6.4 GB/s
PC2-6400 DDR2-SDRAM (single channel) 51.2 Gbit/s 6.4 GB/s
PC4000 DDR-SDRAM (dual channel) 67.2 Gbit/s 8.4 GB/s
PC2-4200 DDR2-SDRAM (dual channel) 67.2 Gbit/s 8.4 GB/s
PC2-5300 DDR2-SDRAM (dual channel) 84.8 Gbit/s 10.6 GB/s
PC2-5400 DDR2-SDRAM (dual channel) 85.328 Gbit/s 10.666 GB/s
PC2-6400 DDR2-SDRAM (dual channel) 102.4 Gbit/s 12.8 GB/s
PC2-8000 DDR2-SDRAM (dual channel) 128.0 Gbit/s 16.0 GB/s
PC2-8500 DDR2-SDRAM (dual channel) 136.0 Gbit/s 17 GB/s
PC3-8500 DDR3-SDRAM (dual channel) 136.0 Gbit/s 17 GB/s
PC3-10600 DDR3-SDRAM (dual channel) 165.6 Gbit/s 21.2 GB/s
PC3-12800 DDR3-SDRAM (dual channel) 204.8 Gbit/s 25.6 GB/s

Bus For Expansion Cards [Peripheral buses]

Typical uses are for network, san, raid cards. I'll deal with them separately under each section below.

Chart Of Bus Speeds
Interconnect Max speed (MB/s) Comments
PCI 2.0 132.0 MB/s
PCI 2.1 264.0 MB/s
PCI 2.2 528 MB/s
PCI-X 1.0 1 GB/s
PCI-X 2.0 4 GB/s
PCI-E (Express) Ver. 1.1 250 MB/s x2 - bi-directional These speeds are bi-directional per "lane". Meaning that they are the same going both ways and not shared.
Ver. 1.1 @ 1x 256 MB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 2x 512 MB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 4x 1 GB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 8x 2 GB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 16x 4 GB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
PCI-E (Express) Ver. 2.0 400 MB/s x2 - bi-directional 500 MBs but there's a 20% overhead hit. These speeds are bi-directional per "lane". Meaning that they are the same going both ways and not shared.
Ver. 2 @ 1x 400 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
Ver. 2 @ 4x 1600 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
Ver. 2 @ 8x 3200 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
Ver. 2 @ 16x 6400 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
PCI-E (Express) Ver. 3.0 1 GB/s x2 - bi-directional The final spec is due in 2009. These speeds are bi-directional per "lane". Meaning that they are the same going both ways and not shared.
Front-side Bus replacements below
HyperTransport 1.x 6.4 GB/s x2 - bi-directional Bidirectional per 32 bit link at 800MHz. Front-side bus replacement, see HyperTransport for more details .
HyperTransport 2.0 11.2 GB/s x2 - bi-directional Bidirectional per 32 bit link at 1.4GHz. Front-side bus replacement, see HyperTransport for more details .
HyperTransport 3.0 20.8 GB/s x2 - bi-directional Bidirectional per 32 bit link at 2.6GHz/ Front-side bus replacement, see HyperTransport for more details .
HyperTransport 3.1 25.6 GB/s x2 - bi-directional Bidirectional per 32 bit link at 3.2GHz. Front-side bus replacement, see HyperTransport for more details .
QuickPath Interconnect 12.8 GB/s x2 - bi-directional Intel competitor to HyperTransport. Everything you wanted to know about QuickPath (QPI)

Network Infrastructure

Network Cards

Most motherboards will have integrated ethernet ports. If they are the same chipset then you might want to just channel bond these. What I have done in the past, is used them for management ports and added in network cards for production activity.

Ideally, I would buy two cards that had two ports (Gb). I would then channel bond 2 of the ports across the cards, for 2 separate bondings. I would use one of those bonds for "front facing" traffic and the other for "backup" traffic. Remember to consider the bus infrastructure and other cards when deciding on what ones to get.

Channel Bonding

This will require you to confirm your switches can do this. There are different choices when it comes to channel bonding and some of them require the switch to support it - it usually involves "bonding" on the switch side by configuring the ports in question.

The Production Channel Bond

The "production" bond is more in regards to failover. Make sure this happens as expected through the application layer. That proxies, firewall, and so forth allow on of the ports in the channel bond to go down without any end-user impact.

The Backup Channel Bond

The backup port is for throughput. You'll want to map out the network path and switches to the end host that data is being moved to. You'll also need to confirm the network path actually gives you a gain in performance by doing the channel bonding.

This will give you an excellent way to off load your tape backups, rsyncs (as shown at Ajcody-Notes-ServerMove , and maybe nfs mounts if your using them.

You'll want to map out the network path and switches to the end host the data is being moved to. You'll also need to confirm the network path actually gives you a gain in performance by doing the channel bonding.

Disk Infrastructure

Other references: Performance_Tuning_Guidelines_for_Large_Deployments#Disk

Disk Types

References


Description of Items Used Below For Specific HDD's:

  • Notable Items
    • Benefits
    • Downsides
  • Bus Interface
    • Types:
    • Maximum Devices:
    • Bus Speeds:
  • Performance
    • I/Os per second & Sustained Transfer Rate:
    • Spindle Speed: This is the speed the drives disk actually spins at. This ranges from 5400rpm to 15,000rpm. The higher the speed the more often the data on the disk will be in the right position to be read by the drive heads, and the faster data can be transferred.
    • Average Access Time: Time. This is the average time is takes to position the heads so that data can be read. The faster the better.
    • Cache Size: This is the size of the cache on board the disk drive itself. This does reach a point where doubling the size generates a very small boost in performance and is not worth the cost, but generally the bigger the better. It also assists in small bursts of data where you may actually achieve near maximum performance as data is read from cache rather than the drive.
    • Internal Transfer Rate: This is the speed that data can be transferred within the drive. This speed will be higher than the actual transfer rate of the drive as there is some overhead for protocol handling as data is transferred to the SCSI or IDE bus.
    • Latency: the time it takes for the selected sector to be positioned under the read/write head. Latency is directly related to the spindle speed of the drive and such is influenced solely by the drive's spindle characteristics.
  • Reliability (MTBF or unrecoverable read error rate)
  • Capacity
  • Price
Ultra DMA ATA HDD's
  1. Notable Items
    • Benefits
    • Downsides
  2. Bus Interface
    • Types : ATA
  3. Performance
    • I/Os per second & Sustained transfer rate:
      1. Ultra DMA ATA 33 - 264 Mbit/s 33 MB/s
      2. Ultra DMA ATA 66 - 528 Mbit/s 66 MB/s
      3. Ultra DMA ATA 100 - 800 Mbit/s 100 MB/s
      4. Ultra DMA ATA 133 - 1,064 Mbit/s 133 MB/s
    • Spindle Speed:
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price
SATA HDD's
  1. Notable Items
    1. Benefits
      • SATA drives typically draw less power than traditional SAS HDDs due to slower RPM speeds.
      • SATA drives have the best dollar per gigabyte compared to SAS drives.
      • SATA HHDs can work on a SAS interface.
    2. Downsides
      • SATA HDDs are single port and not capable of being utilized in dual port environments without the addition of an interposer designed for this purpose.
  2. Bus Interface Type
    • Types: SATA , SAS
  3. Performance
    • I/Os per second & Sustained transfer rate:
      1. Serial ATA (SATA-150) - 1,500 Mbit/s 187.5 MB/s
        • Real speed: 150 MB/s
      2. Serial ATA (SATA-300) - 3,000 Mbit/s 375 MB/s
        • Real speed: 300 MB/s
        • (alternate names: SATA II or SATA2)
      3. Serial ATA (SATA-600) - 4,800 Mbit/s 600 MB/s
        • I've seen this listed as well though, SATA 6.0 Gbit/s (SATA 6Gb/s).
        • Standard is expected to be available before the end of 2008.
    • Spindle Speed: 7200 RPM , 5400 RPM
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price
SCSI (Parallel SCSI) HDD's
  1. Notable Items
    • Benefits
    • Downsides
  2. Bus Interface Type
    • Types: SCSI
    • Maximum Devices (On Single Channel):
      • Ultra Wide SCSI - 16
      • Ultra2 SCSI - 8
      • Ultra2 Wide SCSI - 16
      • Ultra3 SCSI - 16
        • (alternate names: Ultra-160, Fast-80 wide)
      • Ultra-320 SCSI - 16
        • (alternate name: Ultra4 SCSI)
      • Ultra-640 SCSI - 16
    • Bus Speeds:
      • The total amount of data that can be transferred throughout the whole channel.
      • Data transfers will step down to the rate speed of the drive. If you have a Ultra160 HDD on a Ultra320 controller it operate at 160 MB/s.
      • SCSI-3 : Also known as Ultra SCSI and fast-20 SCSI. Bus speeds at 20 MB/s for narrow (8 bit) systems and 40 MB/s for wide (16-bit).
      • Ultra-2 : Also know as LVD SCSI. Data transfer to 80 MB/s.
      • Ultra-3 : Also known as Ultra-160 SCSI. Data transfer to 160 MB/s.
      • Ultra-320 : Data transfer to 320 MB/s.
      • Ultra-640 : Also known as Fast-320. Data transfer to 640 MB/s.
  3. Performance
    • I/Os per second & Sustained Transfer Rate:
      • Ultra Wide SCSI 40 (16 bits/20MHz) - 320 Mbit/s 40 MB/s
        • Real speed: 40 MB/s
      • Ultra2 SCSI
        • Real speed: 40 MB/s
      • Ultra2 Wide SCSI 80 (16 bits/40 MHz) - 640 Mbit/s 80 MB/s
        • Real speed: 80 MB/s
      • Ultra3 SCSI 160 (16 bits/80 MHz DDR) - 1,280 Mbit/s 160 MB/s
        • (alternate names: Ultra-160, Fast-80 wide)
        • Real speed: 160 MB/s
      • Ultra-320 SCSI (16 bits/80 MHz DDR) - 2,560 Mbit/s 320 MB/s
        • (alternate name: Ultra4 SCSI)
        • Real speed: 320 MB/s
      • Ultra-640 SCSI (16 bits/160 MHz DDR) - 5,120 Mbit/s 640 MB/s
        • Real speed: 640 MB/s
    • Spindle Speed:
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price
SAS (Serial Attached SCSI) HDD's
  1. Notable Items
    1. Benefits
      • SAS HDDs are true dual port, full duplex devices. This means SAS HDDs can simultaneously process commands on both ports.
      • All SAS HDDs are hot-swap capable. Users can add or remove an HDD without disrupting the enterprise environment.
      • SAS HDDs can support online firmware update (check vendor). This allows users to update firmware on the SAS HDD without having to schedule downtime.
    2. Downsides
      • SAS HDDs cannot be used on the older architecture SCSI backplanes or cables.
      • SAS HDDs typically draw more power than the equivalent SATA counterparts.
  2. Bus Interface Type
    • Types: SAS
    • Maximum Devices:
      • SAS 16,256
        • 128 devices per port expanders
    • Bus Speeds:
  3. Performance
    • I/O per second & Sustained transfer rate:
      • Serial Attached SCSI (SAS) - 3,000 Mbit/s 375 MB/s
        • Real Speed: 300 MB/s (full duplex , per direction)
      • Serial Attached SCSI 2 - 6,000 Mbit/s 750 MB/s
        • Planned
    • Spindle Speed: 15000 RPM , 10000 RPM , 7200 RPM
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
    • 10K & 15K SAS HDDs has been rated at 1.6 million hours MTBF
  5. Capacity
  6. Price
Fibre Channel - Arbitrated Loop (FC-AL) HDD's
  1. Notable Items
    1. Benefits
      • FC-AL HDDs are dual ported, providing two simultaneous input/output sessions
      • FC-AL HDDs are hot-swap capable so users can add and remove a hard drives without interrupting system operation.
    2. Downsides
      • FC-AL HDDs are typically utilized in unique environments and are not compatible to be used on SAS or SATA interfaces
      • Long term, SAS is projected to replace FC-AL HDDs within the IT industry
  2. Bus Interface Type
    • Types: FC-AL
    • Maximum Devices:
      • FC-AL in a private loop with 8-bit ID's - 127
      • FC-AL in a public loop with 24-bit ID's - +16 million
    • Bus Speeds:
  3. Performance
    • I/Os per second & Sustained transfer rate:
      • FC-AL 1Gb 100 MB/s (full duplex , per direction)
      • FC-AL 2Gb 200 MB/s (full duplex , per direction)
      • FC-AL 4GB 400 MB/s (full duplex , per direction)
    • Spindle Speed:
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price

Raid Cards (Not Raid Level)

  • If your raid chip on your motherboard goes out, what is your expect downtime to resolve it?
    • Isn't it easier to buy a hot space raid card and replace it in case of a raid card failing?

SAN Typologies - Interfaces

General Questions
San Cards
  • Have you planned on multipathing?
    • You do have more than one port/card, right?
      • If not (because of cost), you do have an open slot on motherboard to allow another one?
      • Remember to consider this with your SAN switch purchase
    • Confirm that your HBA's, HBA drivers, SAN switches allow for the options you want?
    • Multipathing can be for failover
    • Multipathing can increase throughput for same partition mount
Multpathing
iSCSI Interface

So far, physical devices have not featured native iSCSI interfaces on a component level. Instead, devices with SCSI Parallel Interface or Fibre Channel interfaces are bridged by using iSCSI target software, external bridges, or controllers internal to the device enclosure. Source

  • iSCSI over Fast Ethernet 100 Mbit/s 12.5 MB/s
  • iSCSI over Gigabit Ethernet 1,000 Mbit/s 125 MB/s
  • iSCSI over 10G Ethernet (Very few products exist) 10,000 Mbit/s 1,250 MB/s
  • iSCSI over 100G Ethernet (Planned) 100,000 Mbit/s 12,500 MB/s
Serial Attached SCSI (SAS) Interface
  1. Notable Items
    1. Benefits
      • SAS interface protocol supports both SAS and SATA Hard Disk Drives allowing tiered storage management.
      • SAS supports seamless scalability through port expansion enabling customers to daisy chain multiple storage enclosures.
      • SAS supports port aggregation via a x4 wide-link for a full external connection bandwidth of up to 12.0 Gbps (1200MBps) on a single cable and a single connection.
      • SAS is a point-to-point interface allowing each device on a connection to have the entire bandwidth available. The current bandwidth of each SAS port is 3Gb/sec. with future generations aimed at 6Gb/sec and beyond.
    2. Downsides
      • SAS is not backwards compatible with U320 SCSI or previous SCSI generations
Fibre Channel - Arbitrated Loop (FC-AL)
  1. Notable Items
    1. Benefits
      • FC-AL devices can be dual ported, providing two simultaneous input/output sessions that doubles maximum throughput
      • FC-AL enables "hot swapping," so you can add and remove hard drives without interrupting system operation, an important option in server environments.
    2. Downsides
      • FC-AL adapters tend to cost more than SAS adapters
      • FC-AL is currently the fastest interface at 4Gb but is expected to be passed in maximum bandwidth by the next generation of SAS interface at 6Gb

Raid Levels For Disks

References:

Raid 10

RAID 10 (or 1+0) uses both striping and mirroring.

Reference - http://bugzilla.zimbra.com/show_bug.cgi?id=10700 - I fixed statement based upon a private comment in this RFE.

Zimbra recommends Raid 10 for the database and Lucene volumes.
Raid 10 is NOT Raid 0+1
RAID 0+1 takes 2 RAID0 volumes and mirrors them. If one drive in one of the underlying RAID0 groups dies, that RAID0 group is unreadable. If a single drive simultaneously dies in the other RAID0 group, the whole RAID 0+1 volume is unreadable.
RAID 1+0 takes "mirrored slices" and build a RAID0 on top of them. A RAID10 volume can lose a maximum of half of its underlying disks at once without data loss (as long as only one disk from each "slice pair" is lost.)
Raid 5

RAID 5 (striped disks with parity) combines three or more disks in a way that protects data against loss of any one disk; the storage capacity of the array is reduced by one disk.

Reference - http://bugzilla.zimbra.com/show_bug.cgi?id=10700

Zimbra does NOT recommend RAID 5 for the database and Lucene volumes (RAID 5 has poor write performance, and as such is generally not recommended by MySQL, Oracle, and other database vendors for anything but read-only datastores. We have seen order of magnitude performance degradation for the embedded Zimbra database running on RAID 5!).
Raid 6

RAID 6 (less common) can recover from the loss of two disks. Basically and extension of Raid 5.

Filesystem Choices

Other references:

Bugs/RFE's that reference filesystems:

Most configuration go with ext3 because of it's implied default. My own experience has me using xfs always except for the root ( / ) partition.

Volume Managers
LVM / LVM2 (Linux)
EVMS (IBM's Enterprise Volume Management System)
Veritas Volume Manager
Regular File Systems
XFS
EXT3
Reiser
Veritas
Network File Systems
NFS
Samba - SMB/CIFS
Global , Distributed , Cluster-Type Filesystems
GFS
Lustre [Acquired by Sun]
Hadoop Distributed File System (HDFS) - Cluster Filesystem Project From Apache & Yahoo!
SGI CXFS
IBM GPFS
Veritas Storage Foundation Cluster File System
Other Products - Applications (DRBD, Veritas Volume Replicator, etc.)


Example Filesystem Layouts

Local (Server) Disk Configuration

I would have 3 disk subsystems & partitions at a minimum.

OS
  1. For OS / - this could be whatever - local mirrored SATA/SCSI drives
    • A. Make sure /tmp has enough space for operations
    • B. Setup a proper mirror for /
      • 1. Most likely you'll use linux software raid for this
        • A. Make sure to put in a commented configuration line in /etc/fstab to use the non-active mirror
      • 2. Most motherboards will have at least two SATA ports and drive slots.
Backup
  1. /opt/zimbra/back
    • A. I would make sure the disk I/O is separate from /opt/zimbra . This way you minimize performance hits to your end-users. Do a review of the disk i/o | bus is clean as possible to the cpu/memory. Motherboard spec's should tell you want "slots" are on shared buses. Make sure you maximizing your raid/san cards performance to the bus path to the cpu.
    • B. I would make this a multiple of some degree of the /opt/zimbra space that your spec'ing. This default backup schedule is to purge sessions older than one month. This means you'll have 4 fulls and 24 incremental sessions.
    • C. If disks are local
      • 1. Raid Choices
        • A.
        • B.
      • 2. LVM
        • A.
        • B.
      • 2. Other Topics
        • A.
        • B.
    • D. If disks are on SAN
      • 1. Raid Choices
        • A.
        • B.
      • 2. LVM
        • A.
        • B.
      • 2. Other Topics
        • A.
        • B.
Zimbra
  1. /opt/zimbra
    • A. Remember you having logging data in here as well. If this partition becomes full, zimbra will hang and could cause database corruption as well.
    • C. If disks are local
      • 1. Raid Choices
        • A.
        • B.
      • 2. LVM
        • A.
        • B.
    • D. If disks are on SAN
      • 1. Raid Choices
        • A.
        • B.
      • 2. LVM
        • A.
        • B.
      • 2. Other Topics
        • A.
        • B.

SAN Layout As Recommend For Clusters

This is from Multi Node Cluster Installation Guide - PDF

Preparing the SAN

You can place all service data on a single volume or choose to place the service data in multiple volumes. Configure the SAN device and create the partitions for the volumes.

  • Single Volume SAN Mount Point - /opt or /opt/zimbra
    • If you select to configure the SAN in one volume with subdirectories, all service data goes under a single SAN volume.
  • Multiple Volumes For SAN Mount Points
    • If you select to partition the SAN into multiple volumes, the SAN device is partitioned to provide the multiple volumes for each Zimbra mailbox server in the cluster. Example of the type of volumes that can be created follows.
      • /opt Volume for ZCS software (or really, /opt/zimbra/ )
        • Directories under /opt/zimbra/
          • conf Volume for the service-specific configuration files
          • log Volume for the local logs for Zimbra mailbox server
          • redolog Volume for the redo logs for the Zimbra mailbox server
          • db/data Volume for the MySQL data files for the data store
          • store Volume for the message files
          • index Volume for the search index files
          • backup Volume for the backup files
          • logger/db/data Volume for the MySQL data files for logger service’s MySQL instance
          • openldap-data Volume for OpenLDAP data
  • Note, for a multi-volume SAN Cluster, you'll actually create the directory path differently. [ /opt/zimbra-cluster/mountpoints <clusterservicename.com> ] Please see the Cluster Installation Guides for the full planning recommendations and steps if this is what your going to do.

Virtualization

Please see Ajcody-Notes#Virtualization_Issues

What About Backups? I Need A Plan

Please Review Other Backup-Restore As Well

Please review the Ajcody-Backup-Restore-Issues section as well.

What Might Be Wrong?

First thing to check is the log file and see if anything notable stands out.

grep -i backup /opt/zimbra/log/mailbox.log

Use Auto-Group Backups Rather Than Default Style Backup

Having trouble completing that entire full backup during off-hours? Enter the hybrid auto-grouped mode, which combines the concept of full and incremental backup functions - you’re completely backing up a target number of accounts daily rather than running incrementals.

Auto-grouped mode automatically pulls in the redologs since the last run so you get incremental backups of the remaining accounts; although the incremental accounts captured via the redologs are not listed specifically in the backup account list. This still allows you to do a point in time restore for any account.

Administrative manual page:

http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html

Compare the sections called "Standard Backup Method" & "Auto-Grouped Backup Method"

http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html#1080744

Configuration details are here on that page:

http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html#1084777

Good explanation:

http://www.zimbrablog.com/blog/archives/2008/08/recent-admin-tidbits-part-1.html

Simply divide your total accounts by the number of groups you choose (zimbraBackupAutoGroupedNumGroups is 7 by default) and that’s how many will get a full backup session each night. Newly provisioned accounts, and accounts whose last backup is a specified number of days older are picked first. (zimbraBackupAutoGroupedInterval is defaulted to 1d)

Think of auto-grouped mode as a full backup for the scheduled group as well as an incremental (via redologs) for the all other accounts at the same time.

Enabling Auto-Group For Backups - Schedule In Crontab

Make sure your backup schedule was adjusted for this change (zmschedulebackup which basically dumps to zimbra's crontab). The url above mentions this and give an example.

First, basic reference:

  • Run zmschedulebackup without arguments to see your current schedule.
  • Here is a schedule without auto-grouped backups enabled:
    • The Default Schedule For Normal Backups [ zmschedulebackup --help ]:
f 0 1 * * 6
i 0 1 * * 0-5
d 1m 0 0 * * *
    • The crontab would look like [ zmschedulebackup -D ]:
0 1 * * 6 /opt/zimbra/bin/zmbackup -f -a all
0 1 * * 0-5 /opt/zimbra/bin/zmbackup -i -a all
0 0 * * * /opt/zimbra/bin/zmbackup -del 1m
  • Here is a schedule with auto-grouped enabled:
    • The Default Schedule For Auto-Group Backups [ zmschedulebackup --help ]:
f 0 1 * * 0-6
d 1m 0 0 * * *
    • The crontab would look like [ zmschedulebackup -D ]:
0 1 * * 0-6 /opt/zimbra/bin/zmbackup -f
0 0 * * * /opt/zimbra/bin/zmbackup -del 1m
  • You'll need to make the variable changes as listed in the administrative guide and then the following for set the crontab correctly.
  • Run zmschedulebackup --help to see a list of options.
  • Run zmschedulebackup -D , which will now set a new default schedule (crontab) that uses auto-group settings now that your variable are set for it.
  • Incremental backups aren't performed as they were.
    • You'll see there's no longer the -i option in cron.
    • Incrementals are performed, in a manner, but the redologs are just copied into the full backup session

Two bugs to look at as well:

Some Variables For Auto-Group

The below might not be complete or the defaults, I just wanted to save this before I forget them. Try to get more complete details on these later.

zmprov gacf | grep Backup
zimbraBackupAutoGroupedInterval: 1d
zimbraBackupAutoGroupedNumGroups: 7
zimbraBackupAutoGroupedThrottled: FALSE
zimbraBackupMode: Auto-Grouped
Auto-group And Redologs

Please see Ajcody-Backup-Restore-Issues#Redologs_And_Auto-group_In_Regards_To_Backups

Need To Write Fewer Files - Add The Zip Option To Your Backup Commands

RFE to make zip option the default for backups:

There is very little details in the official documentation on this option unfortunately. This does have a really good explanation though:

http://www.zimbrablog.com/blog/archives/2008/08/recent-admin-tidbits-part-1.html

From the administrative manual on the Backup section:

http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html

It says,

"-zip can be added to the command line to zip the message files during backup. Zipping these can save backup storage space."

It's implied that instead of having all the individual message files in the backup that it will bunch them all together into zip files. The body of a shared blob is added once to a shared-blobs zip file, then a small pointer-only entry is added to a mailbox's zip file. Same effect as in non-zipped case. This will be useful when the number of message files is causing disk i/o issues. Maybe your trying to rsync the backup session directories off to another server or your running a third party backup on it to save to tape. The default use of -zip will use compression, if this also causes overhead that you need to avoid you can use the -zipStore option.

Note about -zipStore:

"when used with the -zip option, it allows the backup to write fewer files (-zip), but not incur the compression overhead as well"
How To Use As A Default Option?

You'll add the options to the zimbra crontab file. This can be done with the zmschedulebackup command.

Run zmschedulebackup with help option:

zmschedulebackup --help

You'll see:

-z: compress - compress email blobs with zip

It appears that you'll need to manually add the options about -zipStore , if you want it, to the crontab file.

See bug :

http://bugzilla.zimbra.com/show_bug.cgi?id=30981

What Does It Look Like When I Use Zip?

Shared blobs are zipped and blobs (messages) are zipped per root store directory.

mail:~/backup/sessions/full-20080820.160003.770/accounts/115/988/11598896-a89b-4b9d-bedb-1ed1afcb6c87/blobs 
zimbra$ ls blobs-1.zip    blobs-2.zip    blobs-3.zip    blobs-4.zip

To Tape?

I would then use the non-rsync network ports for your traditional network backup software to run over to dump the data to tape. This way that activity doesn't effect prod performance at all. All full DR would use the backup/ data anyways (offsite DR).


Creating A Hot-spare Server

Setup along the same lines... though you could cut out some of the HA/performance items if you only see this box being used for "short-term" use. Rsync's will occur over backup network port.

Need to do a sanity check in regards to ldap data. With a normal DR restore, one would do a zmrestoreldap. zmrestoreldap looks to a backup session, there is no real option in regards to a "redolog" directory. Things that are "ldap" only are COS's, DL's, etc..

  1. Setup the hot-spare according to http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove
    • Basically install the zimbra packages. You'll want to do this with upgrades to production as the system evolves.
  2. I would do an initial rsync of /opt/zimbra (remember to use nice flag to not effect prod to much)
  3. I would then setup 2 daily rsync jobs (following the same wiki instructions)
    1. rsync /opt/zimbra/backup
      • This could be intergrated within the backup cron job so it kicks off after the backup is done. You'll need to monitor the times of backups and also the time for sync's so you can make sure you make it within the window of rotation - backup , rsync , next backup. Times will be different on diff and full backups.
      • rsync other necessary files:
        • /opt/zimbra/conf
        • /opt/zimbra/redolog
        • /opt/zimbra/log
      • This will give some "sanity" in case issues arise with the full restore. This part could use some better feedback from more experience Zimbra staff. I can think of some circumstances where his effort would prove useful.

A Real Hot-spare DR RFE

Note: Please add your votes to this RFE!

If It All Blows Up, Now What?

References:

http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove

http://wiki.zimbra.com/index.php?title=Network_Edition_Disaster_Recovery

Jump to: navigation, search