Ajcody-Server-Plan-Move-Migration-Upgrade-DR

Revision as of 21:13, 21 July 2009 by Ajcody (talk | contribs) (New page: {| width="100%" border="0" | bgcolor="orange" | Image:Attention.png - This article is NOT official Zimbra documentation. It is a user contribution and may include unsupported custom...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Attention.png - This article is NOT official Zimbra documentation. It is a user contribution and may include unsupported customizations, references, suggestions, or information.

I moved the following pages from Ajcody-Server-Topics to this page, Ajcody-Server-Plan-Move-Migration-Upgrade-DR :

Contents

Server Planning, Moves, Migrations, Upgrades, And DR

Actual Server Server Planning, Moves, Migrations, Upgrades, And DR Topics Homepage

Please see Ajcody-Server-Plan-Move-Migration-Upgrade-DR

Ajcody Notes Server Planning

   KB 2963        Last updated on 2009-07-21  




0.00
(0 votes)
24px ‎  - This is Zeta Alliance Certified Documentation. The content has been tested by the Community.


Server Performance Planning & Infrastructure - The Big Picture

Actual Server Planning Homepage

Please see Ajcody-Notes-ServerPlanning

Initial Comments

These are just my random thoughts. They have not been peer-reviewed yet.

Actual DR and Server restore issues are listed in my Ajcody-Server-Topics page.

Items For Your Review


Redundancy - High Availability - And Other Related Questions

One might ask, "Is there any other way to have redundant zimbra email servers?"

The short answer is No, but if you have millions of dollars you can get really close.

Remember, redundancy always comes with a dollar amount cost. And this dollar amount will determine your range of options to implement redundant technologies or processes.

Redundancy isn't magical and all choices have a "time" for the operation to failover and the application being used can restrict what can be done or increase the length of "time" of the operation. Timing of the operation is where the service/server has a client side impact of being "unavailable".

So you break down the server and services into components and put them in a graph or excel sheet. This will help to make sure your not "over engineering" the system.

For example, disk and data.

Disks can use different raid systems to give redundancy. Channels to the disk (remote) can also have redundant paths. The disk chassis can have redundant power supplies which goto different UPS's on different circuit breakers. This data can also be sent to tape, rysnc'd/copied to another disk subsystem on another server, or "flashed" to another location if your filesystem or SAN unit supports it. There's allot that has to fail for you to completely use the data. When exploring these items, you want to have multiple channel paths to make sure copies, rsync's, flashing occurs differently as compared to the "production" path. Have your network backup occur on a different ethernet device.

Two big picture objectives for "redundancy" are "data redundancy" for DR situations and "application" redundancy for service availability. Our "cluster" situation is an active-passive "redundancy" for the purposes of the mailstore application layer. The raid level of the disks on the san servers the "redundancy" in regards to data recovery for DR.

When you introduce the objective of having off-site redundancy the costs and issues become huge. Remote locations introduce speed issue for data transfers which will also impact performance for most applications as it tries to stay in sync between the two machines for write and roll-back purposes.

My intent wasn't so much to give specific answers to this question but rather demonstrate that to answers these questions you have to get down to the real specifics - it's simply impossible to answer the broad question of "how do I make redundant servers". It would take a book to fully explore all the issues and possibilities but it still comes down to - how much money you have to spend on the project. I use to work with High Performance Clusters and HA Linux+Oracle+SAP global deployments with TB's of data - this issue would arise daily for us in those environments.

HA Clustering Software

From the Single Node Cluster Installation Guide - Rev 1 June 2008:

For ZCS 5.0.7 to ZCS 7. With ZCS 8, only Vmware HA is support by Zimbra. RHCS still can work but it would be hardware only monitoring and support should be directed to Redhat for the specifics.
For cluster integration to provide high availability, Zimbra Collaboration Suite (ZCS) can integrate with either of the following:
  • Red Hat® Enterprise Linux® Cluster Suite version 4, Update 5 or later update release. In the single-node cluster implementation, all Zimbra servers are part of a cluster under the control of the Red Hat Cluster Manager.
Note: Red Hat Cluster Suite consists of Red Hat Cluster Manager and Linux Virtual Server Cluster. For ZCS, only Red Hat Cluster Manager is used. In this guide, Red Hat Cluster Suite refers only to Cluster Manager.
  • Veritas™ Cluster Server by Symantec (VCS) version 5.0 with maintenance pack 1 or later.

References:

CPU And Motherboards

Other references: Performance_Tuning_Guidelines_for_Large_Deployments#RAM_and_CPU

This is your mail server, there is no higher profile application in your environment most likely than this box. You'll want to think 3 to 5 years down the road when you spec out the box. Make sure to confirm:

  • It will scale or offer:
    • Hotpluggable technology
      • You'll need to confirm the OS can handle this
    • Redundant and Hot-swap power supplies
      • You might need to upgrade your power supplies depending on the "accessories" you put in it.
    • CPU's (There is NO reason to use a 32bit chip - the memory limitations will kill you)
      • By default, a 32-bit Linux kernel only allows each process to address 2GB of space. Through PAE (Process Address Extension), a feature available in some CPUs, and a special 32-bit kernel that supports large address space for processes, it is possible to get a 32-bit mode kernel that really uses > 4GB of RAM, and get a per process 3-4GB address range
    • Memory
      • What are your onboard cache options?
      • How many slots are available? Does it force you to buy the large memory sticks to max out total memory - this increases cost?
      • Can you mix & match different memory size sticks? (This will increase costs when you goto scale)
      • Understand how memory interacts with multi-cpu's if your server has them
      • Understand the front side bus (FSB)
      • Does the motherboard offline for bad memory detection
    • Have growth for expansion cards and the right kind
      • What is PCI Express?
      • How PCI Express Works?
      • Know what slots you need for your network card, raid card, san card, etc.
        • Then do a sanity check against the motherboard and understand if the cpu and channels can allow the full throughout from all channels.
      • Is there room for redundancy with your expansion cards?

Memory

Other references: Performance_Tuning_Guidelines_for_Large_Deployments#RAM_and_CPU

When I was working with HPC, I found there was never a good reason to NOT at least start with 32GB's of RAM. Now, a mail server isn't a HPC compute node - I understand that. But I would still try to spec out a system that has at least 8 memory slots (usually a ratio of x slots per cpu on system) but allows me to use 4 4GB DIMMS giving me 16GB of memory and allows an upgrade path choice of 4 more 4GB or 8GB DIMMS. Get the higher speed DIMMS, you can't mixed memory of different speeds.

Chart Of Memory Speeds
Memory Interconnect Buses Bit Bytes
PC2100 DDR-SDRAM (single channel) 16.8 Gbit/s 2.1 GB/s
PC1200 RDRAM (single-channel) 19.2 Gbit/s 2.4 GB/s
PC2700 DDR-SDRAM (single channel) 21.6 Gbit/s 2.7 GB/s
PC800 RDRAM (dual-channel) 25.6 Gbit/s 3.2 GB/s
PC1600 DDR-SDRAM (dual channel) 25.6 Gbit/s 3.2 GB/s
PC3200 DDR-SDRAM (single channel) 25.6 Gbit/s 3.2 GB/s
PC2-3200 DDR2-SDRAM (single channel) 25.6 Gbit/s 3.2 GB/s
PC1066 RDRAM (dual-channel) 33.6 Gbit/s 4.2 GB/s
PC2100 DDR-SDRAM (dual channel) 33.6 Gbit/s 4.2 GB/s
PC2-4200 DDR2-SDRAM (single channel) 34.136 Gbit/s 4.267 GB/s
PC4000 DDR-SDRAM (single channel) 34.3 Gbit/s 4.287 GB/s
PC1200 RDRAM (dual-channel) 38.4 Gbit/s 4.8 GB/s
PC2-5300 DDR2-SDRAM (single channel) 42.4 Gbit/s 5.3 GB/s
PC2-5400 DDR2-SDRAM (single channel) 42.664 Gbit/s 5.333 GB/s
PC2700 DDR-SDRAM (dual channel) 43.2 Gbit/s 5.4 GB/s
PC3200 DDR-SDRAM (dual channel) 51.2 Gbit/s 6.4 GB/s
PC2-3200 DDR2-SDRAM (dual channel) 51.2 Gbit/s 6.4 GB/s
PC2-6400 DDR2-SDRAM (single channel) 51.2 Gbit/s 6.4 GB/s
PC4000 DDR-SDRAM (dual channel) 67.2 Gbit/s 8.4 GB/s
PC2-4200 DDR2-SDRAM (dual channel) 67.2 Gbit/s 8.4 GB/s
PC2-5300 DDR2-SDRAM (dual channel) 84.8 Gbit/s 10.6 GB/s
PC2-5400 DDR2-SDRAM (dual channel) 85.328 Gbit/s 10.666 GB/s
PC2-6400 DDR2-SDRAM (dual channel) 102.4 Gbit/s 12.8 GB/s
PC2-8000 DDR2-SDRAM (dual channel) 128.0 Gbit/s 16.0 GB/s
PC2-8500 DDR2-SDRAM (dual channel) 136.0 Gbit/s 17 GB/s
PC3-8500 DDR3-SDRAM (dual channel) 136.0 Gbit/s 17 GB/s
PC3-10600 DDR3-SDRAM (dual channel) 165.6 Gbit/s 21.2 GB/s
PC3-12800 DDR3-SDRAM (dual channel) 204.8 Gbit/s 25.6 GB/s

Bus For Expansion Cards [Peripheral buses]

Typical uses are for network, san, raid cards. I'll deal with them separately under each section below.

Chart Of Bus Speeds
Interconnect Max speed (MB/s) Comments
PCI 2.0 132.0 MB/s
PCI 2.1 264.0 MB/s
PCI 2.2 528 MB/s
PCI-X 1.0 1 GB/s
PCI-X 2.0 4 GB/s
PCI-E (Express) Ver. 1.1 250 MB/s x2 - bi-directional These speeds are bi-directional per "lane". Meaning that they are the same going both ways and not shared.
Ver. 1.1 @ 1x 256 MB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 2x 512 MB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 4x 1 GB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 8x 2 GB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
Ver. 1.1 @ 16x 4 GB/s x2 - bi-directional PCI-E Ver. 1.1 notes.
PCI-E (Express) Ver. 2.0 400 MB/s x2 - bi-directional 500 MBs but there's a 20% overhead hit. These speeds are bi-directional per "lane". Meaning that they are the same going both ways and not shared.
Ver. 2 @ 1x 400 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
Ver. 2 @ 4x 1600 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
Ver. 2 @ 8x 3200 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
Ver. 2 @ 16x 6400 MB/s x2 - bi-directional PCI-E Ver. 2 notes.
PCI-E (Express) Ver. 3.0 1 GB/s x2 - bi-directional The final spec is due in 2009. These speeds are bi-directional per "lane". Meaning that they are the same going both ways and not shared.
Front-side Bus replacements below
HyperTransport 1.x 6.4 GB/s x2 - bi-directional Bidirectional per 32 bit link at 800MHz. Front-side bus replacement, see HyperTransport for more details .
HyperTransport 2.0 11.2 GB/s x2 - bi-directional Bidirectional per 32 bit link at 1.4GHz. Front-side bus replacement, see HyperTransport for more details .
HyperTransport 3.0 20.8 GB/s x2 - bi-directional Bidirectional per 32 bit link at 2.6GHz/ Front-side bus replacement, see HyperTransport for more details .
HyperTransport 3.1 25.6 GB/s x2 - bi-directional Bidirectional per 32 bit link at 3.2GHz. Front-side bus replacement, see HyperTransport for more details .
QuickPath Interconnect 12.8 GB/s x2 - bi-directional Intel competitor to HyperTransport. Everything you wanted to know about QuickPath (QPI)

Network Infrastructure

Network Cards

Most motherboards will have integrated ethernet ports. If they are the same chipset then you might want to just channel bond these. What I have done in the past, is used them for management ports and added in network cards for production activity.

Ideally, I would buy two cards that had two ports (Gb). I would then channel bond 2 of the ports across the cards, for 2 separate bondings. I would use one of those bonds for "front facing" traffic and the other for "backup" traffic. Remember to consider the bus infrastructure and other cards when deciding on what ones to get.

Channel Bonding

This will require you to confirm your switches can do this. There are different choices when it comes to channel bonding and some of them require the switch to support it - it usually involves "bonding" on the switch side by configuring the ports in question.

The Production Channel Bond

The "production" bond is more in regards to failover. Make sure this happens as expected through the application layer. That proxies, firewall, and so forth allow on of the ports in the channel bond to go down without any end-user impact.

The Backup Channel Bond

The backup port is for throughput. You'll want to map out the network path and switches to the end host that data is being moved to. You'll also need to confirm the network path actually gives you a gain in performance by doing the channel bonding.

This will give you an excellent way to off load your tape backups, rsyncs (as shown at Ajcody-Notes-ServerMove ), and maybe nfs mounts if your using them.

You'll want to map out the network path and switches to the end host the data is being moved to. You'll also need to confirm the network path actually gives you a gain in performance by doing the channel bonding.

Disk And Data Infrastructure

Other references: Performance_Tuning_Guidelines_for_Large_Deployments#Disk

Disk Types


References


Description of Items Used Below For Specific HDD's:

  • Notable Items
    • Benefits
    • Downsides
  • Bus Interface
    • Types:
    • Maximum Devices:
    • Bus Speeds:
  • Performance
    • I/Os per second & Sustained Transfer Rate:
    • Spindle Speed: This is the speed the drives disk actually spins at. This ranges from 5400rpm to 15,000rpm. The higher the speed the more often the data on the disk will be in the right position to be read by the drive heads, and the faster data can be transferred.
    • Average Access Time: Time. This is the average time is takes to position the heads so that data can be read. The faster the better.
    • Cache Size: This is the size of the cache on board the disk drive itself. This does reach a point where doubling the size generates a very small boost in performance and is not worth the cost, but generally the bigger the better. It also assists in small bursts of data where you may actually achieve near maximum performance as data is read from cache rather than the drive.
    • Internal Transfer Rate: This is the speed that data can be transferred within the drive. This speed will be higher than the actual transfer rate of the drive as there is some overhead for protocol handling as data is transferred to the SCSI or IDE bus.
    • Latency: the time it takes for the selected sector to be positioned under the read/write head. Latency is directly related to the spindle speed of the drive and such is influenced solely by the drive's spindle characteristics.
  • Reliability (MTBF or unrecoverable read error rate)
  • Capacity
  • Price
Ultra DMA ATA HDD's
  1. Notable Items
    • Benefits
    • Downsides
  2. Bus Interface
    • Types : ATA
  3. Performance
    • I/Os per second & Sustained transfer rate:
      1. Ultra DMA ATA 33 - 264 Mbit/s 33 MB/s
      2. Ultra DMA ATA 66 - 528 Mbit/s 66 MB/s
      3. Ultra DMA ATA 100 - 800 Mbit/s 100 MB/s
      4. Ultra DMA ATA 133 - 1,064 Mbit/s 133 MB/s
    • Spindle Speed:
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price
SATA HDD's
  1. Notable Items
    1. Benefits
      • SATA drives typically draw less power than traditional SAS HDDs due to slower RPM speeds.
      • SATA drives have the best dollar per gigabyte compared to SAS drives.
      • SATA HHDs can work on a SAS interface.
    2. Downsides
      • SATA HDDs are single port and not capable of being utilized in dual port environments without the addition of an interposer designed for this purpose.
  2. Bus Interface Type
    • Types: SATA , SAS
  3. Performance
    • I/Os per second & Sustained transfer rate:
      1. Serial ATA (SATA-150) - 1,500 Mbit/s 187.5 MB/s
        • Real speed: 150 MB/s
      2. Serial ATA (SATA-300) - 3,000 Mbit/s 375 MB/s
        • Real speed: 300 MB/s
        • (alternate names: SATA II or SATA2)
      3. Serial ATA (SATA-600) - 4,800 Mbit/s 600 MB/s
        • I've seen this listed as well though, SATA 6.0 Gbit/s (SATA 6Gb/s).
        • Standard is expected to be available before the end of 2008.
    • Spindle Speed: 7200 RPM , 5400 RPM
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price
SCSI (Parallel SCSI) HDD's
  1. Notable Items
    • Benefits
    • Downsides
  2. Bus Interface Type
    • Types: SCSI
    • Maximum Devices (On Single Channel):
      • Ultra Wide SCSI - 16
      • Ultra2 SCSI - 8
      • Ultra2 Wide SCSI - 16
      • Ultra3 SCSI - 16
        • (alternate names: Ultra-160, Fast-80 wide)
      • Ultra-320 SCSI - 16
        • (alternate name: Ultra4 SCSI)
      • Ultra-640 SCSI - 16
    • Bus Speeds:
      • The total amount of data that can be transferred throughout the whole channel.
      • Data transfers will step down to the rate speed of the drive. If you have a Ultra160 HDD on a Ultra320 controller it operate at 160 MB/s.
      • SCSI-3 : Also known as Ultra SCSI and fast-20 SCSI. Bus speeds at 20 MB/s for narrow (8 bit) systems and 40 MB/s for wide (16-bit).
      • Ultra-2 : Also know as LVD SCSI. Data transfer to 80 MB/s.
      • Ultra-3 : Also known as Ultra-160 SCSI. Data transfer to 160 MB/s.
      • Ultra-320 : Data transfer to 320 MB/s.
      • Ultra-640 : Also known as Fast-320. Data transfer to 640 MB/s.
  3. Performance
    • I/Os per second & Sustained Transfer Rate:
      • Ultra Wide SCSI 40 (16 bits/20MHz) - 320 Mbit/s 40 MB/s
        • Real speed: 40 MB/s
      • Ultra2 SCSI
        • Real speed: 40 MB/s
      • Ultra2 Wide SCSI 80 (16 bits/40 MHz) - 640 Mbit/s 80 MB/s
        • Real speed: 80 MB/s
      • Ultra3 SCSI 160 (16 bits/80 MHz DDR) - 1,280 Mbit/s 160 MB/s
        • (alternate names: Ultra-160, Fast-80 wide)
        • Real speed: 160 MB/s
      • Ultra-320 SCSI (16 bits/80 MHz DDR) - 2,560 Mbit/s 320 MB/s
        • (alternate name: Ultra4 SCSI)
        • Real speed: 320 MB/s
      • Ultra-640 SCSI (16 bits/160 MHz DDR) - 5,120 Mbit/s 640 MB/s
        • Real speed: 640 MB/s
    • Spindle Speed:
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price
SAS (Serial Attached SCSI) HDD's
  1. Notable Items
    1. Benefits
      • SAS HDDs are true dual port, full duplex devices. This means SAS HDDs can simultaneously process commands on both ports.
      • All SAS HDDs are hot-swap capable. Users can add or remove an HDD without disrupting the enterprise environment.
      • SAS HDDs can support online firmware update (check vendor). This allows users to update firmware on the SAS HDD without having to schedule downtime.
    2. Downsides
      • SAS HDDs cannot be used on the older architecture SCSI backplanes or cables.
      • SAS HDDs typically draw more power than the equivalent SATA counterparts.
  2. Bus Interface Type
    • Types: SAS
    • Maximum Devices:
      • SAS 16,256
        • 128 devices per port expanders
    • Bus Speeds:
  3. Performance
    • I/O per second & Sustained transfer rate:
      • Serial Attached SCSI (SAS) - 3,000 Mbit/s 375 MB/s
        • Real Speed: 300 MB/s (full duplex , per direction)
      • Serial Attached SCSI 2 - 6,000 Mbit/s 750 MB/s
        • Planned
    • Spindle Speed: 15000 RPM , 10000 RPM , 7200 RPM
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
    • 10K & 15K SAS HDDs has been rated at 1.6 million hours MTBF
  5. Capacity
  6. Price
Fibre Channel - Arbitrated Loop (FC-AL) HDD's
  1. Notable Items
    1. Benefits
      • FC-AL HDDs are dual ported, providing two simultaneous input/output sessions
      • FC-AL HDDs are hot-swap capable so users can add and remove a hard drives without interrupting system operation.
    2. Downsides
      • FC-AL HDDs are typically utilized in unique environments and are not compatible to be used on SAS or SATA interfaces
      • Long term, SAS is projected to replace FC-AL HDDs within the IT industry
  2. Bus Interface Type
    • Types: FC-AL
    • Maximum Devices:
      • FC-AL in a private loop with 8-bit ID's - 127
      • FC-AL in a public loop with 24-bit ID's - +16 million
    • Bus Speeds:
  3. Performance
    • I/Os per second & Sustained transfer rate:
      • FC-AL 1Gb 100 MB/s (full duplex , per direction)
      • FC-AL 2Gb 200 MB/s (full duplex , per direction)
      • FC-AL 4GB 400 MB/s (full duplex , per direction)
    • Spindle Speed:
    • Average Access Time:
    • Cache Size:
    • Internal Transfer Rate:
    • Latency:
  4. Reliability (MTBF or unrecoverable read error rate)
  5. Capacity
  6. Price

Raid Cards (Not Raid Level)


  • If your raid chip on your motherboard goes out, what is your expect downtime to resolve it?
    • Isn't it easier to buy a hot space raid card and replace it in case of a raid card failing?

SAN Typologies - Interfaces


General Questions
San Cards
  • Have you planned on multipathing?
    • You do have more than one port/card, right?
      • If not (because of cost), you do have an open slot on motherboard to allow another one?
      • Remember to consider this with your SAN switch purchase
    • Confirm that your HBA's, HBA drivers, SAN switches allow for the options you want?
    • multipathing can be for failover
    • multipathing can increase throughput for same partition mount
iSCSI Interface

So far, physical devices have not featured native iSCSI interfaces on a component level. Instead, devices with SCSI Parallel Interface or Fibre Channel interfaces are bridged by using iSCSI target software, external bridges, or controllers internal to the device enclosure. Source

  • iSCSI over Fast Ethernet 100 Mbit/s 12.5 MB/s
  • iSCSI over Gigabit Ethernet 1,000 Mbit/s 125 MB/s
  • iSCSI over 10G Ethernet (Very few products exist) 10,000 Mbit/s 1,250 MB/s
  • iSCSI over 100G Ethernet (Planned) 100,000 Mbit/s 12,500 MB/s
iSCSI Performance Topics

One will find the following recommendations in regards to iscsi:

  • Private network for iscsi traffic
  • Jumbo frames enabled with MTU size of 9000
    • All equipment must support and be enabled for this from point to point.
      • Remember, point to point meaning the "server" to the host serving the iscsi disk array.
iSCSI References
Serial Attached SCSI (SAS) Interface
  1. Notable Items
    1. Benefits
      • SAS interface protocol supports both SAS and SATA Hard Disk Drives allowing tiered storage management.
      • SAS supports seamless scalability through port expansion enabling customers to daisy chain multiple storage enclosures.
      • SAS supports port aggregation via a x4 wide-link for a full external connection bandwidth of up to 12.0 Gbps (1200MBps) on a single cable and a single connection.
      • SAS is a point-to-point interface allowing each device on a connection to have the entire bandwidth available. The current bandwidth of each SAS port is 3Gb/sec. with future generations aimed at 6Gb/sec and beyond.
    2. Downsides
      • SAS is not backwards compatible with U320 SCSI or previous SCSI generations
Fibre Channel - Arbitrated Loop (FC-AL)
  1. Notable Items
    1. Benefits
      • FC-AL devices can be dual ported, providing two simultaneous input/output sessions that doubles maximum throughput
      • FC-AL enables "hot swapping," so you can add and remove hard drives without interrupting system operation, an important option in server environments.
    2. Downsides
      • FC-AL adapters tend to cost more than SAS adapters
      • FC-AL is currently the fastest interface at 4Gb but is expected to be passed in maximum bandwidth by the next generation of SAS interface at 6Gb

Multipathing


General References
Multipathing And SAN Persistent Binding
Multipathing And LVM
Multipathing - Redundant Disk Array Controller [RDAC] (LSI mpp driver)

References:

Multipathing - Fibreutils - sg3_utils (QLogic)

References:

Multipathing - lpfc (Emulex)

References:

Multipathing - device-mapper [dm] & multipath-tools (Linux OSS driver & tools)

DM is the open source multipath driver.

References:

Multipathing - Veritas (Symantec)

References:

Multipathing - HP & Linux

References:

Multipathing - NetApp

References:

Multipathing - EMC (PowerPath)

References:

Raid Levels For Disks


General References

References:

Raid 10

RAID 10 (or 1+0) uses both striping and mirroring.

Reference - http://bugzilla.zimbra.com/show_bug.cgi?id=10700 - I fixed statement based upon a private comment in this RFE.

Zimbra recommends Raid 10 for the :
Mailstore and Logger [MySQL] databases - /opt/zimbra/db & /opt/zimbra/logger
Indexing Volume [Lucene] database - /opt/zimbra/index
Raid 10 is NOT Raid 0+1
RAID 0+1 takes 2 RAID0 volumes and mirrors them. If one drive in one of the underlying RAID0 groups dies, that RAID0 group is unreadable. If a single drive simultaneously dies in the other RAID0 group, the whole RAID 0+1 volume is unreadable.
RAID 1+0 takes "mirrored slices" and build a RAID0 on top of them. A RAID10 volume can lose a maximum of half of its underlying disks at once without data loss (as long as only one disk from each "slice pair" is lost.)
Raid 5

RAID 5 (striped disks with parity) combines three or more disks in a way that protects data against loss of any one disk; the storage capacity of the array is reduced by one disk.

Reference - http://bugzilla.zimbra.com/show_bug.cgi?id=10700

Zimbra does NOT recommend RAID 5 for the MySQL database and Lucene volumes (RAID 5 has poor write performance, and as such is generally not recommended by MySQL, Oracle, and other database vendors for anything but read-only datastores. We have seen order of magnitude performance degradation for the embedded Zimbra database running on RAID 5!).
Mailstore and Logger [MySQL] databases - /opt/zimbra/db & /opt/zimbra/logger
Indexing Volume [Lucene] database - /opt/zimbra/index
Raid 6

RAID 6 (less common) can recover from the loss of two disks. Basically an extension of Raid 5.

Filesystem Choices


Other references:

Bugs/RFE's that reference filesystems:

Most configuration go with ext3 because of it's implied default. My own experience has me using xfs always except for the root ( / ) partition.

Filesystem Feature And Option Descriptions
Journaling
Inodes

To see existing inode use:

# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/mapper/VolGroup00-LogVol00
                     19005440  158715 18846725    1% /
/dev/sda1              26104      39   26065    1% /boot
Block Size
Volume Managers

LVM / LVM2 (Linux)
EVMS (IBM's Enterprise Volume Management System)
Veritas Volume Manager
Regular File Systems

XFS

Supported

  • 1. Bugs and RFE's
  • 2. Journaling
    • A. "XFS uses what is called a metadata journal. Basically, this means that every disc transaction is written in a journal before it is written to the disc and then marked as "done" in the journal when it finishes. If the system crashes during the writing of the journal entry, that incomplete entry can be ignored since the data on the disc has not been touched yet and if the journal entry is not marked done, then that operation can be rolled back to preserve disc integrity. Its a very nice system. As stated above, XFS practices a type of journaling called "metadata journaling." This means only the inodes are journaled, not the actual data. This will preserve the integrity of the file system, but does not preserve the integrity of the data." reference: Filesystem Design Part 1 : XFS
    • B.
  • 3. Inodes
    • A. "FS considers dynamic allocation of inodes and keeps track of such inodes using B+Trees. Each allocation group uses a B+Tree to index the locations of inodes in it. This allows to create millions of inodes in each allocation group and thus supporting large number of files" reference: PDF Warning - Failure Analysis of SGI XFS File System
    • B.
  • Resources:
EXT3 - EXT4

Supported

Reiser
vxfs - Veritas
  • 1.
    • A.
    • B.
  • 2.
    • A.
    • B.
  • 3. Dynamic inode allocation
    • A. A description of sorts of what this means, "Vxfs is dynamic, so the inodes are basically created on the fly. However, it's a bit inaccurate to actually state that inodes are dynamic in VxFS and leave it at that (somewhat confusing I know). Vxfs really doesn't use them per se. It creates them to be compatible with UFS. VxFS uses extent-based allocation rather than the block-based UFS controlled by inodes. So, the question of how many inodes for vxfs, is as many as it needs."
    • B. http://www.docs.hp.com/en/B3929-90011/ch02s04.html
  • Resources:
BTRFS - Oracle's Better File System for Linux (with Redhat, Intel, HP)


ZFS
  • 1.
    • A.
    • B.
  • 2.
    • A.
    • B.
  • Resources:
Network File Systems

NFS
  • 1. Only supported for the Zimbra backup directory or when used internally to Vmware vSphere at this time. Please see Bug/RFE and comments below.
  • 2.
    • A.
    • B.
  • Resources:
  • Bugs/RFE's:
    • "Need clarity on supporting nfs mounted zimbra directories - report error/msg if nfs mount is present"
    • "Zimbra on NFS Storage through VMware ESX"
      • http://bugzilla.zimbra.com/show_bug.cgi?id=50635
      • Note - I asked this rfe to become private since it is really an internal request to do testing/qa'ing of nfs with vSphere. I asked another rfe for public viewing to be made that will let customers know when that can deploy with it under production use.

Purposed inclusion to release notes from Bug/RFE above:

ZCS & NFS:
Zimbra will support customers that store backups (e.g. /opt/zimbra/backup) on an NFS-mounted partition. Please note that this does not relieve the customer from the responsibility of providing a storage system with a performance level appropriate to their desired backup and restore times. In our experience, network-based storage access is more likely to encounter latency or disconnects than is equivalent storage attached by fiber channel or direct SCSI.
Zimbra continues to view NFS storage of other parts of the system as unsupported. Our testing has shown poor read and write performance for small files over NFS implementations, and as such we view it unlikely that this policy will change for the index and database stores. We will continue to evaluate support for NFS for the message store as customer demand warrants.
When working with Zimbra Support on related issues, the customer must please disclose that the backup storage used is NFS.
Samba - SMB/CIFS
Global , Distributed , Cloud, Cluster-Type Filesystems - Unsupported

Currently, Zimbra does not support or recommend the use of the various filesystems listed under this section. Please see the specific section to see if I've identified any existing bugs/RFE's against them. One general RFE for this topic is:


GFS
Lustre [Acquired by Sun]
Hadoop Distributed File System (HDFS) - Cluster Filesystem Project From Apache & Yahoo!
SGI CXFS
IBM GPFS
Veritas Storage Foundation Cluster File System
Google Computing

Unsupported at this time by Zimbra Support.

References:

Amazon S3 - And Amazon EC2 Information

Unsupported at this time by Zimbra Support.

References:

Still going to try this when it's not supported? See the following if you run into troubles and if your doing a new setup:

Other Products - Applications (DRBD, Veritas Volume Replicator, etc.)

Unsupported at this time by Zimbra Support.


Background Items To Review Before Drawing Up A Plan


When you start putting all this together you'll find there's allot of "exceptions" to work out based upon your needs, equipment, kernel version, distro choice, SAN equipment, HBA's, OSS drivers vs. vendor ones, and so forth. You must TEST all your assumptions when your planning on using methods that will provide online filesystem growth. Don't deploy just on an assumption. Some situations will not allow a true online filesystem growth.

References to review:

Online Resizing Of LUN Example 1 - Multipathing Required
Warning, this process reminds me of what I was doing over a year ago [before zimbra]. I don't have the necessary hardware or my old notes to go through the steps and confirm.
  1. Resize the SAN volume.
  2. Reinitialize the HBA, e.g. using sg_reset or some module specific method.
  3. Rescan the SCSI device:
    • echo 1 > /sys/block/[sdx]/device/rescan.
  4. Now confirm that /proc/partitions should contain updated values. The fdisk and sfdisk commands may still (most likely) see the old values.
  5. Remove and readd the SCSI Devices.
    • Warning: Without multipathing you are going to loose access to your disk!! Multipathing gives you several paths to the volume, so you will not loose access. Make sure multipathd is up and running.
      • echo "scsi remove-single-device <Host> <Channel> <SCSI ID> <LUN>" > /proc/scsi/scsi
      • echo "scsi add-single-device <Host> <Channel> <SCSI ID> <LUN>" > /proc/scsi/scsi
    • This will also issue new device names!
      • Multipathing updated it's device size automatically after I reloaded the last path-device.
  6. Run:
      • If you have LVM: vgresize, lvresize if you have a lvm setup.
        • Need to double check for the need of pvresize & vgextend use first.
      • Your filesystem online grow commands if you have just a filesystem.
Q&A
  • Q. The use of multipathing failover has the block device being changed in size, potentially during a write (this was, after all, online). What happens if data is being written to the device while you're doing the disconnect/reconnect operations? Do you have two conflicting pieces of information about the device?
    • A. The write is being done at the "filesystem" level, which is unaware of the block device size change.

Example Filesystem Layouts


Local (Server) Disk Configuration

I would have 3 disk subsystems at a minimum. Meaning, three distinction group of disks on their own disk i/o backplane, if possible. For example:

  1. Two disks using the motherboard provided SATA/ATA ports. (OS)
  2. Multiple disks accessed through a dedicated SAN or SCSI card. Maybe these are the hot swap disk array available on the front of your server.(Backup)
  3. Multiple disks accessed through another dedicated SAN or SCSI card. Maybe these are the disks available from an external disk array on your server or to external disks via SAN/iSCSI. (zimbra)
OS - /boot , swap , /tmp , /
Your first disk is referred to as disk1 or sda. Your second disk is referred to as disk2 or sdb. Partitions will start at 1 , rather than 0, throughout my examples below.
For the OS partitions, there's no reason to use anything but "default" for the mounting options in the /etc/fstab. If you reviewed the Performance_Tuning_Guidelines_for_Large_Deployments#File_System wiki page, those settings would apply to a dedicated partition(s) used for zimbra - not the root OS level partitions.
  1. For OS ( /boot , swap, /tmp , / )
    • A. This could be whatever for disks (SATA, SCSI, etc.).
      • 1. Two Disk minimum for mirroring. Four Disks would allow raid 10.
        • A. I prefer to use software raid, because then the "raid" will move with the disks with less complications if my server dies. I can simply move the disks into another server chassis.
        • B. Most motherboards will have at least two SATA ports and drive slots. Make sure you put your drive on different channels if you have the options rather than being on the same.
    • B. /boot partition
      • 1. I would setup a /boot partition (128MB or 256MB) on drive. This would be my first partition on each drive.
        • A. Disk 1 > partition type linux > partition 1 for 128MB or 256MB > filesystem ext3 > no "raiding" > NO LVM > mount point /boot
        • B. Disk 2 > partition type linux > partition 1 for 128MB or 256MB > filesystem ext3 > no "raiding" > NO LVM > mount point /boot2
        • 2. After OS install, I would then "rsync/copy" the contents of /boot into /boot2
        • 3. Configure grub or your bootloader to have the option to boot from /boot2 (disk1/partition0)
          • A. This will give you a failover in case something goes wrong with /boot (disk0/partition0). Bad kernel upgrade or a wacked partition maybe.
        • 4. After any changes to /boot (disk1/partition1) confirm everything works right (confirm reboots fine), you would then do another manual rsync/copy of /boot to /boot2 (disk2/partition1).
    • C. swap
      • 1. Determine how much swap you "need". I will use an example of 2GB's below. Notice both "swap" partitions will get 2 GB's.
      • 2. Setup a swap partition on disk1/partition2 and disk2/partition2 for 2 GB's.
        • A. Disk 1 > partition type - swap > partition 2 for 2GB > filesystem swap > no "raiding" > NO LVM > mount point swap
        • B. Disk 2 > partition type - swap > partition 2 for 2GB > filesystem swap > no "raiding" > NO LVM > not set to mount
      • 3. *Default Suggestion* Configure the OS to only use/mount swap on disk1/partition2. You can configure /etc/fstab for the other partition but just comment out the line for now.
      • 4. Reasoning for this.
        • A. swap drives that are "mirror" can cause undue complications.
        • B. I configured a "same" sized swap on the other drive really for the third partition - that for /. This way the blocking/sizes are as close as possible for the / mirror.
        • C. It isn't really a "lose" of space, but rather for adjustments you might need later.
          • 1. It can server as a "failure" over swap partition incase things go bad on disk0/partition1.
          • 2. It can server for more "production" swap if you find you need it.
          • 3. It allows for a complete disk failover or simplicity in case you need to move the "one" disk to another box.
        • 5. You could also configure the two swap partitions into a raid0 if you would like. Swap partitions can be turned on/off (swapon , swapoff). It's easy enough to reformat them as well, if something goes wrong, after a swapoff.
    • D. /tmp
      • 1. I generally never setup a partition for /tmp , but if you did decide to do this make sure it's following the swap partitions. ( IMO )
        • A. If you do setup /tmp , I would go with ext3 or xfs and put it within LVM. See notes below about LVM use.
    • E. / - Introduction of LVM
      • 1. You'll now place the rest of the "free" disks under a "software" raid1 / mirror partition.
        • A. You should then see this "mirror" as a new disk/partition to be used within the OS installer.
      • 2. Place the "mirror" under LVM - this will be a "partition" type. Let's assume this new mirror is now /dev/md0.
      • 3. Now configure the LVM partition for LVM. I'm assuming that a partition wasn't made for /tmp
        • A. General concepts of what happens with the LVM setup and recommendations on the naming scheme. Sda3 is disk1/partion3 and sdb3 is disk1/partition3
          • 1. pvcreate md0
          • 2. vgcreate -s 128M rootvg md0
            • A. vgchange -a y rootvg
          • 3. lvcreate -L 150GB -n rootlv rootvg
            • A. I put in 150GB above as an example. I would normally put in about 90% of the available space left and leave me some room to create a new partition/mount point if it becomes necessary. For example, let's say / kept filling up because of lazy admins not cleaning up in /root. I could create a new vg group called roothomevg and mount it as /root , restricting the messy admins from effecting / .
          • 4. Now you would make the filesystem.
            • A. Example of setting up ext3 filesystem using defaults.
              • 1. mkfs.ext3 /dev/rootvg/rootlv
          • 5. Now you need to setup the new filesystem in /etc/fstab to mount. I'll use /root as an example, considering the / would of been done through the installer and you would of most likely used the GUI tools to have done all of this.
            • A. mkdir /root (if it didn't exist - this example would of of course)
            • B. vi /etc/fstab
              • 1. /dev/rootvg/rootlv /root defaults 1 2
                • A. Adjust for your system defaults in regards to the device naming convention that's being used.
      • 4. So why did I bother with LVM if I used all the disk space anyways?
        • 1. If you setup the / with LVM at the start, even if you use all the disk space, it allows you in the future to add more disk to the underlying LVM setup to grow the / filesystem - online.
          • A. For example, let's say I have 2 open drive bays that weren't initially used when I setup my server. And two years later I find my / becoming very close to 100%. I can throw in two new drives into those bays (assuming hot-swap drives). Setup a mirror (mdadm) between the two drives. Set the new mirror partition type to LVM. Then run through the pvcreate , vgextend , lvresize , and then online grow the filesystem (ext2online/resize2fs , xfs_growfs , etc.)
Backup
  1. /opt/zimbra/backup
    • A. I would make sure the disk I/O is separate from /opt/zimbra . This way you minimize performance hits to your end-users. Do a review of the disk i/o bus path is clean as possible to the cpu/memory. Motherboard spec's should tell you what "slots" are on shared buses. Make sure your maximizing your raid/san cards performance to the bus path to the cpu.
    • B. I would make this a multiple of some degree of the /opt/zimbra space that your spec'ing. This default backup schedule is to purge sessions older than one month. This means you'll have 4 fulls and 24 incremental sessions.
      • 1. Make sure you investigate the auto-group backup method as well and the zip option. This could have a huge effect on your disk space requirements for the backup partition.
    • C. If disks are local
      • 1. Raid Choices
        • A. If you'll be sending this to tape as well, you could go with a raid0 for performance.
        • B. If you don't plan on sending to tape or another remote storage system, maybe raid5.
      • 2. LVM
        • A. Please encapsulate the backup partition under LVM. This will give you some flexibility later in case you need more space.
      • 2. Other Topics
    • D. If disks are on SAN
      • 1. Raid Choices
        • A. If you'll be sending this to tape as well, you could go with a raid0 for performance.
        • B. If you don't plan on sending to tape or another remote storage system, maybe raid5.
        • C. Note, if you have allot of disks to construct your raid with you still can achieve good performance with raid5. This way you don't have to lose all the "disks" when doing raid10. It would be worth bench marking your SAN using x number of disks configured as raid10 vs. the same x number of disks are raid5. Remember to consider the i/o backplanes involved in what "disks" you choose to use throughout your different disk chassis. Going up and down your disk rack vs left to right.
      • 2. LVM
        • A. Please encapsulate the backup partition under LVM. This will give you some flexibility later in case you need more space.
        • B. If your NAS/SAN system is going to do block level snap-shots, the choice to use LVM or not becomes more complicated. A block level snap-shot across multiple LUN's will generally not work when the top level filesystem is using LVM. If you plan on only using one LUN and growing that same LUN as needed, then LVM will still prove useful if your also using the SAN/NAS block level snap-shots.
      • 2. Other Topics
Zimbra
  1. /opt/zimbra
    • A. Remember you having logging data in here as well. If this partition becomes full, zimbra will hang and could cause database corruption as well.
    • C. If disks are local
      • 1. Raid Choices
        • A. Raid 0 or Raid 10
        • B.
      • 2. LVM
        • A. Please encapsulate the backup partition under LVM. This will give you some flexibility later in case you need more space.
        • B.
    • D. If disks are on SAN / NAS
      • 1. Raid Choices
        • A. Raid 0 or Raid 10
        • B. Note, if you have allot of disks to construct your raid with you still can achieve good performance with raid5. This way you don't have to lose all the "disks" when doing raid10. It would be worth bench marking your SAN using x number of disks configured as raid10 vs. the same x number of disks are raid5. Remember to consider the i/o backplanes involved in what "disks" you choose to use throughout your different disk chassis. Going up and down your disk rack vs left to right.
      • 2. LVM
        • A. Please encapsulate the partitions under LVM. This will give you some flexibility later in case you need more space.
        • B. If your NAS/SAN system is going to do block level snap-shots, the choice to use LVM or not becomes more complicated. A block level snap-shot across multiple LUN's will generally not work when the top level filesystem is using LVM. If you plan on only using one LUN and growing that same LUN as needed, then LVM will still prove useful if your also using the SAN/NAS block level snap-shots.
      • 2. Other Topics
More Details About LVM Use

The notes below were gathered from the "Zimbra Admins in Universities" mailing list.

Contents of Post by Matt on Date: Mon, 27 Oct 2008 14:24:50 -0500

We've found however that we are able to grow storage on the fly with LVM.  
It basically works like this for us...

    *  Grow the LUN on the SAN
          o wait for the LUN to finish growing by checking the 'Jobs / Current Jobs' 
            display until the "Volume Initialization" job is finished. 
    * reboot the host to see the new partition size.
          o (partprobe -s is supposed to do this, but it doesn't) 
    * find the device name:
          o pvs | grep VolumeName? 
    * grow the Volume Group:
          o pvresize -v /dev/XXX 
    * Verify the new size:
          o pvs | grep VolumeName? 
    * grow the logical volume:
          o Grow by a specific size: lvextend --size +NNg /dev/VgName/LvName
          o Grow to use all free space: lvextend -l +100%FREE /dev/VgName/LvName 
    * grow the file system:
          o Online method (dangerous?)
                + ext2online /dev/VgName/LvName 
          o Offline method (safer?)
                + umount /mountpoint
                + e2fsck -f /dev/Vgname/Lvname
                + resize2fs /dev/Vgname/LvName
                + mount /dev/VgName/LvName /mountpoint 
    * Verify new filesystem size:
          o df -h /mountpoint 

I've always used the online method (marked "dangerous?" by one of my cohorts) and 
never had a problem.  One other thing we've been able to do with LVM that has been 
a benefit is migrating data to a new LUN...

   1.  Find the new physical volume that is associated with the correct LUN#. On 
       the Zimbra servers you can use this MPP (linuxrdac) tool.

      # /opt/mpp/lsvdev

   2. Prepare the physical volume with PVCREATE.

      # pvcreate /dev/sdX

   3. Extend the logical volume to the new physical volume with VGEXTEND.

      # vgextend /dev/VolGroupName /dev/sdX

   4. Use LVDISPLAY to make sure you are moving from the right physical volume.

      # lvdisplay /dev/VolGroupName/LogVolName -m

      Example Results
      ===========
        --- Logical volume ---
        LV Name                /dev/VgMbs03Backup/LvMbs03Backup
        VG Name                VgMbs03Backup
        LV UUID                0vZQx3-5A22-a4ZO-4VmV-2naM-jwoi-yc6r6k
        LV Write Access        read/write
        LV Status              available
        # open                 0
        LV Size                580.00 GB
        Current LE             148479
        Segments               1
        Allocation             inherit
        Read ahead sectors     0
        Block device           253:6
         
        --- Segments ---
        Logical extent 0 to 148478:
          Type                linear
          Physical volume     /dev/sdab
          Physical extents    0 to 148478

   5. Move the Volume Group to the new physical volume using PVMOVE.

      # pvmove -i 60 -v /dev/sdZ /dev/sdX

      -i 60     : Show progress every 60 seconds
      -v         : Verbose
      /dev/sdZ     : Physical volume we are moving from
      /dev/sdX     : Physical volume we are moving to

   6. When the move is completed use VGREDUCE to reduce the volume group down to 
      the new physical volume.

      # vgreduce /dev/VolGroupName /dev/sdZ

A reply was given to the post above, noting the issue of the reboot in the steps above. Rich, Mon, 27 Oct 2008 15:24:06 -0500, write:

Any process including the term "reboot" isn't "on the fly." :-)

Current proprietary OSes can rescan and use expanded LUNs on the fly while 
filesystems are mounted. Apparently, so can the latest development Linux kernel,
but stacked device-mapper and LVM layers will need major changes, so don't 
expect to see this capability in enterprise linices for 2 years.

You can save some time, though, by replacing "reboot" with the minimum steps 
required to clear all holders of the device:

umount /file/system
vgchange -an VG
service multipathd stop
multipath -ll # note the physical devices involved, here assuming sd[fg]
multipath -f mpathVG
echo 1 > /sys/block/sdf/device/rescan # partprobe -s might also do this
echo 1 > /sys/block/sdg/device/rescan
multipath -v2
service multipathd start

...and continue with the pvresize. But simply adding a new LUN and marking it 
active with the admin console (or zmvolume) can be done with zero downtime, 
so that's my new model.

SAN Layout As Recommend For Clusters

This is from Multi Node Cluster Installation Guide - PDF

Preparing the SAN

You can place all service data on a single volume or choose to place the service data in multiple volumes. Configure the SAN device and create the partitions for the volumes.

  • Single Volume SAN Mount Point - /opt or /opt/zimbra
    • If you select to configure the SAN in one volume with subdirectories, all service data goes under a single SAN volume.
  • Multiple Volumes For SAN Mount Points
    • If you select to partition the SAN into multiple volumes, the SAN device is partitioned to provide the multiple volumes for each Zimbra mailbox server in the cluster. Example of the type of volumes that can be created follows.
      • /opt Volume for ZCS software (or really, /opt/zimbra/ )
        • Directories under /opt/zimbra/
          • conf Volume for the service-specific configuration files
          • log Volume for the local logs for Zimbra mailbox server
          • redolog Volume for the redo logs for the Zimbra mailbox server
          • db/data Volume for the MySQL data files for the data store
          • store Volume for the message files
          • index Volume for the search index files
          • backup Volume for the backup files
          • logger/db/data Volume for the MySQL data files for logger service’s MySQL instance
          • openldap-data Volume for OpenLDAP data
  • Note, for a multi-volume SAN Cluster, you'll actually create the directory path differently. [ /opt/zimbra-cluster/mountpoints <clusterservicename.com> ] Please see the Cluster Installation Guides for the full planning recommendations and steps if this is what your going to do.

Zimbra Directory Layout & FHS (Filesystem Hierarchy Standard)

FHS

References:

Bugs/RFE's

Community Feedback
  • Feedback Reference One
We are following FHS[1] standard for our deployments (or at least trying 
our best to follow it). It would be nice to reflect on the possibilities 
of mostly FHS-compliant Zimbra deploy. Here's what we've came up with so 
far:

/etc/opt/zimbra for configs
/opt/zimbra - binaries
/var/opt/zimbra - message store, OpenLDAP db, MySQL db's etc.
/var/log/zimbra - logs

Going by the FHS standards (in our case) means deploying a well-documented
system and that its layout is consistent across the board. 

Benefits:
* A paranoid type setup like mounting /opt read-only, and /var as no-exec. 
* For the uber-paranoia, including /etc as read-only. 
* You could tune each FS for specific needs which are consistent across 
  the board. 
** Different FS / or differently tuned FS/ used for each generic case.
* Migrations would be fairly simple, as it's easy to rip out configs (/etc) 
  or data (/var) or logs (/var/log) and copy/move it someplace else. 
* It opens the door to possibility of mounting volume with binaries on 
  multiple machines that only have local configs and data (not that we plan
  on it at the moment).

Disk Or Disk Layout Performance Testing

hdparm - Read or set the hard drive parameters

References:

Using DD

1GB file test

sync ; time bash -c "(dd if=/dev/zero of=largefile bs=1024 count=1000000; sync)"

Now time the removal of the large file.

sync ; time bash -c "(rm largefile; sync)"
Bonnie++

References:

dbench - generate load patterns

References:

IOzone

References:

Stress

References:

Postmark (Netapp)

References:

LTP - Linux Test Project

This suite of tools has filesystem performance tests.

References:

Clustering Software

Please see Ajcody-Clustering

Virtualization

Please see Ajcody-Virtualization

What About Backups? I Need A Plan

Please Review Other Backup-Restore As Well

Please review the Ajcody-Backup-Restore-Issues section as well.

What Might Be Wrong?

First thing to check is the log file and see if anything notable stands out.

grep -i backup /opt/zimbra/log/mailbox.log

Use Auto-Group Backups Rather Than Default Style Backup

Having trouble completing that entire full backup during off-hours? Enter the hybrid auto-grouped mode, which combines the concept of full and incremental backup functions - you’re completely backing up a target number of accounts daily rather than running incrementals.

Auto-grouped mode automatically pulls in the redologs since the last run so you get incremental backups of the remaining accounts; although the incremental accounts captured via the redologs are not listed specifically in the backup account list. This still allows you to do a point in time restore for any account.

Please see the following for more detailed information:

Need To Write Fewer Files - Add The Zip Option To Your Backup Commands

Using the zip option will compress all those thousands of single files that exist under a user's backup, decreasing performance issues that arise from writing out thousands of small files as compared to large ones. This is often seen when one is :

  • Using nfs for the backup directory
  • Copying/rsyncing backups to a remote server
  • Are using some third party backup software (to tape) to archive/backup the zimbra backup sessions.

Please see the following for more information about using the Zip option:

SAN Snapshots For Backups

Please see:

Cloud Backups

Please see:

Tape Backups

I would then use the non-rsync network ports for your traditional network backup software to run over to dump the data to tape. This way that activity doesn't effect prod performance at all. All full DR would use the backup/ data anyways (offsite DR). I've created another section that will deal with this in more details - specifically handling the hard links that are used by Zimbra.

Please see:

Test Environments And Managing Customizations

I have some suggestions on this in the RFE below. The first comment has a recommended layout for your test/qa/dev environments:

Using Vmware ESX For A DEV/QA/Prod Test Environment

Please see Ajcody-Virtualization#Using_VMWare_ESX_For_ZCS_Test_Servers_-_How-To

Creating A Hot-spare Server

Setup along the same lines... though you could cut out some of the HA/performance items if you only see this box being used for "short-term" use. Rsync's will occur over backup network port.

Need to do a sanity check in regards to ldap data. With a normal DR restore, one would do a zmrestoreldap. zmrestoreldap looks to a backup session, there is no real option in regards to a "redolog" directory. Things that are "ldap" only are COS's, DL's, etc..

  1. Setup the hot-spare according to http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove
    • Basically install the zimbra packages. You'll want to do this with upgrades to production as the system evolves.
  2. I would do an initial rsync of /opt/zimbra (remember to use nice flag to not effect prod to much)
  3. I would then setup 2 daily rsync jobs (following the same wiki instructions)
    1. rsync /opt/zimbra/backup
      • This could be intergrated within the backup cron job so it kicks off after the backup is done. You'll need to monitor the times of backups and also the time for sync's so you can make sure you make it within the window of rotation - backup , rsync , next backup. Times will be different on diff and full backups.
      • rsync other necessary files:
        • /opt/zimbra/conf
        • /opt/zimbra/redolog
        • /opt/zimbra/log
      • This will give some "sanity" in case issues arise with the full restore. This part could use some better feedback from more experience Zimbra staff. I can think of some circumstances where this effort would prove useful.

A Real Hot-spare DR RFE

Note: Please add your votes to this RFE!

If It All Blows Up, Now What?

References:

http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove

http://wiki.zimbra.com/index.php?title=Network_Edition_Disaster_Recovery




Migration Issues

   KB 2963        Last updated on 2009-07-21  




0.00
(0 votes)
24px ‎  - This is Zeta Alliance Certified Documentation. The content has been tested by the Community.


Actual Migration Notes Homepage

Please see Ajcody-Migration-Notes

RFE/Bug's For Migration Tools

I'll try to keep track of items here when I find them.

Server Migrations Terminology

Staged (Migration Over Time Rather Than Single Instance)

Staged describes a migration will take place over time (days, weeks, months). Where users are moved to the "new" server over these time frames. This process might involve a Split-Domain or Sub-Domain configuration or the use of LDAP replica's that would integrate a new ZCS mailstore server.
Reasons For Staged Migrations

Normally, the split-domain option is used for one of these two reasons:

  • Very large users base and can't migrate everyone over is one downtime window. Must migrate over the course of multiple downtime windows spanning across weeks or months.
    • One of the mailserver might be over a remote network link that prevents migration data to be moved fast enough in one downtime window.
  • The "old" mail system must stay up because some users will not be migrated to Zimbra.

Split-Domain (Shared Mail Domain Between Multiple Email Servers)

Split-Domain means that your going to have mail servers from different vendors (Zimbra, MS Exchange, Sendmail, Postfix, etc.). One of these servers+vendor-type will be authoritative (Primary) for the email domain. The other servers (Secondaries) will still be in the same email domain, rather than a sub-domain, and use the Primary server for authoritative type functions.
Reasons For Split-Domains
  • Allows the highest possible merger between the two systems to minimize the impact to the end users.
  • Doesn't require a final domain rename as a final step to the server migration.
    • This can cause some disruption to some user items like - calendars, appointments, client reconfiguration, etc.

Sub-Domain (New Email Domain To Integrate Into Existing Mail Environment - Multiple Email Servers)

Sub-Domain means we'll add a new email sub-domain to the primary email domain to migrate users over to. It doesn't have to be an actual dns sub-domain, i.e. company.com > sub.company.com . This configuration will use forwarding & domain masquerading rules between the two email systems to allow users to move to the new email system over time. Once everyone is moved over, it can be reconfigured to use the primary domain only and drop the use of the sub-domain.
Reasons For Sub-Domains
  • Administrator wants to avoid the possibility of very complex configurations of the other options.
  • Can be setup fairly fast and easily.
  • Good for testing since it requires very little modifications to the existing system.

Required Reading Overview & Resources

Review the following articles because the how-to's in this area will require understanding of the content within them.

Zimbra And Linux Resources

  • Other Zimbra Related Steps That Might Be Necessary
    • Specifically "Domain Masquerading"
    • Special Note About These Variables And Calendar Invites & Email Replies
      • Please confirm replies are working correctly as well as calendar invites acceptance to those outside your zimbra environment.
      • [http://bugzilla.zimbra.com/show_bug.cgi?id=9545 See RFE 9545 - calendar organizer shows as account name, not "reply-to" address or zimbraMailCanonicalAddress"
        • CLI commands that might be necessary to modify for the situation:
          • zmprov ma user@domain.com zimbraMailCanonicalAddress otheraddess@domain.com
          • zmprov ma user@domain.com zimbraPrefReplyToAddress otheraddess@domain.com

MS Exchange Resources

Different Email Server Type/Vendor To ZCS - Using Split-Domain To Migrate

Split-Domain Example 1 - Zimbra Primary & Exchange Secondary

My attempt to rewrite Split_Domain for customers that are having problems following it for whatever reasons.

Rough Notes - Working On It Now

Hostnames used for server configurations when referencing each other MUST be using internal ip addresses/hostnames for resolution. This will avoid unnecessary routing out of the local network and then back in through your firewall, thus getting a new IP address.


Details of Firewal and/or External Mail Filter :

  • Inbound email goes first to:
    • Firewall : homename =
      • external ip's =
      • internal ip's =
    • External Mail Filter Server : hostname = mailfilter.YYY.state.XX.us
      • external ip's =
      • internal ip's =
    • Will then forward email to primary.YYY.state.XX.us via ip address after implementation


Configuring The New Zimbra Server As The Primary :

  1. Primary Mail Server ("authoritative" system for domain YYY.state.XX.us)
    • Zimbra Server
      • Hostname is primary.YYY.state.XX.us
      • mail domain = YYY.state.XX.us
    • The primary MTA must
      1. Be aware of, and accept mail for, all accounts on the domain
        1. Create ALL accounts for Zimbra that are on existing Exchange box
          • $ zmprov ca secondary-user@YYY.state.XX.us <some_random_password>
          • Review wiki on Bulk_Create . Don't forget about system accounts that you have configured for email
      2. Forward mail to the secondary MTA for users hosted there.
        1. Setup Transport for users to point to the Exchange Server
          • $ zmprov ma secondary-user@YYY.state.XX.us zimbraMailTransport smtp:secondary.YYY.state.XX.us:25
          • Again, review wiki on Bulk_Create to this easier.
      3. Reject on RCPT TO or configure a catchall account for those accounts that do not exist in either system
      4. LAST STEP - Configure DNS, MX records, firewall to make Zimbra server authoritative for domain.
        1. You will bounce mail if you make this change before configuring the entire system and testing that mail flow is working as desired.)
        2. Change your MX record so mail from the internet flows into the Zimbra MTA first.


Re-configuring The Old Exchange Server As The Secondary :

  1. Seconardy Mail Server
    • Exchange Server
      • Hostname is secondary.YYY.state.XX.us
      • mail domain = YYY.state.XX.us
    • The secondary MTA must
      1. Accept mail for accounts that are hosted on the secondary
        1. See Ajcody-Migration-Notes#MS_Exchange_Resources for now. I'll add specific steps as I get confirmation of them.
      2. We also highly recommend that you turn off DNS lookups and internet wide message routing from the secondary host and route all mail through the primary.
        1. See Ajcody-Migration-Notes#MS_Exchange_Resources for now. I'll add specific steps as I get confirmation of them.
      3. Forward mail to primary.YYY.state.XX.us for accounts hosted on the primary server
        1. See Ajcody-Migration-Notes#MS_Exchange_Resources for now. I'll add specific steps as I get confirmation of them.

Migrating Users To Be On the Zimbra Server :

  • When you are ready to move a user from the old system to the new system just run this command
    1. (where zimbra.YYY.state.XX.us is the name of your Zimbra server):
    2. $ zmprov ma secondary-user@YYY.state.XX.us zimbraMailTransport lmtp:primary.YYY.state.XX.us:7025


If there is no account on either server, the mail will bounce back to Zimbra.YYY.state.XX.us to catchall@YYY.state.XX.us


Outbound email:

Either of the mail servers will be able to send email out independent of the other by allowing outbound smtp traffic from their respective ip addresses through the firewall via policy rule.

Split-Domain Example 2 [Mnichols Write Up] - Exchange Primary & Zimbra Secondary

Please see the following for excellent write ups on doing this with MS Exchange acting as the primary. I'll leave my older write up below for historical reasons.

Split-Domain Example 2 [My Attempt] - Exchange Primary & Zimbra Secondary

My attempt to rewrite Split_Domain for customers that are having problems following it for whatever reasons.

Rough Notes - Working On It Now

Hostnames used for server configurations when referencing each other MUST be using internal ip addresses/hostnames for resolution. This will avoid unnecessary routing out of the local network and then back in through your firewall, thus getting a new IP address.


Details of Firewal and/or External Mail Filter :

  • Inbound email goes first to:
    • Firewall : homename =
      • external ip's =
      • internal ip's =
    • External Mail Filter Server : hostname = mailfilter.YYY.state.XX.us
      • external ip's =
      • internal ip's =
    • Will then forward email to primary.YYY.state.XX.us via ip address after implementation


Configuring The Old Exchange Server As The Primary :

  1. Primary Mail Server ("authoritative" system for domain YYY.state.XX.us)
    • Exchange Server
      • Hostname is primary.YYY.state.XX.us
      • mail domain = YYY.state.XX.us
    • The primary MTA must
      1. Be aware of, and accept mail for, all accounts on the domain
        1. See Ajcody-Migration-Notes#MS_Exchange_Resources for now. I'll add specific steps as I get confirmation of them.
        2. You'll have to grant relay permission on the Exchange server [Primary Server] to the Zimbra Server [Secondary Server] in the Exchange SMTP properties so outbound email from Zimbra going through the Exchange will work.
      2. Forward mail to the secondary MTA for users hosted there.
        1. Setup Transport for "migrated" [if any are done] users to point to the Zimbra Server
      3. Reject on RCPT TO or configure a catchall account for those accounts that do not exist in either system
        1. Determine if you want this on the Exchange or Zimbra Server.


Configuring The Zimbra Server As The Secondary :

  1. Seconardy Mail Server
    • Zimbra Server
      • Hostname is secondary.YYY.state.XX.us
      • mail domain = YYY.state.XX.us
    • The secondary MTA must
      1. Accept mail for accounts that are hosted on the secondary
        1. On the Zimbra Server, run these commands (domain already exists in below example of course):
          • The first two commands (in combination) tell the Zimbra postfix to accept all addresses in the @example.com domain as valid addresses.
            • $ zmprov md YYY.state.XX.us zimbraMailCatchAllAddress @YYY.state.XX.us
            • $ zmprov md YYY.state.XX.us zimbraMailCatchAllForwardingAddress @YYY.state.XX.us
      2. But must forward all other mail for accounts on this domain to the primary system
        1. On the Zimbra Server, run this command (domain already exists in below example of course).
          • This third command establishes default mail routing for the domain. Any users that do not exist on the Zimbra system will have their mail routed according to this rule.
            • $ zmprov md YYY.state.XX.us zimbraMailTransport smtp:primary.YYY.state.XX.us:25
      3. We also highly recommend that you turn off DNS lookups and internet wide message routing from the secondary host and route all mail through the primary.
        1. On the Zimbra Server you would:
          • Relay mail to Primary with:
            • $ zmprov mcf zimbraMtaRelayHost primary.YYY.state.XX.us:25
              • At the end, 25 is the port number for smtp on the targeted system. Adjust this number if you changed the smtp port.
          • Turn off DNS lookups with:
            • $ zmprov mcf zimbraMtaDnsLookupsEnabled FALSE
              • If you disable DNS Lookups (under the MTA tab of the admin console, or with zmprov), Zimbra will end up using (according to the postconf man page) the "gethostbyname() system library routine which normally also looks in /etc/hosts" (based on the entries on the "hosts" line in /etc/nsswitch.conf). If you do this but don't also specify an SMTP relay host (typically your ISP's SMTP server), which will take care of checking DNS, you will reverse your ability to send mail: suddenly you can send mail to other users on the Zimbra server, but you can't send to the internet (though you can still receive mail from the internet either way).
      4. accept mail relayed by primary.YYY.state.XX.us
      5. forward mail to primary.YYY.state.XX.us for accounts hosted on the primary server
      6. After configuration changes, restart services/server if needed.
        1. On the Zimbra Server, run these commands (as zimbra):
          • $ postfix stop
          • $ postfix start


If there is no account on either server, the mail will bounce back to a catchall account if you configured one.

Outbound email:

Either of the mail servers will be able to send email out independent of the other by allowing outbound smtp traffic from their respective ip addresses through the firewall via policy rule.

As You Migrate Accounts From Exchange To Zimbra

Need to determine if these steps are conditional to certain Exchange versions or to all versions.

After an account is migrated to the Zimbra server from the Exchange server, you might need to delete the mailbox of the migrated account on the Exchange server so it will forward the mail to the Zimbra Server. You might also need to "Run Cleanup Agent" in the Exchange System Manager Console as well.

Unsetting Split-Domain Configurations (Migration is done and Split-Domain is no longer needed)

Simply null out the values/return them to their previous:

zmprov md domain.com zimbraMailCatchAllAddress 
zmprov md domain.com zimbraMailCatchAllForwardingAddress 
zmprov md domain.com zimbraMailTransport lmtp:primary.YYY.state.XX.us:7025

Accounts should already of been adjusted through the migration steps:

zmprov ma secondary-user@YYY.state.XX.us zimbraMailTransport lmtp:primary.YYY.state.XX.us:7025

Unset Mail Relay and DNS Lookups False:

zmprov mcf zimbraMtaRelayHost 
zmprov mcf zimbraMtaDnsLookupsEnabled TRUE

Different Email Server Type/Vendor To ZCS - Using Sub-Domain To Migrate

Yet to be written.

Sub-Domain Example 1 - Exchange Primary Domain & ZCS With Sub-Domain

Yet to be written.

Sub-Domain Example 1 - ZCS Primary Domain & Exchange With Sub-Domain

Yet to be written.

ZCS User to Another ZCS Server - With Rest & TGZ

Before Import - Confirm Your TGZ Is Good

On the server you'll be importing to, do the following.

  • Have a copy of the tgz file on the system, for example : /tmp/user_domain.tgz
  • Confirm the following doesn't give any errors
    • tar tzf /tmp/user_domain.tgz

Rest And The FMT= Option

See also:

Please consult /opt/zimbra/docs/rest.txt on your server for any version differences, the below is from 6.0.8.

  • fmt="tar", "tgz", "zip"
    • stream multiple items in an archive format. These forms are a superset of most other REST formatters .
  • Download options: export item data in raw or interchange format
    • body=0
      • only include msg headers, not bodies
    • charset=name
      • name character set for tar file and directory names as well as ical or vcard text. UTF-8 is default
    • emptyname=name
      • filename to set in content-disposition header if no data items are found in query instead of returning HTTP 204 error
    • file=name
      • filename to set in content-disposition header
    • lock=1
      • lock mailbox before running query to assure a complete, consistent snapshot
    • meta=
      • meta=1 (default)
        • item metadata from the DB is JSON encoded as .meta files. Associated data blobs follow immediately after
      • meta=0
        • data items saved in common interchange formats including eml for mail, ical for appointments and vcard for contacts. Briefcase files saved unmodified
  • Upload options: POST to the original folder or a subfolder in raw or MIME encoded data stream. If .meta files are included, data will be imported with original details. Otherwise, eml, ics and vcards are imported the same as with other formatters and all other files are imported to the briefcase
    • callback=name
      • name of javascript function used to return status when doing a direct browser upload.
      • window.parent.name("exception string", "exception class name", exception_code) is called when the upload completes.
      • plain html is returned by default if callback unspecified
    • charset=name
      • character set for archive file and directory names as well as ical or vcard text
    • resolve=[modfy|replace|reset|skip]
      • how to handle duplicate data conflicts
        • modify modify old item in place
        • replace delete old item and recreate
        • reset reset folder by deleting all item before importing
        • skip (default) skip duplicate items completely
    • subfolder=name
      • create subfolder under the destination to import items into
    • timestamp=0
      • do not use the archive entry date as the received date for msgs or the creation date for documents. Msg dates will be inferred from the Date header if available
    • timeout=mscec
      • update brower client with a text newline to prevent upload timeouts

The Basics

Please see [ZCS-to-ZCS Migrations] for use of this new tool option set.

Teaser, to export an entire mailbox with::

/opt/zimbra/bin/zmmailbox -z -m user@domain.com getRestURL “//?fmt=tgz” > /tmp/account.tgz

Create account on new server to import data to and then:

/opt/zimbra/bin/zmmailbox -z -m user@domain.com postRestURL “//?fmt=tgz&resolve=reset” /tmp/account.tgz

Note, this ends with "skip" - skip is default option. Resolve is how to handle duplicate data conflicts, options being: modify, replace, reset, skip . Please know what you want to do with the resolve= option or be testing against a test account rather than a production one. See /opt/zimbra/docs/rest.txt or Ajcody-Migration-Notes#Rest_And_The_FMT.3D_Option for more information on the fmt= options.

Time Out Errors

If you see errors about timing out, especially with large tgz file exports, then you'll want to include the -t 0 option for zmmailbox. T is for timeout and the 0 tells it to NOT have a timeout period. For example:

$ zmmailbox -z -m USER@DOMAIN -t 0 getRestURL "//?fmt=tgz" > /tmp/USER@DOMAIN-20100825.tgz

Problems Importing Data From Large Accounts

You might want to use the addMessage option instead of the postRestURL. For example:

tar zxvf user001.tgz
cd Inbox
zmmailbox -z -m user001@domain.com addMessage /Inbox *.eml
cd Sent
zmmailbox -z -m user001@domain.com addMessage /Sent *.eml

Work Around For Possible Proxy Or PreAuth Problems

Change With ZCS6+ vs ZCS5 - If Account Is Closed

This seems to be a ZCS 6+ change in behavior, when an account is in the closed status [zimbraAccountStatus: closed] and the zmmailbox steps to export above don't work - auth failure type errors. Work around, change account status or use the curl steps listed here.

Example of error you'll see:

$ zmmailbox -z -m USER@DOMAIN getRestURL "//?fmt=tgz" > /tmp/USER@DOMAIN-20100825.tgz

ERROR: service.FAILURE (system failure: GET from 
https://www.DOMAIN/home/USER@DOMAIN/?fmt=tgz failed, status=401.  must authenticate)
Using Curl To EXPORT TGZ Data

This is a work around if you see the following error when attempting the export via zmmailbox -- "status=401. must authenticate" . For admin:PASSWORD below you could use admin or a domain admin username and the appropriate password. You can also either use your 'global' url or the specific mailstore the user resides on. Notice the use of https and the :7071 , this is critical to use an admin or domain admin account to do the export.

curl -k -u admin:PASSWORD https://MAILSTORE-FQDN:7071/service/home/USER@DOMAIN/?fmt=tgz > ./USER-account.tgz
Using WGET To EXPORT TGZ Data

This is a work around if you see the following error when attempting the export via zmmailbox -- "status=401. must authenticate" . For admin:PASSWORD below you could use admin or a domain admin username and the appropriate password. You can also either use your 'global' url or the specific mailstore the user resides on. Notice the use of https and the :7071 , this is critical to use an admin or domain admin account to do the export.

wget --no-check-certificate -O ./account.tgz https://admin:PASSWORD@MAILSTORE-FQDN:7071/service/home/USER@DOMAIN/?fmt=tgz
Using ZWC Via Admin Port To EXPORT TGZ Data

In your browser, do something like the following:

https://admin:PASSWORD@MAILSTORE-FQDN:7071/service/home/USER@DOMAIN/?fmt=tgz
Using Curl To IMPORT TGZ Data

To import a tgz file into zimbra using curl, use the format example shown below:

curl -k -u admin:PASSWORD --data-binary @/path/to/USER-account.tgz "https://MAILSTORE-FQDN:7071/service/home/USER@DOMAIN/?fmt=tgz&resolve=skip"

Note, this ends with "skip" - skip is default option. Resolve is how to handle duplicate data conflicts, options being: modify, replace, reset, skip . Please know what you want to do with the resolve= option or be testing against a test account rather than a production one. See /opt/zimbra/docs/rest.txt or Ajcody-Migration-Notes#Rest_And_The_FMT.3D_Option for more information on the fmt= options.

Too Large Or Timeout Errors With Proxy

Update - rest_request_max_upload_size does not exist in code. This key was removed with change 399480 as part of a code revert. See bug :

Apply the LC rest_request_max_upload_size instead of zimbraFileMaxUploadSize for imports when using nginx proxy or if your using curl [ZCS 7.2.1+] .

See:

  • "unable to import .tgz using ImportUI through the Proxy "
    • https://bugzilla.zimbra.com/show_bug.cgi?id=74622#c35
      • Introduce LC config "rest_request_max_upload_size" with default value of 1073741824 (1GB). Limit the file uploads of tar/zip/tgz/ics formatters to this size.
      • To set to 1GB [as zimbra] :
        • zmlocalconfig -e rest_request_max_upload_size=1073741824

See also:

To Just Export & Import Contents Of "Something"

To export/import just the Inbox:

zmmailbox -z -m user@domain.com getRestURL "//?fmt=tgz&query=in:inbox/inboxsub" > /tmp/userinboxsubin.tgz
zmmailbox -z -m userinboxsubin@domain.com postRestURL "//?fmt=tgz&resolve=replace" /tmp/userinboxsubin.tgz

To export/import Inbox and Subdirectories. If the query is [ in: ] , it just gets from that folder. If it's [ under: ] , it'll grab anything including and beneath that folder.

zmmailbox -z -m user@domain.com getRestURL "//?fmt=tgz&query=under:inbox/inboxsub" > /tmp/userinboxsubunder.tgz
zmmailbox -z -m userinboxsubunder@domain.com postRestURL "//?fmt=tgz&resolve=replace" /tmp/userinboxsubunder.tgz

Export Using Query String - Before And After Dates

Note - Make sure you scroll web page to the sides to see the complete command examples!


Tested against ZCS 8. This is an example [user@`zmhostname` should be replace below with the targeted user@domain information] :

**Updated to include this new example at the top.**

zmmailbox -z -m user@`zmhostname` gru '//?fmt=tgz&query=under:/ after:"4/8/15" AND before:"4/11/15"' > /tmp/test_export/test_export.tgz

Or:

 zmmailbox -z -m user@`zmhostname` getRestURL "//?fmt=tgz&query="after:2/1/2014\ before:3/15/2014\ type:message\ meta=0"" > /tmp/test_export/test_export.tgz

Or :

zmmailbox -z -m user@`zmhostname` gru '/?fmt=tgz&query=after:"10/01/14" and before:"10/15/14"' > /tmp/filename.tgz

Or:

zmmailbox -z -m user@`zmhostname` gru -u https://localhost '/?fmt=tgz&query=after:"4/30/13" and before:"8/1/13"' > /tmp/filename.tgz

Using ZD To Import Zimbra TGZ Export User Data

See :

Old ZCS To New ZCS - Using LDAP Replica To Migrate

If your moving users to a new mailstore, please see King0770-Notes#Preferred_Method_Moving_Users_To_New_Machine for a possible solution. This process implements a new LDAP replica's with the new ZCS mailstore. Over time you migrate the account to new ZCS with zmmailboxmove. Once the accounts have all been moved, you prompt the LDAP Replica configuration on the new ZCS server to be the LDAP master. Adjust for the MTA configuration and then retire the old box.

Old ZCS To New ZCS - Using Rsync Method To Migrate

The rsync method is described in Ajcody-Notes-Server-Move would require downtime.

Old ZCS Email Domains To Another ZCS With Existing Domains

LDAP Method - Untested & Not Finished

The process would involve migrating the LDAP data for the domain from the old zimbra server into the other existing Zimbra system (using slapcat or ldapsearch). You'll need to get the server entries, any customer COS entries, the domain entry, and all users under that domain. You would then load the information (ldif) into the existing Zimbra server with slapadd. You could then reconfigured the old server to become an ldap slave to the existing Zimbra box and to server simply as a mailstore for that domain data (users). One could also proceed with zmmailboxmove if you needed to migrate the user data onto the other box as well now.

Adjustments would also need to be done in regards to auth keys and server certs.

Sub-Domain Method - Untested & Not Finished

Situation

Let's say you own a hosting company and a new customer says they are running Zimbra Network edition (same version as yours) and wants you to host them. They only have one domain and want to minimize downtime. Their server is in another country or state. They have 50 users.

Realistically, during the migration time frame, those accounts that will be using the temporary sub-domain until the switch is complete should only use the zimbra web client. Phones, thick clients, and so forth will introduce issues about synchronization and so forth that would be better to simply avoid.

Split-Domain Setup

Setup a split domain that you can use through the migration time frame.

They are company.com

You'll be migrate.company.com

See Ajcody-Migration-Notes#Zimbra_And_Linux_Resources about Split-Domains and Domain Masquerading. Other adjustments might need to be applied.

The situation will be this:

  • All emails maintain their company.com domain and email comes and goes through customers ZCS server.
  • Customers ZCS will be using forward/alias for account to goto @migrate.company.com
    • User Forward/Alias
      • Need to figure out the right way to do this for this situation still.
    • Relaying/Domain Forwarding
      • zmprov
      • md example.com zimbraMailCatchAllAddress @migrate.company.com
      • md example.com zimbraMailCatchAllForwardingAddress @migrate.company.com
      • md example.com zimbraMailTransport smtp:mta.HOSTING-COMPANY.com
  • Accounts migrated to Hosting company server will have rewrite rules. So migrated accounts will not be exposed to "outside" world.
    • zmprov md migrate.company.com zimbraMailCatchAllAddress @migrate.company.com zimbraMailCatchAllCanonicalAddress @company.com
    • Need instructions to configure for this particular domain mta rules do basically do the below WITHOUT effecting the other domains with the hosting companies ZCS infrastructure.
      • $ zmprov mcf zimbraMtaRelayHost mta.COMPANY.com
      • $ zmprov mcf zimbraMtaDnsLookupsEnabled FALSE
Break Up User Migration

Determine group of users that make sense to migrate based upon their functions and mailbox sizes. Some items might not work 100% as expect while in transition - i.e. Calendars, Resources, Wiki, etc..

Set Accounts To Maintenance

Set the targeted accounts to Maintenance status. They should stay this way after the backup is completed and will stay like that until the migration to the new server is complete and the split-domain is setup.

Backup

On the customers ZCS server, do a backup of the selected users. Be careful of where you direct the backup to and that you'll have space.

mkdir /tmp/backup-accounts
zmbackup -f -a userA@company.com userB@company.com userC@company.com -t /tmp/backup-accounts
Problem With Backup

Ideally, we would then do something like this:

zmbackup -i -a userA@company.com userB@company.com userC@company.com -t /tmp/backup-accounts

But this will fail, with:

Error occurred: system failure: Customer backup target is not allowed for incremental backup

And if you don't use the customer target it seems like the incremental picks up all the accounts. The point of wanting this would be to send a full via mail if needed and to allow the incrementals to be small enough to transfer over the network between the customer site and the hosting company.

Set Company Domain Onto Hosting Server

You'll need to setup the companies domain on your server before the zmrestore will work. You might as well configure the COS to make what the customer is using.

If your using a non-default COS.

  • zmprov cc companyCOS
    • adjust in web admin console if needed
  • zmprov gc companyCOS | grep zimbraId
    • zimbraId: 420447db-ea43-4d92-b62d-0164767f516d
  • zmprov cd company.com
    • adjust in web admin console if needed
    • zmprov md company.com zimbraDomainDefaultCOSId 420447db-ea43-4d92-b62d-0164767f516d

Set this domain to Maintenance mode.

Restore Account Onto Hosting Server
zmrestore -a userA@company.com userB@company.com userC@company.com -t /tmp/backup-accounts
Move Restored Accounts Into Split-Domain

If split domain is already setup:

zmprov ra userA@company.com userA@migrate.company.com

If not, your could do a domainrename:

zmprov -l rd company.com migrate.company.com
Forward Accounts That Were Moved

On the customers server, you'll now setup those accounts to forward to the split-domain.

Email for userA@company.com forwards to userA@migrate.company.com

Here's another problem though. We need to find a way to switch to a forward/alias on the companies ZCS so that emails don't get bounced with "no such account". You can't setup an alias as long as the account exists on the box.

Split-Domain Method - Untested & Not Finished

Summary Of Idea
I'm sure there's a way you could have both servers hosting the company.com domain at the same time with separate users. Where you had all mail going to a shared non-zimbra MTA host that would direct delivery to either the Hosting or Client ZCS server depending on where the user was at. Of course, you would need to have DNS setup to handle this type of setup (both internal and external). Could your Postini, MailProtector, or other such device do this? Then you would need to figure out a way to handle internal domain delivery between the two systems. The defaults, would have lmtp rejecting the delivery thinking the account doesn't exists because it believes it's authoritative for the domain. Maybe the use of subdomains for aliases/forwarding work be the work around for this.
Diagram Of Flow
                            ALL Company.com
                           emails to and from

                               MTA HEAD
                     lookups username to determine
                          server to deliver to

                                either/or
 userA@company.com is                             userB@company.com is
 userA@hosting.company.com                       userB@client.company.com
                    (above is done on MTA device)

 hosting server has:                             client server has:

 hosting.company.com has a                        client.company.com has a
 domain alias pointing to company.com             domain alias pointing to company.com
  * see http://wiki.zimbra.com/index.php?title=ManagingDomains#Creating_a_Domain_Alias *

 hosting server knows (MX record)                client server knows (MX record)
 Eternity server for delivery of                 hosting server for delivery of  
 eternity.company.com accounts (alias)           volaria.company.com accounts (alias)
  * also see http://wiki.zimbra.com/index.php?title=ManagingDomains#Relaying.2FDomain_Forwarding *

 Summary of hosting.company.com ZCS server
 Has a domain called company.com
 Has a domain alias called hosting.company.com that points to company.com
 All users are setup within company.com
 Migrate user(s) via zmrestore command into the domain (company.com)
 Non-migrated users will have alias setup to forward to username@client.company.com
 Will have mail-relaying/forwarding setup for client.company.com

 Summary of client.company.com ZCS server
 Has a domain called company.com
 Has a domain alias called client.company.com that points to company.com
 All users all ready exist within company.com
 Non-migrated user(s) will remain untouched
 Migrate user(s) via zmbackup with individual or list of users as flag
 Migrated users will have alias setup to forward to username@hosting.company.com
 Will have mail-relaying/forwarding setup for client.company.com

I've only just started to think this one possible solution through, maybe you can pick up what I have and go with it. To see if it might or might not work. Of course there's going to be a performance overhead on this in regards to mail delivery - because of the MTA Head being needed for routing issues. And there's also any security (spam as well) issues to think through.

Some Zimbra Users From DomainA to DomainB On Same Zimbra Server

All Accounts From DomainA to DomainB

You would use [renameDomain]:

zmprov -l rd domainA.com domainB.com

Please review domainrename though, there's some outstanding issues with this depending on your ZCS version.

Account From DomainA to DomainB

You want to use [renameAccount]:

zmprov ra userA@domainA.com userA@domainB.com

This is same command to rename account within the same domain:

zmprov ra userA@domainA.com userB@domainA.com

Non-Zimbra IMAP Accounts To Zimbra

IMAPSYNC References

General IMAPSYNC References:

IMAPSYNC with admin login

Example of using admin login:

imapsync --buffersize 8192000 --nosyncacls --subscribe --syncinternaldates \
--host1 server1.test.org --user1 yourAccount --password1 yourPassword \
--host2 zimbra.test.org --user2 yourZimbraAccount --authuser2 admin \
--password2 adminZimbraPassword --authmech2 LOGIN

You can also try --authmech2 PLAIN

I found this description in one of the imapsync files:

"You may authenticate as one user (typically an admin user), but be authorized as someone else, which means you don't need to know every user's personal password. Specify --authuser1 "adminuser" to enable this on host1. In this case, --authmech1 PLAIN will be used, but otherwise, --authmech1 CRAM-MD5 is the default. Same behavior with the --authuser2 option."

Imapsync During A Certain Time Frame

You might want to check out the imapsync options of --maxage and --minage

Performance Issues With IMAPSYNC

MS Exchange , IMAPSYNC, And ExMerge [Microsoft Exchange Server Mailbox Merge Wizard] For Email Other Data

See Ajcody-Migration-Notes#IMAPSYNC_References for IMAPSYNC references.

ExMerge References:

MS Exchange To Zimbra By Way Of Zimbra's Migration Wizard for Exchange

References:

Office 365 Specifics

We also have these two RFE's requests:

PST Import Options

References:

Zimbra Data Back To Another Server (Exchange)

I understand there are numerous reasons why a company might need to change the mail server software, most of them are probably not technical reasons. To help where I can for this situation, I'll start documenting the information, tricks, and tools I discover to make this process more easier for you - unfortunate that it is.

The Zimbra Support team doesn't support exporting data into a non-zimbra server. You would normally get support from the company of the non-zimbra server, since you would be using their tools and processes for importing. We can only offer the references below and also point out different "export" abilities and the standard formats within Zimbra. The one migration wiki article goes over those examples.

  • Exchange Server - We don't have anything specific about moving data from Zimbra to Exchange, but the following should help or apply.
    • If you enable IMAP on Exchange, you could use IMAPSYNC. This tool is mentioned in the below url.
      • There's also the possibility of exporting the data to PST/OST files from Outlook as well.
    • You'll most likely need to manually recreate all users, resources, and dl's in Exchange - unless you can work through a way to import the ldap data from the zimbra server into the exchange/active directory server. Calendar data can be export out in the ical format - I think there might be third party tools for Exchange to import this.

I did a number of example's about migrations in general on the wiki, these could be adjusted for your situation. They also include url resources on the Exchange configuration.

I would highly recommend you reaching out to the sales group and exploring other options with them as well. Contact Information



Ajcody Notes Server Move

   KB 2963        Last updated on 2009-07-21  




0.00
(0 votes)
Attention.png - This article describes the steps to move a ZCS server to a new physical or virtual server. Technically, this is not 100% supported as it is not a method that the developers or QA teams test against. But because of customer demands and needs, this method is often preferred compared to Network_Edition_Disaster_Recovery.

THIS IS THE ONLY OTHER DOCUMENTED METHOD BESIDES THE Network_Edition_Disaster_Recovery WIKI FOR A ZCS SERVER MOVE PROCESS THAT ZIMBRA WILL SUPPORT AND ACCEPT SUPPORT CASES FOR.

Support cases should reference this wiki page and include a copy of all the steps and output of them as you work through this how-to. Then noting the issue and step in the how-to your stuck at.

Discussion on this document is located here:

Server Move Notes

Actual Server Move Home Page

Please see: Ajcody-Notes-Server-Move

Server Move To Same Platform (32/64bit) And OS Type & Version


References

Please review the main reference for this page. It will have additional information you should be aware of before following these steps.

To do a server move via zmrestoreoffline, see:

OS Changes Also Happening

Please see these wiki article:

Related RFE's

The following, if and when done, should also address directly changes in the OS or architecture of the "new" server as compared to the old production server you were moving from.

Assumptions


Basic Assumptions

  • 1. OS Type & Version
    • This article makes the assumption your moving to the same OS type & version. For example, your OLDPROD machine is running RHEL4-64bit. So your new machine would be running the same and brought to the same patch level as well.
    • Note This wiki has been used many times to successfully move to a new server that was running a different OS than the OLDPROD. The one condition that does stay true though is the ZCS version must be the same between the two serves. Any ZCS "upgrades" should occur after the migration has been successfully done to the new server using the same ZCS version as the old one. Experience has shown it's generally a non-issue, but I would advise doing a 'test' move first to confirm this prior to attempting a real production server switch over. I can't address all the various issues/tweaks that might or might not need to be addressed with an OS change in this article though - hence, my strong recommendation of a proper "test" before you schedule your production server switch over. ---Ajc , Jan 27, 2015
  • 2. Same HOSTNAME
    • New server is setup with the same HOSTNAME information as OLDPROD but it will use a different IP until OLDPROD can be shutoff/reconfigured (if needed)
  • 3. Different IP Issues - Either For Testing Or As A Permanent Change
    • Problems that are caused by different ip's being in use related to zcs variables will usually show up on the NEWHOST where services [usually mailboxd] can't successfully start.
      • Note about command examples below.
        • I used `zmhostname` rather than typing out an explicit hostname.
        • I also included the -l option for the zmprov ms examples since the NEWHOST most likely will only be able to start the ldap service until you have these variables changed.
      • One example variable to look for:
        • zmprov gs `zmhostname` zimbraLmtpBindAddress
        • To modify on NEWHOST - as zimbra
          • zmprov -l ms `zmhostname` zimbraLmtpBindAddress "XXX.XXX.XXX.XXX"
            • replace XXX.XXX.XXX.XXX with the ip address on the NEWHOST
      • Another example variable to look for - this would be an issue if the NEWHOST is on a different subnet than OLDPROD :
        • zmprov gs `zmhostname` zimbraMtaMyNetworks
        • An examaple to modify on the NEWHOST - as zimbra
          • zmprov -l ms `zmhostname` zimbraMtaMyNetworks '127.0.0.0/8 10.10.130.0/24'
    • Since there are many possible variable that might include a specific ip reference, it would be best to do something like the following on the OLDPROD and review the possible variables you might have set that are using an explicit ip address.
su - zimbra
ip addr
ifconfig -a
[ NOTE THE IP's BEING USED - You'll grep for the first octet below ]
[ For example, my server's eth0 is using 192.168.0.71 ]
zmlocalconfig | grep 192
[192 is based upon my example]
zmprov gacf | grep 192
zmprov gs `zmhostname` | grep 192
  • 4. SAME AMOUNT OF MEMORY
    • If your moving from 32bit to 32bit and the new system has more than 4GB of memory and the older one didn't you will most likely need to adjust mailboxd_java_heap_memory_percent . This problem will show up as the mailbox stop starting and /opt/zimbra/log/zmmailboxd.out logging errors about JVM memory heap. Try the following:
su - zimbra
zmlocalconfig -e mailboxd_java_heap_memory_percent=25
zmmailboxdctl restart
If You're Attempting A Switch From 32 to 64

Update - 02/01/2014 - I created https://wiki.zimbra.com/wiki/Ajcody-Notes-Server-Move-32-To-64 so it's less confusing to do the steps on this page when you have a 32>64 transition. - Adam Cody

Note - If you skipped this section for whatever reason and run into the problem later as
your attempting to start/install on the new box you can still fix it. Use rpm2cpio to extra the
needed data for the zimbramon/ directory. Move the existing one at /opt/zimbra/ to a tmp
location and then copy/move the one from your rpm2cpio extraction.
Get the path to the zimbra-core package within the extracted tarball you have for the zimbra installation :
ex: /root/zcs-NETWORK-6.0.6_GA_2324.RHEL5_64.20100406133038/packages
Change directory to a temporary location that you'll extract the data to :
ex: rpm2cpio /root/zcs-NETWORK-6.0.6_GA_2324.RHEL5_64.20100406133038/packages/zimbracore-6.0.6_GA_2324.RHEL5_64-20100406133038.x86_64.rpm | cpio -idmv
You'll be able to find the zimbramon directory under there now to copy over to the /opt/zimbra path.


The details of this document assume that both servers are of the same chip type - 32bit or 64bit. Below are notes to "tweak" this how-to to handle a switch as well from 32 to 64.

The two exceptions to this how-to's instructions would be handling the zimbramon direcotry and the inclusion of the 32>64 ldap export/import steps listed in Network_Edition:_Moving_from_32-bit_to_64-bit_Server :

  1. On the "new server" , install the 64-bit version of ZCS that is the same version that your old server is running. The only difference would be it's 64 rather than 32-bit tar ball of ZCS.
  2. Copy /opt/zimbra/zimbramon of the 64bit installed stuff to a tmp dir
    • [as root] cp -rp /opt/zimbra/zimbramon /opt/zimbramon-64bit
  3. On the "old server" do the zmldapcat steps listed in Network_Edition:_Moving_from_32-bit_to_64-bit_Server
  4. Proceed with the steps listed in the main body of this how-to
    1. When you start step 7 in Ajcody-Notes-Server-Move#The_Big_Day_-_PROD_Downtime_For_Switch section stop.
  5. Copy the zimbramon-tmp over the /opt/zimbra/zimbramon dir
    • [as root] cp -rp /opt/zimbramon-64bit/* /opt/zimbra/zimbramon 
      • This is to bypass an reported error, below, during the ./install.sh part . Reports so far have been against Ubuntu.
        • Can't load '/opt/zimbra/zimbramon/lib/i486-linux-gnu-thread-multi/auto/IO/IO.so' for module IO: 
          /opt/zimbra/zimbramon/lib/i486-linux-gnu-thread-multi/auto/IO/IO.so: wrong ELF class:
          ELFCLASS32 at /usr/lib/perl/5.8/XSLoader.pm line 70.
          at /opt/zimbra/zimbramon/lib/i486-linux-gnu-thread-multi/IO.pm line 11
  6. Save a copy of your .saveconfig directory and /opt/zimbra/conf/localconfig.xml in case you need to fall back on it. It seems the ./install.sh -s removes the files in .saveconfig and alter's localconfig.xml , which means when you goto do the ./install.sh later it will not see your configuration variable for your server.
    • [as root] cp -rp /opt/zimbra/.saveconfig /opt/saveconfig-bak
    • [as root] cp -p /opt/zimbra/conf/localconfig.xml /opt/localconfig.xml 
  7. Do a ./install.sh -s from the zimbra installation extract directory as root and then copy back your files from the saveconfig-bak directory.
    • [as root] cp -rp  /opt/saveconfig-bak/* /opt/zimbra/.saveconfig/
    • [as root] cp -p /opt/localconfig.xml /opt/zimbra/conf/ 
  8. Then follow the steps listed in the 32>64 bit wiki page about restoring the ldap data
    • Either the 5.x or 6.x section entitled:
      • "5.0.x or previous LDAP setup: 1. Restore the LDAP data to the 64-bit server. As zimbra, type"
      • "6.0.x and later LDAP setup: 1. Restore the LDAP data to the 64-bit server. As zimbra, type"
    • Stop at the point you just finished doing the
      • "/opt/zimbra/openldap/sbin/slapadd -q -b "" -F /opt/zimbra/data/ldap/config -cv -l /backup/ldap.bak"
  9. Now proceed with the step 7 section of the server move wiki, which is the full install/upgrade -- ./install.sh .
Moving To A New VMware Image Using Different OS

A short overview I wrote:

Optional Uses


For DR Prior To Upgrades

You could also use this method for a "copy" of your old version prior to doing an upgrade.

For Local Disk Partition Changes Rather Than A Server Move

Let's say your going to add addition storage and you want to move /opt/zimbra which was on the root partition of / . You could use the rsync steps listed here to make that move and decrease the downtime window versus shutting down zimbra to do a move or copy operation to the new partition. Remember to adjust your /etc/fstab after this otherwise zimbra might not come up after your next server restart.

Typical example of this situation is customer also wanting to change the underlying OS/distro , need to move the zimbra directory off of the / partition onto a new partition they now have available, etc..

Note - if you switch the OS type/version, you'll need to include the steps below about getting the right version of the ZCS binaries on the system and so forth since your 'rsync' copy would have the older version. For the re-install to work, you need the correct binaries - mostly so ldap can run during the installation/upgrade.

Other Sync Method Processes


BackupPC

Some customers have been using BackupPC for server moves and roll backs from ZCS upgrades.

Please see:

Important Advise About Testing Or Practicing This How-To


Attention.png We strongly encourage customers to 'test' this how-to prior to scheduling
your final downtime for the production server to be moved.

If you plan accordingly, you'll only need two downtime windows on the production server to include this testing. These downtime windows will be much shorter using rsync versus other means to do your server move.

To allow for this recommend testing, please adjust the how-to steps below with the following information. Since at some point you'll have to shutdown zimbra on the production server to get your final rsync, you should preserve the rsync'd data on the NEWHOST if you have the available disk space.

Alternation to the rsync targets:

  1. rsync OLDPROD:/opt/zimbra data to NEWHOST:/opt/zimbra-BACKUP rather than NEWHOST:/opt/zimbra .
  2. Then rsync or cp [with appropriate option flags] to NEWHOST:/opt/zimbra .
    • Leaving the NEWHOST:/opt/zimbra-BACKUP intact for later reuse for testing and etc..

For retesting purposes, you have two possible methods.

  1. First - jut rsync over the NEWHOST:/opt/zimbra directory
    • You would use the rsync command that includes the --delete option, since we want NEWHOST:/opt/zimbra to be exactly the same as NEWHOST:/opt/zimbra-BACKUP .
  2. Second - remove NEWHOST:/opt/zimbra and then rsync NEWHOST:/opt/zimbra-BACKUP [copying anything over again into a new and empty NEWHOST:/opt/zimbra ]
    1. run the uninstall option via the zimbra installation script - install.sh -u - on the NEWHOST.
    2. [NEWHOST] rm -rf /opt/zimbra . We want to confirm all files are gone.
    3. Remove any 'zimbra' files in /tmp
    4. And now restart the how-to with a clean zcs installation followed by another rsync of NEWHOST:/opt/zimbra-BACKUP to NEWHOST:/opt/zimbra .
  3. Note - I just used /opt/zimbra above, you'll need to include other data/zmvolume paths as well if you have them.

Final word of advise, please consider installing bind or some other DNS service on your NEWHOST so you can handle resolution locally on that server to resolve to the different ip address the NEWHOST is using. Consult Ajcody-Hostname-DNS and Split_DNS for more details.

The Actual Steps


Preparing NEWHOST Server


Attention.png Please Note - DO NOT USE CRON WHEN SYNCING YOUR SERVER!
You should always run the rsync command manually. If you fail to do so, and then complete the migration without removing the cron job that did the sync's, and you leave the old server running, you will lose data. The sync will then kick off again when cron runs and overwrite your production data! This will very likely corrupt your installation and leave you with an unstable system!


  1. Install Newer (supported) Operating System that matches OLDPROD
  2. Set up newer ZCS Server’s Hostname as it was on the older server
  3. Configure BIND locally on NEWHOST to handle resolution issues (A, MX, etc.)
    • Attention.png Please Note - The NEWHOST MUST RESOLVE THE ZMHOSTNAME TO IT'S OWN IP ADDRESS!
      • If dns resolution of the zmhostname on the NEWHOST resolves to the OLDPROD's ip address - this will most likely cause a production outage on OLDPROD since all zm commands ran on NEWHOST will actually be sent to OLDPROD. You will most likely need to do a DR recovery on OLDPROD if you don't confirm the ip resolution is correct on the NEWHOST.
    • On NEWHOST confirm /etc/hosts , /etc/resolv.conf is set correctly to resolve to NEWHOSTs ip address and NOT the OLDPROD's ip address.
  4. Download the EXACT version of ZCS that your OLDPROD is using - meaning the ZCS release number [ex. 8.0.7]; for some, it might be a different distro OS release of 8.0.7 [ex ZCS version].
  5. ZIMBRA User And UID Match
    • On OLDPROD as ZIMBRA:
      zmlocalconfig zimbra_uid
    • On OLDPROD, note the zimbra entry in /etc/passwd and /etc/group.
    • On NEWHOST configure the /etc/passwd to be the same UID for zimbra as the OLDPROD had.
    • On NEWHOST configure the entry for zimbra in the OLDPROD's /etc/group to match for zimbra as well.
  6. On NEWHOST as ROOT: Run the installer with the -s option:
    ./install.sh -s
    • This tells the installer to only install the software, and not to configure the installation.
    • To see what is installed and enabled on the PROD server, do the following on the PROD server:
    • zmprov gs `zmhostname` | grep zimbraService
    • Save this list, you'll need it also when you rerun the installer later on the NEWHOST to confirm the "upgrade" does the right package upgrades/installs.
  7. On NEWHOST as Root: Remove the dummy install:
    rm -rf /opt/zimbra ; mkdir /opt/zimbra
  8. On NEWHOST, make any other mounts or directories you'll need as to match the OLDPROD server.
    • Secondary mailstores, alternative backup directory paths, etc.
    • On OLDPROD, double check for these additional mounts by doing:
      • reviewing /etc/fstab and output of the df command.
      • run, as zimbra, the following: zmvolume -l and review the output and directory paths.
Sync OLDPROD Data While OLDPROD Is Still In Production Use


Attention.png Please Note - DO NOT USE CRON WHEN SYNCING YOUR SERVER!
You should always run the rsync command manually. If you fail to do so, and then complete the migration without removing the cron job that did the sync's, and you leave the old server running, you will lose data. The sync will then kick off again when cron runs and overwrite your production data! This will very likely corrupt your installation and leave you with an unstable system!


Preliminary Comments
Rsync options used below.
I've added the -H option to the rsync command to preserve hard links.
-H, --hard-links , preserves hard links
-a, --archive . This is equivalent to -rlptgoD .
From man page:
It is a quick way of saying you want recursion and want to preserve almost everything (with -H being a notable omission). The only exception to the above equivalence is when --files-from is specified, in which case -r is not implied. Note that -a does not preserve hardlinks, because finding multiply-linked files is expensive. You must separately specify -H.
-z, --compress
From man page:
With this option, rsync compresses the file data as it is sent to the destination machine, which reduces the amount of data being transmitted -- something that is useful over a slow connection.
You should upgrade rsync on both servers to version 3+. It addresses some performance issues. Please see the following for more details.
http://www.samba.org/rsync/FAQ.html#4
Nice and the numeric ranger, from RHEL5 nice man page:
Nicenesses range from -20 (most favorable scheduling) to 19 (least favorable).
-n, --adjustment=N add integer N to the niceness (default 10)


  1. First initial sync of OLDPROD to NEWHOST
    • on OLDPROD as Root
      nice -n +19 rsync -avzH -e ssh --progress /opt/zimbra/ root@NEWHOSTIP:/opt/zimbra
      • For ZCS8+ , YOU HAVE TO EXCLUDE THE LDAP DB DIRECTORY [default path is /opt/zimbra/data/ldap ] in your RSYNC command.
      • On my test box, Ubuntu 12.10 , the syntax for rsync to exclude the default ldap path would be :
      • Do the same with other paths you might have - i.e. secondary volumes.
        • [as zimbra] zmvolume -l
      • Some distro's, might want this format for nice :
        nice +19 rsync ....
  2. Sync daily until schedule downtime is available
    • On OLDPROD as Root
      nice -n +19 rsync -avzH -e ssh --progress /opt/zimbra/ root@NEWHOSTIP:/opt/zimbra
      • For ZCS8+ , YOU HAVE TO EXCLUDE THE LDAP DB DIRECTORY [default path is /opt/zimbra/data/ldap ] in your RSYNC command.
      • On my test box, Ubuntu 12.10 , the syntax for rsync to exclude the default ldap path would be :
        • nice -n +19 rsync -avzH --exclude='data/ldap/*' -e ssh --progress /opt/zimbra/ root@NEWHOSTIP:/opt/zimbra
      • Do the same with other paths you might have - i.e. secondary volumes.
      • Some distro's, might want this format for nice :
        nice +19 rsync ....
    • Extra Point Comment
      • One could also use a more aggressive nice value to start a job after hours but then include a renice command via cron prior to your normal business hours the next day. You would need to script this out though since it would need the process # of the command you wanted to renice.
    • Copy your crontab over to new system - should be able to use scp or rsync. /var/spool/cron/zimbra
      • You can then compare the 'old' one to the new one that will be created with the dummy install. You might have customization's that aren't in the new one provided by the basic install - ie, backup schedule, hsm runs, etc.
The Big Day - OLDPROD Downtime For Switch


  1. Commercial Certificates
    • Are you prepared for the impact this server move has for your commercial certificate? This document doesn't address that.
  2. Block client access to the old server's IP address with firewall rules
    • If your remote, make sure to keep your access port open. We are just trying to prevent any changes to the machines while they are being reconfigured.
  3. Shut down Zimbra on OLDPROD
    • On OLDPROD as ZIMBRA
      su - zimbra
      zmcontrol stop
  4. Attention.png Please Note - Prior To Final Rsync
    • YOU MUST PERFORM THIS STEP WITH ZIMBRA DOWN!
      Please ensure that you have the Zimbra server STOPPED before performing the final rsync. If you're using this method to test your server move, you must schedule downtime to stop the server before performing the final sync. This is required, and failure to do so will result in corruption of your mysql/ldap databases on the test server.
  5. Get LDAP in a good state prior to the final rsync [as recommend by our LDAP developer - 12/1/2012]
    1. For Versions Prior to ZCS8 , use db_recover - see the following reference for your version:
      1. http://wiki.zimbra.com/wiki/Ajcody-LDAP-Topics#Attempt_To_Cover_Versions_Higher_Than_ZCS5_-_I.27ve_yet_to_confirm_the_below
    2. For ZCS8 , db_recover isn't needed. You must include the -S option though for rsync.
  6. Last rsync of OLDPROD to NEWHOST
    • On Prod as ROOT
      nice -n -20 rsync -avzHS -e ssh --delete --progress /opt/zimbra/ root@NEWHOSTIP:/opt/zimbra 
      UPDATE : I've added the -H option to the rsync command to preserve hard links.
      UPDATE : I've added the -S option to the rsync command to handle sparse files. With ZCS8's ldap setup, this is necessary. See OpenLDAP_Performance_Tuning_8.0#Notes_on_MDB for details. 12/1/2012
      • Do the same with other paths you might have - i.e. secondary volumes.
      • Some distro's, might want this format for nice :
        nice -20 rsync ....
  7. Fix permissions on NEWHOST
    • On NEWHOST as ROOT run
      /opt/zimbra/libexec/zmfixperms --verbose --extended
  8. Turn off OLDPROD and reconfigure NEWHOST
    • This is a good time to turn off OLDPROD.
      • Reconfigure the network interfaces so if someone turns on OLDPROD later, it will not use the ip addresses that will now be used on NEWHOST.
      • Reconfigure any mounts (san, nfs, iscsi, etc.) so it will not mount anything that should only be mounted on our NEWHOST. Again, in case the machine is powered on accidentally later.
    • Reconfigure NEWHOST to take over ip addresses of OLDPROD.
      • Reconfigure the following now to use the production ip address etc:
        • The network interface configuration files per your distro
        • /etc/hosts
        • /etc/resolve.conf to use your production DNS servers
          • disabled BIND on this server if you set it up temporary until you could use the production DNS servers
      • Make any firewall or other network changes that are necessary.
        • Remember about arp tables.
    • Reconfigure for any mounts that were on OLDPROD that will be needed for NEWHOST.
  9. Install of Zimbra on NEWHOST
    • If you have a Split DNS install or use private LAN addresses on the server with a firewall front-ending the public addresses, you'll want to verify logical hostname resolution and hostname resolution
      • In some cases, you can move Zimbra to another server with a different hostname but keep the logical hostname the same. The logical hostname is what users know this server as, and it doesn't necessarily have to match the actual hostname.
        • For example, you might have "mail.mydomain.com" as the DNS name for the server, but the hostname is "web11233"
        • You need to have the server itself resolve the logical hostname, the old hostname and the new hostname as the internal private LAN address.
        • host `hostname`
        • host (logical hostname)
        • nslookup `hostname`
        • nslookup (logical hostname)
        • nslookup (old hostname)
        • nslookup (old hostname)
      • You want all these to look the same. You can follow instructions at Split_dns. Essentially what that does is have a local copy of bind (named) running that resolves just those names and forwards all other lookups to your normal DNS servers.
    • On NEWHOST as ROOT, rerun the installer without the -s option. Again, this is using the EXACT version of ZCS that your OLDPROD is using - meaning the ZCS release number [ex. 8.0.7]; for some, it might be a different distro OS release of 8.0.7 [ex ZCS version]. Any attempt to upgrade to a newer version of ZCS must be done AFTER you've completed this wiki page in full.
      ./install.sh --skip-activation-check   #OR This If your version doesn't support the skip activation option#  ./install.sh
      • It will detect ZCS already installed, and ask if you want to upgrade. Select Yes.
      • You can confirm the package installation/upgrade selection by comparing the output from the OLDPROD host by running this on the OLDPROD host
        • zmprov gs `zmhostname` | grep zimbraService
      • Attention.png Note - If in the configuration screen it looks like it doesn't have your old values, then the localconfig.xml in /opt/zimbra/conf must of been altered. To fix:
        • Stop/cancel the install - you'll see an option for it from the list of options.
        • Copy from your OLDPROD's /opt/zimbra/conf/ directory the localconfig.xml file to the NEWHOST's /opt/zimbra/conf/ directory.
        • Restart the the installation again with ./install.sh --skip-activation-check or ./install.sh , depending on your version
        • If that failed to fixed the install configuration screen situation, redo the above but also remove /opt/zimbra/.saveconfig from NEWHOST and copy that directory from OLDPROD.
      • Within the installation script, you might want to choose the option that tells Zimbra to NOT automatically start upon completion of upgrade/install.
      • Confirm your license is valid and activated:
        •  zmlicense -c ; zmlicense -p | grep -A4 Activation
  10. Post-Install on NEWHOST
    • This document assumes you were going to get the same hostname and ip address once the final move was done. In case this isn't true, below are some follow up issues you might want to check. You might of done some of these already.
      • Do you need to make adjustments for commercial certificates?
      • Reconfigure any network interface/ip information that you need because of hardware move.
      • Make necessary adjustments you might need because of hostname changes. ( see ZmSetServerName )
      • Adjust any firewall settings
        • If ip address is going to be different, make sure you know the settings you'll need to adjust within Zimbra (if any).
        • If ip address is going to be the same, remember your network will take awhile to see change as the new MAC address gets updated to other devices arp table.
          • If you can, you can speed this along with changes on your switches.
  11. Start zimbra once you think everything is ready.
    • Do some client access tests within your LAN.
    • If testing looks good, the remove any firewall rules you might of done to block access from outside. Then confirm outside access and functionality.
      • Remember to check those mobile devices, certificates, and other access software/devices besides just the Zimbra webclient.

Things To Check Before Going Production Again

See the post checks from the DR page:

Additional Topics


Zimbra Domains Into An Existing Zimbra Server

See Ajcody-Migration-Notes . There's a couple of possibilities being worked out there.

Verified Against: unknown Date Created: 7/23/2008
Article ID: https://wiki.zimbra.com/index.php?title=Ajcody-Server-Plan-Move-Migration-Upgrade-DR Date Modified: 2009-07-21



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »





Ajcody Notes Multi Server Restore Disaster Recovery (DR)

   KB 2963        Last updated on 2009-07-21  




0.00
(0 votes)
24px ‎  - This is Zeta Alliance Certified Documentation. The content has been tested by the Community.


Actual Multi-Server Restore-DR Notes Homepage

Please see Ajcody-Notes-Multi-Server-Restore-DR

Bugs I've Filed That Might Apply

Issues I Still Need To Confirm

As reported by one customers experience.

"The zmmyinit command does not work when used as directed. I found that for it to work
the mailstore has to be started then stopped then reinitialize the db, this sets some
configs somewhere allowing zmmyinit to create/start the db and connect. After that the
rest of the restore procedure works."

Multi-Server Restore-DR How-To

Introduction

This document is for disaster recovery, where the /opt/zimbra data is destroyed [hardware failure], missing, or completely inaccessible for some reason. There are better options if it’s a matter of some data being corrupt or some function not working properly.

Full restores can take a while, so plan accordingly.

More To Do

  • Section different between LDAP+MTA and LDAP only and MTA only
    • Include (need to get details still) MTA only steps
  • I still need to adjust this document to allow administrators to practice a DR situation.
    • Special consideration to take when production system is still live
    • Can production multi-server be DR'd to a single server?
      • Commands needed to adjust configurations for this situation
    • Private network
    • Internal DNS server for practice DR servers to use
    • How to handle different hostnames if one wanted/needed to.

Note, whenever an application is moved to a different physical host one must account for DNS TTL for hostname/ip addresses and also arp table caching on the clients and network devices in regards to the old MAC address of the network interfaces. Down the rode, I hope to add another wiki write up on this topic and the different solutions & practices to handle this.

Pre-Install

Confirm base OS restore/configuration details.

  • Hostname  :
    cat /etc/hostname & /etc/hosts
  • IP configuration  :
    ip ad
    • or  :
      ifconfig -a
  • DNS resolution  :
    cat /etc/resolv.conf ; host –a `hostname –f`
  • Routing information  :
    netstat –rn
  • Local and remote filesystems :
    cat /etc/fstab ; df –k
  • Access to backup data

The information you need from the old server including:

  • Installation packages - options to the yes/no questions
  • License file
  • Admin account name and password
  • Spam training and non-spam training user account names
  • Exact domain name
  • The global document account name
  1. If the server is not running.
    1. Block client access to the server IP address with firewall rules.
    2. Find the latest full ZCS backup session to use.
  2. If ZCS is still running, to prepare the move to the new server.
    1. Block client access to the server’s IP address with firewall rules.
    2. Run a full backup of the old service, or if the backup is recent, run an incremental backup to get the most current incremental backup session.
    3. Run zmcontrol stop, to stop ZCS. In order to restore to the most current state, no new mail should be received after the last incremental backup has run.
    4. Change the hostname and IP address on the old server to something else. Do not turn off the server.

LDAP Server (Everything but Mailstore)

  1. Reinstall/Restore the Zimbra Collaboration Suite
    1. Ensure the host hostname and MX DNS records resolve to the new server.
    2. Untar the matching version of the old server or restore the /opt/zimbra directory
    3. If this is NOT a new server, then :
      1. sudo – u zimbra /opt/zimbra/bin/zmcontrol stop
      2. rpm –qa | grep zimbra
      3. rpm –e zimbra-<package-name>.rpm
    4. Copy the License file (ZCSLicense.xml) to a path on your server
    5. Find and run the install.sh script
      1. Choose all options as original install for modules
        1. Mailstore shouldn’t be selected or MTA if you had a separate MTA server
      2. Make sure the same domain, hostname, passwords as on the old server are used.
      3. Configure the SMTP server to the LDAP server hostname or a separate MTA server you might be using.
      4. Disable auto-backup and starting of servers after configuration in the configuration menu
      5. Apply configuration changes
  2. Confirm ZCS isn’t running.
    1. su – zimbra ; /opt/zimbra/bin/zmcontrol stop
  3. Copy the backup files to /opt/zimbra/backup if not already present.
  4. Restore the LDAP server (zmrestoreldap -lb <latest_session_label>)
    1. Note: To find the LDAP session label to restore, type zmrestoreldap –lbs
    2. If you are restoring large number of accounts, you may want to run this command with nohup so that the session does not terminate. (Observe whether the LDAP server is started successfully after the restore, it must be running for the next steps).
    3. Note: The zmrestoreldap script included in ZCS 4.5.7 through ZCS 4.5.10 and ZCS 5.0 through ZCS 5.0.1 is broken. This is being tracked as Bug 23644: zmrestoreldap not taking accesslog db into consideration. The fix will be included in ZCS 4.5.11 and ZCS 5.0.2. You can also download an updated script with the fix from these links:
      1. ZCS 4.5.x: http://files.zimbra.com/downloads/4.5.10_GA/zmrestoreldap_4511
      2. ZCS 5.0.x: http://files.zimbra.com/downloads/5.0.1_GA/zmrestoreldap_502
    4. Type zmconvertctl start. This is required before running zmrestoreoffline.
    5. Because some ZCS services are running at this point, type zmcontrol stop to stop all services.
    6. Remove any old backup sessions because these sessions are no longer valid.
      1. Type
        rm -rf /opt/zimbra/redolog/* /opt/zimbra/backup/*
    7. Start the Zimbra server (zmcontrol startup)
    8. Confirm and get the root LDAP password for the mailstore setup.
      1. zmlocalconfig –s ldap_root_password
      2. Go to the Zimbra administration console to verify that the accounts are set to active.
        1. (Accounts>General tab)

Mailstore Server (Nothing but Mailstore option in the installer)

  1. Reinstall/Restore the Zimbra Collaboration Suite
    1. Ensure the host hostname and MX DNS records resolve to the new server.
    2. Untar the matching version of the old server or restore the /opt/zimbra directory
    3. If this is not a new server, then :
      1. sudo – u zimbra /opt/zimbra/bin/zmcontrol stop
      2. rpm –qa | grep zimbra
      3. rpm –e zimbra-<package-name>.rpm
    4. Copy the License file (ZCSLicense.xml) to a path on your server
    5. Find and run the install.sh script
      1. Choose Mailstore for module to install
        1. Install other modules you might have had for your Mailstore server.
      2. Make sure the same domain, hostname, passwords as on the old server are used.
      3. Configure LDAP using the LDAP server hostname and LDAP ROOT password found with:
        1. zmlocalconfig –s ldap_root_password
      4. Configure the SMTP server to the LDAP server hostname or a separate MTA server you might be using.
      5. Disable auto-backup and starting of servers after configuration in the configuration menu
      6. Apply configuration changes
  2. Confirm ZCS isn’t running.
    1. su – zimbra ; /opt/zimbra/bin/zmcontrol stop
  3. Re-initialize mysql
    1. rm –rf /opt/zimbra/db/data/*
    2. /opt/zimbra/libexec/zmmyinit
  4. Copy the backup files to /opt/zimbra/backup if not already present.
  5. Type
    zmconvertctl start
    . This is required before running zmrestoreoffline.
  6. Run the zmrestoreoffline to restore all account mailbox and messages
    1. To start the offline restore, type
      zmrestoreoffline -sys -a all -c -br
      . You may want to run nohup here also. To watch the progress, tail /opt/zimbra/log/mailbox.log. Note: Use –c on the command line so that accounts will be restored even if some accounts encounter errors during the offline restore process.
    2. Because some ZCS services are running at this point, type zmcontrol stop to stop all services.
    3. Remove any old backup sessions because these sessions are no longer valid.
    4. Type
      rm -rf /opt/zimbra/redolog/* /opt/zimbra/backup/*
    5. Start the Zimbra server (zmcontrol startup)

Post-Server setup

  1. Remove the firewall rules and allow client access to the new server.
Verified Against: Zimbra Collaboration 8.0, 7.0 Date Created: 04/16/2014
Article ID: https://wiki.zimbra.com/index.php?title=Ajcody-Server-Plan-Move-Migration-Upgrade-DR Date Modified: 2009-07-21



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »





Ajcody Disaster Recovery Specific Notes

   KB 2963        Last updated on 2009-07-21  




0.00
(0 votes)
24px ‎  - This is Zeta Alliance Certified Documentation. The content has been tested by the Community.


Actual Disaster Recovery Specific Notes Homepage

Please see: Ajcody-Disaster-Recovery-Specific-Notes

Bugs Filed That Might Apply

DR Is Taking A Very Long Time - Too Long

Please see:

Missing accounts.xml File

Please see:

Multi-server DR / Restore Documentation

Please see:

ZSC to ZSC Online Sync For DR

Please see:

Please add your votes to this RFE!

My Disaster Recovery Failed

Main Reference

Please see: Network_Edition_Disaster_Recovery#Restoring_to_the_new_server

What To Do Or Check

Time Issues

First, make sure your TIME is set right! See Time_Zones_in_ZCS#The_server_OS

Authentication Password Issues

Though I'm still investigating why this is happening for our customers, the root issues seem to be resolved by the following. Some of the auth errors will be logged to /var/log/zimbra.log and /opt/zimbra/log/mailbox.log

Put the output of this command in a text file:

zmlocalconfig -s | grep password

These should match what's in /opt/zimbra/conf/localconfig.xml

Now, compare the passwords with what is in your restore. Put in your path of the restore specific directory.

vi /opt/zimbra/backup/sessions/YOUR_DIR/sys/localconfig.xml

Do a /password or the full variable in vi to see what the old passwords are. You'll need to adjust the following below. Remember to copy a backup.

cp /opt/zimbra/conf/localconfig.xml /opt/zimbra/conf/localconfig.xml.DR
accounts.xml file - restore error

mailbox.log or restore command will output errors about

  • accounts don't actually get restore though restore command appears to completed without error.
zimbra@xxxx:~/log$ zmrestore -all -rf
Error occurred: invalid request: invalid account email address: ll
zimbra@xxxx:~/log$ zmrestore -a bdavis@XXXXXXX.com -rf
Error occurred: no such account: Account ID for bdavis@XXXXXXXXX.com not found in backup
  • no such session backup label errors
com.zimbra.cs.backup.BackupServiceException: backup full-20080920.050014.406 not found: 
/opt/zimbra/backup/sessions/full-20080920.050014.406 not found ExceptionId:main:1222811246316:7615eea74222489d

If your missing the /opt/zimbra/backup/accounts.xml file, grab it from your saved backup directory or from the backup session your using (/opt/zimbra/backup/sessions/full-XXXXXX/sessions.xml). If you need/want to change the referenced session label in the accounts.xml file - just vi it and use the global search and replace string (:%s/search_string/replacement_string/g).

Errors about mysql or ldap not running

This will most likely be encounter if you have to adjust some items in regards to the restore command not working. Run or confirm the following:

su - zimbra
zmconvertctl stop
mysql.server start
ldap start
zmconvertctl start

Then try the restore command again.

Verified Against: Zimbra Collaboration 8.0, 7.0 Date Created: 04/16/2014
Article ID: https://wiki.zimbra.com/index.php?title=Ajcody-Server-Plan-Move-Migration-Upgrade-DR Date Modified: 2009-07-21



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »





How Do I Make Sure I Don't Lose Emails During Upgrade? If I Need To Fail Back To Old Version?

   KB 2963        Last updated on 2009-07-21  




0.00
(0 votes)
24px ‎  - This is Zeta Alliance Certified Documentation. The content has been tested by the Community.


Actual Upgrade Options Notes Homepage

Please see: Ajcody-Notes-Upgrade-Options

Warning - Must Read!

These steps are the start of my notes/ideas. They have not been peer-reviewed or gone through any QA process.

Update I'll need to update some of the steps below in regards to mta errors until it can communicate with mailstores. Went through one dry run tonight (July 23, 08) with customer, I'll incorporate changes soon. See Ajcody-Notes-Of-Customer-Cluster-Upgrade and the follow up bug I did:

Bug filed for Documentation and QA issues:

Normal Order Of Upgrades

Update the servers in the following order:

All LDAP servers, MTA's, and dedicated Proxy servers must be updated in the same downtime window.

Update in this order.

  1. LDAP Master server
  2. LDAP replica servers. For upgrade details see LDAP_Replicas_4.5.x_to_5.0.x
  3. Zimbra MTA servers
  4. Dedicated Proxy servers
  5. Next upgrade the mailbox servers.
    • The first mailbox server to upgrade is the server where the Documents wiki account was created. If you do not upgrade and get this server working first, you will need to manually install the Documents templates on each of the other servers.

Starting Resources

Upgrade Steps for Multi-Servers (Non-Cluster)

  • Please review ALL documentation and make sure this process fits and makes sense for your environment. This is untested at this time.
  • Start or have a very recent full backup.
  • Incremental Backups
    • If you have a short time frame for your incremental backups and you've confirm the scheduled one ran the night before then:
      • Then you can proceed to Step 4
    • If you have a large time frame for your incremental backups and you've confirm the scheduled one ran the night before then:
      • Figure out a time frame to start another incremental mid-day that will complete shortly before your schedule downtime.
        • This is just to minimize the amount of time the mta's are offline.
      • Proceed to Step 4
  • Start of schedule downtime
  • Stop MTAs from delivering to the mailstore
    • There are a number of ways this can be done
      • Simply shutdown the zimbra mta server/services
        • We are relying on other MTAs to queue up your mail until your MTAs can be reached again. This variable is usually fairly long (hours/days) before MTAs will actually bounce back messages as undeliverable.
      • Use a non-zimbra device or server that can hold message queue, configure it to hold queue and not forward to zimbra MTAs
      • Other ideas?
        • Alter DNS/MX/Firewall to direct to a holding machine ?
    • Do a sanity check about disk space needed to queue up expected messages over time of upgrade.
      • Monitor space and be prepare to shut them back down again if disk issue becomes a problem
  • Do your last incremental backup of mailstores
  • Reference steps in multi-server upgrade doc's for technical details for below
    • This will recommend shutting down ALL Zimbra server
    • The release notes don't mention a particular order, but following from the "Rolling Upgrade" wiki this make sense
      • Upgrade LDAP Master server first
      • Upgrade LDAP replica servers second
      • Upgrade Zimbra MTA servers third
      • Upgrade dedicated Proxy servers fourth
        • In this case, the mailstore is assumed to be down already. The mailstores being the last to be upgraded.
      • Upgrade the Mailstore servers last
        • The first mailbox server to upgrade is the server where the Documents wiki account was created. If you do not upgrade and get this server working first, you will need to manually install the Documents templates on each of the other servers.
  • Follow up with Items listed in the official documentation about post upgrade steps.

Upgrade Steps for Multi-Servers with Clustered Mailstores

  • Please review ALL documentation and make sure this process fits and makes sense for your environment. This is untested at this time.
  • Start or have a very recent full backup.
  • Incremental Backups
    • If you have a short time frame for your incremental backups and you've confirm the scheduled one ran the night before then:
      • Then you can proceed to Step 4
    • If you have a large time frame for your incremental backups and you've confirm the scheduled one ran the night before then:
      • Figure out a time frame to start another incremental mid-day that will complete shortly before your schedule downtime.
        • This is just to minimize the amount of time the MTAs are offline.
      • Proceed to Step 4
  • Start of schedule downtime
  • Stop MTAs from delivering to the mailstore
    • There are a number of ways this can be done
      • Simply shutdown the Zimbra mta server/services
        • We are relying on other MTAs to queue up your mail until your MTAs can be reached again. This variable is usually fairly long (hours/days) before MTAs will actually bounce back messages as undeliverable.
      • Use a non-Zimbra device or server that can hold message queue, configure it to hold queue and not forward to Zimbra MTAs
      • Other ideas?
        • Alter DNS/MX/Firewall to direct to a holding machine?
    • Do a sanity check about disk space needed to queue up expected messages over time of upgrade.
      • Monitor space and be prepare to shut them back down again if disk issue becomes a problem
  • Do your last incremental backup of mailstores
  • Reference steps in multi-server & cluster upgrade doc's for technical details for below
  • Upgrade LDAP Master server first
  • Upgrade LDAP replica servers second
  • Upgrade Zimbra MTA servers third
  • Upgrade any dedicated Proxy servers fourth
    • Choose option in the installer config to not start after it's finish. We'll want them to stay down.
      • Confirm they are still down
  • Upgrade the "clustered" mailstore servers last
    • Follow steps as outline in cluster (single/multi-server) or multi-server upgrade doc's
      • This summary below is to brief to use, you MUST review the cluster upgrade documentation.
      • Please note "The first mailbox server to upgrade is the server where the Documents wiki account was created. If you do not upgrade and get this server working first, you will need to manually install the Documents templates on each of the other servers." Account for this in case there's multiple cluster mailstores.
        • Active Nodes upgraded first
          • The first one will effectively bring down the mailstore cluster - meaning it can't accept any mail deliveries
        • Standby Nodes upgrade second
        • When all are upgraded, you'll re-enabled the cluster
    • Once the upgrade has started (where the cluster is brought down with first upgrade), the "mailstore" should be offline and the mta should now be able to queue up your messages
      • Start up your mta's (or whatever method you used to delayed delivery to the Zimbra MTA's)
  • Follow up with items listed in the official documentation about post upgrade steps.

Notes I Took While Doing These Steps With A Customer

Please see Ajcody-Notes-Of-Customer-Cluster-Upgrade

Rolling Upgrade Outline

  • Please review ALL documentation and make sure this process fits and makes sense for your environment. This is untested at this time.
  • Start or have a very recent full backup.
  • Incremental Backups
    • If you have a short time frame for your incremental backups and you've confirm the scheduled one ran the night before then:
      • Then you can proceed to Step 4
    • If you have a large time frame for your incremental backups and you've confirm the scheduled one ran the night before then:
      • Figure out a time frame to start another incremental mid-day that will complete shortly before your schedule downtime.
        • This is just to minimize the amount of time the MTAs are offline.
      • Proceed to Step 4
  • Start of schedule downtime
  • Stop MTAs from delivering to the mailstore
    • There are a number of ways this can be done
      • Simply shutdown the Zimbra MTA server/services
        • We are relying on other MTAs to queue up your mail until your MTAs can be reached again. This variable is usually fairly long (hours/days) before mta's will actually bounce back messages as undeliverable.
      • Use a non-Zimbra device or server that can hold message queue, configure it to hold queue and not forward to Zimbra MTAs
      • Other ideas?
        • Alter DNS/MX/Firewall to direct to a holding machine?
      • Do a sanity check about disk space needed to queue up expected messages over time of upgrade.
        • Monitor space and be prepare to shut them back down again if disk issue becomes a problem
  • Do your last incremental backup of mailstores
  • Reference steps in the rolling upgrade wiki, multi-server, cluster upgrade doc's for technical details for below
  • Start of First Group Of Upgrades
    • From Rolling_Upgrades_for_ZCS , All LDAP servers and MTA servers must be updated in the same downtime window I'm including proxy servers below as well.
    • Update in this order
      • Upgrade LDAP Master server first
      • Upgrade LDAP replica servers second
      • Upgrade Zimbra MTA servers third
      • Upgrade all dedicated Proxy servers fourth
    • Allow messages to be delivered to mailstores (still running old version) once things look right.
  • Start of Second Group of Upgrades (Mailstores)
    • Repeat the steps about backups, stopping delivery of MTAs to mailstore, and then final incremental.
      • Do a sanity check about disk space needed to queue up expected messages over time of upgrade.
        • Monitor space and be prepare to shut them back down again if disk issue becomes a problem
    • Upgrade your mailstores based up your need/configuration
      • The first mailbox server to upgrade is the server where the Documents wiki account was created. If you do not upgrade and get this server working first, you will need to manually install the Documents templates on each of the other servers.
    • Once upgrade is done and things look right, allow delivery of mail from the MTAs to the mailstores.
  • Follow up with Items listed in the official documentation about post upgrade steps.

Jump to: navigation, search