Ajcody-Notes-ServerPlanning
What About Backups? I Need A Plan
Actual Backup Plans Homepage
Please see Ajcody-Notes-BackupPlans
Initial Comments
These are just my random thoughts. They have not been peer-reviewed yet.
What Might Be Wrong?
First thing to check is the log file and see if anything notable stands out.
grep -i backup /opt/zimbra/log/mailbox.log
Use Auto-Group Backups Rather Than Default Style Backup
Please see the administrative manual page on this:
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html
Compare the sections called "Standard Backup Method" & "Auto-Grouped Backup Method"
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html#1080744
Configuration details are here on that page:
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html#1084777
Make sure your backup schedule was adjusted for this change (zmschedulebackup which basically dumps to zimbra's crontab)
Two bugs to look at as well:
- cannot backup using admin ui in autogrouped mode
- Expose autogroup backup configuration to admin UI
Add The Zip Option To Your Backup Commands
There is very little details on this option unfortunately.
From the administrative manual on the Backup section:
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html
It says,
- "-zip can be added to the command line to zip the message files during backup. Zipping these can save backup storage space."
Local Disk Configuration
I would have 3 disk subsystems.
1. For OS / - this could be whatever - local mirrored SATA/SCSI drives
A. Make sure /tmp has enough space for operations
2. /opt/zimbra/back - I would make sure the disk I/O is separate from /opt/zimbra . This way you minimize performance hits to your end-users. Do a review the disk i/o | bus is clean as possible to the cpu/memory. Motherboard spec's should tell you want "slots" are on shared buses. Make sure you maximizing your raid/san cards performance to the bus path to the cpu.
A. I would make this a multiple of x4 the /opt/zimbra space that your spec'ing. This would allow for a couple of weeks of full's and the daily diffs.
B. Remember in when you consider you disk chassis infrastructure.
C. As an example, for /opt/zimbra I might go with a 16 disk enclosure that uses very fast & low latency disks which also means "smaller" disks. 16x73GB FB 10000 RPM's. And then fill another 16 disk enclosure with price performance disk for /opt/zimbra/backup. 16xSATA's @ 250GB's and 10,000 or 7200 rpm's (latency is different). [This is more of a price point question]
3. /opt/zimbra - same as #2 considerations
Network
A. I would be on Gb and have one dedicated port for moving data off of the system and another port (channel-bonded or not) dedicated for production (end-user) use. I would rsync over the backup network port (as shown on wiki - http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove ). I would actually get two separate dual-port cards and placed in the best "bus" for performance [motherboard spec situation here] For one card, I would channel-bond them for /opt/zimbra/backup. This requires you to map out the network path and switches to the end host the data is being moved to. You'll also need to confirm the network path actually gives you a gain in performance by doing the channel bonding. For the other card, I would setup ip take over for the production port. This is incase of port failure. Your customers most likely will not saturate this port..but if they do, you can also channel bond later.
Hot-spare server
Setup along the same lines... though you could cut out some of the HA/performance items if you only see this box being used for "short-term" use. Rsync's will occur over backup network port.
A. Setup the hot-spare according to http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove
Basically install the zimbra packages. You'll want to do this with upgrades to production as the system evolves.
B. I would do an initial rsync of /opt/zimbra (remember to use nice flag to not effect prod to much)
C. I would then setup 2 daily rsync jobs (following the same wiki instructions)
1. rsync /opt/zimbra/backup
This could be intergrated within the backup cron job so it kicks off after the backup is done. You'll need to monitor the times of backups and also the time for sync's so you can make sure you make it within the window of rotation - backup , rsync , next backup. Times will be different on diff and full backups.
2. rsync :
/opt/zimbra/conf /opt/zimbra/redolog /opt/zimbra/log
This will give some "sanity" in case issues arise with the full restore. This part could use some better feedback from more experience Zimbra staff. I can think of some circumstances where his effort would prove useful.
To Tape?
I would then use the non-rsync network ports for your traditional network backup software to run over to dump the data to tape. This way that activity doesn't effect prod performance at all. All full DR would use the backup/ data anyways (offsite DR).
It All Blew Up, Now What?
References:
http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove
http://wiki.zimbra.com/index.php?title=Network_Edition_Disaster_Recovery