Ajcody-Notes-ServerPlanning
What About Backups? I Need A Plan
Actual Backup Plans Homepage
Please see Ajcody-Notes-BackupPlans
Initial Comments
These are just my random thoughts. They have not been peer-reviewed yet.
What Might Be Wrong?
First thing to check is the log file and see if anything notable stands out.
grep -i backup /opt/zimbra/log/mailbox.log
Use Auto-Group Backups Rather Than Default Style Backup
Please see the administrative manual page on this:
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html
Compare the sections called "Standard Backup Method" & "Auto-Grouped Backup Method"
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html#1080744
Configuration details are here on that page:
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html#1084777
Make sure your backup schedule was adjusted for this change (zmschedulebackup which basically dumps to zimbra's crontab). The url above mentions this and give an example.
More details about the cron schedule item here for reference:
- Run zmschedulebackup without arguments to see your current schedule.
- Here is a schedule without auto-grouped backups enabled:
- Current Schedule:
- f 0 1 * * 6 -a all
- i 0 1 * * 0-5 -a all
- d 1m 0 0 * * *
- Here is a schedule with auto-grouped enabled:
- Current Schedule:
- f 0 23 * * * -z
- d 7d 0 21 * * *
- Run zmschedulebackup --help to see a list of options.
Two bugs to look at as well:
- cannot backup using admin ui in autogrouped mode
- Expose autogroup backup configuration to admin UI
Some Variables For Auto-Group
The below might not be complete or the defaults, I just wanted to save this before I forget them. Try to get more complete details on these later.
zimbraBackupAutoGroupedInterval: 1d zimbraBackupAutoGroupedNumGroups: 7 zimbraBackupAutoGroupedThrottled: FALSE zimbraBackupMode: Auto-Grouped
Need To Write Fewer Files - Add The Zip Option To Your Backup Commands
There is very little details on this option unfortunately.
From the administrative manual on the Backup section:
http://www.zimbra.com/docs/ne/latest/administration_guide/10_Backup_Restore.15.1.html
It says,
- "-zip can be added to the command line to zip the message files during backup. Zipping these can save backup storage space."
It's implied that instead of having all the individual message files in the backup that it will bunch them all together into zip files. This will be useful when the number of message files is causing disk i/o issues. Maybe your trying to rsync the backup session directories off to another server or your running a third party backup on it to save to tape. The default use of -zip will use compression, if this also causes overhead that you need to avoid you can use the -zipStore option.
Note about -zipStore:
- "when used with the -zip option, it allows the backup to write fewer files (-zip), but not incur the compression overhead as well"
How To Use As A Default Option?
You'll add the options you want to the zimbra crontab file for the different backup schedules your running.
What Does It Look Like When I Use Zip?
Shared blobs are zipped and blobs (messages) are zipped per root store directory.
mail:~/backup/sessions/full-20080820.160003.770/accounts/115/988/11598896-a89b-4b9d-bedb-1ed1afcb6c87/blobs zimbra$ ls blobs-1.zip blobs-2.zip blobs-3.zip blobs-4.zip
Local Disk Configuration
I would have 3 disk subsystems.
1. For OS / - this could be whatever - local mirrored SATA/SCSI drives
A. Make sure /tmp has enough space for operations
2. /opt/zimbra/back - I would make sure the disk I/O is separate from /opt/zimbra . This way you minimize performance hits to your end-users. Do a review the disk i/o | bus is clean as possible to the cpu/memory. Motherboard spec's should tell you want "slots" are on shared buses. Make sure you maximizing your raid/san cards performance to the bus path to the cpu.
A. I would make this a multiple of x4 the /opt/zimbra space that your spec'ing. This would allow for a couple of weeks of full's and the daily diffs.
B. Remember in when you consider you disk chassis infrastructure.
C. As an example, for /opt/zimbra I might go with a 16 disk enclosure that uses very fast & low latency disks which also means "smaller" disks. 16x73GB FB 10000 RPM's. And then fill another 16 disk enclosure with price performance disk for /opt/zimbra/backup. 16xSATA's @ 250GB's and 10,000 or 7200 rpm's (latency is different). [This is more of a price point question]
3. /opt/zimbra - same as #2 considerations
Network
A. I would be on Gb and have one dedicated port for moving data off of the system and another port (channel-bonded or not) dedicated for production (end-user) use. I would rsync over the backup network port (as shown on wiki - http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove ). I would actually get two separate dual-port cards and placed in the best "bus" for performance [motherboard spec situation here] For one card, I would channel-bond them for /opt/zimbra/backup. This requires you to map out the network path and switches to the end host the data is being moved to. You'll also need to confirm the network path actually gives you a gain in performance by doing the channel bonding. For the other card, I would setup ip take over for the production port. This is incase of port failure. Your customers most likely will not saturate this port..but if they do, you can also channel bond later.
Hot-spare server
Setup along the same lines... though you could cut out some of the HA/performance items if you only see this box being used for "short-term" use. Rsync's will occur over backup network port.
A. Setup the hot-spare according to http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove
Basically install the zimbra packages. You'll want to do this with upgrades to production as the system evolves.
B. I would do an initial rsync of /opt/zimbra (remember to use nice flag to not effect prod to much)
C. I would then setup 2 daily rsync jobs (following the same wiki instructions)
1. rsync /opt/zimbra/backup
This could be intergrated within the backup cron job so it kicks off after the backup is done. You'll need to monitor the times of backups and also the time for sync's so you can make sure you make it within the window of rotation - backup , rsync , next backup. Times will be different on diff and full backups.
2. rsync :
/opt/zimbra/conf /opt/zimbra/redolog /opt/zimbra/log
This will give some "sanity" in case issues arise with the full restore. This part could use some better feedback from more experience Zimbra staff. I can think of some circumstances where his effort would prove useful.
To Tape?
I would then use the non-rsync network ports for your traditional network backup software to run over to dump the data to tape. This way that activity doesn't effect prod performance at all. All full DR would use the backup/ data anyways (offsite DR).
It All Blew Up, Now What?
References:
http://wiki.zimbra.com/index.php?title=Ajcody-Notes-ServerMove
http://wiki.zimbra.com/index.php?title=Network_Edition_Disaster_Recovery