Difference between revisions of "Troubleshooting Course Content Rough Drafts-Recover Missing Data - User"

m (Backup Directory Partition & Filesystem)
m (Server Restores - Disaster Recoveries)
Line 128: Line 128:
====Server Restores - Disaster Recoveries====
====Server Restores - Disaster Recoveries====
'''Need to create summary statement'''
* Upgrades And Compatibility Of Older Backups
** "Backups must be compatible across patch releases"
*** http://bugzilla.zimbra.com/show_bug.cgi?id=16239
** "Incorrect upgrade documentation regarding backups"
*** http://bugzilla.zimbra.com/show_bug.cgi?id=30218
** "support for restore across major versions"
*** http://bugzilla.zimbra.com/show_bug.cgi?id=15750
** "Add conversion tool to upgrade backup versions to allow restore on later zcs versions"
*** http://bugzilla.zimbra.com/show_bug.cgi?id=14002
* From 5.0.7 - 5.0.10 You Might See minor version upgrades moving your backups into a subdirectory
** "upgrade incorrectly invalidates backups."
*** http://bugzilla.zimbra.com/show_bug.cgi?id=304
====Server Cloning - Server Moves/Testing Purposes====
====Server Cloning - Server Moves/Testing Purposes====

Revision as of 16:47, 28 January 2015

Need To Do

  • review training materials for sys admin course
  • determine what should be 'video' and how it can all be built upon preceding steps in a hands on training manner

General Overview Of Backup Environment

Backup Directory Partition & Filesystem

dedicated backup partition - only use supported methods:

NFS Use For Backups

Please see this RFE:

This is the proposed statement to be included in the release notes following the RFE:

Zimbra will support customers that store backups (e.g. /opt/zimbra/backup) on an NFS-mounted partition. Please note that this does not relieve the customer from the responsibility of providing a storage system with a performance level appropriate to their desired backup and restore times. In our experience, network-based storage access is more likely to encounter latency or disconnects than is equivalent storage attached by fiber channel or direct SCSI.
Zimbra continues to view NFS storage of other parts of the system as unsupported. Our testing has shown poor read and write performance for small files over NFS implementations, and as such we view it unlikely that this policy will change for the index and database stores. We will continue to evaluate support for NFS for the message store as customer demand warrants.
When working with Zimbra Support on related issues, the customer must please disclose that the backup storage used is NFS.
Things To Check When Using NFS
  • Check the /var/log/messages on both the zimbra server and the nfs server for nfs related errors during the time frame of your backup.
  • Check /opt/zimbra/log/mailbox.log for error messages about folders/files not being able to be written or missing directory errors.
  • Is root_squash configured on the nfs server? If it's changed to no_root_squash , does the behavior of the backup change?
  • Is the */backup directory owned by zimbra:zimbra with at least 750 or 755 permissions?
    • This parent directory as given in:
      • zmprov gs `zmhostname` zimbraBackupTarget
  • Does zimbraBackupTarget have at least the subdirectories of : sessions and tmp  : and are owned by zimbra:zimbra with 750 or 755 permissions?
    • If not, try manually creating them and then running a test backup.
    • If your backup session directory shows something like : drwxrwx---+ 2 zimbra zimbra 4096 Sep 14 00:00 TO_DELETE-full-XXXXXX , that + sign indicated extended acls are in use.
Debugging Example When Using NFS

Steps I wrote for one customer, where saving out the information as you walk through all the commands would give enough information [hopefully] to submit a good rfe/bug:

 1. make a test partition on nfs server - /nfs-test

 2. mount on zimbra server
 2A. mkdir /nfs-test
 2B. chmod 755 /nfs-test
 2C. mount nfs-server:/nfs-test /nfs-test
 2D. ls -la /nfs-test
 2E. mkdir /nfs-test/backup
 2F. chown zimbra:zimbra /nfs-test/backup
 2G. chmod 755 /nfs-test/backup
 2H. su - zimbra ; touch /nfs-test/backup/testfile 
 2I. ls -laR /nfs-test/
 2J. rm /nfs-test/backup/testfile

 3. Set zimbraBackupTarget
 3A. zmprov ms `zmhostname` zimbraBackupTarget /nfs-test/backup

 4. Run a full backup against one account
 4A. ex. zmbackup -f -a user@domain.com

 5. ls -laR /nfs-test/

 6. If you again, run into the same problem. You could also repeat the backup after increasing the backup 
    logging variable for the account your trying to backup. If you didn't run into the same problem, it might 
    had to do with the initial setup of the nfs mount and permissions being used during the directory creation.
 6A. zmprov aal user@domain.com zimbra.backup debug
 6B. logging will show up in /opt/zimbra/log/mailbox.log
 6C. Remove account logging when your done.  zmprov ral user@domain.com zimbra.backup

 8. Change zimbraBackupTarget back to your production path.
Setup A Fast Test NOT Using NFS

A way to "avoid" the NFS issues for testing purposes would be to setup a new zimbraBackupTarget to try doing a full backup of a couple of user accounts. I DON'T recommend this if your using auto-group for zimbraBackupMode [ zmprov gs `zmhostname` zimbraBackupMode ] , only if your using Standard mode.

[as root]
** adjust your new "backup" directory path as needed - mine is just an example**
mkdir /mnt/usb1/backup-test
chown zimbra:zimbra /mnt/usb1/backup-test
chmod 750 /mnt/usb1/backup-test
su - zimbra
**confirm backup mode as standard**
zmprov gs `zmhostname` zimbraBackupMode
**if not stand, please stop**
zmprov ms `zmhostname` zimbraBackupTarget /mnt/usb1/backup-test
**Find a couple of test accounts to do a full for. Make sure you'll have space to do the backup regards to need free space.**
zmbackup -f -a user1@domain.com user2@domain.com user3@domain.com
**Watch and confirm status of backup you just started.**
**Confirm files were backed up in right location**
ls /mnt/usb1/backup-test/sessions/
**Failed backups would most likely results in left over directory in tmp directory**
ls /mnt/usb1/backup-test/tmp/

Standard Backup or Autogroup

Backup Schedule


Backup CLI Commands

zmbackup zmbackupquery

Monitoring And Maintenance Of Backups

monitoring backups and disk usage

Server Issues

Questions To Ask & Address Prior To Doing A Server Restore

Should I shutdown ZCS services or block client access until issue and resolution are identified? What data needs to really be restored - can you restore just that or does it need a full restore?

Server Restores

Server Restores - Disaster Recoveries

Server Cloning - Server Moves/Testing Purposes

User Issues

Questions To Ask & Address Prior To Doing A User Restore

  • Can data be restored from the user's dumpster [if enabled] ?
  • Is there a copy of the data in a third party client ? [POP email client that downloaded it, for example]
  • Does the data maybe exist in another user's account ? [For example, the senders account or others in the To/CC]
  • Does the data reside in a redolog/s ?
  • Finally, restore from backups ?
    • Do I need to overwrite the existing account with the restore or should I restore to a new account name?
      • Using the option -ca [create account] and -pre restore_ [prefix to the new account name, restore_user@ ]
    • What are your retention policies and how it might determine the backup set you need?

User Restores

  • Admin Console
    • Restore/Backup
    • Redirected restore : -ca -pre restored_
    • View Mail > TGZ export/import
  • Crossmailboxsearch
  • CLI
    • zmrestore
      • to latest point in time
      • to specific point in time in past
      • to incremental session label
      • to full backup label
  • Importing Restored Data Into Parent Account

Trouble-shooting Backup/Restore Issues And Other General Questions

Restore Compatibility Between ZCS Versions



Basic Backup Information To Submit To Support

Disk Space Usage Issues
Trend Data

If there is concerns about disks/partitions getting full, this command would be helpful for trending data on your server. Send support the resulting df.tar file . Note - adjust the tail command if you want more than 20 day's worth of trending data, the -n 20 option.

[zimbra@zcs806 tmp]$ /tmp

[zimbra@zcs806 tmp]$ tar cvf /tmp/df.tar `find /opt/zimbra/zmstat -name df.cs\* | sort | tail -n 20`
tar: Removing leading `/' from member names
 [cut - Ajcody]

[zimbra@zcs806 tmp]$ ls -lah /tmp/df.tar
-rw-r----- 1 zimbra zimbra 80K Mar 29 06:44 /tmp/df.tar

[zimbra@zcs806 tmp]$ tar tvf /tmp/df.tar
-rw-r----- zimbra/zimbra  2566 2014-03-11 00:00 opt/zimbra/zmstat/2014-03-10/df.csv.gz
-rw-r----- zimbra/zimbra  2553 2014-03-12 00:00 opt/zimbra/zmstat/2014-03-11/df.csv.gz
 [cut - Ajcody]
-rw-r----- zimbra/zimbra  2513 2014-03-28 00:00 opt/zimbra/zmstat/2014-03-27/df.csv.gz
-rw-r----- zimbra/zimbra  2531 2014-03-29 00:00 opt/zimbra/zmstat/2014-03-28/df.csv.gz
-rw-r----- zimbra/zimbra  8013 2014-03-29 06:40 opt/zimbra/zmstat/df.csv
Directory Sizes In /opt/zimbra

Please see the following and provide the output to support. Note, even though this method is faster than doing a du it still can take awhile.

* Ajcody-Server-Misc-Topics#Faster_Way_To_Get_Directory_Size_On_Filesytem_-_find_vs_du
Adjusting The Disk Alert Threshold

Note - zmlocalconfig smtp_notify must return yes if you want to receive the notifications.

If you just need to adjust the disk alert threshold, then see the following:

See current values:

 zmlocalconfig | grep zmdisklog

Example adjustment:

 su - zimbra
 zmlocalconfig -e zmdisklog_critical_threshold=98
 zmlocalconfig -e zmdisklog_warn_threshold=95

To exclude a partition from the checks [example of two being excluded]:

 su - zimbra
 zmlocalconfig -e zmstat_df_excludes="/mount/point:/mount/point2"

They might be a bug on this, where you'll keep getting email until a logrotate happens [zimbra.log?].

Some things to do to confirm and share with support or in bug. As zimbra

su - zimbra

ls -la /var/log/zimbra.log

df -h
                      5.5G  3.5G  1.7G  68% /
 tmpfs                 939M     0  939M   0% /dev/shm
 /dev/sda1             485M   79M  381M  18% /boot
 /dev/sdb1              30G  6.2G   23G  22% /opt


zmlocalconfig | grep zmdisklog
 zmdisklog_critical_threshold = 80
 zmdisklog_warn_threshold = 85  

zmlocalconfig -e zmdisklog_critical_threshold=95

zmlocalconfig -e zmdisklog_warn_threshold=90

zmlocalconfig | grep zmdisklog
 zmdisklog_critical_threshold = 95
 zmdisklog_warn_threshold = 90

zmstatctl restart


ps -eaf | grep zmstat-df

ls -la /var/log/zimbra.log

date ; grep "Disk warning" /var/log/zimbra* ; zmmailbox -z -m admin@`zmhostname` s -l 100 -t message "Subject: Disk and after:yesterday"

##Note - Emails by default go out every 10 minutes - for example:

[zimbra@zcs803 ~]$ date ; grep "Disk warning" /var/log/zimbra* ; zmmailbox -z -m admin@`zmhostname` s -l 100 -t message "Subject: Disk and after:yesterday"
Thu May 22 09:40:08 PDT 2014
/var/log/zimbra.log:May 22 08:30:00 zcs803 zimbramon[18826]: 18826:err: Disk warning: zcs803.DOMAIN.com: / on device /dev/mapper/vg_rhel664-lv_root at 82%
/var/log/zimbra.log:May 22 08:40:00 zcs803 zimbramon[22970]: 22970:err: Disk warning: zcs803.DOMAIN.com: / on device /dev/mapper/vg_rhel664-lv_root at 82%
/var/log/zimbra.log:May 22 08:50:00 zcs803 zimbramon[22970]: 22970:err: Disk warning: zcs803.DOMAIN.com: / on device /dev/mapper/vg_rhel664-lv_root at 82%
/var/log/zimbra.log:May 22 09:00:00 zcs803 zimbramon[22970]: 22970:err: Disk warning: zcs803.DOMAIN.com: / on device /dev/mapper/vg_rhel664-lv_root at 82%
  ## Note - I had readjusted the variable to not warn during this time segment ##
/var/log/zimbra.log:May 22 09:20:00 zcs803 zimbramon[8322]: 8322:err: Disk warning: zcs803.DOMAIN.com: / on device /dev/mapper/vg_rhel664-lv_root at 82%
/var/log/zimbra.log:May 22 09:30:00 zcs803 zimbramon[8322]: 8322:err: Disk warning: zcs803.DOMAIN.com: / on device /dev/mapper/vg_rhel664-lv_root at 82%
/var/log/zimbra.log:May 22 09:40:00 zcs803 zimbramon[8322]: 8322:err: Disk warning: zcs803.DOMAIN.com: / on device /dev/mapper/vg_rhel664-lv_root at 82%
num: 7, more: false

     Id  Type   From                  Subject                                             Date
   ----  ----   --------------------  --------------------------------------------------  --------------
1.  328  mess   admin                 Disk / at 82% on zcs803.DOMAIN.com:           05/22/14 09:40
2.  327  mess   admin                 Disk / at 82% on zcs803.DOMAIN.com:           05/22/14 09:30
3.  326  mess   admin                 Disk / at 82% on zcs803.DOMAIN.com:           05/22/14 09:20
  ## Note - I had readjusted the variable to not warn during this time segment ##
4.  325  mess   admin                 Disk / at 82% on zcs803.DOMAIN.com:           05/22/14 09:00
5.  324  mess   admin                 Disk / at 82% on zcs803.DOMAIN.com:           05/22/14 08:50
6.  323  mess   admin                 Disk / at 82% on zcs803.DOMAIN.com:           05/22/14 08:40
7.  320  mess   admin                 Disk / at 82% on zcs803.DOMAIN.com:           05/22/14 08:31

Continue to monitor your zmmailbox search results for an hour.

The Basic Information Support Needs

as root:

  • cat /etc/fstab
    • Shows us what is mounted upon boot
  • cat /proc/mounts
    • Shows us what is currently mounted and its status - you can see if a mount is read-only here.
  • df -hT
    • Lists current mounts using human-readable size information and also notes the filesystem type.

as zimbra:

  • zmprov -l gs `zmhostname` | egrep 'Back|Redo'
    • Will show us a number of variables related to backup and redologs. Also tell us if your using auto-group or the default method.
  • du -sh /opt/zimbra/redolog
    • Will might notice your redolog logs aren't rolling over, causing a possible issue.
  • ls -latr /opt/zimbra/backup
    • This is the default backup target, please adjust this path here and below if you are using a different zimbraBackupTarget value.
    • zmprov gs `zmhostname` zimbraBackupTarget
    • We'll be able to confirm permissions are right.
  • ls -latr /opt/zimbra/backup/tmp
    • This will show us if you have failed backup jobs and confirm tmp is being cleaned appropriately after the backup is done.
  • ls -latr /opt/zimbra/backup/sessions
    • This will show us what backup sessions are available and confirm permissions are correct.
    • Adjust path if your zimbraBackupTarget value is not the default path.
  • Some directory sizes in the backup directory:
    • Default path first
      • du -sh `find /opt/zimbra/backup -maxdepth 2 -type d`
    • If your using a different backup target, check that directory also. Replace /opt/zimbra/backup above with your backup path.
  • zmbackupquery
    • This should match what's in the sessions directory and it will also tell us if status of each backup and how many accts were done.
  • crontab -l | grep -i back
    • This will show use when backups are support to run and with what options they are running with.
  • zmlocalconfig | grep -i back
    • This is useful to see a number of backup options not exposed in the crontab, things related to the zip options.
  • zmvolume -l
    • This is useful to see how many volumes are being used, if HSM is being used, and if compression is being done at the volume level.
Additional Log Files Support Might Need

And send the following logs:

  • /var/log/messages
    • Filesystem issues often times are noted here and also in syslog. This might explain an interruption in the backup process. Server restarts, filesystem going full, filesystem going read-only, etc.
  • /var/log/syslog
  • /opt/zimbra/log/mailbox.log
    • The backup activity is logged here.
    • And any other mailbox.log file that would cover the event

Additional Checks For Performance Specific Issues

If Your Using a SAN or NFS For Your Backup Target - Please Check Your IOWait

Ideally, you would compare iowait and performance data from the target backup host as well as the stats available on the ZCS servers. To get graphs and stats on this from ZCS, please see Ajcody-Testing-Debugging#zmstat_and_zmstat-chart . You should submit this data and iowait conclusions if you still need to submit a support case about backup performance issues.

Is HSM Running During Your Backup Window
Are You Using --zipStore

--zipStore zips the blobs vs. keeping the blobs as individual files. --zipStore does not use compression either. For most circumstances, this will give the best performance, especially with NFS. This should be the default behavior of the backups, the following RFE is when it became the default [ZCS6+] :

To see if zip's are being used for backups for example, in the backup/session directory you'll know if it is by seeing .zip files:

zimbra$ ls blobs-1.zip    blobs-2.zip    blobs-3.zip    blobs-4.zip

To see if the zip file is using compression [-Z option for unzip will indicate whether or not the archive is actually compressed] :

unzip -Z blobs-4.zip
293 files, 5982984 bytes uncompressed, 5982984 bytes compressed:  0.0%

Also, if your zmvolume has compression enabled the blobs will remain compressed within the zip also upon backup. The point being, they are uncompressed to be then put into a zip file when the backup is using --zipStore.

Verified Against: Zimbra Collaboration Suite 8.6 Date Created: 01/22/2015
Article ID: https://wiki.zimbra.com/index.php?title=Troubleshooting_Course_Content_Rough_Drafts-Recover_Missing_Data_-_User Date Modified: 2015-01-28

Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »

Jump to: navigation, search