Troubleshooting Performance Issues
Zimbra Support has recently been seeing an increase in the number of support cases relating to server performance. These typically consist of a complaint that the server is too slow, without much supporting data. This page exists to help administrators gather the data that Zimbra will be requesting to help resolve the issue. The data listed here is by no means exhaustive, but will provide a good starting place for troubleshooting the problem.
mailbox.log and zimbra.log
Zimbra Support will likely need to see the /opt/zimbra/log/mailbox.log file and possibly /var/log/zimbra.log from the system. These files contain the basic operation data from the system, and can tell us if the server has something seriously wrong with it. They can also be correlated against other collected data to give a complete picture of the workings of the system.
mysql_error.log and myslow.log
These files are both located in /opt/zimbra/log on the mail store server. They contain information about the health of the mysql database. If there is data corruption or another problem causing direct mysql errors, events will be logged in mysql_error.log. If certain search requests are taking longer to complete than others, they will be logged in myslow.log.
This file contains startup information for mailboxd and thread dumps created whenever mailboxd is shut down. If a server goes completely nonresponsive and is restarted, the thread dump captured here will tell us if there are certain threads blocking other threads' access to critical data elements. Frequently slow behavior can be caused by these thread locks.
In some cases, it may be necessary to monitor garbage collection and other operations at the Java VM level. To enable this logging, add the following to mailboxd_java_options using zmlocalconfig: "-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime"
Zmstats runs constantly on all current Zimbra systems. It monitors various Zimbra components as well as the system as a whole to give a good picture of how the system is performing over time. Stats charts are extremely useful for troubleshooting performance issues and can often point to a bottleneck on the system itself or to specific problems in the mailboxd Java VM.
To generate a stat chart, run the following:
zmstat-chart -s /opt/zimbra/zmstat/<date> -d <output directory>
The date is in the format 'YYYY-MM-DD'. Generally when troubleshooting a performance problem, Zimbra Support will need to see several days' worth of statistics data, as well as the log files and possibly thread dumps from the same time period.
A thread dump is a printout of the status of all the running threads in the mailboxd process at a specific point in time. A thread dump allows Zimbra engineers to see how the system is operating, what each thread is doing, and what data elements are being accessed by individual threads. If a performance bottleneck is not identifiable from the stat charts alone, it may be necessary to generate a periodic thread dump.
The following script generates five thread dumps within one minute:
#!/bin/bash # # Dump 5 threads and proc stats for mailboxd Java PID in one minute. # Daily output written to /tmp/zmperiodic-YYYYMMDD # # Execute the script with cron # crontab: * * * * * /tmp/zmperiodic.sh DUMPDIR="/tmp/zmperiodic-$(date '+%Y%m%d')" if [ ! -d $DUMPDIR ] then mkdir $DUMPDIR fi for ((i=0; i<5; i++)) do echo "" > /opt/zimbra/log/zmmailboxd.out STAMP=$(date '+%Y%m%d.%H%M%S') JPID=$(cat /opt/zimbra/log/zmmailboxd_java.pid) kill -3 $JPID sleep 1 cp /opt/zimbra/log/zmmailboxd.out $DUMPDIR/zmmailboxd.out-$STAMP cat /proc/$JPID/task/*/stat > $DUMPDIR/proc-stats-$STAMP if [ $i -ne 4 ] then sleep 11 else exit fi done
It should be placed in cron so that it runs every minute. The output data will be placed in a directory named /tmp/zmperiodic-YYYYMMDD. This data will consist of thread dumps along with thread data from /proc.