Ajcody-Testing-Debugging: Difference between revisions
m (→zmdiaglog) |
|||
Line 825: | Line 825: | ||
** Your script can attach GDB, get a backtrace, run pstack, dump open file descriptors, etc. | ** Your script can attach GDB, get a backtrace, run pstack, dump open file descriptors, etc. | ||
** For detailed explanation + script: http://timetobleed.com/an-obscure-kernel-feature-to-get-more-info-about-dying-processes/ | ** For detailed explanation + script: http://timetobleed.com/an-obscure-kernel-feature-to-get-more-info-about-dying-processes/ | ||
** [http://www.itcreate.net IPhone Developers] | |||
===Automate Testing / Load Testing=== | ===Automate Testing / Load Testing=== |
Revision as of 12:29, 31 January 2011
- This article is NOT official Zimbra documentation. It is a user contribution and may include unsupported customizations, references, suggestions, or information. |
Testing & Debugging Tools
Actual Testing & Debugging Homepage
Please see Ajcody-Testing-Debugging
Other Debug References
Other Zimbra Wiki Pages
Various Errors And Log Items One Might See
Logging
See Ajcody-Logging for more complete logging information.
mailbox.log and zimbra.log
- Zimbra Support will likely need to see the /opt/zimbra/log/mailbox.log file and possibly /var/log/zimbra.log from the system. These files contain the basic operation data from the system, and can tell us if the server has something seriously wrong with it. They can also be correlated against other collected data to give a complete picture of the workings of the system.
mysql_error.log and myslow.log
- These files are both located in /opt/zimbra/log/ on the mail store server. They contain information about the health of the mysql database. If there is data corruption or another problem causing direct mysql errors, events will be logged in mysql_error.log. If certain search requests are taking longer to complete than others, they will be logged in myslow.log.
zmmailboxd.out
- This file contains startup information for mailboxd and thread dumps created whenever mailboxd is shut down. If a server goes completely nonresponsive and is restarted, the thread dump captured here will tell us if there are certain threads blocking other threads' access to critical data elements. Frequently, slow behavior can be caused by these thread locks.
- In some cases, it may be necessary to monitor garbage collection and other operations at the Java VM level. To enable this logging, add the following to mailboxd_java_options.
- To get current options:
zmlocalconfig mailboxd_java_options
- Then appended the following to your current set of options - "-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime". PLEASE NOTE FOR BELOW, "YOUR EXISTING OPTIONS" should be replaced with the output above.
zmlocalconfig -e mailboxd_java_options="YOUR EXISTING OPTIONS -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime"
Connections And Or Threads Mentioned In mailbox.log
Tuning Variables Related To Threads And Connections
Resources to "tune" the below variables:
zimbraHttpNumThreads
- ZCS 5.0 has a single thread pool for both HTTP and HTTPS
- To see current setting
- zmprov gs `zmhostname` zimbraHttpNumThreads
- To modify setting
- zmprov ms `zmhostname` zimbraHttpNumThreads 500
- To see current setting
zimbraPop3NumThreads
- This setting of 300 is able to support a few 10s of thousands of users checking mailing every 8 minutes.
- To see current setting
- zmprov gs `zmhostname` zimbraPop3NumThreads
- To modify setting
- zmprov ms `zmhostname` zimbraPop3NumThreads 300
- To see current setting
zimbraImapNumThreads
- IMAP thread settings.
- To see current setting
- zmprov gs `zmhostname` zimbraImapNumThreads
- To modify setting
- zmprov ms `zmhostname` zimbraImapNumThreads 500
- To see current setting
zimbraLmtpNumThreads
- LMTP thread settings.
- To see current setting
- zmprov gs `zmhostname` zimbraLmtpNumThreads
- To modify setting
- zmprov ms `zmhostname` zimbraLmtpNumThreads 40
- To see current setting
mailboxd_java_options
- JVM (java Virtual Machine options) ZCS 5.0 and later
- To see current setting
- zmlocalconfig mailboxd_java_options
- To modify setting
- zmlocalconfig mailboxd_java_options="-client -XX:NewRatio=2 -Djava.awt.headless=true -XX:MaxPermSize=128m -XX:SoftRefLRUPolicyMSPerMB=1"
- The above is using the "default" options as an example to modify.
- zmlocalconfig mailboxd_java_options="-client -XX:NewRatio=2 -Djava.awt.headless=true -XX:MaxPermSize=128m -XX:SoftRefLRUPolicyMSPerMB=1"
- Other various configurations for mailboxd_java_options. See sections related to heap and thread dumps below for more details.
- mailboxd_java_options="-client -XX:NewRatio=2 -XX:MaxPermSize=128m -Djava.awt.headless=true -XX:SoftRefLRUPolicyMSPerMB=1 -XX:+UseParallelGC"
- mailboxd_java_options="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/some/directory/that/exists/and/is/zimbra/writable"
- To see current setting
mailboxd_java_heap_memory_percent
- JVM Cache setting
- To see current setting
- zmlocalconfig mailboxd_java_heap_memory_percent
- To modify setting
- zmlocalconfig -e mailboxd_java_heap_memory_percent=40
- To see current setting
zimbraMessageCacheSize
- Mailbox message cache. The mailbox server maintains a cache of message bodies (blobs on disk from /opt/zimbra/store). This cache speeds up retrieval of message content for mailapps such as mail.app
- To see current setting
- zmprov gs `zmhostname` zimbraMessageCacheSize
- To modify setting
- zmprov ms `zmhostname` zimbraMessageCacheSize 104857600
- To see current setting
zimbra_mysql_connector_maxActive
- zimbra_mysql_connector_maxActive
- To see current setting
- zmlocalconfig zimbra_mysql_connector_maxActive
- To modify setting
- zmlocalconfig -e zimbra_mysql_connector_maxActive=100
- To see current setting
Example Of mailbox.log event:
dbconn - Connection pool is 75% utilized (88 connections out of a maximum of 100 in use). Turn on debug logging for zimbra.dbconn to see stack traces of connections not returned to the pool.
zimbraXMPPEnabled
There is a potential problem with XMPP [Instant Messaging]. XMPP is a beta feature and is not supported. If you're not using XMPP anyways, disabling it is strongly recommended, and XMPP is a beta feature and not supported in a production environment today.
- Confirm if variable is set to False or True
- zmprov gacf zimbraXMPPEnabled
- Set to FALSE to turn off
- zmprov mcf zimbraXMPPEnabled FALSE
- Then at some point restart the mailstores.
Bug reference:
- "IM service connection leak,memory leak"
Zimbra Tools And Commands
zmstat and zmstat-chart
Resources:
Down And Dirty Example zmstat
Zmstats runs constantly on all current Zimbra systems. It monitors various Zimbra components as well as the whole system to give a good picture of how the system is performing over time. Stats charts are extremely useful for troubleshooting performance issues and can often point to a bottleneck on the system itself or to specific problems in the mailboxd Java VM.
To generate a stat chart, run the following:
zmstat-chart -d <output directory> -s /opt/zimbra/zmstat/<date> --title "<server> <date>"
The date is in the format 'YYYY-MM-DD'. Generally when troubleshooting a performance problem, Zimbra Support will need to see several days' worth of statistics data, as well as the log files and possibly thread dumps from the same time period.
zmdiaglog
See also : Zmdiaglog
Running a full zmdiaglog would take many stats such as zmstats, thread dump, lsof, core dump, and heap dump. The heap dump portion could take longer than the other steps and is invasive on the system. If possible, it should only be ran if it is late in the evening or if the server is not responding anyway. Prior to 5.0.16, this tool may run a heap dump, which would be interruptive on a running server. From 5.0.16, a heap dump requires "zmdiaglog -a". Also, you can CTRL-C to exit out of the zmdiaglog during the heap dump process without affecting the rest of the dump. You will just need to compress the files manually.
As root:
/opt/zimbra/libexec/zmdiaglog
This will create a zip file with allot of relevant data for troubleshooting issues.
zimdialog To Get Full Heap Dump
In order to get a full heap dump, one has to make sure that HeapDumps have been enabled in the JVM options. In order to do this, please first check your mailboxd_java_options on that server:
# su - zimbra $ zmlocalconfig mailboxd_java_options
Then, append these additional arguments:
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/some/directory/that/exists/and/is/zimbra/writable
It is very important that the HeapDumpPath chosen is large enough to dump a multi-GB core file and heap dump. It is also very important to make sure that these options are appended to your existing options.
Example [You might need to adjust based upon your existing mailboxd_java_options]:
$ zmlocalconfig mailboxd_java_options mailboxd_java_options = -client -XX:NewRatio=2 -Djava.awt.headless=true -XX:MaxPermSize=128m -XX:SoftRefLRUPolicyMSPerMB=1
$ zmlocalconfig -e mailboxd_java_options="-client -XX:NewRatio=2 -Djava.awt.headless=true -XX:MaxPermSize=128m -XX:SoftRefLRUPolicyMSPerMB=1 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp"
Then restart ZCS on that mailstore.
Caution: If /var/tmp does not enough space, it could fill the filesystem/server and cause very bad problems!
Then, if any problem with performance occurs again, run the zmdiaglog as root:
[Prior to 5.0.16]
/opt/zimbra/libexec/zmdiaglog
[For 5.0.16 and forward]
/opt/zimbra/libexec/zmdiaglog -a
Note: there are optional options to zmdiaglog that allows it to write to locations other than /tmp, again very important not to fill up the filesystem. Run zmdiaglog -h to get the list of options:
# /opt/zimbra/libexec/zmdiaglog -h Usage: zmdiaglog [-dth] -d - Log destination (Default /tmp/zmdiaglog.PID) -t - Timeout in seconds for hanging commands (Default 120) -h - This help message
This may take a few minutes to complete, as it is writing a multiple GB file. Once complete, then restart mailboxd:
# su - zimbra $ zmmailboxdctl restart
You will then provide to Zimbra, via FTP, the zmdiaglog information that is written to /tmp. Our developers will most likely need to debug the information provided rather than support staff. Please be patient, support will provide feedback as they hear the conclusions that the developers provide.
There are lots of steps requested here, but this information is critical if we are to diagnose this reported problem. The data provided by zmdiaglog is very important.
Using JDK 1.6 For zmdiaglog
See Ajcody-Testing-Debugging#jmap_And_JDK_1.6
Note About Memory And Linux
In general, memory used by the JVM can occasionally spike above the maximum heap size under normal usage, but we would expect that it would go back down again eventually. Also, please note that all Linux systems will utilize all memory available on the system - this is because the kernel will cache any data it can in order for possible reuse; this is perfectly normal, and in the case that memory is needed, older cache will be deleted from memory and memory freed for other use. No Linux system will show much memory as "free" after being up for some period of time, for example when using top or vmstat. All memory will get utilized by the system, and this is a good thing for performance - you paid for the memory, might as well use it. Where it could be a problem is if one process grows uncontrolled, or if swapping is occurring.
Thread Dumps And Specific Commands To Use
A thread dump is a printout of the status of all the running threads in the mailboxd process at a specific point in time. A thread dump allows Zimbra engineers to see how the system is operating, what each thread is doing, and what data elements are being accessed by individual threads. If a performance bottleneck is not identifiable from the stat charts alone, it may be necessary to generate a periodic thread dump.
zmthrdump - ZCS 5.0.10+
The thread dump from zmthrdump will show which threads are currently in process within the jvm. zmthrdump is useful when you want to grab several thread dumps in short intervals in order to track and look for locked threads. Typically, the thread dumps are useful for tracking what is waiting on what in the thread stack. If the server is also experiencing GC [garbage collection] issue, it is generally better to just have a full zmdiaglog (with valid java memory heap dump) along with the rest of the data it collects.
Resources:
- zmthrdump help
- zmthrdump [-h] [-i] [-t <timeout seconds>] [-p <pid file>] [-f <file>] [-o <out-file>]
zmmailboxdmgr And Threaddump - Prior to 5.0.10
Run this command as root
for i in 1 2 3; do /opt/zimbra/libexec/zmmailboxdmgr threaddump; sleep 30; done
/opt/zimbra/log/zmmailboxd.out will reflect the thread dumps.
Thread Dump Script Example 1
Usage of the below script:
- $0 FILENAME -dl [-frames #_ stack_frames] [-id REGEXP] [-all]
- $0 FILENAME -dt [-frames #_ stack_frames] [-sort id|state] [-stack REGEXP] [-id REGEXP] [-state REGEXP] [-waiting NUMBER]
- frames: controls the # lines of stack trace included in the output
- id: only include where the id matches REGEXP
- all: include ALL locks (not just ones that other threads are blocked-on)
- sort: controls the order threads are printed out (locks always printed in lock ID order)
- stack: only include threads where the thread's stack output matches REGEXP
- state: only include threads where the thread state (e.g. RUNNABLE) matches REGEXP
- waiting: only inclure threads are blocking other threads
Examples:
- $0 threads.txt -dt -stack MailboxIndex.java -state RUNNABLE -f 20
- -- dumps all RUNNABLE threads with MailboxIndex.java on the stack (1st 20 lines of the stack)
- $0 threads.txt -dt -stack MailboxIndex.java -state RUNNABLE -f 20
- $0 threads.txt -dl -f 0
- -- dumps the list of locks in the system that are blocking at least one thread
- $0 threads.txt -dl -f 0
- $0 threads.txt -dt -f 0 -sort state
- -- dumps a list of all the threads in the system and tells you their run state
- $0 threads.txt -dt -f 0 -sort state
- $0 threads.txt -dt -f 20 -w 1
- -- dumps a list of all the threads in the system that are blocking other threads (quick way to hunt for contention)
- $0 threads.txt -dt -f 20 -w 1
Working Example:
- threadDumpAnalyzer.pl zmmailboxd.out -dl -f 100 ~/Desktop
threadDumpAnalyzer.pl Script
#!/usr/bin/perl -w # # ***** BEGIN LICENSE BLOCK ***** # # Zimbra Collaboration Suite Server # Copyright (C) 2004, 2005, 2006, 2007 Zimbra, Inc. # # The contents of this file are subject to the Yahoo! Public License # Version 1.0 ("License"); you may not use this file except in # compliance with the License. You may obtain a copy of the License at # http://www.zimbra.com/license. # # Software distributed under the License is distributed on an "AS IS" # basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. # # ***** END LICENSE BLOCK ***** # use strict; use warnings; use Getopt::Long; my %locks; # by LockID - hash w/ 'owner','type' my %threads; # by threadId - hash w/ 'state' 'waitingOnLock' my $filename = $ARGV[0]; my ($dumpLocks, $dumpThreads, $searchThreadStack, $searchThreadId, $stackFrames, $sort, $filterByState, $allLocks); my ($waiting); $stackFrames = 10; $sort = "state"; GetOptions( "dl" => \$dumpLocks, "dt" => \$dumpThreads, "frames=s" => \$stackFrames, "sort=s" => \$sort, "id=s" => \$searchThreadId, "state=s" => \$filterByState, "stack=s" => \$searchThreadStack, "waiting=s" => \$waiting, "all" => \$allLocks, ); sub dumpLocks(); sub handleThread(); sub dumpThreads(); sub usage(); sub readFile($); sub getBlockedThreads($); if (!defined $filename) { usage(); } readFile($filename); if (defined $dumpLocks) { dumpLocks(); } elsif (defined $dumpThreads) { dumpThreads(); } else { usage(); } exit(0); sub usage() { my $usage = <<END_OF_USAGE; USAGE: $0 FILENAME -dl [-frames #_ stack_frames] [-id REGEXP] [-all] $0 FILENAME -dt [-frames #_ stack_frames] [-sort id|state] [-stack REGEXP] [-id REGEXP] [-state REGEXP] [-waiting NUMBER] frames: controls the # lines of stack trace included in the output id: only include where the id matches REGEXP all: include ALL locks (not just ones that other threads are blocked-on) sort: controls the order threads are printed out (locks always printed in lock ID order) stack: only include threads where the thread's stack output matches REGEXP state: only include threads where the thread state (e.g. RUNNABLE) matches REGEXP waiting: only inclure threads are blocking other threads Examples: $0 threads.txt -dt -stack MailboxIndex.java -state RUNNABLE -f 20 -- dumps all RUNNABLE threads with MailboxIndex.java on the stack (1st 20 lines of the stack) $0 threads.txt -dl -f 0 -- dumps the list of locks in the system that are blocking at least one thread $0 threads.txt -dt -f 0 -sort state -- dumps a list of all the threads in the system and tells you their run state $0 threads.txt -dt -f 20 -w 1 -- dumps a list of all the threads in the system that are blocking other threads (quick way to hunt for contention) END_OF_USAGE die $usage; } sub mySort($$) { my ($a, $b) = @_; if ($sort eq "state") { my $state1 = $threads{$a}{state}; my $state2 = $threads{$b}{state}; if (!defined $state1) { return 1; } elsif (!defined $state2) { return -1; } return $state1 cmp $state2; } else { return $a cmp $b; } } sub padToWidth($$) { my ($str, $width) = @_; if (!defined($str)) { $str = ""; } return sprintf "%-*s", $width, $str; } sub formatStackTrace($$) { my ($stack, $indent) = @_; my $ret; if ($stackFrames == 0) { return ""; } my $num = $stackFrames; foreach my $line (split /\n/, $stack) { if ($num <= 0) { return $ret; } $ret .= $indent.$line."\n"; $num--; } return $ret; } sub formatLock($) { my $lockId = shift; my $output = "$lockId - "; my $numData = 0; while (my ($k,$v) = each %{ $locks{$lockId}}) { $output .= " $k=$v"; $numData++; } if ($numData == 0) { $output .= " UNKNOWN"; } return $output; } sub formatThread($) { my $threadId = shift; if (!defined $threadId) { return ""; } my $foo = padToWidth($threadId, 50); if (!defined($foo) || $foo eq "") { $foo = "ASDF"; } my $bar = $threads{$threadId}{state}; if (!defined($bar) || $bar eq "") { $bar = "HJKL"; } my $ret = $foo.$bar."\n"; if (defined $threads{$threadId}{waitingOnLock}) { $ret .= "\tWaiting for: ".formatLock($threads{$threadId}{waitingOnLock})."\n"; } my @blockedThreads = getBlockedThreads($threadId); for my $blockedThread (sort @blockedThreads) { $ret .= "\t$blockedThread is waiting on this thread\n"; } $ret .= formatStackTrace($threads{$threadId}{stack}, "\t "); if ($stackFrames > 0) { $ret .= "\n"; } return $ret; } # given a lockId, return a list of the threads that are blocked on it sub getLockWaiters($) { my $lockId = shift; my @ret; while (my ($threadId,$thread) = each %threads ) { if (defined $thread->{waitingOnLock} && $thread->{waitingOnLock} eq $lockId) { push @ret, $threadId; } } return @ret; } sub getLockOwner($) { my $lockId = shift; if (defined $locks{$lockId} && defined $locks{$lockId}{owner}) { return $locks{$lockId}{owner}; } else { return ""; } } # given a threadId, get a list of all other threads that are blocked # on locks it is holding sub getBlockedThreads($) { my $threadId = shift; my @ret; foreach my $lockId ( sort keys %locks ) { if (getLockOwner($lockId) eq $threadId) { # a lock we own my @blockedThreads = getLockWaiters($lockId); foreach my $blockedThread (@blockedThreads) { push @ret, $blockedThread; push @ret, getBlockedThreads($blockedThread); #recurse! } } } return @ret; } sub dumpLocks() { foreach my $lockId ( sort keys %locks ) { my $ret = ""; my $numWaiters = 0; $ret .= "LOCK: $lockId "; while (my ($k,$v) = each %{ $locks{$lockId}}) { $ret .= "$k=$v, "; } $ret .= "\n"; while (my ($threadId,$thread) = each %threads ) { if (defined $thread->{waitingOnLock} && $thread->{waitingOnLock} eq $lockId) { $ret .= "\tThread $threadId is waiting for this lock\n"; $numWaiters++; } } $ret .= formatStackTrace($threads{getLockOwner($lockId)}{stack}, "\t"); if (!defined $searchThreadId || $lockId =~ /$searchThreadId/) { if ($numWaiters > 0 || defined $allLocks) { if ($stackFrames > 0) { $ret .= "\n"; } print $ret; } } } } sub dumpThreads() { foreach my $threadId ( sort { mySort($a, $b) } keys %threads ) { if (!defined $threadId) { # continue } elsif (defined $searchThreadStack && $threads{$threadId}{stack} !~ /$searchThreadStack/) { # continue } elsif (defined $searchThreadId && $threadId !~ /$searchThreadId/) { # continue } elsif (defined $filterByState && $threads{$threadId}{state} !~ /$filterByState/) { # continue } elsif (defined $waiting && getBlockedThreads($threadId) < $waiting) { # continue } else { print formatThread($threadId); } } } sub readFile($) { my $filename = shift; open IN, '<', $filename or die "couldn't open $filename"; my @curThread; while (<>) { chomp; s/\r//g; if ($_ eq "") { if (@curThread) { my $threadId; my @locksHeld; my $waitingOnLock; my $threadState; my $output; my $firstLineState; # 1stline my $line = shift @curThread; $output .= "$line\n"; if ($line =~ /"(.*)"/) { $threadId = $1; if ($line =~/nid=0x[0-9a-f]+\s([a-zA-Z\s\.()]+)/) { $threads{$threadId}{state} = $1; } } else { $threadId = $line; } if ($threadId eq "") { $threadId = "none"; } # 2nd line $line = shift @curThread; if (defined $line) { $output .= "$line\n"; if ($line =~ /State: ([A-Z_]+)/) { $threadState = $1; $threads{$threadId}{state} = $1; } foreach $line (@curThread) { $output .= "$line\n"; if ($line =~ /locked <(0x[0-9a-f]+)>\s?(.*)?/) { push @locksHeld, $1; $locks{$1}{owner} = $threadId; $locks{$1}{type} = $2; } elsif ($line =~ /- waiting to lock <(0x[0-9a-f]+)>/) { $waitingOnLock = $1; $threads{$threadId}{waitingOnLock} = $1; } elsif ($line =~ /- waiting to lock <(0x[0-9a-f]+)>/) { $waitingOnLock = $1; $threads{$threadId}{waitingOnLock} = $1; } } } else { $threads{$threadId}{state} = "unknown"; } $threads{$threadId}{stack} = $output; @curThread = (); } } else { push @curThread, $_; } } close IN; }
Thread Dump Script Example 2
The following script generates five thread dumps within one minute:
#!/bin/bash # # Dump 5 threads and proc stats for mailboxd Java PID in one minute. # Daily output written to /tmp/zmperiodic-YYYYMMDD # # Execute the script with cron # crontab: * * * * * /tmp/zmperiodic.sh DUMPDIR="/tmp/zmperiodic-$(date '+%Y%m%d')" if [ ! -d $DUMPDIR ] then mkdir $DUMPDIR fi for ((i=0; i<5; i++)) do echo "" > /opt/zimbra/log/zmmailboxd.out STAMP=$(date '+%Y%m%d.%H%M%S') JPID=$(cat /opt/zimbra/log/zmmailboxd_java.pid) kill -3 $JPID sleep 1 cp /opt/zimbra/log/zmmailboxd.out $DUMPDIR/zmmailboxd.out-$STAMP cat /proc/$JPID/task/*/stat > $DUMPDIR/proc-stats-$STAMP if [ $i -ne 4 ] then sleep 11 else exit fi done
It should be placed in cron so that it runs every minute. The output data will be placed in a directory named /tmp/zmperiodic-YYYYMMDD. This data will consist of thread dumps along with thread data from /proc.
Heap Dump Specific Commands
jmap And JDK 1.6
Go here:
http://java.sun.com/javase/downloads/index.jsp
Select:
- Java SE Development Kit (JDK)
- JDK 6 Update 13
Choose Example Is Using x64 Binary - Adjust If Using x32:
- Linux x64
- jdk-6u13-linux-x64.bin *DO NOT choose RPM download*
Place jdk-6u13-linux-x64.bin in /tmp on the ZCS server.
Install JDK 1.6 as root. Executing the bin file prompt you to agree to a license agreement. The installer will create a directory named jdk1.6.0_13 and extract contents to that directory. This process does not interfere with ZCS.
cd /opt/zimbra /tmp/jdk-6u13-linux-x64.bin chown -R zimbra:zimbra jdk1.6.0_13
Next, stop ZCS services with zmcontrol, destroy the java symlink and symlink JDK 1.6 to java.
su - zimbra zmcontrol stop exit rm -rf java ln -s jdk1.6.0_13 java zmcontrol start
You can now run your zmdiaglog with greater efficiency. Alternatively, you can generate heap dumps manually with jmap (as zimbra this time).
su - zimbra -c '/opt/zimbra/java/bin/jmap -dump:live,file=/tmp/heapdump.live $(cat /opt/zimbra/log/zmmailboxd_java.pid)'
Rollback at any time by shutting down services with zmcontrol and pointing the java symlink back to jdk1.5.
zmdbintegrityreport
Resources:
Trouble Shooting Memory Issues
Basic Commands To Gather Data
PS
Try adjusting the ps output:
ps -eo pid,ppid,rss,vsize,pcpu,pmem,cmd -ww --sort=pid
- RSS (Resident Set Size) - The portion of a process that exists in physical memory (RAM). The rest of the program exists in swap. If the computer has not used swap, this number will be equal to VSIZE. RSS per unit time will show how much RAM the process is using over time.
- VSIZE (Virtual memory SIZE) - The amount of memory the process is currently using. This includes the amount in RAM and the amount in swap. VSIZE per unit time will show how large the process is over time.
Free
Using Free:
free -o
- [ -o ] Disables the display of a "buffer adjusted" line. If the -o option is not specified, free subtracts buffer memory from the used memory and adds it to the free memory reported.
- "Memory Used" per unit time will show the "Memory Used" asymptotically approach the total amount of memory in the system under heavy use. This is normal, as RAM unused is RAM wasted.
- "Memory Used - Memory Buffered - Memory Cached" per unit time will give a good sense of the memory use of your applications minus the effects of your operating system. As you start new applications, this value should go up. As you quit applications, this value should go down. If an application has a severe memory leak, this line will have a positive slope.
- "Swap Used" per unit time will display the swap usage. When the system is low on RAM, a program called kswapd will swap parts of process if they haven't been used for some time. If the amount of swap continues to climb at a steady rate, you may have a memory leak or you might need more RAM.
You could also use watch with free to poll it:
watch free
- or
watch -n 1 -d free
Top
Top is nice, put you want to save output. Do the [-b batch] and [-n number of iterations] variables:
top -b -n1 > /tmp/top-output.txt
One could add into cron if you want to get top output every 5 minutes like this for example [Would would need to confirm the directory structure and perms to write in it our correct]:
crontab -e ** add the following line
*/5 * * * * top -b -n1 > /opt/zimbra/support/data/tops/top-`date +\%F-\%H-\%M`.txt
This is a nice addition to include in the Ajcody-Logging#Like_To_Have_zmstat-chart_Data_Integrated_With_Zimbra setup. Where you would place the data underneath the root http directory.
Vmstat
Gives a good overview as well:
vmstat
- The process-related fields are:
- r — The number of runnable processes waiting for access to the CPU
- b — The number of processes in an uninterruptible sleep state
- w — The number of processes swapped out, but runnable
- The memory-related fields are:
- swpd — The amount of virtual memory used
- free — The amount of free memory
- buff — The amount of memory used for buffers
- cache — The amount of memory used as page cache
- The swap-related fields are:
- si — The amount of memory swapped in from disk
- so — The amount of memory swapped out to disk
- The I/O-related fields are:
- bi — Blocks sent to a block device
- bo— Blocks received from a block device
- The system-related fields are:
- in — The number of interrupts per second
- cs — The number of context switches per second
- The CPU-related fields are:
- us — The percentage of the time the CPU ran user-level code
- sy — The percentage of the time the CPU ran system-level code
- id — The percentage of the time the CPU was idle
Polling data with vmstat:
vmstat 1 10
- Displays one new line per second, but only for the next ten seconds
Meminfo
See below for description of the output for meminfo:
Featured Article: /proc/meminfo Explained
Do the following:
cat /proc/meminfo
Non-Zimbra Tools
Caution : None of this is supported by the Zimbra Support team. However it might prove useful, and depending on how much I can document these tools - who knows, maybe we can start using them more directly in the support process.
Debugging Tools
- Opensourcetesting.org - General Site with News and Tools
- MAT - Memory Analyzer Tool
- MAT w/ Eclipse
- http://www.eclipse.org/mat/
- Adding this to the startup settings of zimbra
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/opt/zimbra/log/dumps/
- dumps out of mem heap. That file you can then load into MAT.
- Adding this to the startup settings of zimbra
- http://www.eclipse.org/mat/
- MAT w/ Eclipse
- Hat - the Haskell Tracer
- Linux
/proc/sys/kernel/core_pattern
- By setting
/proc/sys/kernel/core_pattern
, one can run a user-level program before a process dies to gather additional debugging information. - A pipe is opened and attached to your script. The kernel writes the coredump to the pipe. Your script can read it and write it to storage.
- Your script can attach GDB, get a backtrace, run pstack, dump open file descriptors, etc.
- For detailed explanation + script: http://timetobleed.com/an-obscure-kernel-feature-to-get-more-info-about-dying-processes/
- IPhone Developers
- By setting
Automate Testing / Load Testing
- Mstone - is a multi-protocol stress and performance measurement tool. Mstone can test multiple protocols (e.g. POP and SMTP) simultaneously and measures the performance of every transaction. The performance can be graphed throughout the duration of the test.
- Selenium - is a suite of tools to automate web app testing across many platforms.
- Postal - to benchmark SMTP servers.