Ajcody-Logging

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Logging

Actual Logging Homepage

Please see Ajcody-Logging

Server

zmdiaglog

As root:

/opt/zimbra/libexec/zmdiaglog

This will create a zip file with allot of relevant data for troubleshooting issues.

Syslog Items

tail /etc/syslog.conf
local0.*                -/var/log/zimbra.log
auth.*                  -/var/log/zimbra.log
mail.*                -/var/log/zimbra.log

Restarting syslog

/etc/init.d/syslog restart

Where's It All Go?

Any stacktrace files will have the process id of the JVM appended to the filename.

In all releases 5.0.x and 4.5.x, this is the main mailbox application log:

  • /opt/zimbra/log/mailbox.log

In ZCS 5.0.x, the JVM output log and stacktrace files will be in /opt/zimbra/log/:

  • /opt/zimbra/log/zmmailboxd.out (jvm output log)
  • /opt/zimbra/log/stacktrace.<pid>

Meaning, changes to /opt/zimbra/conf/log4j.properties.in

This file, /opt/zimbra/conf/log4j.properties.in , also controls where some events are logged to.

Example:

# Appender AUDIT writes to the file "audit.log".
log4j.appender.AUDIT=org.apache.log4j.DailyRollingFileAppender
log4j.appender.AUDIT.File=/opt/zimbra/log/audit.log
log4j.appender.AUDIT.DatePattern='.'yyyy-MM-dd
log4j.appender.AUDIT.Append=true
log4j.appender.AUDIT.layout=com.zimbra.common.util.ZimbraPatternLayout
log4j.appender.AUDIT.layout.ConversionPattern=%d %-5p [%t] [%z] %c{1} - %m%n

Log Rotation and Removal

Please look at the output of your crontab - AS ZIMBRA, look for the "Log pruning" line.

crontab -l

And also the config files for logrotate:

/etc/logrotate.conf

/etc/logrotate.d/zimbra

What's up with all the logs?

Logging page resources:

http://www.zimbra.com/docs/ne/latest/administration_guide/9_Monitoring.14.1.html

http://wiki.zimbra.com/index.php?title=Server_Monitoring

http://wiki.zimbra.com/index.php?title=Monitoring_Zimbra_Servers

http://wiki.zimbra.com/index.php?title=Log_Files

http://wiki.zimbra.com/index.php?title=Unresponsive_Server_Troubleshooting

The Bread And Butter Logs

  • /opt/zimbra/log/mailbox.log
    • This log is a mailboxd log4j server log containing the logs from the mailbox server. This includes the mailbox store, LMTP server, IMAP and POP servers, and Index server.
  • /opt/zimbra/log/zmmailbox.out
    • Mailstore not coming up and nothing is being logged in mailbox.log, check here for errors.
  • /var/log/zimbra.log
    • The Zimbra syslog details the activities of the Zimbra MTA (Postfix, amavisd, antispam, antivirus), Logger, Authentication (cyrus-sasl), and Directory (OpenLDAP). By default LDAP activity is logged to Zimbra.log.
  • /opt/zimbra/db/data/YOURHOSTNAME.err
    • This is the message store database error log.
  • /var/log/messages

Increase Logging

Increase Logging Per User

Depending on the ZCS version the below will clear all the per account loggers on zimbra reboot. IMAP is used for these examples.

  • addAccoutLogger
zmprov aal user@domain.com zimbra.imap debug
  • removeAccountLogger
zmprov ral user@domain.com zimbra.imap

Example usuage:

zmprov aal user@domain.com zimbra.soap debug 

And tail the log file, while attempting to login to the AJAX client.:

tail -f /opt/zimbra/log/mailbox.log

See Ajcody-Logging#Log4J_Variables for variables that can be used.

Increase Logging Per Server

If you want to do it globally edit this file:

vi /opt/zimbra/conf/log4j.properties

Add a line at the end that's similar to something like this: log4j.logger.zimbra.imap=DEBUG.

No restart of any service is needed, BUT if something happens that causes a regeneration of the log4j.properties file then your changes will be overwritten. A regeneration of this file takes the contents of /opt/zimbra/conf/log4.properties.in .

To permanently make a change, modify the /opt/zimbra/conf/log4.properties.in file, then a restart would be necessary.

$ zmcontrol stop
$ zmcontrol start

or you could run this on the server:

$ zmmtaconfig mailboxd

Wait for a minute for the server to pick up your change.

See Ajcody-Logging#Log4J_Variables for variables that can be used.

Log4J Variables

"zimbra.misc" logger. For all events that don't have a specific-category.

"zimbra.index" logger. For indexing-related events.

"zimbra.index.lucene" logger. For logging of low-level lucene operations (debug-level only)

"zimbra.searchstat" logger. For logging statistics about what kinds of searches are run

"zimbra.redolog" logger. For redolog-releated events.

"zimbra.lmtp" logger. For LMTP-related events.

"zimbra.smtp" logger. For SMTP-related events.

"zimbra.nio" logger. For NIO-related events.

"zimbra.imap" logger. For IMAP-related events.

"zimbra.pop" logger. For POP-related events.

"zimbra.mailbox" logger. For mailbox-related events.

"zimbra.calendar" logger. For calendar-related events.

"zimbra.im" logger. For instant messaging-related events.

"zimbra.account" logger. For account-related events.

"zimbra.gal" logger. For account-related events.

"zimbra.ldap" logger. For ldap-related events.

"zimbra.security" logger. For security-related events

"zimbra.soap" logger. For soap-related events

"zimbra.test" logger. For testing-related events

"zimbra.sqltrace" logger. For tracing SQL statements sent to the database

"zimbra.dbconn" logger. For tracing database connections

"zimbra.perf" logger. For logging performance statistics

"zimbra.cache" logger. For tracing object cache activity

"zimbra.filter" logger. For filter-related logs.

"zimbra.session" logger. For session- and notification-related logs.

"zimbra.backup" logger. For backup/restore-related logs.

"zimbra.system" logger. For startup/shutdown and other related logs.

"zimbra.sync" logger. For sync client interface logs.

"zimbra.synctrace" logger. For sync client interface logs.

"zimbra.syncstate" logger. For sync client interface logs.

"zimbra.wbxml" logger. For wbxml client interface logs.

"zimbra.extensions" logger. For logging extension loading related info.

"zimbra.zimlet" logger. For logging zimlet related info.

"zimbra.wiki" logger. For wiki and document sharing.

"zimbra.op" logger. Logs server operations

"zimbra.dav" logger. Logs dav operations

"zimbra.io" logger. Logs file IO operations.

"zimbra.datasource" logger. Logs data source operations.

remote management.

public static final Log rmgmt = LogFactory.getLog("zimbra.rmgmt");

"zimbra.webclient" logger. Logs ZimbraWebClient servlet and jsp operations.

"zimbra.scheduler" logger. Logs scheduled task operations.

"zimbra.store" logger. Logs filesystem storage operations.

"zimbra.fb" logger. Logs free/busy operations.

"zimbra.purge" logger. Logs mailbox purge operations.

"zimbra.mailop" logger. Logs changes to items in the mailbox.

Thread Dump To Mailbox.log

Run this command as root

for i in 1 2 3; do /opt/zimbra/libexec/zmmailboxdmgr threaddump; sleep 30; done

Hopefully the zmmailboxd.out will reflect the thread dumps.

A Better Way To Thread Dump?

Usage of the below script:

$0 FILENAME -dl [-frames #_ stack_frames] [-id REGEXP] [-all]
$0 FILENAME -dt [-frames #_ stack_frames] [-sort id|state] [-stack REGEXP] [-id REGEXP] [-state REGEXP] [-waiting NUMBER]
frames: controls the # lines of stack trace included in the output
id: only include where the id matches REGEXP
all: include ALL locks (not just ones that other threads are blocked-on)
sort: controls the order threads are printed out (locks always printed in lock ID order)
stack: only include threads where the thread's stack output matches REGEXP
state: only include threads where the thread state (e.g. RUNNABLE) matches REGEXP
waiting: only inclure threads are blocking other threads

Examples:

$0 threads.txt -dt -stack MailboxIndex.java -state RUNNABLE -f 20
-- dumps all RUNNABLE threads with MailboxIndex.java on the stack (1st 20 lines of the stack)
$0 threads.txt -dl -f 0
-- dumps the list of locks in the system that are blocking at least one thread
$0 threads.txt -dt -f 0 -sort state
-- dumps a list of all the threads in the system and tells you their run state
$0 threads.txt -dt -f 20 -w 1
-- dumps a list of all the threads in the system that are blocking other threads (quick way to hunt for contention)

Working Example:

threadDumpAnalyzer.pl zmmailboxd.out -dl -f 100 ~/Desktop
threadDumpAnalyzer.pl Script
#!/usr/bin/perl -w
# 
# ***** BEGIN LICENSE BLOCK *****
# 
# Zimbra Collaboration Suite Server
# Copyright (C) 2004, 2005, 2006, 2007 Zimbra, Inc.
# 
# The contents of this file are subject to the Yahoo! Public License
# Version 1.0 ("License"); you may not use this file except in
# compliance with the License.  You may obtain a copy of the License at
# http://www.zimbra.com/license.
# 
# Software distributed under the License is distributed on an "AS IS"
# basis, WITHOUT WARRANTY OF ANY KIND, either express or implied.
# 
# ***** END LICENSE BLOCK *****
# 
use strict;
use warnings;
use Getopt::Long;

my %locks;                      # by LockID - hash w/ 'owner','type'
my %threads;           # by threadId - hash w/ 'state' 'waitingOnLock'

my $filename = $ARGV[0];
my ($dumpLocks, $dumpThreads, $searchThreadStack, $searchThreadId, $stackFrames, $sort, $filterByState, $allLocks);
my ($waiting);

$stackFrames = 10;
$sort = "state";

GetOptions(
           "dl" => \$dumpLocks,
           "dt" => \$dumpThreads,
           "frames=s" => \$stackFrames,
           "sort=s" => \$sort,
           "id=s" => \$searchThreadId,
           "state=s" => \$filterByState,
           "stack=s" => \$searchThreadStack,
           "waiting=s" => \$waiting,
           "all" => \$allLocks,
          );

sub dumpLocks();
sub handleThread();
sub dumpThreads();
sub usage();
sub readFile($);
sub getBlockedThreads($);

if (!defined $filename) {
  usage();
}

readFile($filename);

if (defined $dumpLocks) {
  dumpLocks();
} elsif (defined $dumpThreads) {
  dumpThreads();
} else {
  usage();
}

exit(0);

sub usage() {
  my $usage = <<END_OF_USAGE;
    
USAGE:
    $0 FILENAME -dl [-frames #_ stack_frames] [-id REGEXP] [-all]
    $0 FILENAME -dt [-frames #_ stack_frames] [-sort id|state] [-stack REGEXP] [-id REGEXP] [-state REGEXP] [-waiting NUMBER]

    frames:  controls the # lines of stack trace included in the output
    id:      only include where the id matches REGEXP

    all:     include ALL locks (not just ones that other threads are blocked-on)

    sort:    controls the order threads are printed out (locks always printed in lock ID order)
    stack:   only include threads where the thread's stack output matches REGEXP
    state:   only include threads where the thread state (e.g. RUNNABLE) matches REGEXP
    waiting: only inclure threads are blocking other threads

 Examples:

    $0 threads.txt -dt -stack MailboxIndex.java -state RUNNABLE -f 20
        -- dumps all RUNNABLE threads with MailboxIndex.java on the stack (1st 20 lines of the stack)

    $0 threads.txt -dl -f 0
        -- dumps the list of locks in the system that are blocking at least one thread

    $0 threads.txt -dt -f 0 -sort state
        -- dumps a list of all the threads in the system and tells you their run state

    $0 threads.txt -dt -f 20 -w 1
        -- dumps a list of all the threads in the system that are blocking other threads (quick way to hunt for contention)
    

END_OF_USAGE
  die $usage;
}

sub mySort($$) {
  my ($a, $b) = @_;
  if ($sort eq "state") {
    my $state1 = $threads{$a}{state};
    my $state2 = $threads{$b}{state};
    if (!defined $state1) {
      return 1;
    } elsif (!defined $state2) {
      return -1;
    }
    return $state1 cmp $state2;
  } else {
    return $a cmp $b;
  }
}

sub padToWidth($$) {
  (my $str, my $width) = @_;
  if (!defined($str)) {
    $str = "";
  }
  return sprintf "%-".$width.".".$width."s", $str;
}

sub formatStackTrace($$) {
  my ($stack, $indent) = @_;
  my $ret;
  if ($stackFrames == 0) { return ""; }
  my $num = $stackFrames;
  
  foreach my $line (split /\n/, $stack) {
    if ($num <= 0) {
      return $ret;
    }
    $ret .= $indent.$line."\n";
    $num--;
  }
  return $ret;
}

sub formatLock($) {
  my $lockId = shift;
  my $output = "$lockId - ";
  my $numData = 0;
  for my $data (keys %{ $locks{$lockId}}) {
    $output .= " $data=";
    $output .= $locks{$lockId}{$data};
    $numData++;
  }
  if ($numData == 0) { $output .= " UNKNOWN"; }
  return $output;
}

sub formatThread($) {
  my $threadId = shift;
  if (!defined $threadId) { return ""; }
  my $foo = padToWidth($threadId, 50);
  if (!defined($foo) || $foo eq "") {
    $foo = "ASDF";
  }
  my $bar = $threads{$threadId}{state};
  if (!defined($bar) || $bar eq "") {
    $bar = "HJKL";
  }
  my $ret = $foo.$bar."\n";
  if (defined $threads{$threadId}{waitingOnLock}) {
    $ret .= "\tWaiting for: ".formatLock($threads{$threadId}{waitingOnLock})."\n";
  }

  my @blockedThreads = getBlockedThreads($threadId);
  for my $blockedThread (sort @blockedThreads) {
    $ret .= "\t$blockedThread is waiting on this thread\n";
  }
  
  $ret .= formatStackTrace($threads{$threadId}{stack}, "\t  ");
  if ($stackFrames > 0) {
    $ret .= "\n";
  }
  return $ret;
}

# given a lockId, return a list of the threads that are blocked on it
sub getLockWaiters($) {
  my $lockId = shift;
  my @ret;

  foreach my $threadId ( keys %threads ) {
    if (defined $threads{$threadId}{waitingOnLock}) {
      if ($threads{$threadId}{waitingOnLock} eq $lockId) {
        push @ret, $threadId;
      }
    }
  }
  return @ret;
}

sub getLockOwner($) {
  my $lockId = shift();
  if (defined $locks{$lockId}) {
    if (defined $locks{$lockId}{owner}) {
      return $locks{$lockId}{owner};
    } else {
      return "";
    }
  } else {
    return "";
  }
}

# given a threadId, get a list of all other threads that are blocked
# on locks it is holding
sub getBlockedThreads($) {
  my $threadId = shift;
  my @ret;
  foreach my $lockId ( sort keys %locks ) {
    if (getLockOwner($lockId) eq $threadId) { # a lock we own
      my @blockedThreads = getLockWaiters($lockId); 
      foreach my $blockedThread (@blockedThreads) {
        push @ret, $blockedThread;
        push @ret, getBlockedThreads($blockedThread); #recurse!
      }
    }
  }
  return @ret;
}

sub dumpLocks() {
  foreach my $lockId ( sort keys %locks ) {
    my $ret = "";
    my $numWaiters = 0;
    $ret .=  "LOCK: $lockId   ";
    for my $data (keys %{ $locks{$lockId}}) {
      $ret .=  "$data=";
      $ret .=  $locks{$lockId}{$data};
      $ret .=  ", ";
    }
    $ret .= "\n";
    foreach my $threadId ( keys %threads ) {
      if (defined $threads{$threadId}{waitingOnLock}) {
        if ($threads{$threadId}{waitingOnLock} eq $lockId) {
          $ret .= "\tThread $threadId is waiting for this lock\n";
          $numWaiters++;
        }
      }
    }
    $ret .= formatStackTrace($threads{getLockOwner($lockId)}{stack}, "\t");

    if ((!defined $searchThreadId) || ($lockId =~ /$searchThreadId/)) {
      if ($numWaiters > 0 || defined $allLocks) {
        if ($stackFrames > 0) {
          $ret .= "\n";
        }
        print $ret;
      }
    }
  }
}

sub dumpThreads() {
  foreach my $threadId ( sort { mySort($a, $b) } keys %threads ) {
    if (!defined $threadId) {
      # continue
    } elsif (defined $searchThreadStack && !($threads{$threadId}{stack} =~ /$searchThreadStack/)) {
      # continue
    } elsif (defined $searchThreadId && !($threadId =~ /$searchThreadId/)) {
      # continue
    } elsif (defined $filterByState && !($threads{$threadId}{state} =~ /$filterByState/)) {
      # continue
    } elsif (defined $waiting && (getBlockedThreads($threadId) < $waiting)) {
      # continue
    } else {
      print formatThread($threadId);
    }
  }
}

sub readFile($) {
  my $filename = shift;
  
  open IN, "<$filename" or die "couldn't open $filename";
  
  my @curThread;

  while (<>) {
    chomp;
    s/\r//g;
    if ($_ eq "") {
      if (@curThread) {
        my $threadId;
        my @locksHeld;
        my $waitingOnLock;
        my $threadState;
        my $output;
        
        my $firstLineState;
        
        # 1stline
        my $line = shift @curThread;
        $output .= $line."\n";
        if ($line =~ /"(.*)"/) {
          $threadId = $1;
          if ($line =~/nid=0x[0-9a-f]+\s([a-zA-Z\s\.()]+)/) {
            $threads{$threadId}{state} = $1;
          }

        } else {
          $threadId = $line;
        }
        if ($threadId eq "") { $threadId = "none"; }
        
        # 2nd line
        $line = shift @curThread;
        if (defined $line) {
          $output .= $line."\n";
          if ($line =~ /State: ([A-Z_]+)/) {
            $threadState = $1;
            $threads{$threadId}{state} = $1;
          }
          
          foreach $line (@curThread) {
            $output .= $line."\n";
            if ($line =~ /locked <(0x[0-9a-f]+)>\s?(.*)?/) {
              push @locksHeld, $1;
              
              $locks{$1}{owner} = $threadId;
              $locks{$1}{type} = $2;
              
            } elsif ($line =~ /- waiting to lock <(0x[0-9a-f]+)>/) {
              $waitingOnLock = $1;
              $threads{$threadId}{waitingOnLock} = $1;
            } elsif ($line =~ /- waiting to lock <(0x[0-9a-f]+)>/) {
              $waitingOnLock = $1;
              $threads{$threadId}{waitingOnLock} = $1;              
            } 
          }
        } else {
          $threads{$threadId}{state} = "unknown";
        }
        
        $threads{$threadId}{stack} = $output;
        undef @curThread;
      }
    } else {
      push @curThread, $_;
    }
  }
  
  close IN;
}

Want To See What's In Redolog Files

If you suspect there's too much redolog activity during a time window or have another need to inspect the contents of the redolog, dump it and examine it:

$ zmjava com.zimbra.cs.redolog.util.RedoLogVerify /opt/zimbra/redolog/redo.log > out.file

Pick the right redolog file, either redo.log or one of the files under archive/, based on timestamp.

See Ajcody-Notes#Redolog_Files for more details.

Internal Zimbra Charting - zmstat-chart

RFE/Bugs You Might Be Interested In

zmstat IS NOT Logger (graphs in admin console)

Please note, zmstat is different than the processes and graphs involved with Logger. Logger is behind the graphs that show up in the admin web console.

Please see Logger for details and trouble-shooting steps with Logger.

Rick has some other steps for King0770-Notes#Reinitialize_the_Logger_DB

Main References For zmstat-chart Command

Please see the following:

http://www.zimbra.com/docs/ne/latest/administration_guide/A_app-command-line.17.24.html

http://wiki.zimbra.com/index.php?title=Server_Monitoring

http://www.zimbra.com/docs/os/latest/administration_guide/9_Monitoring.12.1.html

JFreeChart is the project/binary behind the charting tool. You'll find the jar files here [ZCS 5.0.9 example]:

  • /opt/zimbra/lib/jars/jfreechart-1.0.1.jar
  • /opt/zimbra/jetty-6.1.5/common/lib/jfreechart-1.0.1.jar

Issues Being Investigated With zmstat And Other "Charting" Items

I currently have an internal thread going with the dev's about some issues with zmstat and also with attempts of getting the active sessions data from the admin web console to be seen in trends.

I'll just be dumping my notes here and then cleaning them up once I'm done with the internal conversation.

  • Question: (mailboxd: active connections by client protocol) What exactly is this tracking? I'm on a test box with no connections and it has the different threads in the 100's/1000's. Customer was hoping (as was I) that this would be giving data much like is shown in the admin console about user connections. Also, there doesn't seem to be a comparable "active connections" for the web client (soap)?
    • Answer: For the connections issue, it's the number of clients that are currently connected. There's no analog for SOAP because SOAP doesn't have persistent connections. We may be able to log stats for number of SOAP sessions, but that would include noise for cases where the user closes the browser window and the session hasn't timed out yet. If you think this is useful, please file an RFE.
  • Question: (graph time plot) If I run zmstat-chart with the --aggregate-start-at option with something like "08/07/2008 01:00:00" I notice that the graphs don't actually reference "days" at the bottom but still use what looks to be "hours". Is this a bug where the bottom reference of the graph doesn't get adjusted or is the syntax to run zmstat-chart to do this more tricky that I think.
    • Answer: zmstat-chart doesn't support multi-day ranges, it really only supports a single directory with a single day of charts, the aggregate options are for specifying ranges at which to calculate the max(col)/min(col)/avg(col) values. Our usual practice is to generate one set of charts for each day we want to look at; I think it gives us a better look at when peaks occur and how they compare day-to-day (it's much easier to look at them in an overlay fashion).
    • Answer: Seems like a reasonable request to be able to generate charts for multiple days. Please file an RFE if you think this would be useful.
  • Question: How reliable are these numbers are in admin console....
    • Answer: I imagine they should be quite reliable; but HTTP (Web) is stateless, so you can't really determine whether a session is really active or not. I believe multiple sessions appearing for a single user would be the result of reloading or closing and then reopening the browser to log in again.
      • Perhaps it is best to ask the users in question what their usage is like? Is there any concern over the high number of sessions? I don't think it should affect server performance.
  • Question: Could you please explain the following:
    • Mailboxd Mailbox Add Rate (Delivery Rate)
      • Answer: The rate at which messages are being added to the mailbox server, e.g. 1 message being delivered per second is... 1 m/s
    • Mailboxd Mailbox Add Latency (Delivery Speed)
      • Answer: How long it takes on average to add a message to a mailbox (averaged over the period of a minute)
    • Mailboxd Active Connections by Client Protocol- This one does NOT give us counts for web client. Host zcs2.mail.XXXX.xxx: Mailboxd: Active Connections by Client Protocol avg(IMAP) = 169.86 avg(IMAP SSL) = 0.00 avg(POP) = 0.21 avg(POP SSL) = 0.00
      • Answer: It doesn't give counts for web client because the web client does not have persistent connections (being HTTP-based). SOAP requests (used by the Web client) use transient connections that often close after they are used; i.e. if you took the number of "SOAP" connections and compared it to the number of "active" sessions, it'd likely be something like 0-1% of the latter figure.
    • Mailboxd Mailbox Get Count
      • Answer: The number of mailboxes that got opened in the last minute (I don't recall if this counting cache hits or not?)
    • Mailboxd Mailbox Get Latency
      • Answer: How long it takes to open a mailbox, on average (over a minute)
    • Also, we don't get stats for any of the soap parameters:
      • SOAP Invocation Count Summary (Top 10 max)
      • SOAP Average Call Duration AuthRequest
        • Answer: I don't know why this could be; perhaps zmstat-chart-config.xml needs to be regenerated, is /opt/zimbra/zmstat/soap.csv available and does it contain data?

Running zmstat-chart

Confirm it's running:

zmprov gs [mailserver hostname] | grep -i stats
  zimbraServiceEnabled:  stats
  zimbraServiceinstalled: stats

Default location of stat files is:

/opt/zimbra/zmstat/

Example command to run:

mkdir /tmp/stats
zmstat-chart -s /opt/zimbra/zmstat -d /tmp/stats/

Like To Have zmstat-chart Data Integrated With Zimbra

Ok, this is just down and dirty. You'll need to handle security issues yourself.

mkdir /opt/zimbra/zmstat-chart
chown zimbra:zimbra /opt/zimbra/zmstat-chart

Let's configure apache to show this directory.

vi /opt/zimbra/conf/httpd.conf
** Towards the end, add the following**
# Include zmstat-chart directory
Include /opt/zimbra/httpd/conf/extra/zmstat-chart.conf

Let's now make the apache conf file for zmstat-chart.conf

vi /opt/zimbra/httpd/conf/extra/zmstat-chart.conf
Alias /zmstat-chart "/opt/zimbra/zmstat-chart"
<Directory "/opt/zimbra/zmstat-chart">
  Options Indexes
  IndexOptions FancyIndexing VersionSort
  AllowOverride None
  Order allow,deny
  Allow from all
</Directory>

Restart apache so the changes take effect:

zmapachectl stop
zmapachectl start

Now, let's dump some graphing directories in there:

zmstat-chart -s /opt/zimbra/zmstat -d /opt/zimbra/zmstat-chart/`date +%F-%H-%M`

To now view the charts, adjust for your servername.

http://servername:7780/zmstat-chart/

You'll see there's the directory and when you go into it - there's your charts.

You can now place this command in your crontab to run on a schedule basis.

Things you'll want to check before deploying:

  1. Place charting directory in a location where space is available and will not create a risk to the mail services if it's full.
  2. Look at the zmstat-chart options to see what other parameters you want to use - especially if it's going in your crontab.
  3. Make sure your following and security guidelines for your environment.

Tweaking zmstat-chart

zmstat-chart also has the option to use a configuration file. The default configuration file is:

/opt/zimbra/conf/zmstat-chart.xml

Copy this to another location and edit away. Then add the -c /location/to/config/file to use your new configuration.

Default Items Charted

Here's a list of the default items that will be charted with zmstat-chart:

  • Postfix Queue Size
  • convertd CPU time used
  • convertd Resident Memory
  • convertd Processes and Threads
  • Total CPU
  • Process CPU
  • Process Total Memory
  • Process Resident Memory
  • Virtual Memory
  • Context Switches
  • Run/Blocked Process Queue Size
  • Disk Partition Throughput
  • Disk Utilization
  • Disk Throughput
  • Disk IOPs
  • Swap Activity
  • Total file descriptors open
  • SOAP Invocation Count Summary (Top 10 max)
  • SOAP Average Call Duration Summary (Top 10 avg)
  • SOAP Invocation Count GetServerRequest
  • SOAP Invocation Count BackupRequest
  • SOAP Invocation Count GetAllConfigRequest
  • SOAP Invocation Count GetDomainRequest
  • SOAP Invocation Count GetAllDomainsRequest
  • SOAP Invocation Count AuthRequest
  • SOAP Average Call Duration GetServerRequest
  • SOAP Average Call Duration BackupRequest
  • SOAP Average Call Duration GetAllConfigRequest
  • SOAP Average Call Duration GetDomainRequest
  • SOAP Average Call Duration GetAllDomainsRequest
  • SOAP Average Call Duration AuthRequest
  • MySQL Database Connections
  • MySQL InnoDB Buffer Pool Pages
  • MySQL InnoDB Buffer Pool Hit Rate
  • MySQL Tables Open/Opened
  • MySQL Total Slow Queries Count
  • Mailboxd Connection Pool Get Latency
  • Mailboxd Dirty Lucene Index Writers
  • Mailboxd Lucene IndexWriterCache Hitrate
  • Mailboxd Lucene IO
  • Mailboxd LMTP Delivery Throughput
  • Mailboxd LMTP Delivery Rate
  • Mailboxd Mailbox Add Rate (Delivery Rate)
  • Mailboxd Mailbox Add Latency (Delivery Speed)
  • Mailboxd Request Rate by Client Protocol
  • Mailboxd Response Time by Client Protocol
  • Mailboxd Active Connections by Client Protocol
  • Mailboxd Mailbox Get Count
  • Mailboxd Mailbox Get Latency
  • Mailboxd Mailbox Cache Hit Rate
  • Mailboxd Mailbox Item/Blob Cache Hit Rate
  • Mailboxd Garbage Collection Time
  • Mailboxd Garbage Collection Count
  • Mailboxd JVM Heap Used
  • Mailboxd JVM Heap Free
  • Mailboxd JVM Permanent Generation and Code Cache

Getting All User Quota Data (not zmstat related really)

To get quota stats, run. The output is a little bit difficult to read since it's in bytes:

zmprov gqu [server-name]

Here's a script to put into user readable format below [by Tony P.]. You'll need to modify the $hostname variable near the top:

#!/usr/bin/perl

$hostname = localhost;

@quotas = `zmprov gqu $hostname`;

foreach $quotaline(@quotas)

{

chop($quotaline);

($email, $quota, $usage) = split(/\s/,$quotaline);

$quota=utils_convert_bytes_to_optimal_unit($quota);

$usage=utils_convert_bytes_to_optimal_unit($usage);

print "$email - $usage\n";

}

sub utils_convert_bytes_to_optimal_unit{

my($bytes) = @_;

return '' if ($bytes eq '');

my($size);

$size = $bytes . ' Bytes' if ($bytes < 1024);

$size = sprintf("%.2f", ($bytes/1024)) . ' KB' if ($bytes >= 1024 && $bytes < 1048576);

$size = sprintf("%.2f", ($bytes/1048576)) . ' MB' if ($bytes >= 1048576 && $bytes < 1073741824);

$size = sprintf("%.2f", ($bytes/1073741824)) . ' GB' if ($bytes >= 1073741824 && $bytes < 1099511627776);

$size = sprintf("%.2f", ($bytes/1099511627776)) . ' TB' if ($bytes >= 1099511627776);

return $size;

}

Third Party Tools (Monitoring & Logging)

FYI - Support For Real Time Counters (snmp,etc.)

This is for those that need to go beyond what zmstat-chart is doing or need to intergrate within a third party monitoring systems.

Please see RFE:

Currently, this RFE has only resolved support for JMX, not SNMP. Customers who need SNMP can use a product like jManage to do the translation from JMX to SNMP.

Some Choices

Charting & Graphing The Data
Montoring Software

Nagios On Zimbra

This is a really rough draft for ideas I have in background. Shouldn't be used by anyone.

Configure Nagios to run on single server Zimbra box - Centos 5.x

Configure yum with repo and install nagios

vi /etc/yum.repos.d/Dag.repo
yum update
yum install nagios nagios-plugins nagios-devel nagios-plugins-nrpe

Move nagios.conf http file into main zimbra directory.

cp /etc/httpd/conf.d/nagios.conf /opt/zimbra/httpd/conf/extra/

Setup nagios to run as zimbra

vi /etc/nagios/nagios.cfg
**Change nagios user to zimbra**
nagios_user=zimbra
nagios_group=zimbra

Change ownership of directories from nagios to zimbra.

chown -R 500:500 /var/log/nagios/
chown -R 500:500 /etc/nagios/
chown -R 500:500 /usr/share/nagios/

Configure authentication within Nagios

vi /etc/nagios/nagios.cfg
# AUTHENTICATION USAGE
use_authentication=1
# SYSTEM/PROCESS INFORMATION ACCESS
authorized_for_system_information=nagiosadmin
# CONFIGURATION INFORMATION ACCESS
authorized_for_configuration_information=nagiosadmin
# SYSTEM/PROCESS COMMAND ACCESS
authorized_for_system_commands=nagiosadmin
# GLOBAL HOST/SERVICE VIEW ACCESS
authorized_for_all_services=nagiosadmin
authorized_for_all_hosts=nagiosadmin
# GLOBAL HOST/SERVICE COMMAND ACCESS
authorized_for_all_service_commands=nagiosadmin
authorized_for_all_host_commands=nagiosadmin

Set up httpasswd's for the accounts for Nagios

htpasswd -c /etc/nagios/htpasswd.users nagiosadmin
htpasswd /etc/nagios/htpasswd.users guest

Configure Zimbra's http/apache to use nagios http config file

vi /opt/zimbra/conf/httpd.conf
**Add the following towards bottom**
# Include Nagios
Include /opt/zimbra/httpd/conf/extra/nagios.conf

Starting nagios is done as root

/etc/init.d/nagios start

Restarting apache for nagios issues would be done with (as zimbra)

zmapachectl stop
zmapachectl start

The webpage address to view Nagios will be like this:

http://IP_OF_SERVER:7780/nagios/

Use the rest of this how-to to configure it now: http://wiki.centos.org/HowTos/Nagios

MRTG - SNMP On Zimbra

This is a really rough draft for ideas I have in background. Shouldn't be used by anyone.

Configure yum with repo and install mrtg, net-snmp, net-snmp-utils

vi /etc/yum.repos.d/Dag.repo
yum update
yum instal mrtg net-snmp net-snmp-utils

Follow some how-to on setting up the basics.

Create a http config:

vi /opt/zimbra/httpd/conf/extra/mrtg.conf
Alias /mrtg "/opt/zimbra/mrtg"
<Directory "/opt/zimbra/mrtg">
#  SSLRequireSSL
  Options None
  AllowOverride None
  Order allow,deny
  Allow from all
#  Order deny,allow
#  Deny from all
#  Allow from 127.0.0.1
</Directory>

Add mrtg to http configuration within zimbra:

vi /opt/zimbra/conf/httpd.conf
# Include Mrtg
Include /opt/zimbra/httpd/conf/extra/mrtg.conf

Restart apache:

zmapachectl stop
zmapachectl start

Create directory to hold mrtg data:

mkdir /opt/zimbra/mrtg

Address will be something like:

http://IP_OF_SERVER:7780/mrtg/index.html

Mailq Pointing To Right Binary

# ls -la /usr/bin/mailq
 lrwxrwxrwx 1 root root 27 Sep  3 17:00 /usr/bin/mailq -> /etc/alternatives/mta-mailq
# ls -la /etc/alternatives/mta-mailq
 lrwxrwxrwx 1 root root 23 Apr  1 10:17 /etc/alternatives/mta-mailq -> /usr/bin/mailq.sendmail
# rm /usr/bin/mailq
# ln -s /opt/zimbra/postfix/sbin/mailq /usr/bin/mailq
# mailq
 Mail queue is empty

Web Client Logging

Debug (SOAP) via Browser

See http://wiki.zimbra.com/index.php?title=Web_Client_URL_Tricks&redirect=no

Admins To View Client Issues

Within the admin console, you can view users mail.

  • Goto accounts and highlight the user having the problem.
  • Click on the View Mail button above that frame.
  • Then goto the url field of that new window and modify it to look like this [replace mailserver with yours]:
  • Hit your return key to cause the browser to reload.
    • If you get warning about pop-up, accept it.
    • If the debug window doesn't show, just mouse in the url field and hit the return key again. It should now pop up.
Jump to: navigation, search