Zmstats: Difference between revisions

(Adding ArticleInfobox)
No edit summary
 
(13 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{Article Infobox|{{admin}}||{{ZCS 5.0}}|}}= Zmstats =
{{BC|Community Sandbox}}
__FORCETOC__
<div class="col-md-12 ibox-content">
=Zmstats=
{{KB|{{Unsupported}}|{{ZCS 8.6}}|{{ZCS 5.0}}|}}
{{WIP}}
= Zmstats =
Zmstats is how Zimbra exposes its performance metrics and statistics to the world.  The information covers a wide array of data: disk usage, cpu utilization, java statistics, zimbra counters and beyond.
Zmstats is how Zimbra exposes its performance metrics and statistics to the world.  The information covers a wide array of data: disk usage, cpu utilization, java statistics, zimbra counters and beyond.


Line 7: Line 13:


* zmstatctl (all zmstat collection scripts are located in /opt/zimbra/libexec)
* zmstatctl (all zmstat collection scripts are located in /opt/zimbra/libexec)
** zmstat-allprocs (since ZCS 6.0)
** zmstat-convertd
** zmstat-cpu
** zmstat-df (since ZCS 6.0)
** zmstat-fd
** zmstat-fd
** zmstat-df
** zmstat-cpu
** zmstat-proc
** zmstat-io
** zmstat-io
** zmstat-vm
** zmstat-ldap (since ZCS 8.0)
** zmstat-convertd
** zmstat-mtaqueue
** zmstat-mtaqueue
** zmstat-mysql
** zmstat-mysql
** zmstat-nginx (since ZCS 6.0)
** zmstat-proc
** zmstat-vm
* zmstat-chart
* zmstat-chart
** zmstat-chart-config
** zmstat-chart-config
== Running zmstats ==
* On the server where the stats were produced, make sure that the zmstat-chart.xml is provided. When running [[zmdiaglog]], the zmstats and zmstat-chart-config are automatically produced]
<pre>
# su - zimbra
$ zmstat-chart-config > /tmp/zmstat-chart-`zmhostname`.xml
</pre>
* Make a charts directory:
<pre>
$ mkdir ~/zmstat/2010-06-01/charts
</pre>
* Then produce the stats:
<pre>
$ zmstat-chart -c /tmp/zmstat-chart.xml -s ~/zmstat/2010-06-01 -d ~/zmstat/2010-06-01/charts
</pre>


== zmstatctl ==
== zmstatctl ==
Line 53: Line 80:
See /opt/zimbra/conf/zmstat-chart.xml for more examples
See /opt/zimbra/conf/zmstat-chart.xml for more examples


= Individual stat scripts and their related counters =
= Individual csv files and their related counters =
 
=== allprocs.csv (sinze ZCS 6.0) ===
 
Written by zmstat-allprocs<br>
Interval: LC.zmstat_interval<br>
Reference: proc(5) man page. /proc/[pid]/stat, /pro/[pid]/io
 
{|
| timestamp || Time when sample was collected
|-
| process || process name
|-
| utime || total user time
|-
| stime || total system time
|-
| cputime || user + system time total
|-
| rchar || bytes read (*1)
|-
| wchar || bytes written (*1)
|-
| read_bytes || bytes read from disk (*1)
|-
| write_bytes || bytes written to disk (*1)
|-
| rss || resident-set-size memory usage (kiloBytes)
|-
| processes || number of processes
|-
| threads || number of threads
|-
|}
 
(*1) Only if IO Accounting is enabled in the Linux kernel
 
=== convertd.csv ===
 
Collects CPU statistics for the convertd process (NE only).
 
Written by zmstat-convertd<br>
Interval: LC.zmstat_interval<br>
Reference: proc(5) man page. /proc/[pid]/stat, /pro/[pid]/io
 
{|
| timestamp || Time when sample was collected
|-
| utime || user time for convertd
|-
| stime || system time for convertd
|-
| cputime || user + system time total
|-
| rchar || bytes read (*1)
|-
| wchar || bytes written (*1)
|-
| read_bytes || bytes read from disk (*1)
|-
| write_bytes || bytes written to disk (*1)
|-
| rss || resident-set-size memory usage (kiloBytes)
|-
| processes || number of processes
|-
| theards || number of threads
|-
|}
 
(*1) Only if IO Accounting is enabled in the Linux kernel
 
=== cpu.csv ===
 
Written by zmstat-cpu<br>
Interval: LC.zmstat_interval<br>
Reference: proc(5) man page. /proc/stat
 
{|
| timestamp || Time when sample was collected
|-
| cpu:user || total user time
|-
| cpu:nice || total nice process time
|-
| cpu:sys || total system time
|-
| cpu:idle || total idle time
|-
| cpu:iowait || total time in iowait
|-
| cpu:irq || total time in irq
|-
| cpu:softirq || total time in softirq
|-
| cpu-N:XXX || same as above, but per individual core/cpu
|-
|}
 
=== df.csv (since ZCS 6.0) ===
 
Captures disk usage
 
Written by zmstat-df<br>
Interval: LC.zmstat_disk_interval<br>
Reference: df(1) man page
 
{|
| timestamp || Time when sample was collected
|-
| path || mount point
|-
| disk || device
|-
| disk_use || space used (kiloBytes)
|-
| disk_space || total space (kiloBytes)
|-
| disk_pct_used || percentage used
|-
|}


=== zmstat-fd: fd.csv ===
=== fd.csv ===


Captures file descriptor usage on the system
Captures file descriptor usage on the system


* fd_count: current number of open file descriptors
Written by zmstat-fd<br>
Interval: LC.zmstat_interval<br>
Reference: proc(5) man page. /proc/sys/fs/file-nr, /proc/[pid]/fs/
 
{|
| timestamp || Time when sample was collected
|-
| fd_count || current number of open file descriptors
|-
| mailboxd_fd_count || current number of open file descriptors by mailboxd
|-
|}
 
=== imap.csv ===
 
Written by mailboxd<br>
Interval: 1 minute


=== zmstat-df: df.csv ===
{|
| timestamp || Time when sample was collected
|-
| command || executed command
|-
| exec_count || number of executions
|-
| exec_ms_avg || average execution time
|-
|}


Captures disk usage
=== io.csv and io-x.csv ===


* path: mount point
Written by zmstat-io<br>
* disk: device
Interval: LC.zmstat_interval<br>
* disk_use: space used
Reference: iostat(1) man page
* disk_pct_used: percentage used
* disk_space: total space


=== zmstat-cpu: cpu.csv ===
{|
| timestamp || Time when sample was collected
|-
| dev:tps || transactions per second
|-
| dev:kB_read/s || read rate
|-
| dev:kB_wrtn/s || write rate
|-
| dev:kB_read || bytes read
|-
| dev:kB_wrtn || bytes written
|-
| dev:rrqm/s || read requests merged per second queued to device
|-
| dev:wrqm/s || write requests merged per second queued to device
|-
| dev:r/s || reads per second
|-
| dev:w/s || writes per second
|-
| dev:rkB/s || read rate
|-
| dev:wkB/s || write rate
|-
| dev:avgrq-sz || average size (sectors) of requests
|-
| dev:avgqu-sz || average queue length
|-
| dev:await || average wait time for requests to be served
|-
| dev:svctm || average time to service requests
|-
| dev:%util || percentage of CPU time / bandwidth utilization of device
|-
|}


* cpu:user: total user time
=== ldap.csv (since ZCS 8.0) ===
* cpu:nice: total nice process time
* cpu:sys: total system time
* cpu:idle:  total idle time
* cpu:iowait: total time in iowait
* cpu:irq: total time in irq
* cpu:softirq: total time in softirq
* cpuN:XXX: same as above, but per individual core/cpu


=== zmstat-io: io.csv and io-x.csv ===
<u>LDAP server</u>


* dev:tps: transactions per second
Written by zmstat-ldap<br>
* dev:kB_read/s: read rate
Interval: LC.zmstat_interval<br>
* dev:kB_wrtn/s: write rate
Reference: OpenLDAP Software 2.4 Administrator's Guide: Monitoring http://www.openldap.org/doc/admin24/monitoringslapd.html
* dev:kB_read: bytes read
* dev:kB_wrtn: bytes written


* dev:rrqm/s: read requests merged per second queued to device
{|
* dev:wrqm/s: write requests merged per second queued to device
| timestamp || Time when sample was collected
* dev:r/s: reads per second
|-
* dev:w/s: writes per second
| abandon_ops || number of completed Abandon operations
* dev:rkB/s read rate
|-
* dev:wkB/s: write rate
| add_ops || number of completed Add operations
* dev:avgrq-sz: average size (sectors) of requests
|-
* dev:avgqu-sz: average queue length
| bind_ops || number of completed Bind operations
* dev:await: average wait time for requests to be served
|-
* dev:svctm: average time to service requests
| bytes_sent || bytes sent by the server
* dev:%util: percentage of CPU time / bandwidth utilization of device
|-
| compare_ops || number of completed Compare operations
|-
| completed_ops || See [http://bugzilla.zimbra.com/show_bug.cgi?id=100731 Bug 100731]
|-
| connections || number of connections
|-
| delete_ops || number of completed Delete operations
|-
| entries_sent || entries sent by the server
|-
| extended_ops || number of completed Extended operations
|-
| initiated_ops || See [http://bugzilla.zimbra.com/show_bug.cgi?id=100731 Bug 100731]
|-
| modify_ops || number of completed Modify operations
|-
| modrdn_ops || number of completed Modrdn operations
|-
| read_waiters || number of current read waiters
|-
| referrals_sent || referrals sent by the server
|-
| search_ops || number of completed Search operations
|-
| unbind_ops || number of completed Unbind operations
|-
| write_waiters || number of current write waiters
|-
|}


=== zmstat-proc: proc.csv ===


* user: total user time
<u>Mailbox server</u>
* sys: total system time
* idle: total idle time
* iowait: total iowait time
* PROC-total-cpu: total cpu usage for PROC
* PROC-utime: usertime for PROC
* PROC-stime: system time for PROC
* PROC-totalMB: total memory footprint for PROC
* PROC-rssMB: resident-set-size of PROC
* PROC-sharedMB: shared memory of PROC
* PROC-process-count: number of threads/subprocesses


=== zmstat-vm: vm.csv ===
Written by mailboxd<br>
Interval: 1 minute


The output of vmstat(1) is recorded for reviewable statistics.
{|
| timestamp || Time when sample was collected
|-
| command || executed command
|-
| exec_count || number of executions
|-
| exec_ms_avg || average execution time
|-
|}


* r:
* b:
* swpd:
* free:
* buff:
* cache:
* si:
* so:
* bi:
* bo:
* in:
* cs:
* us:
* sy:
* id:
* wa:
* st:
* ...: other counters reported by vmstat


=== zmstat-mtaqueue: mtaqueue.csv ===
Known bug: [http://bugzilla.zimbra.com/show_bug.cgi?id=99936 Bug 99936] - mailboxd overwrites ldap.csv generated by zmstat-ldap


* KBytes: kilobytes queued by the mta
=== mailboxd.csv ===
* requests: number of items queued by the mta


=== zmstat-convertd: convertd.csv ===
Written by mailboxd<br>
Interval: 1 minute


Collects CPU statistics for the convertd process (NE only).
{|
| account_cache_hit_rate || LDAP account cache hit rate
|-
| account_cache_size || LDAP account cache size
|-
| acl_cache_hit_rate || LDAP ACL cache hit rate
|-
| bis_read || Number of times that the file descriptor cache read message data from disk
|-
| bis_seek_rate || Percentage of file descriptor cache disk reads that required a seek
|-
| calcache_hit || Hit rate of calendar summary cache, counting cache hit from both memory and file
|-
| calcache_lru_size || Number of calendars (folders) in the calendar summary cache LRU in Java heap
|-
| calcache_mem_hit || Hit rate of calendar summary cache, counting cache hit from memory only
|-
| cos_cache_hit_rate || LDAP COS cache hit rate
|-
| cos_cache_size || LDAP COS cache size
|-
| db_conn_count || Number of times that the server got a database connection from the pool
|-
| db_conn_ms_avg || Average latency (ms) of getting a database connection from the pool
|-
| db_pool_size || Number of database connections in use
|-
| domain_cache_hit_rate || LDAP domain cache hit rate
|-
| domain_cache_size || LDAP domain cache size
|-
| ews_syncstate_cache_hit_rate || EWS Syncstate Cache Hit Rate
|-
| ews_syncstate_cache_size || EWS Syncstate Cache Size
|-
| fd_cache_hit_rate || File descriptor cache hit rate
|-
| fd_cache_size || Number of open file descriptors that reference message content
|-
| gc_concurrentmarksweep_count || Number of times that concurrentmarksweep GC was invoked
|-
| gc_concurrentmarksweep_ms || Time (ms) spent on concurrentmarksweep GC
|-
| gc_major_count || Number of times that major GC was invoked
|-
| gc_major_ms || Time (ms) spent on major GC
|-
| gc_minor_count || Number of times that minor GC was invoked
|-
| gc_minor_ms || Time (ms) spent on minor GC
|-
| gc_parnew_count || Number of times that parnew GC was invoked
|-
| gc_parnew_ms || Time (ms) spent on parnew GC
|-
| group_cache_hit_rate || LDAP group cache hit rate
|-
| group_cache_size || LDAP group cache size
|-
| heap_free || Number of bytes free in the entire JVM heap
|-
| heap_used || Number of bytes used in the entire JVM heap
|-
| http_idle_threads || Number of HTTP idle threads
|-
| http_threads || Number of HTTP threads
|-
| idx_bytes_read || Accumulated bytes read by Lucene
|-
| idx_bytes_read_avg || Average of idx_bytes_read
|-
| idx_bytes_written || Accumulated bytes written by Lucene
|-
| idx_bytes_written_avg || Average of idx_bytes_written
|-
| idx_wrt_avg || Average number of concurrent index writers
|-
| idx_wrt_opened || Accumulated number of index writers opened
|-
| idx_wrt_opened_cache_hit || Accumulated number of cache hits when opening an index writer
|-
| imap_conn || Number of cleartext IMAP connections
|-
| imap_count || Number of IMAP requests received
|-
| imap_ms_avg || Average processing time (ms) of IMAP requests
|-
| imap_ssl_conn || Number of SSL IMAP connections
|-
| imap_ssl_threads || Number of SSL IMAP threads
|-
| imap_threads || Number of IMAP threads
|-
| innodb_bp_hit_rate || InnoDB buffer pool hit rate
|-
| ldap_dc_count || Number of times that the server got an LDAP directory context
|-
| ldap_dc_ms_avg || Average latency (ms) of getting an LDAP directory context
|-
| lmtp_conn || Number of LMTP connections
|-
| lmtp_dlvd_bytes || Number of bytes of data delivered to mailboxes as a result of LMTP delivery
|-
| lmtp_dlvd_msgs || Number of messages delivered to mailboxes as a result of LMTP delivery
|-
| lmtp_rcvd_bytes || Number of bytes received over LMTP
|-
| lmtp_rcvd_msgs || Number of messages received over LMTP
|-
| lmtp_rcvd_rcpt || Number of LMTP recipients
|-
| lmtp_threads || Number of LMTP threads
|-
| mbox_add_msg_count || Number of messages that were added to a mailbox
|-
| mbox_add_msg_ms_avg || Average latency (ms) of adding a message to a mailbox
|-
| mbox_cache || Mailbox cache hit rate
|-
| mbox_cache_size || Number of mailboxes cached in memory
|-
| mbox_get_count || Number of times that the server got a mailbox from the cache
|-
| mbox_get_ms_avg || Average latency (ms) of getting a mailbox from the cache
|-
| mbox_item_cache || Item cache hit rate
|-
| mbox_msg_cache || Message cache hit rate
|-
| mobile_ping_cache_hit_rate || mobile_ping_cache_hit_rate
|-
| mobile_ping_cache_size || mobile_ping_cache_size
|-
| mobile_syncstate_cache_hit_rate ||  mobile_syncstate_cache_hit_rate
|-
| mobile_syncstate_cache_size || mobile_syncstate_cache_size
|-
| mpool_cms_old_gen_free || mpool_cms_old_gen_free
|-
| mpool_cms_old_gen_used || mpool_cms_old_gen_used
|-
| mpool_code_cache_free || Number of bytes free in the code cache memory pool
|-
| mpool_code_cache_used || Number of bytes used in the code cache memory pool
|-
| mpool_compressed_class_space_free || mpool_compressed_class_space_free
|-
| mpool_compressed_class_space_used || mpool_compressed_class_space_used
|-
| mpool_metaspace_free || mpool_metaspace_free
|-
| mpool_metaspace_used || mpool_metaspace_used
|-
| mpool_par_eden_space_free || Number of bytes free in the eden space memory pool
|-
| mpool_par_eden_space_used || Number of bytes used in the eden space memory pool
|-
| mpool_par_survivor_space_free || Number of bytes free in the survivor space memory pool
|-
| mpool_par_survivor_space_used || Number of bytes used in the survivor space memory pool
|-
| msg_cache_size || Number of message structures cached in memory
|-
| pop_conn || Number of cleartext POP3 connections
|-
| pop_count || Number of POP3 requests received
|-
| pop_ms_avg || Average processing time (ms) of POP3 requests
|-
| pop_ssl_conn || Number of SSL POP3 connections
|-
| pop_ssl_threads || Number of SSL POP3 threads
|-
| pop_threads ||  Number of POP3 threads
|-
| server_cache_hit_rate || LDAP server cache hit rate
|-
| server_cache_size || LDAP server cache size
|-
| soap_count || Number of SOAP requests received
|-
| soap_ms_avg || Average processing time (ms) of SOAP requests
|-
| soap_sessions || Number of SOAP sessions
|-
| timestamp || Time when sample was collected
|-
| ucservice_cache_hit_rate || ucservice_cache_hit_rate
|-
| ucservice_cache_size || ucservice_cache_size
|-
| xmpp_cache_hit_rate || LDAP XMPP cache hit rate
|-
| xmpp_cache_size || LDAP XMPP cache size
|-
| zimlet_cache_hit_rate || LDAP zimlet cache hit rate
|-
| zimlet_cache_size || LDAP zimlet cache size
|-
|}


* utime: user time for convertd
=== mtaqueue.csv ===
* stime: system time for convertd
* cputime: user + system time total
* rss: resident-set-size memory usage
* threads: number of threads for convertd
* processes: number of processes for convertd


Only if IO Accounting is enabled in the Linux kernel
Written by zmstat-mtaqueue<br>
Interval: LC.zmstat_interval<br>
Reference: postqueue(1) man page http://www.postfix.org/postqueue.1.html


* rchar: bytes read
{|
* wchar: bytes written
| timestamp || Time when sample was collected
* read_bytes: bytes read from disk
|-
* write_bytes: bytes written to disk
| KBytes || kilobytes queued by the mta
|-
| requests || number of items queued by the mta
|-
|}


=== zmstat-mysql: mysql.csv ===
=== mysql.csv ===


Columns for mysql.csv are derived from the values of the query "SHOW GLOBAL STATUS".  Refer to the mysql administration manual for further elaboration on the meanings of all its counters.
Columns for mysql.csv are derived from the values of the query "SHOW GLOBAL STATUS".  Refer to the mysql administration manual for further elaboration on the meanings of all its counters.
Written by zmstat-mysql<br>
Interval: LC.zmstat_interval<br>
Reference:
: Server Status Variables - MariaDB Knowledge Base https://mariadb.com/kb/en/mariadb/server-status-variables/
: MySQL ::  MySQL 5.0 Reference Manual :: 5.1.6 Server Status Variables https://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html
=== nginx.csv (sinze ZCS 6.0) ===
Written by zmstat-nginx<br>
Interval: LC.zmstat_interval<br>
Reference: proc(5) man page. /proc/[pid]/stat, /proc/[pid]/io
{|
| timestamp || Time when sample was collected
|-
| utime || user time for convertd
|-
| stime || system time for convertd
|-
| cputime || user + system time total
|-
| rchar || bytes read (*1)
|-
| wchar || bytes written (*1)
|-
| read_bytes || bytes read from disk (*1)
|-
| write_bytes || bytes written to disk (*1)
|-
| rss || resident-set-size memory usage (kiloBytes)
|-
| processes || number of processes
|-
| threads || number of threads
|-
|}
(*1) Only if IO Accounting is enabled in the Linux kernel
=== pop3.csv ===
Written by mailboxd<br>
Interval: 1 minute
{|
| timestamp || Time when sample was collected
|-
| command || executed command
|-
| exec_count || number of executions
|-
| exec_ms_avg || average execution time
|-
|}
=== proc.csv ===
Written by zmstat-proc<br>
Interval: LC.zmstat_interval<br>
Reference: proc(5) man page. /proc/stat, /proc/[pid]/stat, /proc/[pid]/statm
{|
| timestamp || Time when sample was collected
|-
| system || label
|-
| user || user time (percent)
|-
| sys || system time (percent)
|-
| idle || idle time (percent)
|-
| iowait || iowait time (percent)
|-
| PROC || PROC name
|-
| PROC-total-cpu || user + system time total for PROC
|-
| PROC-utime || user time for PROC
|-
| PROC-stime || system time for PROC
|-
| PROC-totalMB || total memory footprint for PROC
|-
| PROC-rssMB || resident-set-size of PROC
|-
| PROC-sharedMB || shared memory of PROC
|-
| PROC-process-count || number of threads/subprocesses
|-
|}
=== soap.csv ===
Written by mailboxd<br>
Interval: 1 minute
{|
| timestamp || Time when sample was collected
|-
| command || executed command
|-
| exec_count || number of executions
|-
| exec_ms_avg || average execution time
|-
|}
=== sql.csv (since ZCS 8.5) ===
Written by mailboxd<br>
Interval: 1 minute
{|
| timestamp || Time when sample was collected
|-
| command || executed command
|-
| exec_count || number of executions
|-
| exec_ms_avg || average execution time
|-
|}
=== sync.csv (since ZCS 8.0) ===
Written by mailboxd<br>
Interval: 1 minute
{|
| timestamp || Time when sample was collected
|-
| command || executed command
|-
| exec_count || number of executions
|-
| exec_ms_avg || average execution time
|-
|}
=== threads.csv ===
Written by mailboxd<br>
Interval: 1 minute<br>
Relative configuration: zimbraStatThreadNamePrefix (since ZCS 6.0)
{|
| timestamp || Time when sample was collected
|-
| THREAD || number of threads
|-
| total || total number of threads
|-
|}
=== vm.csv ===
The output of vmstat is recorded for reviewable statistics.
Written by zmstat-vm<br>
Interval: LC.zmstat_interval<br>
Reference:
: vmstat(8) man page
: proc(5) man page. /proc/meminfo, /proc/loadavg
vmstat
* r
* b
* swpd
* free
* buff
* cache
* si
* so
* bi
* bo
* in
* cs
* us
* sy
* id
* wa
* st
* other counters reported by vmstat
proc
* MemTotal
* MemFree
* Buffers
* Cached
* SwapCached
* Active
* Inactive
* Active(anon)
* Inactive(anon)
* Active(file)
* Inactive(file)
* Unevictable
* Mlocked
* SwapTotal
* SwapFree
* Dirty
* Writeback
* AnonPages
* Mapped
* Shmem
* Slab
* SReclaimable
* SUnreclaim
* KernelStack
* PageTables
* NFS_Unstable
* Bounce
* WritebackTmp
* CommitLimit
* Committed_AS
* VmallocTotal
* VmallocUsed
* VmallocChunk
* HardwareCorrupted
* AnonHugePages
* HugePages_Total
* HugePages_Free
* HugePages_Rsvd
* HugePages_Surp
* Hugepagesize
* DirectMap4k
* DirectMap2M
* DirectMap1G
* Loadavg


Related: [[Server Monitoring]]
Related: [[Server Monitoring]]
Line 172: Line 810:
[[Category:Monitoring]]
[[Category:Monitoring]]
[[Category:Command Line Interface]]
[[Category:Command Line Interface]]
[[Category:ZCS 8.6]]
[[Category:ZCS 8.5]]
[[Category:ZCS 8.0]]
[[Category:ZCS 7.0]]
[[Category:ZCS 6.0]]
[[Category:ZCS 5.0]]
[[Category:ZCS 5.0]]

Latest revision as of 06:37, 14 August 2015

Zmstats

   KB 2807        Last updated on 2015-08-14  




0.00
(0 votes)

Zmstats

Zmstats is how Zimbra exposes its performance metrics and statistics to the world. The information covers a wide array of data: disk usage, cpu utilization, java statistics, zimbra counters and beyond.



Zmstats consists of the following components and scripts:

  • zmstatctl (all zmstat collection scripts are located in /opt/zimbra/libexec)
    • zmstat-allprocs (since ZCS 6.0)
    • zmstat-convertd
    • zmstat-cpu
    • zmstat-df (since ZCS 6.0)
    • zmstat-fd
    • zmstat-io
    • zmstat-ldap (since ZCS 8.0)
    • zmstat-mtaqueue
    • zmstat-mysql
    • zmstat-nginx (since ZCS 6.0)
    • zmstat-proc
    • zmstat-vm
  • zmstat-chart
    • zmstat-chart-config

Running zmstats

  • On the server where the stats were produced, make sure that the zmstat-chart.xml is provided. When running zmdiaglog, the zmstats and zmstat-chart-config are automatically produced]
# su - zimbra
$ zmstat-chart-config > /tmp/zmstat-chart-`zmhostname`.xml
  • Make a charts directory:
$ mkdir ~/zmstat/2010-06-01/charts
  • Then produce the stats:
$ zmstat-chart -c /tmp/zmstat-chart.xml -s ~/zmstat/2010-06-01 -d ~/zmstat/2010-06-01/charts

zmstatctl

zmstatctl is used to start and stop and various zmstat-* data logging scripts

zmstat-chart

zmstat-chart reads an XML configuration file generated by zmstat-chart-config and generates a set of HTML and PNG graph images suitable for rapidly diagnosing problems and load issues.

zmstat-chart-config.xml

Used to control what is graphed and how

Examples:

 <chart title="Mailboxd: JVM Heap Used"
        category="mailboxd"
        infile="mailboxd.csv"
        outfile="mboxd-heap-used.png"
        yAxis="MB">
   <plot data="heap_used" legend="total" divisor="1m"/>
 </chart>


The above defines a chart that reads the counter "heap_used" out of mailboxd.csv. It takes that counter and graphs it to a file "mboxd-heap-used.png". There will be a resulting graph with a yAxis labelled "MB" and "heap_used" divided by 1million (megabytes) will be graphed over time.

Multiple plots can be placed onto a single chart through the use of additional <plot> elements.

See /opt/zimbra/conf/zmstat-chart.xml for more examples

Individual csv files and their related counters

allprocs.csv (sinze ZCS 6.0)

Written by zmstat-allprocs
Interval: LC.zmstat_interval
Reference: proc(5) man page. /proc/[pid]/stat, /pro/[pid]/io

timestamp Time when sample was collected
process process name
utime total user time
stime total system time
cputime user + system time total
rchar bytes read (*1)
wchar bytes written (*1)
read_bytes bytes read from disk (*1)
write_bytes bytes written to disk (*1)
rss resident-set-size memory usage (kiloBytes)
processes number of processes
threads number of threads

(*1) Only if IO Accounting is enabled in the Linux kernel

convertd.csv

Collects CPU statistics for the convertd process (NE only).

Written by zmstat-convertd
Interval: LC.zmstat_interval
Reference: proc(5) man page. /proc/[pid]/stat, /pro/[pid]/io

timestamp Time when sample was collected
utime user time for convertd
stime system time for convertd
cputime user + system time total
rchar bytes read (*1)
wchar bytes written (*1)
read_bytes bytes read from disk (*1)
write_bytes bytes written to disk (*1)
rss resident-set-size memory usage (kiloBytes)
processes number of processes
theards number of threads

(*1) Only if IO Accounting is enabled in the Linux kernel

cpu.csv

Written by zmstat-cpu
Interval: LC.zmstat_interval
Reference: proc(5) man page. /proc/stat

timestamp Time when sample was collected
cpu:user total user time
cpu:nice total nice process time
cpu:sys total system time
cpu:idle total idle time
cpu:iowait total time in iowait
cpu:irq total time in irq
cpu:softirq total time in softirq
cpu-N:XXX same as above, but per individual core/cpu

df.csv (since ZCS 6.0)

Captures disk usage

Written by zmstat-df
Interval: LC.zmstat_disk_interval
Reference: df(1) man page

timestamp Time when sample was collected
path mount point
disk device
disk_use space used (kiloBytes)
disk_space total space (kiloBytes)
disk_pct_used percentage used

fd.csv

Captures file descriptor usage on the system

Written by zmstat-fd
Interval: LC.zmstat_interval
Reference: proc(5) man page. /proc/sys/fs/file-nr, /proc/[pid]/fs/

timestamp Time when sample was collected
fd_count current number of open file descriptors
mailboxd_fd_count current number of open file descriptors by mailboxd

imap.csv

Written by mailboxd
Interval: 1 minute

timestamp Time when sample was collected
command executed command
exec_count number of executions
exec_ms_avg average execution time

io.csv and io-x.csv

Written by zmstat-io
Interval: LC.zmstat_interval
Reference: iostat(1) man page

timestamp Time when sample was collected
dev:tps transactions per second
dev:kB_read/s read rate
dev:kB_wrtn/s write rate
dev:kB_read bytes read
dev:kB_wrtn bytes written
dev:rrqm/s read requests merged per second queued to device
dev:wrqm/s write requests merged per second queued to device
dev:r/s reads per second
dev:w/s writes per second
dev:rkB/s read rate
dev:wkB/s write rate
dev:avgrq-sz average size (sectors) of requests
dev:avgqu-sz average queue length
dev:await average wait time for requests to be served
dev:svctm average time to service requests
dev:%util percentage of CPU time / bandwidth utilization of device

ldap.csv (since ZCS 8.0)

LDAP server

Written by zmstat-ldap
Interval: LC.zmstat_interval
Reference: OpenLDAP Software 2.4 Administrator's Guide: Monitoring http://www.openldap.org/doc/admin24/monitoringslapd.html

timestamp Time when sample was collected
abandon_ops number of completed Abandon operations
add_ops number of completed Add operations
bind_ops number of completed Bind operations
bytes_sent bytes sent by the server
compare_ops number of completed Compare operations
completed_ops See Bug 100731
connections number of connections
delete_ops number of completed Delete operations
entries_sent entries sent by the server
extended_ops number of completed Extended operations
initiated_ops See Bug 100731
modify_ops number of completed Modify operations
modrdn_ops number of completed Modrdn operations
read_waiters number of current read waiters
referrals_sent referrals sent by the server
search_ops number of completed Search operations
unbind_ops number of completed Unbind operations
write_waiters number of current write waiters


Mailbox server

Written by mailboxd
Interval: 1 minute

timestamp Time when sample was collected
command executed command
exec_count number of executions
exec_ms_avg average execution time


Known bug: Bug 99936 - mailboxd overwrites ldap.csv generated by zmstat-ldap

mailboxd.csv

Written by mailboxd
Interval: 1 minute

account_cache_hit_rate LDAP account cache hit rate
account_cache_size LDAP account cache size
acl_cache_hit_rate LDAP ACL cache hit rate
bis_read Number of times that the file descriptor cache read message data from disk
bis_seek_rate Percentage of file descriptor cache disk reads that required a seek
calcache_hit Hit rate of calendar summary cache, counting cache hit from both memory and file
calcache_lru_size Number of calendars (folders) in the calendar summary cache LRU in Java heap
calcache_mem_hit Hit rate of calendar summary cache, counting cache hit from memory only
cos_cache_hit_rate LDAP COS cache hit rate
cos_cache_size LDAP COS cache size
db_conn_count Number of times that the server got a database connection from the pool
db_conn_ms_avg Average latency (ms) of getting a database connection from the pool
db_pool_size Number of database connections in use
domain_cache_hit_rate LDAP domain cache hit rate
domain_cache_size LDAP domain cache size
ews_syncstate_cache_hit_rate EWS Syncstate Cache Hit Rate
ews_syncstate_cache_size EWS Syncstate Cache Size
fd_cache_hit_rate File descriptor cache hit rate
fd_cache_size Number of open file descriptors that reference message content
gc_concurrentmarksweep_count Number of times that concurrentmarksweep GC was invoked
gc_concurrentmarksweep_ms Time (ms) spent on concurrentmarksweep GC
gc_major_count Number of times that major GC was invoked
gc_major_ms Time (ms) spent on major GC
gc_minor_count Number of times that minor GC was invoked
gc_minor_ms Time (ms) spent on minor GC
gc_parnew_count Number of times that parnew GC was invoked
gc_parnew_ms Time (ms) spent on parnew GC
group_cache_hit_rate LDAP group cache hit rate
group_cache_size LDAP group cache size
heap_free Number of bytes free in the entire JVM heap
heap_used Number of bytes used in the entire JVM heap
http_idle_threads Number of HTTP idle threads
http_threads Number of HTTP threads
idx_bytes_read Accumulated bytes read by Lucene
idx_bytes_read_avg Average of idx_bytes_read
idx_bytes_written Accumulated bytes written by Lucene
idx_bytes_written_avg Average of idx_bytes_written
idx_wrt_avg Average number of concurrent index writers
idx_wrt_opened Accumulated number of index writers opened
idx_wrt_opened_cache_hit Accumulated number of cache hits when opening an index writer
imap_conn Number of cleartext IMAP connections
imap_count Number of IMAP requests received
imap_ms_avg Average processing time (ms) of IMAP requests
imap_ssl_conn Number of SSL IMAP connections
imap_ssl_threads Number of SSL IMAP threads
imap_threads Number of IMAP threads
innodb_bp_hit_rate InnoDB buffer pool hit rate
ldap_dc_count Number of times that the server got an LDAP directory context
ldap_dc_ms_avg Average latency (ms) of getting an LDAP directory context
lmtp_conn Number of LMTP connections
lmtp_dlvd_bytes Number of bytes of data delivered to mailboxes as a result of LMTP delivery
lmtp_dlvd_msgs Number of messages delivered to mailboxes as a result of LMTP delivery
lmtp_rcvd_bytes Number of bytes received over LMTP
lmtp_rcvd_msgs Number of messages received over LMTP
lmtp_rcvd_rcpt Number of LMTP recipients
lmtp_threads Number of LMTP threads
mbox_add_msg_count Number of messages that were added to a mailbox
mbox_add_msg_ms_avg Average latency (ms) of adding a message to a mailbox
mbox_cache Mailbox cache hit rate
mbox_cache_size Number of mailboxes cached in memory
mbox_get_count Number of times that the server got a mailbox from the cache
mbox_get_ms_avg Average latency (ms) of getting a mailbox from the cache
mbox_item_cache Item cache hit rate
mbox_msg_cache Message cache hit rate
mobile_ping_cache_hit_rate mobile_ping_cache_hit_rate
mobile_ping_cache_size mobile_ping_cache_size
mobile_syncstate_cache_hit_rate mobile_syncstate_cache_hit_rate
mobile_syncstate_cache_size mobile_syncstate_cache_size
mpool_cms_old_gen_free mpool_cms_old_gen_free
mpool_cms_old_gen_used mpool_cms_old_gen_used
mpool_code_cache_free Number of bytes free in the code cache memory pool
mpool_code_cache_used Number of bytes used in the code cache memory pool
mpool_compressed_class_space_free mpool_compressed_class_space_free
mpool_compressed_class_space_used mpool_compressed_class_space_used
mpool_metaspace_free mpool_metaspace_free
mpool_metaspace_used mpool_metaspace_used
mpool_par_eden_space_free Number of bytes free in the eden space memory pool
mpool_par_eden_space_used Number of bytes used in the eden space memory pool
mpool_par_survivor_space_free Number of bytes free in the survivor space memory pool
mpool_par_survivor_space_used Number of bytes used in the survivor space memory pool
msg_cache_size Number of message structures cached in memory
pop_conn Number of cleartext POP3 connections
pop_count Number of POP3 requests received
pop_ms_avg Average processing time (ms) of POP3 requests
pop_ssl_conn Number of SSL POP3 connections
pop_ssl_threads Number of SSL POP3 threads
pop_threads Number of POP3 threads
server_cache_hit_rate LDAP server cache hit rate
server_cache_size LDAP server cache size
soap_count Number of SOAP requests received
soap_ms_avg Average processing time (ms) of SOAP requests
soap_sessions Number of SOAP sessions
timestamp Time when sample was collected
ucservice_cache_hit_rate ucservice_cache_hit_rate
ucservice_cache_size ucservice_cache_size
xmpp_cache_hit_rate LDAP XMPP cache hit rate
xmpp_cache_size LDAP XMPP cache size
zimlet_cache_hit_rate LDAP zimlet cache hit rate
zimlet_cache_size LDAP zimlet cache size

mtaqueue.csv

Written by zmstat-mtaqueue
Interval: LC.zmstat_interval
Reference: postqueue(1) man page http://www.postfix.org/postqueue.1.html

timestamp Time when sample was collected
KBytes kilobytes queued by the mta
requests number of items queued by the mta

mysql.csv

Columns for mysql.csv are derived from the values of the query "SHOW GLOBAL STATUS". Refer to the mysql administration manual for further elaboration on the meanings of all its counters.

Written by zmstat-mysql
Interval: LC.zmstat_interval
Reference:

Server Status Variables - MariaDB Knowledge Base https://mariadb.com/kb/en/mariadb/server-status-variables/
MySQL :: MySQL 5.0 Reference Manual :: 5.1.6 Server Status Variables https://dev.mysql.com/doc/refman/5.0/en/server-status-variables.html

nginx.csv (sinze ZCS 6.0)

Written by zmstat-nginx
Interval: LC.zmstat_interval
Reference: proc(5) man page. /proc/[pid]/stat, /proc/[pid]/io

timestamp Time when sample was collected
utime user time for convertd
stime system time for convertd
cputime user + system time total
rchar bytes read (*1)
wchar bytes written (*1)
read_bytes bytes read from disk (*1)
write_bytes bytes written to disk (*1)
rss resident-set-size memory usage (kiloBytes)
processes number of processes
threads number of threads

(*1) Only if IO Accounting is enabled in the Linux kernel

pop3.csv

Written by mailboxd
Interval: 1 minute

timestamp Time when sample was collected
command executed command
exec_count number of executions
exec_ms_avg average execution time

proc.csv

Written by zmstat-proc
Interval: LC.zmstat_interval
Reference: proc(5) man page. /proc/stat, /proc/[pid]/stat, /proc/[pid]/statm

timestamp Time when sample was collected
system label
user user time (percent)
sys system time (percent)
idle idle time (percent)
iowait iowait time (percent)
PROC PROC name
PROC-total-cpu user + system time total for PROC
PROC-utime user time for PROC
PROC-stime system time for PROC
PROC-totalMB total memory footprint for PROC
PROC-rssMB resident-set-size of PROC
PROC-sharedMB shared memory of PROC
PROC-process-count number of threads/subprocesses

soap.csv

Written by mailboxd
Interval: 1 minute

timestamp Time when sample was collected
command executed command
exec_count number of executions
exec_ms_avg average execution time

sql.csv (since ZCS 8.5)

Written by mailboxd
Interval: 1 minute

timestamp Time when sample was collected
command executed command
exec_count number of executions
exec_ms_avg average execution time

sync.csv (since ZCS 8.0)

Written by mailboxd
Interval: 1 minute

timestamp Time when sample was collected
command executed command
exec_count number of executions
exec_ms_avg average execution time

threads.csv

Written by mailboxd
Interval: 1 minute
Relative configuration: zimbraStatThreadNamePrefix (since ZCS 6.0)

timestamp Time when sample was collected
THREAD number of threads
total total number of threads

vm.csv

The output of vmstat is recorded for reviewable statistics.

Written by zmstat-vm
Interval: LC.zmstat_interval
Reference:

vmstat(8) man page
proc(5) man page. /proc/meminfo, /proc/loadavg

vmstat

  • r
  • b
  • swpd
  • free
  • buff
  • cache
  • si
  • so
  • bi
  • bo
  • in
  • cs
  • us
  • sy
  • id
  • wa
  • st
  • other counters reported by vmstat

proc

  • MemTotal
  • MemFree
  • Buffers
  • Cached
  • SwapCached
  • Active
  • Inactive
  • Active(anon)
  • Inactive(anon)
  • Active(file)
  • Inactive(file)
  • Unevictable
  • Mlocked
  • SwapTotal
  • SwapFree
  • Dirty
  • Writeback
  • AnonPages
  • Mapped
  • Shmem
  • Slab
  • SReclaimable
  • SUnreclaim
  • KernelStack
  • PageTables
  • NFS_Unstable
  • Bounce
  • WritebackTmp
  • CommitLimit
  • Committed_AS
  • VmallocTotal
  • VmallocUsed
  • VmallocChunk
  • HardwareCorrupted
  • AnonHugePages
  • HugePages_Total
  • HugePages_Free
  • HugePages_Rsvd
  • HugePages_Surp
  • Hugepagesize
  • DirectMap4k
  • DirectMap2M
  • DirectMap1G
  • Loadavg


Related: Server Monitoring

Verified Against: unknown Date Created: 3/07/2009
Article ID: https://wiki.zimbra.com/index.php?title=Zmstats Date Modified: 2015-08-14



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »

Jump to: navigation, search