Server Monitoring: Difference between revisions

mNo edit summary
No edit summary
 
(18 intermediate revisions by 14 users not shown)
Line 1: Line 1:
= Server Monitoring =
{{BC|Community Sandbox}}
 
__FORCETOC__
<div class="col-md-12 ibox-content">
=Server Monitoring=
{{KB|{{Unsupported}}|{{ZCS 5.0}}||}}
{{WIP}}
== Stats Collection ==
== Stats Collection ==


One of the coolest things about working with ZCS is that it exposes you to many technologies such as Java, Postfix, OpenLDAP, and MySQL.  An administrator of a ZCS system should have a working knowledge of these technologies, in order to monitor system performance and solve performance problems.
One of the coolest things about working with ZCS is that it exposes you to many technologies such as Java, Postfix, OpenLDAP, and MySQL.  An administrator of a ZCS system should have a working knowledge of these technologies, in order to monitor the system and solve performance problems.


The ZCS server collects many performance-related statistics.  The data is stored in CSV files in '''/opt/zimbra/zmstat''' and its subdirectories:
The ZCS server collects many performance-related statistics.  The data is stored in the following CSV files in '''/opt/zimbra/zmstat''':


* '''cpu.csv''': CPU utilization
* '''cpu.csv''': CPU utilization
Line 16: Line 20:
* '''vm.csv''': Linux VM statistics (from the ''vmstat'' command)
* '''vm.csv''': Linux VM statistics (from the ''vmstat'' command)


These files are in a standard CSV format that can be loaded into Excel for viewing and charting.  We also provide a command-line utility called '''zmstat-chart''' that generates charts from the CSV data:
These files are in a standard CSV format that can be loaded into Excel for viewing and charting.  They are archived to subdirectories of ''/opt/zimbra/zmstat'' every day at midnight.
 
== The zmstat-chart Utility ==
 
Zimbra provides a command-line utility called '''zmstat-chart''' that is used to generate charts from the CSV data.  The following command:
 
$ zmstat-chart -s /opt/zimbra/zmstat/2008-04-03 -d ~/zmstat/2008-04-03/charts
 
will read data from CSV files in ''/opt/zimbra/2008-04-03/'' and write HTML and PNG files to the ''~/zmstat/2008-04-03charts/'' directory (which would be /opt/zimbra/zmstat/2008-04-03/charts/ if run as the zimbra user).  Default chart parameters are specified in '''/opt/zimbra/conf/zmstat-chart.xml'''.  If you'd like to skip certain charts or add charts that aren't generated by default, you can specify an alternate chart conf file with the '''-c''' option.


$ zmstat-chart -s /opt/zimbra/2008-04-03 -d ~/charts
The zmstat-chart configuration file may need generated prior to running zmstat-chart.  To generate, run:


will read data from CSV files in '''/opt/zimbra/2008-04-03''' and write HTML and PNG files to the '''~/charts''' directory.  Default chart parameters are specified in '''/opt/zimbra/conf/zmstat-chart.xml'''.  An alternate chart conf file can optionally be specified with the '''-c''' option.
$ zmstat-chart-config > /opt/zimbra/conf/zmstat-chart.xml


== Chart Analysis ==
== Chart Analysis ==
Line 27: Line 39:
CPU utilization is tracked both at the server level and the process level.  Here's a sample process CPU graph:<br>
CPU utilization is tracked both at the server level and the process level.  Here's a sample process CPU graph:<br>
[[Image:ProcCpu.png | 600px]]<br>
[[Image:ProcCpu.png | 600px]]<br>
This chart shows how server CPU increases in the morning as users come to work and a spike at 9:00AM.  To further investigate the problem, you could look at other charts or the server logs to determine what happened at 9:00AM to cause the heightened system load.
This chart shows that server CPU increases in the morning as users come to work, followed by a spike at 9:00AM.  To further investigate the problem, you could look at other charts or the server logs to determine the cause of the spike.


=== Disk Utilization ===
=== Disk Utilization ===
Disk utilization is tracked for each disk partition:<br>
Disk utilization is tracked for each disk partition:<br>
[[Image:DiskUtil.png | 600px]]<br>
[[Image:DiskUtil.png | 600px]]<br>
This chart shows that disk activity also increases along with the increased utilization shown in the CPU chart.  It also shows that the ''sda'' partition is experiencing more load than the others.  When laying out disk partitions for a ZCS installation, it's a good idea to put different system components (''/opt/zimbra/store'', ''/opt/zimbra/db'', ''/opt/zimbra/index'') on separate partitions.  This makes it much easier to determine which system component is performing more disk access.<br>
This chart shows that disk activity also goes up along with the increased CPU utilization.  It also shows that the ''sda'' partition is experiencing more load than the others.  When laying out disk partitions for a ZCS installation, it's a good idea to put different system components (''/opt/zimbra/store'', ''/opt/zimbra/db'', ''/opt/zimbra/index'') on separate partitions.  This makes it much easier to determine which system component is performing more I/O.
 
=== Memory Consumption ===
ZCS stats track the amount of memory used by each process in the system:<br>
[[Image:ProcTotalMem.png | 600px]]<br>
This information can be used to determine how system memory is being allocated between the various processes.


=== JVM Garbage Collection ===
=== JVM Garbage Collection ===
Line 44: Line 61:
Higher numbers indicate that MySQL is able to get data from memory instead of going to disk.  If your hit rate is below 990, MySQL is hitting the disk harder than it should.  Investigate the following issues:
Higher numbers indicate that MySQL is able to get data from memory instead of going to disk.  If your hit rate is below 990, MySQL is hitting the disk harder than it should.  Investigate the following issues:
* Consider increasing the buffer pool size in '''my.cnf'''.
* Consider increasing the buffer pool size in '''my.cnf'''.
* Run '''EXPLAIN''' on some of the SQL statements in '''/opt/zimbra/log/myslow.log''' to see if they are causing InnoDB to read a large amount of data into memory.
* Run ''EXPLAIN'' on some of the SQL statements in '''/opt/zimbra/log/myslow.log''' to see if they are causing InnoDB to read a large amount of data into memory.
 
== Summary ==
This article describes just a few of the statistics tracked and charted by ZCS.  For more details, spend some time looking at your server's performance data either in the ''zmstat-chart'' output or Excel.
 
=Related Articles=
*[[Zmstats]]
 
{{Article Footer|5.0.x (not 6.0)|4/4/2008}}
 
[[Category:Monitoring]]
[[Category:Performance and Tuning]]
[[Category:ZCS 5.0]]

Latest revision as of 22:01, 12 July 2015

Server Monitoring

   KB 2408        Last updated on 2015-07-12  




0.00
(0 votes)

Stats Collection

One of the coolest things about working with ZCS is that it exposes you to many technologies such as Java, Postfix, OpenLDAP, and MySQL. An administrator of a ZCS system should have a working knowledge of these technologies, in order to monitor the system and solve performance problems.

The ZCS server collects many performance-related statistics. The data is stored in the following CSV files in /opt/zimbra/zmstat:

  • cpu.csv: CPU utilization
  • fd.csv: file descriptor count
  • mailboxd.csv: ZCS server and JVM statistics
  • mtaqueue.csv: Postfix queue
  • proc.csv: disk utilization
  • soap.csv: SOAP request processing time
  • threads.csv: JVM thread counts
  • vm.csv: Linux VM statistics (from the vmstat command)

These files are in a standard CSV format that can be loaded into Excel for viewing and charting. They are archived to subdirectories of /opt/zimbra/zmstat every day at midnight.

The zmstat-chart Utility

Zimbra provides a command-line utility called zmstat-chart that is used to generate charts from the CSV data. The following command:

$ zmstat-chart -s /opt/zimbra/zmstat/2008-04-03 -d ~/zmstat/2008-04-03/charts

will read data from CSV files in /opt/zimbra/2008-04-03/ and write HTML and PNG files to the ~/zmstat/2008-04-03charts/ directory (which would be /opt/zimbra/zmstat/2008-04-03/charts/ if run as the zimbra user). Default chart parameters are specified in /opt/zimbra/conf/zmstat-chart.xml. If you'd like to skip certain charts or add charts that aren't generated by default, you can specify an alternate chart conf file with the -c option.

The zmstat-chart configuration file may need generated prior to running zmstat-chart. To generate, run:

$ zmstat-chart-config > /opt/zimbra/conf/zmstat-chart.xml

Chart Analysis

CPU utilization

CPU utilization is tracked both at the server level and the process level. Here's a sample process CPU graph:
ProcCpu.png
This chart shows that server CPU increases in the morning as users come to work, followed by a spike at 9:00AM. To further investigate the problem, you could look at other charts or the server logs to determine the cause of the spike.

Disk Utilization

Disk utilization is tracked for each disk partition:
DiskUtil.png
This chart shows that disk activity also goes up along with the increased CPU utilization. It also shows that the sda partition is experiencing more load than the others. When laying out disk partitions for a ZCS installation, it's a good idea to put different system components (/opt/zimbra/store, /opt/zimbra/db, /opt/zimbra/index) on separate partitions. This makes it much easier to determine which system component is performing more I/O.

Memory Consumption

ZCS stats track the amount of memory used by each process in the system:
ProcTotalMem.png
This information can be used to determine how system memory is being allocated between the various processes.

JVM Garbage Collection

ZCS tracks the percentage of time that the Java Virtual Machine spends on garbage collection:
GcTime.png
If the JVM is spending more than a few percent of its time on garbage collection, consider increasing the amount of memory allocated to the server Java process.

InnoDB Buffer Pool Hit Rate

This chart tracks the buffer pool hit rate for the InnoDB storage engine in MySQL:
MysqlBufpoolHitRate.png
Higher numbers indicate that MySQL is able to get data from memory instead of going to disk. If your hit rate is below 990, MySQL is hitting the disk harder than it should. Investigate the following issues:

  • Consider increasing the buffer pool size in my.cnf.
  • Run EXPLAIN on some of the SQL statements in /opt/zimbra/log/myslow.log to see if they are causing InnoDB to read a large amount of data into memory.

Summary

This article describes just a few of the statistics tracked and charted by ZCS. For more details, spend some time looking at your server's performance data either in the zmstat-chart output or Excel.

Related Articles

Verified Against: 5.0.x (not 6.0) Date Created: 4/4/2008
Article ID: https://wiki.zimbra.com/index.php?title=Server_Monitoring Date Modified: 2015-07-12



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »

Jump to: navigation, search