Monitoring Zimbra Collaboration Nagios
Monitoring Zimbra Collaboration with Nagios
Overview
This tutorial is focused to IT Admins that have a Nagios environment, and they want to monitor a Zimbra environment. This tutorial is not based in Nagios install and configure, we will publish soon a tutorial about this. This is a real example about how and what monitoring in Zimbra system. If you want to share your own plugins, or configuration, it will be great for the rest of Community.
Preparing our Nagios Server
Before we start, we going to prepare our config files. We must to create a Zimbra Host and Group:
root@firewall:~# vim /etc/nagios3/conf.d/lab_zimbra.cfg define host{
use generic-host host_name mail.zimbra.lab alias mail.zimbra.lab address 192.168.100.20 }
root@firewall:~# vim /etc/nagios3/conf.d/hostgroups_nagios2.cfg define hostgroup { hostgroup_name zimbra-servers alias Zimbra members mail.zimbra.lab }
Installing NRPE in our Zimbra Server
We going to install Nagios in our Zimbra server. In my case is a CentOS, by default this operating system don't come with Nagios plugins in the official repositories, so we need to add the respository EPEL ( Extra Packages for Enterprise Linux ) for install the client. I've attach a screen where you cand find how Nagios works with NRPE ( Nagios Remote Plugin Executor ):
For install the client, we need to add the EPEL repository :
[root@zimbra ~]# rpm -Uhv http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
Once we've added the respository, we can now install the NRPE client in our Zimbra server:
[root@zimbra ~]# yum install nagios-plugins-nrpe nagios-plugins-all nrpe
Now, we need to configure the Nagios client for accept arguments and permit to our Nagios Server check this Zimbra server, for do the monitoring active checks:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg allowed_hosts=192.168.100.1 dont_blame_nrpe=1
[root@zimbra ~]# service nrpe restart [root@zimbra ~]# vim /etc/selinux/config SELINUX=disabled
Monitoring the Services
In the next steps, we will see the configuration per each service that we tan to monitor.
Monitoring the SMTP service
The Simple Mail Transfer Protocol (SMTP), is a network protocol that we use for exchange electronic emails between different computers or devices. We going to monitor from outside the Zimbra server, if the SMTP service is UP and ready:
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service{ use generic-service hostgroup_name zimbra-servers service_description SMTP check_command check_smtp }
Monitoring the SUBMISSION service
The Submission, is a network protocol (port 587) that we use for exchange electronic emails between different computers or devices, this port is the alternative for SMTP. We going to monitor from outside the Zimbra server, if the SUBMISSION service is UP and ready:
root@firewall:~# vim /etc/nagios3/commands.cfg define command{ command_name check_submission command_line /usr/lib/nagios/plugins/check_smtp -H $HOSTADDRESS$ -p 587 }
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service{ use generic-service hostgroup_name zimbra-servers service_description Submission check_command check_submission }
Monitoring the IMAP service
Internet Message Access Protocol (IMAP), is an application protocol that permit us access to the messages storaged in a Internet Server. Trouhgt IMAP, we can access to the email from any device, if we have Internet. IMAP have good points than POP3. We can watch the email like a mirror, a perfect copy of mails and folders from the server, we can choose what folders we can watch and sincronize. We going to monitor from outside the Zimbra server, if the IMAP service is UP and ready:
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service{ use generic-service hostgroup_name zimbra-servers service_description IMAP check_command check_imap }
Monitoring POP3 service
The Post Office Protocol (POP3) permit us to download the email messages from the server, we can choose if let a copy in the server or not. We going to monitor from outside the Zimbra server, if the POP3 service is UP and ready:
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service{ use generic-service hostgroup_name zimbra-servers service_description POP3 check_command check_pop }
Monitoring IMAP SSL service
This protocol, is the same than IMAP, but secure through SSL certificate. We going to monitor from outside the Zimbra server, if the IMAP SSL service is UP and ready:
root@firewall:~# vim /etc/nagios3/commands.cfg define command{ command_name check_imaps command_line /usr/lib/nagios/plugins/check_imap -H $HOSTADDRESS$ -p 993 -S }
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service{ use generic-service hostgroup_name zimbra-servers service_description IMAP SSL check_command check_imaps }
Monitoring POP3 SSL service
This protocol, is the same than POP3, but secure through SSL certificate. We going to monitor from outside the Zimbra server, if the POP3 SSL service is UP and ready:
root@firewall:~# vim /etc/nagios3/commands.cfg define command{ command_name check_pops command_line /usr/lib/nagios/plugins/check_pop -H $HOSTADDRESS$ -p 995 -S }
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service{ use generic-service hostgroup_name zimbra-servers service_description POP3 SSL check_command check_pops }
Monitoring ClamAV service
ClamAv is the antivirus system that Zimbra uses. We going to monitor if ClamAV have a socket, it will tell us that clamav is working and ready:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_clamd]=/usr/lib64/nagios/plugins/check_clamd /opt/zimbra/data/clamav/clamav.sock
[root@zimbra ~]# service nrpe restart
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description ClamAV check_command check_nrpe_1arg!check_lmtp }
Monitoring LMTP service
Local Mail Transfer Protocol or LMTP is a derivate of SMTP, Simple Mail Transfer Protocol. LMTP has desined like an alternative to SMTP for situations where the receive side doesn't have a mail queue , like an a MTA (Mail Delivery Agent) that understands SMTP conversations.We going to monitor from outside the Zimbra server, if the LMTP service is UP and ready:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_lmtp]=/usr/lib64/nagios/plugins/check_smtp -H localhost -p 7025
[root@zimbra ~]# service nrpe restart
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description LMTP check_command check_nrpe_1arg!check_lmtp }
Monitoring SpellCheck service
SpellCheck is the spell corrector of Zimbra. We going to monitor from outside the Zimbra server, if the SpellCheck service is UP and ready::
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_spell]=/usr/lib64/nagios/plugins/check_http -H localhost -p 7780
[root@zimbra ~]# service nrpe restart
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description SpellCheck check_command check_nrpe_1arg!check_spell }
Monitoring DNS service
We will verificate if the server can resolve an external dns name, for example google.com:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_dns]=/usr/lib64/nagios/plugins/check_dns -H google.com
[root@zimbra ~]# service nrpe restart
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description DNS check_command check_nrpe_1arg!check_dns }
Monitoring the Certificate
The certificate in our Zimbra environment is a piece of the Core, without it, or if it fails, everything will be stop.We going to monitor the Zimbra certificate, if it's valid and ok:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_cert]=/usr/lib64/nagios/plugins/check_http -S -H localhost -C 30
[root@zimbra ~]# service nrpe restart
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description Cert HTTPS check_command check_nrpe_1arg!check_cert }
Monitoring logged users
Is very important knows how many users are logged to our environment, through SSH or in console. We can have problems if one or more users edit the same config file for example:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_users]=/usr/lib64/nagios/plugins/check_users -w 2 -c 3
[root@zimbra ~]# service nrpe restart
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description Usuarios check_command check_nrpe_1args!check_users }
Monitoring Load Average
The Load Average, will inform us about the health of the server, the state in CPU, proccesses, etc. We can monitor bottleneck thanks to Load Average
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description Load Average check_command check_nrpe!check_load!5!10 }
Monitoring PING
Monitoring ping, we will know if our system have a bottleneck or high latencies, if we have, will impact in our environment.
define service { use generic-service hostgroup_name zimbra-servers service_description Ping6 check_command check_ping!100.0,20%!500.0,60% }
Monitoring Disk Space
As you can see in the screenshot, I have in my Zimbra server three important partitions: OPT, HSM y Backups. We going to monitor every partition, for have and email alert, in casi that the disk getting full.
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_opt]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /opt command[check_backup]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /opt/zimbra/backup command[check_hsm]=/usr/lib64/nagios/plugins/check_disk -w 20% -c 10% -p /opt/zimbra/hsm
[root@zimbra ~]# service nrpe restart
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description Espacio OPT check_command check_nrpe_1arg!check_opt } define service { use generic-service hostgroup_name zimbra-servers service_description Espacio HSM check_command check_nrpe_1arg!check_hsm } define service { use generic-service hostgroup_name zimbra-servers service_description Espacio Backup check_command check_nrpe_1arg!check_backup }
In the end, all of this checks looks like in the next Screen in our Nagios server:
Plugin check_zmstatus.pl
This plugin check the state of the zimbra services, asking directly to the zmcontrol status, through zimbra user. You can ["https://raw.githubusercontent.com/gmykhailiuta/check_zmstatus/master/check_zmstatus.pl" download] the plugin and you need to put into your plugins folder :
*/usr/lib64/nagios/plugins (for CentOS 64bit) */usr/lib/nagios/plugins (for Ubuntu 12.04 64bit)
Thanks to ["https://github.com/gmykhailiuta" gmykhailiuta] for adapting the plugin.
We need to execute some additional commands, we need to put in the suoders file the next sentence /etc/sudoers :
%nagios ALL=(zimbra) NOPASSWD:/opt/zimbra/bin/zmcontrol
And, in our nrpe.cfg file the next:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_zmstatus]=/usr/lib64/nagios/plugins/check_zmstatus.pl -b $ARG1$
[root@zimbra ~]# service nrpe restart
Now, in our Nagios server, this:
root@srvnagios:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description Zimbra Status check_command check_nrpe_zimbra }
And we need to add the next in our commands.cfg file:
define command{ command_name check_nrpe_zimbra command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_zmstatus -u }
After everything, you will see the next, in case that any service fail, it will turns on red:
Monitoring MySQL
And also, we can't forget a master piece in our Zimbra environment, MySQL,it is very easy and quick, in our Zimbra server we need to do the next:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_zimbra_mysql]=/usr/lib64/nagios/plugins/check_mysql -s /opt/zimbra/db/mysql.sock command[check_zimbra_mysql_logger]=/usr/lib64/nagios/plugins/check_mysql -s /opt/zimbra/logger/db/mysql.sock
[root@zimbra ~]# service nrpe restart
In our Nagios server, the next commands:
root@firewall:~# vim /etc/nagios3/conf.d/services_nagios2.cfg define service { use generic-service hostgroup_name zimbra-servers service_description Comprobar MySQL check_command check_nrpe_1arg!check_zimbra_mysql }
define service { use generic-service hostgroup_name zimbra-servers service_description Comprobar MySQL Logger check_command check_nrpe_1arg!check_zimbra_mysql_logger }
Maybe we will have problems with the credentials of MySQL for monitor, we must to create a MySQL user without any privilege :
mysql> select password('PASSWORDVUESTRA'); +-------------------------------------------+ | password('nagios2014') | +-------------------------------------------+ | *AB6F6BD001383BE123123123123fdssdf7379334 | +-------------------------------------------+ 1 row in set (0.00 sec)
mysql> CREATE USER nagios IDENTIFIED BY PASSWORD '*AB6F6BD001383BE123123123123fdssdf7379334'; Query OK, 0 rows affected (0.52 sec)
And modify the npre config like this :
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg
command[check_zimbra_mysql]=/usr/lib64/nagios/plugins/check_mysql -s /opt/zimbra/db/mysql.sock -u nagios -p PASSWORDVUESTRA
[root@zimbra ~]# service nrpe restart
Monitoring Mailq
Another important service to monitor, is Mailq, this service shows how many messages are in the queue, we can monitor if we have an a spam attack or if simply we have a problem sending emails:
[root@zimbra ~]# vim /etc/nagios/nrpe.cfg command[check_zimbra_mailq]=/usr/lib/nagios/plugins/check_mailq -w 100 -c 150 -M postfix
In this case, we need to modify a nagios file too, because the path to mailq is incorrect by default:
[root@zimbra ~]# vim /usr/lib/nagios/plugins/utils.pm
And change
$PATH_TO_MAILQ
looks like
/usr/bin/mailq
We need to search the postfix path in our Zimbra Collaboration Server, under /opt/zimbra/postfix-XXXXXXXX
If we run the check again, we will see how many mails in queue we have, 6 in my case