Ajcody-Clustering

Revision as of 16:19, 16 June 2010 by Ajcody (talk | contribs) (HA-Linux (Heartbeat))

Attention.png - This article is NOT official Zimbra documentation. It is a user contribution and may include unsupported customizations, references, suggestions, or information.

Clustering Topics

Actual Clustering Topics Homepage


Please see Ajcody-Clustering

My Other Clustering Pages


RFE I made based upon the experience above:

Critical Bugs/RFE's - False Restarts And So Forth

Log Rotation Causes Cluster Failover

See:

Recommendations to work around bug until fix is released - 2 methods:

  • First Method - disable log rotation
    • Remove the zmmtaconfigctl restart from the /etc/logrotate.d/zimbra file so that it will not attempt to restart the service that is being detected as down. We can remove the line that reads:
su - zimbra -c "/opt/zimbra/bin/zmmtaconfigctl restart"
    • That will stop these failures, but will affect the logging for zmmtaconfigctl, probably causing it to write to a nonexistent file. We haven't seen problems in zmmtaconfig for a long time, so this is a pretty low risk workaround.
  • Second Method - disable software monitoring in general

Software Monitoring Causes Problems

  • To Disable software monitoring.
    • This will prevent failover if zmcluctl finds a service down. It will not prevent failover if there is a hardware fault detected by the cluster software.
    • To disable it, check /opt/zimbra-cluster/bin/zmcluctl and find the 'status' section. It's the last of the three (start, stop, status). You will need to find the lines that read 'exit($rc);' and change them to read 'exit(0);'.
    • To also increase the chance of getting more information in the log on what might be going on:
      • In /opt/zimbra-cluster/bin/zmcluctl you should see a line like:
        • my @output = `su - zimbra -c 'zmcontrol status'`;
      • Change that to:
        • my @output = `su - zimbra -c 'date >> /opt/zimbra/log/zmcluster-status.log ; zmcontrol status >> /opt/zimbra/log/zmcluster-status.log 2>> /opt/zimbra/log/zmcluster-status.log'`;
      • That should give us more logging. I believe zmcluctl is read every time from disk when it does the check, so no restart of services should be needed.

RHEL 5 Clusters And Cisco Switches

Please see the following:

Mysql Related Items Impacting Cluster

Please see:

Other Misc Bug/RFEs

Other bugs/rfe's you might be interested in looking at:

Good Summary For RHEL Clustering


This is a good solid summary about RHEL clustering:

http://www.linuxjournal.com/article/9759

Active-Active Clustering


There is a bug(rfe) for active-active configuration. Please see:

http://bugzilla.zimbra.com/show_bug.cgi?id=19700

Non-San Based Fail Over HA/Cluster Type Configuration


This RFE covers issues when your wanting a "copy" of the data to reside on an independent server - LAN/WAN.

Please see:

RFE's/Bug Related To Supporting Clustering Options


Other Clustering RFE's And Bugs


HA-Linux (Heartbeat)

---


HA-Linux How-To For Testing And Educational Use

References:


Actual HA-Linux How-To For Testing And Educational Use Homepage

Please see Ajcody-Notes-HA-Linux-How-To

Motive Behind How-To

I hope this gives an easy way to setup through some clustering concepts for an administrator to gain some real-world experience when they currently have none. I plan on walking through each "function" that is behind clustering rather than jumping to an end setup (Linux-HA, Shared Storage, And Zimbra).

The structure will be:

  • Setup two machines (physical or virtual)
    • Emphasis physical hostname / ip vs. the hostname and ip address that will be for HA.
  • Setup virtual hostname and ip address for HA.
    • Explain and do ip failover between the two machines.
  • Setup a disk mount, we'll use probably use a nfs export from a third machine.
    • This will give us an example of expanding the HA conf files to move beyond the ip address failover.
    • Adjust HA conf's to now export via nfs a local directory from each server. This will not be a shared physical disk of course.
  • Setup a shared disk between the two servers and include it in the HA conf files.
    • Can use drbd or maybe figure out a way to share a virtual disk between the two vm's.
  • Setup a very simple application to include between the two machines. Something like apache or cups.
  • Go back and now readjust all variables between monitoring type (automatic) failover and simple manually initiated.

Jump to: navigation, search