Difference between revisions of "Ajcody-Clustering"

(Critical Bugs/RFE's - False Restarts And So Forth: Updating info re: Bug 25456 and hardware only failover)
m (Critical Bugs/RFE's - False Restarts And So Forth)
Line 47: Line 47:
  
 
'''''Note:''' Beginning in ZCS 6.0, hardware only failover with Redhat Cluster Manager is supported.''
 
'''''Note:''' Beginning in ZCS 6.0, hardware only failover with Redhat Cluster Manager is supported.''
 +
 +
====RHEL 5 Clusters And Cisco Switches====
 +
 +
Please see the following:
 +
 +
* "Openais appears to fail, causing cluster member to fence"
 +
** https://bugzilla.redhat.com/show_bug.cgi?id=469874
 +
*** The last comment mentions a cisco issue being the cause [cisco switches are used internally for ibm blades].
 +
*** Comment 9 states most likely hardware configuration issue w/ switch or iptables.
 +
* "Cman kills first node in initial cluster setup"
 +
** https://bugzilla.redhat.com/show_bug.cgi?id=485026
  
 
===Good Summary For RHEL Clustering===
 
===Good Summary For RHEL Clustering===

Revision as of 18:55, 12 March 2010

Attention.png - This article is NOT official Zimbra documentation. It is a user contribution and may include unsupported customizations, references, suggestions, or information.

Clustering Topics

Actual Clustering Topics Homepage


Please see Ajcody-Clustering

My Other Clustering Pages


RFE I made based up the stuff above:

Critical Bugs/RFE's - False Restarts And So Forth

Please see:

Recommendations to work around bug until fix is released - 2 methods:

  • First Method
    • Remove the zmmtaconfigctl restart from the /etc/logrotate.d/zimbra file so that it will not attempt to restart the service that is being detected as down. We can remove the line that reads:
su - zimbra -c "/opt/zimbra/bin/zmmtaconfigctl restart"
    • That will stop these failures, but will affect the logging for zmmtaconfigctl, probably causing it to write to a nonexistent file. We haven't seen problems in zmmtaconfig for a long time, so this is a pretty low risk workaround.
  • Second Method - If your wanting to just disable software monitoring in general, then:
    • Disable software monitoring. This will prevent failover if zmcluctl finds a service down. It will not prevent failover if there is a hardware fault detected by the cluster software.
    • To disable it, check /opt/zimbra-cluster/bin/zmcluctl and find the 'status' section. It's the last of the three (start, stop, status). You will need to find the lines that read 'exit($rc);' and change them to read 'exit(0);'.

Other bugs/rfe's you might be interested in looking at:

Note: Beginning in ZCS 6.0, hardware only failover with Redhat Cluster Manager is supported.

RHEL 5 Clusters And Cisco Switches

Please see the following:

Good Summary For RHEL Clustering


This is a good solid summary about RHEL clustering:

http://www.linuxjournal.com/article/9759

Active-Active Clustering


There is a bug(rfe) for active-active configuration. Please see:

http://bugzilla.zimbra.com/show_bug.cgi?id=19700

Non-San Based Fail Over HA/Cluster Type Configuration


This RFE covers issues when your wanting a "copy" of the data to reside on an independent server - LAN/WAN.

Please see:

RFE's/Bug Related To Supporting Clustering Options


HA-Linux (Heartbeat)


HA-Linux How-To For Testing And Educational Use

References:


Actual HA-Linux How-To For Testing And Educational Use Homepage

Please see Ajcody-Notes-HA-Linux-How-To

Motive Behind How-To

I hope this gives an easy way to setup through some clustering concepts for an administrator to gain some real-world experience when they currently have none. I plan on walking through each "function" that is behind clustering rather than jumping to an end setup (Linux-HA, Shared Storage, And Zimbra).

The structure will be:

  • Setup two machines (physical or virtual)
    • Emphasis physical hostname / ip vs. the hostname and ip address that will be for HA.
  • Setup virtual hostname and ip address for HA.
    • Explain and do ip failover between the two machines.
  • Setup a disk mount, we'll use probably use a nfs export from a third machine.
    • This will give us an example of expanding the HA conf files to move beyond the ip address failover.
    • Adjust HA conf's to now export via nfs a local directory from each server. This will not be a shared physical disk of course.
  • Setup a shared disk between the two servers and include it in the HA conf files.
    • Can use drbd or maybe figure out a way to share a virtual disk between the two vm's.
  • Setup a very simple application to include between the two machines. Something like apache or cups.
  • Go back and now readjust all variables between monitoring type (automatic) failover and simple manually initiated.

Jump to: navigation, search