King0770-Notes-ldap-fragmentation: Difference between revisions

No edit summary
No edit summary
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
Have you ever noticed ldap fragmentation on you ldap replica nodes before?<br>
==Have you ever noticed ldap fragmentation on your ldap replica nodes before?==
<br>
<br>
<code><pre>
<code><pre>
Line 31: Line 31:
</pre></code>
</pre></code>


Then do the following as the zimbra user<br>
==If so, then do the following as the zimbra user==
<code><pre>
<code><pre>
1) Remove the ldap replica it out of the VIP pool    <<== you must do this FIRST
1) Make sure your LDAP environment is configured to fail-over LDAP read traffic to another Replica or the Master  <<== you must do this FIRST


2) zmcontrol stop
2) zmcontrol stop


3) su - zimbra; mv /opt/zimbra/data/ldap/mdb/ /opt/zimbra/ldap/mdb.old
3) mv /opt/zimbra/data/ldap/mdb/ /opt/zimbra/ldap/mdb.BIG


4) cd /opt/zimbra/data/ldap
4) cd /opt/zimbra/data/ldap
Line 43: Line 43:
5) mkdir -p mdb/db
5) mkdir -p mdb/db


6) mdb_copy /opt/zimbra/data/ldap/mdb,old/db /opt/zimbra/data/ldap/mdb/db
6) mdb_copy -c /opt/zimbra/data/ldap/mdb.BIG/db /opt/zimbra/data/ldap/mdb/db


7) zmcontrol start
7) zmcontrol start
*Should be noted to run this from the replicas, and not on the master*
</pre></code>
==What was the cause of the fragmentation?==
Most likely there was an event, and that event will be in the access log. Export your access log from the <strong>ldap-master</strong>, and inspect it.
<code><pre>
/opt/zimbra/libexec/zmslapcat -a /tmp/
</pre></code>
==A way to detect LDAP updates that are over 5 seconds==
Set the minimum log level required
<code><pre>
zmlocalconfig -e ldap_common_loglevel="stats none"
</pre></code>
Now find updates that taking over a certain period of time - lets say 5 seconds (5000 ms) or more.
<code><pre>
tail -f /var/log/zimbra.log | egrep 'duration=[5-9][0-9]{3}\.' | grep -o 'conn=[^ ]* op=[^ ]*' > /tmp/5secplus.log
</pre></code>
Run that for a bit during a time of slowness, or just generally through a busy period during the day (8:30-9:00am for example), and then set the loglevel back to 49152 (because it builds quickly).
<code><pre>
zmlocalconfig -e ldap_common_loglevel="none sync"
</pre></code>
</pre></code>
Now find the attributes associated with the long durations (5 seconds) - this would be the updates causing sync-repl slowness (example, we likely see CSRF token data or a gentime attribute).
<code><pre>
fgrep -f /tmp/5secplus.log /var/log/zimbra.log |grep -o 'MOD attr=.*' | sort | uniq -c
</pre></code>
Much Thanks to Karl Buchner & John Holder for the mdb_copy syntax!
More articles written by me, https://wiki.zimbra.com/wiki/King0770-Notes
[[Category:Community Sandbox]]
[[Category:Command Line Interface]]
[[Category:LDAP]]
[[Category:King0770-Notes]]

Latest revision as of 19:00, 20 December 2018

Have you ever noticed ldap fragmentation on your ldap replica nodes before?


zimbra@ldap-replica002 ~]$ date;mdb_stat -a -e -f /opt/zimbra/data/ldap/mdb/db | grep "Free pages" | awk '{print $3 * 4096/1024/1024 " MB"}'
Tue Nov 20 12:12:34 MST 2018
3419.11 MB

[zimbra@ldap-replica002 ~]$ date;mdb_stat -a -e -f /opt/zimbra/data/ldap/mdb/db | grep "Free pages" | awk '{print $3 * 4096/1024/1024 " MB"}'
Tue Nov 20 12:14:16 MST 2018
3554.54 MB

[zimbra@ldap-replica002 ~]$ date;mdb_stat -a -e -f /opt/zimbra/data/ldap/mdb/db | grep "Free pages" | awk '{print $3 * 4096/1024/1024 " MB"}'
Tue Nov 20 12:15:14 MST 2018
3627.49 MB

[zimbra@ldap-replica002 ~]$ date;mdb_stat -a -e -f /opt/zimbra/data/ldap/mdb/db | grep "Free pages" | awk '{print $3 * 4096/1024/1024 " MB"}'
Tue Nov 20 12:16:19 MST 2018
3721.03 MB

[zimbra@ldap-replica002 ~]$ date;mdb_stat -a -e -f /opt/zimbra/data/ldap/mdb/db | grep "Free pages" | awk '{print $3 * 4096/1024/1024 " MB"}'
Tue Nov 20 12:19:13 MST 2018
3932.64 MB

[zimbra@ldap-replica002 ~]$ date;mdb_stat -a -e -f /opt/zimbra/data/ldap/mdb/db | grep "Free pages" | awk '{print $3 * 4096/1024/1024 " MB"}'
Tue Nov 20 12:20:24 MST 2018
4031.6 MB

[zimbra@ldap-replica002 ~]$ date;mdb_stat -a -e -f /opt/zimbra/data/ldap/mdb/db | grep "Free pages" | awk '{print $3 * 4096/1024/1024 " MB"}'
Tue Nov 20 12:21:58 MST 2018
4160.29 MB

If so, then do the following as the zimbra user

1) Make sure your LDAP environment is configured to fail-over LDAP read traffic to another Replica or the Master   <<== you must do this FIRST

2) zmcontrol stop

3) mv /opt/zimbra/data/ldap/mdb/ /opt/zimbra/ldap/mdb.BIG

4) cd /opt/zimbra/data/ldap

5) mkdir -p mdb/db

6) mdb_copy -c /opt/zimbra/data/ldap/mdb.BIG/db /opt/zimbra/data/ldap/mdb/db

7) zmcontrol start

*Should be noted to run this from the replicas, and not on the master*

What was the cause of the fragmentation?

Most likely there was an event, and that event will be in the access log. Export your access log from the ldap-master, and inspect it.

/opt/zimbra/libexec/zmslapcat -a /tmp/

A way to detect LDAP updates that are over 5 seconds

Set the minimum log level required

zmlocalconfig -e ldap_common_loglevel="stats none"

Now find updates that taking over a certain period of time - lets say 5 seconds (5000 ms) or more.

tail -f /var/log/zimbra.log | egrep 'duration=[5-9][0-9]{3}\.' | grep -o 'conn=[^ ]* op=[^ ]*' > /tmp/5secplus.log

Run that for a bit during a time of slowness, or just generally through a busy period during the day (8:30-9:00am for example), and then set the loglevel back to 49152 (because it builds quickly).

zmlocalconfig -e ldap_common_loglevel="none sync"

Now find the attributes associated with the long durations (5 seconds) - this would be the updates causing sync-repl slowness (example, we likely see CSRF token data or a gentime attribute).

fgrep -f /tmp/5secplus.log /var/log/zimbra.log |grep -o 'MOD attr=.*' | sort | uniq -c

Much Thanks to Karl Buchner & John Holder for the mdb_copy syntax!

More articles written by me, https://wiki.zimbra.com/wiki/King0770-Notes

Jump to: navigation, search