|
|
Line 1: |
Line 1: |
| <div class="col-md-12"><br></div>
| | #REDIRECT [[Zimbra_NG_Modules/Zimbra_NG_HSM/Item_Deduplication]] |
| <div class="col-md-12"><br></div>
| |
| <ol class="breadcrumb">
| |
| <li>[[Main Page|Zimbra Wiki]]</li>
| |
| <li>[[Zimbra_Next_Generation_Modules]]</li>
| |
| <li>[[Zimbra_NG_HSM]]</li>
| |
| <li class="active">Zimbra NG HSM - Item Deduplication</li>
| |
| </ol>
| |
| __NOTOC__
| |
| <div class="col-md-12"><br /></div>
| |
| <div class="col-md-9">
| |
| <h2 class="title-header" style="padding-bottom: 9px; border-bottom: 4px solid #0087c3;">Zimbra NG HSM - Item Deduplication</h2>
| |
| <div class="col-md-12">
| |
| <div class="ibox-content">
| |
| <div class="post animated fadeInLeft animation-delay-8" style="padding-top:5px">
| |
| <div class="panel panel-default">
| |
| <div class="panel-body">
| |
| <div class="row">
| |
| == What is Item Deduplication ==
| |
| Item Deduplication is a technique that allows to save disk space by storing a single copy of an item and referencing it multiple times instead of storing multiple copies of the same item and referencing each copy only once.
| |
| | |
| This might seem a minor improvement, in theory, but in practical use can make a huge difference. Think about that user, the one that improperly sends nice and unnecessary 15Mb "motivational" or "funny" presentations to a-hundred-and-something-recipient-all-in-the-"to:"-field.
| |
| | |
| === Item Deduplication in Zimbra ===
| |
| Item Deduplication is performed by Zimbra at the moment of storing a new item in the [[Zimbra_Next_Generation_Modules/Zimbra_NG_HSM/Zimbra_Stores|Primary Volume]].
| |
| | |
| When a new item is being created its "message ID" is compared to a list of cached items, and in case of a match a hardlink to the cached message's BLOB is created instead of a whole new BLOB for the message.
| |
| | |
| The dedupe cache is managed in Zimbra 8 through the following config attributes:
| |
| | |
| '''zimbraPrefDedupeMessagesSentToSelf'''
| |
| | |
| Used to set the deduplication behaviour for sent-to-self messages.
| |
| <pre>
| |
| <attr id="144" name="zimbraPrefDedupeMessagesSentToSelf" type="enum" value="dedupeNone,secondCopyifOnToOrCC,dedupeAll" cardinality="single"
| |
| optionalIn="account,cos" flags="accountInherited,domainAdminModifiable">
| |
| <defaultCOSValue>dedupeNone</defaultCOSValue>
| |
| <desc>dedupeNone|secondCopyIfOnToOrCC|moveSentMessageToInbox|dedupeAll</desc>
| |
| </attr>
| |
| </pre>
| |
| | |
| '''zimbraMessageIdDedupeCacheSize'''
| |
| | |
| Number of cached Message IDs.
| |
| <pre>
| |
| <attr id="334" name="zimbraMessageIdDedupeCacheSize" type="integer" cardinality="single" optionalIn="globalConfig" min="0">
| |
| <globalConfigValue>3000</globalConfigValue>
| |
| <desc>
| |
| Number of Message-Id header values to keep in the LMTP dedupe cache.
| |
| Subsequent attempts to deliver a message with a matching Message-Id
| |
| to the same mailbox will be ignored. A value of 0 disables deduping.
| |
| </desc>
| |
| </attr>
| |
| </pre>
| |
| | |
| '''zimbraPrefMessageIdDedupingEnabled'''
| |
| | |
| Manage deduplication at Account or COS-level.
| |
| <pre>
| |
| <attr id="1198" name="zimbraPrefMessageIdDedupingEnabled" type="boolean" cardinality="single" optionalIn="account,cos" flags="accountInherited"
| |
| since="8.0.0">
| |
| <defaultCOSValue>TRUE</defaultCOSValue>
| |
| <desc>
| |
| Account-level switch that enables message deduping. See zimbraMessageIdDedupeCacheSize for more details.
| |
| </desc>
| |
| </attr>
| |
| </pre>
| |
| | |
| ''' zimbraMessageIdDedupeCacheTimeout '''
| |
| | |
| Timeout for each entry in the dedupe cache.
| |
| <pre>
| |
| <attr id="1340" name="zimbraMessageIdDedupeCacheTimeout" type="duration" cardinality="single" optionalIn="globalConfig" since="7.1.4">
| |
| <globalConfigValue>0</globalConfigValue>
| |
| <desc>
| |
| Timeout for a Message-Id entry in the LMTP dedupe cache. A value of 0 indicates no timeout.
| |
| zimbraMessageIdDedupeCacheSize limit is ignored when this is set to a non-zero value.
| |
| </desc>
| |
| </attr>
| |
| </pre>
| |
| (older Zimbra versions might use different attributes or lack some of them)
| |
| | |
| == Item Deduplication and Zimbra NG HSM ==
| |
| The Zimbra NG HSM module features a "doDeduplicate" operation that parses a target volume to find and deduplicate any duplicated item.
| |
| | |
| Doing so you will save even more disk space, as while Zimbra's automatic deduplication is bound to a limited cache, Zimbra NG HSM deduplication will also find and take care of multiple copies of the same email regardless of any cache or timing.
| |
| | |
| Running the "doDeduplicate" operation is also highly suggested after a migration or a large data import in order to optimize your storage usage.
| |
| | |
| === Running a Volume Deduplication ===
| |
| ==== Via the Zimbra Next Generation Modules Administration Zimlet ====
| |
| To run a volume deduplication via the Zimbra Next Generation Modules Administration Zimlet simply click on the "Zimbra NG HSM" tab select the volume you wish to deduplicate and press the "Deduplicate" button:
| |
| | |
| | |
| ==== Via the Zimbra Next Generation Modules CLI ====
| |
| <pre>
| |
| zimbra@mailserver:~$ zxsuite powerstore doDeduplicate
| |
| | |
| command doDeduplicate requires more parameters
| |
| | |
| Syntax:
| |
| zxsuite powerstore doDeduplicate {volume_name} [attr1 value1 [attr2 value2...]]
| |
| | |
| PARAMETER LIST
| |
| | |
| NAME TYPE EXPECTED VALUES DEFAULT
| |
| volume_name(M) String[,..]
| |
| dry_run(O) Boolean true|false false
| |
| | |
| (M) == mandatory parameter, (O) == optional parameter
| |
| | |
| Usage example:
| |
| | |
| zxsuite powerstore dodeduplicate secondvolume
| |
| Starts a deduplication on volume secondvolume
| |
| </pre>
| |
| | |
| To list all available volumes, you can use the ''`zxsuite powerstore getAllVolumes`'' command.
| |
| | |
| | |
| === "doDeduplicate" stats ===
| |
| The "doDeduplicate" operation is a valid target for the "monitor" command, meaning that you can watch the command's statistics while it's running through the `zxsuite powerstore monitor [operationID]` command.
| |
| | |
| ''Sample Output''
| |
| <pre>
| |
| Current Pass (Digest Prefix): 63/64
| |
| Checked Mailboxes: 148/148
| |
| Deduplicated/duplicated Blobs: 64868/137089
| |
| Already Deduplicated Blobs: 71178
| |
| Skipped Blobs: 0
| |
| Invalid Digests: 0
| |
| Total Space Saved: 21.88 GB
| |
| </pre>
| |
| | |
| * "Current Pass (Digest Prefix)" - The "doDeduplicate" command will analyze the BLOBS in groups based on the first characted of their digest (name).
| |
| * "Checked Mailboxes" - The number of mailboxes analyzed for the current pass.
| |
| * "Deduplicated/duplicated Blobs" - Number of BLOBS deduplicated by the current operation / Number of total duplicated items on the volume.
| |
| * "Already Deduplicated Blobs" - Number of deduplicated blobs on the volume (duplicated blobs that have been deduplicated by a previous run).
| |
| * "Skipped Blobs" - BLOBs that have not been analyzed, usually because of a read error or missing file.
| |
| * "Invalid Digests" - BLOBs with a bad digest (name different from the actual digest of the file).
| |
| * "Total Space Saved" - Amount of disk space freed by the doDeduplicate operation.
| |
| | |
| | |
| Looking at the sample output above we can see that:
| |
| * The operation is running the second to last pass on the last mailbox
| |
| * 137089 duplicated BLOBs have been found, 71178 of which have already been deduplicated previously.
| |
| * The current operation deduplicated 64868 BLOBs, for a total disk space saving of 21.88GB
| |
| </div>
| |
| </div>
| |
| <div class="col-md-9">
| |
| <div class="panel-footer">
| |
| <p><i class="fa fa-clock-o"></i> Aug 25, 2016 - [https://www.zimbra.com/email-server-software/ Know more »]</p>
| |
| </div>
| |
| </div>
| |
| </div>
| |
| </div>
| |
| </div>
| |
| </div>
| |
| </div>
| |
| <div class="col-md-3"><br /></div>
| |
| <div class="col-md-3">
| |
| <div class="panel panel-zimbrared-light-border">
| |
| <div class="panel-heading">
| |
| <h3 class="panel-title"><i class="fa fa-gear pull-left"></i> Zimbra Next Generation Modules</h3>
| |
| </div>
| |
| <div class="panel-body">
| |
| {{ZNG}}
| |
| </div>
| |
| </div>
| |
| </div>
| |
| <div class="col-md-3">
| |
| <div class="panel panel-primary-light-border">
| |
| <div class="panel-heading">
| |
| <h3 class="panel-title"><i class="fa fa-info-circle pull-left"></i> Zimbra Next Generation Modules Resources</h3>
| |
| </div>
| |
| <div class="panel-body">
| |
| {{ZNGL}}
| |
| </div>
| |
| </div>
| |
| </div>
| |
| <div class="clearfix"></div>
| |
| <div class="col-md-12"><br></div>
| |
| {{FH}}
| |