Ajcody-HSM-Notes

From Zimbra :: Wiki

Jump to: navigation, search
Attention.png - This article is NOT official Zimbra documentation. It is a user contribution and may include unsupported customizations, references, suggestions, or information.

Contents

HSM Or Hierarchical Storage Management

Actual HSM Or Hierarchical Storage Management Homepage

Please see Ajcody-HSM-Notes

General Q&A

What's It Look Like - Big Picture

HSM requires a new "volume" on the mailstore(s), i.e. the mailbox server. The Zimbra mailbox server(s) starts with dedicated volumes for the [Default paths listed]:

Reference is Zimbra Mailbox Server

  • Message Store [/opt/zimbra/store]
    • All email messages reside, including the message body and any file attachments.
      • Messages are stored in MIME format.
    • Each mailbox has a dedicated directory named after its internal Zimbra mailbox ID.
      • Note: Mailbox IDs are unique per server, not system-wide.
    • HSM Message Store (optional) [there is no default path, you use whatever partition you make for it]
      • Hierarchical Storage Management (HSM) allows you to configure storage volumes for older messages.
      • To manage your email storage resources, you can implement a different HSM policy for each message server.
        • Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message.
        • The messages are still accessible.
    • Single-Copy Message Storage
      • Single copy storage allows messages with multiple recipients to be stored only once in the file system.
  • Index Store [/opt/zimbra/index]
  • Data (MySQL) Store [/opt/zimbra/db]
  • Backup [/opt/zimbra/backup]
  • Log files [/opt/zimbra/log].

Backup And HSM

Please see Bugs/RFE:

HSM Impact With Backups

Zimbra mailbox servers cannot see, read, or write to another Zimbra server. The HSM store data is integrated into the mailbox servers normal backup process - they are NOT separate processes. If you find the HSM disks are having an impact on your backup times and you need to get the backup times shorter please look at the following:

HSM Impact To Server Performance

HSM currently iterates all mailboxes without pausing. This can result in a big disk/CPU hit. Please see below for more details.

HSM Running During ZCS Restarts

I created this RFE:

Customer reported that mailboxd was manually restart during their hsm operation and it resulted in an extended downtime during the zimbra start processes because the hsm activity was in the redologs and had to be processed that way before mailboxd was fully running.

How Does HSM Determine When To Move Message?

Taken from an internal thread within bugzilla.

Question is:

What date/timestamp does HSM use to determine when to move the message? The time a message is injected or the date/time in the Date: header?

Answer is:

HSM uses the date that's stored in the database. This is either the time that the message was added or the value specified in the X-Zimbra-Received header.

HSM And Attachments - Any Options?

If you would like to remove attachments (to another type of storage), please see the RFE/bug below. Comment #9 says:

Today most customers use our built-in HSM to allow for very large mailboxes but
use cheaper storage. We don't have a way to strip attachments but are looking
at some options of providing a way to move large attachments optionally to
either online or offline storage.

Please see the RFE/bug below. Vote on this if you like it.

Aging Policy Options For HSM Data

This is needed otherwise your HSM volume would grow indefinitely.

Please see the following RFE and vote on it.

What Doesn't Get HSM'd?

It's basically messages that get HSM'd. Here's some RFE/bug's I've found for missing items.

Wiki Items

RFE filed, please see:

Document & Wiki Version Items

Please see:

Briefcase Items

RFE filed, please see:

Does The Mailbox Go Into Maintenance Mode?

This was fixed in 5.0.3. Please see the following bug.

HSM Logging

I found this RFE, it might prove useful.

More Than One HSM Volume (Secondary Message Store)

There is a RFE for this, please see the following:

HSM/Secondary Volume for Spam & Junk

I'm not sure exactly what the details and dependencies are with this bug. I added a comment for clarity.

Please see the follow:

Consistency Checking Tool For HSM

This is available with zmblobchk, which checks messages in general.

Please see the RFE that it was built for for more information:

What If HSM Volume Becomes Full?

Q: What would happen if the HSM volume filled up while the HSM process was moving messages from the primary store to HSM store? Would it detect the full volume and abort the transaction(s) or would it keep trying? Is it possible that any mail would be lost if the HSM store filled up during an HSM run?

A. It's transactional, so it will fail gracefully. More specifically, if anything goes wrong during the file copying process, we delete any copied files and abort. The volume id of messages processed before the failure remains the same. HSM runs one mailbox at a time, so the rollback only happens for the last mailbox.

Restoring HSM Volumes - RFE

Please see:

A How-To Example - CLI

Introduction

This is a testing example. You should adjust these steps to use a REAL PARTITION and NOT A DIRECTORY in the steps listed below.

Create The HSM Volume

Normally you would have the HSM volume as a separate partition on the server - the lower performance disks when compared to the partition the mailstore is using.

To do a "dry" test, I did the following.

As root.

[root@mail3 ~]mkdir /opt/zimbra/hsm
[root@mail3 ~]chown zimbra:zimbra /opt/zimbra/hsm

Then switch to zimbra.

 [zimbra@mail3 ~]su - zimbra
 [zimbra@mail3 ~]zmvolume -a -n hsm-volume -t secondaryMessage -p /opt/zimbra/hsm
 [zimbra@mail3 ~]zmvolume -l
   Volume id: 2
        name: index1
        type: index
        path: /opt/zimbra/index
  compressed: false
     current: true

   Volume id: 1
        name: message1
        type: primaryMessage
        path: /opt/zimbra/store
  compressed: false
     current: true

   Volume id: 3
        name: hsm-volume
        type: secondaryMessage
        path: /opt/zimbra/hsm
  compressed: false
     current: false
Set HSM Volume To Current

Now let's set the hsm-volume to "current". Otherwise, if you try to run zmhsm you'll get a error of "invalid request: None of the secondary message volumes are marked as current."

 [zimbra@mail3 ~]$zmvolume -sc -id 3
 [zimbra@mail3 ~]$zmvolume -l
   Volume id: 2
        name: index1
        type: index
        path: /opt/zimbra/index
  compressed: false
     current: true

   Volume id: 1
        name: message1
        type: primaryMessage
        path: /opt/zimbra/store
  compressed: false
     current: true

   Volume id: 3
        name: hsm-volume
        type: secondaryMessage
        path: /opt/zimbra/hsm
  compressed: false
     current: true
Starting HSM For First Time

This is example is on a test server of mine. I don't have any messages older than 30 days. The default global configuration for HSM Age is 30 days.

[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
[zimbra@mail3 ~]$ crontab -l | grep -i hsm
[zimbra@mail3 ~]$ zmhsm -t
HSM process started.
[zimbra@mail3 ~]$ zmhsm -u
Last HSM Session Stats
----------------------
Start time: Fri Aug 15 15:48:58 EDT 2008
End time:   Fri Aug 15 15:48:58 EDT 2008
Not currently running.
Moved 0 blobs dated earlier than Wed Jul 16 15:48:58 EDT 2008
  to volume 3.
Mailboxes processed: 7 out of 7.
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 30d

So, it worked but didn't have anything to Age.

Adjusting the zimbraHsmAge

So let's adjust the zimbraHsmAge variable and try again.

zimbraHsmAge must be a valid duration of: nnn[hsmd]

[zimbra@mail3 ~]$ zmprov mcf zimbraHsmAge 30d
### example initially had to reproduce issue for my RFE - please don't use 1d ###
###  [zimbra@mail3 ~]$ zmprov mcf zimbraHsmAge 1d ###
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1
[zimbra@mail3 ~]$ zmhsm -t
HSM process started.
[zimbra@mail3 ~]$ zmhsm -u
Last HSM Session Stats
----------------------
Start time: Fri Aug 15 15:56:39 EDT 2008
End time:   Fri Aug 15 15:56:40 EDT 2008
Not currently running.
Moved 63 blobs dated earlier than Fri Aug 15 15:56:38 EDT 2008
  to volume 3.
Mailboxes processed: 7 out of 7.
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
0
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/0/
1  14  15  3

The above commands modified the zimbraHsmAge as a global setting. You can also modify it on a server basis.

[zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge
zimbraHsmAge: 1d
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1d
[zimbra@mail3 ~]$ zmprov ms mail3.internal.homeunix.com zimbraHsmAge 30d
[zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge
zimbraHsmAge: 30d
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1d
Moving zimbraHsmAge Back To A Lower Number

Continuing after the above steps, let's see what happens if we now run zmhsm.

[zimbra@mail3 ~]$ zmhsm -t
HSM process started.
[zimbra@mail3 ~]$ zmhsm -u
Last HSM Session Stats
----------------------
Start time: Fri Aug 15 16:09:05 EDT 2008
End time:   Fri Aug 15 16:09:05 EDT 2008
Not currently running.
Moved 0 blobs dated earlier than Fri Aug 15 16:08:35 EDT 2008
  to volume 3.
Mailboxes processed: 7 out of 7.

Did the message files get moved back?

 [zimbra@mail3 ~]$ find /opt/zimbra/store/0/15 -name *.msg -print
 [zimbra@mail3 ~]$ find /opt/zimbra/hsm/0/15 -name *.msg -print
 /opt/zimbra/hsm/0/15/msg/0/269-21.msg
 /opt/zimbra/hsm/0/15/msg/0/263-10.msg
 /opt/zimbra/hsm/0/15/msg/0/268-14.msg
 /opt/zimbra/hsm/0/15/msg/0/261-8.msg
 /opt/zimbra/hsm/0/15/msg/0/266-13.msg
 /opt/zimbra/hsm/0/15/msg/0/259-4.msg
 /opt/zimbra/hsm/0/15/msg/0/265-12.msg
 /opt/zimbra/hsm/0/15/msg/0/257-2.msg

Nope. I'm not sure what else to add here. Don't know if there's actually a way to move them back.

RFE File For This Behavior - To Move Msg's Back To Primary Store

Please see:

It is possible to move blobs between volumes using the zmsoap command for ZCS 6 and above. Please read the source documents on this on your ZCS server in /opt/zimbra/docs - the two files are soap-admin.txt and soap.txt . The reference your looking for is MoveBlobsRequest, in the soap-admin.txt guide. To build the proper query, you'll also want to consult the soap.txt guide. As of ZCS 8, it reads as:

<MoveBlobsRequest types="{types}" sourceVolumeIds="{volume-ids}" destVolumeId="{volume-id}" [maxBytes="{n}"]>
  [<query>{search-query}</query>]
</MoveBlobsRequest>

<MoveBlobsResponse numBlobsMoved="{n}" numBytesMoved="{n}" totalMailboxes="{n}"/>

Moves blobs between volumes.  Unlike HsmRequest, this request is synchronous,
and reads parameters from the request attributes instead of zimbraHsmPolicy.

types: a comma-separated list of item types, or "all" for all types.  See the spec for
  <SearchRequest> for details.
volume-ids: a comma-separated list of volume ids.
query: if specified, only items that match this query will be moved.
maxBytes: Limit for the total number of bytes of data to move.  Blob move will abort
  if this threshold is exceeded.

Some examples are listed below, please check your volume id's with zmvolume -l - you will NOT want to run these examples without updating the volume id's. The HSM policy and MoveBlobsRequest are search based, so you can add a query like this:

zmsoap -z MoveBlobsRequest @types=all @sourceVolumeIds=3 @destVolumeId=4 query=is:anywhere

An "is:anywhere" query will look in all folders, including /Trash for messages to move.


A query just in junk would be like the following below. This will move every blob for every mailbox located in volume 1 to volume 3 that is under the junk folder.

zmsoap -z MoveBlobsRequest @sourceVolumeIds=1 @destVolumeId=3 query=in:junk


Or you can specify a date:

zmsoap -z MoveBlobsRequest @sourceVolumeIds=1 @destVolumeId=3 query=before:1/1/2012


MoveBlobsRequest will move the blob and update mysql mail_item table accordingly.


Old Note Below


The desperate, could review the following. This would be an unsupported procedure. I'll try to engage developers on this and get some feedback though:

Now What? Place In Cron

You will have to manually put in the zimbra crontab file a line to run the zmhsm command.

Something like the following at the end of the crontab [ su - zimbra ; crontab -e ] :

# HSM
0 6 * * * /opt/zimbra/bin/zmhsm -t

This will run every morning at 0600. The question of each administrator is what is the right time to run this. I'm initial thought is to try and kick it off after your backups. One might want to have two entries for the zmhsm command, one after your daily incremental and another time frame for you full backup days.

Checking For The HSM Zimlet

In the Admin web console, you'll see the reference to the HSM zimlets in this path:

Configuration > Admin Extensions > com_zimbra_hsm

Deploying The HSM Zimlet

To deploy the zimlet for HSM

su - zimbra
zmzimletctl deploy /opt/zimbra/zimlets-network/com_zimbra_hsm.zip

Log into the admin web console - fresh session.

What Can I Do For HSM In The Admin Web Console?

Confirm the HSM zimlet is installed first.

Setup The HSM Volume

Configuration > Servers > [Server For HSM Volume]

Then select the Volumes tab on the right-hand section.

Clicking on "Add", you'll be given a drop down chooser for "Volume Type". For HSM, you would select "Secondary Message".

The "Assign Current Volumes" section will show you what volume is in use for what function.

You will also see "HSM" on this page to set the zimbraHsmAge variable for the SERVER - rather than globally.

Adjust The zimbraHsmAge Date

Per Server

Configuration > Servers > [Server For HSM Volume]

Then select the Volumes tab on the right-hand section.

You will see "HSM" on this page to set the zimbraHsmAge variable for the SERVER - rather than globally.

Global

Configuration > Global Settings

The select the HSM tab on the right-hand section.

This will set the global (default) message age for HSM.

Starting & Stopping HSM (zmhsm)

Configurations > Servers > [Server With HSM Volume]

The on the right-hand section, above the details area for the server you'll see a button/tab HSM. Click on this and you'll be given the option to Start HSM Session. It will output progress details.


References From Official Documentation

zmhsm - command for HSM

Please see:

CLI_zmhsm_Network_Edition_only

zmvolume - command for volumes

Please see:

http://wiki.zimbra.com/index.php?title=CLI_zmvolume

Global HSM Session Setting

Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111022

Global Settings HSM (Hierarchical Storage Management) sets the default message age threshold to 30 days. The HSM global setting is the default unless you change the schedule in the Server configuration. See “Scheduling HSM Sessions” .

Scheduling HSM Sessions

Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111045

HSM can be configured for secondary storage volumes for older messages. Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message. Users are not aware of any change and do not see any noticeable difference when opening an older message that has been moved.

To manage your email storage resources, you can implement a different HSM policy for each mailbox server. The message age threshold for HSM is set globally on the HSM tab or for individual servers from the Server Volume tab. The default is 30 days. The thresholds configured on individual servers override the threshold configured as the global setting. Sessions to move messages to the secondary volume are scheduled in your cron table. From the administration console, when you select a server, you can manually start a session, monitor sessions, and abort sessions that are in progress from the Volumes tab.

When you abort a session and then restart the process, the HSM session looks for entries in the primary store that meet the HSM age criteria. Any entries that were moved in the previous run would be excluded, as they would no longer exist in the primary store.

Personal tools