Ajcody-HSM-Notes

Revision as of 21:50, 15 August 2008 by Ajcody (talk | contribs) (Adjust The zimbraHsmAge Date)

HSM or Hierarchical Storage Management

Actual General Notes Homepage

Please see Ajcody-HSM-Notes

What's It Look Like - Big Picture

HSM requires a new "volume" on the mailstore(s), i.e. the mailbox server. The Zimbra mailbox server(s) starts with dedicated volumes for the [Default paths listed]:

Reference is Zimbra Mailbox Server

  • Message Store [/opt/zimbra/store]
    • All email messages reside, including the message body and any file attachments.
      • Messages are stored in MIME format.
    • Each mailbox has a dedicated directory named after its internal Zimbra mailbox ID.
      • Note: Mailbox IDs are unique per server, not system-wide.
    • HSM Message Store (optional) [there is no default path, you use whatever partition you make for it]
      • Hierarchical Storage Management (HSM) allows you to configure storage volumes for older messages.
      • To manage your email storage resources, you can implement a different HSM policy for each message server.
        • Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message.
        • The messages are still accessible.
    • Single-Copy Message Storage
      • Single copy storage allows messages with multiple recipients to be stored only once in the file system.
        • On UNIX systems, the mailbox directory for each user contains a hard link to the actual file.
  • Index Store [/opt/zimbra/index]
  • Data (MySQL) Store [/opt/zimbra/db]
  • Backup [/opt/zimbra/backup]
  • Log files [/opt/zimbra/log].

HSM Impact With Backups

Zimbra mailbox servers cannot see, read, or write to another Zimbra server. The HSM store data is integrated into the mailbox servers normal backup process - they are NOT separate processes. If you find the HSM disks are having an impact on your backup times and you need to get the backup times shorter please look at the following:

HSM Impact To Server Performance

HSM currently iterates all mailboxes without pausing. This can result in a big disk/CPU hit. Please see below for more details.

How Does HSM Determine When To Move Message?

Taken from an internal thread within bugzilla.

Question is:

What date/timestamp does HSM use to determine when to move the message? The time a message is injected or the date/time in the Date: header?

Answer is:

HSM uses the date that's stored in the database. This is either the time that the message was added or the value specified in the X-Zimbra-Received header.

HSM And Attachments - Any Options?

If you would like to remove attachments (to another type of storage), please see the RFE/bug below. Comment #9 says:

Today most customers use our built-in HSM to allow for very large mailboxes but
use cheaper storage. We don't have a way to strip attachments but are looking
at some options of providing a way to move large attachments optionally to
either online or offline storage.

Please see the RFE/bug below. Vote on this if you like it.

A How-To Example - CLI

Introduction

This is a testing example. You should adjust these steps to use a REAL PARTITION and NOT A DIRECTORY in the steps listed below.

Create The HSM Volume

Normally you would have the HSM volume as a separate partition on the server - the lower performance disks when compared to the partition the mailstore is using.

To do a "dry" test, I did the following.

As root.

[root@mail3 ~]mkdir /opt/zimbra/hsm
[root@mail3 ~]chown zimbra:zimbra /opt/zimbra/hsm

Then switch to zimbra.

 [zimbra@mail3 ~]su - zimbra
 [zimbra@mail3 ~]zmvolume -a -n hsm-volume -t secondaryMessage -p /opt/zimbra/hsm
 [zimbra@mail3 ~]zmvolume -l
   Volume id: 2
        name: index1
        type: index
        path: /opt/zimbra/index
  compressed: false
     current: true

   Volume id: 1
        name: message1
        type: primaryMessage
        path: /opt/zimbra/store
  compressed: false
     current: true

   Volume id: 3
        name: hsm-volume
        type: secondaryMessage
        path: /opt/zimbra/hsm
  compressed: false
     current: false
Set HSM Volume To Current

Now let's set the hsm-volume to "current". Otherwise, if you try to run zmhsm you'll get a error of "invalid request: None of the secondary message volumes are marked as current."

 [zimbra@mail3 ~]$zmvolume -sc -id 3
 [zimbra@mail3 ~]$zmvolume -l
   Volume id: 2
        name: index1
        type: index
        path: /opt/zimbra/index
  compressed: false
     current: true

   Volume id: 1
        name: message1
        type: primaryMessage
        path: /opt/zimbra/store
  compressed: false
     current: true

   Volume id: 3
        name: hsm-volume
        type: secondaryMessage
        path: /opt/zimbra/hsm
  compressed: false
     current: true
Starting HSM For First Time

This is example is on a test server of mine. I don't have any messages older than 30 days. The default global configuration for HSM Age is 30 days.

[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
[zimbra@mail3 ~]$ crontab -l | grep -i hsm
[zimbra@mail3 ~]$ zmhsm -t
HSM process started.
[zimbra@mail3 ~]$ zmhsm -u
Last HSM Session Stats
----------------------
Start time: Fri Aug 15 15:48:58 EDT 2008
End time:   Fri Aug 15 15:48:58 EDT 2008
Not currently running.
Moved 0 blobs dated earlier than Wed Jul 16 15:48:58 EDT 2008
  to volume 3.
Mailboxes processed: 7 out of 7.
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 30d

So, it worked but didn't have anything to Age.

Adjusting the zimbraHsmAge

So let's adjust the zimbraHsmAge variable and try again.

[zimbra@mail3 ~]$ zmprov mcf zimbraHsmAge 1
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1
[zimbra@mail3 ~]$ zmhsm -t
HSM process started.
[zimbra@mail3 ~]$ zmhsm -u
Last HSM Session Stats
----------------------
Start time: Fri Aug 15 15:56:39 EDT 2008
End time:   Fri Aug 15 15:56:40 EDT 2008
Not currently running.
Moved 63 blobs dated earlier than Fri Aug 15 15:56:38 EDT 2008
  to volume 3.
Mailboxes processed: 7 out of 7.
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
0
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/0/
1  14  15  3

The above commands modified the zimbraHsmAge as a global setting. You can also modify it on a server basis.

[zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge
zimbraHsmAge: 1
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1
[zimbra@mail3 ~]$ zmprov ms mail3.internal.homeunix.com zimbraHsmAge 30
[zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge
zimbraHsmAge: 30
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1
Moving zimbraHsmAge Back To A Lower Number

Continuing after the above steps, let's see what happens if we now run zmhsm.

[zimbra@mail3 ~]$ zmhsm -t HSM process started. [zimbra@mail3 ~]$ zmhsm -u Last HSM Session Stats


Start time: Fri Aug 15 16:09:05 EDT 2008 End time: Fri Aug 15 16:09:05 EDT 2008 Not currently running. Moved 0 blobs dated earlier than Fri Aug 15 16:08:35 EDT 2008

 to volume 3.

Mailboxes processed: 7 out of 7.

Did the message files get moved back?

[zimbra@mail3 ~]$ find /opt/zimbra/store/0/15 -name *.msg -print
[zimbra@mail3 ~]$ find /opt/zimbra/hsm/0/15 -name *.msg -print
/opt/zimbra/hsm/0/15/msg/0/269-21.msg
/opt/zimbra/hsm/0/15/msg/0/263-10.msg
/opt/zimbra/hsm/0/15/msg/0/268-14.msg
/opt/zimbra/hsm/0/15/msg/0/261-8.msg
/opt/zimbra/hsm/0/15/msg/0/266-13.msg
/opt/zimbra/hsm/0/15/msg/0/259-4.msg
/opt/zimbra/hsm/0/15/msg/0/265-12.msg
/opt/zimbra/hsm/0/15/msg/0/257-2.msg

Nope. I'm not sure what else to add here. Don't know if there's actually a way to move them back.

Now What? Place In Cron

You will have to manually put in the zimbra crontab file a line to run the zmhsm command.

Something like the following at the end of the crontab [ su - zimbra ; crontab -e ] :

# HSM
0 6 * * * /opt/zimbra/bin/zmhsm -t

This will run every morning at 0600. The question of each administrator is what is the right time to run this. I'm initial thought is to try and kick it off after your backups. One might want to have two entries for the zmhsm command, one after your daily incremental and another time frame for you full backup days.

Checking For The HSM Zimlet

In the Admin web console, you'll see the reference to the HSM zimlets in this path:

Configuration > Admin Extensions > com_zimbra_hsm

Deploying The HSM Zimlet

To deploy the zimlet for HSM

su - zimbra
zmzimletctl deploy /opt/zimbra/zimlets-network/com_zimbra_hsm.zip

Log into the admin web console - fresh session.

What Can I Do For HSM In The Admin Web Console?

Confirm the HSM zimlet is installed first.

Setup The HSM Volume

Configuration > Servers > [Server For HSM Volume]

Then select the Volumes tab on the right-hand section.

Clicking on "Add", you'll be given a drop down chooser for "Volume Type". For HSM, you would select "Secondary Message".

The "Assign Current Volumes" section will show you what volume is in use for what function.

You will also see "HSM" on this page to set the zimbraHsmAge variable for the SERVER - rather than globally.

Adjust The zimbraHsmAge Date

Per Server

Configuration > Servers > [Server For HSM Volume]

Then select the Volumes tab on the right-hand section.

You will see "HSM" on this page to set the zimbraHsmAge variable for the SERVER - rather than globally.

Global

Configuration > Global Settings

The select the HSM tab on the right-hand section.

This will set the global (default) message age for HSM.

Starting & Stopping HSM (zmhsm)

Configurations > Servers > [Server With HSM Volume]

The on the right-hand section, above the details area for the server you'll see a button/tab HSM. Click on this and you'll be given the option to Start HSM Session. It will output progress details.

References From Official Documentation

zmhsm - command for HSM

Please see:

http://www.zimbra.com/docs/ne/latest/administration_guide/A_app-command-line.17.19.html

zmvolume - command for volumes

Please see:

http://wiki.zimbra.com/index.php?title=CLI_zmvolume

Global HSM Session Setting

Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111022

Global Settings HSM (Hierarchical Storage Management) sets the default message age threshold to 30 days. The HSM global setting is the default unless you change the schedule in the Server configuration. See “Scheduling HSM Sessions” .

Scheduling HSM Sessions

Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111045

HSM can be configured for secondary storage volumes for older messages. Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message. Users are not aware of any change and do not see any noticeable difference when opening an older message that has been moved.

To manage your email storage resources, you can implement a different HSM policy for each mailbox server. The message age threshold for HSM is set globally on the HSM tab or for individual servers from the Server Volume tab. The default is 30 days. The thresholds configured on individual servers override the threshold configured as the global setting. Sessions to move messages to the secondary volume are scheduled in your cron table. From the administration console, when you select a server, you can manually start a session, monitor sessions, and abort sessions that are in progress from the Volumes tab.

When you abort a session and then restart the process, the HSM session looks for entries in the primary store that meet the HSM age criteria. Any entries that were moved in the previous run would be excluded, as they would no longer exist in the primary store.

Jump to: navigation, search