Ajcody-HSM-Notes: Difference between revisions

Line 262: Line 262:
  su - zimbra
  su - zimbra
  zmzimletctl deploy /opt/zimbra/zimlets-network/com_zimbra_hsm.zip
  zmzimletctl deploy /opt/zimbra/zimlets-network/com_zimbra_hsm.zip
===What Can I Do For HSM In The Admin Web Console?===
====Setup The HSM Volume====
====Adjust The zimbraHsmAge Date====
=====Per Server=====
=====Global=====


===References From Official Documentation===
===References From Official Documentation===

Revision as of 21:29, 15 August 2008

HSM or Hierarchical Storage Management

Actual General Notes Homepage

Please see Ajcody-HSM-Notes

What's It Look Like - Big Picture

HSM requires a new "volume" on the mailstore(s), i.e. the mailbox server. The Zimbra mailbox server(s) starts with dedicated volumes for the [Default paths listed]:

Reference is Zimbra Mailbox Server

  • Message Store [/opt/zimbra/store]
    • All email messages reside, including the message body and any file attachments.
      • Messages are stored in MIME format.
    • Each mailbox has a dedicated directory named after its internal Zimbra mailbox ID.
      • Note: Mailbox IDs are unique per server, not system-wide.
    • HSM Message Store (optional) [there is no default path, you use whatever partition you make for it]
      • Hierarchical Storage Management (HSM) allows you to configure storage volumes for older messages.
      • To manage your email storage resources, you can implement a different HSM policy for each message server.
        • Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message.
        • The messages are still accessible.
    • Single-Copy Message Storage
      • Single copy storage allows messages with multiple recipients to be stored only once in the file system.
        • On UNIX systems, the mailbox directory for each user contains a hard link to the actual file.
  • Index Store [/opt/zimbra/index]
  • Data (MySQL) Store [/opt/zimbra/db]
  • Backup [/opt/zimbra/backup]
  • Log files [/opt/zimbra/log].

HSM Impact With Backups

Zimbra mailbox servers cannot see, read, or write to another Zimbra server. The HSM store data is integrated into the mailbox servers normal backup process - they are NOT separate processes. If you find the HSM disks are having an impact on your backup times and you need to get the backup times shorter please look at the following:

HSM Impact To Server Performance

HSM currently iterates all mailboxes without pausing. This can result in a big disk/CPU hit. Please see below for more details.

How Does HSM Determine When To Move Message?

Taken from an internal thread within bugzilla.

Question is:

What date/timestamp does HSM use to determine when to move the message? The time a message is injected or the date/time in the Date: header?

Answer is:

HSM uses the date that's stored in the database. This is either the time that the message was added or the value specified in the X-Zimbra-Received header.

HSM And Attachments - Any Options?

If you would like to remove attachments (to another type of storage), please see the RFE/bug below. Comment #9 says:

Today most customers use our built-in HSM to allow for very large mailboxes but
use cheaper storage. We don't have a way to strip attachments but are looking
at some options of providing a way to move large attachments optionally to
either online or offline storage.

Please see the RFE/bug below. Vote on this if you like it.

A How-To Example

Introduction

This is a testing example. You should adjust these steps to use a REAL PARTITION and NOT A DIRECTORY in the steps listed below.

Create The HSM Volume

Normally you would have the HSM volume as a separate partition on the server - the lower performance disks when compared to the partition the mailstore is using.

To do a "dry" test, I did the following.

As root.

[root@mail3 ~]mkdir /opt/zimbra/hsm
[root@mail3 ~]chown zimbra:zimbra /opt/zimbra/hsm

Then switch to zimbra.

 [zimbra@mail3 ~]su - zimbra
 [zimbra@mail3 ~]zmvolume -a -n hsm-volume -t secondaryMessage -p /opt/zimbra/hsm
 [zimbra@mail3 ~]zmvolume -l
   Volume id: 2
        name: index1
        type: index
        path: /opt/zimbra/index
  compressed: false
     current: true

   Volume id: 1
        name: message1
        type: primaryMessage
        path: /opt/zimbra/store
  compressed: false
     current: true

   Volume id: 3
        name: hsm-volume
        type: secondaryMessage
        path: /opt/zimbra/hsm
  compressed: false
     current: false
Set HSM Volume To Current

Now let's set the hsm-volume to "current". Otherwise, if you try to run zmhsm you'll get a error of "invalid request: None of the secondary message volumes are marked as current."

 [zimbra@mail3 ~]$zmvolume -sc -id 3
 [zimbra@mail3 ~]$zmvolume -l
   Volume id: 2
        name: index1
        type: index
        path: /opt/zimbra/index
  compressed: false
     current: true

   Volume id: 1
        name: message1
        type: primaryMessage
        path: /opt/zimbra/store
  compressed: false
     current: true

   Volume id: 3
        name: hsm-volume
        type: secondaryMessage
        path: /opt/zimbra/hsm
  compressed: false
     current: true
Starting HSM For First Time

This is example is on a test server of mine. I don't have any messages older than 30 days. The default global configuration for HSM Age is 30 days.

[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
[zimbra@mail3 ~]$ crontab -l | grep -i hsm
[zimbra@mail3 ~]$ zmhsm -t
HSM process started.
[zimbra@mail3 ~]$ zmhsm -u
Last HSM Session Stats
----------------------
Start time: Fri Aug 15 15:48:58 EDT 2008
End time:   Fri Aug 15 15:48:58 EDT 2008
Not currently running.
Moved 0 blobs dated earlier than Wed Jul 16 15:48:58 EDT 2008
  to volume 3.
Mailboxes processed: 7 out of 7.
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 30d

So, it worked but didn't have anything to Age.

Adjusting the zimbraHsmAge

So let's adjust the zimbraHsmAge variable and try again.

[zimbra@mail3 ~]$ zmprov mcf zimbraHsmAge 1
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1
[zimbra@mail3 ~]$ zmhsm -t
HSM process started.
[zimbra@mail3 ~]$ zmhsm -u
Last HSM Session Stats
----------------------
Start time: Fri Aug 15 15:56:39 EDT 2008
End time:   Fri Aug 15 15:56:40 EDT 2008
Not currently running.
Moved 63 blobs dated earlier than Fri Aug 15 15:56:38 EDT 2008
  to volume 3.
Mailboxes processed: 7 out of 7.
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/
0
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/0/
1  14  15  3

The above commands modified the zimbraHsmAge as a global setting. You can also modify it on a server basis.

[zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge
zimbraHsmAge: 1
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1
[zimbra@mail3 ~]$ zmprov ms mail3.internal.homeunix.com zimbraHsmAge 30
[zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge
zimbraHsmAge: 30
[zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge
zimbraHsmAge: 1
Moving zimbraHsmAge Back To A Lower Number

Continuing after the above steps, let's see what happens if we now run zmhsm.

[zimbra@mail3 ~]$ zmhsm -t HSM process started. [zimbra@mail3 ~]$ zmhsm -u Last HSM Session Stats


Start time: Fri Aug 15 16:09:05 EDT 2008 End time: Fri Aug 15 16:09:05 EDT 2008 Not currently running. Moved 0 blobs dated earlier than Fri Aug 15 16:08:35 EDT 2008

 to volume 3.

Mailboxes processed: 7 out of 7.

Did the message files get moved back?

[zimbra@mail3 ~]$ find /opt/zimbra/store/0/15 -name *.msg -print
[zimbra@mail3 ~]$ find /opt/zimbra/hsm/0/15 -name *.msg -print
/opt/zimbra/hsm/0/15/msg/0/269-21.msg
/opt/zimbra/hsm/0/15/msg/0/263-10.msg
/opt/zimbra/hsm/0/15/msg/0/268-14.msg
/opt/zimbra/hsm/0/15/msg/0/261-8.msg
/opt/zimbra/hsm/0/15/msg/0/266-13.msg
/opt/zimbra/hsm/0/15/msg/0/259-4.msg
/opt/zimbra/hsm/0/15/msg/0/265-12.msg
/opt/zimbra/hsm/0/15/msg/0/257-2.msg

Nope. I'm not sure what else to add here. Don't know if there's actually a way to move them back.

Now What? Place In Cron

You will have to manually put in the zimbra crontab file a line to run the zmhsm command.

Something like the following at the end of the crontab [ su - zimbra ; crontab -e ] :

# HSM
0 6 * * * /opt/zimbra/bin/zmhsm -t

This will run every morning at 0600. The question of each administrator is what is the right time to run this. I'm initial thought is to try and kick it off after your backups. One might want to have two entries for the zmhsm command, one after your daily incremental and another time frame for you full backup days.

Checking For Or Deploying The HSM Zimlet

In the Admin web console, you'll see the reference to the HSM zimlets in this path:

Configuration > Admin Extensions > com_zimbra_hsm

To deploy the zimlet for HSM

su - zimbra
zmzimletctl deploy /opt/zimbra/zimlets-network/com_zimbra_hsm.zip

What Can I Do For HSM In The Admin Web Console?

Setup The HSM Volume

Adjust The zimbraHsmAge Date

Per Server
Global

References From Official Documentation

zmhsm - command for HSM

Please see:

http://www.zimbra.com/docs/ne/latest/administration_guide/A_app-command-line.17.19.html

zmvolume - command for volumes

Please see:

http://wiki.zimbra.com/index.php?title=CLI_zmvolume

Global HSM Session Setting

Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111022

Global Settings HSM (Hierarchical Storage Management) sets the default message age threshold to 30 days. The HSM global setting is the default unless you change the schedule in the Server configuration. See “Scheduling HSM Sessions” .

Scheduling HSM Sessions

Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111045

HSM can be configured for secondary storage volumes for older messages. Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message. Users are not aware of any change and do not see any noticeable difference when opening an older message that has been moved.

To manage your email storage resources, you can implement a different HSM policy for each mailbox server. The message age threshold for HSM is set globally on the HSM tab or for individual servers from the Server Volume tab. The default is 30 days. The thresholds configured on individual servers override the threshold configured as the global setting. Sessions to move messages to the secondary volume are scheduled in your cron table. From the administration console, when you select a server, you can manually start a session, monitor sessions, and abort sessions that are in progress from the Volumes tab.

When you abort a session and then restart the process, the HSM session looks for entries in the primary store that meet the HSM age criteria. Any entries that were moved in the previous run would be excluded, as they would no longer exist in the primary store.

Jump to: navigation, search