Ajcody-HSM-Notes: Difference between revisions
Line 289: | Line 289: | ||
So, it worked but didn't have anything to Age. | So, it worked but didn't have anything to Age. | ||
=====Adjusting the zimbraHsmAge===== | =====Adjusting the zimbraHsmPolicy variable===== | ||
Default policy [ZCS 8.6] : | |||
<pre> | |||
$ zmprov gacf zimbraHsmPolicy | |||
zimbraHsmPolicy: message,document:before:-30days | |||
</pre> | |||
</pre> | |||
$ zmprov desc -a zimbraHsmPolicy | |||
zimbraHsmPolicy | |||
The policy that determines which mail items get moved to secondary | |||
storage during HSM. Each value specifies a comma-separated list of | |||
item types and the search query used to select items to move. See the | |||
spec for <SearchRequest> for the complete list of item types and | |||
query.txt for the search query spec. | |||
type : string | |||
value : | |||
callback : | |||
immutable : false | |||
cardinality : multi | |||
requiredIn : | |||
optionalIn : globalConfig,server | |||
flags : serverInherited | |||
defaults : message,document:before:-30days | |||
min : | |||
max : | |||
id : 1024 | |||
requiresRestart : | |||
since : 6.0.0_BETA2 | |||
deprecatedSince : | |||
</pre> | |||
======Adjusting the zimbraHsmAge variable - Deprecated since: 6.0.0_BETA2. deprecated in favor for zimbraHsmPolicy====== | |||
<pre> | |||
$ zmprov desc -a zimbraHsmAge | |||
zimbraHsmAge | |||
Deprecated since: 6.0.0_BETA2. deprecated in favor for | |||
zimbraHsmPolicy. Orig desc: Minimum age of mail items whose filesystem | |||
data will be moved to secondary storage.. Must be in valid duration | |||
format: {digits}{time-unit}. digits: 0-9, time-unit: [hmsd]|ms. h - | |||
hours, m - minutes, s - seconds, d - days, ms - milliseconds. If time | |||
unit is not specified, the default is s(seconds). | |||
type : duration | |||
value : | |||
callback : | |||
immutable : false | |||
cardinality : single | |||
requiredIn : | |||
optionalIn : globalConfig,server | |||
flags : serverInherited | |||
defaults : 30d | |||
min : 0 | |||
max : | |||
id : 8 | |||
requiresRestart : | |||
since : | |||
deprecatedSince : 6.0.0_BETA2 | |||
</pre> | |||
'''zimbraHsmAge must be a valid duration of: nnn[hsmd]''' | '''zimbraHsmAge must be a valid duration of: nnn[hsmd]''' | ||
Line 369: | Line 429: | ||
Nope. I'm not sure what else to add here. Don't know if there's actually a way to move them back. | Nope. I'm not sure what else to add here. Don't know if there's actually a way to move them back. | ||
=====RFE - To Move Msg's Back To Primary Store===== | |||
Please see: | Please see: |
Revision as of 18:17, 9 February 2015
![]() |
HSM Or Hierarchical Storage Management
Actual HSM Or Hierarchical Storage Management Homepage
Please see Ajcody-HSM-Notes
General Q&A
What's It Look Like - Big Picture
HSM requires a new "volume" on the mailstore(s), i.e. the mailbox server. The Zimbra mailbox server(s) starts with dedicated volumes for the [Default paths listed]:
Reference is Zimbra Mailbox Server
- Message Store [/opt/zimbra/store]
- All email messages reside, including the message body and any file attachments.
- Messages are stored in MIME format.
- Each mailbox has a dedicated directory named after its internal Zimbra mailbox ID.
- Note: Mailbox IDs are unique per server, not system-wide.
- HSM Message Store (optional) [there is no default path, you use whatever partition you make for it]
- Hierarchical Storage Management (HSM) allows you to configure storage volumes for older messages.
- To manage your email storage resources, you can implement a different HSM policy for each message server.
- Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message.
- The messages are still accessible.
- Single-Copy Message Storage
- Single copy storage allows messages with multiple recipients to be stored only once in the file system.
- On UNIX systems, the mailbox directory for each user contains a hard link to the actual file.
- Note - this is limited by the variable in postfix called, default_destination_recipient_limit , which defaults to 50. See the following for more details - Ajcody-Hardlinks-And-Postfix-default_destination_recipient_limit
- Single copy storage allows messages with multiple recipients to be stored only once in the file system.
- All email messages reside, including the message body and any file attachments.
- Index Store [/opt/zimbra/index]
- Data (MySQL) Store [/opt/zimbra/db]
- Backup [/opt/zimbra/backup]
- Log files [/opt/zimbra/log].
Backup And HSM
Please see Bugs/RFE:
- "RFE: HSM and backup should not run at the same time if initated."
- "Separate Restore Mechanism for Primary/Secondary Volumes" [Helix]
- http://bugzilla.zimbra.com/show_bug.cgi?id=18566
- "add backup/restore for HSM only" [marked as duplicate of above]
- http://bugzilla.zimbra.com/show_bug.cgi?id=18566
HSM Impact With Backups
Zimbra mailbox servers cannot see, read, or write to another Zimbra server. The HSM store data is integrated into the mailbox servers normal backup process - they are NOT separate processes. If you find the HSM disks are having an impact on your backup times and you need to get the backup times shorter please look at the following:
- Improving backup performance issues
- See Ajcody-Notes-ServerPlanning#What_About_Backups.3F_I_Need_A_Plan
- Specifically look at Auto-group Backups
- User submitted bug below. I don't think this is going to be possible but activity on the bug might push the dev's & PM's to pursue other options. I believe the current answer to the issue is above, Auto-group backups.
- Add backup/restore for HSM only
- http://bugzilla.zimbra.com/show_bug.cgi?id=28200
- Which was marked as a duplicate of the following RFE
- "Separate Restore Mechanism for Primary/Secondary Volumes"
- Which was marked as a duplicate of the following RFE
- http://bugzilla.zimbra.com/show_bug.cgi?id=28200
- Add backup/restore for HSM only
HSM Impact To Server Performance
HSM currently iterates all mailboxes without pausing. This can result in a big disk/CPU hit. Please see below for more details.
- HSM should run as a background thread
HSM Running During ZCS Restarts
I created this RFE:
- "RFE: batch HSM option to avoid large redolog operations if interrupted"
Customer reported that mailboxd was manually restart during their hsm operation and it resulted in an extended downtime during the zimbra start processes because the hsm activity was in the redologs and had to be processed that way before mailboxd was fully running.
How Does HSM Determine When To Move Message?
Taken from an internal thread within bugzilla.
Question is:
- What date/timestamp does HSM use to determine when to move the message? The time a message is injected or the date/time in the Date: header?
Answer is:
- HSM uses the date that's stored in the database. This is either the time that the message was added or the value specified in the X-Zimbra-Received header.
HSM And Attachments - Any Options?
If you would like to remove attachments (to another type of storage), please see the RFE/bug below. Comment #9 says:
- Today most customers use our built-in HSM to allow for very large mailboxes but
- use cheaper storage. We don't have a way to strip attachments but are looking
- at some options of providing a way to move large attachments optionally to
- either online or offline storage.
Please see the RFE/bug below. Vote on this if you like it.
- Ability to remove attachment from received message
Aging Policy Options For HSM Data
This is needed otherwise your HSM volume would grow indefinitely.
Please see the following RFE and vote on it.
- Zimbra Message Store & HSM Aging Policies
What Doesn't Get HSM'd?
It's basically messages that get HSM'd. Here's some RFE/bug's I've found for missing items.
Wiki Items
RFE filed, please see:
- Zmhsm does not move wiki blobs
Document & Wiki Version Items
Please see:
- HSM should be able to handle Document revisions.
Briefcase Items
RFE filed, please see:
- RFE: moving briefcase items to secondary storage during HSM process
Does The Mailbox Go Into Maintenance Mode?
This was fixed in 5.0.3. Please see the following bug.
- HSM should not put mailbox in maintenance mode
HSM Logging
I found this RFE, it might prove useful.
- HSM logging improvements
More Than One HSM Volume (Secondary Message Store)
There is a RFE for this, please see the following:
- Add support for more than one current secondary storage volume in HSM
HSM/Secondary Volume for Spam & Junk
I'm not sure exactly what the details and dependencies are with this bug. I added a comment for clarity.
Please see the follow:
- Junk mail storage (secondary/HSM volume for Spam messages)
Consistency Checking Tool For HSM
This is available with zmblobchk, which checks messages in general.
Please see the RFE that it was built for for more information:
- Tool to do consistency checks and repair for missing blob for ID x
What If HSM Volume Becomes Full?
Q: What would happen if the HSM volume filled up while the HSM process was moving messages from the primary store to HSM store? Would it detect the full volume and abort the transaction(s) or would it keep trying? Is it possible that any mail would be lost if the HSM store filled up during an HSM run?
- A. It's transactional, so it will fail gracefully. More specifically, if anything goes wrong during the file copying process, we delete any copied files and abort. The volume id of messages processed before the failure remains the same. HSM runs one mailbox at a time, so the rollback only happens for the last mailbox.
Restoring HSM Volumes - RFE
Please see:
- "Separate Restore Mechanism for Primary/Secondary Volumes"
- http://bugzilla.zimbra.com/show_bug.cgi?id=18566
- Work around in the meantime is in bug notes [private]. Support can help you with this.
- http://bugzilla.zimbra.com/show_bug.cgi?id=18566
A How-To Example - CLI
Introduction
This is a testing example. You should adjust these steps to use a REAL PARTITION and NOT A DIRECTORY in the steps listed below.
Create The HSM Volume
Normally you would have the HSM volume as a separate partition on the server - the lower performance disks when compared to the partition the mailstore is using.
To do a "dry" test, I did the following.
As root.
[root@mail3 ~]mkdir /opt/zimbra/hsm [root@mail3 ~]chown zimbra:zimbra /opt/zimbra/hsm
Then switch to zimbra.
[zimbra@mail3 ~]su - zimbra [zimbra@mail3 ~]zmvolume -a -n hsm-volume -t secondaryMessage -p /opt/zimbra/hsm [zimbra@mail3 ~]zmvolume -l Volume id: 2 name: index1 type: index path: /opt/zimbra/index compressed: false current: true Volume id: 1 name: message1 type: primaryMessage path: /opt/zimbra/store compressed: false current: true Volume id: 3 name: hsm-volume type: secondaryMessage path: /opt/zimbra/hsm compressed: false current: false
Set HSM Volume To Current
Now let's set the hsm-volume to "current". Otherwise, if you try to run zmhsm you'll get a error of "invalid request: None of the secondary message volumes are marked as current."
[zimbra@mail3 ~]$zmvolume -sc -id 3 [zimbra@mail3 ~]$zmvolume -l Volume id: 2 name: index1 type: index path: /opt/zimbra/index compressed: false current: true Volume id: 1 name: message1 type: primaryMessage path: /opt/zimbra/store compressed: false current: true Volume id: 3 name: hsm-volume type: secondaryMessage path: /opt/zimbra/hsm compressed: false current: true
Starting HSM For First Time
This is example is on a test server of mine. I don't have any messages older than 30 days. The default global configuration for HSM Age is 30 days.
[zimbra@mail3 ~]$ ls /opt/zimbra/hsm/ [zimbra@mail3 ~]$ crontab -l | grep -i hsm [zimbra@mail3 ~]$ zmhsm -t HSM process started. [zimbra@mail3 ~]$ zmhsm -u Last HSM Session Stats ---------------------- Start time: Fri Aug 15 15:48:58 EDT 2008 End time: Fri Aug 15 15:48:58 EDT 2008 Not currently running. Moved 0 blobs dated earlier than Wed Jul 16 15:48:58 EDT 2008 to volume 3. Mailboxes processed: 7 out of 7. [zimbra@mail3 ~]$ ls /opt/zimbra/hsm/ [zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge zimbraHsmAge: 30d
So, it worked but didn't have anything to Age.
Adjusting the zimbraHsmPolicy variable
Default policy [ZCS 8.6] :
$ zmprov gacf zimbraHsmPolicy zimbraHsmPolicy: message,document:before:-30days
$ zmprov desc -a zimbraHsmPolicy zimbraHsmPolicy
The policy that determines which mail items get moved to secondary storage during HSM. Each value specifies a comma-separated list of item types and the search query used to select items to move. See the spec for <SearchRequest> for the complete list of item types and query.txt for the search query spec.
type : string value : callback : immutable : false cardinality : multi requiredIn : optionalIn : globalConfig,server flags : serverInherited defaults : message,document:before:-30days min : max : id : 1024 requiresRestart : since : 6.0.0_BETA2 deprecatedSince :
Adjusting the zimbraHsmAge variable - Deprecated since: 6.0.0_BETA2. deprecated in favor for zimbraHsmPolicy
$ zmprov desc -a zimbraHsmAge zimbraHsmAge Deprecated since: 6.0.0_BETA2. deprecated in favor for zimbraHsmPolicy. Orig desc: Minimum age of mail items whose filesystem data will be moved to secondary storage.. Must be in valid duration format: {digits}{time-unit}. digits: 0-9, time-unit: [hmsd]|ms. h - hours, m - minutes, s - seconds, d - days, ms - milliseconds. If time unit is not specified, the default is s(seconds). type : duration value : callback : immutable : false cardinality : single requiredIn : optionalIn : globalConfig,server flags : serverInherited defaults : 30d min : 0 max : id : 8 requiresRestart : since : deprecatedSince : 6.0.0_BETA2
zimbraHsmAge must be a valid duration of: nnn[hsmd]
- "zimbraHsmAge variables unclear from output"
[zimbra@mail3 ~]$ zmprov mcf zimbraHsmAge 30d ### example initially had to reproduce issue for my RFE - please don't use 1d ### ### [zimbra@mail3 ~]$ zmprov mcf zimbraHsmAge 1d ### [zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge zimbraHsmAge: 1 [zimbra@mail3 ~]$ zmhsm -t HSM process started. [zimbra@mail3 ~]$ zmhsm -u Last HSM Session Stats ---------------------- Start time: Fri Aug 15 15:56:39 EDT 2008 End time: Fri Aug 15 15:56:40 EDT 2008 Not currently running. Moved 63 blobs dated earlier than Fri Aug 15 15:56:38 EDT 2008 to volume 3. Mailboxes processed: 7 out of 7. [zimbra@mail3 ~]$ ls /opt/zimbra/hsm/ 0 [zimbra@mail3 ~]$ ls /opt/zimbra/hsm/0/ 1 14 15 3
The above commands modified the zimbraHsmAge as a global setting. You can also modify it on a server basis.
[zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge zimbraHsmAge: 1d [zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge zimbraHsmAge: 1d [zimbra@mail3 ~]$ zmprov ms mail3.internal.homeunix.com zimbraHsmAge 30d [zimbra@mail3 ~]$ zmprov gs mail3.internal.homeunix.com | grep zimbraHsmAge zimbraHsmAge: 30d [zimbra@mail3 ~]$ zmprov gacf | grep zimbraHsmAge zimbraHsmAge: 1d
Moving zimbraHsmAge Back To A Lower Number
Continuing after the above steps, let's see what happens if we now run zmhsm.
[zimbra@mail3 ~]$ zmhsm -t HSM process started. [zimbra@mail3 ~]$ zmhsm -u Last HSM Session Stats ---------------------- Start time: Fri Aug 15 16:09:05 EDT 2008 End time: Fri Aug 15 16:09:05 EDT 2008 Not currently running. Moved 0 blobs dated earlier than Fri Aug 15 16:08:35 EDT 2008 to volume 3. Mailboxes processed: 7 out of 7.
Did the message files get moved back?
[zimbra@mail3 ~]$ find /opt/zimbra/store/0/15 -name *.msg -print [zimbra@mail3 ~]$ find /opt/zimbra/hsm/0/15 -name *.msg -print /opt/zimbra/hsm/0/15/msg/0/269-21.msg /opt/zimbra/hsm/0/15/msg/0/263-10.msg /opt/zimbra/hsm/0/15/msg/0/268-14.msg /opt/zimbra/hsm/0/15/msg/0/261-8.msg /opt/zimbra/hsm/0/15/msg/0/266-13.msg /opt/zimbra/hsm/0/15/msg/0/259-4.msg /opt/zimbra/hsm/0/15/msg/0/265-12.msg /opt/zimbra/hsm/0/15/msg/0/257-2.msg
Nope. I'm not sure what else to add here. Don't know if there's actually a way to move them back.
RFE - To Move Msg's Back To Primary Store
Please see:
- "HSM - should move messages back to main store if date to HSM was changed" [marked as WONTFIX]
It is possible to move blobs between volumes using the zmsoap command for ZCS 6 and above. Please read the source documents on this on your ZCS server in /opt/zimbra/docs - the two files are soap-admin.txt and soap.txt . The reference your looking for is MoveBlobsRequest, in the soap-admin.txt guide. To build the proper query, you'll also want to consult the soap.txt guide. As of ZCS 8, it reads as:
<MoveBlobsRequest types="{types}" sourceVolumeIds="{volume-ids}" destVolumeId="{volume-id}" [maxBytes="{n}"]> [<query>{search-query}</query>] </MoveBlobsRequest> <MoveBlobsResponse numBlobsMoved="{n}" numBytesMoved="{n}" totalMailboxes="{n}"/> Moves blobs between volumes. Unlike HsmRequest, this request is synchronous, and reads parameters from the request attributes instead of zimbraHsmPolicy. types: a comma-separated list of item types, or "all" for all types. See the spec for <SearchRequest> for details. volume-ids: a comma-separated list of volume ids. query: if specified, only items that match this query will be moved. maxBytes: Limit for the total number of bytes of data to move. Blob move will abort if this threshold is exceeded.
Some examples are listed below, please check your volume id's with zmvolume -l - you will NOT want to run these examples without updating the volume id's. The HSM policy and MoveBlobsRequest are search based, so you can add a query like this:
zmsoap -z MoveBlobsRequest @types=all @sourceVolumeIds=3 @destVolumeId=4 query=is:anywhere
An "is:anywhere" query will look in all folders, including /Trash for messages to move.
A query just in junk would be like the following below. This will move every blob for every mailbox located in volume 1 to volume 3 that is under the junk folder.
zmsoap -z MoveBlobsRequest @sourceVolumeIds=1 @destVolumeId=3 query=in:junk
Or you can specify a date:
zmsoap -z MoveBlobsRequest @sourceVolumeIds=1 @destVolumeId=3 query=before:1/1/2012
MoveBlobsRequest will move the blob and update mysql mail_item table accordingly.
Old Note Below
The desperate, could review the following. This would be an unsupported procedure. I'll try to engage developers on this and get some feedback though:
Now What? Place In Cron
You will have to manually put in the zimbra crontab file a line to run the zmhsm command.
Something like the following at the end of the crontab [ su - zimbra ; crontab -e ] :
# HSM 0 6 * * * /opt/zimbra/bin/zmhsm -t
This will run every morning at 0600. The question of each administrator is what is the right time to run this. I'm initial thought is to try and kick it off after your backups. One might want to have two entries for the zmhsm command, one after your daily incremental and another time frame for you full backup days.
Checking For The HSM Zimlet
In the Admin web console, you'll see the reference to the HSM zimlets in this path:
Configuration > Admin Extensions > com_zimbra_hsm
Deploying The HSM Zimlet
To deploy the zimlet for HSM
su - zimbra zmzimletctl deploy /opt/zimbra/zimlets-network/com_zimbra_hsm.zip
Log into the admin web console - fresh session.
What Can I Do For HSM In The Admin Web Console?
Confirm the HSM zimlet is installed first.
Setup The HSM Volume
Configuration > Servers > [Server For HSM Volume]
Then select the Volumes tab on the right-hand section.
Clicking on "Add", you'll be given a drop down chooser for "Volume Type". For HSM, you would select "Secondary Message".
The "Assign Current Volumes" section will show you what volume is in use for what function.
You will also see "HSM" on this page to set the zimbraHsmAge variable for the SERVER - rather than globally.
Adjust The zimbraHsmAge Date
Per Server
Configuration > Servers > [Server For HSM Volume]
Then select the Volumes tab on the right-hand section.
You will see "HSM" on this page to set the zimbraHsmAge variable for the SERVER - rather than globally.
Global
Configuration > Global Settings
The select the HSM tab on the right-hand section.
This will set the global (default) message age for HSM.
Starting & Stopping HSM (zmhsm)
Configurations > Servers > [Server With HSM Volume]
The on the right-hand section, above the details area for the server you'll see a button/tab HSM. Click on this and you'll be given the option to Start HSM Session. It will output progress details.
References From Official Documentation
zmhsm - command for HSM
Please see:
CLI_zmhsm_Network_Edition_only
zmvolume - command for volumes
Please see:
http://wiki.zimbra.com/index.php?title=CLI_zmvolume
Global HSM Session Setting
Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111022
Global Settings HSM (Hierarchical Storage Management) sets the default message age threshold to 30 days. The HSM global setting is the default unless you change the schedule in the Server configuration. See “Scheduling HSM Sessions” .
Scheduling HSM Sessions
Reference: http://www.zimbra.com/docs/ne/latest/administration_guide/Managing_ZCS.10.1.html#1111045
HSM can be configured for secondary storage volumes for older messages. Messages and attachments are moved from a primary volume to the current secondary volume based on the age of the message. Users are not aware of any change and do not see any noticeable difference when opening an older message that has been moved.
To manage your email storage resources, you can implement a different HSM policy for each mailbox server. The message age threshold for HSM is set globally on the HSM tab or for individual servers from the Server Volume tab. The default is 30 days. The thresholds configured on individual servers override the threshold configured as the global setting. Sessions to move messages to the secondary volume are scheduled in your cron table. From the administration console, when you select a server, you can manually start a session, monitor sessions, and abort sessions that are in progress from the Volumes tab.
When you abort a session and then restart the process, the HSM session looks for entries in the primary store that meet the HSM age criteria. Any entries that were moved in the previous run would be excluded, as they would no longer exist in the primary store.