Ajcody-Draft-8.8-Beta-Missing-Blob-Walkthrough-Example

Ajcody Draft 8.8 Beta Missing Blob Walkthrough Example


Walk through for fixing missing blobs

We'll be using the following NG commands:

  • [1st] zxsuite hsm docheckblobs
  • [2nd] zxsuite backup dorestoreblobs
  • [3rd] zxsuite hsm dodeduplicate


NOTE - “hsm” should be thought of as Zimbra Volume Management It is NOT specific to HSM volumes only, it’s for ALL volume management

Initial Statements

  • NG Backup must be enabled, initialized, and have good backups.
  • NG HSM must be enabled. You can do this from the admin console or from the CLI with:
    • zxsuite hsm setProperty ZxPowerstore_EnableHsmOnNe true
  • The zxsuite backup doItemRestore is NOT the appropriate command to resolve missing blobs.
    • The zxsuite backup dorestoreblobs command is used to fix blob issues.
  • The following older NE commands aren’t needed and probably shouldn’t be ran at all for this situation
    • zmblobchk
    • zmmetadump
    • zmhsm
    • zmvolume
    • zmbackupquery
    • zmrestore
  • Old method of dealing with ‘missing blobs’ - Ajcody-Notes-No-Such-Blob
  • Useful reference on the zxsuite commands: Ajcody-Draft-8.8-Beta-zxsuite-NG-full-help-description

Start by manually removing a user’s blobs to replicate a “missing blob” error

THIS WALK THROUGH SHOULD BE DONE ON A TEST BOX - NOT A PRODUCTION SERVER

[zimbra@zcs86c7 0]$ zmprov gmi tester@example.com
mailboxId: 9
quotaUsed: 9530
  • Ajcody: zmprov gmi only allows user@domain . zxsuite has no account query tool at the moment, but ideally either zmprov gmi gets expanded or zxsuite includes an account query tool that can use either account name {tester@exacmple.com} , id/mailbox_id { 9 } . or zimbra_id . It should also report the user's mailbox server and message blob paths on the mailstores volumes. zxsuite core getaccountstats and zxsuite backup getaccountinfo are nice tools but don’t report this info.
  • Current operation looks like this


[zimbra@zcs86c7 0]$ zmprov gmi 9
ERROR: account.NO_SUCH_ACCOUNT (no such account: 9)
[zimbra@zcs86c7 0]$ zmprov gmi tester@example.com
mailboxId: 9
quotaUsed: 9530


  • If zmprov gmi is modified for this, maybe do it as an option , -verbose , so to not break backward compatibility [other commands or scripting on systems]
  • Ajcody: RFE zxsuite help output is inconsistent with {Account name or id} , {account} , {account_name} , account_id} usage.


[zimbra@zcs86c7 0]$ cd /opt/zimbra/store/0/9/msg/0/

[zimbra@zcs86c7 0]$ rm -rf *.msg

[zimbra@zcs86c7 0]$ ls -la
total 0
drwxr-x---. 2 zimbra zimbra  6 Aug  6 15:34 .
drwxr-x---. 3 zimbra zimbra 14 Aug  6 04:49 ..

Doing a zxsuite hsm docheckblobs [Optional]

  • Ajcody: Confirm the zextras based tools can detect the missing blobs - zxsuite hsm docheckblobs .
  • Ajcody: Notice the first time I’ll run it with the --progress option, this is an option from the base zxsuite command - hence why it’s before the module name.


[zimbra@zcs86c7 0]$ zxsuite --progress hsm docheckblobs start mailbox_ids 9

        operationId                                         56e929bf-b5c1-4af8-b790-3bc207baf6c8
        server                                              zcs86c7.example.com
        log path                                            /opt/zimbra/log/op_CheckBlobs_56e929bf-b5c1-4af8-b790-3bc207baf6c8.log
== Notifications ==
Subject: HSM Notification, CheckBlobs started.
Date: 06/08/2017 15:40:00
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxPowerstore about CheckBlobs.
Operation CheckBlobs Started.
Operation Id: 56e929bf-b5c1-4af8-b790-3bc207baf6c8
Operation Host: zcs86c7.example.com
Operation Log path: /opt/zimbra/log/op_CheckBlobs_56e929bf-b5c1-4af8-b790-3bc207baf6c8.log
Monitor Command: zxsuite hsm monitor 56e929bf-b5c1-4af8-b790-3bc207baf6c8
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
Volumes to check: all
Mailboxes to check: [9]
Search all volumes: true
Traced: false
Subject: HSM Notification, CheckBlobs completed.
Date: 06/08/2017 15:40:00
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxPowerstore about CheckBlobs.
Operation CheckBlobs Completed.
Operation Id: 56e929bf-b5c1-4af8-b790-3bc207baf6c8
Operation Host: zcs86c7.example.com
Operation Log path: /opt/zimbra/log/op_CheckBlobs_56e929bf-b5c1-4af8-b790-3bc207baf6c8.log
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
Volumes to check: all
Mailboxes to check: [9]
Search all volumes: true
Traced: false
- stats -
                    checked volumes : 2/2
                    checked mailboxes : 0/8
                    checked items : 5
                    checked blobs : 0
                    correct items : 0
                    missing blobs : 5
                 unexpected blobs : 0
          blobs with wrong digest : 0
                    items per sec : 1000
  volumes with empty or no folder : 1
- notes -
Skipped index volume 2 - index1


  • Ajcody: Now I’ll run it without --progress . Notice that the output from the CLI is not very helpful and it doesn’t show the summary like --progress does. The summary data is also NOT logged in the log as listed in log path - with neither option.


[zimbra@zcs86c7 0]$ zxsuite hsm docheckblobs start mailbox_ids 9

        operationId                                         611a93cc-29a3-49c5-be1a-1e568f3e6ae2
        server                                              zcs86c7.example.com
        log path                                            /opt/zimbra/log/op_CheckBlobs_611a93cc-29a3-49c5-be1a-1e568f3e6ae2.log


  • Ajcody: RFE: Commands should represent enough useable information in the CLI output that would normally be expected if there’s an ‘issue’.
  • Ajcody: RFE: If ‘summary’ output is given on CLI, it would be expected to also be logged in the log file when the command uses one or logs to a log [mailbox.log for example].
  • Ajcody: RFE: Summary info should list what zimbra volumes have blobs with problems since this is required for : zxsuite backup doRestoreBlobs {volume_id} [attr1 value1 [attr2 value2...]]
  • Ajcody: RFE: Note the doRestoreBlobs from the backup module wants volume by their id [ {volume_id} ] , but the hsm modules report and want volume by their name [ {volume_name} ].


[zimbra@zcs86c7 0]$ grep "WARN Missing Blob" /opt/zimbra/log/op_CheckBlobs_611a93cc-29a3-49c5-be1a-1e568f3e6ae2.log

2017-08-06 15:40:24,429 WARN Missing Blob file: Mailbox id: 9 - Item id: 257 - Revision: 5 - Blob path: /opt/zimbra/store/0/9/msg/0/257-5.msg - Size: 798 - Digest: AMNQuJ8+mlZ30r14k3mFuBbCBVh2qNjpJYnzLGpd8I8=
2017-08-06 15:40:24,429 WARN Missing Blob file: Mailbox id: 9 - Item id: 260 - Revision: 100 - Blob path: /opt/zimbra/store/0/9/msg/0/260-100.msg - Size: 2163 - Digest: 6o3ziCVjX4nMppLqzpOcpqbFlW6Vue,GgFYEZUesovY=
2017-08-06 15:40:24,429 WARN Missing Blob file: Mailbox id: 9 - Item id: 261 - Revision: 101 - Blob path: /opt/zimbra/store/0/9/msg/0/261-101.msg - Size: 2178 - Digest: qkwXGkwD0M3ONswoTsH9GnbjF8FTRfCFnJ9PrZo3VEY=
2017-08-06 15:40:24,429 WARN Missing Blob file: Mailbox id: 9 - Item id: 262 - Revision: 102 - Blob path: /opt/zimbra/store/0/9/msg/0/262-102.msg - Size: 2179 - Digest: aAgpjPfqz4WrYC,+aNhlavR8w8zCraErpxEak59loRQ=
2017-08-06 15:40:24,429 WARN Missing Blob file: Mailbox id: 9 - Item id: 263 - Revision: 103 - Blob path: /opt/zimbra/store/0/9/msg/0/263-103.msg - Size: 2212 - Digest: DUlwGRa9,MEySRithtp3JIlUt6bVcrrezmp+0JxMMy8=


Getting Zimbra Volume Id's

  • Ajcody: RFE: Since the commands to inquire about bad blobs are resolved by ‘zxsuite backup dorestoreblobs’ which requires the zimbra volume name [not id], it seems reasonable for the query commands to state the volume names where it identified blob issues. Otherwise, you would have to do another step to resolve the issues [zxsuite hsm getallvolumes] .
  • Note - Grab the volume name and the volume id since the next commands require both.


[zimbra@zcs86c7 0]$ zxsuite hsm getAllVolumes

        primaries                               
                id                                                          1
                name                                                        message1
                path                                                        /opt/zimbra/store
                compressed                                                  false
                threshold                                                   4096
                storeType                                                   LOCAL
                isCurrent                                                   true
                volumeType                                                  primary
        secondaries                             
        indexes                                 
                id                                                          2
                name                                                        index1
                path                                                        /opt/zimbra/index
                compressed                                                  false
                threshold                                                   4096
                storeType                                                   LOCAL
                isCurrent                                                   true
                volumeType                                                  index


Repairing The Blobs By Running - zxsuite backup dorestoreblobs

  • Ajcody: I’ll run zxsuite backup dorestoreblobs twice, once with --progress and once without. As noted already above before, the zxsuite commands to have more useful presented in the CLI output
  • Ajcody: RFE: Remove the “clutter” from --progress output. It makes it difficult to identify key/critical information. [see next rfe]
  • Ajcody: RFE: Please note below, only --progress will show the problem with this command as it’s setup by default and lack of decent CLI output.
    • Line from --progress: “This was a dry-run, nothing was actually restored, please run with parameter dryrun false to actually restore blobs”
    • I might argue that dryrun be mandatory BUT don’t use a default value. This way, the admin clearly knows what is about to about with the command prior to running it.
  • Ajcody: RFE: Note, zxsuite backup doRestoreBlobs uses dryrun for the variable name and defaults to TRUE. BUT, zxsuite hsm doDeduplicate , uses dry_run and defaults FALSE. These are the only two commands that use “dry*run”, recommend the variable and behavior is consistent. Also, recommend it be “required” but have no default value. CLI will output that admin must explicitly set if they want FALSE or TRUE for dry*run. Allow the config command to override this a set default sticky/global value if they want it though.


[zimbra@zcs86c7 0]$ zxsuite --progress  backup dorestoreblobs message1

        operationId                                         ace0b03d-5444-4968-83a2-c5408f0aad1f
        server                                              zcs86c7.example.com
        monitorCommand                                      zxsuite backup monitor ace0b03d-5444-4968-83a2-c5408f0aad1f
== Notifications ==
Subject: Backup Notification, Restore Broken Blobs started.
Date: 06/08/2017 15:45:18
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxBackup about Restore Broken Blobs.
Operation Restore Broken Blobs Started.
Operation Id: ace0b03d-5444-4968-83a2-c5408f0aad1f
Operation Host: zcs86c7.example.com
Monitor Command: zxsuite backup monitor ace0b03d-5444-4968-83a2-c5408f0aad1f
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
Subject: Backup Notification, Restore Broken Blobs completed.
Date: 06/08/2017 15:45:18
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxBackup about Restore Broken Blobs.
Operation Restore Broken Blobs Completed.
Operation Id: ace0b03d-5444-4968-83a2-c5408f0aad1f
Operation Host: zcs86c7.example.com
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
This was a dry-run, nothing was actually restored, please run with parameter dryrun false to actually restore blobs
Total blobs volume fixed: 0
Total blobs checked: 0
Total broken blobs: 0
Total restorable blobs: 0
Total unrestorable blobs: 0
Total failed blob restores: 0
Total restored blob: 0

[zimbra@zcs86c7 0]$ zxsuite backup dorestoreblobs message1

        operationId                                         49bf9aac-1795-4805-8d5c-181ff924fcf0
        server                                              zcs86c7.example.com
        monitorCommand                                      zxsuite backup monitor 49bf9aac-1795-4805-8d5c-181ff924fcf0


  • Ajcody: Notes the lack of warning about “This was a dry-run, nothing was actually restored, please run with parameter dryrun false to actually restore blobs” that the --progress output showed us.
  • Ajcody I’ll now run the command with dryrun false, which will restore/fix the missing blobs.
  • Ajcody: Note - this command, zxsuite backup dorestoreblobs, does NOT work against a user target but rather against a zimbra volume. The variable description for the volume target does NOT appear to accept multiple volumes or ALL as an options.
    • volume_id(M) String
  • Ajcody: RFE: Adjust volume_id in zxsuite backup dorestoreblobs to be {volume ID or name} and also accept the value of ALL [do we excluded index volumes?}.
  • Ajcody: RFE: Include new value for zxsuite backup dorestoreblobs to allow as target, accounts [account name, account id, zimbra id]. Will perform dorestoreblobs only for stated accounts


[zimbra@zcs86c7 0]$ zxsuite --progress backup dorestoreblobs 1 dryrun false

        operationId                                         beee315d-9054-4fa9-beb6-a95237be0392
        server                                              zcs86c7.example.com
        monitorCommand                                      zxsuite backup monitor beee315d-9054-4fa9-beb6-a95237be0392
== Notifications ==
Subject: Backup Notification, Restore Broken Blobs started.
Date: 06/08/2017 15:49:33
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxBackup about Restore Broken Blobs.
Operation Restore Broken Blobs Started.
Operation Id: beee315d-9054-4fa9-beb6-a95237be0392
Operation Host: zcs86c7.example.com
Monitor Command: zxsuite backup monitor beee315d-9054-4fa9-beb6-a95237be0392
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
Subject: Backup Notification, Restore Broken Blobs completed.
Date: 06/08/2017 15:49:33
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxBackup about Restore Broken Blobs.
Operation Restore Broken Blobs Completed.
Operation Id: beee315d-9054-4fa9-beb6-a95237be0392
Operation Host: zcs86c7.example.com
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
Total blobs volume fixed: 0
Total blobs checked: 135
Total broken blobs: 5
Total restorable blobs: 5
Total unrestorable blobs: 0
Total failed blob restores: 0
Total restored blob: 5


Checking The Repairs

  • Ajcody: Let’s now check what the user’s directory looks like and if the blobs were indeed restored.


[zimbra@zcs86c7 0]$ ls -la /opt/zimbra/store/0/9/msg/0/
total 20
drwxr-x---. 2 zimbra zimbra  94 Aug  6 15:49 .
drwxr-x---. 3 zimbra zimbra  14 Aug  6 04:49 ..
-rw-r-----. 1 zimbra zimbra 463 Aug  6 15:49 257-5.msg
-rw-r-----. 1 zimbra zimbra 908 Aug  6 15:49 260-100.msg
-rw-r-----. 1 zimbra zimbra 904 Aug  6 15:49 261-101.msg
-rw-r-----. 1 zimbra zimbra 899 Aug  6 15:49 262-102.msg
-rw-r-----. 1 zimbra zimbra 913 Aug  6 15:49 263-103.msg


  • Ajcody: Notice that the files have no hard link references to them [the 1 column]. I know 4 of these 5 messages use to be hardlinks. Let’s repair that with zxsuite hsm dodeduplicate .


Deduplicating The Blobs That Were Restored With - zxsuite hsm dodeduplicate

[zimbra@zcs86c7 0]$ zxsuite --progress hsm dodeduplicate message1

        operationId                                         0b0643dc-973c-470d-8271-693306ab02e8
        server                                              zcs86c7.example.com
== Notifications ==
Subject: HSM Notification, Deduplicate started.
Date: 06/08/2017 15:52:01
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxPowerstore about Deduplicate.
Operation Deduplicate Started.
Operation Id: 0b0643dc-973c-470d-8271-693306ab02e8
Operation Host: zcs86c7.example.com
Monitor Command: zxsuite hsm monitor 0b0643dc-973c-470d-8271-693306ab02e8
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
Following volumes selected for deduplication: [message1]
Subject: HSM Notification, Deduplicate completed.
Date: 06/08/2017 15:52:02
Level: Information
Server: zcs86c7.example.com
Text:
This is an automated notification from ZxPowerstore about Deduplicate.
Operation Deduplicate Completed.
Operation Id: 0b0643dc-973c-470d-8271-693306ab02e8
Operation Host: zcs86c7.example.com
Operation requested by: zimbra
Network Modules NG Version: 2.5.4
commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zal Version: 2.0.0
Zal commit: fe5605f0742a1dd820b5edb089f46e3157559ac1
Zimbra version: 8.8.0_BETA1_1712 20170615062344 20170615-0633 NETWORK
Following volumes selected for deduplication: [message1]
- stats -
         total duplicated blobs: 6
                   linked blobs: 5
           already linked blobs: 1
                  skipped blobs: 0
                invalid digests: 0
                    bytes saved: 5.7 KB

[zimbra@zcs86c7 0]$ ls -la /opt/zimbra/store/0/9/msg/0/
total 20
drwxr-x---. 2 zimbra zimbra   94 Aug  6 15:52 .
drwxr-x---. 3 zimbra zimbra   14 Aug  6 04:49 ..
-rw-r-----. 1 zimbra zimbra  463 Aug  6 15:49 257-5.msg
-rw-r-----. 2 zimbra zimbra  908 Aug  6 15:49 260-100.msg
-rw-r-----. 2 zimbra zimbra 2178 Aug  6 05:17 261-101.msg
-rw-r-----. 2 zimbra zimbra 2179 Aug  6 05:17 262-102.msg
-rw-r-----. 3 zimbra zimbra 2212 Aug  6 05:17 263-103.msg


  • Ajcody: Notice that the file now references, [3 vs 1] , hard links
    • -rw-r-----. 1 zimbra zimbra 913 Aug 6 15:49 263-103.msg
    • vs.
    • -rw-r-----. 3 zimbra zimbra 2212 Aug 6 05:17 263-103.msg


  • Ajcody: But why did the filesize change? The change of the time/date of the file makes make sense, but size?


2017-08-06 15:40:24,429 WARN Missing Blob file: Mailbox id: 9 - Item id: 263 - Revision: 103 - Blob path: /opt/zimbra/store/0/9/msg/0/263-103.msg - Size: 2212 - Digest: DUlwGRa9,MEySRithtp3JIlUt6bVcrrezmp+0JxMMy8=

[zimbra@zcs86c7 0]$ ls -la /opt/zimbra/backup/zextras/items/DU/DUlwGRa9,MEySRithtp3JIlUt6bVcrrezmp+0JxMMy8=
-rw-r-----. 1 zimbra zimbra 913 Aug  6 05:17 /opt/zimbra/backup/zextras/items/DU/DUlwGRa9,MEySRithtp3JIlUt6bVcrrezmp+0JxMMy8=

[zimbra@zcs86c7 0]$ file /opt/zimbra/backup/zextras/items/DU/DUlwGRa9,MEySRithtp3JIlUt6bVcrrezmp+0JxMMy8=
/opt/zimbra/backup/zextras/items/DU/DUlwGRa9,MEySRithtp3JIlUt6bVcrrezmp+0JxMMy8=: gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT)

[zimbra@zcs86c7 0]$ file /opt/zimbra/store/0/9/msg/0/263-103.msg 
/opt/zimbra/store/0/9/msg/0/263-103.msg: SMTP mail, ASCII text, with CRLF line terminators

[zimbra@zcs86c7 0]$ zxsuite hsm getAllVolumes

        primaries                               

                id                                                          1
                name                                                        message1
                path                                                        /opt/zimbra/store
                compressed                                                  false
                threshold                                                   4096
                storeType                                                   LOCAL
                isCurrent                                                   true
                volumeType                                                  primary
        secondaries                             
        indexes                                 

                id                                                          2
                name                                                        index1
                path                                                        /opt/zimbra/index
                compressed                                                  false
                threshold                                                   4096
                storeType                                                   LOCAL
                isCurrent                                                   true
                volumeType                                                  index


  • Ajcody: Note the backup is using compression for it’s blobs and the message1 volume isn't
  • Ajcody: It’s because ‘zxsuite backup doRestoreBlobs’ by default uses : compress(O) Boolean true|false true
  • Ajcody: Something to note, that this will override whatever value the volume is set at.
  • Ajcody: The new hardlink inherits whatever it's compression value is on the volume.


  • Ajcody: Checking with the following commands, seem to indicate compression doesn't have an on/off variable
  • Ajcody: zxsuite backup getserverinfo
  • Ajcody: zxsuite backup getserverconfig standard query / | grep -i compress
  • Ajcody: zxsuite backup getproperty verbose true | grep -i compress
Jump to: navigation, search