Hard links

Introduction

I recently implemented a system of storing cold backups of Zimbra to S3 using Duplicity. The problem is that Duplicity doesn't preserve hard links at time of writing. This is also a problem when backing up the /opt/zimbra directory using tar, for example.

Since Zimbra makes extensive use of hardlinks I set about writing a script to preserve hard link information through the backup and then recreate the hardlinks when the backup is restored. Obviously the hard link information file needs to be included as part of the backup.

The script can also purge a directory of the hardlinks it has found. This should never be tried on /opt/zimbra, only a mirror copy of /opt/zimbra made, for example, by rsync! Purging the hardlinks saves some space on the backup but can take a long time to process. In my case I found that hardlinks accounted for only an extra 1GB on an 8GB database so I personally opted not to purge the hardlinks before running the backup.

Example

Backup procedure

service zimbra stop
cd /opt/zimbra
hardlinks scan . .hardlinks
tar -f /var/backup/zimbra.tgz --preserve -zc .
service zimbra start

Restore procedure

service zimbra stop
cd /opt/zimbra
rm -rf /opt/zimbra/*
tar zxf /var/backup/zimbra.tgz
hardlinks restore . .hardlinks
service zimbra start

In this case the hardlinks are stored in a file /opt/zimbra/.hardlinks


hardlinks script

#!/bin/bash

# Program name: hardlinks

#    This program is free software: you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation, either version 3 of the License, or
#    (at your option) any later version.

#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.

#    You should have received a copy of the GNU General Public License
#    along with this program.  If not, see <http://www.gnu.org/licenses/>

##########################################################################
# Title      :  hardlinks - list hardlinks or rebuild from list
# Author     :  Simon Blandford <simon -at- onepointltd -dt- com>
# Date       :  2008-07-31
# Requires   :  awk
# Category   :  Administration
# Version    :  1.0.0
# Copyright  :  Simon Blandford, Onepoint Consulting Limited
# License    :  GPLv3 (see above)
##########################################################################
# Description
# Store hardlink information in a file which can then be used to
# remove hardlinks or restore the hardlinks.
# This is intended to be used with backup and archiving systems that do
# not support hardlinks. Simply store the hardlink information as part of
# the archive so that the hardlinks can be regenerated when the archive is
# unpacked.
# the -s option is used to suppress the progress count when purging or
# restoring links.
# the -p option supresses errors when a file in the list isn't found
##########################################################################



function usage()
{
    echo "Usage: `basename $0` scan rootdir linklistfile"
    echo "       `basename $0` [-sp] purge rootdir linklistfile"
    echo "       `basename $0` [-sp] restore rootdir linklistfile"
    echo "Options: -s no counter display"
    echo "         -p no error on file in list not found"
    exit 0
}

function act_on_list()
{
  if [ ! -f "$linkListFile" ]; then
    echo "Error: Unable to open input link list file: $linkListFile." >&2
    exit 1
  fi
  totFiles=`wc -l $linkListFile | awk '{ print $1 }'`
  while read -r currentEntry; do
    inum=$( echo $currentEntry | awk '{ print $1 }' )
    #Change file space escapes (\ ) to *, extract the filename by deleting up to last space
    #then change the * back to spaces.
    fileName=$( echo "$currentEntry" | awk '{ for (i=2; i<=NF; i++) printf (" %s",$i) }' )
    #Remove leading space and . and prepend path
    fileName="$rootDir""${fileName:2}"
    if [ ! -f "$fileName" ]; then
      if [ "$suppressNoFileErr" != "yes" ]; then
        echo "Error: File $fileName can not be found in $currentEntry." >&2
        exitCode=1
        continue
      fi
    fi
    fileSize=`ls -l "$fileName" | awk '{ print $5 }'`
    sizeUnLinkedTotal=$(( $sizeUnLinkedTotal + $fileSize ))
    if [ "$inum" != "$lastInum" ]; then
      sizeLinkedTotal=$(( $sizeLinkedTotal + $fileSize ))
      baseFile="$fileName"
    else
      rm -f "$fileName"
      if [ $1 == "restore" ]; then
        ln "$baseFile" "$fileName"
      fi
    fi
    lastInum=$inum
    if [ $(( counter++ % 100 )) -eq 99 ] && [ "$noiseLevel" != "silent" ]; then
      echo "Processed $counter files of $totFiles"
    fi
  done < "$linkListFile"
  if [ "$noiseLevel" != "silent" ]; then
    echo "Processed $counter files of $totFiles"
  fi
}

while getopts hps c
do
  case $c in
    h)
      usage
      ;;
    s)
      noiseLevel="silent"
      ;;
    p)
      suppressNoFileErr="yes"
      ;;
    ?)
      usage
      ;;
  esac
done
shift $(($OPTIND - 1))

if [ $# -ne 3 ]; then usage; fi

rootDir="$2"
linkListFile="$3"

if [ ! -d "$rootDir" ]; then
  echo "Error: Can't find root directory: $rootDir." >&2
  exit 1
fi

exitCode=0
case $1 in
  scan)
    echo 'cd '"$rootDir"'; find . -type f -links +1 -printf "%i %h/%f\n" | sort' | bash > "$linkListFile"
    ;;
  purge)
    act_on_list "purge"
    ;;
  restore)
    act_on_list "restore"
    ;;
  *)
    usage
    ;;
esac

exit $exitCode


Verified Against: Unknown Date Created: 8/11/2008
Article ID: https://wiki.zimbra.com/index.php?title=Hard_links Date Modified: 03/24/2015



Try Zimbra

Try Zimbra Collaboration with a 60-day free trial.
Get it now »

Want to get involved?

You can contribute in the Community, Wiki, Code, or development of Zimlets.
Find out more. »

Looking for a Video?

Visit our YouTube channel to get the latest webinars, technology news, product overviews, and so much more.
Go to the YouTube channel »

Jump to: navigation, search