[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Do you do HSM?



On 08/25/2010 03:10 PM, Xueshan Feng wrote:
I am interested to know the real world experience of Zimbra HSM
deployment. I found some archived articles on this mailing list from
2008 about using HSM, but things have changed so much since and I
believe there are must be more customers actively using HSM now.

There is a good wiki page
http://wiki.zimbra.com/wiki/Ajcody-HSM-Notes that covers various
implementation considerations and setup, but I'd like to know anyone
who uses HSM in their production environment has lessons, concerns
and 'howtos' to share. Here are my specific questions (with some
answers for our university, when applicable). All in all, does it
worth to trade potential system performance impact with cheaper
storage?

Yes.

1. How many user accounts per mailbox server

We have 9 servers, each has about 5000 accounts.

Two store servers with 1,200 fac/staff on one and 3,500 students on
the other. However, students with an odd numbered class year will be
migrated to a second store server this October.

2. How much mailbox quota per account

1GB  by default. Additional quota can be purchased. Mailstore disk
usage is around 1.5TB on each server.

1G by default. Most of /opt/zimbra is mounted on a 600G LUN, while HSM
is on a 900G disk and backup is on a 1.5T disk.

3. Storage architecture: do you use different storage for zimbra
root /opt/zibmra, and zimbra backup /opt/zimbra?

We have /opt/zimbra  (database, redolog, logs) on fast disk
connected through fiber channel to SAN disks. /opt/zimbra/backup is
mounted as NFS partition through Ethernet connection to NetApp
server, slower disks.

If you use HSM, what kind of secondary storage do you use? Do you
use NFS mounted file system for secondary volume?

We *had* /opt/zimbra  (database, redolog, logs) on fast disk connected
through fiber channel to an EMC SAN with 15k disks in RAID10 with HSM
on the same clarrion but on 7200k disks in RAID5 SATA. Backup was
mounted across campus to a second EMC with 7200k disks in RAID5
SATA. Backups ran quickly enough this way (8 hour full backup).

We have since replaced the two EMC cx3-20s with two XIVs. Since the
entire SAN is SATA there a lot of differences. Note that it does more
IOPs than the 15k disks because of how XIV works so the backups and
HSM do run faster, though I might eventually look into moving backups
and HSM off of the XIV and onto cheaper storage. I was mainly using it
with cheaper disks but not a cheaper array, now the entire SAN has
that philosophy. I might eventually do the backups over iSCSI to
cheaper disk but I'm not in a hurry to do that since I currently have
the space on the XIV and the backups take long enough to run even with
a SAN capable of 52,000 IOPs (at 6 nodes). I still think it's nice to
have HSM (that which might grow) so that older mail is on a separate
partition from your live mail store.

4. Do you have dedicated network connection to the HSM storage?

We are looking into HSM, but our servers have only one Ethernet
interface, and we may need to share that connection used by the
backup volume.

No, but it uses the same SAN, though another volume within that SAN.

Will it be a performance problem if both backup and the secondary
HSM volume share the same network connection?

We haven't seen that, though everything is using 4G fibre.

5. Impact with backups

Our current backups (NFS mounted partition) takes more than 10 hours
already, even we use 7-group backup policy. Offering more storage
obviously will make the backup runs longer. If HSM process has to
run at off-peak hour and  over-ldaps with the backup process,  how
bad it can be?  Do you use higher number of auto-group policy, like
14 or 31 auto-groupping?

No comment as I had HSM since day 1.

6. Impact to server

Obviously moving data from primary disk to secondary disk will have
impact to server performance. Do you let your HSM process run
continuously? Is the performance impact end user email experience
during the business hours?

No, cron starts it after backups are run and it is done before
the start of business.

  John
--
John Fulton
Associate Director of Network Services
Senior Systems Programmer
Lafayette College
610-330-5650