[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Do you do HSM?



Answers inline below.

On 2010-08-25 at 14:10:37, Xueshan Feng wrote:
>I am interested to know the real world experience of Zimbra HSM
>deployment. I found some archived articles on this mailing list from
>2008 about using HSM, but things have changed so much since and I
>believe there are must be more customers actively using HSM now.
>
>There is a good wiki page http://wiki.zimbra.com/wiki/Ajcody-HSM-Notes
>that covers various implementation considerations and setup, but I'd
>like to know anyone who uses HSM in their production environment has
>lessons, concerns and 'howtos' to share. Here are my specific
>questions (with some answers for our university, when applicable).
>All in all, does it worth to trade potential system performance
>impact with cheaper storage?
>
>1. How many user accounts per mailbox server

2 servers, 6000 accounts each.

>2. How much mailbox quota per account

256 MB students, 1 GB employees

Disk usage on the mailbox servers is approximately:
115 GB /opt/zimbra
450 GB /opt/zimbra/backup
125 GB /opt/zimbra/hsm

>3. Storage architecture: do you use different storage for zimbra root
>/opt/zibmra, and zimbra backup /opt/zimbra?

/opt/zimbra is on an EMC SAN with fibre channel disk.

/opt/zimbra/backup and /opt/zimbra/hsm are on the same EMC SAN but a 
different volume that is backed by SATA (much cheaper) disk.

>/opt/zimbra/backup is mounted as NFS partition through Ethernet
>connection to NetApp server, slower disks.

I can see 2 problems with that. First of all, don't use NFS. Secondly, 
don't use NetApp. The Zimbra docs say not to use NFS, and we used to use 
NetApp. Performance was abysmal until we went to the EMC SAN.

>If you use HSM, what kind of secondary storage do you use? Do you use
>NFS mounted file system for secondary volume?

Don't do NFS. We use a SAN with SATA disk for HSM.

>4. Do you have dedicated network connection  to the HSM storage?

All SAN connections are over a pair (mostly for reliability, partially 
for performance) of gigabit ethernet connections. HSM is not separated 
out from the rest of the SAN traffic.

>We are looking into HSM, but our servers have only one Ethernet
>interface, and we may need to share that connection used by the
>backup volume.

Ethernet cards are cheap. I'd suggest adding 1 or more ethernet 
interfaces that are dedicated to storage, and use the existing interface 
for client traffic.

>Will it be a performance problem if both backup and the secondary HSM
>volume share the same network connection?

It isn't a problem here. Primary storage, backup, and HSM are all on the 
same network interface.

>5. Impact with backups
>
>Our current backups (NFS mounted partition) takes more than 10 hours
>already, even we use 7-group backup policy. Offering more storage
>obviously will make the backup runs longer. If HSM process has to run
>at off-peak hour and  over-ldaps with the backup process,  how bad it
>can be?  Do you use higher number of auto-group policy, like 14 or 31
>auto-groupping?

Get rid of the NFS and the NetApp and i bet your backup times will 
improve. We used to see ridiculously long backup times as well (up to 20 
hours). Now our backups take 1-2 hours.

>6. Impact to server
>
>Do you let your HSM process run continuously?

Yes. That's the default.

>Is the performance impact end user email experience during the business
>hours?

No noticeable impact.

>7. What HSM Age do you use? If you adjusted it from the default 30
>days, why?

We use the default 30 days. It seems to work well.

>8. When you initially turn on HSM, how long did it take for a complete
>run? How long does it take after the intial run?

I don't recall how long the initial run took. Since it was in the 
background it didn't really affect server performance. Not sure how long 
a run takes now either; i think it just continuously scans accounts 
looking for stuff to move to the HSM volume. The HSM scan seems to 
operate at a low priority and doesn't really affect anything else, 
performance wise. But it lets us get away with having a lot less fast, 
expensive disk. We also have the HSM volume set to compress blobs. Since 
it is on slower disk, the compression probably helps retrieval time so 
that user's don't notice their older mail is on slower storage. We've 
not gotten a single complaint about access to old messages being slow, 
so i'd say HSM Just Works.

-- 
Daniel A. Ramaley
Network Engineer 2

Dial Center 118, Drake University
2407 Carpenter Ave / Des Moines IA 50311 USA
Tel: +1 515 271-4540
Fax: +1 515 271-1938
E-mail: daniel.ramaley@drake.edu