[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Do you do HSM?



Rick is correct on all points.  During the initial tests of turning on HSM, we ran for what I think was about half a day.  Of course we did this in our test environment first and at the time it was not at exact parity with production's capabilities.

After this long initial test, we then started with setting the HSM values to a year or more.  I don't recall the exact numbers.  As expected, it took much less time.  Then we dialed back a little more, ran it and watched it again.  Eventually we got to our current policy and like Rick says, it runs pretty quick now that its moved the bulk of the files to the HSM.

The one concern I had with HSM was backing out.  I think your link to Adam's wiki shows an unofficial way; moving blobs and changing mysql to point to the right directory for the blob.   I think we put in an RFE to code this unsupported backout into the Zimbra stack.  So far we find no reason why we should backout, but it was a concern at the time.


--Kurtus


----- Original Message -----
From: "rick brown" <rick.brown@oit.gatech.edu>
To: "Xueshan Feng" <sfeng@stanford.edu>
Cc: zimbra-hied-admins@sfu.ca
Sent: Wednesday, August 25, 2010 3:46:45 PM GMT -05:00 US/Canada Eastern
Subject: Re: Do you do  HSM?

I'm not on the Zimbra team proper, but from the storage perspective I have a bit of insight, so take my comments with a grain of salt. 

----- "Xueshan Feng" <sfeng@stanford.edu> wrote:

> From: "Xueshan Feng" <sfeng@stanford.edu>
> To: zimbra-hied-admins@sfu.ca
> Sent: Wednesday, August 25, 2010 3:10:37 PM GMT -05:00 US/Canada Eastern
> Subject: Do you do  HSM?
>
> I am interested to know the real world experience of Zimbra HSM
> deployment. I found some archived articles on this mailing list from
> 2008 about using HSM, but things have changed so much since and I
> believe there are must be more customers actively using HSM now. 
> 
> There is a good wiki page http://wiki.zimbra.com/wiki/Ajcody-HSM-Notes
> that covers various implementation considerations and setup, but I'd
> like to know anyone who uses HSM in their production environment has
> lessons, concerns and 'howtos' to share. Here are my specific
> questions (with some answers for our university, when applicable). All
> in all, does it worth to trade potential system performance impact
> with cheaper storage? 
> 
> 1. How many user accounts per mailbox server
> 
> We have 9 servers, each has about 5000 accounts. 

8 servers, around 7800 users per server. 

> 2. How much mailbox quota per account 
> 
> 1GB  by default. Additional quota can be purchased. Mailstore disk
> usage is around 1.5TB on each server. 

2GB students, 5GB faculty/staff. 

# df -kh /opt/zimbra/store /opt/zimbra/backup /opt/zimbra/hsm_store
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/mail1_vg-mail1_store_vol
                      1.8T  1.4T  323G  82% /opt/zimbra/store
/dev/mapper/mail1_backup_vg-mail1_backup_vol
                      5.9T  3.4T  2.4T  59% /opt/zimbra/backup
/dev/mapper/mail1_hsm_vg-mail1_hsm_vol
                      1.7T  617G  1.1T  37% /opt/zimbra/hsm_store
> 
> 3. Storage architecture: do you use different storage for zimbra root
> /opt/zibmra, and zimbra backup /opt/zimbra? 

indeed.   Different storage for /opt/zimbra, /opt/zimbra/backup, and /opt/zimbra HSM

> 
> We have /opt/zimbra  (database, redolog, logs) on fast disk connected
> through fiber channel to SAN disks. /opt/zimbra/backup is mounted as
> NFS partition through Ethernet connection to NetApp server, slower
> disks.

We used to let zmbackup run to a NAS appliance over NFS and saw it taking forever as well.   We changed to iSCSI (using the same NAS appliance) and saw the backup time drop by an order of magnitude.   Since then we've moved the arrays behind the NAS directly to our Fiber Channel fabric and have seen a bit more improvement (but nowhere near the improvement of iSCSI over NFS). 
 
> If you use HSM, what kind of secondary storage do you use? Do you use
> NFS mounted file system for secondary volume? 

Primary storage is 15Krpm FC disks on a 4Gb/s FC array, HSM is 7200rpm SATA disks in a 4Gb/s FC attached array, and backups are 7200rpm SATA disks in a 2Gb/s FC attached array.   
 
> 4. Do you have dedicated network connection  to the HSM storage?

We are all fiber channel now, but when we initially implemented backups over NFS and then iSCSI, we used the same network connection and didn't see a terribly large impact on the servers themselves, however it was more traffic than our load balancers could handle, so we did move backup traffic to a separate ethernet interface. 
 
> We are looking into HSM, but our servers have only one Ethernet
> interface, and we may need to share that connection used by the backup
> volume. 
> 
> Will it be a performance problem if both backup and the secondary HSM
> volume share the same network connection?

And the production web/imap/mail traffic all on the same NIC?   That may be a bit much. 

> 
> 5. Impact with backups
> 
> Our current backups (NFS mounted partition) takes more than 10 hours
> already, even we use 7-group backup policy. Offering more storage
> obviously will make the backup runs longer. If HSM process has to run
> at off-peak hour and  over-ldaps with the backup process,  how bad it
> can be?  Do you use higher number of auto-group policy, like 14 or 31
> auto-groupping? 

I believe we are using 28 as our auto-grouping policy...   but try iSCSI for your backup volumes instead of NFS..   the overhead of open/create in NFS is where your slowdown is, as backups create millions of small files. 

> 6. Impact to server
> 
> Obviously moving data from primary disk to secondary disk will have
> impact to server performance. Do you let your HSM process run
> continuously? Is the performance impact end user email experience
> during the business hours? 

It looks like we run it twice a day, at 7AM and 7PM and it looks like it finishes fairly quickly. 

> 
> 7. What HSM Age do you use? If you adjusted it from the default 30
> days, why? 

We started with a very high value (ie: 6 months) and dialed it back down until we found a happy medium where /opt/zimbra/store stays around 80% full. 

> 
> 8. When you initially turn on HSM, how long did it take for a complete
> run? How long does it take after the intial run?   

I want to say around 6 hours for the first run...  the other GT folks can correct me if I'm wrong..    And it looks like runs take about 20 minutes nowadays. 

> 
> Thanks!
> 
> Xueshan
> 
> -- 
> 
> Xueshan Feng <sfeng@stanford.edu>
> Technical Lead, IT Services, Stanford University
> 
> -- 
> 
> Xueshan Feng <sfeng@stanford.edu>
> Technical Lead, IT Services, Stanford University
> 
> -- 
> 
> Xueshan Feng <sfeng@stanford.edu>
> Technical Lead, IT Services, Stanford University
> 
> -- 
> 
> Xueshan Feng <sfeng@stanford.edu>
> Technical Lead, IT Services, Stanford University
> 
> -- 
> 
> Xueshan Feng <sfeng@stanford.edu>
> Technical Lead, IT Services, Stanford University
> 
> -- 
> 
> Xueshan Feng <sfeng@stanford.edu>
> Technical Lead, IT Services, Stanford University

-- 
Rick Brown
Office of Information Technology
Georgia Institute of Technology
258 4th Street N.W.  Atlanta, GA  30332-0715
rick@gatech.edu
(404) 894-6175