[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance Tuning



----- "Tim Ross" <tross@calpoly.edu> wrote:

> Hi All,
> 
> We have been on a tight deadline for our Zimbra go-live, which is
> happening over Labor Day weekend, and haven't had the time to do much
> in the way of performance/load testing.  We purchased servers which
> were about twice the Zimbra recommended specs, so we are not too
> concerned about hardware.  What I would like to know is if any of you
> out there who have either gone live or done some performance testing
> have any tips on settings which have helped out your setups.
> 
> Our general setup is 3 mailstore/app servers, 1 logger server, 2 MTA
> servers, 1 Master LDAP server, and 1 Replica LDAP server.  There are
> dual quad core Intel Xeon X5450 processors on all but the logger
> server.  We have 16 GB of RAM on the mailstore and ldap servers and 8
> GB on the MTAs.  We have Red Hat Linux 5, 64 bit on all boxes.  The
> logger server is a virtual server with 2 virtual CPUs and 4 GB of RAM.
> 

What are you using for storage? That's where we felt the first pain point. Even though we started small (we just passed 1000 users migrated a few days ago), we very quickly saturated the SATA drives on our NetApp. We had expected to have to upgrade to FCAL shelves anyway, but we thought we'd be able to handle a few hundred users on SATA to get us through the "Pilot" phase. Nope. Luckily we had 2 shelves of older FCAL drives that we were able to put into production. They won't give us enough capacity long-term, but they give us the IOPS that we need.

We've broken storage apart into 4 separate areas, all of which use iSCSI. The first 3 are NetApp, the last is a Sun Thumper:
  - /opt/zimbra (includes db and index storage) - FCAL
  - /opt/zimbra/redolog - FCAL: Will mirror using iSCSI to a remote data centre
  - /opt/zimbra/store - SATA: for message blobs
  - /opt/zimbra/backup - SATA

We use multipathd to do multipathing to the NetApp and that has allowed us to ride through NetApp reboots and network outages seamlessly. Before using multipathd, the default iSCSI timeout of 2 minutes would cause our mounted volumes to switch to read-only, causing lots of fun.

In terms of Zimbra tuning, so far the only thing we've had to play with are the LDAP locks. I tripled them last week and just doubled them again last night because we were still hitting the limit. Even though we only have 1000 users on Zimbra so far, we're replicating our maillists, complete with all members (for the purposes of doing ACLs in Zimbra), into Zimbra and I think that's what causes the lock usage to spike.

-- 
Steve Hillman                                IT Architect
hillman@sfu.ca                               IT Infrastructure
778-782-3960                                 Simon Fraser University