An Open Source EqualLogic Replacement: Part 2

Pipe Dream?

Part 1 laid out my goals: move from my EqualLogic array to something cheaper, faster, with more storage.  It sounds impossible, but there are some acceptable (and affordable) solutions out there.

SSD + Hard Drive Options

After some research, I came up with three possible ways to accomplish my goals, looking strictly at SSD/Hard drive integration, and leaving failover until later in my thinking.

Option One: Keep SSD and Hard Drive Volumes Separate

This is the simplest option.  When configuring the RAID card I create two separate volumes: one is made up only of SSDs, and the other of hard drives.  The volumes can be exported as iSCSI or NFS shares so that I can choose which repository to use for each virtual machine.  File storage goes on the hard-drive backed share, and databases go on the SSD backed share.  Easy.

Advantages:

  • Simple.  This is easy, using any hardware RAID card.
  • Easy to understand.  Never underestimate the value of minimal complexity.  This will help avoid stupid mistakes later on, and will greatly ease problem diagnosis when something fails.
  • Cheap.  No expensive hardware required.
  • Any Platform will do.  We can perform this with a customized storage-based FreeBSD distribution, or straight Linux, or FreeBSD, or even Windows will work fine in this role.  File serving has been around for a long time, is well understood, and it solid.

Disadvantages:

  • Manual Setup and Maintenance.  If I want to increase performance for a server, I need to migrate it to the fast share manually.
  • All or nothing.  When I host forums there is a file serving component, a web serving component, and a database component.  Large forums have their component servers broken down along these lines, but smaller forums start on one server until they can justify the additional complexity.  I’d prefer a solution that was more nuanced so all servers benefit from the performance increases available by using hybrid storage, rather than those few servers that are most sensitive to storage performance.  Even with database servers I’m interested in having the database stored on the SSD and the operating system and logs stored elsewhere.  I can do this by manually configuring shares on a per-server basis, but I don’t like the additional complexity.

Overall this doesn’t seem like a bad solution, but it’s not ideal.

Option Two: Use a Smart RAID Card

LSI’s CacheCade technology and its competitors offer some compelling features at a higher cost than a normal RAID card.  In these cards you add both SSD and traditional hard drives to your storage server, configure the hard drives as a normal RAID array, and tell the controller to use the SSDs as cache.

Now the controller will watch your file access pattern, and “hot” data that is frequently accessed will be moved to the SSDs, while “cool” data stays on the hard drives.

Advantages:

  • Smart.  Only the data that will best benefit from the fast access offered by SSDs will be moved there, while the remaining data will stay on the hard drives.  Makes better use of SSD storage than the more simple option above.
  • Simple.  Configure your shares as necessary, and don’t worry about data access speed.  Now your two volumes that were separated based on perceived performance requirements in the simpler solution above are combined on one volume, and the RAID controller allocates your storage based on your actual demonstrated performance needs.  This is a huge improvement over the earlier solution.
  • Easy.  Spend more money on an SSD-capable RAID card and you get these benefits, without needing to learn or manage anything more than that.
  • Simple Maintenance.  If a hard drive fails, pull it out and replace it and the RAID card will spin the new drive up and get it online as part of the RAID array.  I love keeping things simple, and drive replacement is as simple as it gets using hardware RAID.

Disadvantages:

  • Cost.  This is the big one.  To even play with this technology will require buying a compatible RAID card plus the additional license to enable the CacheCade system.  Even so, we’re talking about less than $1,200 per server to go this route, which is pretty reasonable in the grand scheme of things.  After all, a new EqualLogic that does all this will cost somewhere in excess of $40,000…

Option Three: Skip Hardware RAID and use ZFS Instead

I should say up-front that I started off opposed to this idea, because my experiences with Software RAID a decade ago in both Windows and Linux were vastly inferior to my experiences with Hardware Raid.

After some research though, this actually started to make a lot more sense.  ZFS is a lot more complex than traditional filesystems running on top of hardware RAID, but there are some significant advantages here.

Advantages:

  • Speed.  The goal here is to somehow merge SSDs into the storage array to get more IOPS.  ZFS can do one step better: it uses RAM as the primary cache for frequently used data, then resorts to SSD or other fast storage if enough RAM isn’t available.  In theory this is significantly faster than something like CacheCade, as a server may have 50 to 100 times more RAM available for caching than what one can put on a RAID controller.
  • Really, speed.  Some ZFS users are reporting in excess of 100,000 IOPS in their installations using fairly generic hardware.  That’s insane, but well-designed and appropriately-tuned ZFS systems can pull this off.
  • Cost.  No RAID card necessary.  For people like me who have spare (or underutilized) server-class hardware sitting around this is a very inexpensive solution.  If buying new hardware this is one RAID card and two drives (the mirrored volume used for the OS) per node less expensive than other options.
  • Smart.  You receive the same advantages with regard to resource utilization and speed boosts for all servers that you do with CacheCade technology.
  • Flexible.  A RAID card is powered by a low-powered CPU designed to run calculations on storage to manage your arrays.  ZFS uses the processors on the computer itself which offer much more computing power, and this additional power can translate into useful features.  Want to compress data?  Done.  Want deduplication on your network shares?  Just enable it.  Do you want to use parity RAID but you’re smart enough not to trust RAID5 any more?  There are some solid options in ZFS, though I’m still a RAID10 advocate.

Disadvantages:

  • Different enough to need retraining.  Forget what you know about RAID, as this is a whole new system.  You need to think differently and drop your preconceived ideas at the door.  Decisions you make up-front are decisions you may be living with for a long time.
  • Complex.  There is a whole lot of flexibility here, though there are some significant gotchas surrounding a system with this level of complexity.  With a RAID card you can do some searching on the web to help choose between RAID10 and RAID6, turn it on in hardware, and ignore it.  You do this with ZFS at your peril.
  • Requires Planning.  If you turn on deduplication and later decide to disable it for performance reasons, you may be out of luck as currently dedup’d data will stay that way.  You can expand a ZPool (the ZFS version of volumes) indefinitely, but if you have a mirrored virtual device at 80% capacity, then you expand the pool to add another mirrored virtual device, the first will stay at 80% capacity — the ZPool won’t automatically distribute data across the new pool.  If you create a 5-drive RAIDZ2 pool (RAID6 equivalent) and want to expand it, you’re out of luck.  You’ll need to add another ZPool instead.  This really isn’t a solution you can learn as you go.

Conclusion

At this point the best I can say is “it looks like it is very possible to get better than EqualLogic performance on standard hardware if I do my part.”  Since I have some server hardware lying around I am more tempted by a ZFS-based solution than I am with a solution that requires buying new hardware.

I’ll detail more on my thinking in the next post in this series.

Leave a Reply

Your email address will not be published. Required fields are marked *