ZFS Resources

Recently I was given the task of putting together  a storage solution that would be used to  house a large amount of our digital assets.  I was also asked to make sure there would be enough space to meet our needs over the next few years.  The project called for a solution that could scale up to around 120TB of usable space.  Depending on the price, this solution might also be used to store a majority of our digital archive (audio and video).

I will go into the specific hardware and software details of the project in another post, however after about a month of research, we decided to go with a solution that was able to take advantage of the ZFS filesystem.

Here are a few documents that I found invaluable during my setup and overall planning:

ZFS Best Practices Guide

ZFS Configuration Guide

ZFS Troubleshooting Guide

ZFS Troubleshooting and Cheatsheet Guide

These links can be a starting point for anyone who wants to gain a better overall understanding of how to best administer a server running ZFS.  The ‘best practices guide’ is also a great resource to consult during the initial project planning stages.

Poor LSI SAS1068E Write Performance with Linux

While doing research into poor write performance with Oracle I discovered that the server was using the LSI SAS1068E. We had a RAID1 setup with 300GB 10K RPM SAS drives. Google provided some possible insight into why we the write performance was so bad(1 2). The main problem with this card is that there is no  battery backed write cache. This means that the write-cache is disabled by default. I was able to turn on the write cache using the LSI utility.

This change however did not seem to any difference on performance.  At this point I came to the conclusion that the card itself is the blame.  I believe  that this is an inexpensive RAID card that is good for general use of RAID0 and Raid1, however for anything were write throughput is important, it might be better the spring for a something a little bit more expensive.

When it was all said and done we ended up replacing all the these LSI cards with Dell Perc 6i cards.  These cards did come battery backed…which allowed us to then enable the write cache, needless to say the performance improved significantly.

Poor Write Performance with Oracle

We recently deployed an Oracle virtual machine for development and testing purposes. Imports and database migration scripts were taking several hours on existing VM’s, so we hoped this new machine with more RAM (32 GB) and more CPU horsepower (quad core Intel Xeon’s) would allow for those operations to move along much more quickly.

We soon got reports from users that this server was in fact much slower then the existing less powerful Oracle VM’s. After doing some poking around (with vztop) we discovered that there were no issues with cpu or memory resources, however the server was performing terribly when it came to I/O.

Continue reading

Nexenta

I came across an interesting project last week while doing some research on OpenSolaris and zfs.  The distribution is called Nexenta.  The kernel of Nexenta is based on opensolaris, however the userspace tools are based on Debian/Ubuntu.

There is also a commercial offshoot called the Nexenta storage appliance which is a the Nexentra distribution packaged as a zfs based storage server.  Pricing is dependant on the maxium size of the storage pool.

I have downloaded the free version and am currently planning testing this distro with Gluster as well.  The FUSE project (which is required by a Gluster client to mount the filesystem) is currently not stable on opensolaris.  However I plan on using Nexenta as the server bricks of the Gluster cluster and using Linux as the client, since FUSE has no issues running on Linux.

Kontrollbase

Anyone looking for a free tool to monitor Mysql, should have a look at kontrollbase. I have contributed a few patches to Matt Reid and the project over the past few months.  I am currently using it to monitor several Mysql version 5.x sevrers.  It is a good alternative to the Mysql Enterprise Dashboard tool that Mysql offers.

Gluster 2.0

I recently went looking to see what sort of open source scalable filesystem projects existed.  I wanted to see about putting together a storage solution that would scale to upwards of 100 TB using open source software and commodity hardware.  During the search I became reacquainted with the GluserFS project.

I had configured a 3 brick ‘unify’ cluster a while back with one of their 1.3.x builds, however I had not gotten an opportunity to play with it much more after that.

After looking at the various other options out there, spending a considerable amount of time on IRC and reviewing the contents of their mailing lists, I ended up settling on GlusterFS due to it’s seemingly simple design, management, configuration and future roadmap goals.

As it turns out a few days after I started my search, the gluster team released version 2.0 of their software.  At this point I have setup a 5 brick ‘distribute’ (DHT) cluster on a few of our Proxmox (OpenVZ) servers.

I now have 5 independent 4GB bricks and a 20GB mountpoint it representes to the client.  In this case I am currently exporting  CIFS (Samaba) on top of the gluster mountpoint.  I found some very useful instructions on setup, etc here.  I plan to test NFS as well at some point on some real physical hardware, due to current OpenVZ limitations on NFS servers inside of a container.

One thing I was unable to get working at this point is to have the glusterfs client and server running on the same machine.  The single client/server setup worked flawlessly on my Ubutu laptop, so I suspect that is just an OpenVZ issue that I need to work out.

Commencement post!

Welcome to shainmiley.com. I plan to use this blog to discuss some of the technological issues that I encounter on a day to day basis.  Topics will include Linux, scaling infrastructure, cloud computing, Mysql, open source,  storage etc.