Category Archives: Linux

All things Linux.

Performance and scalability improvements in Mysql 5.5

Oracle’s Mysql Blog has a very good post that provides an overview of some of the improvements that you can expect in the upcoming Mysql 5.5 release.

This writeup focuses mainly on the changes as they relate to performance and scalability, however the author (Rob Young) expresses his plans to discuss other aspects as well, sometime in the near future.

Here are just a few of the topics covered by Rob:

  • Improved Default Thread Concurrency
  • Improved Recovery Performance
  • Multiple Buffer Pool Instances
  • Native Asynchronous I/O for Linux
  • Improved Metadata Locking Within Transactions
  • Better performance on Windows based installs

At some point I hope to continue my testing and benchmarking of several different versions of Mysql such as MariaDB, Percona and Mysql 5.5. However for production databases we will be sticking with the Mysql 5.1.x code branch for the foreseeable feature.

Gluster 3.1 released

Today the team over at Gluster.com announced the availability of version of Gluster 3.1 of their software.   There are currently two different offerings available from Gluster.  There is the Gluster Storage Platform, known as ‘GlusterSP’ which provides a Linux based bare metal installer, web based front end, etc.

They also offer ‘Glusterfs’ which they release as open source and provides the same functionality of GlusterSP,  but does not require a fresh install like GlusterSP,  but instead,  you can use it on an existing Linux or Solaris based system.

The 3.1 release brings the following new features:

Elastic Volume Management: logical storage volumes are decoupled from physical hardware, allowing administrators to grow, shrink and migrate storage volumes without any application downtime. As storage is added, storage volumes are automatically rebalanced across the cluster making it always available online regardless of changes to the underlying hardware.

New Gluster Console Manager: the Command Line Interface (CLI), Application Programming Interface (API) and shell are merged into a single powerful interface, enabling automation by giving the CLI higher level API’s and scripting capabilities. Languages such as Python, Ruby or PHP can be used to script a series of commands that are invoked through the command line. This new tool requires no new APIs and is able to script out and rapidly automate any information inserted in the CLI allowing cloud administrators the ability to simply automate large scale operations.

Native Network File System (NFS): including a native NFS v3 module which allows storage servers to communicate natively with NFS clients directly to any storage server in the cluster and simultaneously communicates NFS and the Gluster protocol. NFS requires no specialized training, making it simple and easy to deploy.

To find out more about Gluster you can visit Gluster.com, you can also visit Gluster.org if you want to get more familiar with the open source side of the Gluster house.

Ext4 vs Zfs Kernel Module:benchmarks so far.

Well I have finally set aside some time to try and test performance using the zfs kernel module that I blogged about a bit ago.

Overall the zfs kernel module produced results that were similar to the ones I saw while using ext4, however most real world zfs setups are not limited to a single disk, so it will be very interesting to see what kind of performance numbers we will see when we start benchmarking on setups that have many disks.

Although the zfs results were slower in almost every single case, ext4 was not too much faster in most of those cases and I suspect that there are lots of people out there who would be more then willing to take a tiny hit in speed, in order to gain the substantial benefits that comes with having zfs as your underlying filesystem.

Here are some of the benchmarks I got doing the following:

a)create 10,000 files using touch
b)create 10,000 directories using mkdir
c)untar the latest stable linux kernel
d)create a 1GB file using dd
e)find 10,000 files
f)delete 10,000 files
g)find 10,000 directories
h)delete 10,000 directories

At some point soon I plan to add values for raid2z, btrfs, iozone results, etc.

[easychart type=”vertbar” height=”10″ width=”10″ title=”Various File Operations in Seconds” groupnames=”Ext4,Zfs,Zfs-mirror” valuenames=”Touch x 10000,Mkdir x 10000,Untar kernel,Create 1 GB file” group1values=”12.669,14.276,4.997,1.110″ group2values=”13.009,13.015,6.577,6.084″ group3values=”13.044,13.352,9.787,12.208″] [easychart type=”vertbar” height=”10″ width=”10″ title=”Various File Operations in Seconds” groupnames=”Ext4,Zfs,Zfs-mirror” valuenames=”Delete files,Find files,Delete directories,Find directories” group1values=”0.122,0.036,0.163,0.295″ group2values=”0.577,0.096,0.247,0.764″ group3values=”0.526,0.141,0.261,0.690″ ]

ZFS kernel module for Linux

UPDATE: If you are interested in ZFS on linux you have two options at this point:

I have been actively following the  zfsonlinux project because once stable and ready it should offer surperior performance due to the extra overhead that would be incurred by using fuse with the zfs-fuse project.

You can see another one of my posts concerning zfsonlinux here.

————————————————————————————————————————————————————-

KQ Infotech has released (currently in closed beta) code that brings ZFS to Linux via a loadable kernel module.

Here is a link to the current and future feature set.  The reason that this is exciting is that although other ZFS implementations for Linux have traditionally existed, each of the available options have significant drawbacks.  For example  ZFS-FUSE is  implemented in userspace using FUSE, which has additional overhead due to the context switching that is required while switching back and forth between kernel-space and user -space. .

Another option is ZFS on Linux which provides a stable SPA, DMU and ZVOL layer, but does not however provide a Posix layer (ZPL) that would enable you to actually mount a ZFS filesystem from inside Linux.  From what I understand, KQ Infotech has basically taken some of the ZFS on Linux code that was developed by the Lawrence Livermore National Laboratory (LLNL), and actually implemented  the missing ZPL layer.

NPR was recently accepted into the closed beta program,  and I took some time last week to get this module installed on a Dell Poweredge 2950 running a 64 bit version of Ubuntu 10.04.  We are currently testing ZFS under  kernel version  2.6.32-24.  I have not had a ton of time to test things out, but I would say so far so good.  I plan on posting some ZFS and Btrfs benchmarks in the next few weeks after I get some time to better test performance, throughput, etc.

Btrfs: The Story So Far

Here is a link to a video presentation given by Josef Bacik, one of the 3 lead developers currently working on Btrfs.  This presentation was given at at LinuxCon Brazil 2010.  The video lasts about an hour and according to the description provides:

‘A look at the features that currently exist in Btrfs and what features are left to be done. We’ll look at stability and what things testers need to look out for. There will be plenty of benchmarks and use cases for the different features of Btrfs. We will also discuss what testing needs to be done, and how testers can help us developers.’

If you have some questions about the current state of Btrfs, current and future Btrfs development roadmap,  benchmarks, etc… you should take some time to watch this video.

Current state of “Btrfs” File System for Linux

Oracle has provided  a link to a webcast (registration required) on the state of Btrfs given,  by lead Btrfs developer Chris Mason.  Here is an excerpt from the webcast description:

‘Join Chris Mason, Director of Software Development at Oracle, the principal author of Btrfs flie system, and our own resident Linux kernel guru, as he discusses the development, features, benefits of the “Btrfs” file system (pronounced “Butter F S”, “B-tree F S”) in Linux.’

The video lasts about 1 hour and provides a very good overview of the current state of the file system, some of the pros and cons of Btrfs under various workloads, some of the features that have been implemented thus far, as well as some of the tools and features that a slated for future releases.

O’Reilly MySQL Conference & Expo 2010 Keynote Vidoes

For those of us who were not able to attend this years Mysql Conference in Santa Clara, CA, the keynote videos have been posted online for your viewing pleasure!

UPDATE:

Here is a link to another location (Youtube), that has some more of the videos from the Mysql Conference, including the one given on Oracle to Mysql migration given by NPR’s own Joanne Garlow.

Raid Controller Caching Options

Here is a quick link to a blog post that talks about RAID caching for various database workloads.  The post also lists some of the popular RAID cards that are being put into use today, as well as  some interesting features that the author feels are missing from these current lineup of available RAID cards.