Advanced Hard Drive Caching Techniques

With the introduction of the solid-state Flash drive, performance came to the forefront for data storage technologies. Prior to that, software developers and server administrators needed to devise methods for which they could increase I/O throughput to storage, most of which resulted in low capacity caching to random access memory (RAM) or a RAM drive. Although not as fast as RAM, the Flash drive was almost a dream come true, but it had its limitations—one of which was its low capacities packaged in the NAND-based chips. The traditional spinning disk drive provided users' desired capacities but lacked in speedy accessibility. Even with the 6Gb SATA protocol, sequential data access at best performed at approximately 150MB per second (or MB/s) for both read and write operations, while random access varied between 2–5MB/s as the seeking across multiple sectors laid out in multiple tracks across multiple spinning platters proved to be an extremely disruptive bottleneck. The solid-state drive (SSD) with no movable components significantly decreased these access latencies, thus rendering this bottleneck almost nonexistent.

Even today, the consumer SSD cannot compare to the capacities provided by the magnetic hard disk drive (or HDD), which is why in this article I intend to introduce readers to proven methods for obtaining near SSD performance with the traditional HDD. Multiple open-source projects exist that can achieve this, all but one of which utilizes an SSD as a caching node, and the other caches to RAM. The device drivers I cover here are dm-cache, FlashCache and the RapidDisk/RapidCache suite; I also briefly discuss bcache and EnhanceIO.

Note:

To build the kernel modules shown in this article, you need to have either the full kernel source or the kernel headers installed for your current kernel image revision.

In my examples, I am using a commercial SATA III (6Gbps) SSD with an average performance of the following:

  • Sequential read: 231MB/s

  • Sequential write: 74MB/s

  • Random read: 230MB/s

  • Random write: 72MB/s

This SSD provides the caching layer for a slower mechanical SATA III HDD that performs at the following:

  • Sequential read: 115MB/s

  • Sequential write: 72MB/s

  • Random read: 2MB/s

  • Random write: 2MB/s

In my environment, the SSD is labeled as /dev/sdb, and the HDD is /dev/sda3. These are non-intrusive transparent caching solutions intended to achieve the performance benefits of SSDs. They can be added and removed to existing storage targets without issue or data loss (assuming that all cached data has been flushed to disk successfully). Also, all the examples here showcase a write-back caching scheme with the exception of RapidCache, which instead will be used in write-through mode. In write-back mode, newly written data is cached but not immediately written to the destination target. Write-through mode always will write new data to the target while still maintaining it in cache for future reads.

Note:

The benchmarks shown here were obtained by using FIO, a file I/O benchmarking and test tool designed for data storage technologies. It is maintained by Linux kernel developer Jens Axboe. Unless noted otherwise, all captured I/O is written at the typical 4KB page size, asynchronously to the storage target 32 transfers at a time (that is, queue depth).

dm-cache

dm-cache has been around for quite some time—at least since 2006. It originally made its debut as a research project developed by Dr Ming Zhao through his summer internship at IBM research. The dm-cache module just recently was integrated into the Linux kernel tree as of version 3.9. Whether you choose to enable it in a recently downloaded kernel or compile it from the official project site, the results will be the same. To load the module, you need to invoke modprobe or insmod:


$ sudo modprobe dm-cache

Now that the module is loaded, you need to inform that module about which drive to point to for the cache and which to point to for the destination. The dm-cache project site provides a Perl script to simplify this process called dmc-setup.pl. For example, if I wanted to use the entire SSD in write-back caching mode with a 4KB block size, I would type:


$ sudo perl dmc-setup.pl -o /dev/sda3 -c /dev/sdb -n cache -b 8 -w

This script is a wrapper to the equivalent dmsetup command below:


$ echo 0 20971520 cache /dev/sda3 /dev/sdb 0 8 65536 16 1 | 
 ↪dmsetup create cache

The dm-cache documentation hosted on the project site provides details on each parameter field, so I don't cover them here.

You may notice that in both examples, I named the mapping to both drives "cache". So, when I need to access the drive mapping, I must refer to it as "cache".

The following mapping passes all data requests to the caching driver, which in turn performs the necessary magic to process the requests either by handling it entirely out of cache or both the cache and the slower device:


$ ls -l /dev/mapper
total 0
lrwxrwxrwx 1 root root       7 Jun 30 12:10 cache -> ../dm-0
crw------- 1 root root 10, 236 Jun 30 11:52 control

Just like with any other device-mapper-enabled target, I also can pull up detailed mapping data:


$ sudo dmsetup status cache
0 20971520 cache stats: reads(83), writes(0), 
 ↪cache hits(0, 0.0),replacement(0), replaced dirty blocks(0)

$ sudo dmsetup table cache
0 20971520 cache conf: capacity(256M), associativity(16), 
 ↪block size(4K), write-back

If the target drive already is formatted with data on it, you just need to mount it; otherwise, format it to your specified filesystem:


$ sudo mke2fs -F /dev/mapper/cache

Remember, these solutions are non-intrusive, so if you have existing data that needs to remain on that disk drive, skip the above step and go straight to mounting it for data accessibility:


$ sudo mount /dev/mapper/cache /mnt/cache
$ df|grep cache
/dev/mapper/cache  10321208 1072632   8724288  11% /mnt/cache

______________________

Petros Koutoupis is a full-time Linux kernel, device-driver and application developer for embedded and server platforms. He has been working in the data storage industry for more than six years and enjoys discussing the same technologies.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Also on my list is new spark

sollen's picture

Also on my list is new spark plugs and wires. The engine 27W Forklift driving lamps LED has some minor leaks, so I’m going to go ahead and replace the valve cover gasket and intake manifold gasket.

Rapid[Disk,Cache] better than native ram caching?

Bucky's picture

The kernel has been doing ram caching for a loooong time. It would take a lot to convince me that the ram caching these guys have come up with is somehow better, ESPECIALLY when you consider all the other RAM needs of the system. If it's hoarding memory while other processes have to do without or dip into swap then that's not a good thing.

The SSD stuff looks cool, though. I'll have to play with this soon.

The kernel doesn't really

Anonymous's picture

The kernel doesn't really cache block level data just file level within VFS and even that is limited. Besides, when writing to a physical device, it is written like Direct I/O. With RapidDisk and in turn RapidCache mapping a RAM drive to your slower disk, you can allocate multiple gigabytes and even terabytes of data to slower volumes. The OS does need memory but nowadays, we are packing more than the typical OS uses. What are you going to do with all 16 GBytes on your laptop? Or 512 GBytes to 1 Terabyte on your servers?

Correction (Page 3)

Petros Koutoupis's picture

There is a typo, one I wish I would have caught before submitting; so it is my fault alone. On page 3, following the example of inserting the rxdsk module, the next line is not supposed to read the removal of the module, but instead the insertion of the rxcache module. So the following:

$ sudo modprobe rxdsk
$ sudo modprobe -r rxdsk

Should read:

$ sudo modprobe rxdsk
$ sudo modprobe rxcache

Petros Koutoupis is a full-time Linux kernel, device-driver and
application developer for embedded and server platforms. He has been
working in the data storage industry for more than six years and enjoys discussing the same technologies.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState