Advanced Hard Drive Caching Techniques

With the introduction of the solid-state Flash drive, performance came to the forefront for data storage technologies. Prior to that, software developers and server administrators needed to devise methods for which they could increase I/O throughput to storage, most of which resulted in low capacity caching to random access memory (RAM) or a RAM drive. Although not as fast as RAM, the Flash drive was almost a dream come true, but it had its limitations—one of which was its low capacities packaged in the NAND-based chips. The traditional spinning disk drive provided users' desired capacities but lacked in speedy accessibility. Even with the 6Gb SATA protocol, sequential data access at best performed at approximately 150MB per second (or MB/s) for both read and write operations, while random access varied between 2–5MB/s as the seeking across multiple sectors laid out in multiple tracks across multiple spinning platters proved to be an extremely disruptive bottleneck. The solid-state drive (SSD) with no movable components significantly decreased these access latencies, thus rendering this bottleneck almost nonexistent.

Even today, the consumer SSD cannot compare to the capacities provided by the magnetic hard disk drive (or HDD), which is why in this article I intend to introduce readers to proven methods for obtaining near SSD performance with the traditional HDD. Multiple open-source projects exist that can achieve this, all but one of which utilizes an SSD as a caching node, and the other caches to RAM. The device drivers I cover here are dm-cache, FlashCache and the RapidDisk/RapidCache suite; I also briefly discuss bcache and EnhanceIO.


To build the kernel modules shown in this article, you need to have either the full kernel source or the kernel headers installed for your current kernel image revision.

In my examples, I am using a commercial SATA III (6Gbps) SSD with an average performance of the following:

  • Sequential read: 231MB/s

  • Sequential write: 74MB/s

  • Random read: 230MB/s

  • Random write: 72MB/s

This SSD provides the caching layer for a slower mechanical SATA III HDD that performs at the following:

  • Sequential read: 115MB/s

  • Sequential write: 72MB/s

  • Random read: 2MB/s

  • Random write: 2MB/s

In my environment, the SSD is labeled as /dev/sdb, and the HDD is /dev/sda3. These are non-intrusive transparent caching solutions intended to achieve the performance benefits of SSDs. They can be added and removed to existing storage targets without issue or data loss (assuming that all cached data has been flushed to disk successfully). Also, all the examples here showcase a write-back caching scheme with the exception of RapidCache, which instead will be used in write-through mode. In write-back mode, newly written data is cached but not immediately written to the destination target. Write-through mode always will write new data to the target while still maintaining it in cache for future reads.


The benchmarks shown here were obtained by using FIO, a file I/O benchmarking and test tool designed for data storage technologies. It is maintained by Linux kernel developer Jens Axboe. Unless noted otherwise, all captured I/O is written at the typical 4KB page size, asynchronously to the storage target 32 transfers at a time (that is, queue depth).


dm-cache has been around for quite some time—at least since 2006. It originally made its debut as a research project developed by Dr Ming Zhao through his summer internship at IBM research. The dm-cache module just recently was integrated into the Linux kernel tree as of version 3.9. Whether you choose to enable it in a recently downloaded kernel or compile it from the official project site, the results will be the same. To load the module, you need to invoke modprobe or insmod:

$ sudo modprobe dm-cache

Now that the module is loaded, you need to inform that module about which drive to point to for the cache and which to point to for the destination. The dm-cache project site provides a Perl script to simplify this process called For example, if I wanted to use the entire SSD in write-back caching mode with a 4KB block size, I would type:

$ sudo perl -o /dev/sda3 -c /dev/sdb -n cache -b 8 -w

This script is a wrapper to the equivalent dmsetup command below:

$ echo 0 20971520 cache /dev/sda3 /dev/sdb 0 8 65536 16 1 | 
 ↪dmsetup create cache

The dm-cache documentation hosted on the project site provides details on each parameter field, so I don't cover them here.

You may notice that in both examples, I named the mapping to both drives "cache". So, when I need to access the drive mapping, I must refer to it as "cache".

The following mapping passes all data requests to the caching driver, which in turn performs the necessary magic to process the requests either by handling it entirely out of cache or both the cache and the slower device:

$ ls -l /dev/mapper
total 0
lrwxrwxrwx 1 root root       7 Jun 30 12:10 cache -> ../dm-0
crw------- 1 root root 10, 236 Jun 30 11:52 control

Just like with any other device-mapper-enabled target, I also can pull up detailed mapping data:

$ sudo dmsetup status cache
0 20971520 cache stats: reads(83), writes(0), 
 ↪cache hits(0, 0.0),replacement(0), replaced dirty blocks(0)

$ sudo dmsetup table cache
0 20971520 cache conf: capacity(256M), associativity(16), 
 ↪block size(4K), write-back

If the target drive already is formatted with data on it, you just need to mount it; otherwise, format it to your specified filesystem:

$ sudo mke2fs -F /dev/mapper/cache

Remember, these solutions are non-intrusive, so if you have existing data that needs to remain on that disk drive, skip the above step and go straight to mounting it for data accessibility:

$ sudo mount /dev/mapper/cache /mnt/cache
$ df|grep cache
/dev/mapper/cache  10321208 1072632   8724288  11% /mnt/cache


Petros Koutoupis is a full-time Linux kernel, device-driver and application developer for embedded and server platforms. He has been working in the data storage industry for more than six years and enjoys discussing the same technologies.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Also on my list is new spark

sollen's picture

Also on my list is new spark plugs and wires. The engine 27W Forklift driving lamps LED has some minor leaks, so I’m going to go ahead and replace the valve cover gasket and intake manifold gasket.

Rapid[Disk,Cache] better than native ram caching?

Bucky's picture

The kernel has been doing ram caching for a loooong time. It would take a lot to convince me that the ram caching these guys have come up with is somehow better, ESPECIALLY when you consider all the other RAM needs of the system. If it's hoarding memory while other processes have to do without or dip into swap then that's not a good thing.

The SSD stuff looks cool, though. I'll have to play with this soon.

The kernel doesn't really

Anonymous's picture

The kernel doesn't really cache block level data just file level within VFS and even that is limited. Besides, when writing to a physical device, it is written like Direct I/O. With RapidDisk and in turn RapidCache mapping a RAM drive to your slower disk, you can allocate multiple gigabytes and even terabytes of data to slower volumes. The OS does need memory but nowadays, we are packing more than the typical OS uses. What are you going to do with all 16 GBytes on your laptop? Or 512 GBytes to 1 Terabyte on your servers?

Correction (Page 3)

Petros Koutoupis's picture

There is a typo, one I wish I would have caught before submitting; so it is my fault alone. On page 3, following the example of inserting the rxdsk module, the next line is not supposed to read the removal of the module, but instead the insertion of the rxcache module. So the following:

$ sudo modprobe rxdsk
$ sudo modprobe -r rxdsk

Should read:

$ sudo modprobe rxdsk
$ sudo modprobe rxcache

Petros Koutoupis is a full-time Linux kernel, device-driver and
application developer for embedded and server platforms. He has been
working in the data storage industry for more than six years and enjoys discussing the same technologies.

Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

Upcoming Webinar
8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot