Advanced Hard Drive Caching Techniques
With the introduction of the solid-state Flash drive, performance came to the forefront for data storage technologies. Prior to that, software developers and server administrators needed to devise methods for which they could increase I/O throughput to storage, most of which resulted in low capacity caching to random access memory (RAM) or a RAM drive. Although not as fast as RAM, the Flash drive was almost a dream come true, but it had its limitations—one of which was its low capacities packaged in the NAND-based chips. The traditional spinning disk drive provided users' desired capacities but lacked in speedy accessibility. Even with the 6Gb SATA protocol, sequential data access at best performed at approximately 150MB per second (or MB/s) for both read and write operations, while random access varied between 2–5MB/s as the seeking across multiple sectors laid out in multiple tracks across multiple spinning platters proved to be an extremely disruptive bottleneck. The solid-state drive (SSD) with no movable components significantly decreased these access latencies, thus rendering this bottleneck almost nonexistent.
Even today, the consumer SSD cannot compare to the capacities provided by the magnetic hard disk drive (or HDD), which is why in this article I intend to introduce readers to proven methods for obtaining near SSD performance with the traditional HDD. Multiple open-source projects exist that can achieve this, all but one of which utilizes an SSD as a caching node, and the other caches to RAM. The device drivers I cover here are dm-cache, FlashCache and the RapidDisk/RapidCache suite; I also briefly discuss bcache and EnhanceIO.
To build the kernel modules shown in this article, you need to have either the full kernel source or the kernel headers installed for your current kernel image revision.
In my examples, I am using a commercial SATA III (6Gbps) SSD with an average performance of the following:
Sequential read: 231MB/s
Sequential write: 74MB/s
Random read: 230MB/s
Random write: 72MB/s
This SSD provides the caching layer for a slower mechanical SATA III HDD that performs at the following:
Sequential read: 115MB/s
Sequential write: 72MB/s
Random read: 2MB/s
Random write: 2MB/s
In my environment, the SSD is labeled as /dev/sdb, and the HDD is /dev/sda3. These are non-intrusive transparent caching solutions intended to achieve the performance benefits of SSDs. They can be added and removed to existing storage targets without issue or data loss (assuming that all cached data has been flushed to disk successfully). Also, all the examples here showcase a write-back caching scheme with the exception of RapidCache, which instead will be used in write-through mode. In write-back mode, newly written data is cached but not immediately written to the destination target. Write-through mode always will write new data to the target while still maintaining it in cache for future reads.
The benchmarks shown here were obtained by using FIO, a file I/O benchmarking and test tool designed for data storage technologies. It is maintained by Linux kernel developer Jens Axboe. Unless noted otherwise, all captured I/O is written at the typical 4KB page size, asynchronously to the storage target 32 transfers at a time (that is, queue depth).
dm-cache has been around for quite some time—at least since 2006. It originally made its debut as a research project developed by Dr Ming Zhao through his summer internship at IBM research. The dm-cache module just recently was integrated into the Linux kernel tree as of version 3.9. Whether you choose to enable it in a recently downloaded kernel or compile it from the official project site, the results will be the same. To load the module, you need to invoke modprobe or insmod:
$ sudo modprobe dm-cache
Now that the module is loaded, you need to inform that module about which drive to point to for the cache and which to point to for the destination. The dm-cache project site provides a Perl script to simplify this process called dmc-setup.pl. For example, if I wanted to use the entire SSD in write-back caching mode with a 4KB block size, I would type:
$ sudo perl dmc-setup.pl -o /dev/sda3 -c /dev/sdb -n cache -b 8 -w
This script is a wrapper to the equivalent dmsetup command below:
$ echo 0 20971520 cache /dev/sda3 /dev/sdb 0 8 65536 16 1 | ↪dmsetup create cache
The dm-cache documentation hosted on the project site provides details on each parameter field, so I don't cover them here.
You may notice that in both examples, I named the mapping to both drives "cache". So, when I need to access the drive mapping, I must refer to it as "cache".
The following mapping passes all data requests to the caching driver, which in turn performs the necessary magic to process the requests either by handling it entirely out of cache or both the cache and the slower device:
$ ls -l /dev/mapper total 0 lrwxrwxrwx 1 root root 7 Jun 30 12:10 cache -> ../dm-0 crw------- 1 root root 10, 236 Jun 30 11:52 control
Just like with any other device-mapper-enabled target, I also can pull up detailed mapping data:
$ sudo dmsetup status cache 0 20971520 cache stats: reads(83), writes(0), ↪cache hits(0, 0.0),replacement(0), replaced dirty blocks(0) $ sudo dmsetup table cache 0 20971520 cache conf: capacity(256M), associativity(16), ↪block size(4K), write-back
If the target drive already is formatted with data on it, you just need to mount it; otherwise, format it to your specified filesystem:
$ sudo mke2fs -F /dev/mapper/cache
Remember, these solutions are non-intrusive, so if you have existing data that needs to remain on that disk drive, skip the above step and go straight to mounting it for data accessibility:
$ sudo mount /dev/mapper/cache /mnt/cache $ df|grep cache /dev/mapper/cache 10321208 1072632 8724288 11% /mnt/cache
Petros Koutoupis is a full-time Linux kernel, device-driver and application developer for embedded and server platforms. He has been working in the data storage industry for more than six years and enjoys discussing the same technologies.
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Numerical Python
- Use Linux as a SAN Provider
- diff -u: What's New in Kernel Development
- NSA: Linux Journal is an "extremist forum" and its readers get flagged for extra surveillance
- RSS Feeds
- Linux Systems Administrator
- Senior Perl Developer
- Tech Tip: Really Simple HTTP Server with Python
- Technical Support Rep