Advanced Hard Drive Caching Techniques
RapidDisk and RapidCache
Currently at version 2.9, RapidDisk is an advanced Linux RAM disk whose features include the capabilities to allocate RAM dynamically as a block device, use it as standalone disk drives, or even map it as caching nodes to slower local disk drives via RapidCache (the latter of which was inspired by FlashCache and uses the device-mapper framework). RAM is being accessed to handle the data storage by allocating memory pages as they are needed. It is a volatile form of storage, so if power is removed or if the computer is rebooted, all data stored within RAM will not be preserved. This is why the RapidCache module was designed to handle only read-through/write-through caching, which means that whatever is intended to be written to the slower storage device will be cached to RapidCache and written immediately to the hard drive. And, if data is being requested from the hard drive and it does not pre-exist in the RapidCache node, it will read the data from the slower device and then cache it to the RapidCache node. This method will retain the same write performance as the hard drive, but significantly increase sequential and random access read performance to cached data.
Once the package, which consists of two kernel modules and an administration utility, is built and installed, you need to insert the modules by typing the following on the command line:
$ sudo modprobe rxdsk $ sudo modprobe -r rxdsk
Let's assume that you're running on a computer that contains 4GB of RAM, and you confidently can say that at least 1GB of that RAM is never used by the operating system and its applications. Using RapidDisk to create a RAM drive of 1GB in size, you would type:
$ sudo rxadm --attach 1024
Remember, RapidDisk will not pre-allocate this storage. It will allocate RAM only as it is used.
A quick benchmark test of just the RAM drive produces some overwhelmingly fast results with 4KB I/O transfers:
Sequential read: 1.6GB/s
Sequential write: 1.6GB/s
Random read: 1.3GB/s
Random write: 1.1GB/s
It produces the following with 1MB I/O transfers:
Sequential read: 4.9GB/s
Sequential write: 4.3GB/s
Random read: 4.9GB/s
Random write: 4.0GB/s
Impressive, right? To utilize such a speedy RAM drive as a caching node to a slower drive, a mapping must be created, where /dev/rxd0 is the node used to access the RAM drive, and /dev/mapper/rxc0 is the node used to access the mapping of the two drives:
$ sudo rxadm --rxc-map rxd0 /dev/sda3 4
You can get a list of attached devices and mappings by typing:
$ sudo rxadm --list rxadm 2.9 Copyright 2011-2013 Petros Koutoupis List of rxdsk device(s): RapidDisk Device 1: rxd0 Size: 1073741824 List of rxcache mapping(s): RapidCache Target 1: rxc0 0 20971519 rxcache conf: rxd dev (/dev/rxd0), disk dev (/dev/sda3) mode (WRITETHROUGH) capacity(1024M), associativity(512), block size(4K) total blocks(262144), cached blocks(0) Size Hist: 512:663
As with the previous device-mapper-based solutions, you even can list detailed information of the mapping by typing:
$ sudo dmsetup table rxc0 0 20971519 rxcache conf: rxd dev (/dev/rxd0), disk dev (/dev/sda3) mode (WRITETHROUGH) capacity(1024M), associativity(512), block size(4K) total blocks(262144), cached blocks(0) Size Hist: 512:663 $ sudo dmsetup status rxc0 0 20971519 rxcache stats: reads(663), writes(0) cache hits(0) replacement(0), write replacement(0) read invalidates(0), write invalidates(0) uncached reads(663), uncached writes(0) disk reads(663), disk writes(0) cache reads(0), cache writes(0)
Format the mapping if needed and mount it:
$ sudo mount /dev/mapper/rxc0 /mnt/cache
A benchmark test produces the following results:
Sequential read: 794MB/s
Sequential write: 70MB/s
Random read: 901MB/s
Random write: 2MB/s
Notice that the write performance is not very great, and that's because it is not meant to be. Write-through mode promises only faster read performance of cached data and consistent write performance to the original drive. The read performance, however, shows significant improvement when accessing cached data.
To remove the mapping and detach the RAM drive, type the following:
$ sudo umount /mnt/cache $ sudo rxadm --rxc-unmap rxc0 $ sudo rxadm --detach rxd0
Other Solutions Worth Mentioning
bcache is relatively new to the hard drive caching scene. It offers all the same features and functionalities as the previous solutions with the exception of its capability to map one or more SSDs as the cache for one or more HDDs instead of one volume to one volume. The project's maintainer does, however, tout its superiority over the other solutions when it comes to data access performance from the cache. From what I can tell, bcache is unlike the previous solutions where it does not rely on the device-mapper framework and instead is a standalone module. At the time of this writing, it is set to be integrated into release 3.10 of the Linux kernel tree. Unfortunately, I haven't had the opportunity or the appropriate setup to test bcache. As a result, I haven't been able to dive any deeper into this solution and benchmark its performance.
EnhanceIO is an SSD caching solution produced by STEC, Inc., and hosted on GitHub. It was greatly inspired by the work done by Facebook for FlashCache, and although it's open-source, a commercial version is offered by the company for those seeking additional support. STEC did not simply modify a few lines of code of FlashCache and republish it. Instead, STEC rewrote the write-back caching logic while also improving other areas, such as memory footprint, failure handling and more. As with bcache, I haven't had the opportunity to install and test EnhanceIO.
These solutions are intended to provide users with near SSD speeds and HDD capacities at a significantly reduced cost. From the data center to your home office, these solutions can be deployed almost anywhere. They also can be tuned to operate more appropriately in their intended environments. Some of them even offer a variety of caching algorithm options, such as Least Recently Used (LRU), Most Recently Used (MRU), hybrids of the two or just a simple first-in first-out (FIFO) caching scheme. The first three options can be expensive regarding performance, as they require the tracking of cached data sets for what has been accessed and how recently in order to determine whether to discard it. FIFO, however, functions as a circular buffer in which the oldest cached data set will be discarded first. With the exception of RapidCache, the SSD-focused modules also preserve metadata of the cache to ensure that any disruptions, including power cycles/outages, don't compromise the integrity of the data.
FIO Git Repository: http://git.kernel.dk/?p=fio.git;a=summary
Wikipedia Page on Caching Algorithms: http://en.wikipedia.org/wiki/Cache_algorithms
Petros Koutoupis is a full-time Linux kernel, device-driver and application developer for embedded and server platforms. He has been working in the data storage industry for more than six years and enjoys discussing the same technologies.
Free DevOps eBooks, Videos, and more!
Regardless of where you are in your DevOps process, Linux Journal can help!
We offer here the DEFINITIVE DevOps for Dummies, a mobile Application Development Primer, and advice & help from the expert sources like:
- Linux Journal