Advanced Hard Drive Caching Techniques

RapidDisk and RapidCache

Currently at version 2.9, RapidDisk is an advanced Linux RAM disk whose features include the capabilities to allocate RAM dynamically as a block device, use it as standalone disk drives, or even map it as caching nodes to slower local disk drives via RapidCache (the latter of which was inspired by FlashCache and uses the device-mapper framework). RAM is being accessed to handle the data storage by allocating memory pages as they are needed. It is a volatile form of storage, so if power is removed or if the computer is rebooted, all data stored within RAM will not be preserved. This is why the RapidCache module was designed to handle only read-through/write-through caching, which means that whatever is intended to be written to the slower storage device will be cached to RapidCache and written immediately to the hard drive. And, if data is being requested from the hard drive and it does not pre-exist in the RapidCache node, it will read the data from the slower device and then cache it to the RapidCache node. This method will retain the same write performance as the hard drive, but significantly increase sequential and random access read performance to cached data.

Once the package, which consists of two kernel modules and an administration utility, is built and installed, you need to insert the modules by typing the following on the command line:


$ sudo modprobe rxdsk
$ sudo modprobe -r rxdsk

Let's assume that you're running on a computer that contains 4GB of RAM, and you confidently can say that at least 1GB of that RAM is never used by the operating system and its applications. Using RapidDisk to create a RAM drive of 1GB in size, you would type:


$ sudo rxadm --attach 1024

Remember, RapidDisk will not pre-allocate this storage. It will allocate RAM only as it is used.

A quick benchmark test of just the RAM drive produces some overwhelmingly fast results with 4KB I/O transfers:

  • Sequential read: 1.6GB/s

  • Sequential write: 1.6GB/s

  • Random read: 1.3GB/s

  • Random write: 1.1GB/s

It produces the following with 1MB I/O transfers:

  • Sequential read: 4.9GB/s

  • Sequential write: 4.3GB/s

  • Random read: 4.9GB/s

  • Random write: 4.0GB/s

Impressive, right? To utilize such a speedy RAM drive as a caching node to a slower drive, a mapping must be created, where /dev/rxd0 is the node used to access the RAM drive, and /dev/mapper/rxc0 is the node used to access the mapping of the two drives:


$ sudo rxadm --rxc-map rxd0 /dev/sda3 4

You can get a list of attached devices and mappings by typing:


$ sudo rxadm --list
rxadm 2.9
Copyright 2011-2013 Petros Koutoupis

List of rxdsk device(s):

 RapidDisk Device 1: rxd0
    Size: 1073741824

List of rxcache mapping(s):

 RapidCache Target 1: rxc0
0 20971519 rxcache conf:
    rxd dev (/dev/rxd0), disk dev (/dev/sda3) mode (WRITETHROUGH)
    capacity(1024M), associativity(512), block size(4K)
    total blocks(262144), cached blocks(0)
 Size Hist: 512:663 

As with the previous device-mapper-based solutions, you even can list detailed information of the mapping by typing:


$ sudo dmsetup table rxc0
0 20971519 rxcache conf:
    rxd dev (/dev/rxd0), disk dev (/dev/sda3) mode (WRITETHROUGH)
    capacity(1024M), associativity(512), block size(4K)
    total blocks(262144), cached blocks(0)
 Size Hist: 512:663 

$ sudo dmsetup status rxc0
0 20971519 rxcache stats: 
    reads(663), writes(0)
    cache hits(0) replacement(0), write replacement(0)
    read invalidates(0), write invalidates(0)
    uncached reads(663), uncached writes(0)
    disk reads(663), disk writes(0)
    cache reads(0), cache writes(0)

Format the mapping if needed and mount it:


$ sudo mount /dev/mapper/rxc0 /mnt/cache

A benchmark test produces the following results:

  • Sequential read: 794MB/s

  • Sequential write: 70MB/s

  • Random read: 901MB/s

  • Random write: 2MB/s

Notice that the write performance is not very great, and that's because it is not meant to be. Write-through mode promises only faster read performance of cached data and consistent write performance to the original drive. The read performance, however, shows significant improvement when accessing cached data.

To remove the mapping and detach the RAM drive, type the following:


$ sudo umount /mnt/cache
$ sudo rxadm --rxc-unmap rxc0
$ sudo rxadm --detach rxd0

Other Solutions Worth Mentioning

bcache:

bcache is relatively new to the hard drive caching scene. It offers all the same features and functionalities as the previous solutions with the exception of its capability to map one or more SSDs as the cache for one or more HDDs instead of one volume to one volume. The project's maintainer does, however, tout its superiority over the other solutions when it comes to data access performance from the cache. From what I can tell, bcache is unlike the previous solutions where it does not rely on the device-mapper framework and instead is a standalone module. At the time of this writing, it is set to be integrated into release 3.10 of the Linux kernel tree. Unfortunately, I haven't had the opportunity or the appropriate setup to test bcache. As a result, I haven't been able to dive any deeper into this solution and benchmark its performance.

EnhanceIO:

EnhanceIO is an SSD caching solution produced by STEC, Inc., and hosted on GitHub. It was greatly inspired by the work done by Facebook for FlashCache, and although it's open-source, a commercial version is offered by the company for those seeking additional support. STEC did not simply modify a few lines of code of FlashCache and republish it. Instead, STEC rewrote the write-back caching logic while also improving other areas, such as memory footprint, failure handling and more. As with bcache, I haven't had the opportunity to install and test EnhanceIO.

Summary

These solutions are intended to provide users with near SSD speeds and HDD capacities at a significantly reduced cost. From the data center to your home office, these solutions can be deployed almost anywhere. They also can be tuned to operate more appropriately in their intended environments. Some of them even offer a variety of caching algorithm options, such as Least Recently Used (LRU), Most Recently Used (MRU), hybrids of the two or just a simple first-in first-out (FIFO) caching scheme. The first three options can be expensive regarding performance, as they require the tracking of cached data sets for what has been accessed and how recently in order to determine whether to discard it. FIFO, however, functions as a circular buffer in which the oldest cached data set will be discarded first. With the exception of RapidCache, the SSD-focused modules also preserve metadata of the cache to ensure that any disruptions, including power cycles/outages, don't compromise the integrity of the data.

Resources

dm-cache: http://visa.cs.fiu.edu/tiki/dm-cache

FlashCache: https://github.com/facebook/flashcache

EnhanceIO: https://github.com/stec-inc/EnhanceIO

bcache: http://bcache.evilpiepirate.org

RapidDisk: http://www.rapiddisk.org

FIO Git Repository: http://git.kernel.dk/?p=fio.git;a=summary

Wikipedia Page on Caching Algorithms: http://en.wikipedia.org/wiki/Cache_algorithms

______________________

Petros Koutoupis is a full-time Linux kernel, device-driver and application developer for embedded and server platforms. He has been working in the data storage industry for more than six years and enjoys discussing the same technologies.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Also on my list is new spark

sollen's picture

Also on my list is new spark plugs and wires. The engine 27W Forklift driving lamps LED has some minor leaks, so I’m going to go ahead and replace the valve cover gasket and intake manifold gasket.

Rapid[Disk,Cache] better than native ram caching?

Bucky's picture

The kernel has been doing ram caching for a loooong time. It would take a lot to convince me that the ram caching these guys have come up with is somehow better, ESPECIALLY when you consider all the other RAM needs of the system. If it's hoarding memory while other processes have to do without or dip into swap then that's not a good thing.

The SSD stuff looks cool, though. I'll have to play with this soon.

The kernel doesn't really

Anonymous's picture

The kernel doesn't really cache block level data just file level within VFS and even that is limited. Besides, when writing to a physical device, it is written like Direct I/O. With RapidDisk and in turn RapidCache mapping a RAM drive to your slower disk, you can allocate multiple gigabytes and even terabytes of data to slower volumes. The OS does need memory but nowadays, we are packing more than the typical OS uses. What are you going to do with all 16 GBytes on your laptop? Or 512 GBytes to 1 Terabyte on your servers?

Correction (Page 3)

Petros Koutoupis's picture

There is a typo, one I wish I would have caught before submitting; so it is my fault alone. On page 3, following the example of inserting the rxdsk module, the next line is not supposed to read the removal of the module, but instead the insertion of the rxcache module. So the following:

$ sudo modprobe rxdsk
$ sudo modprobe -r rxdsk

Should read:

$ sudo modprobe rxdsk
$ sudo modprobe rxcache

Petros Koutoupis is a full-time Linux kernel, device-driver and
application developer for embedded and server platforms. He has been
working in the data storage industry for more than six years and enjoys discussing the same technologies.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix