Although protecting the data in the cache is always important, cache safety is especially important when caching writes. A write cache acknowledges the write before it is safely committed to hard disk. In write I/O-intensive environments, the cache always has data that has been acknowledged to the application but is not safely written to hard disk. If the cache storage area fails, this data could be lost and corruption might ensue.
The key three circumstances from which you should protect a write cache are: power failure, cache device failure and server failure.
[ Why run two backups, one for virtualized servers and one for non-virtualized, when one backup will do? Read The Virtualized Backup Gap. ]
1. Power Failure.
Power failure was a bigger concern when caches were mostly made out of server DRAM, but as I discussed in my article "The Need For Server Grade SSDs," most flash devices also use a small amount of DRAM to organize inbound data. If the device loses power, data in that DRAM area could be lost. The DRAM area in flash devices should leverage capacitors to charge DRAM long enough so that the data it stores can be flushed to the flash area of the device prior to the drive shutting down. There also has to be some intelligence in the drive to sense a power loss and take this corrective action.
2. Cache Device Failure.
Failure of the entire flash device can be a larger problem because many caches often are built using a single drive to keep down costs. A write-cached environment should provide greater redundancy. For server-side write caching, consider drive mirroring or PCIe card mirroring. Caches built from shared storage systems will leverage the mirroring or RAID built into the storage system.
3. Server Failure.
Server failure normally is not a caching problem, because the server hardware would typically be returned to operation and caching should pick up where it left off. In a virtual server environment, though, it is entirely possible to restart a virtual machine on another host if the primary host failed. If the caching was server side, the write data in the cache would need to be flushed somehow prior to the VM being restarted elsewhere.
To protect against uncommitted writes held captive in a failed server, you must get those writes outside of the host. This can be done by either mirroring the cache externally to a shared solid state device or by mirroring one server's cache to another server. Although both of these methods do introduce some latency because a network of some sort has to be traversed, they should still provide better write performance than hard disks, and reads would still be serviced from within the local physical host.
Flash storage has made it to a point of reliability that the technology is able to sustain the higher write traffic of write-back caching, and the surrounding technology has improved to make sure that the cache is protected in case of a device or server failure. With proper design, writes can be safely cached and both sides of the I/O equation can benefit from memory-based storage.
Our four business scenarios show how to improve disaster recovery, boost disk utilization and speed performance. Also in the new, all-digital Storage Virtualization Gets Real issue of InformationWeek SMB: While Intel remains the biggest manufacturer of chips in the world, the next few years will prove vexing for the company. (Free registration required.)