Holey Copy-on-Write

About

The usual approach for migrating an application to a shared-storage cluster is to leverage distributed shared memory. This is the approach taken by projects such as OCFS2 for file-systems and Real Application Cluster for database servers. This approach has however the inconvenient of requiring a profound code re-engineering and requiring a distributed locking mechanism. Ensuring that the shared storage is usable when the owner of a lock crashes requires additional distributed recovery mechanisms.

This project proposes an entirely different approach to achieve the same goal. It requires only that there is already a replication mechanism in place, which is already true for many servers, aimed at a shared-nothing clusters. Namely, every major database server has multiple built-in or supported replication mechanisms that satisfy this requirement.

At the core of HoleyCoW is a surprisingly simple modification to the common copy-on-write storage management technique: One of the servers (writer) is allowed to punch through and directly write to the original backing store after a simple distributed coordination protocol with other replicas (copiers) runs. Besides working with unchanged application software, this approach has the following advantages:
  • Copier nodes don't need to invoke fsync on their local logs, unless when taking over as writer after a failure.
  • Copier nodes never block waiting for the writer, other than during initialization.
  • There is no constraint on writer throughput, only a configurable impact on latency.

Roadmap

  • A proof-of-concept implementation DONE!
  • A research paper detailing the approach and including an experimental evaluation with TPC-W.
  • A prototype implementation as a block device, working with any replicated application.