NetVault FastRecover's Real-Time Event Journaling for Data Consistency in Recovery

As Product Manager I often find myself fielding questions around NetVault FastRecover and its ability to ensure data consistency. While there are several components working as one to provide this functionality (i.e. Real/True CDP, 1st and 2nd Stage Delta Reduction, Filter Level Driver, etc), the primary element is Real-Time Event Journaling. In the following paragraphs I will describe Real-Time Event Journaling and how it plays into FastRecover's ability to provide data consistency during a recovery.

Capturing of data in real time does not necessarily guarantee successful recovery to any point in time. Let’s say, for example, an application crashes and it was only able to write a partial record to a file. Partial updates may cause the application to think the file is corrupted and unusable. Applications that use multiple files simultaneously, such as databases, may have built-in crash recovery mechanism. But these applications require data to be written in a specific order. The data that is backed up must preserve the write-order to ensure crash recovery works correctly. When it’s time for a recovery, if lost data is not in a consistent state, then the recoverability is not guaranteed. A sound recovery solution should not rely on crash recovery. A successful recovery solution must be able to reconstruct data from a point in time with guaranteed consistency so that its associated application(s) can access and modify the data. In order to guarantee recovery, key events must be captured along with the data as they happen.

NetVault FastRecover is designed to be data aware and application aware. It not only captures the changed data and file system events, it also recognizes the applications’ consistency and system events. FastRecover tracks file system events such as OPEN, WRITE, FLUSH, CLOSE, MOVE, and DELETE. In a general purpose file system, events such as CLOSE, MOVE, and DELETE indicate strong consistency state for the associated file. FLUSH indicates a weak consistency state in some cases. For applications that use multiple files, like databases, FastRecover is aware of the file types (binary, control, and log files). It captures database CHECKPOINT events by recognizing the specific checkpoint IO sequence. It’s also capable of forcing a CHECKPOINT event to occur as necessary. What’s more, FastRecover recognizes log file updates and units of TRANSACTION, application STARTUP, application SHUTDOWN, and so on.

NetVault FastRecover captures data in real time and synchronizes it with the events and metadata. It then indexes and stores all that information together. Metadata includes things like user Access Control List (ACL) and attributes of data objects, date and time of events, users who accessed the data, etc. The data is finally verified with checksum to ensure integrity. This streaming, indexing, and storing of the real-time information is known as Real-Time Event Journaling. It allows NetVault FastRecover to ensure data consistency during recovery.