Demands and expectations of today’s modern backup and recovery solutions are challenging, it is not unreasonable for small and medium sized companies to accumulate hundreds of terabytes of backup data. If backup administrators are not paying attention, they can easily wander into expensive territory.
Besides the initial investments of repository storage, additional expenses include power & cooling, data center footprint, maintenance and management.
Because of this, deduplication technology has been developed to specifically reduce backup repository capacities helping data centers to meet today’s backup challenges. Deduplication impacts are significant due to the reduction of storage repository investments, management, maintenance and power and cooling. Deduplication allows IT administrators to do more with less.
When evaluating various deduplication solutions, having a clear understanding of an environments needs such as scalability, ingest rates (performance), deduplication abilities (source side, target side, in-line, post-process), and replication abilities are extremely important to finding the right solution. But, when working with customers, we have discovered that a critical piece of deduplication solution information is typically not accounted for – the proper understanding of a deduplication solutions cleaner process. Let me explain why this understanding important.
A deduplication cleaner’s job is to reclaim unused space. It does so by processing data block reference counts and removing data as it ages out. A data block reference count is updated when new data is ingested or when data expires. Over time when a data block reference count reaches zero (backups do not reference this data anymore), the cleaner removes this unreferenced data resulting with additional free space. Because the cleaner can be an IO intensive process, it can run for longer periods of time when large amounts of data have aged out (or manually deleted). Running the cleaner process during an active backup or recovery is not recommended as it introduces contention to a deduplication solutions disk resources, which results in slower backup performance and longer backup windows. Because of this, cleaner processes are scheduled to run outside of backup windows when the deduplication solution is idle.
Because some backup environments run 24x7 (or close to it), administrators lack time for the cleaner to complete. If this occurs, the deduplication solution storage assets begin to rapidly fill up, creating a backup log jam. To assist, some vendors offer cleaner strategies to help ensure the cleaner process will adequately reclaim space.
Thus, during the evaluation process for a deduplication solution, gaining an understanding of a solutions cleaner process is critical to achieving backup SLA’s. An efficient cleaner is important and desired for any backup solution that utilized deduplication solutions.
Quest DR Series appliances implement many cleaner efficiencies for it to finish as quickly as possible. For example:
Understanding and defining a deduplication cleaning schedule is a critical component when choosing the right deduplication solution for a backup environment. Quest DR Series appliances offer a cleaner process that is flexible, configurable and intelligent. Additional information about the DR Series appliances cleaner process can be found here.