With the start of Season 7 of AMC's hit series "The Walking Dead" slowly advancing toward us, we are now left with not the question of when....but how, or who? Avid watchers will understand the what or the who to which I'm referencing, to the others, you better start binge-watching and catch up. However, in the case of NetVault SmartDisk, the question is not of how, but of when?
By this loosely referenced analogy, I am simply referring to the fact if left unchecked, the data within your SmartDisk instance will grow exponentially, and result in a process that will likely cause your instance to always be entrenched in a battle of fighting off (expiring) dead (expired) backup indexes.
Deduplication in NVSD is a process that is consistently running, and working to minimize the data hordes that are coming in, however, NVSD is only capable of post-process deduplication, which means the incoming data is only deduplicated after it has been backed up by SmartDisk and the index stored. Think about it like a horde of zombies at the gates, and a few are trickling through at a time, but kept under control. The data is queued up and being processed in blocks, but it keeps churning away until all the data in the queue has been deduplicated. In NVSD, this process requires a decent amount of processor speed running on a fairly beefy system, and fast disks. However, it is not uncommon to see sub-standard systems running NVSD, and the outcome of this is often undesirable.
In any event though, even the best systems can face the same issues if the deployment strategy is miscalculated or under-estimated. While deduplication in any capacity (pre or post-process) is an absolutely fantastic technology to implement into your storage and data protection strategy, it can sometimes come at the cost of a major performance hit, or data hit if left unchecked. What this means is that you can be happily backing up TB and TB of data, until one day everything comes to a halt. In the case of SmartDisk, this is basically where you either a) run out of space or b) you have so much data in your pre-dedupe queue that your data ends up in a virtual horde of data, and it now is at the stage that it cannot deduplicate data fast enough - the horde grows in size until now its too overwhelming and eventually, the horde has consumed you.
This is a dark analogy, I know, but I have seen it all to often. Misguided and misinformed, undersized and underspecced, customers falling victim to NVSD backlogs so massive that they take several weeks to clear before they can begin operating normally, or worse, needing to completely wipe out and start over. There are other options of course, such as adding more storage, or adding another instance, but in many cases this all could be avoided with proper assesments of your data sizes and yearly expected growth.
So try to avoid this "Alexandria" of sorts and be sure to refer to the NetVault SmartDisk Installation/Upgrade Guide and/or the NetVault Backup 10 Server Sizing Guide. Both of these resources will help you to effectively plan a strategy to avoid the pitfalls and keep the data horde under some effective "crowd control."