This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Difference in recovery points between core and replication server

I have a core server and replication one running 6.1.1.137

i started the replication process but not all recovery points are copied.

 

Here is the lsit of Main core server

 

This is my Replication 

 

 

How can i have the same recovery points on both?

  • In Rapid Recovery the source core and target core can have separate retention policies. To ensure you have the same recovery points on both sides you need to ensure that the retention policy is set identically on both. Once you have both set identically the data will remain under the same retention policy and should be close to identical.

    The reason I say it will be close to identical is because the retention policy will be affected by the success of incoming replication and when the rollup job starts each day. If for instance, you have replication failures for a couple of machines for a few days you could potentially experience a situation where the two cores are slightly out of sync due to rollup running on the source but not on the target (since rollup only runs for agents that have a new recovery point in the previous 24 hours).

    The other potential cause for disparity in your recovery points is the time that rollup starts. Rollup sets the start point for the retention policy by taking the date/time stamp for the most recent recovery point for that agent and then setting the start point to 12 AM of that same day. This way all recovery points created after 12 AM of that day are not touched by rollup. So if the source and target core run rollup at different times it is possible that the calculation of the retention policy will be different for each agent.

    The last thing you need to know is once the rollup has occurred and the recovery point chains are different, there is no way to force the replication to make the target core identical to the source. The only thing you can do is delete all the recovery points for an agent on the target core and then re-replicate all of them. I don't suggest doing that unless your retention policy was significantly different and you are missing a lot of data on the target core.
  • Thanks, for the reply.

    the rentention policy are identical

    here is the recovery points of a specific VM

    on the core :

     

    on the replication core :

     

    anyway  if the situation is normal then it s fine, i was just wondering why it is different.

     

    Another difference that puzzles me is the repository usage

     

    On the main core ( compression rate is 4%) : 

     

    On the replication core ( compression rate is 55%) :

     

     

  • The recovery points on the two cores to me looks like the difference in how the retention policy is applied. Are the two cores in separate timezones?

    The compression rate difference is going to be a function of your dedupe cache sizing. By default RR uses a 1.5 GB dedupe cache. This cache is capable of 100% dedupe for about 500 GB of unique data. After you have reached 500 GB of unique data the dedupe cache is maxed out and you start to get duplicate blocks of data written to the repository. So on the source side the data is deduplicated as it is stored in the repository. When you replicate, that same dedupe process is used again to improve the dedupe even more on the target core. So what this tells me is that your dedupe cache was not large enough to get efficient dedupe on the source core, which allowed replication to save you even more repository space.

    In this situation I would recommend increasing your dedupe cache size on both the source and target cores. We have a KB regarding how to size the dedupe cache properly here - support.quest.com/.../134726. Generally we recommend 1 GB of dedupe cache per 1 TB of protected data. In your situation you only have 1.5 TB of protected data so a 1.5 GB dedupe cache would be our recommendation. But we have real world data proving that you actually need more dedupe cache. So I'd probably set the dedupe cache to 3 or 4 GB on each core. Please note that the increase in dedupe cache directly increases the amount of RAM consumed by the core since the dedupe cache is loaded in RAM.

    The other thing to note as you increase the dedupe cache is that it doesn't fix the data in the repository already. It will only impact the new data coming into the core. So it will probably take a few months of backups and rollup removing duplicate data before it's affects are truly seen. The other option would be to increase the dedupe cache and then run the repository optimization job which processes all of the data in the repository again and with the increased dedupe cache size should decrease the overall repository usage. However that job takes a significant amount of time and while it is running it blocks other jobs from occurring so it may not be functional for you based on your time constraints.
  • Tim, great info!

    I have a question about the dedup cache. I run into a lot of Core's that don't have a lot of memory available so I don't like to guess on the sizing. The recommendations are a good guideline but there are always Core/ Data that falls outside, like this Core it seems. You mention his Core config falls within the guidelines but recommend he double the cache to deal with his data.

    The Core is obviously aware that he has filled the cache up and is sending un-duplicated data accross the wire, does he log this anywhere?

    If its not, can you think of any way that we maybe able to find this out? I am thinking looking at the PrimaryCache\cache.data file directly (size or some other query method)

    Even if your Core has tons of memory, no one wants to over-commit memory to a process that does not need it. And if your cache settings are to low, it has a large impact on traffic across the wire so I would expect the core to give us some feedback on what it needs
  • So I have a question regarding this advice. The total "protected data" amount is ever growing. Should I be increasing my cache to match?

    Is this just a performance issue or am I actually getting less deduplication because my cache is smaller than my total repository?
  • The core does not log when the dedupe cache is full and it starts overwriting values. That would be a very helpful feature to have. I'll talk with our Product Managers about it. However, I can't promise anything especially since the repository storage engine will hopefully be changing in the next major release of the software.

    I do not know of any way to identify when the cache is filled. Even if you look at the cache files directly. They are thick provisioned when the core is created so the file size never changes only the content in the file (which is in a proprietary format). So I can't come up with a way to figure out when that occurs.
  • 1. When I say total protected data I mean the total amount of data that you are backing up with your initial base images. So basically add up the used space on all your protected machines and use that for the calculation.

    2. This is a guideline, not a rule. The type of data and your growth rate of data should be factored in as well. The more unique the data, the more dedupe cache you will need. The larger your data growth rate, the more buffer you should probably build in.

    3. The total protected data amount on your core should stabilize once you have reached the end of your retention policy. This is because old data will be deleted while new data is brought in. The only thing you have to account for then is total data growth over time. So when calculating your dedupe cache it may be smart to build in a buffer based on that projected growth.

    4. Dedupe occurs as data comes into the core. We hash the block to get a unique value for it then use that to validate against the dedupe cache to see if we have seen that block of data recently and already have a value for it in the repository. If we do have a value for it in the cache, we dedupe it. If we don't, we write the block as new data to the repo. So the more dedupe cache you have, the more values you can store in it, the more likely you are to get higher dedupe capability.

    5. To reach perfect deduplication you have to have enough cache to store every possible block of data that could be experienced. I'm not gonna do the math here, but suffice it to say that it would take hundreds of GBs of cache in order to get perfect dedupe. That's just not possible.

    6. So if you have a lot of highly unique data (3D video, pictures, CAD files, etc.) your dedupe is going to be poor. If you have lots of easily deduped data (text files, word documents, duplicate OS builds, etc.) you are going to get very good dedupe. It is possible if you have highly dedupable data that a 1.5 GB dedupe cache could be more than enough for your environment to experience near perfect dedupe.

    7. There is a performance hit to increasing your cache too much. The core flushes that cache to disk once per hour. The larger the cache is, the longer that takes. During this flush job, no data processing is done and all jobs are basically on hold. I've seen some cores with 32 GB of dedupe cache take 5 minutes to perform the flush. On a busy core, that can really impact your performance. The other issue is the larger the cache is, the longer the time it will take to look through all the values in the dedupe cache. We've seen transfer rates decrease by as much as 4 MB/s on large cores with dedupe caches in the 30+ GB range. So you have to decide if you want to sacrifice additional storage savings for speed of backups.

    At the end of the day, the only metric you have to go by to judge your dedupe effectiveness is the compression rate. You can get a really good idea of the efficiency of your dedupe by comparing the compression rate on the source core with the compression rate on the target core, provided that you have all the same data on both.

    I hope I didn't put you to sleep with that explanation...
  • No that's great. Thanks for the info.