This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Difference in recovery points between core and replication server

boom over 8 years ago

I have a core server and replication one running 6.1.1.137

i started the replication process but not all recovery points are copied.

Here is the lsit of Main core server

This is my Replication

How can i have the same recovery points on both?

Parents

+1 Tim Seymour over 8 years ago

1. When I say total protected data I mean the total amount of data that you are backing up with your initial base images. So basically add up the used space on all your protected machines and use that for the calculation.

2. This is a guideline, not a rule. The type of data and your growth rate of data should be factored in as well. The more unique the data, the more dedupe cache you will need. The larger your data growth rate, the more buffer you should probably build in.

3. The total protected data amount on your core should stabilize once you have reached the end of your retention policy. This is because old data will be deleted while new data is brought in. The only thing you have to account for then is total data growth over time. So when calculating your dedupe cache it may be smart to build in a buffer based on that projected growth.

4. Dedupe occurs as data comes into the core. We hash the block to get a unique value for it then use that to validate against the dedupe cache to see if we have seen that block of data recently and already have a value for it in the repository. If we do have a value for it in the cache, we dedupe it. If we don't, we write the block as new data to the repo. So the more dedupe cache you have, the more values you can store in it, the more likely you are to get higher dedupe capability.

5. To reach perfect deduplication you have to have enough cache to store every possible block of data that could be experienced. I'm not gonna do the math here, but suffice it to say that it would take hundreds of GBs of cache in order to get perfect dedupe. That's just not possible.

6. So if you have a lot of highly unique data (3D video, pictures, CAD files, etc.) your dedupe is going to be poor. If you have lots of easily deduped data (text files, word documents, duplicate OS builds, etc.) you are going to get very good dedupe. It is possible if you have highly dedupable data that a 1.5 GB dedupe cache could be more than enough for your environment to experience near perfect dedupe.

7. There is a performance hit to increasing your cache too much. The core flushes that cache to disk once per hour. The larger the cache is, the longer that takes. During this flush job, no data processing is done and all jobs are basically on hold. I've seen some cores with 32 GB of dedupe cache take 5 minutes to perform the flush. On a busy core, that can really impact your performance. The other issue is the larger the cache is, the longer the time it will take to look through all the values in the dedupe cache. We've seen transfer rates decrease by as much as 4 MB/s on large cores with dedupe caches in the 30+ GB range. So you have to decide if you want to sacrifice additional storage savings for speed of backups.

At the end of the day, the only metric you have to go by to judge your dedupe effectiveness is the compression rate. You can get a really good idea of the efficiency of your dedupe by comparing the compression rate on the source core with the compression rate on the target core, provided that you have all the same data on both.

I hope I didn't put you to sleep with that explanation...
Cancel
Up 0 Down

Cancel

Reply

+1 Tim Seymour over 8 years ago

1. When I say total protected data I mean the total amount of data that you are backing up with your initial base images. So basically add up the used space on all your protected machines and use that for the calculation.

2. This is a guideline, not a rule. The type of data and your growth rate of data should be factored in as well. The more unique the data, the more dedupe cache you will need. The larger your data growth rate, the more buffer you should probably build in.

3. The total protected data amount on your core should stabilize once you have reached the end of your retention policy. This is because old data will be deleted while new data is brought in. The only thing you have to account for then is total data growth over time. So when calculating your dedupe cache it may be smart to build in a buffer based on that projected growth.

4. Dedupe occurs as data comes into the core. We hash the block to get a unique value for it then use that to validate against the dedupe cache to see if we have seen that block of data recently and already have a value for it in the repository. If we do have a value for it in the cache, we dedupe it. If we don't, we write the block as new data to the repo. So the more dedupe cache you have, the more values you can store in it, the more likely you are to get higher dedupe capability.

5. To reach perfect deduplication you have to have enough cache to store every possible block of data that could be experienced. I'm not gonna do the math here, but suffice it to say that it would take hundreds of GBs of cache in order to get perfect dedupe. That's just not possible.

6. So if you have a lot of highly unique data (3D video, pictures, CAD files, etc.) your dedupe is going to be poor. If you have lots of easily deduped data (text files, word documents, duplicate OS builds, etc.) you are going to get very good dedupe. It is possible if you have highly dedupable data that a 1.5 GB dedupe cache could be more than enough for your environment to experience near perfect dedupe.

7. There is a performance hit to increasing your cache too much. The core flushes that cache to disk once per hour. The larger the cache is, the longer that takes. During this flush job, no data processing is done and all jobs are basically on hold. I've seen some cores with 32 GB of dedupe cache take 5 minutes to perform the flush. On a busy core, that can really impact your performance. The other issue is the larger the cache is, the longer the time it will take to look through all the values in the dedupe cache. We've seen transfer rates decrease by as much as 4 MB/s on large cores with dedupe caches in the 30+ GB range. So you have to decide if you want to sacrifice additional storage savings for speed of backups.

At the end of the day, the only metric you have to go by to judge your dedupe effectiveness is the compression rate. You can get a really good idea of the efficiency of your dedupe by comparing the compression rate on the source core with the compression rate on the target core, provided that you have all the same data on both.

I hope I didn't put you to sleep with that explanation...
Cancel
Up 0 Down

Cancel

Children

No Data