This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Virtual Standby to ESXI are slow

Appliance: DL4300

Software Version: RapidRecovery 6.1.2.115

ESXI Version: 6.0

ESXI Server Spec: Dell R730x, 24 15k SAS Drives in RAID 6 - Adaptive Read Ahead - Write Back, 64GB RAM, Dual Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz

Problem: Exports from RR to ESXI run below 10MB/s on a 10Gb/s network. Many of our standby's are in the TB per day range, meaning most take approx 20-30 hours to complete. During this time backups of that agent do not take place, leaving the server vulnerable.

 

Troubleshooting taken with Support:

 - All software and hardware updates completed

 - Direct 10Gb/s fibre between Core and ESXI

 - Pausing all other tasks on the Core and restarting the server

 - Change RAID policy to Adaptive Read Ahead and Write Back

 

Performing an export of an agentless backup last night seemed to run in the 100MB/s+ range, however after checking the logs this morning it dropped back down to 18MB/s after an hour.

 

Any help is appreciated, and even a comment to say you are experiencing the same and info about your setup will assist.

 

James

Parents
  • Let me give some more details related to D-34758. This defect was created related to a specific customer environment in which the Quest Dev team was able to identify that the data in the repository was highly fragmented and during the export job many small requests were being sent to the repository. These small requests generated a bottleneck on the core slowing down the export speed. To optimize this, the Quest Dev team improved the logic and grouped those small requests together so that they did not create a bottleneck in the same way. Hence the issue was resolved in this one specific customer environment and the defect was closed after further testing to ensure that the fix did not have any negative consequences on performance in other environments.

    The issue here is that the defect title is very generic and makes it seem like it covers every situation with a fragmented repository. It does not.

    To isolate your specific problem, what we need to do is:
    1) Confirm that the repository is actually highly fragmented. This is not easy to do and requires the Quest Dev team to review. I'll follow up with the team on my side and confirm we have done this properly.
    2) Confirm the bottleneck that is causing the export slowness. If it's the same as D-34758 then the fix we implemented was not as complete as it needed to be. If it is not hte same as D-34758 then the Quest Dev team will need to work through other options for optimizing the job. As with any complex issue this requires patience and time as we find the root cause of the issue and then figure out how to work around that.
Reply
  • Let me give some more details related to D-34758. This defect was created related to a specific customer environment in which the Quest Dev team was able to identify that the data in the repository was highly fragmented and during the export job many small requests were being sent to the repository. These small requests generated a bottleneck on the core slowing down the export speed. To optimize this, the Quest Dev team improved the logic and grouped those small requests together so that they did not create a bottleneck in the same way. Hence the issue was resolved in this one specific customer environment and the defect was closed after further testing to ensure that the fix did not have any negative consequences on performance in other environments.

    The issue here is that the defect title is very generic and makes it seem like it covers every situation with a fragmented repository. It does not.

    To isolate your specific problem, what we need to do is:
    1) Confirm that the repository is actually highly fragmented. This is not easy to do and requires the Quest Dev team to review. I'll follow up with the team on my side and confirm we have done this properly.
    2) Confirm the bottleneck that is causing the export slowness. If it's the same as D-34758 then the fix we implemented was not as complete as it needed to be. If it is not hte same as D-34758 then the Quest Dev team will need to work through other options for optimizing the job. As with any complex issue this requires patience and time as we find the root cause of the issue and then figure out how to work around that.
Children
No Data