This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Poor Backup Performance with AppAssure

Has anyone come up with performance tweaks they can share in AppAssure ? 

I frequently have many jobs run at sub 100K transfer speeds and take ten's of hours to finish. During these times the server is utilizing less than 10% of the processors load, the network shows no more than 5% utilized. I am concerned there is something more going on. I talked to support and they have had me make tweaks, but their last response was why don't you try upgrading to 15K SAS drives and see if this helps.  Not really a solution i want to try without hard numbers to back it up.

I have tested with ATTO and Iometer bench tests and can get line speed transferring data. In addition I went as far as to test Unitrends backup software and i can sustain higher 50MB + transfer speeds for backups. Its not making much sense and though i would check with the group and see if anyone else is experiencing this as well. 

Parents
  • I realize it is an old thread but I wasted a lot of time with support to discover they know nothing about the reg tweaks to boost performance. Dell knew nothing, and quest just recycled the dell kbs. Dell fired the engineers who came up with this program long ago, so no one really knows how it works any more. The core settings are VERY hardware specific.

    By default AA/RR sets your bytes per sector to 4096, if your physical drives are 512e, this is ok but you can get slightly better IO setting it to 512 so it doesnt need to run through the emulation.

    This setting youre probably fine with 4096, but it is one thing to check. HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\FileConfigurations\0\Specification\BytesPer Sector this can also be configured during intial repo setup in the gui

    Write cache policy, anytime you talk to support complaining about speed they will google and come up with an old dell kb saying to set the write cache policy to 3. They arent going to think about what they are doing, theyll just see the article from appassure saying do this to boost speed - criticism accurate as of 7/11/18 and im running Rapid Recovery 6.2. If youre are using server 2012 or later set it to 4 so it is synchronous. You want windows controlling the io. This changed my speeds dramatically. HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\FileConfigurations\0\Specification\WriteCachingPolicy

    Next thing if youre seeing a lag running rollups or other resource intensive jobs there are 2 fun settings to increase the io. Again, this is very hardware specific. If you are running this on 5K drives its probably not going to do much for you but if you have 15k or ssd drives, you will like this. No one at quest or dell has put effort in to tuning the repo in 10 years so all the default recommendations are from 2008. under HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\CoreSettings change max threads and max io threads from 500 to 1000. Why they would statically set this is beyond me. Feel free to experiment, but I doubt you will need more than 1000. If you check the logs and see an io operation taking too long this setting will help fix that.

    This one I have zero explanation for other than trial and error. By default max concurrent operations is set to either 32 or 64. The original dell documentation, which admittedly is kinda worthless, said to tweak this in multiples of 8. So 16, 72, 80, etc. Dont ask me why, 98, even though it isnt a multiple of 8 was my magic number. 99, 97 96, performance jumped off a cliff and landed in Cleveland.  HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\Specification\MaxconcurrentOperations

    While you are in that key you might want to change 

    HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\Specification\enabledirectoryautorepair from zero to 1. I like in general when something bad is detected it gets fixed

    Last tip, try and make sure your Meta folder is on a different drive than your data folder. If you can put the meta folder on an ssd even better. 

    Use all these tweaks at your on risk, you need to stop and restart the core service for most of them to take effect. If you break something just revert back to the old setting and restart the core service, there is next to no harm in experimenting. Worst case the repo doesnt load in the web gui, you undo your change, and restart the service. The DVM check will clean up any side effects.

    Snootchie Boochies

  • Hi Ken:

    I'm the Product Manager for Rapid Recovery, and I also led the engineering team that built AppAssure in the 5.x releases. 

    Thanks for taking the time to write this up, and I'm sorry to hear that you had to go through all of these permutations to optimize the performance of your system.  I shared this post with the development team to get their take on it.  The results you're seeing are surprising given the way some of these parameters actually work.  I'll elaborate on each parameter one by one below, but I also wanted to make a general remark about these parameters.

    AppAssure/RapidRecovery has to be supported across a wide range of hardware and software configs, so over time we've evolved some tuning parameters which don't rise to the level of exposing in a GUI but which need to be configurable for those environments where the default value isn't optimal.  Whenever possible we try to make the right performance choice automatically, though given the complex surface area we support that's not always possible.  I'd caution anyone else trying to optimize performance of their Cores to be careful manipulating some of these parameters unless absolutely necessary or directed by support.  We definitely do not intend for most RR customers to need to engage in trial-and-error registry manipulation in order to get decent performance out of the solution.

    Now, regarding your specific suggestions:

    By default AA/RR sets your bytes per sector to 4096, if your physical drives are 512e, this is ok but you can get slightly better IO setting it to 512 so it doesnt need to run through the emulation.

    This setting youre probably fine with 4096, but it is one thing to check. HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\FileConfigurations\0\Specification\BytesPer Sector this can also be configured during intial repo setup in the gui

    The default for this parameter is actually 512.  We've encountered some drives which need 4096, but most drives with 4K sector sizes have performant enough emulation layers that this doesn't often have a material impact on performance.

    Write cache policy, anytime you talk to support complaining about speed they will google and come up with an old dell kb saying to set the write cache policy to 3. They arent going to think about what they are doing, theyll just see the article from appassure saying do this to boost speed - criticism accurate as of 7/11/18 and im running Rapid Recovery 6.2. If youre are using server 2012 or later set it to 4 so it is synchronous. You want windows controlling the io. This changed my speeds dramatically. HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\FileConfigurations\0\Specification\WriteCachingPolicy

    The write cache policy parameter has a complex history, and depending upon the OS version sometimes changing this parameter can yield some performance benefits.  I'm very surprised that you see a performance improvement using Synchronous.  This removes all asynchrony  from the write path and performs blocking I/O; it doesn't make sense that this would be faster on most modern storage subsystems.  We recommend using a WriteCache policy of "Off" for most workloads because in our performance tests this yields the best overall performance, balancing write and read performance.  Writing to the write cache can sometimes improve the performance of read workloads because sometimes those reads can be served from the OS cache instead of the disk, but in our tests the best value for overall performance is "Off". It's great that you're getting good performance from tuning this parameter to Sync but I wouldn't expect that to generalize to most customers.

    Next thing if youre seeing a lag running rollups or other resource intensive jobs there are 2 fun settings to increase the io. Again, this is very hardware specific. If you are running this on 5K drives its probably not going to do much for you but if you have 15k or ssd drives, you will like this. No one at quest or dell has put effort in to tuning the repo in 10 years so all the default recommendations are from 2008. under HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\CoreSettings change max threads and max io threads from 500 to 1000. Why they would statically set this is beyond me. Feel free to experiment, but I doubt you will need more than 1000. If you check the logs and see an io operation taking too long this setting will help fix that.

    This is another surprise to me.  This parameter determines the maximum number of thread pool threads per processor.  The default value of 500 is itself very very high; if you have configured a Write Cache Policy of "Sync" that might explain why increasing the number of threads would improve your performance, but for the default policy with asynchronous I/O there's no reason that more threads should equate to more performance.  RR performance tends to be bound by I/O speed and not computation workloads, and more threads actually can incur additional compute as the OS must context-switch between more threads at once.  I'd be curious to know more about the environment in which setting this to a non-default value actually made anything measurably faster.

    This one I have zero explanation for other than trial and error. By default max concurrent operations is set to either 32 or 64. The original dell documentation, which admittedly is kinda worthless, said to tweak this in multiples of 8. So 16, 72, 80, etc. Dont ask me why, 98, even though it isnt a multiple of 8 was my magic number. 99, 97 96, performance jumped off a cliff and landed in Cleveland.  HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\Specification\MaxconcurrentOperations

    This is another puzzling suggestion.  The repository engine maintains one I/O worker thread per extent; this parameter determines how many outstanding operations that thread will maintain on the disk before additional I/O operations are queued.  In theory, very fast storage might be able to sustain more than 64 such operations, but I have no explanation for why 98 was faster while 99 and 97 were slow.  Neither of these should make any material difference.  Again I'd be curious to know more about the environment in which this was the case.

    HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\{$repoidstring}\Specification\enabledirectoryautorepair from zero to 1. I like in general when something bad is detected it gets fixed

    I strongly caution against setting this parameter to 1.  This is used to control a one-time remediation process that was only necessary on much older versions of the software.  There's nothing to be gained by setting this manually, other than slower repository check times.

    Last tip, try and make sure your Meta folder is on a different drive than your data folder. If you can put the meta folder on an ssd even better. 

    This is excellent advice.  The I/O patterns on the metadata files are much smaller and more random so they benefit greatly from any storage with lower latency, especially flash storage.


    I really do appreciate you taking the time to write up these recommendations, and I hope having some more details on what these parameters do will take some of the mystery out of what sounds like a frustrating performance tuning process.  I'm sorry to hear you're intending to switch to Veeam, I hope you'll reconsider.

    Thanks,

    Adam

Reply Children
No Data