This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Core memory usage in Hyper-V VMs making VMs unresponsive

We run most of our Rapid Recovery Cores in Hyper-V VMs. Since upgrading from Appassure 5.4 to Rapid Recovery we've had problems with the Core service gradually using up more and more memory until it eventually uses up all the available memory recourses. At the point the VM will become unresponsive and eventually crash. This first started happening with Rapid Recovery 6.0, and has continued to happen with all the subsequent releases. Most of our Core VMs are running Windows Server 2012r2, a few are also on 2012 or 2008r2, and the problem seems to happen with all of them. We don't see this problem happening with Core that are running bare metal however.

Are there any known issues with Rapid Recovery running on Hyper-V VMs, or workarounds for this problem?

  • At this point in time I can not say that I have noticed this occurring, although I will admit that I tend to test more with VMs in VMware more often than not. I will purposely built out some in HV to see if I notice the behavior that you have mentioned, as this is news to me.

    If I may ask however, what kind of resources are we providing these cores? Are they within our sizing specs? (I only ask since if these were VMs of say 6 GB of RAM or 10 and a dedupe cache of 4, then I'd say yes I'd expect this to happen)

    support.quest.com/.../185962

    Or are they a little bit of everything? Or perhaps it just doesn't seem to be a rhyme or reason?

    Either way, I'll spin up some HV cores and attempt just this very thing myself.
  • The cores we have with small data sets, and smaller dedupe caches take longer for the memory usage to balloon up enough to become a problem. This particular core has 45GB of RAM, and a 7.5GB dedupe cache. The memory usage is almost 100% right now. To fix it we have to restart the core service and the memory usage will drop back down to what it should be.

     

  • Are the VM's using static or dynamic memory?
  • Is the host also showing that memory used when this occurs? How much up time are we talking about before this occurs?

  • I believe that the best first step in troubleshooting this is disabling write caching policy and setting the allocation policy to sequential.
    HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\<RepositoryId>\FileConfigurations\<extentId>\Specification\WriteCachingPolicy =3
    HKEY_LOCAL_MACHINE\SOFTWARE\AppRecovery\Core\Repositories\<RepositoryId>\Specification\AllocationPolicy = 1

    *RepositoryId = Guid
    ** extentId = [int]0...maxNumberofExtents

    You need to bounce the core service for these settings to take effect.

    In my experience this would help.

    Please let us know how it works for you.
  • The host always shows the amount of assigned memory as being in use.
    It usually takes between a week and a month for the memory usage to balloon up to the point it's a problem.
  • I've made the changes you suggested to one of the Cores in our test environment, and left a similar core unchanged. I restarted the core service on both to compare how the memory behaves.
  • Making those changes to the registry doesn't seem to have changed much. The one that I made the changes to and the one I left alone are both about to run out of memory resources.
  • Try dynamic. It may reclaim memory for the buffer that static cannot. You can also up it on the fly and see if it ever caps out.