Replication to Core in Azure

I would be interested to hear experiences with replicating to Azure and the general performance. We have had several issues and continue to do so with the speed of replication and virtual standbys.

Issue 1 - Core Service on Azure VM crashing.

Originally we were running a core in Azure using the Rapid Recovery Core Virtual Machine in the Azure Marketplace - running Server 2016 OS.

This had 4vCPUs and 32GB of RAM, Azure storage account general purpose v2 (standard & hot tier).

Every week or so the core service would randomly crash, after back and forth with support and numerous log files we never really got to the bottom of it. CPU resource on the VM was around 70% so decided to increase to 8 vCPU and 64GB memory. The crashes became less frequent but were still happening, support still haven't got to the bottom of this issue.

The core versions have also been upgraded to the latest 6.7.

Issue 2 - Replication and Virtual Standby speeds.

While battling with issue 1 the general performance of the core is slow, replication is conducted from on-prem core on server 2019 OS, 16 CPU and 64GB Memory, 200MB internet line. The replication and roll up speeds on the core in Azure fluctuate massively from 100KBs/sec up to 7 MBs/sec. The resource on the source core is fine, there are no issues with high CPU or Memory usage.

One thing I noticed on the Azure storage account was that the success E2E latency on average was 135ms while server latency was 16ms -

So I have started from scratch and built a new core in Azure on Server 2019 with same spec using Azure Blob Premium storage, E2E latency seemed fine for the first few hours and was tracking server latency with a few ms, however it is now averaging 88ms while server latency is 5ms. This leads me to believe it's an application issue, thoughts appreciated.

  • I'll start by saying that overall, Azure replication VMs should work, however I can't speak to their Blog storage. By the nature of how the RR product line works, I would assume (I could be wrong) that the blob storage setup would end up problematic at best. I can't say I've tried the blob storage option with Azure for RR, I also really wouldn't. For the amount of i/o chatter and the fact that it is DP, I simply wouldn't. Secondly, I'd say specifically this is why partners (such as ourselves) provide turnkey RR cores for Quest customers, to potentially alleviate some of these experiences and headaches. If you're looking to get a hosts Core anyhow, then might as well lease one that you can escape these situations of wonder. Not only is a reality, but most of the time, it's just easy. 

    That being said however you mention virtual standby too, so are you trying to replicate to Azure and then to build Azure VMs from the RR within Azure? Are you replicating straight over the internet? Or are you replicating through a secure tunnel? Replicating through a tunnel wouldn't be ideal if this is meant as your failover, however it does beg the question. Are you trying to do 1 replication at a time, or 2,3,4? Are the virtual standby jobs trying to compete with the replication jobs at the same time? 

    I can't help but wonder if this is a job contention and/or resource allocation (storage) issue. Look forward to hearing your reply, cheers.