This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Network Performance Issues

After a recent switch issue that caused my exported VM's to fail I have become aware of what seems to be a RR only problem with network speed.

While exporting base images (and I assume snapshots) I am experiencing what I would call "bursting". That is to say a data/pause/data/pause condition that is slowing the entire process down. This can be seen both visually in the status window as alternating between a progress bar and the

//////////////////////////////

bar that you see while it is preparing something to move. The overall transfer times also indicate about 1/2 the transfer rate of sample data moved including from other backup software run on the same network.

I have dropped the firewalls and killed the firewall service. I have added the machines and users to each other at full permissions. This is a dedicated fiber network and Windows 2012 servers with RAID6 arrays.

For reference it took me 12 hours to move a 1.17TB base using RR. About twice as long as similarly sized moves across the same network.

  • What you are describing is correct, and usually a common situation. Regardless of whether you are using VMware or HV, the data transfer rates into both hypervisors are not the types of speeds that you would expect to get through standard Windows traffic (also keep in mind writing directly into a VMware datastore from on outside "non-VMware" entity is not fast, that is something the entire industry has noted over the years). In fact this is why VMware has the transport methods of LAN-Free SAN and HotAdd to try and 'speed' things up if at all possible by reducing the amount of network hops, as this is something that overall has been noticed over time:

    support.quest.com/.../195634

    Having said that however, there are other things that affect your speed within RR that relate to the product specifically that you might not have to face when you do a standard network copy. Those are block size and dedupe. Unlike VMware which as a block size standard of 1 MB, or Windows that uses 64k, we use an 8k block size. While this helps with granularity, it will not help with performance when you are trying to copy out large amounts of data, it will in fact slow them down. I also mention dedupe as the solution is a compressed/deduped solution, so when you do a restore not only do you have to wait for files to un-compress, the core also has to re-hydrate the files as well from the dedupe cache.

    All of which comes down to yes, the restores will take, and are expected to take, a longer amount of time to restore than if you were to just copy the files via something like Windows. Though this isn't all that uncommon for DP solutions we may be unique in this situation as you hit all the factors that slow things down:

    1) writing into a Hypervisor
    2) dedupe
    3) compression
    4) block size

    I see your other posts regarding virtual standby, that is precisely why we recommend that customers have it created ahead of time, as restores take time, usually longer than you'd like them to when you're in dire need of them.
  • Hello Again,

    This makes sense to some degree however I am basing my comparison not on simply moving files in Windows explorer but other backup products like Symantec and Altaro both of which use dedup and compression (and probably custom block sizes as well).

    Symantec is juat awful from the navel out in every direction but Altaro for example also writes directly to the Hypervisor across the same network and does so at almost twice the rate of RR

    I know you guys have been beat up for network speed a lot and I'm not piling on but it does seems like something is up when it comes to network overhead and RR.

    Thanks
  • I looked at Altaro, and I can't find anything publicly facing (yet) that states their block sizes, however Symantec still recommends 64k, which I believe is their default, which I certainly could be wrong there. That is one huge factor there, as if we are comparing factors RR may have to read 2x, 4x, 8x the amount of blocks to restore the same file potentially.

    The only other question I would have is if we are comparing apples to apples and if the transports are the same when you are testing these (I only mention it as 2x as fast I agree is nothing to just bat an eye at). For an example if this was Altaro and they are hypervisor aware and you had it running on a VMware VM you might be taking advantage of HotAdd, who knows. However if RR was on a physical then you wouldn't have had that option (but you could do the aforementioned LAN Free SAN option). Without knowing the exact specifics its hard to do a 1v1 to comparison. However if you were to call in and say I have 1 TB to restore into a virtual platform and asked out long it would take, I'd ballpark 10-16 hours depending on the environment.

    Restore speeds are a common theme, across most every DP product. Certainly not disagreeing with you by any means, and not to say that they can't be improved upon, and that won't be. Having said that however, taking a look at the environment and making sure that you're getting the best 'bang for your buck' so to speak (LAN-Free SAN, HotAdd, networking) you do have support there to assist with that. Calling in and logging a case to validate your installation is something that I'd suggest if indeed you are concerned about the backups/restores/ functionality of the product.

    On that note it's time to test some virtual exports for the other forum post my friend.