This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Data Transfer Speeds Between Core and Client

Hi All

I have a dedicated 10GB link between my core and a protected server. I note the transfer speed is maxing out at around 120mb/s. I have investigated a little and can see I can amend the transfer rates but can only find articles referring to throttling the speed.

Can I increase settings to improve data transfer rates if so any recommendations?

Many Thanks!

  • Hi paulw,

    I will assume the that default 3 concurrent transfers is still untouched. Therefore you would roughly have 3 machines going 120MBps with their transfers. In total that'd be around 360MBps transfer rates. You'll should review what other jobs on the Core are taking place such as virtual standbys, integrity checks, replication, or rollups. All of this reads and writes to your repository storage.

    The next step would be to consider the maximum write and read speeds the disks your repository resides on can reach before further troubleshooting the transfer question that has been raised. If you have completed bench marks of reads and writes, please share the results. One tool that I like to use is CrystalDiskMark: crystalmark.info/.../index-e.html

    You can also review your dedupe cache sizing and see if it needs to be increased: support.quest.com/.../134726. If more transferred data can be deduplicated, then the transfer rate will increase because less data is being committed to the repository.
  • Hi Derek,
    Thanks for the response....the transfer rate of the default 3 concurrent connections actually drops to around 40MBps therefore indicating a maximum transfer of 120MBps in total and not the 360MBps as I would expect to see. I can copy an ISO over the network to the appliance at 950MBps indicating further that there is some throttling in place with Rapid Recovery. The settings for transfer are currently set at the default level as per a default RR installation. If you have any further ideas that would be much appreciated! I am also experiencing this issue between my core and replication partner which again has a 10GB dedicated Fibre.
    Thanks

  • Hi Paulw:
    I would like to add my "tupence" to the conversation.

    When sizing RapidRecovery assignments, a common approach is to focus on the network connection characteristics and give very little importance if any to the storage performance both on the protected machine and the core. In fact the most important characteristic of any system transferring huge amounts of data is the available IOPS on the storage systems at the ends of the transfer pipe. For instance, if the source system hosts a SQL server that is hammered with I/O requests, there may be very little room left to transfer the backup data. On the repository size, when doing Attachability, mounatbility, data transfers, replication and exports plus "invisible" background jobs such as deferred deletes at the same time subtract IOPS from the available pool; it is not uncommon to see 100% disk active time and queue lengths of 5 or more (according to Microsoft, a queue length of 2 is a bottle neck).

    Other limitation, which is inherent to Rapid Recovery is the fixed size of the blocks -- 8KB (while windows works with 64KB chunks). This was decided in order to get high deduplication/compression (despite its simplicity it is one of the best in the business).

    Another factor diminishing performance is the size of the in line dedupe cache -- the larger it is the more time it take between data is received on the wire and the moment it is committed to storage. Moreover, the dedupe cache is flushed on the disk every hour or so. During this process all repository jobs are suspended. In certain cases this operation can take 5, 10 even 20 minutes every hour (the case of large dedupe cache on slow storage -- i.e. 64GB being flushed on 5400 RPM/512MB cache Raid 1 disks). At last but not at least, data read on the protected machine and committed to the repository is read and written in a random pattern which diminishes considerably the performance compared with sequential operations.

    It does not make too much sense to compare Windows copy and data backup. First, Windows copy is caching data but reports data still in the memory as being transferred (which a backup operation should report only committed data). Second, Windows dedupes data (if enabled so) at a time the system does not do something else, while Rapid Recovery backups are doing it before committing the data to the repository. Windows copy is unable to sustain copying large volumes of data without errors for long periods of time as it is optimized toward small files compared with backup jobs which should be able to perform terabytes of transfers without error.

    So, returning the question of how to improve performance, the first thing to do is to make sure that the storage system on core and protected agents perform properly (fast disks, SAS rather than SATA, use writeback caching and adaptive reds on Raid arrays etc). Second is to see if jobs can be staggered in such a way that most of the time they run with minimal "competition". Third, on the networking side, set end-to-end jumbo frames, avoid using VLANS if possible and use proven quality switches. If you have a cahance, make a RAID 1 out of 2 SSDs with high sustainable write capabilities and move there the Dedupe cache location. For improved reliability I would choose SSDs at least twice the optimal size of the Dedupe cache (which it won't be that difficult taking into account that the max recommended size is around 32GB. Keep in mind that you have two copies so plan for double storage size.

    At last but not at least, based on the quality of your core storage, you may attempt to increase the value of transfers settings for agents (i.e. the segment size). The benefits of changing the transfer settings are rather volatile so I won't get into details. Sufice to say that there are better chances of success on iSCSI storage.

    Hope that this helps.