Rapid Recovery

Can we clarify - Replication rate becomes extremely slow if you start a virtual export job while replication job is running

So, the release notes for 6.1 states that replication can become very slow if an export is happening at the same time as an export and the fix is to schedule these at different times.

  • Is this only when the replication and exports are for the same machine, or any export / replication

Scheduling is actually quite difficult, and perhaps some improvement should be included in the logic for RR, such as

  • If the current replication job includes multiple RPs, don't start an export until the replication job has been completed, at present it starts the export as soon as a new RP has been replicated and then this slows the remainder of the replication and everything else has to sit in a queue
  • If we suspend exports during the replication then the export doesn't join the queue
    • Could potentially watch for event log entries such that when replication is finished then force an export.  However forced exports always join the top of the queue and as such some systems may take forever to export
    • We have systems so they only export once / day (well used to) and the fix for this would complicate matters.


Guess we need

  • when an export starts, suspend replication for that machine,
    • if currently replicating it will cancel that job at where it had got to
  • when it finishes exporting suspend exports and re-enable replication
  • when a replication job has completed, re-enable exports for that machine and force an export (though i guess for me I could just exclude that option)


When exports get re-enabled daily across the board,

  • check if a replication job is running for that machine, if it is then re-suspend exports


am i mistaken in this?  I can't see a realistic way whereby I can schedule replications to happen at a different time to exports since how are you supposed to know when one has happened?

  • my rudimentary examination seems to show that if we have any replication going on, then all exports run really slow.
    if we have any export running then all replications run slow.

    basically, the system is incapable of working as it should.

    looking at the repository disk performance seems to show latencies of <10ms and we're talking disk throughput of 10MB/s when both are running, not sure what it is waiting for. Active time is quite high though

    the only answer I can think of is to exclude all replications between 00:00 & 06:00 and force all systems to update their virtual standby at 00:00 and hope it completes in time.

    Not that great for those customers that can only really replicate overnight (or at least at high bandwidth, though that is a moot point)
  • That is true. Replication and exports running at the same time result in poor performance.

    Replication jobs consist of many small & light sequential operations while export jobs feature a few parallel heavy requests to repository. Both compete for the same queue and as a result the export job consumes almost all repository active time.

    Decreasing of parallel streams of export job may increase replication speed by a factor of 10. Increasing the buffer size of exports (thus less frequent repository hits) could improve the overall performance of both replication and exports.

    However, these operations belong to the category of "do not try this as home" as unexpected results may follow.

    Our Dev team is working on it.

    I suggest that you open a case with us.
  • i have a support case open, in fact i have several. i can't believe that an export should drop to KB/s when i have an inbound replication of 1MB/s. in fact that usually drops to KB/s when there is an export running. the new 6.1.2 repository structure appears to be very inefficient.

    disk latencies are still low. hopefully dev get back sooner rather than later.
  • how can we get this secret sauce, it's worthless at present so won't be any worse anyway.
  • my ticket for slow exports is 4077105 if that helps