Rapid Recovery

SQL Locks

Hello,

 

We are using MSSQL 2014, and RR 6.1.1.137.

Our database server has C (170GB), D (2.2 TB), E (360GB), F (5.6TB).

Currently all databases are set to simple recovery, and our schedule is set to 2 hours.

Unfortunately, at least once a day we have a period where our application (custom built .NET web app) throws a bunch of errors saying it cannot write to the database, we are getting locks.  These are happening only at scheduled backup times, and typically when that happens we have 2-3 retries of the backups.

This is, of course, causing client impact.

Our database is not particularly busy, I feel like maybe we are just doing something wrong or Rapid Recovery would never really work for live databases.  In which case I'm unsure how they could be selling it.

One suspicion, we do have some fairly long running transactions in the application.  These are not changeable at this time.  Could this be causing issues like I've mentioned where DB locks are occurring?

 

Looking for any tips at all.

Parents
No Data
Reply
  • Take everything with a grain of salt, since I don't know your environment.

    I think there are 2 issues

    1) VSS Snapshot Fails
    2) Database is locked

    SQL has a VSSWriter, I would expect this writer to be able to pause DB access for a few miliseconds even if there was a long job running. But maybe there is a certain type of SQL job that will not respond or maybe it cant respond fast enough. I dont know enough about SQL to help with the first but for the second, make sure the disk can handle the IO of this job and snapshots. Maybe try moving the snapshots to another disk, Check if the logs and db are on the same drive, maybe move one. There are a lot of possible causes and fixes

    For the DB lock, I have no idea. Maybe ping Microsoft since both SQL and VSS are their products.

    Open a case with Quest and see if their people can help. It is not their product that is failing but since their product calls VSS, they should have a fair amount of knowledge and should help since in the end, their backups are failing due to a call they are making (vss) failing
Children
No Data