This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Deferred Delete / Repo space

Does anyone from RR have any insight into the Deferred Delete Process and/ or managing full repo's?

 

1) Deferred Delete

I cant find a lot of info on deferred delete past https://support.quest.com/rapid-recovery/kb/198715/understanding-deferred-delete-in-nightly-jobs

- How is it different than the normal deleting rpfs index file jobs? Does it just run after a delete is performed vs waiting till the nightly?

- Is there any reason not to enable this (and if not, why is it not the default) 

- If we upgrade to v6 and enable deferred delete (or the normal nightly job) Will this clean up v5 repos that were affected with several bugs from v5 repos, such as the ones documented here

support.quest.com/.../understanding-0-compression-in-an-appassure-rapid-recovery-repository

 

2) Full Repo's

I wish you guys would take a look at this entire repo situation. We constantly have sites call us with full repo's and it is a pain to deal with. The Cores are always in this catch 22 where you need space to free up space and we always seem to end up doing something we don't want to do just in order to get it running again and each time, what we have to do seems different. These types of posts go back years without any change, the forum suggested 2 of them as "solutions" to me when I was creating this one

Some idea's

- Have the Repo reserve some space for performing jobs that clear up space, like deletes, rollup etc. Make it so this space can not be consumed long term, EVER

- Have the Core stop backup when the repo reaches a certain % full or even better a % of the total size. So a working % is stored if the repo is 500GB or 50TB, make this configurable

- Have some sort of external space that can be used for jobs like rollup, RP deletes. Even if we had to add a storage location to the repo that was used for this process ONLY. so we could add it, delete data, run rollup and then DELETE the storage location from the repo (long shot, probably)

 

Or maybe I am missing something that is already there? If so, I would love a bit of guidance

 

Thanks in advance

  • 1) Deferred delete in the nightly jobs is an option that you can enable. What it does is reserve time for the deleting index RPFS file jobs to run. So if you have a very busy core where jobs are constantly running and blocking the deletes, you can enable this job and specify a duration for it allowing the job to run and block all other jobs from running during this time period. The goal being to allow the repository deletes to run unhindered by other tasks for a defined quantity of time.

    One reason not to enable it is that it blocks other jobs for the duration that you specify. Another reason not to enable it is that it changes the job compatibility logic for deletes. Currently Deleting Index RPFS files jobs run in the background and do not affect other jobs. When you enable this nightly job not only do you get a window at night in which the deletes can run, but outside of that window the delete jobs are moved into the job queue and block other jobs from running. So if you have a single delete job that is large and takes a couple hours to complete, while it is running it will block backups and replications.

    Upgrade from version 5 to version 6 does not fix any issues that may exist in the repository related to defects in version 5 such as the ones that create white space. Once white space exists in the repository there is no way to remove it. You have to archive the data, delete the repository, import the data, and add the agents back to the repository. That is the only solution at this time. I highly recommend upgrading all your systems to Rapid Recovery 6.x since version 5 has reached end of support and all of the repository impacting defects in version 5 have been fixed in version 6.

    2) The core has alerts built in that can be enabled to warn of the repository filling up when certain thresholds are passed. I want to say they are at 80% and 90% full (but don't quote me on that). At this time we have chosen not to put any limits on how full we will allow a repository to get or marked storage space for maintenance tasks only. So our recommendation is to enable the alerts and then when storage starts to fill up, make sure to manage that space and don't overfill the repository. I'm sure that's not what you want to hear, but that's how the software has been designed as it exists now.

    I have sent your forum post over to our Product Manager to ensure he sees it as I think you have some good ideas here and he is adding these to his list of items to review related to repositories filling up and managing data.

  • Thanks for the reply and info!

    All our Cores are v6 now but many were v5 so have the space issues created by v5. You know how long it would take to archive, delete and then import data. The performance issues with this have been discussed for years, so this is not really an option in the enterprise. But since RR does give us any other option, we are stuck. Live with RR bugs that maybe filling up your repo or take some indeterminable amount of time to complete this process, all the while no backups are running

    We have the warnings enabled but they are just warnings and this product sends a TON of warnings, so users get kinda desensitized to it. Not saying it is right, but it is certainly what happens.

    Also with the base image issues, what often seems to happen is they have a data center wide issue (loss of power) so the Core takes full snapshots of every client. So the Core may go from 80% to 100% in one night.

    The Repo configuration just needs a ton of work. I would love to see it more flexible (this was my feedback on v5 when we were trying to beta v6) And while we are on it, a repo should be a ton easier to move.
  • "you know how long it would take..." - Yeah, I do know. Seen it done. Helped do it. Painful. Like you said, no other option really...

    I'm a big fan of of this report - support.quest.com/.../powershell-unified-at-a-glance-reporting-user-guide-and-script-. Send it once a day and at the top are the repo stats and a single breakdown of all the jobs that occurred. Don't enable any notifications on the core unless you absolutely need them otherwise it will spam you into oblivion.

    Also, dataprotection.rapidrecovery.com is a great tool for managing cores and watching repo stats, events, alerts, etc. Not sure if you've set it up or not, but I highly recommend you do.

    Base images are a definite problem. I've been complaining about them for years too. Hopefully we'll get some traction on them one of these days...

    If you haven't gone to https://ideas.labs.quest.com/ and submitted feature requests, voted on existing ones, etc., I recommend you do. The RR Product Team is using this tool to garner feedback and prioritize new features. It also allows you to track your ideas and suggestions and get feedback on them.
  • Great discussion and thanks Tim

    A few comments

    This is the first I have heard about ideas.labs.quest (there is no other mention on the forum that I can see) I just took a quick look and honestly it does not seem that useful. It kinda looks like a black hole where ideas go in and are all but forgotten or something that is directed towards sales vs technical people. The reply's don't seem helpful from a technical stand point either. For example

    ideas.labs.quest.com/.../RR-I-191

    The user took the time to write up a long and detailed post and the official reply seems pretty vague. What improvements were made, how do they address the problems mentioned, what version has those improvements and how to do I track what improvements are scheduled for what release.

    It also does not look like the RR PM's are very involved. There are 74 ideas and only 1 is planned (selecting Rapid Recovery + planned show 0 results so that appears broken) and it is just CSV support so that was going to be in the product at some point anyhow. And it has been an issue on the forums for a very long time. This came up in the 5.x days and you no longer even support that version

    I used to chat with old Appsure PM's and nothing we talked about ever made it into the product. Its it not like this problem is unique to RR but flashy sales items get added every release, but fixes to make it more supportable and reliable once it is installed are a battle.

    I think a lot of the issues that most of us (you included) would like to see are very deep in the product and would take a HUGE re-write of the entire product and not something that comes along often. That is why v6 was so disappointing to me. It was billed as this huge re-write and everyone told me how great it was going to be but I cant think of a single core issue that was changed. It was all GUI and Sales stuff. I suppose I could include dedup cache, it was a GREAT feature but how it was implemented still causes a ton of issues

    As far as the report that Tudor created (I think) I agree it is great but several issues with it.

    1) We have a fair amount of Cores and I don't directly mange any of them. So getting this report installed on every core is impossible.

    2) Installing this report actually creates a more work, the report did not run for some reason, there is a question on the output or it is off, can we change this or that, so now I get to spend my day messing with the report vs having this data in the core product like every other enterprise software.

    3) This is Powershell report that RR wants me to install to help manage their product. Yet if there are ANY issues with the report, guess what. It is not supported.

    I don't want to sound negative. RR is a great product and has some amazing features/ options or else we would not be using it. But if you have ever used another enterprise backup product or dealt with RR for any amount of time, it also has these gaping long term support ability/ reliability issues. I think my posting history shows I am trying to help both the users and the product and I hope my post is taken that way vs just being a nuisance
  • First of all, the old PMs don't work here anymore. So we have a new PM. I have faith you're going to see things change going forward. I know you have heard that like every other month since you started working with us, but honestly, this is the first time I've had hope in a long time. So that should tell you something.

    Second, the new PM is digging through the backlog of ideas and updating it as quickly as he can. It takes time to go through 200 odd items, but he's working at it and you should start seeing updates. Also, I'm not sure how much public changes have been done yet. Public feedback is important and the goal is to get there as quickly as we can.

    Third, the feature request you pointed out about replication defines how replication already works for the most part. It's very similar to what is outlined there. So the reply should have been "we already do this". I am happy to explain how replication works in another post when I have more time if you want to hear it.

    Fourth, the alerts for a repository filling up start at 80%. The notification for a full repository occurs at 98% which is when the repo is basically full. So RR-I-223 is only partially accurate.

    I'm providing you with information regarding the tools we are going to be using going forward. This ideas portal is what we are going to use. If you open a support case and ask to add a feature request, support will point you there.
  • PM's have a tough job. Sales drives every organization I have ever seen, support is a cost center. When asked between a huge re-write of the repository that will take 100's of dev hours have a ripple effect through every part of the product and will entice exactly ZERO people to buy the product or adding a support for X OS/Application and flashy new GUI, we all know what gets picked. None of what we ask for is ever going to be high on PM's list without some type of feature champion. This is what support used to do at some un-named large backup software company I used to work with. Support would track their own enhancements and prioritize. Then Support would assign champions to those enhancements they picked to try and shepherd them through the process. Support has to be responsible for items that are important to supports users or you end up in this situation where year after year we ask for the most basic function. I would like to be able to shut down the Core with a supported and easy to use option. And we still don't have it.

    I would love to see a detailed write up on replication. In fact an ongoing deep dive on certain functions of RR every week/month would be great. I think most of us know replication has some sort of resume function (but the details are basic at best) and we know it can be configured to handle duplicate bases but it seems like we kinda have to piece it together here.

    You mention this is how the product works but this enhancement request is the 5th most poplar (out of 74) so RR users certainly don't know or understand this. I can tell you from my experience that it certainly does not feel like replication handles interruptions or network issues very well even though I know there has been a resume function and block tracking for years.

    www.quest.com/.../why-does-replication-need-to-restart

    Maybe the idea post could be a great jump off point for PM's to hand over to support the write ups. If a user requests something that is already in the product, vs saying it is already in the product, have a detailed write up done on how to accomplish what they want. The 5th item in the replication enhancement request is a great example. If I was one of the 14 other people that had this issue and the reply back was that it was already in the product, it would not be as helpful as it could

    We are getting WAY off topic so if you want to remove all of the reply's after the first one from Tim that answered this, go ahead
  • Hi:
    This is Sasha, I’m a member of the new PM group for Rapid Recovery. You’re right the Aha! Ideation portal has in the past been a black hole, but my boss Adam Nelson has made it a priority within the team to work through the deep backlog of ideas already in the queue, and respond to new ideas promptly. We only took over the PM office within the last month, so it will take some time for us to identify and work through these areas that were neglected in the past. I want to assure you we are very respectful of the time customers spend writing up ideas and feedback and are committed to showing that respect by responding promptly and honestly to all of these submissions.

    I’m going to bring Adam’s attention to this thread also, he may have more to add.

    Thanks,
    Sasha