Base images and retention policy

How often should a core be taking base images of agents? For some reason I was under the impression that one base image happens, then EVERY snapshot after that is incremental, and they all just get rolled into each other when the time comes. I have machines here with 3 or 4 base images over the last few months. Some seem to line up with my retention policy, and some don't.

Anybody have some insight?

  • if any of the volumes between snapshots have greater than 50% in changes, Replay will automatically force a base image, this is at least true in 4.7

  • AppAssure is designed to take ONE base image and incrementals afterwards from that point on; however there are a few conditions that can trigger a new base spontaneously.

    1. Dirty Shutdown. If a transfer is interrupted due to a spontaneous ungraceful shutdown, it will break the back up chain. That transfer will be inconsistent with the rest of the chain and any back up moving forward will not match up to the baseline. To correct this behavior AppAssure will automatically trigger a baseline and start a new back up chain.

    2. 40% Data Change. If 50% or more of the blocks on a given volume change, It will automatically trigger a baseline. Once a back up exeeds 50% of the total blocks for a volume it cannot be considered incremental anymore. the most common culprit in triggering this is Anti-Virus and Defrags, as they has a tendency to change the location of mass amounts of blocks.

    3. System Vol. Information Logs. AppAssure writes to a log in the hidden SVI folder that tracks and matches information to the previous back up to keep the chains contiguous. If this folder becomes full or AppAssure cannot read/write to that Log, then it is assumed the EPOCH chain has been broken and will take a new base to correct.

    These are the most common ways this behavior can occur however not the only way. If you are pretty sure that NONE of these apply to your situation, I would urge you get a ticket into support to have them look at it.

  • Only 1 base image should ever be taken. Any other base images are on demand (forced)

    But we have had agents running multiple base images for awhile be since the logging is so terrible there is no way for support to find out what happened. And since its rare and intermittent we just delete the base as it happens and move on.

  • Replying to @MattS-AppAssure's post:

    Why would it trigger a full after a "dirty" shutdown? I would think it would just dump the partial and take another incremental from the start.

    I'm not sure about number two to be honest. I wish there was a way to view this percentage per agent so I could see if particular ones are experiencing high volumes of change for some odd reason. Plus, something like this really throws a wrench into planning storage for backup with the software. I don't think it was mentioned during sales or anything.

    And I assume number three won't be an issue unless the disk is full right?

    Replying to @scashman's post:

    I wouldn't want to delete them if there are any incrementals that happened in between the bases, then they become useless. Plus, it would screw up my retention that I'm telling my boss that we have.

  • Replying to @itcaswg's post:
    @itcaswg said "Why would it trigger a full after a "dirty" shutdown? I would think it would just dump the partial and take another incremental from the start.

    So when your taking a back up we create a log, which gets replaced each snap, in the System Volume Information File, this is where we judge what has changed since the last back up and what chain we are linking it to at the core. If the agent was in the middle of a write to this file and a dirty shutdown occurs then the log file could become corrupt. This would make it impossible for the agent to judge what's changed. So i triggers a baseline to repair the chain.

    @itcaswg said "... I assume number three won't be an issue unless the disk is full. right?"

    Yes and no. Permissions or GP can play into this trigger. Also when you set up you incremental interval, and place all volumes on the the same time, all the volumes lock into that time as one Job. The SRP partition is relatively small and can fill up. This can trigger the base for the whole machine as all the volumes are linked through the interval job. Often SRP's don't even have an assigned drive letter so its often over looked but it too has a SVI Folder and is snapped like any other drive.

    Also ,food for thought ,your retention only applies to incrementals down to your most recent baseline. Anything before that is ignored and must be cleaned up manually.

  • Replying to @MattS-AppAssure's post:

    Gah, well that's a mess. Should I just not back up the SRP volume then? That needs fixed pretty badly, I shouldn't have to go and manually clean base images that logically shouldn't be there.

    Also, man, you guys really need to put notices like that last sentence blatantly in the interface somewhere. I have the user guide open in front of me, and it's not mentioned anywhere in there.

  • Replying to @MattS-AppAssure's post:

    If the dirty shutdown does happen and the log file becomes corrupt. Does this effect previous recovery points or just that recovery point. I would assume its just that recovery point but since you say the system has to create a new base it sounds like it effects the base and therefor all INCs after that making the entire chain useless for restore?

    Does the same thing happen if the SVI fills up (all recovery points are corrupt and not able to restore)

  • Replying to @itcaswg's post:

    It seems that way, with the SRP, however these are possible causes and do not present 100% of the time. Usually cleaning the SRP one time will reset the cycle and it does not present again. Protecting the SRP is necessary because it contains all your Boot Info.

  • Replying to @scashman's post:

    This does not effect INC's previous to the dirty shutdown. The log inconsistency only effects the one INC. All availability to restore previous back ups persist. The new baseline is triggered to correct the information moving forward.

    Think of it as a train. Each INC is represented by a train car, and the log is the hitch that connects them. If the hitch is missing it does not effect the train moving forward, only the ability to add another car.

  • Yea, but you don't junk the train all the way to the locomotive because one car wasn't complete. You chuck the partial car and start building another car starting where the last good car left off.

    It's crazy to me that they have to snap another base image.