This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Another Shut down a Core post

Is this even a topic of discussion in the product for any version? This has been a huge topic of discussion and anger since version 5 came out and zero progress has been made

And now in V6, it seems to be getting WORSE. 

1) It seems like the changes to the repository in v6, cause the checks that runs after a dirty shutdown to take longer (just a guess)

2) The unsupported powershell script that was provided by DELL for v5 does not work for 6 and Support just told me there is no script for 6 and never will be.

3) Long running jobs. With the introduction of cloud archives, exports etc, we are seeing massive jobs that run (and block other jobs like backups but that is another issue) for days, meaning scheduling a restart becomes even harder.

I am not even asking for the Core to be able to cleanly shutdown during an OS shutdown (something that every other application in the world seems to manage) But how is this not a single button(s) on the GUI "Prepare for Shutdown" and "Core has been restarted" (to un-pause jobs)

Or at the very least, a single supported powershell script to perform this basic function?

We have dozens of Cores in various geographic locations and managing simple shutdowns is such a massive problem for us.

Parents
  • One of the main issues to be aware of with doing that is when you have increased the size of the deduplication cache on your core. During core shutdown we must stop all jobs (some jobs are not cancellable, so they must complete prior to the core shutting down), then we flush the deduplication cache from memory to disk, then the service will completely stop. If say you configure a 20 GB deduplication cache, the core must write 20 GB of data to disk prior to stopping the service. On a RAID 1 array with 2 x 10K SAS drives, this takes somewhere between 3 and 4 minutes generally. So the core software will take a minimum of 4 minutes to stop. Windows generally does not allow such a long timeout and will forcibly terminate the process prior to it's graceful shutdown. This in turn can cause corruption in the deduplication cache. So making sure the service stops before shutting down is important.

    Another potential pitfall is if there are other jobs running that were modifying data in the repository and those jobs were not cancellable when you initiate a shutdown. The core service continues those jobs even as Windows tells it to stop. So when Windows finally reaches it's timeout and forcibly terminates the core, it is very possible that corruption could be introduced in the repository since the core will not know what the last write was that was committed to disk.

    These are two of the most common reasons the core does not stop gracefully during a Windows shutdown in Rapid Recovery and I'm sure are just some of the underlying concerns that generated scashman's post here.
Reply
  • One of the main issues to be aware of with doing that is when you have increased the size of the deduplication cache on your core. During core shutdown we must stop all jobs (some jobs are not cancellable, so they must complete prior to the core shutting down), then we flush the deduplication cache from memory to disk, then the service will completely stop. If say you configure a 20 GB deduplication cache, the core must write 20 GB of data to disk prior to stopping the service. On a RAID 1 array with 2 x 10K SAS drives, this takes somewhere between 3 and 4 minutes generally. So the core software will take a minimum of 4 minutes to stop. Windows generally does not allow such a long timeout and will forcibly terminate the process prior to it's graceful shutdown. This in turn can cause corruption in the deduplication cache. So making sure the service stops before shutting down is important.

    Another potential pitfall is if there are other jobs running that were modifying data in the repository and those jobs were not cancellable when you initiate a shutdown. The core service continues those jobs even as Windows tells it to stop. So when Windows finally reaches it's timeout and forcibly terminates the core, it is very possible that corruption could be introduced in the repository since the core will not know what the last write was that was committed to disk.

    These are two of the most common reasons the core does not stop gracefully during a Windows shutdown in Rapid Recovery and I'm sure are just some of the underlying concerns that generated scashman's post here.
Children
No Data