Resiliency: it's not just a buzzword in childrearing; you also need to consider this in your Active Directory (AD) disaster recovery plan, including how you backup your domain controller. This is my sixth prediction in the blog series 10 predictions for 2019: What’s in store for Windows and Office 365 pros, and we’ll explore why you’ll be changing how you think about AD in your business continuity plan in the face of devastating malware compromises.
In my fifth prediction on ransomware, I explored how devastating attacks like NotPetya can trash entire data centers. I’ve talked to customers who call this a scorched earth scenario — everything is wiped out from the domain controllers (DC) down to the operating system including the virtual machine (VM). Not a single DC around the globe left standing. Pretty scary stuff.
Why do you need an Active Directory disaster recovery plan?
Simply put, if you don’t have an Active Directory disaster recovery plan and disaster strikes, your users won’t be able to log in to the applications they need to do their business. There is no business. If healthcare workers can’t access the data and applications they need to prescribe medicine, review patient records, collaborate with colleagues on upcoming surgeries, then people’s lives are at risk.
I could go on and on for each industry, but it’s essentially the same thing across the board: AD is the lifeblood for your apps, files and users. Without it, nothing else works.
How likely are you to use an Active Directory recovery plan?
Let’s explore two options: the scorched earth option and the audit-ready option.
Shipping giant Maersk lived through a scorched earth scenario — NotPetya ransomware left the shipping company dead in the water. Andy Greenberg’s account of the events of June 27, 2017 in WIRED paints the picture perfectly:
- Maersk’s network was so deeply corrupted that even IT staffers were helpless…[M]any employees — rendered entirely idle without computers, servers, routers or desk phones — simply left.
And of the impact, Greenberg continues:
- For days to come, one of the world’s most complex and interconnected distributed machines, underpinning the circulatory system of the global economy itself, would remain broken.
The Maersk teams responded quickly to rebuild their infrastructure in order to bring back to life the 76 ports they are responsible for across the globe as well as the 800 ships carrying a fifth of the world’s shipping capacity.
While Maersk had found almost all the backups for their individual servers, they couldn’t find the backups for their DCs. These DCs are the gatekeepers for user authentication and authorization to Maersk’s systems. Each of their 150 DCs were synced with each other, acting as a backup to the rest. But they never imagined all their DCs would go down at once. According to an IT staffer Greenberg quotes:
- If we can’t recover our domain controllers…we can’t recover anything.
This line of thinking isn’t unique to Maersk. For years, we were told that if we had global data centers, we would just build a DC in each location with enough capacity to carry the entire load if another data center had a disaster. All traffic would re-route to the other. We planned for localized natural disasters. We didn’t plan for malicious, nation-state cyber weapons worming their way through all our global data centers.
Let’s hope you never have to use an Active Directory recovery plan like Maersk, but, in many industries, you must prove you can execute on this plan if the need arises. Many of our clients talk about being audit ready. This means more than a tabletop exercise. They must show, in a test environment, that their AD disaster recovery plan is extensively tested, waiting and ready to go at a moment’s notice.
Factors to consider in an Active Directory disaster recovery plan
An AD disaster recovery plan in the above scorched earth scenario requires rebuilding your AD from the ground up — bare metal, if you will. There are no DCs available in other global data centers — they are all compromised. Everything from the server, the VM, the operating system to AD itself needs to be rebuilt. In some cases, this could be as many as ten different departments responsible for getting your users up and running again.
Some clients rely on a paging system of first responders from various departments (server, storage, AD, OS, network, security, etc.) to respond and jump on an emergency call. But things happen, people go on vacation or miss the alert. In these cases, you’re stuck waiting for someone to provision your operating system.
To shorten your recovery time objective and mitigate the risk of a vital disaster recovery team missing their alert, here’s our recommendation that should be built into your AD disaster recovery plan:
- Make copies of your DCs on a regular basis and store those in a completely separate network from AD with a non-Microsoft API going into that network so attacks like WannaCry and NotPetya can’t compromise your DC backups.
- Limit access to those backups to AD admins to reduce the likelihood of errors and malicious manipulation of the images.
- Build a kit of what you would need and have those on standby at your disaster recovery cold sites. This includes ready and offline VMs/servers with critical data, volumes and the OS so the AD team can quickly rebuild and deploy your DCs. When minutes count to your business, the last thing you want to do is call the server team to get a host.
- For virtualized DCs, it’s a best practice by most service delivery organizations to have physical DCs offline and ready in case the virtualized ones crash.
Learn more about building out and testing your Active Directory disaster recovery plan in this informational tech brief: Testing Your Active Directory Disaster Recovery Plan — The Hard Way vs. The Easy Way.