Disaster Recovery Use Case #4 – Natural Disasters

The Datrium Automatrix platform offers a comprehensive set of features that enable fast and cost-efficient disaster recovery (DR) mitigation for enterprises. In this blog post series, we’re looking at the four most recognizable use cases that cause organizations to trigger a DR plan:

Previously, we discussed how Automatrix provides resilience for partial or entire application stack recovery. We also covered full data center recovery, including using on-demand Datrium DRaaS with VMware Cloud on AWS as the DR target, which offers 10x more cost-effective DR.

In this article, we’ll cover how Automatrix help organizations recover from natural disasters.

 

In one of the previous posts, we talked about the 2016 report by the Ponemon Institute, which found that weather-related disasters cause 10% of all data center downtime. According to The State of Enterprise Data Resiliency and Disaster Recovery 2019 report natural disasters were responsible for 16.6% of DR events.

 

For the most part, natural disasters can be predicted, which allows organizations to prepare for the event before the disaster strikes.

However, even when there is time to prepare, organizations must know that their DR plans are working correctly. Unless they can complete failover and failback seamlessly without fear of failure, DR should never be initiated unless urgently needed.

I have witnessed an organization that, upon failing over their workloads to a DR site, wasn’t able to move them back. They were running their production environment out of their DR data center for more than five months.

One of Datrium’s customers was affected by the Sonoma County wildfires in 2017. Because of his careful planning, he was able to transition his workloads safely to his DR target site and then back to production.

In the case of natural disasters, the most common use case is a preemptive data center migration. Natural disasters that destroy data centers are also plausible. Still, in such cases, the recovery to a target site or the cloud is no different than a total power failure, which we discussed in our previous blog post, “Disaster Recovery Use Case #3 – Power Failure.”

 

Preemptive Data Center Migration

Preemptive migration implies that the disaster is yet to happen, and you need to migrate applications with minimal effort and ensure that they start operating quickly from a DR target site.

Datrium ControlShift is a DR orchestration SaaS application, which is driven by the same policy and snapshot system that enables backups in Automatrix. Using ControlShift, it’s straightforward to fail over applications running on a DVX production site to a DR site, and it’s also easy to fail back.

Here’s an overview of ControlShift features:

  • Runbook orchestration for VMs to restart correctly in a different data center.
  • Restart from current data or older backups. Automatrix is built to incorporate both current and old VM snapshots, so it’s ideal for ransomware or point-in-time recoveries.
  • RCO (Recovery Compliance Objective) of 30 minutes. Because Automatrix is a consolidated data plane with a focus on VMware and Kubernetes, it’s built to perform compliance tests of all required failover/failback resources every 30 minutes. It also offers a full test bubble system.
  • DRaaS (DR-as-a-Service) provides a subscription model and is fully integrated with VMware Cloud on AWS for on-demand disaster recovery. Datrium provides fully integrated purchasing, support, and billing for all components and services, including VMware Cloud on AWS and AWS. It’s delivered as a SaaS solution that eliminates all the complexity of packaged software.

 

DRaaS with VMware Cloud on AWS

Another option for preemptive data center migration is to use DRaaS with VMware Cloud on AWS. VMware Cloud provides a vSphere-based execution environment as a DR target, and the VMware Cloud SDDC is managed using the familiar vCenter interface – so there’s nothing new to learn.

VMware is the de-facto standard for the enterprise data center, and AWS has the broadest set of cloud services with a global scale and reach. VMware Cloud on AWS has a massive global presence, and Datrium has been helping many global organizations to leverage VMware Cloud for DR, and in many cases helping organizations to shut down expensive and idling DR data centers.

At a high-level DRaaS is a complete, fully orchestrated DR solution for the VMware ecosystem that is offered as a subscription and leverages VMware Cloud and Datrium Cloud DVX backups.

A VMware Cloud SDDC is provisioned on-demand, and a provisioned SDDC incurs only hourly charges. ControlShift performs automated network configurations for both AWS and VMware Cloud to make S3 backups from Cloud DVX available for spin-up in an SDDC.

When it’s time to restore operations on your primary data center, ControlShift efficiently fails back applications with minimal AWS egress charges by transferring only changed and globally deduplicated data; similar to failover, failback is fully automated. Data changes that occur while executing in the VMware Cloud are captured and stored as a Cloud DVX snapshot in S3. Finally, upon DR completion, the SDDC is automatically decommissioned.

 

When using DRaaS, there are two primary modes to choose from:

Just-in-Time – This mode eliminates any infrastructure upfront CapEx costs and drastically cuts OpEx costs. You only pay for VMware Cloud when a disaster occurs. However, when your DR plan is triggered, you may need to wait for an SDDC to be created, and that could take approximately 90 minutes. After the DR event is over, the changes are synchronized back, and the SDDC is torn down.

 

 

Pilot-Light – In this mode, ControlShift creates an SDDC on VMware Cloud with a minimal number of hosts to fail over the most critical VMs with very low RTO. Then, on-demand, new hosts are added to the SDDC to complete the failover of less essential VMs. In this mode, you pay for just a minimal number of hosts until the DR plan is triggered, and then full capacity only when DR is in full effect.

Our on-demand model for DRaaS radically improves the economics for DR. Customers have reported saving almost 90% over traditional DR approaches, such as having a secondary physical site or an “always-on” DR environment in the cloud.

DRaaS enables you to provision an on-demand SDDC in VMware Cloud on AWS and pay as you go – for testing or in the event of a disaster. The only steady-state cost is storing data-reduced backups on S3. You get protection from power failures, ransomware, and natural disasters in a single solution.

Unlike other DR solutions, we keep virtual machines in their native vSphere format, which eliminates brittle, time-consuming VM disk format conversions.

While your production site is up, Cloud DVX is continuously backing up your data to AWS S3 with low RPO and global dedupe to minimize costs. When disaster strikes, ControlShift executes your fully compliant DR plan to fail over your workloads to an on-demand SDDC created in VMC on AWS immediately after the disaster strikes. You get a consistent operational experience with vSphere both on premises and in the cloud, so you and your team don’t have to learn a new set of tools.

Datrium provides fully integrated purchasing, support, and billing for all components and services, including both VMware Cloud on AWS and AWS. It’s delivered as a SaaS solution that eliminates all the complexity of packaged software.

 

Disaster Recovery without DVX (DRaaS Connect)

Until now, only customers with DVX on-premises infrastructure could leverage the benefits of DRaaS. DRaaS Connect changes that, so the functionality is available for existing storage platforms.

DRaaS Connect, a feature of DRaaS, is downloadable, lightweight software for any vSphere infrastructure that enables customers to protect VMs in just minutes. DRaaS Connect for vSphere On-Prem extends Datrium DRaaS to any vSphere on-premises infrastructure, including SANs, NAS, HCI, and DHCI.

 

 

DRaaS Connect gives organizations, with existing storage solutions that are not yet ready to be replaced with DVX, an option to implement orchestrated DR to VMware Cloud on AWS.

 

Conclusion

In this series, we’ve covered the four most common types of DR events, their causes, and how Automatrix helps to avoid or recover from such disasters. Each blog post in this series gives you more insight into the high-level architecture and business benefits of using Automatrix and DRaaS. If you’re ready to learn more, please contact us to see how we can help you create an on-demand DR strategy and reduce your costs.