Adam, an IT administrator I’ve known for a while, always had doubts about his prior DR service provider. “I don’t know,” he recalled, “it always seemed as if they were looking for a different kind of customer. DR tests were a LOT harder than they should have been.”
Unfortunately, this is not uncommon. Cloud DR has always had some big advantages, such as pay-as-you-go compute during a failover. And these days, it’s obviously attractive to rely on already-scaled public clouds as your DR site vs. more limited private data centers.
But DRaaS is a sophisticated solution, so implementations vary widely. You need to make sure your provider shares your values.
At Datrium, we have taken a fresh look at the problem for the hybrid cloud world. We listened carefully to how our customers prioritized scalable simplicity, at low cost, for VMware VMs. For most data centers, it’s the obvious list. But it turns out we’re unique in the market.
To celebrate the general availability of our DRaaS Connect feature, below is a Top 10 list of the red flags to note if you are working with a vendor that doesn’t share Datrium’s vision for DRaaS.
- VMware Cloud for failover. If you’re thinking about failover for VMware VMs, this decision is the most critical Aha moment. Once failed over, you operate using vCenter, just like normal. Failover is rare, and administration needs to be obvious. That is the center of Datrium’s design.→Red Flags: If your provider asks you to learn a different cloud’s methods of operation; see more below. Also, if your provider asks you to prepare to convert your VMs to a different kind of VM, this can work, but you need to fully test it to discover when it doesn’t.
- Ransomware recovery, cost-optimized. Ransomware is the most common reason for data center outages, approaching 50% of all failovers. A simple recovery solution requires fast access to old immutable images, and to be low cost; they need to be on a service like S3, but with aggressive data reduction. Datrium Cloud Backup vaulting uses industry-leading global deduplication and compression to support years of retention, if required, on S3.→Red Flags: Failover images on cloud block storage, sometimes at 10x the $/TB, and/or lack of global dedupe. Or, cost optimization that hurts RTO (see below).
- Fast RTO, cost-optimized. When a data center fails, simplicity of experience is about fast time to recovery. Any other answer makes life complicated and possibly dangerous. Since ransomware recovery requires fast access to old images at low cost, this is hard to solve technically. Datrium offers it through Live Mount technology, using scalable instance flash caching for any selected backup on S3 (caching is on demand only during failover, for cost optimization).→Red Flags: Shallow retention available for faster RTO, such as most-recent-only or slow data copy into primary storage from cost-optimized backup storage. Or both.
- RPO. Some DR software optimizes for continuous replication or small snapshot gaps to favor frequent restore points. No one offers this feature to a low-cost, globally deduped cloud store on S3 (yet), so the currently available method is in conflict with our cost-optimized vision. Datrium offers backup-style RPOs today and recommends VMware functions such as vSphere Replication for (the typically small number of) more mission-critical apps. If these use a small pilot-light 3-host cluster, the rest of the cluster can be built on demand for the rest of the VMs with Datrium DRaaS.→Red Flags: High bandwidth and capacity required for all VMs.
- Hiding AWS console and billing. AWS has more than 200 discrete services. If you’re doing DR for on-prem VMs to VMware Cloud, you don’t really need to understand any of them; a SaaS DR service like Datrium’s can provide whatever is required and hide the rest. AWS native billing is also a whole new world of surprises. Datrium hides AWS (unless transparency and direct billing are specifically requested by large AWS customers).→Red Flags: You are required to have an AWS account and billing in addition to the price you pay for DRaaS.
- No AWS instance conversion training. For VMware Cloud failover, you’ll need to know vCPUs required, RAM, and maximum cluster capacity. There is no conversion. That’s Datrium’s focus, so there’s very little to learn.→Red Flags: For AWS instance conversion, you’ll need to understand EC2, map to the correct Windows or Linux instance types, possibly install new drivers or agents in guests on prem, and possibly change Windows registries.
- Eliminating the need to install and maintain Cloud software. A promise of SaaS is that you don’t have to manage infrastructure or software. With Datrium DRaaS, the service and its availability were born in the cloud, and Datrium manages uptime. There is no software to install in the cloud.→Red Flags: products are called DRaaS, but you have to install and maintain software in the cloud.
- Eliminate software upgrades on prem. Some software connectors may need to be installed on prem. That is true for the Datrium DRaaS Connect software appliance for vSphere on prem. But true to our SaaS philosophy with Datrium DRaaS, it does not need administrative attention to upgrade it over time. All upgrades are designed to be over-the-air and automatic.→Red Flags: Software on prem requiring upgrade processes by administrators over time.
- New Cloud security models. Clouds have their own security models, askew from vSphere models. Datrium focuses on well-understood vSphere models, while allowing for powerful enhancements such as NSX and AWS Direct Connect.→Red Flags: Long lists of security options required by AWS services. E.g. some require more than 30 settings for each instance, and still more for S3, IAM, and CloudTrails. No typos allowed!
- Continuous compliance checking. If you have to install software, and maintain it, and manage resources independently, it’s harder to validate that resources are in place, ready for failover. An advantage of a true service is that all resources can be reviewed regularly and consistently, including on-demand compute infrastructure. Datrium offers continuous compliance checks every 30 minutes, checking all resources across all Plans. Continuous compliance checking vastly reduces the errors from data center changes causing drift in DR plans. (Datrium still recommends periodic DR exercises to ensure current staff is familiar with all processes required.)→Red Flags: Not having this feature.