VM Private Cloud: Managing VMs at Scale

Want to know what’s under the hood of the new rackscale private cloud? It starts with a converged system for consolidating VMs with very

high performance and low latency. Then we support simple and efficient secondary storage, such as built-in erasure coding, global dedupe, compression, etc. This combination gives you seamless data management for fine-grained objects like VMs, virtual disks, ISOs, etc. resulting in better life-cycle orchestration for business applications at scale for your private cloud.  We call it DVX and here’s a closer look.

No more hand-crafting policies for each VM

Given that there are now 100s or 1000s of VMs to manage, it is unfathomable to configure each VM’s policy individually.  Now you can group a number of VMs together all with a common policy related to snaps, retention and replication across the private cloud with protection groups (analogous to a policy group). You can, for example, create a policy called “Finance-Group” and protect all VMs in the finance department.  This way, new VMs can come and go, and the protection group will always auto detect the changes.  This is done by allowing the administrator to create a policy to include VMs (or other objects like virtual disks, ISOs, etc.) that match a certain name pattern.  This pattern is evaluated each time when there is an action to be taken (like taking a snapshot), thus easing the burden of having to configure a policy for each VM individually.

An integrated snapshot catalog

The foundation is built on a very efficient snapshot mechanism capable of handling fine grained objects.  All of these snapshots are stored in SnapStore, which is built into every system.  Think of SnapStore as a backup catalog that (like S3 in AWS) provides a few very important things. SnapStore separates copy data management from the main data path for live VMs. The DVX architecture, which separates capacity from performance, ensures the objects in SnapStore do not interfere with the performance of your live VMs. In addition, Snapstore has integrated search capabilities. When you have 1000s of objects, browsing objects is no longer tenable. SnapStore delivers a fast search function to quickly locate the objects of interest.  From SnapStore, you can also restore or clone your VMs or virtual disks.  Multiple SnapStores across your private cloud work together to provide an integrated data cloud.

Data management without knobs

There could be 100s of VMs in a protection group policy. We had two design choices when taking scheduled snapshots of all the VMs in the protection group; either snapshot one VM at a time sequentially, or snapshot them all together in a consistent way.  We observed that some vendors provide a concept called “Consistency Groups” where the administrator selects a set of VMs, then undertakes additional steps to include these VMs as part of the consistency group.  For example, there might be a 3-tiered application running in three different VMs, and all need to be snapped together for application consistency.  We decided to significantly reduce the burden on the end user by eliminating all cumbersome steps and provide a snap of all VMs in a protection group in a consistent and atomic way. Our philosophy has been to provide features that just work, and avoid knobs as much as possible.  This, for us, is the only reasonable way to manage a system at scale.

Bring all your objects, not just VMs

Business applications include not only VMs, but all the files that go with them. Therefore, the protection group allows you to have a policy for other objects besides VMs and virtual disks, like ISOs, OVAs, VM templates, etc. So you’re not forced to use or create another tool just to manage the policy and mobility of other objects, we make it convenient to include all other object types in the protection group.

It’s a private cloud: high performance and data management must co-exist

You can clone or restore VMs or virtual disks from SnapStore on any connected DVX.  This allows you to easily enable test/dev type workflows where the applications can be cloned quickly into new VMs or virtual disks.  Some systems discourage the running of production vs test/dev VMs in the same system.  Because of the unique host isolation properties in a DVX, it is easy to run test/dev and production applications in the same Datrium system, and with high performance.  By using different hosts for different workloads, isolated from neighboring application workloads, you can get high and predictable application performance.