by Murli Thirumale

How do you back up containerized apps?

analysis
Dec 18, 20194 mins
Software Development

Containers have transformed enterprise applications. Now backup systems need to catch up

CSO > cloud computing / backups / data center / server racks / data transfer
Credit: gorodenkoff / Getty Images

As more enterprises adopt a containerized approach for applications, the need for an effective backup system becomes critical. Containers have unique characteristics that set them apart from other deployment models, and without the right backup architecture in place, enterprises run the risk of significant downtime, data loss, or both. 

IDC has found that 76 percent of enterprises are making broad use of containers for mission-critical apps, with improved security and operations management cited as key drivers. Our own survey found that 87 percent of enterprises are using containers, with 90 percent of them in production. 

Today, backups are often implemented at the virtual machine level. That’s fine if an application runs on a single VM, but containerized apps are often distributed over multiple VMs. Conversely, a single VM often runs container pods associated with multiple different applications. In such environments, backups need to be more precisely targeted than they are today. 

In short, businesses need a backup architecture that can support the automated, highly distributed application model enabled by Kubernetes. With that in mind, here are four key requirements for building an effective backup solution for containerized applications.  

Take a container-granular approach

Instead of backing up everything that runs on a VM or bare-metal server, operators need the ability to backup specific containers or groups of containers running on specific nodes. That means having the ability to select just the application they need to back up—even if there are other apps on the same node, or if the app is spread across multiple nodes.  

By backing up at the container-level, IT teams can avoid the complications of ETL procedures that would be required if they backed up a group of VMs in their entirety. Backing up only the specific applications required also minimizes storage costs and keeps recovery time objectives (RTOs) low.

Make it Kubernetes namespace-aware

This granular concept can be extended to entire namespaces. Within Kubernetes, namespaces typically run multiple applications that are related in some way. For example, an enterprise might have a namespace related to a branch of the company. Often, operators want to back up the entire namespace, not just a single application in that namespace. Traditional backup solutions run into the same problems outlined above. Namespaces bridge VM-boundaries. Teams need the ability to back up entire namespaces no matter where the containers that comprise that namespace run.

Build application-consistent backups

Snapshotting multi-container apps in a way that allows an application to be recovered without risking data corruption requires all containers to be locked during the snapshot operation. VM-based snapshots can’t achieve this, since containers can run on different servers, nor can individual snapshots that are serially executed. To address this, a rules engine is required that allows operators to automatically execute the required snapshot commands for each particular data service. For Cassandra, for instance, admins run the nodetool flush command to take an application-consistent snapshot of multiple Cassandra containers. This type of rules-based snapshot is essential for each data service an enterprise runs.

Configuration backups are essential, too

In a container environment, operators need to back up the application configuration as well as the data. With data only, admins would need to rebuild the app configuration in place, which means recovery times would be unacceptably long. Backing up the configuration requires capturing the YAML files that define a deployment, as well as other Kubernetes objects such as the service accounts and PVCs (PersistentVolumeClaims). Backing up the configuration only would leave operators without the application data, so clearly both are required.

Containers have dramatically reshaped the data center, allowing enterprises to deploy applications more quickly, support multi-cloud environments, and reduce infrastructure costs. But this innovation creates pressure for change in other areas such as data backup. I hope the guidance above will allow you to build an effective backup architecture that allows you to confidently deploy containerized apps with the knowledge that critical data will always be available, protected, and secure.

Murli Thirumale is co-founder and CEO of Portworx, provider of cloud-native storage and data management solutions for Kubernetes. 

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.