by Eric Han

How to run stateful applications on Kubernetes

how-to
Apr 10, 201915 mins
Cloud ComputingDatabasesSoftware Development

Take advantage of Portworx PX-Enterprise to simplify management of data-rich workloads on Kubernetes

virtual data center servers
Credit: Henrik5000 / Getty Images

Kubernetes has many core abstractions, sometimes called primitives, that make the experience of deploying and managing applications so much better than what came before. Understanding these abstractions helps you take full advantage of Kubernetes and also avoid complexity—especially when running stateful applications like databases, data analytics, big data applications, streaming engines, machine learning, and AI apps.

In this article, I’ll review some of the fundamental abstractions in Kubernetes storage, and walk through how Portworx PX-Enterprise helps solve important challenges that arise with the need for persistent storage in Kubernetes.

Kubernetes abstractions and Kubernetes storage

The Pod is a great example of a core Kubernetes abstraction. It’s actually the first example—the starting point.

Back in 2015, other container orchestration systems started with a single container as the fundamental abstraction; Kubernetes started with Pods. A Pod is a group of one or more containers that need to run together to be useful. One simple analogy is that a Pod is like an outfit of clothing. It’s great to have a shirt and socks on, but let’s not walk out the door without pants!

Pods are like that—they let us focus on what’s needed to be useful (a running outfit) and not overload us with bookkeeping minutiae (a shoelace, one sock). Don’t get me wrong, the minutiae is being tracked by the scheduler and Kubelet (the Kubernetes agent). But it’s this abstraction that allows for the ecosystem to build on Kubernetes and for administrators to automate their infrastructure. And today, we see that most other schedulers have adopted the Pods concept, a sure sign of its usefulness.

The world of storage in Kubernetes has its primitives too, some of which may sound complex at first glance. These abstractions come together to take a complex problem—how to schedule efficiently when application demand is unpredictable—and provide a reliable solution. At the end of the day, you wouldn’t want to run in production without these abstractions.

Here are the Kubernetes abstractions that describe and control storage:

  • PersistentVolume (PV) – the representation for where data is held. Your infrastructure provider or storage vendor implements this. PVs are what you protect through standard means like backup, replication, and encryption.
  • PersistentVolumeClaim (PVC) – how a Pod requests a PersistentVolume, including describing the size of PV needed. After the request, the PVC becomes the reference between a Pod and its PersistentVolume.
    • Now, you might ask, why not skip PVCs and have Pods directly use PersistentVolume? Without a PVC concept, applications would be less portable, which we’ll explain later.
  • StorageClass (SC) – describes the types of storage that your infrastructure offers. For example, your provider may offer two flavors: fast SSD with encryption and slow HDD without encryption.
    • Just like with PVC, you might ask why is this needed? Again, these abstractions help with portability and let administrators prevent abuse from sloppy applications. We’ll also explain this point below.

Here are the Kubernetes abstractions that describe and control applications:

  • Pod – one or more containers that run on the same server, work together, and together form a basic unit of work.
  • Deployment – a controller that ensures that the desired number of application Pods are running and that manages the Pod’s lifecycle. A lifecycle event might be adding more Pods or updating the version. A Pod definition is included within the written specification of a Deployment.
    • A common question for customers is when to use a Deployment versus a StatefulSet. This is a good question that we’ll expand upon.
  • StatefulSet – manages the entirety of the database, instead of individual Pods and their PVCs.
    • It’s important to remember that a horizontally scaling database, like Cassandra, runs with multiple Pods that work together. With a StatefulSet, you don’t have to think about how each Cassandra node (instance) relates to another node. Kubernetes does that for you.

These are the fundamental Kubernetes primitives that enable portability and scalability. There is a subtle and powerful beauty to how these abstractions work together. Since StatefulSets wrap a lot of the underlying primitives, let’s start with a more basic example using PostgreSQL. Then we will directly touch on some of the primitives, starting with a PVC, and build upwards.

Deploying PostgreSQL on Kubernetes

Customers love how Kubernetes manages applications on their infrastructure. As we walk through an example with PostgreSQL, we see that the Kubernetes primitives were designed to be as portable as possible. Even before we try to equate portability with multi-cloud, we see that portability means that the proper primitives enable apps to run, re-run, and re-re-run across servers.

Portability and robustness are thus two sides of the same coin, which makes sense if we think about it. Apps have to be portable across servers if apps are to survive failures.

Back to our PostgreSQL example. The Pod identifies the container image and a PVC. Here, we will run the PostgreSQL database, so the container image is for version 10.1 of PostgreSQL. The Pod is written as a section within a Deployment specification in this example. Had we chosen Cassandra, we would have written our Pod as part of a StatefulSet.

The Deployment not only holds a Pod definition, but also allows us to make updates to that Pod as it runs. Let’s look at all of this within the Deployment specification. I’ve added comments for explanation.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 name: postgres
spec:
 template:
   # Pod definition portion of this deployment specification
   metadata:
     labels:
       app: postgres
   spec:
     # Container to use the application image PostgreSQL 10.1
     containers:
     - image: "postgres:10.1"
       name: postgres
       envFrom:
       - configMapRef:
           name: example-config
       ports:
       - containerPort: 5432
         name: postgres
       volumeMounts:
# Container to use the PVC below called ‘postgres-data’
       - name: postgres-data
# Container sees itself as writing to the directory below
         mountPath: /var/lib/postgresql/data
     volumes:
     - name: postgres-data
# PVC available to any containers in this Pod spec
       persistentVolumeClaim:
         claimName: postgres-data-claim

In the above example, the Pod knows about its PVC but does not—and need not—know about the PersistentVolume. This part may feel a little roundabout, so bear with me. The PVC requests an amount of storage capacity and the type of storage to use. The PVC looks like this:

apiVersion: v1
metadata:
  # Create a Persistent Volume using this Storage Class definition
  annotations:
    volume.beta.kubernetes.io/storage-class: px-postgres-sc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    # Create a Persistent Volume with 5 GB of storage capacity
    requests:
      storage: 5Gi

Up to this point, the application owner has been describing their app and requirements. Now, the infrastructure administrator gets involved by defining the types of storage available by publishing StorageClasses.

It’s important to separate the concerns that the application is addressing from those of the infrastructure: the infrastructure admin needs to define what is sustainable in a shared cluster. Without such primitives, applications could trash each other—a problem that some Kubernetes alternatives are susceptible to.

In the StorageClass specification below, all PVCs that specify this StorageClass will have replication and encryption and will be configured for database I/O workloads. StorageClasses are storage provider specific. Under the covers, the vendor who provides the PersistentVolume implements these features.

apiVersion: storage.k8s.io/v1beta1
metadata:
   name: px-postgres-sc
provisioner: kubernetes.io/portworx-volume
parameters:
  # Replicate three copies using Portworx
  repl: "3"
  # Tune the I/O for the volume for databases
  io_profile: "db"
  # Encrypt the data using a key from Key Management System
  secure: "true”

Running PostgreSQL on Kubernetes

Now to install all of the above primitives, the administrator starts with the StorageClass. Typically, administrators will design several StorageClasses, allowing for tradeoffs between what different apps require and what the infrastructure can support.

To publish the first StorageClass, the administrator runs the following command with the corresponding YAML file:

$ kubectl create -f px-storage-class.yaml
storageclass.storage.k8s.io "px-postgres-sc" created

The application owner can now use storage as defined by the StorageClass. An application Pod will use a PVC to request storage. Since this is a new application, a new PersistentVolume will be created that satisfies the PVC. To create the PVC, we run the following command with our PVC file:

$ kubectl create -f pvc.yaml
persistentvolumeclaim "postgres-data-claim" created

Now we are ready to deploy the PostgreSQL database. Our database will run as a Pod with a PostgreSQL container inside it. Since the Pod was defined within a Deployment specification, we simply create all of this by running the command on the Deployment YAML file:

$ kubectl create -f postres-deployment.yaml
deployment.extensions "postgres" created

We can look backwards to see what was created. First, we ask for all PVCs by running the following command. Below, we see that the PVC we created is in a Bound status, meaning that it’s using the PersistentVolume. In other words, the storage primitives are ready for use.

$ kubectl get pvc
NAME         STATUS    VOLUME    CAPACITY   STORAGE CLASS   AGE
postgres...  Bound    pvc-3...   5Gi         px-...        17s

Next, we can look at the Pod that is running our PostgreSQL container. Below, we see that our PostgreSQL Pod is ready to serve requests.

$ kubectl get pods
NAME                     READY     STATUS    RESTARTS   AGE
postgres-dff54d66d...    1/1       Running   0          6s

Taking a step back, we see that we created all of the primitives needed to run a database. More than that, we standardized on the storage our infrastructure offers to applications, which improves the experience for all subsequent Pods. And we can now control the database Pod using our Deployment object, automating parts of the upgrade process.

The entire stack and set of primitives are shown in the figure below.

portworx kubernetes postgresql stack Portworx

The net result is that we have the language for expressing our desired state in production. We have Kubernetes that manages the applications to meet that intent. And we have a way to share our storage infrastructure with other applications.

How Portworx addresses Kubernetes storage challenges

We just walked through the Deployment of a stateful service using Kubernetes. It’s important to delve into the production requirements as we handle data-rich workloads. How do we resize a PersistentVolume? How do we encrypt microservices data while allowing for the portability benefits? Let’s delve into these topics that matter in production.

The Kubernetes primitives are powerful because each application can now scale out (handle more requests) easily and independently of other applications. Moreover, you can update particular components with similar fine-grained control. But as with any application platform, you need an infrastructure that supports this flexibility. For stateful workloads that seek the benefits of Kubernetes, there are a number of common (and vendor neutral) gaps in the storage infrastructure that present challenges.

Infrastructure limitations that impact stateful workloads on Kubernetes:

  • Application discrimination — isolate and tune I/O behavior based on applications, especially as servers are now shared among apps. Example: control over when Elasticsearch deletes or Cassandra compacts.
  • Clustered operations — allow for data access as Pods scale out across servers and across availability zones (when in public clouds). Oftentimes, the slowness of accessing storage becomes the concern.
  • Monitoring and visibility — understanding performance as applications share infrastructure. Here, we can benefit from how labels can be used to tag across Pods and then down to disks.
  • Protection — ensuring that backup, snapshots, and data protection mechanisms handle applications that are now dozens of Pods instead of a few large VMs.
  • Portability — helping teams move their data as they move their compute either for development clusters getting promoted to test environments or for multi-cloud workloads.

There are many ways storage and infrastructure solutions can make Kubernetes the best way to run stateful applications. At Portworx, we have been working on making stateful workloads as easy and resilient as stateless workloads with Kubernetes.

Below are some of the ways we have been investing in that.

Microservices first

Unlike past enterprise storage systems, PX-Enterprise is designed from the ground up for microservice applications. Much of our work has gone into extending the experience for Kubernetes users, and Portworx itself can be installed, extended, and controlled using Kubernetes.

As a result, Portworx benefits from the same advantages that are driving microservices elsewhere in the enterprise: fast to deploy, free of hardware/vendor lock-in, easy to update, and highly available with managed uptime.

Kubernetes-driven

Because PX-Enterprise runs as a Pod, it can be installed directly via a container scheduler, like Kubernetes, using the standard kubectl apply -f "[configuration spec]" command. Portworx has a spec generator that customers can use to automatically generate the configuration based on their own environment.

portworx kubernetes spec gen Portworx

Hardware-independent

As a software-only storage and data management solution, it’s important that Portworx be able to run in any environment on any hardware. Our customers run in the public cloud, on premises, and in hybrid deployments. Portworx supports all of these configurations because enterprises require this flexibility.

Application-aware

Historically, storage products focused on providing storage capacity and performance (such as bandwidth and IOPS) from a centralized set of storage hardware, without getting involved in understanding the application. Container schedulers radically change the demands on storage.

We now need to understand how data for dozens to thousands of application Pods need to be prioritized, managed, and snapshotted — all on a Kubernetes-based infrastructure. The solution also needs to provide automation and data protection to be useable in production. However, one simple challenge is that microservices architecture encourages applications to operate independently in some cases and as a group in other cases.

As an example of operating independently, a relational database like PostgreSQL or MySQL will often be deployed as a single application Pod. Before upgrading the database version, teams need to take a snapshot and a backup so that a failsafe exists. One concern is how to make these operations fast, safe, and automatable.

From a Portworx perspective, this is handled by making sure applications send (flush) their contents to the data volume before taking a snapshot. (See left panel below.)

portworx kubernetes mysql snapshots Portworx

Without such application-level tooling, a data volume is only crash-consistent, not application-consistent. This means that an application (MySQL in this example) must run recovery steps. In addition, often admins must do manual verification before allowing the app to serve workloads again. This all takes more time and limits us from realizing the automation that we seek from schedulers.

In other cases, scale-out applications like Cassandra run Pods across many servers. Together, the Pods form a single Cassandra ring and work together to provide higher throughput. It becomes important in these scale-out cases to be able to handle all of the Cassandra data volumes as a single group. Acting on each volume independently would otherwise introduce unwanted rebalancing that reduces predictability in production.

In this scale-out case, the steps will start the same (by flushing the memory) but now end with a snapshot of all the data volumes as a group, as shown below.

portworx kubernetes cassandra Portworx

Unlike legacy storage approaches, this distributed set of operations represents new data management functionality that needs to be able to discriminate based on the use case (MySQL, Cassandra) while running on a shared infrastructure, just as Kubernetes does. At the same time, the experience needs to be integrated with Kubernetes in order to provide both automation and the intended data protection. For Portworx, we provide this functionality and a Kubernetes-native experience by integrating through Kubernetes  scheduler extensions and a set of storage custom resources.

Multi-cloud and hybrid-cloud ready

Portworx installs itself as a Pod, can be managed by Kubernetes, deploys on almost any hardware, and is application aware. It is a natural fit to support multi-cloud and hybrid-cloud workloads. The key to multi-cloud operations for stateful services is overcoming data gravity, the idea that stateless components like load balancers and app containers are trivial to “move,” while stateful components like data volumes are difficult because data has mass (figuratively).

Portworx overcomes data gravity, in part, by giving users the ability to snapshot application data, with full application consistency, even across multiple nodes, and to move that data to a secondary environment along with its configuration. With the ability to move data and configuration, Portworx supports multi-environment workloads such as burst-to-cloud, blue-green deployments of stateful applications, and copy-data management for the purposes of reproducibility and debugging, as well as more traditional backup and recovery.

Data is as important as ever. If containers are to become as popular in the enterprise as VMs have been in the previous decade, then a solid storage and data management solution will be a requirement. Just as I couldn’t imagine a world in which VMware couldn’t run a database, I can’t imagine a world in which databases and other stateful services don’t run on Kubernetes. But containers, which are more dynamic and numerous than VMs by an order of magnitude, create problems for stateful services that traditional storage and data management solutions don’t solve. I’m excited to be working at Portworx to tackle these problems head-on. It’s an important mission.

Eric Han is vice president of product management at Portworx, the cloud-native storage company. He previously worked at Google, where he was the first product manager for Kubernetes.

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.