Kubernetes operators: for who and why

In our previous article we highlighted a case of modeling meaningful business entities, such as WordPress sites, using Kubernetes feature named "Custom Resource Definitions" (CRD). The CRD defines specification, validation, status for our Wordpress instance which could look like this:

apiVersion: wp-hosting-company.com/v1 kind: Wordpress metadata: generation: 1 labels: app.kubernetes.io/name: wordpress app.kubernetes.io/instance: client-x name: website-client-x spec: autoScalingEnabled: true image: client-x-wp:latest configuration: siteUrl: "http://example.com/wordpress" database: host: mysql user: test password: valueFrom: secretKeyRef: name: db-password key: password media: storage: s3 s3: url: "S3://bucket-name/key-name" status: available: true observedGeneration: 1 last-modified: "2023-08-04T11:44:34Z"

The example is somewhat artificial and simplified, but we can see that it allows us to treat our Wordpress instances as first class citizen inside Kubernetes, instead of constructing it with regular Deployments, Services and Ingress.

The CRD only defines our data model. An operator that is running a reconciliation controller is required to act on this input, and manage Deployments, Services and other objects to materialize our site. This is called control loop pattern, and is quite common in Kubernetes. The Kubernetes comes with a few built-in controllers, such as Deployment and Job controllers. They are responsible for reconciliation of built-in resources. In simple terms, when user creates a Deployment; Deployment controller is responsible for managing ReplicaSets. Beyond that Replicaset controller is responsible for watching Replicasets, and managing Pods for them, and so on.

Creating custom operators

There are few ways to create operators, quite common starting point being Operator SDK framework. The SDK helps with scaffolding, CRD creation, and allows to write our operator using Golang, Helm or Ansible.

Kubernetes operator capability levels; Golang, Helm and Ansible

Golang operators are most common out there and the most powerful in terms of features. Helm chart operators basically wrap a Helm chart and make operator out of it; their capabilities are limited to power of Helm.

When picking a technology, it is important to consider current skillset and required features. If the operator requires more advanced features like managing lifecycle and keeping track of state of various different items and taking actions outside of just creating regular Kubernetes objects, then either Golang or Ansible is a must. Usually Golang is used, as that is the native programming language of the ecosystem, and interfaces well with pre-existing libraries, allowing things like creation of custom kubectl plugin for your custom entity.

Scope of the operator should be kept as minimal as possible. As example, we wouldn't want our Wordpress operator to create and manage MySQL databases on Kubernetes, even though we know the fact that MySQL database is needed for the application. For that we might use a different operator that creates it's own CRD to specify how MySQL database should look like. Our operator can simply use that MySQL CRD to trigger creation of the database, keeping it simple as possible.

When to create your own operator

There are several considerations for deciding to create an operator:

You have a piece of software that requires complex and stateful orchestration to set up and operate. Example of these are various database operators for clustering Postgres and MySQL.
You have skills and/or resources to create and maintain it
You prefer to use CRD's over eg. Helm charts

My personal recommendation is to avoid writing Golang based custom operators, if the same thing can be done using regular Helm chart or Kustomize deployment with some stateless logic in container initialization scripts. Code is a liability itself; less code to maintain is almost always better!

Cartman has strong experience of utilizing and creating operators. Reach out to us for assistance on this topic!