Skip to content

Node maintenance for Aerospike on Kubernetes

When performing Kubernetes node maintenance (such as version upgrades, patching, or hardware changes), you need to safely migrate Aerospike pods off the affected nodes. The Aerospike Kubernetes Operator (AKO) provides multiple approaches to handle this:

ApproachUse CaseStorage Type
Safe Pod Evictionkubectl drain operationsAny
Scheduling PoliciesPlanned migrations of Aerospike podsNetwork-attached
K8sNodeBlockListPlanned migrations of Aerospike podsAny

Safe pod eviction webhook

AKO provides a webhook that intercepts pod eviction API calls triggered by commands like kubectl drain or Kubernetes node scale-down by cluster autoscalers like Karpenter. This webhook blocks the Aerospike pod eviction API calls and safely migrates those pods to other Kubernetes nodes, ensuring all safety checks for data migration.

This feature works with both network-attached and local-attached storage configurations. It is disabled by default.

Enabling safe pod eviction

To enable the safe pod eviction webhook, set the ENABLE_SAFE_POD_EVICTION environment variable to true in the operator deployment.

If you installed the operator using Helm, enable it by setting the value during installation or upgrade:

Terminal window
helm upgrade aerospike-kubernetes-operator aerospike/aerospike-kubernetes-operator \
--set safePodEviction.enable="true"

Or add it to your values.yaml:

# Enable the eviction webhook to safely block Aerospike pod evictions during node maintenance
# Also enables Prometheus metrics: aerospike_ako_eviction_webhook_requests_total (labels: eviction_namespace, decision)
safePodEviction:
enable: "true"
# Eviction webhook timeout in seconds
timeoutSeconds: "20"

Using kubectl drain

Once the safe pod eviction webhook is enabled, you can use standard Kubernetes commands to drain nodes:

Terminal window
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data

The webhook intercepts the eviction request for pods that belong to an AerospikeCluster and denies it. For non-Aerospike pods, the eviction request is passed through without modification.

If the eviction is blocked, the webhook sets an annotation aerospike.com/eviction-blocked on the pod. AKO receives this event and starts migrating the Aerospike pods safely. Wait for the AerospikeCluster to reach the Completed phase before retrying the drain command:

Terminal window
kubectl -n NAMESPACE wait --for=jsonpath='{.status.phase}'=Completed aerospikecluster/CLUSTER_NAME --timeout=300s

Scheduling Policy

Network-attached storage

For clusters using network-attached storage (such as cloud provider block storage), you can migrate pods by updating scheduling policies in the CR. The pods can move freely between nodes since the storage follows them.

Setting a scheduling policy like affinity, taint & tolerations, and nodeSelectors can help migrate the pods to a different node pool, and the current node pool can be brought down. Set RollingUpdateBatchSize to expedite this process by migrating pods in a batch.

For example, you can set the following nodeAffinity in the podSpec section of the Custom Resource (CR) file. AKO performs a rolling restart of the cluster and migrates the pods based on the scheduling policies.

Following nodeAffinity ensures that pods are migrated to a node-pool named upgrade-pool. AKO restarts the pods and move them to the nodes with the node label cloud.google.com/gke-nodepool: upgrade-pool.

podSpec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- upgrade-pool

K8sNodeBlockList

Local-attached storage

When Kubernetes pods use local storage, they are unable to move to different Aerospike cluster nodes because of volume affinity. This prevents a rolling restart with a different scheduling policy from working.

However, you can use the K8sNodeBlockList feature to migrate the pods out of the given Kubernetes nodes when using local storage.

K8sNodeBlockList specifies the list of Kubernetes node names from which you want to migrate pods. AKO reads this configuration and safely migrates pods off these nodes.

If pods are using network-attached storage, AKO migrates the pods out of their Kubernetes nodes without additional configuration. If pods are using local-attached storage, you must specify those local storage classes in the spec.Storage.LocalStorageClasses field of the CR. AKO uses this field to delete the corresponding local volumes so that the pods can be easily migrated out of the Kubernetes nodes.

This process uses the RollingUpdateBatchSize parameter defined in your CR to migrate pods in batches for efficiency.

The following example CR includes a spec.K8sNodeBlockList section with two nodes defined:

apiVersion: asdb.aerospike.com/v1
kind: AerospikeCluster
metadata:
name: aerocluster
namespace: aerospike
spec:
k8sNodeBlockList:
- gke-test-default-pool-b6f71594-1w85
- gke-test-default-pool-b6f71594-9vm2
size: 4
image: aerospike/aerospike-server-enterprise:8.1.0.0
rackConfig:
namespaces:
- test
racks:
- id: 1
- id: 2
...
Feedback

Was this page helpful?

What type of feedback are you giving?

What would you like us to know?

+Capture screenshot

Can we reach out to you?