Handle Ops stuff like a developer would
Everything in version control…
…because YAML is text
Use branches for stages (e.g. dev, qa, live)
Pipeline to deploy to stages
Integrate changes using pull/merge requests
Add automated tests to pipeline
Changes are pushed into Kubernetes cluster
Different approches to access the cluster from a pipeline
Pipeline runs inside the target cluster
Direct API access with RBAC
Pipeline runs somewhere else…
…or does not have direct access to Kubernetes API
Pipeline fetches (encrypted) kubeconfig
helm template my-ntpd ../helm/ntpd/ >ntpd.yaml
yamllint ntpd.yaml
cat <<EOF >.yamllint
rules:
indentation:
indent-sequences: consistent
EOF
yamllint ntpd.yaml
Validate against official schemas using kubeval
:
kubeval ntpd.yaml
Static analysis using kube-linter
kube-linter lint ntpd.yaml
kube-linter lint ../helm/ntpd/
kube-linter checks list
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Manually scaling pods is time consuming
HPA changes replicas automagically
Supports CPU and memory usage
Deploy nginx and HPA
Create load and watch hpa scale nginx
Checks every 15 seconds
Calculates the required number of replicas:
desiredReplicas
= ceil[currentReplicas * (currentMetricValue / desiredMetricValue)]
Configurable behaviour:
Control where pods are placed
Resource requests are important for scheduling
Limits are important for eviction
(requests == limits)
Pods will not be evicted…
…because resource consumption is known at all times
Control where pods are placed
Force pods onto specific nodes
Force pods on the same node or on different nodes
Reserve nodes for specific pods (taints)
Pods must accept taints (tolerations)
kubectl create <resource>
kubectl create
is not idempotent
Next pipeline run will fail because resource already exists
Instead create resource definition on-the-fly:
kubectl create secret generic foo \
--from-literal=bar=baz \
--dry-run=client \
--output=yaml \
| kubectl apply -f -
Reconciliation takes time
Do not use sleep after apply, scale, delete
Let kubectl
do the waiting:
helm upgrade --install my-nginx bitnami/nginx \
--set service.type=ClusterIP
kubectl rollout status deployment my-nginx --timeout=15m
kubectl wait pods \
--for=condition=ready \
--selector app.kubernetes.io/instance=my-nginx
Works for jobs as well:
kubectl wait --for=condition=complete job/baz
Finding the pod name is error prone
Filter by label:
helm upgrade --install my-nginx bitnami/nginx \
--set service.type=ClusterIP \
--set replicaCount=2
kubectl delete pod --selector app.kubernetes.io/instance=my-nginx
Show logs of the first pod of a deployment:
kubectl logs deployment/my-nginx
Show logs of multiple pods at once with stern :
stern --selector app.kubernetes.io/instance=my-nginx
When a pod is broken, it can be investigated
Remove a label to exclude it from ReplicaSet
, Deployment
, Service
helm upgrade --install my-nginx bitnami/nginx \
--set service.type=ClusterIP \
--set replicaCount=2
kubectl get pods -l app.kubernetes.io/instance=my-nginx -o name \
| head -n 1 \
| xargs -I{} kubectl label {} app.kubernetes.io/instance-
ReplicaSet
replaces missing pod
Remove after troubleshooting
kubectl logs --selector '!app.kubernetes.io/instance'
kubectl delete pod \
-l 'app.kubernetes.io/name=nginx,!app.kubernetes.io/instance'
Secret
Templating becomes easier when inserting plaintext
#...
stringData:
foo: bar
Do not store resource descriptions after templating
cat secret.yaml \
| envsubst \
| kubectl apply -f -
Outdated Ops dependencies are also a (security) risk
Tools will be missing useful features
Services can contain vulnerabilities
Let bots do the work for you
Doing updates regularly is easier
Automerge for patches can help stay on top of things
Automated tests help decide whether an update is safe