CI/CD

Handle Ops stuff like a developer would

Everything in version control…

…because YAML is text

Use branches for stages (e.g. dev, qa, live)

Pipeline to deploy to stages

Integrate changes using pull/merge requests

Add automated tests to pipeline

Changes are pushed into Kubernetes cluster

Cluster access 1/

Different approches to access the cluster from a pipeline

Inside cluster

Pipeline runs inside the target cluster

Direct API access with RBAC

Next to cluster

Pipeline runs somewhere else…

…or does not have direct access to Kubernetes API

Pipeline fetches (encrypted) kubeconfig

Useful tools

Validate YAML using yamllint

helm template my-ntpd ../helm/ntpd/ >ntpd.yaml
yamllint ntpd.yaml
cat <<EOF >.yamllint
rules:
  indentation:
    indent-sequences: consistent
EOF
yamllint ntpd.yaml

Validate against official schemas using kubeval :

kubeval ntpd.yaml

Static analysis using kube-linter

kube-linter lint ntpd.yaml
kube-linter lint ../helm/ntpd/
kube-linter checks list

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
 name: my-app-hpa
spec:
 scaleTargetRef:
   apiVersion: apps/v1
   kind: Deployment
   name: my-app
 minReplicas: 1
 maxReplicas: 10
 metrics:
 - type: Resource
   resource:
     name: cpu
     target:
       type: Utilization
       averageUtilization: 50

Horizontal pod
autoscaler (HPA) 1/

Manually scaling pods is time consuming

HPA changes replicas automagically

Supports CPU and memory usage

Demo

Deploy nginx and HPA

Create load and watch hpa scale nginx

Horizontal pod autoscaler (HPA) 2/2

Internals

Prerequisites: metrics-server

Checks every 15 seconds

Calculates the required number of replicas:

desiredReplicas 
= ceil[currentReplicas * (currentMetricValue / desiredMetricValue)]

Configurable behaviour:

Scaling policies
Stabilization window

Scheduling 1/

Control where pods are placed

Resources

Resource requests are important for scheduling

Limits are important for eviction

You want `(requests == limits)`

Pods will not be evicted…

…because resource consumption is known at all times

Scheduling 2/2

Control where pods are placed

Node selector

Force pods onto specific nodes

(Anti)affinity

Force pods on the same node or on different nodes

Tains / tolerations

Reserve nodes for specific pods (taints)

Pods must accept taints (tolerations)

Lessons Learnt 1/

Avoid `kubectl create <resource>`

kubectl create is not idempotent

Next pipeline run will fail because resource already exists

Instead create resource definition on-the-fly:

kubectl create secret generic foo \
    --from-literal=bar=baz \
    --dry-run=client \
    --output=yaml \
| kubectl apply -f -

Lessons Learnt 2/

Wait for reconciliation

Reconciliation takes time

Do not use sleep after apply, scale, delete

Let kubectl do the waiting:

helm upgrade --install my-nginx bitnami/nginx \
    --set service.type=ClusterIP
kubectl rollout status deployment my-nginx --timeout=15m
kubectl wait pods \
    --for=condition=ready \
    --selector app.kubernetes.io/instance=my-nginx

Works for jobs as well:

kubectl wait --for=condition=complete job/baz

Lessons Learnt 3/

Avoid hardcoded names

Finding the pod name is error prone

Filter by label:

helm upgrade --install my-nginx bitnami/nginx \
    --set service.type=ClusterIP \
    --set replicaCount=2
kubectl delete pod --selector app.kubernetes.io/instance=my-nginx

Show logs of the first pod of a deployment:

kubectl logs deployment/my-nginx

Show logs of multiple pods at once with stern :

stern --selector app.kubernetes.io/instance=my-nginx

Lessons Learnt 4/

Troubleshooting individual pods

When a pod is broken, it can be investigated

Remove a label to exclude it from ReplicaSet, Deployment, Service

helm upgrade --install my-nginx bitnami/nginx \
    --set service.type=ClusterIP \
    --set replicaCount=2
kubectl get pods -l app.kubernetes.io/instance=my-nginx -o name \
| head -n 1 \
| xargs -I{} kubectl label {} app.kubernetes.io/instance-

ReplicaSet replaces missing pod

Remove after troubleshooting

kubectl logs --selector '!app.kubernetes.io/instance'
kubectl delete pod \
    -l 'app.kubernetes.io/name=nginx,!app.kubernetes.io/instance'

Lessons Learnt 5/

Use plaintext in `Secret`

Templating becomes easier when inserting plaintext

#...
stringData:
  foo: bar

Do not store resource descriptions after templating

cat secret.yaml \
| envsubst \
| kubectl apply -f -

Lessons Learnt 6/

Update dependencies

Outdated Ops dependencies are also a (security) risk

Tools will be missing useful features

Services can contain vulnerabilities

Renovate/Dependabot FTW

Let bots do the work for you

Doing updates regularly is easier

Automerge for patches can help stay on top of things

Automated tests help decide whether an update is safe

Nicholas Dille

CI/CD

CI/CD

Cluster access 1/

Inside cluster

Next to cluster

Useful tools

Horizontal podautoscaler (HPA) 1/

Demo

Horizontal pod autoscaler (HPA) 2/2

Internals

Scheduling 1/

Resources

You want (requests == limits)

Scheduling 2/2

Node selector

(Anti)affinity

Tains / tolerations

Lessons Learnt 1/

Avoid kubectl create <resource>

Lessons Learnt 2/

Wait for reconciliation

Lessons Learnt 3/

Avoid hardcoded names

Lessons Learnt 4/

Troubleshooting individual pods

Lessons Learnt 5/

Use plaintext in Secret

Lessons Learnt 6/

Update dependencies

Renovate/Dependabot FTW

Horizontal pod
autoscaler (HPA) 1/

You want `(requests == limits)`

Avoid `kubectl create <resource>`

Use plaintext in `Secret`