Container metrics primer

Container metrics primer

Collecting metrics for containers

Containers are based on cgroups

Start a container and retrieve ID:

docker run -d --name nginx nginx
ID="$(docker container inspect nginx | jq -r '.[].Id')""

Using high-level tools:

docker stats
systemd-cgtop

Directly from the kernel:

cat "/sys/fs/cgroup/pids/docker/${ID}/cgroup.procs"
cat "/sys/fs/cgroup/cpuacct/docker/${ID}/cpuacct.usage"
cat "/sys/fs/cgroup/memory/docker/${ID}/memory.usage_in_bytes"

Container metrics in Kubernetes

kubelet is responsible for maintaining pods/containers on a node

Metrics…

…are offered by kubelet as well

kubelet ships with cadvisor

Published under /metrics/cadvisor/

Demo: cadvisor with Docker

Run cadvisor in compose

Alternative: docker-exporter (untested)


Demo: Container metrics in k8s 1/

Explore metrics using kubeletctl

IP="$(
    docker inspect \
        --format '' \
        kind-worker
)"
TOKEN="$(
    kubectl get secrets kubelet -o json \
    | jq --raw-output '.data.token' \
    | base64 -d
)"
kubectl get secrets kubelet -o json \
| jq --raw-output '.data."ca.crt"' \
| base64 -d >ca.crt
kubeletctl \
    --server ${IP} \
    --cacert ca.crt \
    --token ${TOKEN} \
    metrics cadvisor | less

Demo: Container metrics in k8s 2/

Explore metrics using curl

IP="$(
    docker inspect \
        --format ''\
        kind-worker
)"
TOKEN="$(
    kubectl get secrets kubelet -o json \
    | jq --raw-output '.data.token' \
    | base64 -d
)"
kubectl get secrets kubelet -o json \
| jq --raw-output '.data."ca.crt"' \
| base64 -d >ca.crt
curl -skH "Authorization: Bearer ${TOKEN}" \
    "https://${IP}:10250/metrics/cadvisor" | less
curl -skH "Authorization: Bearer ${TOKEN}" \
    "https://${IP}:10250/metrics/cadvisor" \
| grep container_memory_usage_bytes | grep kube-proxy

OpenMetrics 1/

“…today’s de-facto standard for transmitting cloud-native metrics at scale.”

Specification

Types

Metadata

Metrics can have labels

Labels provide metadata for filtering


OpenMetrics 2/

Example output of a metrics endpoint:

# TYPE acme_http_router_request_seconds summary
# UNIT acme_http_router_request_seconds seconds
# HELP acme_http_router_request_seconds Latency though all of ACME's HTTP request router.
acme_http_router_request_seconds_sum{path="/api/v1",method="GET"} 9036.32
acme_http_router_request_seconds_count{path="/api/v1",method="GET"} 807283.0
acme_http_router_request_seconds_created{path="/api/v1",method="GET"} 1605281325.0
acme_http_router_request_seconds_sum{path="/api/v2",method="POST"} 479.3
acme_http_router_request_seconds_count{path="/api/v2",method="POST"} 34.0
acme_http_router_request_seconds_created{path="/api/v2",method="POST"} 1605281325.0
# TYPE go_goroutines gauge
# HELP go_goroutines Number of goroutines that currently exist.
go_goroutines 69
# TYPE process_cpu_seconds counter
# UNIT process_cpu_seconds seconds
# HELP process_cpu_seconds Total user and system CPU time spent in seconds.
process_cpu_seconds_total 4.20072246e+06

OpenMetrics

Format:

name{labels} value [timestamp]

Labels provide context for…

For example:

container_memory_usage_bytes{
    namespace="kube-system",
    pod="kube-proxy-68mp4",
    container="kube-proxy"
} 1.4917632e+07 1669235346213