Collecting metrics for containers
Containers are based on cgroups
Start a container and retrieve ID:
docker run -d --name nginx nginx
ID="$(docker container inspect nginx | jq -r '.[].Id')""
Using high-level tools:
docker stats
systemd-cgtop
Directly from the kernel:
cat "/sys/fs/cgroup/pids/docker/${ID}/cgroup.procs"
cat "/sys/fs/cgroup/cpuacct/docker/${ID}/cpuacct.usage"
cat "/sys/fs/cgroup/memory/docker/${ID}/memory.usage_in_bytes"
kubelet
is responsible for maintaining pods/containers on a node
…are offered by kubelet
as well
Published under /metrics/cadvisor/
Run cadvisor
in compose
Alternative: docker-exporter (untested)
Explore metrics using kubeletctl
IP="$(
docker inspect \
--format '' \
kind-worker
)"
TOKEN="$(
kubectl get secrets kubelet -o json \
| jq --raw-output '.data.token' \
| base64 -d
)"
kubectl get secrets kubelet -o json \
| jq --raw-output '.data."ca.crt"' \
| base64 -d >ca.crt
kubeletctl \
--server ${IP} \
--cacert ca.crt \
--token ${TOKEN} \
metrics cadvisor | less
Explore metrics using curl
IP="$(
docker inspect \
--format ''\
kind-worker
)"
TOKEN="$(
kubectl get secrets kubelet -o json \
| jq --raw-output '.data.token' \
| base64 -d
)"
kubectl get secrets kubelet -o json \
| jq --raw-output '.data."ca.crt"' \
| base64 -d >ca.crt
curl -skH "Authorization: Bearer ${TOKEN}" \
"https://${IP}:10250/metrics/cadvisor" | less
curl -skH "Authorization: Bearer ${TOKEN}" \
"https://${IP}:10250/metrics/cadvisor" \
| grep container_memory_usage_bytes | grep kube-proxy
“…today’s de-facto standard for transmitting cloud-native metrics at scale.”
Metrics can have labels
Labels provide metadata for filtering
Example output of a metrics endpoint:
# TYPE acme_http_router_request_seconds summary
# UNIT acme_http_router_request_seconds seconds
# HELP acme_http_router_request_seconds Latency though all of ACME's HTTP request router.
acme_http_router_request_seconds_sum{path="/api/v1",method="GET"} 9036.32
acme_http_router_request_seconds_count{path="/api/v1",method="GET"} 807283.0
acme_http_router_request_seconds_created{path="/api/v1",method="GET"} 1605281325.0
acme_http_router_request_seconds_sum{path="/api/v2",method="POST"} 479.3
acme_http_router_request_seconds_count{path="/api/v2",method="POST"} 34.0
acme_http_router_request_seconds_created{path="/api/v2",method="POST"} 1605281325.0
# TYPE go_goroutines gauge
# HELP go_goroutines Number of goroutines that currently exist.
go_goroutines 69
# TYPE process_cpu_seconds counter
# UNIT process_cpu_seconds seconds
# HELP process_cpu_seconds Total user and system CPU time spent in seconds.
process_cpu_seconds_total 4.20072246e+06
Format:
name{labels} value [timestamp]
Labels provide context for…
For example:
container_memory_usage_bytes{
namespace="kube-system",
pod="kube-proxy-68mp4",
container="kube-proxy"
} 1.4917632e+07 1669235346213