Data types
count
, sum
, min
, max
, avg
stddev
, stdvar
, quantile
topk
, bottomk
rate()
round()
, floor()
, ceil()
Gauges represents a value that can arbitrarly jump up and down
Usually used for temperatures or memory usage
Queries:
# Explore container memory
container_memory_usage_bytes
container_memory_usage_bytes{namespace="ingress-nginx"}
container_memory_usage_bytes{namespace="ingress-nginx",container!=""}
# Show memory usage per pod
avg by (pod) (container_memory_usage_bytes{namespace="kube-system",container!=""})
# Show memory usage for a namespace
sum(container_memory_usage_bytes{namespace="kube-system",container!=""})
# Show memory usage for all namespaces
sum by (namespace) (container_memory_usage_bytes{container!=""})
Counters represent a cumulative value
Usually used for the number of requests served or the CPU usage
Queries:
# Explore counter
container_cpu_usage_seconds_total{namespace="ingress-nginx",container!=""}
# Show rate of change with different resolutions
rate(container_cpu_usage_seconds_total{namespace="ingress-nginx",container!=""}[5m])
rate(container_cpu_usage_seconds_total{namespace="ingress-nginx",container!=""}[10m])
# Idle node CPU
node_cpu_seconds_total{mode="idle"}
rate(node_cpu_seconds_total{mode="idle"}[5m])
sum by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m]))
# Error reasons for pods
count by (reason) (kube_pod_status_reason)
count by (reason) (kube_pod_container_status_terminated_reason)
# Number of replicas
count by (exported_container) (kube_pod_container_status_ready)
# Numer of running pods
count(count by (exported_container) (kube_pod_container_status_running))
# Pods referencing Helm chart
count by (exported_pod) (
kube_pod_labels{label_chart!=""} or
kube_pod_labels{label_helm_sh_chart!=""}
)
You are using dockershim
(Docker runtime)
Filter out sleeping POD container:
container_memory_usage_bytes{namespace="ingress-nginx",container!="",container!="POD"}
This also removes the kumulative pod metrics
kube_pod_labels
is emptykube-state-metrics
does not aggregate labels anymore #1501
Set --metric-labels-allowlist=pods=[*]
in arguments
Or metricLabelsAllowlist
in Helm chart
Metrics can be joined…
…to add labels from another metric
Joined metrics must have at least one common label
Node metrics only reference the instance and do not contain the nodename
node_uname_info
helps to get the nodename
node_memory_Active_bytes * on(instance) group_left(nodename) node_uname_info
Use label instance
to join metrics
Use all labels from left metrics (group_left
) and add label nodename