containerd Logging & Monitoring
Effective logging and monitoring are essential for production containerd deployments. This guide covers configuring log drivers, collecting metrics, integrating with Prometheus, and setting up Grafana dashboards for container observability.
containerd is the backbone of your container infrastructure. Without proper logging and monitoring, you're flying blind. Logging helps you debug issues when containers fail to start or behave unexpectedly. Monitoring gives you visibility into resource usage, performance trends, and capacity planning.
This guide covers two aspects: logging (capturing containerd daemon logs and container logs) and monitoring (collecting metrics like CPU, memory, and operation counts). Both are essential for maintaining healthy container infrastructure.
The containerd daemon itself generates logs that help debug issues with the runtime. These logs show startup events, errors, and internal operations.
# View containerd daemon logs
sudo journalctl -u containerd -f
# View recent logs
sudo journalctl -u containerd --since "1 hour ago"
# View logs with timestamps
sudo journalctl -u containerd --output short-full
# Enable debug logging in config.toml
log_level = "debug"
log_format = "text" # or "json"
# After enabling debug, restart containerd
sudo systemctl restart containerd
containerd captures stdout and stderr from containers. These logs are stored in the container's log directory and can be accessed using `crictl logs` or by reading the log files directly.
# View container logs with crictl
crictl logs
# Follow logs
crictl logs -f
# View last 50 lines
crictl logs --tail 50
# Container log location (Linux)
/var/log/containers/
/var/lib/containerd/io.containerd.runtime.v2.task/k8s.io//log.json
# View raw log file
sudo cat /var/log/containers/*.log
Log rotation prevents disk space exhaustion from container logs. Configure log rotation in the containerd config or use external tools like logrotate.
# Configure log rotation in config.toml
[plugins."io.containerd.grpc.v1.cri"]
# Max log line size
max_container_log_line_size = 16384
[plugins."io.containerd.grpc.v1.cri".containerd]
# Log rotation settings
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
# Systemd cgroup for log rotation
SystemdCgroup = true
# External logrotate configuration
# /etc/logrotate.d/containerd
/var/log/containers/*.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
copytruncate
maxsize 100M
}
containerd exposes metrics that help monitor performance and resource usage. Enable the metrics endpoint to start collecting data.
# Enable metrics in config.toml
[metrics]
# Address to serve metrics on
address = "0.0.0.0:1338"
# Enable metrics
enabled = true
[metrics.prometheus]
# Enable Prometheus metrics
enabled = true
# Verify metrics endpoint
curl http://localhost:1338/metrics
# Example metrics output
# containerd_container_runtime_operations_total{type="create"}
# containerd_container_memory_usage_bytes
# containerd_container_cpu_usage_seconds_total
container_runtime_operations_total
Total number of container operations (create, start, stop, delete)
container_runtime_operations_errors_total
Failed container operations
container_image_pull_duration_seconds
Time to pull images
container_memory_usage_bytes
Memory usage per container
container_cpu_usage_seconds_total
CPU usage per container
container_network_receive_bytes_total
Network receive bytes
container_network_transmit_bytes_total
Network transmit bytes
go_goroutines
Number of goroutines in containerd
Configure Prometheus to scrape containerd metrics for long-term storage and alerting.
# prometheus.yml
scrape_configs:
- job_name: 'containerd'
static_configs:
- targets: ['localhost:1338']
metrics_path: '/metrics'
scheme: 'http'
# Check Prometheus target status
# http://localhost:9090/targets
# Sample PromQL queries
# Container operations rate
rate(container_runtime_operations_total[5m])
# Memory usage by container
container_memory_usage_bytes{container="nginx"}
# Container operation errors
container_runtime_operations_errors_total
Visualize containerd metrics with Grafana dashboards. There are pre-built dashboards available for container monitoring.
# Install Grafana
sudo apt-get install -y grafana
# Start Grafana
sudo systemctl start grafana-server
# Import a dashboard
# Dashboard ID: 14285 (Kubernetes / Containers)
# Dashboard ID: 13295 (containerd Monitoring)
# Sample dashboard panels:
# - Container CPU usage (top 10)
# - Container memory usage (top 10)
# - Container restart count
# - Image pull duration
# - Container operations per second
Set up Prometheus alerts to notify you when issues occur.
# Prometheus alerting rules
groups:
- name: containerd_alerts
rules:
- alert: ContainerHighCPU
expr: rate(container_cpu_usage_seconds_total[5m]) * 100 > 80
for: 5m
annotations:
summary: "High CPU usage on container {{ $labels.container }}"
- alert: ContainerHighMemory
expr: container_memory_usage_bytes / container_memory_limit_bytes > 0.9
for: 5m
annotations:
summary: "High memory usage on container {{ $labels.container }}"
- alert: ContainerRestarting
expr: changes(container_start_time_seconds[10m]) > 2
annotations:
summary: "Container {{ $labels.container }} is restarting frequently"
- alert: ImagePullSlow
expr: histogram_quantile(0.95, rate(container_image_pull_duration_seconds_bucket[5m])) > 30
annotations:
summary: "Image pull is taking longer than 30 seconds"
# Check if containerd logs are being generated
sudo journalctl -u containerd --since "5 minutes ago" | wc -l
# Verify metrics endpoint
curl -s http://localhost:1338/metrics | head -20
# Check if Prometheus can scrape metrics
curl -s http://prometheus:9090/api/v1/query?query=up{job="containerd"}
# Check containerd config for metrics
containerd config dump | grep -A 5 metrics
# Increase log level temporarily
sudo sed -i 's/log_level = "info"/log_level = "debug"/' /etc/containerd/config.toml
sudo systemctl restart containerd
Effective logging and monitoring are essential for production containerd deployments. Set up these practices early to maintain healthy container infrastructure.