Master the essential Linux monitoring tools used by DevOps professionals worldwide. This comprehensive guide covers vmstat, iostat, free, sar, dstat, iotop, and iftop with practical examples, real-world scenarios, and expert tips for effective system performance analysis.
Why These Tools Matter
These command-line monitoring tools provide real-time insights into system performance. Each tool serves a specific purpose in the monitoring ecosystem:
- Real-time Analysis: Get immediate visibility into system health
- Low Overhead: Minimal impact on system performance
- Granular Metrics: Detailed statistics for troubleshooting
- Historical Data: Track performance trends over time
- Script Integration: Easy to integrate into monitoring scripts
1. vmstat - Virtual Memory Statistics
Key Metrics Explained
Practical Examples
# Basic usage - single snapshot
vmstat
# Monitor every 2 seconds
vmstat 2
# Monitor every second, 10 times
vmstat 1 10
# Show active and inactive memory
vmstat -a
# Show disk statistics
vmstat -d
# Show statistics in MB
vmstat -S M
# Show summary since boot
vmstat -s
# Show statistics for specific disk
vmstat -p /dev/sda1
Real Output Analysis
$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 5012344 102344 3056789 0 0 12 24 456 1234 12 5 83 0 0
2 0 0 5011234 102356 3057890 0 0 0 32 567 1345 15 7 78 0 0
0 1 0 5009876 102345 3054567 0 0 456 0 678 1456 8 10 82 0 0
1 0 0 5013456 102378 3058901 0 0 0 16 789 1567 20 3 77 0 0
2 0 0 5012345 102390 3056789 0 0 24 8 890 1678 18 6 76 0 0
1. High 'r': Many processes waiting for CPU (CPU bottleneck)
2. High 'b': Processes blocked on I/O (I/O bottleneck)
3. High 'wa': CPU waiting for I/O (I/O bottleneck)
4. High 'si/so': Active swapping (memory pressure)
5. High 'cs': Frequent context switching (possible overhead)
2. iostat - Input/Output Statistics
Essential Commands
# Basic CPU and device statistics
iostat
# Monitor every 2 seconds
iostat 2
# Show extended statistics
iostat -x
# Show statistics in megabytes
iostat -m
# Show only CPU statistics
iostat -c
# Show only device statistics
iostat -d
# Show with timestamp
iostat -t
# Show statistics for specific device
iostat -p sda
# Combine options
iostat -x -m 2 5
Extended Statistics Explained
Real Output Analysis
$ iostat -x -m 1 3
Linux 5.4.0-91-generic (server) 12/07/2025 _x86_64_ (4 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
12.45 0.00 3.56 1.23 0.00 82.76
Device r/s w/s rMB/s wMB/s await svctm %util
sda 45.23 23.45 5.67 3.45 2.34 1.23 12.34
sdb 1.23 0.45 0.12 0.04 1.56 0.89 0.56
nvme0n1 123.45 89.67 15.67 11.23 1.23 0.45 45.67
1. %util > 70%: Device is heavily used
2. await > svctm: Device is saturated
3. avgqu-sz > 1: Requests are queuing
4. High iowait in CPU: System waiting for I/O
5. High svctm: Slow device response
3. free - Memory Usage Statistics
Essential Commands
# Basic memory display
free
# Human readable format (KB, MB, GB)
free -h
# Show in megabytes
free -m
# Show in gigabytes
free -g
# Show total line
free -t
# Wide output (shows low/high memory)
free -w
# Continuous monitoring (every 2 seconds)
free -h -s 2
# Show with timestamp
free -h -t
# Combine options
free -h -t -s 5
Memory Metrics Explained
Real Output Analysis
$ free -h -t
total used free shared buff/cache available
Mem: 15Gi 4.2Gi 2.1Gi 123Mi 8.7Gi 10Gi
Swap: 2.0Gi 512Mi 1.5Gi
Total: 17Gi 4.7Gi 3.6Gi
1. Don't panic about low 'free': Linux uses free memory for cache
2. Watch 'available': This is memory truly available for apps
3. High buff/cache is good: Means Linux is using memory efficiently
4. Swap usage: Some swap usage is normal, high usage indicates memory pressure
5. Memory leak detection: Monitor 'used' memory growth over time
4. sar - System Activity Reporter
Installation and Setup
# Install sar (Ubuntu/Debian)
sudo apt-get install sysstat
# Install sar (RHEL/CentOS)
sudo yum install sysstat
# Enable data collection
sudo systemctl enable sysstat
sudo systemctl start sysstat
# Configure collection interval (edit /etc/sysstat/sysstat)
sudo vi /etc/sysstat/sysstat
# Change: ENABLED="true"
# Set: HISTORY=7 (days to keep data)
Essential sar Commands
# CPU usage every 1 second, 3 times
sar 1 3
# Show CPU statistics
sar -u
# Show memory statistics
sar -r
# Show swap statistics
sar -S
# Show I/O statistics
sar -b
# Show paging statistics
sar -B
# Show network statistics
sar -n DEV
# Show queue length and load averages
sar -q
# Show process creation and context switches
sar -w
# Show block device statistics
sar -d
# Read from system activity file
sar -f /var/log/sysstat/sa07 # For day 07
# Show specific time range
sar -s 10:00:00 -e 11:00:00
# Combine multiple reports
sar -A
Historical Data Analysis
# View yesterday's CPU usage
sar -u -f /var/log/sysstat/sa$(date -d yesterday +%d)
# View specific date (December 7th)
sar -u -f /var/log/sysstat/sa07
# View memory usage for specific day
sar -r -f /var/log/sysstat/sa07
# Generate daily report
sar -A -f /var/log/sysstat/sa07 > /tmp/daily-report.txt
# Compare two days
sar -u -f /var/log/sysstat/sa06 > day6.txt
sar -u -f /var/log/sysstat/sa07 > day7.txt
diff day6.txt day7.txt
5. dstat - Versatile Resource Statistics
Installation
# Ubuntu/Debian
sudo apt-get install dstat
# RHEL/CentOS
sudo yum install dstat
# Install from source
git clone https://github.com/dagwieers/dstat
cd dstat
./install.sh
Essential dstat Commands
# Basic statistics (CPU, disk, network, paging, system)
dstat
# Update every 2 seconds
dstat 2
# Show only CPU and disk
dstat -c -d
# Show with top processes
dstat -c --top-cpu
# Show with top memory processes
dstat -c -d -n -m --top-mem
# Show disk by device
dstat -cdl -D sda,sdb,total
# Show network by interface
dstat -cdn -N eth0,eth1,total
# Output to CSV file
dstat --output /tmp/dstat.csv 2 10
# Show TCP connections
dstat --tcp
# Show memory usage details
dstat -m --vmstat
# Show all available plugins
dstat --list
Custom dstat Views
# Custom columns: CPU, Memory, Disk, Network
dstat -tc -m -d -n
# With process information
dstat -tc -m -d -n --proc-count --top-cpu
# Monitor specific metrics only
dstat --time --cpu --mem --net --disk --tcp --vm
# Show in megabytes
dstat -cdngy -M total,sys,in,out --integer
# Monitor for 30 seconds, update every 2
dstat -tcmd 2 15
6. iotop - I/O Monitor by Process
Installation
# Ubuntu/Debian
sudo apt-get install iotop
# RHEL/CentOS
sudo yum install iotop
# Run with necessary privileges
sudo iotop
Essential iotop Commands
# Basic interactive mode (requires sudo)
sudo iotop
# Batch mode for scripting
sudo iotop -b
# Update every 2 seconds
sudo iotop -d 2
# Show only processes doing I/O
sudo iotop -o
# Show accumulated I/O
sudo iotop -a
# Show threads instead of processes
sudo iotop -P
# Show in kilobytes
sudo iotop -k
# Show timestamp
sudo iotop -t
# Combine options
sudo iotop -bot -d 2 -n 5
Interactive Mode Controls
7. iftop - Network Bandwidth Monitor
Installation
# Ubuntu/Debian
sudo apt-get install iftop
# RHEL/CentOS
sudo yum install iftop
# Run (specify interface if multiple)
sudo iftop -i eth0
Essential iftop Commands
# Monitor specific interface
sudo iftop -i eth0
# Show port numbers
sudo iftop -P
# Show in bits per second (default is bytes)
sudo iftop -B
# Don't resolve hostnames (faster)
sudo iftop -n
# Don't resolve port names
sudo iftop -N
# Combine options
sudo iftop -i eth0 -n -N -B
# Filter by network
sudo iftop -F 192.168.1.0/24
# Filter by port
sudo iftop -f "port 80 or port 443"
Interactive Mode Controls
Tool Comparison Guide
Quick Reference Cheat Sheet
Real-World Troubleshooting Scenarios
# Step 1: Check CPU and memory
vmstat 1 5
# Step 2: Check I/O wait
iostat -x 1 3
# Step 3: Check which process is causing I/O
sudo iotop -o
# Step 4: Check memory pressure
free -h
sar -r 1 3
# Step 1: Identify top talkers
sudo iftop -i eth0 -n
# Step 2: Check network errors
sar -n EDEV 1 3
# Step 3: Check TCP connections
sar -n TCP 1 3
# Step 4: Monitor overall bandwidth
dstat -n -N eth0,total 1
# Step 1: Check disk utilization
iostat -x -m 1
# Step 2: Check await times
iostat -x 1 3 | grep -A1 await
# Step 3: Identify process causing disk I/O
sudo iotop -o -b -d 2
# Step 4: Check filesystem cache
free -h
sar -r 1 3
Monitoring Script Examples
Comprehensive Monitoring Script
#!/bin/bash
# comprehensive-monitor.sh
# Collects data from multiple monitoring tools
LOG_DIR="/var/log/monitoring"
mkdir -p "$LOG_DIR"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
echo "=== Comprehensive System Monitoring - $(date) ==="
echo "Logging to: $LOG_DIR/monitor_$TIMESTAMP.log"
# Collect vmstat data
echo "=== vmstat (system overview) ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
vmstat 1 3 >> "$LOG_DIR/monitor_$TIMESTAMP.log"
# Collect iostat data
echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
echo "=== iostat (disk I/O) ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
iostat -x -m 1 3 >> "$LOG_DIR/monitor_$TIMESTAMP.log"
# Collect memory data
echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
echo "=== Memory Usage ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
free -h >> "$LOG_DIR/monitor_$TIMESTAMP.log"
# Collect process I/O (requires sudo)
if [ "$EUID" -eq 0 ]; then
echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
echo "=== Top I/O Processes ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
iotop -b -n 2 -d 1 >> "$LOG_DIR/monitor_$TIMESTAMP.log" 2>/dev/null
fi
# Collect network stats
echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
echo "=== Network Statistics ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
sar -n DEV 1 1 >> "$LOG_DIR/monitor_$TIMESTAMP.log" 2>/dev/null
echo "Monitoring completed. Report: $LOG_DIR/monitor_$TIMESTAMP.log"
Automated Alert Script
#!/bin/bash
# alert-monitor.sh
# Monitors system and sends alerts for critical conditions
THRESHOLD_CPU=90
THRESHOLD_MEMORY=90
THRESHOLD_DISK=85
ALERT_EMAIL="admin@example.com"
check_cpu() {
CPU_USAGE=$(top -b -n 1 | grep "^%Cpu" | awk '{print 100 - $8}')
if (( $(echo "$CPU_USAGE > $THRESHOLD_CPU" | bc -l) )); then
echo "ALERT: High CPU usage: ${CPU_USAGE}%"
return 1
fi
return 0
}
check_memory() {
MEM_PERCENT=$(free | grep Mem | awk '{print $3/$2 * 100.0}')
if (( $(echo "$MEM_PERCENT > $THRESHOLD_MEMORY" | bc -l) )); then
echo "ALERT: High memory usage: ${MEM_PERCENT}%"
return 1
fi
return 0
}
check_disk() {
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
if [ "$DISK_USAGE" -ge "$THRESHOLD_DISK" ]; then
echo "ALERT: High disk usage on /: ${DISK_USAGE}%"
return 1
fi
return 0
}
# Run checks
ALERTS=""
if ! check_cpu; then ALERTS="$ALERTS\n- High CPU"; fi
if ! check_memory; then ALERTS="$ALERTS\n- High Memory"; fi
if ! check_disk; then ALERTS="$ALERTS\n- High Disk"; fi
# Send alert if any issues
if [ -n "$ALERTS" ]; then
echo -e "System alerts detected on $(hostname) at $(date):$ALERTS" | \
mail -s "SYSTEM ALERT: $(hostname)" "$ALERT_EMAIL"
echo "Alerts sent to $ALERT_EMAIL"
fi
Monitoring Best Practices Checklist
- Use vmstat for quick system overview
- Use iostat for disk I/O analysis
- Use free for memory status
- Use sar for historical data
- Use dstat for customizable views
- Use iotop to identify I/O intensive processes
- Use iftop for network traffic analysis
- Establish baseline metrics
- Set appropriate monitoring intervals
- Log data for trend analysis
Getting Started with Monitoring
Follow this step-by-step approach to master Linux monitoring tools:
- Start with basics: Master free, vmstat, and iostat first
- Install sysstat: Set up sar for historical data collection
- Learn interactive tools: Practice with iotop and iftop
- Create custom views: Use dstat to build custom monitoring dashboards
- Establish baselines: Monitor your systems during normal operation
- Set up alerts: Create scripts to notify you of issues
- Practice troubleshooting: Use the scenarios in this guide
- Document findings: Keep notes on normal ranges and thresholds
- Automate reporting: Schedule regular system health reports
- Stay updated: Learn new tools and techniques regularly
Master the Art of System Monitoring
These monitoring tools are essential for any Linux administrator or DevOps professional. By mastering vmstat, iostat, free, sar, dstat, iotop, and iftop, you gain deep visibility into system performance and can quickly identify and resolve issues.
Remember: Effective monitoring is about asking the right questions. Start with "What's the overall system health?" (vmstat), then drill down to specific areas: "Is disk I/O the bottleneck?" (iostat), "Which process is causing it?" (iotop).
Next Steps: Install all these tools on your system and spend 30 minutes each day practicing with them. Create a monitoring script that collects data from all tools and generates a daily health report. Soon, you'll be diagnosing system issues with confidence and precision.