Essential Linux Monitoring Tools: Complete Guide

Master the essential Linux monitoring tools used by DevOps professionals worldwide. This comprehensive guide covers vmstat, iostat, free, sar, dstat, iotop, and iftop with practical examples, real-world scenarios, and expert tips for effective system performance analysis.

Why These Tools Matter

These command-line monitoring tools provide real-time insights into system performance. Each tool serves a specific purpose in the monitoring ecosystem:

  • Real-time Analysis: Get immediate visibility into system health
  • Low Overhead: Minimal impact on system performance
  • Granular Metrics: Detailed statistics for troubleshooting
  • Historical Data: Track performance trends over time
  • Script Integration: Easy to integrate into monitoring scripts

1. vmstat - Virtual Memory Statistics

📊
vmstat
Virtual Memory Statistics
Reports information about processes, memory, paging, block IO, traps, disks and CPU activity.
vmstat [options] [delay [count]]
Process Memory Swap IO System CPU

Key Metrics Explained

procs: r = running, b = blocked
memory: swpd = used swap, free, buff, cache
swap: si = swap in, so = swap out
io: bi = blocks in, bo = blocks out
system: in = interrupts, cs = context switches
cpu: us, sy, id, wa, st = user, system, idle, wait, steal

Practical Examples

# Basic usage - single snapshot
vmstat

# Monitor every 2 seconds
vmstat 2

# Monitor every second, 10 times
vmstat 1 10

# Show active and inactive memory
vmstat -a

# Show disk statistics
vmstat -d

# Show statistics in MB
vmstat -S M

# Show summary since boot
vmstat -s

# Show statistics for specific disk
vmstat -p /dev/sda1

Real Output Analysis

$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 5012344 102344 3056789   0    0    12    24  456 1234 12  5 83  0  0
 2  0      0 5011234 102356 3057890   0    0     0    32  567 1345 15  7 78  0  0
 0  1      0 5009876 102345 3054567   0    0   456     0  678 1456  8 10 82  0  0
 1  0      0 5013456 102378 3058901   0    0     0    16  789 1567 20  3 77  0  0
 2  0      0 5012345 102390 3056789   0    0    24     8  890 1678 18  6 76  0  0
Interpreting vmstat Output:
1. High 'r': Many processes waiting for CPU (CPU bottleneck)
2. High 'b': Processes blocked on I/O (I/O bottleneck)
3. High 'wa': CPU waiting for I/O (I/O bottleneck)
4. High 'si/so': Active swapping (memory pressure)
5. High 'cs': Frequent context switching (possible overhead)

2. iostat - Input/Output Statistics

💾
iostat
I/O and CPU Statistics
Monitors system input/output device loading by observing time devices are active.
iostat [options] [interval [count]]
CPU Utilization Device tps KB Read/s KB Write/s Await Time Utilization %

Essential Commands

# Basic CPU and device statistics
iostat

# Monitor every 2 seconds
iostat 2

# Show extended statistics
iostat -x

# Show statistics in megabytes
iostat -m

# Show only CPU statistics
iostat -c

# Show only device statistics
iostat -d

# Show with timestamp
iostat -t

# Show statistics for specific device
iostat -p sda

# Combine options
iostat -x -m 2 5

Extended Statistics Explained

rrqm/s: Read requests merged per second
wrqm/s: Write requests merged per second
r/s, w/s: Read/write requests per second
rkB/s, wkB/s: KB read/written per second
avgrq-sz: Average request size (sectors)
avgqu-sz: Average queue length
await: Average I/O response time (ms)
svctm: Average service time (ms)
%util: Percentage of CPU time spent on I/O

Real Output Analysis

$ iostat -x -m 1 3
Linux 5.4.0-91-generic (server)     12/07/2025     _x86_64_    (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          12.45    0.00    3.56    1.23    0.00   82.76

Device            r/s     w/s     rMB/s   wMB/s   await  svctm  %util
sda              45.23   23.45     5.67    3.45    2.34   1.23  12.34
sdb               1.23    0.45     0.12    0.04    1.56   0.89   0.56
nvme0n1         123.45   89.67    15.67   11.23    1.23   0.45  45.67
I/O Bottleneck Indicators:
1. %util > 70%: Device is heavily used
2. await > svctm: Device is saturated
3. avgqu-sz > 1: Requests are queuing
4. High iowait in CPU: System waiting for I/O
5. High svctm: Slow device response

3. free - Memory Usage Statistics

🧠
free
Memory Usage Display
Displays amount of free and used memory in the system, including physical and swap memory.
free [options]
Total Used Free Shared Buff/Cache Available

Essential Commands

# Basic memory display
free

# Human readable format (KB, MB, GB)
free -h

# Show in megabytes
free -m

# Show in gigabytes
free -g

# Show total line
free -t

# Wide output (shows low/high memory)
free -w

# Continuous monitoring (every 2 seconds)
free -h -s 2

# Show with timestamp
free -h -t

# Combine options
free -h -t -s 5

Memory Metrics Explained

total: Total installed memory
used: Memory used by processes
free: Unused memory
shared: Memory used by tmpfs
buff/cache: Memory used by kernel buffers/cache
available: Memory available for new applications

Real Output Analysis

$ free -h -t
              total        used        free      shared  buff/cache   available
Mem:           15Gi        4.2Gi       2.1Gi       123Mi       8.7Gi        10Gi
Swap:         2.0Gi       512Mi       1.5Gi
Total:         17Gi        4.7Gi       3.6Gi
Understanding Linux Memory:
1. Don't panic about low 'free': Linux uses free memory for cache
2. Watch 'available': This is memory truly available for apps
3. High buff/cache is good: Means Linux is using memory efficiently
4. Swap usage: Some swap usage is normal, high usage indicates memory pressure
5. Memory leak detection: Monitor 'used' memory growth over time

4. sar - System Activity Reporter

📈
sar
System Activity Report
Collects, reports, and saves system activity information. Part of sysstat package.
sar [options] [interval [count]]
CPU Memory I/O Network Process Paging

Installation and Setup

# Install sar (Ubuntu/Debian)
sudo apt-get install sysstat

# Install sar (RHEL/CentOS)
sudo yum install sysstat

# Enable data collection
sudo systemctl enable sysstat
sudo systemctl start sysstat

# Configure collection interval (edit /etc/sysstat/sysstat)
sudo vi /etc/sysstat/sysstat
# Change: ENABLED="true"
# Set: HISTORY=7 (days to keep data)

Essential sar Commands

# CPU usage every 1 second, 3 times
sar 1 3

# Show CPU statistics
sar -u

# Show memory statistics
sar -r

# Show swap statistics
sar -S

# Show I/O statistics
sar -b

# Show paging statistics
sar -B

# Show network statistics
sar -n DEV

# Show queue length and load averages
sar -q

# Show process creation and context switches
sar -w

# Show block device statistics
sar -d

# Read from system activity file
sar -f /var/log/sysstat/sa07  # For day 07

# Show specific time range
sar -s 10:00:00 -e 11:00:00

# Combine multiple reports
sar -A

Historical Data Analysis

# View yesterday's CPU usage
sar -u -f /var/log/sysstat/sa$(date -d yesterday +%d)

# View specific date (December 7th)
sar -u -f /var/log/sysstat/sa07

# View memory usage for specific day
sar -r -f /var/log/sysstat/sa07

# Generate daily report
sar -A -f /var/log/sysstat/sa07 > /tmp/daily-report.txt

# Compare two days
sar -u -f /var/log/sysstat/sa06 > day6.txt
sar -u -f /var/log/sysstat/sa07 > day7.txt
diff day6.txt day7.txt

5. dstat - Versatile Resource Statistics

🔄
dstat
Versatile Resource Statistics
Flexible replacement for vmstat, iostat, netstat, and ifstat with color output and CSV export.
dstat [options] [delay [count]]
CPU Disk Network Memory System Process

Installation

# Ubuntu/Debian
sudo apt-get install dstat

# RHEL/CentOS
sudo yum install dstat

# Install from source
git clone https://github.com/dagwieers/dstat
cd dstat
./install.sh

Essential dstat Commands

# Basic statistics (CPU, disk, network, paging, system)
dstat

# Update every 2 seconds
dstat 2

# Show only CPU and disk
dstat -c -d

# Show with top processes
dstat -c --top-cpu

# Show with top memory processes
dstat -c -d -n -m --top-mem

# Show disk by device
dstat -cdl -D sda,sdb,total

# Show network by interface
dstat -cdn -N eth0,eth1,total

# Output to CSV file
dstat --output /tmp/dstat.csv 2 10

# Show TCP connections
dstat --tcp

# Show memory usage details
dstat -m --vmstat

# Show all available plugins
dstat --list

Custom dstat Views

# Custom columns: CPU, Memory, Disk, Network
dstat -tc -m -d -n

# With process information
dstat -tc -m -d -n --proc-count --top-cpu

# Monitor specific metrics only
dstat --time --cpu --mem --net --disk --tcp --vm

# Show in megabytes
dstat -cdngy -M total,sys,in,out --integer

# Monitor for 30 seconds, update every 2
dstat -tcmd 2 15

6. iotop - I/O Monitor by Process

📤
iotop
I/O Monitoring by Process
Shows I/O usage by processes/threads in real-time, similar to top for I/O.
iotop [options]
Process PID Read/s Write/s Swapin IO%

Installation

# Ubuntu/Debian
sudo apt-get install iotop

# RHEL/CentOS
sudo yum install iotop

# Run with necessary privileges
sudo iotop

Essential iotop Commands

# Basic interactive mode (requires sudo)
sudo iotop

# Batch mode for scripting
sudo iotop -b

# Update every 2 seconds
sudo iotop -d 2

# Show only processes doing I/O
sudo iotop -o

# Show accumulated I/O
sudo iotop -a

# Show threads instead of processes
sudo iotop -P

# Show in kilobytes
sudo iotop -k

# Show timestamp
sudo iotop -t

# Combine options
sudo iotop -bot -d 2 -n 5

Interactive Mode Controls

r: Reverse sort order
o: Only show active I/O
p: Show processes only
a: Show accumulated I/O
q: Quit
Left/Right: Change sort column

7. iftop - Network Bandwidth Monitor

🌐
iftop
Network Bandwidth Monitor
Displays bandwidth usage on an interface by host, similar to top for network.
iftop [options] [interface]
Source Host Dest Host 2s Rate 10s Rate 40s Rate Cumulative

Installation

# Ubuntu/Debian
sudo apt-get install iftop

# RHEL/CentOS
sudo yum install iftop

# Run (specify interface if multiple)
sudo iftop -i eth0

Essential iftop Commands

# Monitor specific interface
sudo iftop -i eth0

# Show port numbers
sudo iftop -P

# Show in bits per second (default is bytes)
sudo iftop -B

# Don't resolve hostnames (faster)
sudo iftop -n

# Don't resolve port names
sudo iftop -N

# Combine options
sudo iftop -i eth0 -n -N -B

# Filter by network
sudo iftop -F 192.168.1.0/24

# Filter by port
sudo iftop -f "port 80 or port 443"

Interactive Mode Controls

h: Show help
n: Toggle DNS resolution
s: Show source host
d: Show destination host
t: Cycle through display modes
p: Toggle port display
P: Pause display
1/2/3: Sort by 1st/2nd/3rd column
</>: Scroll display
q: Quit

Tool Comparison Guide

Tool Best For Real-time Historical Overhead Ease of Use vmstat System-wide overview ✅ ❌ Very Low Easy iostat Disk I/O analysis ✅ ❌ Low Medium free Memory usage ✅ ❌ Very Low Very Easy sar Historical analysis ✅ ✅ Low Medium dstat Custom monitoring ✅ ❌ Medium Medium iotop Process I/O ✅ ❌ Medium Easy iftop Network traffic ✅ ❌ Medium Easy

Quick Reference Cheat Sheet

vmstat 1 10
System stats every second for 10 iterations
iostat -x -m 2
Extended disk stats in MB every 2 seconds
free -h -s 5
Memory usage in human format every 5 seconds
sar -u -f /var/log/sysstat/sa07
CPU usage for December 7th from historical data
dstat -tc -m -d -n
CPU, memory, disk, network with timestamp
sudo iotop -o -b
Batch mode showing only active I/O processes
sudo iftop -i eth0 -n -N
Network traffic on eth0 without DNS/port resolution

Real-World Troubleshooting Scenarios

Scenario 1: Slow System Response
# Step 1: Check CPU and memory
vmstat 1 5

# Step 2: Check I/O wait
iostat -x 1 3

# Step 3: Check which process is causing I/O
sudo iotop -o

# Step 4: Check memory pressure
free -h
sar -r 1 3
Scenario 2: High Network Usage
# Step 1: Identify top talkers
sudo iftop -i eth0 -n

# Step 2: Check network errors
sar -n EDEV 1 3

# Step 3: Check TCP connections
sar -n TCP 1 3

# Step 4: Monitor overall bandwidth
dstat -n -N eth0,total 1
Scenario 3: Disk Performance Issues
# Step 1: Check disk utilization
iostat -x -m 1

# Step 2: Check await times
iostat -x 1 3 | grep -A1 await

# Step 3: Identify process causing disk I/O
sudo iotop -o -b -d 2

# Step 4: Check filesystem cache
free -h
sar -r 1 3

Monitoring Script Examples

Comprehensive Monitoring Script

#!/bin/bash
# comprehensive-monitor.sh
# Collects data from multiple monitoring tools

LOG_DIR="/var/log/monitoring"
mkdir -p "$LOG_DIR"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)

echo "=== Comprehensive System Monitoring - $(date) ==="
echo "Logging to: $LOG_DIR/monitor_$TIMESTAMP.log"

# Collect vmstat data
echo "=== vmstat (system overview) ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
vmstat 1 3 >> "$LOG_DIR/monitor_$TIMESTAMP.log"

# Collect iostat data
echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
echo "=== iostat (disk I/O) ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
iostat -x -m 1 3 >> "$LOG_DIR/monitor_$TIMESTAMP.log"

# Collect memory data
echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
echo "=== Memory Usage ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
free -h >> "$LOG_DIR/monitor_$TIMESTAMP.log"

# Collect process I/O (requires sudo)
if [ "$EUID" -eq 0 ]; then
    echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
    echo "=== Top I/O Processes ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
    iotop -b -n 2 -d 1 >> "$LOG_DIR/monitor_$TIMESTAMP.log" 2>/dev/null
fi

# Collect network stats
echo "" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
echo "=== Network Statistics ===" >> "$LOG_DIR/monitor_$TIMESTAMP.log"
sar -n DEV 1 1 >> "$LOG_DIR/monitor_$TIMESTAMP.log" 2>/dev/null

echo "Monitoring completed. Report: $LOG_DIR/monitor_$TIMESTAMP.log"

Automated Alert Script

#!/bin/bash
# alert-monitor.sh
# Monitors system and sends alerts for critical conditions

THRESHOLD_CPU=90
THRESHOLD_MEMORY=90
THRESHOLD_DISK=85
ALERT_EMAIL="admin@example.com"

check_cpu() {
    CPU_USAGE=$(top -b -n 1 | grep "^%Cpu" | awk '{print 100 - $8}')
    if (( $(echo "$CPU_USAGE > $THRESHOLD_CPU" | bc -l) )); then
        echo "ALERT: High CPU usage: ${CPU_USAGE}%"
        return 1
    fi
    return 0
}

check_memory() {
    MEM_PERCENT=$(free | grep Mem | awk '{print $3/$2 * 100.0}')
    if (( $(echo "$MEM_PERCENT > $THRESHOLD_MEMORY" | bc -l) )); then
        echo "ALERT: High memory usage: ${MEM_PERCENT}%"
        return 1
    fi
    return 0
}

check_disk() {
    DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
    if [ "$DISK_USAGE" -ge "$THRESHOLD_DISK" ]; then
        echo "ALERT: High disk usage on /: ${DISK_USAGE}%"
        return 1
    fi
    return 0
}

# Run checks
ALERTS=""
if ! check_cpu; then ALERTS="$ALERTS\n- High CPU"; fi
if ! check_memory; then ALERTS="$ALERTS\n- High Memory"; fi
if ! check_disk; then ALERTS="$ALERTS\n- High Disk"; fi

# Send alert if any issues
if [ -n "$ALERTS" ]; then
    echo -e "System alerts detected on $(hostname) at $(date):$ALERTS" | \
    mail -s "SYSTEM ALERT: $(hostname)" "$ALERT_EMAIL"
    echo "Alerts sent to $ALERT_EMAIL"
fi

Monitoring Best Practices Checklist

  • Use vmstat for quick system overview
  • Use iostat for disk I/O analysis
  • Use free for memory status
  • Use sar for historical data
  • Use dstat for customizable views
  • Use iotop to identify I/O intensive processes
  • Use iftop for network traffic analysis
  • Establish baseline metrics
  • Set appropriate monitoring intervals
  • Log data for trend analysis

Getting Started with Monitoring

Follow this step-by-step approach to master Linux monitoring tools:

  1. Start with basics: Master free, vmstat, and iostat first
  2. Install sysstat: Set up sar for historical data collection
  3. Learn interactive tools: Practice with iotop and iftop
  4. Create custom views: Use dstat to build custom monitoring dashboards
  5. Establish baselines: Monitor your systems during normal operation
  6. Set up alerts: Create scripts to notify you of issues
  7. Practice troubleshooting: Use the scenarios in this guide
  8. Document findings: Keep notes on normal ranges and thresholds
  9. Automate reporting: Schedule regular system health reports
  10. Stay updated: Learn new tools and techniques regularly

Master the Art of System Monitoring

These monitoring tools are essential for any Linux administrator or DevOps professional. By mastering vmstat, iostat, free, sar, dstat, iotop, and iftop, you gain deep visibility into system performance and can quickly identify and resolve issues.

Remember: Effective monitoring is about asking the right questions. Start with "What's the overall system health?" (vmstat), then drill down to specific areas: "Is disk I/O the bottleneck?" (iostat), "Which process is causing it?" (iotop).

Next Steps: Install all these tools on your system and spend 30 minutes each day practicing with them. Create a monitoring script that collects data from all tools and generates a daily health report. Soon, you'll be diagnosing system issues with confidence and precision.