Linux Log Monitoring: journalctl & /var/log/ Mastery

Master the art of Linux log analysis with this comprehensive guide to journalctl and /var/log/ directory. Learn to monitor, filter, analyze, and troubleshoot system issues using powerful log analysis techniques used by DevOps professionals worldwide.

Why Log Monitoring is Critical

Logs are the primary source of truth for system behavior, security incidents, and performance issues. Effective log monitoring enables you to:

  • Debug Issues: Identify root causes of system failures
  • Monitor Security: Detect intrusion attempts and suspicious activities
  • Track Performance: Identify bottlenecks and optimize system resources
  • Comply with Regulations: Maintain audit trails for compliance requirements
  • Predict Problems: Spot patterns that indicate impending issues

1. Understanding /var/log/ Directory Structure

/var/log/
auth.log - Authentication logs
syslog - General system messages
kern.log - Kernel messages
dpkg.log - Package management logs
boot.log - Boot process messages
apache2/ - Apache web server logs
access.log - HTTP requests
error.log - Error messages
nginx/ - Nginx web server logs
mysql/ - Database server logs
journal/ - Systemd journal (binary)
🔐
Authentication Logs
/var/log/auth.log

Tracks authentication attempts, sudo usage, SSH logins, and security events.

SSH Logins sudo Usage Failed Logins PAM Events
🖥️
System Logs
/var/log/syslog

General system messages from various services and system components.

Service Events System Errors Cron Jobs Daemon Logs
⚙️
Kernel Logs
/var/log/kern.log

Kernel-level messages including hardware issues, driver problems, and kernel panics.

Hardware Errors Driver Issues Kernel Panics dmesg Output

Essential Log File Commands

# View last 100 lines of syslog
tail -100 /var/log/syslog

# Follow log file in real-time
tail -f /var/log/auth.log

# Search for errors in log file
grep -i error /var/log/syslog

# View log file with timestamps
grep "Dec  7" /var/log/syslog

# Count occurrences of a pattern
grep -c "Failed password" /var/log/auth.log

# View compressed log files
zcat /var/log/syslog.1.gz | grep error

# Monitor multiple log files
tail -f /var/log/syslog /var/log/auth.log

# Find large log files
find /var/log -type f -size +100M

# Check log rotation status
ls -la /var/log/*.log

2. journalctl: The Modern Systemd Journal

journalctl is the primary tool for interacting with systemd's journal, a centralized logging system that collects logs from the kernel, system services, and user applications.

Basic journalctl Commands

# View entire journal (most recent first)
journalctl

# View in reverse chronological order (oldest first)
journalctl -r

# Follow new entries in real-time
journalctl -f

# Show logs from today
journalctl --since today

# Show logs from yesterday
journalctl --since yesterday --until today

# Show logs for specific time range
journalctl --since "2025-12-07 09:00:00" --until "2025-12-07 17:00:00"

# Show logs for specific boot
journalctl -b

# Show logs for specific boot (by ID)
journalctl -b 3

# Show kernel messages only
journalctl -k

# Show emergency priority messages
journalctl -p emerg

# Show logs with full output (no paging)
journalctl --no-pager

Filtering and Searching

# Show logs for specific service
journalctl -u nginx.service
journalctl -u ssh.service

# Show logs for multiple services
journalctl -u nginx.service -u mysql.service

# Filter by priority/severity
journalctl -p err          # Errors only
journalctl -p warning      # Warnings only
journalctl -p info         # Info messages only
journalctl -p debug        # Debug messages only

# Combine filters
journalctl -u nginx -p err --since today

# Show logs for specific user
journalctl _UID=1000

# Show logs for specific process ID
journalctl _PID=1234

# Show logs from specific executable
journalctl /usr/sbin/sshd

# Filter by systemd unit type
journalctl _SYSTEMD_UNIT=sshd.service

# Search for specific text
journalctl --grep="error"
journalctl --grep="failed"
journalctl --grep="authentication failure"

# Case-insensitive search
journalctl --grep -i "error"

# Show logs with specific field
journalctl FIELD=value

Output Formatting Options

# Show with full timestamps
journalctl --utc           # UTC timestamps
journalctl -o short-full   # Full ISO timestamps
journalctl -o verbose      # Verbose output with all fields
journalctl -o json         # JSON output (for scripting)
journalctl -o json-pretty  # Pretty JSON output
journalctl -o cat          # Simple output (no metadata)
journalctl -o with-unit    # Show unit names

# Show specific fields only
journalctl -o json --output-fields=MESSAGE,_PID,_UID

# Export logs to file
journalctl --since today > /tmp/today-logs.txt
journalctl --since today --output=json > /tmp/today-logs.json

# Show disk usage
journalctl --disk-usage

# Vacuum/clean journal
journalctl --vacuum-size=1G   # Keep only 1GB of logs
journalctl --vacuum-time=30d  # Keep only last 30 days

3. Real-World Log Analysis Examples

Dec 7 09:30:45 server1 sshd[1234]: Accepted password for root from 192.168.1.100 port 54322 ssh2
Dec 7 09:31:12 server1 sudo: root : TTY=pts/0 ; PWD=/root ; USER=www-data ; COMMAND=/bin/systemctl restart nginx
Dec 7 09:32:05 server1 nginx: 2025/12/07 09:32:05 [error] 2345#2345: *1234 upstream timed out (110: Connection timed out)
Dec 7 09:33:22 server1 kernel: [12345.67890] CPU0: Core temperature above threshold, cpu clock throttled
Dec 7 09:34:15 server1 cron[3456]: (root) CMD (/usr/local/bin/backup.sh)

Practical Analysis Commands

# Find failed SSH login attempts
grep "Failed password" /var/log/auth.log
journalctl -u ssh --grep="Failed password"

# Count failed SSH attempts by IP
grep "Failed password" /var/log/auth.log | awk '{print $11}' | sort | uniq -c | sort -nr

# Find root login attempts
grep "root" /var/log/auth.log | grep "Failed password"
journalctl -u ssh --grep="root.*Failed"

# Check for brute force attacks
journalctl -u ssh --since "1 hour ago" | grep -c "Failed password"

# Find service failures
journalctl -p err --since today
journalctl -p err -u nginx -u mysql -u apache2

# Check disk space warnings
journalctl --grep -i "disk full\|no space\|filesystem full"

# Monitor OOM killer activity
journalctl --grep="killed process"
dmesg | grep -i "oom\|out of memory"

# Find hardware errors
journalctl -k -p err
dmesg | grep -i "error\|fail\|warning"

# Check system reboots
journalctl --list-boots
last reboot

# Monitor cron job executions
grep "CMD" /var/log/syslog | grep "cron"
journalctl -u cron --since today

4. Log Rotation Management

Log rotation prevents log files from consuming all disk space. Linux uses logrotate to manage log rotation.

logrotate Configuration

# View logrotate configuration files
ls -la /etc/logrotate.d/
cat /etc/logrotate.conf

# Common logrotate configuration file
cat /etc/logrotate.d/rsyslog

# Example logrotate configuration
cat > /etc/logrotate.d/myapp << 'EOF'
/var/log/myapp/*.log {
    daily
    missingok
    rotate 30
    compress
    delaycompress
    notifempty
    create 640 www-data www-data
    sharedscripts
    postrotate
        systemctl reload myapp > /dev/null 2>&1 || true
    endscript
}
EOF

# Test logrotate configuration
logrotate -d /etc/logrotate.d/rsyslog

# Force log rotation
logrotate -f /etc/logrotate.conf

# View rotation status
ls -la /var/log/*.log
ls -la /var/log/*.gz

Manual Log Rotation Script

#!/bin/bash
# manual-log-rotate.sh
# Manual log rotation with compression and cleanup

LOG_DIR="/var/log/myapp"
RETENTION_DAYS=30
COMPRESS_AFTER_DAYS=7

# Create backup directory with timestamp
BACKUP_DIR="/backups/logs/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"

echo "Starting log rotation at $(date)"

# Rotate current logs
for log_file in "$LOG_DIR"/*.log; do
    if [ -f "$log_file" ] && [ -s "$log_file" ]; then
        # Create backup
        backup_file="$BACKUP_DIR/$(basename "$log_file")_$(date +%H%M%S)"
        cp "$log_file" "$backup_file"
        
        # Truncate original log
        > "$log_file"
        
        echo "Rotated: $log_file -> $backup_file"
    fi
done

# Compress old backups
find "/backups/logs" -name "*.log" -type f -mtime +$COMPRESS_AFTER_DAYS ! -name "*.gz" | \
while read -r old_log; do
    if gzip "$old_log"; then
        echo "Compressed: $old_log.gz"
    fi
done

# Cleanup old backups
find "/backups/logs" -name "*.log.gz" -type f -mtime +$RETENTION_DAYS -delete
echo "Cleaned up backups older than $RETENTION_DAYS days"

echo "Log rotation completed at $(date)"

5. Advanced Log Monitoring Scripts

Real-time Log Monitoring with Alerting

#!/bin/bash
# log-monitor.sh
# Real-time log monitoring with alerting

LOG_FILES=(
    "/var/log/auth.log"
    "/var/log/syslog"
    "/var/log/nginx/error.log"
    "/var/log/mysql/error.log"
)

ALERT_PATTERNS=(
    "Failed password"
    "authentication failure"
    "error"
    "critical"
    "fatal"
    "panic"
    "segmentation fault"
    "out of memory"
    "disk full"
    "connection refused"
)

ALERT_EMAIL="admin@example.com"
ALERT_THRESHOLD=5  # Number of occurrences before alert
CHECK_INTERVAL=60  # Seconds between checks

# Function to check logs
check_logs() {
    local log_file=$1
    local pattern=$2
    
    # Count occurrences in last CHECK_INTERVAL seconds
    local count=$(tail -n 1000 "$log_file" 2>/dev/null | \
        grep -c "$pattern" | \
        awk '{print $1}')
    
    if [ "$count" -ge "$ALERT_THRESHOLD" ]; then
        send_alert "$log_file" "$pattern" "$count"
    fi
}

# Function to send alert
send_alert() {
    local log_file=$1
    local pattern=$2
    local count=$3
    
    local subject="LOG ALERT: $pattern detected in $log_file"
    local message="Pattern: $pattern\nCount: $count\nLog File: $log_file\nTime: $(date)\n\nLast 10 matches:\n$(grep "$pattern" "$log_file" | tail -10)"
    
    echo -e "$message" | mail -s "$subject" "$ALERT_EMAIL"
    echo "[$(date)] Alert sent: $pattern in $log_file ($count occurrences)"
}

# Main monitoring loop
echo "Starting log monitoring at $(date)"
echo "Monitoring files: ${LOG_FILES[*]}"
echo "Alert patterns: ${ALERT_PATTERNS[*]}"

while true; do
    for log_file in "${LOG_FILES[@]}"; do
        if [ -f "$log_file" ]; then
            for pattern in "${ALERT_PATTERNS[@]}"; do
                check_logs "$log_file" "$pattern"
            done
        fi
    done
    sleep "$CHECK_INTERVAL"
done

Comprehensive Log Analysis Report

#!/bin/bash
# log-analysis-report.sh
# Generate comprehensive log analysis report

REPORT_FILE="/var/log/analysis/$(date +%Y%m%d_%H%M%S)-report.txt"
mkdir -p "/var/log/analysis"

echo "=== Log Analysis Report - $(date) ===" > "$REPORT_FILE"
echo "Hostname: $(hostname)" >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"

# 1. Authentication Analysis
echo "1. AUTHENTICATION ANALYSIS" >> "$REPORT_FILE"
echo "==========================" >> "$REPORT_FILE"

# Failed SSH attempts
FAILED_SSH=$(grep -c "Failed password" /var/log/auth.log 2>/dev/null)
echo "Failed SSH attempts: $FAILED_SSH" >> "$REPORT_FILE"

# Failed SSH by IP
echo "" >> "$REPORT_FILE"
echo "Top 10 IPs with failed SSH attempts:" >> "$REPORT_FILE"
grep "Failed password" /var/log/auth.log 2>/dev/null | \
    awk '{print $11}' | sort | uniq -c | sort -nr | head -10 >> "$REPORT_FILE"

# Successful root logins
ROOT_LOGINS=$(grep "Accepted password for root" /var/log/auth.log 2>/dev/null | wc -l)
echo "" >> "$REPORT_FILE"
echo "Successful root logins: $ROOT_LOGINS" >> "$REPORT_FILE"

# 2. System Errors
echo "" >> "$REPORT_FILE"
echo "2. SYSTEM ERRORS" >> "$REPORT_FILE"
echo "================" >> "$REPORT_FILE"

# Journal errors today
echo "Systemd journal errors (today):" >> "$REPORT_FILE"
journalctl -p err --since today 2>/dev/null | tail -20 >> "$REPORT_FILE"

# Kernel errors
echo "" >> "$REPORT_FILE"
echo "Kernel errors (last 24h):" >> "$REPORT_FILE"
journalctl -k -p err --since "24 hours ago" 2>/dev/null | tail -10 >> "$REPORT_FILE"

# 3. Service Status
echo "" >> "$REPORT_FILE"
echo "3. SERVICE STATUS" >> "$REPORT_FILE"
echo "=================" >> "$REPORT_FILE"

SERVICES=("nginx" "mysql" "ssh" "docker" "cron")
for service in "${SERVICES[@]}"; do
    echo "" >> "$REPORT_FILE"
    echo "Service: $service" >> "$REPORT_FILE"
    journalctl -u "$service" --since "1 hour ago" 2>/dev/null | \
        grep -E "(error|fail|stop|start)" | tail -5 >> "$REPORT_FILE"
done

# 4. Disk Space and Log Sizes
echo "" >> "$REPORT_FILE"
echo "4. DISK AND LOG STATISTICS" >> "$REPORT_FILE"
echo "==========================" >> "$REPORT_FILE"

echo "Disk usage for /var/log:" >> "$REPORT_FILE"
du -sh /var/log/* 2>/dev/null | sort -hr >> "$REPORT_FILE"

echo "" >> "$REPORT_FILE"
echo "Large log files (>100MB):" >> "$REPORT_FILE"
find /var/log -type f -size +100M 2>/dev/null | xargs ls -lh >> "$REPORT_FILE"

# 5. Summary
echo "" >> "$REPORT_FILE"
echo "5. SUMMARY" >> "$REPORT_FILE"
echo "==========" >> "$REPORT_FILE"

TOTAL_ERRORS=$((FAILED_SSH + $(journalctl -p err --since today 2>/dev/null | wc -l)))
echo "Total errors detected: $TOTAL_ERRORS" >> "$REPORT_FILE"

if [ "$TOTAL_ERRORS" -gt 100 ]; then
    echo "⚠️  HIGH ERROR COUNT: System may have issues" >> "$REPORT_FILE"
elif [ "$TOTAL_ERRORS" -gt 10 ]; then
    echo "⚠️  Moderate error count: Monitor closely" >> "$REPORT_FILE"
else
    echo "✅ Normal error count: System appears healthy" >> "$REPORT_FILE"
fi

echo "" >> "$REPORT_FILE"
echo "Report generated: $REPORT_FILE" >> "$REPORT_FILE"
echo "Analysis completed at $(date)" >> "$REPORT_FILE"

echo "Report generated: $REPORT_FILE"

6. Troubleshooting Common Issues

Symptom Log Location Commands to Diagnose Common Causes SSH Connection Refused /var/log/auth.log journalctl -u ssh
grep "sshd" /var/log/auth.log SSH service stopped, firewall blocking, configuration error Website Not Loading /var/log/nginx/*.log
/var/log/apache2/*.log tail -f error.log
journalctl -u nginx Web server down, port conflict, permissions issue Database Connection Issues /var/log/mysql/error.log tail -f error.log
journalctl -u mysql MySQL service down, max connections, disk full High Memory Usage /var/log/syslog
dmesg grep -i "oom\|kill" /var/log/syslog
dmesg | grep -i memory Memory leak, insufficient RAM, swap misconfigured Disk Full /var/log/syslog grep -i "disk full\|no space" /var/log/syslog
df -h Log files growing, large temporary files, backup failures

7. Security Monitoring Scripts

SSH Intrusion Detection

#!/bin/bash
# ssh-intrusion-detector.sh
# Monitor SSH logs for intrusion attempts

LOG_FILE="/var/log/auth.log"
ALERT_EMAIL="security@example.com"
THRESHOLD=10  # Failed attempts threshold
TIME_WINDOW=300  # 5 minutes in seconds

# Function to check for brute force attacks
check_ssh_bruteforce() {
    local current_time=$(date +%s)
    local window_start=$((current_time - TIME_WINDOW))
    
    # Convert to readable date for grep
    local start_date=$(date -d "@$window_start" "+%b %_d %H:%M")
    
    # Count failed attempts in time window
    local failed_count=$(grep "Failed password" "$LOG_FILE" 2>/dev/null | \
        awk -v start="$start_date" '$1" "$2" "$3 >= start' | wc -l)
    
    if [ "$failed_count" -ge "$THRESHOLD" ]; then
        # Get attacking IPs
        local attacking_ips=$(grep "Failed password" "$LOG_FILE" 2>/dev/null | \
            awk -v start="$start_date" '$1" "$2" "$3 >= start' | \
            awk '{print $11}' | sort | uniq -c | sort -nr)
        
        send_ssh_alert "$failed_count" "$attacking_ips"
        
        # Optional: Block IPs with iptables
        # echo "$attacking_ips" | while read count ip; do
        #     iptables -A INPUT -s "$ip" -j DROP
        # done
    fi
}

send_ssh_alert() {
    local count=$1
    local ips=$2
    
    local subject="SSH BRUTE FORCE DETECTED: $count failed attempts"
    local message="SSH brute force attack detected!\n\n"
    message+="Failed attempts in last 5 minutes: $count\n\n"
    message+="Attacking IPs:\n$ips\n\n"
    message+="Time: $(date)\n"
    message+="Host: $(hostname)\n"
    
    echo -e "$message" | mail -s "$subject" "$ALERT_EMAIL"
    echo "[$(date)] SSH alert sent: $count failed attempts"
}

# Main monitoring loop
echo "Starting SSH intrusion detection at $(date)"
while true; do
    check_ssh_bruteforce
    sleep 60  # Check every minute
done

File Integrity Monitoring

#!/bin/bash
# file-integrity-monitor.sh
# Monitor critical files for unauthorized changes

CRITICAL_FILES=(
    "/etc/passwd"
    "/etc/shadow"
    "/etc/sudoers"
    "/etc/ssh/sshd_config"
    "/etc/hosts"
    "/etc/crontab"
)

BASELINE_DIR="/var/log/security/baseline"
mkdir -p "$BASELINE_DIR"

# Create baseline hashes
create_baseline() {
    for file in "${CRITICAL_FILES[@]}"; do
        if [ -f "$file" ]; then
            sha256sum "$file" > "$BASELINE_DIR/$(basename "$file").baseline"
        fi
    done
    echo "Baseline created at $(date)"
}

# Check for changes
check_integrity() {
    local changes_detected=false
    
    for file in "${CRITICAL_FILES[@]}"; do
        local baseline_file="$BASELINE_DIR/$(basename "$file").baseline"
        
        if [ -f "$file" ] && [ -f "$baseline_file" ]; then
            local current_hash=$(sha256sum "$file" | awk '{print $1}')
            local baseline_hash=$(awk '{print $1}' "$baseline_file")
            
            if [ "$current_hash" != "$baseline_hash" ]; then
                echo "ALERT: File modified: $file"
                echo "  Old hash: $baseline_hash"
                echo "  New hash: $current_hash"
                echo "  Date: $(date)"
                echo "  Last modified: $(stat -c %y "$file")"
                echo ""
                
                changes_detected=true
            fi
        fi
    done
    
    if [ "$changes_detected" = true ]; then
        # Send alert
        echo "Critical file changes detected at $(date)" | \
            mail -s "FILE INTEGRITY ALERT: $(hostname)" "security@example.com"
    fi
}

# Create baseline if it doesn't exist
if [ ! -f "$BASELINE_DIR/.created" ]; then
    create_baseline
    touch "$BASELINE_DIR/.created"
fi

# Check integrity
check_integrity

Quick Reference Cheat Sheet

tail -f /var/log/syslog
Follow system log in real-time
journalctl -f
Follow systemd journal in real-time
journalctl -u nginx --since today
Show nginx logs from today
grep "Failed password" /var/log/auth.log
Find failed SSH login attempts
journalctl -p err -b
Show errors from current boot
journalctl --disk-usage
Show journal disk usage
journalctl --vacuum-size=500M
Limit journal to 500MB
dmesg | grep -i error
Show kernel errors
logrotate -f /etc/logrotate.conf
Force log rotation
find /var/log -size +100M
Find log files larger than 100MB

Log Monitoring Best Practices

  • Regularly monitor authentication logs for failed attempts
  • Set up log rotation to prevent disk space issues
  • Use centralized logging for multiple servers
  • Implement alerting for critical errors
  • Regularly review and archive old logs
  • Secure log files with appropriate permissions
  • Use journalctl for systemd-based systems
  • Monitor disk space used by logs
  • Create baseline hashes for critical configuration files
  • Document log analysis procedures

Getting Started with Log Monitoring

Follow this step-by-step approach to master Linux log monitoring:

  1. Explore /var/log: Familiarize yourself with the directory structure
  2. Learn journalctl: Master basic commands and filtering
  3. Set up monitoring: Create scripts to monitor critical logs
  4. Implement alerting: Set up email alerts for critical events
  5. Configure log rotation: Ensure logs don't fill disk space
  6. Create baselines: Establish normal patterns for your system
  7. Practice analysis: Use the scenarios in this guide
  8. Automate reports: Schedule daily log analysis reports
  9. Secure logs: Set appropriate permissions and encryption
  10. Stay current: Keep up with new logging technologies

Master Log Analysis for Proactive Operations

Effective log monitoring transforms reactive troubleshooting into proactive system management. By mastering journalctl and /var/log/ analysis, you gain deep visibility into system behavior, security events, and performance issues.

Remember: Logs tell the story of your system. Learn to read them like a book, and you'll be able to predict problems before they occur and quickly resolve issues when they arise.

Next Steps: Start by monitoring your authentication logs for failed SSH attempts. Create a simple script that alerts you to brute force attacks. Then expand to monitoring application logs and creating daily health reports. Soon, you'll be a log analysis expert.