Linux Interview Questions: Complete Guide with Answers

Master Linux interviews with this comprehensive guide covering basic to advanced questions, practical scenarios, and detailed explanations. Understand not just the "what" but also the "why" and "how" behind each concept.

1. Basic Linux Concepts & Fundamentals

Start with foundational Linux concepts that every candidate should understand regardless of experience level.

Q1: What is Linux and how does it differ from Unix?

Beginner

What the interviewer wants to know:

This question tests your understanding of Linux's origins and its relationship with Unix. Interviewers want to see if you understand:

  • The historical context of Linux
  • Key technical differences
  • Licensing and distribution models
  • Practical implications for system administration

Complete Answer:

Linux: A free, open-source, Unix-like operating system kernel created by Linus Torvalds in 1991. Linux distributions combine the kernel with GNU utilities and other software.

Unix: A family of multitasking, multiuser computer operating systems originally developed at AT&T Bell Labs in the 1970s.

Key Differences:

Aspect Linux Unix
License GPL (Free and Open Source) Proprietary (except BSD variants)
Development Community-driven, open development Vendor-specific development
Kernel Monolithic kernel with loadable modules Mostly monolithic, some microkernel variants
Cost Free to use and modify Expensive licensing fees
Hardware Support Extensive, especially for x86 systems Limited to vendor hardware
Distributions Ubuntu, RHEL, CentOS, Debian, etc. AIX, Solaris, HP-UX, macOS

Why this matters: Understanding these differences helps in making informed decisions about which OS to use for specific workloads, understanding compatibility issues, and troubleshooting cross-platform problems.

Interview Tip:

Don't just list differences - explain implications: When discussing differences, mention practical consequences. For example:

  • "Linux's open-source nature means we can customize it for our specific needs, which is why we use it for our container infrastructure."
  • "Unix systems often have better support contracts, which is important for critical banking systems."
This shows you understand not just the facts, but their business implications.

Q2: Explain the Linux filesystem hierarchy

Beginner

What the interviewer is evaluating:

This question tests your practical knowledge of Linux system organization. Interviewers want to see if you:

  • Know where to find system files
  • Understand the purpose of each directory
  • Can navigate the filesystem efficiently
  • Know where to store application data
  • Understand permissions and ownership implications

Complete Answer:

The Linux Filesystem Hierarchy Standard (FHS) defines the directory structure and directory contents. Here are the key directories and their purposes:

šŸ“ LINUX FILESYSTEM HIERARCHY ================================ / # Root directory ā”œā”€ā”€ /bin/ # Essential user binaries (ls, cp, rm, etc.) ā”œā”€ā”€ /boot/ # Boot loader files (vmlinuz, initrd, grub) ā”œā”€ā”€ /dev/ # Device files (sda, tty, null, random) ā”œā”€ā”€ /etc/ # System configuration files ā”œā”€ā”€ /home/ # User home directories ā”œā”€ā”€ /lib/ # Essential shared libraries ā”œā”€ā”€ /media/ # Mount point for removable media ā”œā”€ā”€ /mnt/ # Temporary mount points ā”œā”€ā”€ /opt/ # Optional application software packages ā”œā”€ā”€ /proc/ # Virtual filesystem for process information ā”œā”€ā”€ /root/ # Home directory for root user ā”œā”€ā”€ /run/ # Runtime variable data ā”œā”€ā”€ /sbin/ # System binaries (fdisk, fsck, init) ā”œā”€ā”€ /srv/ # Service data (web, FTP, CVS) ā”œā”€ā”€ /sys/ # Virtual filesystem for system information ā”œā”€ā”€ /tmp/ # Temporary files (cleaned on reboot) ā”œā”€ā”€ /usr/ # User utilities and applications │ ā”œā”€ā”€ /bin/ # Non-essential user binaries │ ā”œā”€ā”€ /lib/ # Libraries for /usr/bin and /usr/sbin │ ā”œā”€ā”€ /local/ # Locally installed software │ └── /share/ # Architecture-independent data └── /var/ # Variable data (logs, spool, cache) ā”œā”€ā”€ /log/ # System and application logs ā”œā”€ā”€ /spool/ # Queued files (mail, print) └── /cache/ # Application cache data

Practical Examples:

  • /etc/: Contains system-wide configuration files. For example, /etc/passwd stores user information, /etc/fstab defines filesystem mounts.
  • /var/log/: Where system logs are stored. /var/log/syslog contains general system messages, /var/log/auth.log stores authentication logs.
  • /proc/: A virtual filesystem providing process and kernel information. /proc/cpuinfo shows CPU details, /proc/meminfo shows memory information.
  • /dev/: Contains device files. /dev/sda represents the first hard disk, /dev/null is a null device that discards data.

Why this matters: Knowing the filesystem hierarchy is essential for:

  • Troubleshooting: You know where to look for logs and configuration files
  • Security: You understand which directories need strict permissions
  • Application deployment: You know where to install applications and store data
  • System maintenance: You understand what can be safely cleaned up

Common Follow-up Questions:

Interviewers often ask:

  1. "What's the difference between /bin and /usr/bin?"
    Answer: /bin contains essential binaries needed for single-user mode, while /usr/bin contains non-essential binaries.
  2. "Where should you install third-party applications?"
    Answer: /opt/ for self-contained applications, /usr/local/ for locally compiled software.
  3. "What's special about /proc and /sys?"
    Answer: They're virtual filesystems that provide kernel and process information in real-time.

Q3: What are inodes in Linux?

Intermediate

Understanding the question:

This question tests your understanding of Linux filesystem internals. Interviewers want to see if you understand:

  • How files are stored and referenced
  • Filesystem metadata concepts
  • Troubleshooting disk issues
  • Performance implications
  • Difference between filesystem types

Complete Answer:

Inodes (Index Nodes) are data structures in Unix/Linux filesystems that store metadata about files and directories, except the filename and the actual data.

What information is stored in an inode:

  • File type (regular file, directory, symbolic link, device file)
  • Permissions (read, write, execute for owner, group, others)
  • Owner and group IDs
  • File size in bytes
  • Timestamps (creation, modification, access)
  • Link count (number of hard links to the inode)
  • Pointers to data blocks (direct, indirect, double indirect)
  • Device ID (for device files)

Visual Representation:

šŸ“„ INODE STRUCTURE ========================= Inode Number: 123456 ā”œā”€ā”€ File Type: Regular file (-) ā”œā”€ā”€ Permissions: rw-r--r-- ā”œā”€ā”€ Owner: uid=1000 (john) ā”œā”€ā”€ Group: gid=1000 (john) ā”œā”€ā”€ File Size: 4096 bytes ā”œā”€ā”€ Timestamps: │ ā”œā”€ā”€ Access: 2023-12-18 10:30:00 │ ā”œā”€ā”€ Modify: 2023-12-18 10:25:00 │ └── Change: 2023-12-18 10:25:00 ā”œā”€ā”€ Link Count: 1 ā”œā”€ā”€ Pointers to Data Blocks: │ ā”œā”€ā”€ Direct Blocks: 1001, 1002, 1003 │ ā”œā”€ā”€ Single Indirect: 2001 │ ā”œā”€ā”€ Double Indirect: 3001 │ └── Triple Indirect: 4001 └── Device ID: 0 (not a device file)

Key Commands for Working with Inodes:

# 1. Check inode usage on filesystem df -i # Output: # Filesystem Inodes IUsed IFree IUse% Mounted on # /dev/sda1 524288 84712 439576 17% / # 2. Find inode number of a file ls -i filename.txt # Output: 123456 filename.txt # 3. Find file by inode number find / -inum 123456 2>/dev/null # 4. Check inode information (stat command) stat filename.txt # Output shows all inode metadata # 5. Create many small files to test inode limits for i in {1..1000}; do touch file$i.txt; done

Common Scenarios and Solutions:

  1. "No space left on device" but df shows free space:
    This indicates inode exhaustion. Use df -i to check inode usage.
    # Solution: Find and clean up small files # Find directories with many files find /path -type f -name "*.log" -size +0c | xargs rm # Or use find to delete old small files find /path -type f -size -1k -mtime +30 -delete
  2. Hard links vs Symbolic links:
    Hard links share the same inode, symbolic links have their own inode.
    # Create hard link (same inode) ln original.txt hardlink.txt ls -i original.txt hardlink.txt # Same inode number # Create symbolic link (different inode) ln -s original.txt symlink.txt ls -i original.txt symlink.txt # Different inode numbers
  3. Filesystem differences:
    EXT4 vs XFS vs BTRFS handle inodes differently:
    • EXT4: Fixed number of inodes at format time
    • XFS: Dynamic inode allocation
    • BTRFS: No traditional inodes, uses B-trees

Why this matters:

  • Troubleshooting: Understanding inodes helps diagnose "disk full" errors
  • Performance: Inode-heavy operations affect filesystem performance
  • Storage planning: Choosing appropriate filesystem for workload
  • File recovery: Understanding how files are stored and referenced

Common Mistakes to Avoid:

  • Mistake: "Inodes store file content"
    Correction: Inodes store metadata, not file content. File content is stored in data blocks.
  • Mistake: "All filesystems handle inodes the same way"
    Correction: Different filesystems (EXT4, XFS, BTRFS) handle inodes differently.
  • Mistake: "Inode exhaustion is rare"
    Correction: Common in systems with many small files (email servers, log files).

2. Process Management & Signals

Understanding processes, job control, and signals is crucial for effective system administration and troubleshooting.

Q4: Explain the difference between a process and a thread

Intermediate

What the interviewer is testing:

This question evaluates your understanding of operating system concepts and their practical implications:

  • Multiprocessing vs multithreading concepts
  • Resource allocation and management
  • Performance implications
  • Concurrency and parallelism
  • Troubleshooting application issues

Complete Answer:

Process: An independent execution unit with its own memory space, resources, and state. Each process has its own address space, file descriptors, and security context.

Thread: A lightweight unit of execution within a process. Threads share the same memory space and resources as their parent process but have their own stack and register state.

Aspect Process Thread
Memory Space Separate memory space Shares memory with parent process
Creation Time Slower (requires OS intervention) Faster (mostly user-space)
Context Switching Expensive (OS involvement) Cheaper (within same address space)
Communication IPC mechanisms (pipes, sockets, shared memory) Shared memory (variables, objects)
Isolation High (crashes don't affect others) Low (one thread crash can kill all)
Resource Overhead High (separate memory, file tables) Low (shares resources)

Visual Representation:

šŸ”§ PROCESS VS THREAD =========================== PROCESS A (PID: 1001) PROCESS B (PID: 1002) ā”œā”€ā”€ Memory Space A ā”œā”€ā”€ Memory Space B ā”œā”€ā”€ File Descriptors A ā”œā”€ā”€ File Descriptors B ā”œā”€ā”€ Security Context A ā”œā”€ā”€ Security Context B ā”œā”€ā”€ Thread 1 ā”œā”€ā”€ Thread 1 │ ā”œā”€ā”€ Stack A1 │ ā”œā”€ā”€ Stack B1 │ ā”œā”€ā”€ Registers A1 │ ā”œā”€ā”€ Registers B1 │ └── PC A1 │ └── PC B1 └── Thread 2 └── Thread 2 ā”œā”€ā”€ Stack A2 ā”œā”€ā”€ Stack B2 ā”œā”€ā”€ Registers A2 ā”œā”€ā”€ Registers B2 └── PC A2 └── PC B2 Note: Threads in same process share memory space. Processes have completely isolated memory spaces.

Practical Examples in Linux:

# 1. View processes and their threads ps -efL # L option shows threads # Output shows LWP (Lightweight Process ID = Thread ID) # 2. View thread-specific information ps -T -p # T option shows threads under a process # 3. Count threads in a process cat /proc//status | grep Threads # or ls /proc//task/ | wc -l # 4. Real-world example: Apache web server # Apache uses multiple processes (prefork) or # multiple threads (worker/event MPM) # Prefork MPM (process-based) # Each connection = new process # More stable but heavier # Worker MPM (hybrid process/thread) # Multiple processes, each with multiple threads # Better performance for high concurrency # 5. Nginx example (process-based with worker processes) # Master process + worker processes # Each worker handles multiple connections (event-driven)

When to use processes vs threads:

  • Use Processes when:
    • You need strong isolation between tasks
    • Tasks don't need to share much data
    • Security is critical (different privileges needed)
    • You want to leverage multiple CPUs effectively
  • Use Threads when:
    • Tasks need to share data frequently
    • You need lightweight concurrency
    • Tasks are I/O bound and waiting often
    • You need to create many concurrent tasks

Common Interview Scenarios:

  1. "Why does my application crash when one thread fails?"
    Because threads share memory space. A thread crashing can corrupt shared memory, affecting all threads in the process.
  2. "Should I use multiprocessing or multithreading for my web scraper?"
    Depends on requirements:
    • If scraping many independent sites → Processes (isolation)
    • If scraping one site with many pages → Threads (share session/cookies)
    • If I/O bound (waiting for network) → Threads are more efficient
  3. "How would you debug a memory leak in a multithreaded application?"
    Use tools like valgrind, check for:
    • Thread-local storage not being freed
    • Race conditions causing double allocation
    • Shared resources not being released properly

Why this matters in DevOps:

  • Container design: Containers are essentially isolated processes
  • Microservices: Each service runs as separate processes
  • Monitoring: Need to monitor both process and thread counts
  • Scaling: Understanding when to scale horizontally (processes) vs vertically (threads)
  • Troubleshooting: High thread counts can indicate thread leaks, high process counts can indicate fork bombs

Advanced Concepts:

For senior roles, be prepared to discuss:

  1. User threads vs Kernel threads:
    • User threads: Managed by user-space library (old pthreads)
    • Kernel threads: Managed by kernel (modern NPTL)
    • Linux uses 1:1 model (each user thread maps to kernel thread)
  2. Thread pools vs spawning threads:
    • Thread pools: Reuse threads to avoid creation overhead
    • Important for high-performance servers
  3. CPU affinity and scheduling:
    • taskset command to set CPU affinity
    • chrt command to change scheduling policy

Q5: Explain Linux signals with examples

Intermediate

What the interviewer wants to see:

This question tests your practical knowledge of process control and signal handling:

  • Understanding of inter-process communication
  • Graceful shutdown procedures
  • Troubleshooting stuck processes
  • Writing robust scripts and applications
  • Understanding default signal behavior

Complete Answer:

Signals are software interrupts delivered to a process to notify it of an event. They are a form of inter-process communication (IPC) used by the kernel or other processes.

Common Linux Signals:

Signal Number Default Action Purpose
SIGHUP (Hangup) 1 Terminate Terminal disconnect, reload configuration
SIGINT (Interrupt) 2 Terminate Ctrl+C from keyboard
SIGQUIT (Quit) 3 Core dump Ctrl+\ from keyboard
SIGKILL (Kill) 9 Terminate Unstoppable kill, cannot be caught
SIGTERM (Terminate) 15 Terminate Graceful shutdown request
SIGSTOP (Stop) 17,19,23 Stop Pause process execution
SIGCONT (Continue) 18,20,24 Continue Resume stopped process
SIGUSR1 10 Terminate User-defined signal 1
SIGUSR2 12 Terminate User-defined signal 2

Practical Signal Usage:

# 1. Sending signals to processes kill -SIGNAL PID # Examples: kill -9 1234 # SIGKILL - forceful termination kill -15 1234 # SIGTERM - graceful termination kill -1 1234 # SIGHUP - reload configuration kill -2 1234 # SIGINT - same as Ctrl+C # 2. Using signal numbers kill -9 1234 # Same as SIGKILL kill -15 1234 # Same as SIGTERM # 3. Kill by process name pkill -9 nginx # Kill all nginx processes with SIGKILL pkill -HUP nginx # Send SIGHUP to reload nginx # 4. Kill all processes by name killall -9 python # Kill all python processes # 5. Trap signals in shell scripts #!/bin/bash trap "echo 'Caught SIGINT'; cleanup; exit 1" SIGINT trap "echo 'Caught SIGTERM'; cleanup; exit 0" SIGTERM # 6. List all available signals kill -l # Output: 1) SIGHUP 2) SIGINT 3) SIGQUIT ... 64) SIGRTMAX

Real-World Scenarios:

šŸŽÆ PRACTICAL SIGNAL SCENARIOS =================================== SCENARIO 1: Graceful shutdown of web server -------------------------------------------- # Instead of SIGKILL (which drops connections) kill -15 $(cat /var/run/nginx.pid) # SIGTERM # Nginx receives SIGTERM, finishes current requests, # then exits gracefully SCENARIO 2: Reload configuration without downtime ------------------------------------------------- kill -1 $(cat /var/run/nginx.pid) # SIGHUP # Nginx master process re-reads config, # spawns new workers, old workers finish requests SCENARIO 3: Debugging stuck process ----------------------------------- # First try graceful termination kill -15 PID # If no response after 30 seconds kill -3 PID # SIGQUIT to get core dump # Finally if still stuck kill -9 PID # SIGKILL as last resort SCENARIO 4: Pause and resume long-running job --------------------------------------------- # Pause the process kill -19 PID # SIGSTOP # Do some maintenance... # Resume the process kill -18 PID # SIGCONT

Signal Handling in Applications:

# Python signal handling example import signal import sys import time def signal_handler(sig, frame): print(f'\nReceived signal {sig}') print('Performing cleanup...') # Cleanup code here sys.exit(0) # Register signal handlers signal.signal(signal.SIGINT, signal_handler) # Ctrl+C signal.signal(signal.SIGTERM, signal_handler) # kill command signal.signal(signal.SIGHUP, signal_handler) # Reload config print('Running... Press Ctrl+C to exit') while True: time.sleep(1) # Bash script signal handling #!/bin/bash cleanup() { echo "Cleaning up..." rm -f /tmp/tempfile.$$ echo "Cleanup complete" } # Trap signals trap cleanup EXIT # Run on any exit trap 'echo "SIGINT caught"; exit 1' INT trap 'echo "SIGTERM caught"; exit 0' TERM trap 'echo "SIGHUP caught"; reload_config' HUP reload_config() { echo "Reloading configuration..." source /etc/app.conf }

Important Concepts:

  1. Signal vs Interrupt:
    • Interrupt: Hardware to CPU
    • Signal: Software to process
  2. Catchable vs Uncatchable Signals:
    • SIGKILL (9) and SIGSTOP cannot be caught or ignored
    • All other signals can be handled by the process
  3. Signal Delivery:
    • Synchronous: Caused by process itself (SIGSEGV, SIGFPE)
    • Asynchronous: From outside the process (SIGINT, SIGTERM)
  4. Signal Masks:
    • Processes can block signals temporarily
    • Useful in critical sections of code

Why this matters for DevOps:

  • Graceful shutdowns: Critical for zero-downtime deployments
  • Configuration management: SIGHUP for reloading configs without restart
  • Troubleshooting: Understanding signal behavior helps debug process issues
  • Container orchestration: Kubernetes uses SIGTERM then SIGKILL for pod termination
  • Writing robust scripts: Proper signal handling prevents orphaned processes and resources

Common Pitfalls:

  • Using SIGKILL as first resort: Always try SIGTERM first to allow graceful shutdown
  • Not handling signals in long-running processes: Can leave resources locked
  • Race conditions in signal handlers: Keep signal handlers simple and reentrant
  • Assuming signals are delivered immediately: Signals can be queued or lost
  • Not considering zombie processes: SIGCHLD handling to prevent zombies

3. Filesystem & Permissions

Master filesystem operations, permissions, and security concepts essential for system administration.

Q6: Explain Linux file permissions in detail

Intermediate

What the interviewer is assessing:

This question evaluates your understanding of Linux security model:

  • Understanding of permission bits
  • Knowledge of special permissions
  • Ability to troubleshoot permission issues
  • Understanding of security implications
  • Practical usage in scripts and automation

Complete Answer:

Linux file permissions control access to files and directories through a three-tiered system: Owner, Group, and Others.

Permission Components:

šŸ” PERMISSION BREAKDOWN ============================ -rwxr-xr-- 1 user group 4096 Dec 18 10:30 script.sh │ │ │ │ │ │ │ └── Others: r-- (read only) │ │ └───── Group: r-x (read & execute) │ └──────── Owner: rwx (read, write, execute) └────────── File type: - (regular file) FILE TYPES: - : Regular file d : Directory l : Symbolic link c : Character device b : Block device s : Socket p : Named pipe PERMISSION BITS: r : Read (4) w : Write (2) x : Execute (1) - : No permission (0)

Octal vs Symbolic Notation:

Symbolic Octal Binary Meaning
rwxrwxrwx 777 111111111 Full permissions for all
rwxr-xr-x 755 111101101 Owner: full, Others: read+execute
rw-r--r-- 644 110100100 Owner: read+write, Others: read only
rwx------ 700 111000000 Only owner has access

Special Permissions (Setuid, Setgid, Sticky Bit):

# Special permission bits # SetUID (s): Execute with owner's privileges chmod u+s /usr/bin/passwd # Shows as: -rwsr-xr-x # SetGID (s): Execute with group's privileges chmod g+s /usr/bin/write # Shows as: -rwxr-sr-x # For directories: new files inherit group # Sticky Bit (t): Only owner can delete files chmod +t /tmp # Shows as: drwxrwxrwt # Important for shared directories like /tmp

Practical Permission Management:

# 1. Changing permissions chmod 755 script.sh # Octal notation chmod u+rwx,g+rx,o+rx script.sh # Symbolic notation chmod a+x script.sh # Add execute for all (a=all) # 2. Changing ownership chown user:group file.txt chown user file.txt # Change user only chown :group file.txt # Change group only chown -R user:group /path/ # Recursive change # 3. Default permissions with umask umask 022 # Default: 777 - 022 = 755 for dirs umask 027 # Default: 777 - 027 = 750 for dirs # Files: 666 - umask, Directories: 777 - umask # 4. View current umask umask # or umask -S # Symbolic format: u=rwx,g=rx,o=rx # 5. Find files with specific permissions find / -type f -perm 4000 # Find setuid files find / -type f -perm 2000 # Find setgid files find / -perm /o=w ! -user root # World-writable files not owned by root # 6. ACLs (Access Control Lists) for advanced permissions setfacl -m u:john:rwx file.txt # Add user john with rwx setfacl -m g:developers:rx file.txt # Add group getfacl file.txt # View ACLs

Real-World Scenarios:

šŸŽÆ PRACTICAL PERMISSION SCENARIOS ===================================== SCENARIO 1: Web server permissions ----------------------------------- # Nginx/Apache needs read access to web files chown -R www-data:www-data /var/www/html chmod -R 755 /var/www/html # Directories chmod -R 644 /var/www/html/*.html # Files # Upload directory needs write access chmod 775 /var/www/html/uploads # Or use ACL for more granular control setfacl -Rm u:www-data:rwx /var/www/html/uploads SCENARIO 2: Shared development directory ---------------------------------------- # Create shared directory mkdir /shared chown :developers /shared chmod 2775 /shared # SetGID for group inheritance # Now all files created in /shared inherit developers group SCENARIO 3: Secure configuration files -------------------------------------- # Configuration files should not be world-readable chmod 600 /etc/secret.conf # Only owner can read/write chmod 640 /etc/database.conf # Owner: rw, Group: r SCENARIO 4: Troubleshooting "Permission denied" ----------------------------------------------- # Check current permissions ls -la /path/to/file # Check user and group id # Check if in correct group groups # Check ACLs if present getfacl /path/to/file

Permission Concepts for Directories:

  • Read (r): List directory contents (ls)
  • Write (w): Create/delete files in directory
  • Execute (x): Access files in directory (cd into it)

Common Permission Issues and Solutions:

  1. "Permission denied" when running script:
    # Script needs execute permission chmod +x script.sh # Also check shebang line #!/bin/bash
  2. Can't delete file in directory:
    # Need write permission on directory, not file # Check directory permissions ls -ld /path/to/directory # Fix: chmod +w /path/to/directory
  3. Apache/Nginx can't read files:
    # Web server runs as www-data user # Files need to be readable by www-data or world-readable chmod o+r file.html # World readable (not secure) chown www-data file.html # Change ownership (better) # Or use ACLs for granular control

Security Best Practices:

  • Principle of Least Privilege: Give minimum necessary permissions
  • Regular audits: Find world-writable files, setuid binaries
  • Use groups: Instead of making files world-readable
  • Secure umask: Use 027 or 077 for sensitive environments
  • Limit setuid/setgid: Only essential binaries should have these

Why this matters for DevOps:

  • Container security: Understanding permissions for containerized applications
  • CI/CD pipelines: Setting correct permissions in deployment scripts
  • Infrastructure as Code: Managing permissions through configuration
  • Security compliance: Meeting security standards and audits
  • Troubleshooting: Quickly resolving permission-related issues

Security Considerations:

For senior roles, be prepared to discuss:

  1. SELinux/AppArmor: Mandatory Access Control beyond standard permissions
  2. Capabilities: Breaking root privileges into smaller units
  3. Filesystem attributes: Immutable files (chattr +i), append-only logs
  4. SUID/SGID risks: Security implications and auditing
  5. Container root vs non-root: Running containers as non-root users

4. Shell Scripting & Automation

Essential shell scripting concepts, best practices, and automation techniques for DevOps engineers.

Q7: Write a script to monitor disk usage and send alerts

Scenario

What the interviewer is evaluating:

This scenario tests multiple skills:

  • Practical shell scripting ability
  • Understanding of system monitoring
  • Error handling and robustness
  • Automation and scheduling knowledge
  • Production-ready coding practices
  • Communication and alerting mechanisms

Complete Solution:

Here's a production-ready disk monitoring script with explanations:

#!/usr/bin/env bash # ============================================================================== # Script: disk-monitor.sh # Description: Monitor disk usage and send alerts # Author: DevOps Team # Version: 2.0.0 # ============================================================================== set -euo pipefail IFS=$'\n\t' # ============================================================================== # Configuration # ============================================================================== readonly SCRIPT_NAME="$(basename "${0}")" readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" readonly LOG_FILE="/var/log/${SCRIPT_NAME%.*}.log" readonly LOCK_FILE="/tmp/${SCRIPT_NAME}.lock" # Thresholds (percentage) readonly CRITICAL_THRESHOLD=90 readonly WARNING_THRESHOLD=80 readonly INODE_CRITICAL_THRESHOLD=85 # Alert recipients readonly ALERT_EMAIL="admin@example.com" readonly SLACK_WEBHOOK="https://hooks.slack.com/services/XXX/YYY/ZZZ" readonly PAGERDUTY_KEY="your-pagerduty-key" # ============================================================================== # Colors and Formatting # ============================================================================== readonly RED='\033[0;31m' readonly GREEN='\033[0;32m' readonly YELLOW='\033[1;33m' readonly BLUE='\033[0;34m' readonly NC='\033[0m' # No Color # ============================================================================== # Logging Functions # ============================================================================== log_info() { echo -e "${BLUE}[INFO]${NC} $(date '+%Y-%m-%d %H:%M:%S') - $*" | tee -a "${LOG_FILE}" } log_warning() { echo -e "${YELLOW}[WARNING]${NC} $(date '+%Y-%m-%d %H:%M:%S') - $*" | tee -a "${LOG_FILE}" } log_error() { echo -e "${RED}[ERROR]${NC} $(date '+%Y-%m-%d %H:%M:%S') - $*" | tee -a "${LOG_FILE}" } log_success() { echo -e "${GREEN}[SUCCESS]${NC} $(date '+%Y-%m-%d %H:%M:%S') - $*" | tee -a "${LOG_FILE}" } # ============================================================================== # Utility Functions # ============================================================================== check_dependencies() { local dependencies=("mailx" "curl" "jq" "df" "awk") local missing=() for cmd in "${dependencies[@]}"; do if ! command -v "${cmd}" &> /dev/null; then missing+=("${cmd}") fi done if [[ ${#missing[@]} -gt 0 ]]; then log_error "Missing dependencies: ${missing[*]}" log_error "Install missing packages and try again." exit 1 fi } acquire_lock() { if [[ -e "${LOCK_FILE}" ]]; then local pid=$(cat "${LOCK_FILE}") if kill -0 "${pid}" 2>/dev/null; then log_error "Script is already running (PID: ${pid})" exit 1 else log_warning "Stale lock file found (PID: ${pid}), removing..." rm -f "${LOCK_FILE}" fi fi echo "$$" > "${LOCK_FILE}" trap 'release_lock' EXIT } release_lock() { rm -f "${LOCK_FILE}" } # ============================================================================== # Monitoring Functions # ============================================================================== get_disk_usage() { # Get disk usage using df, excluding tmpfs, squashfs, etc. df -h --output=source,fstype,pcent,ipcent,target 2>/dev/null | \ awk 'NR>1 && $2 !~ /(tmpfs|squashfs|devtmpfs|overlay)/ {print $0}' } get_largest_directories() { local mount_point="${1}" local depth="${2:-3}" # Find top 10 largest directories du -h --max-depth="${depth}" "${mount_point}" 2>/dev/null | \ sort -rh | \ head -11 | \ tail -10 } get_largest_files() { local mount_point="${1}" # Find top 10 largest files find "${mount_point}" -type f -exec du -h {} + 2>/dev/null | \ sort -rh | \ head -10 } # ============================================================================== # Alert Functions # ============================================================================== send_email_alert() { local subject="${1}" local body="${2}" local recipient="${3}" echo "${body}" | mailx -s "${subject}" "${recipient}" log_info "Email alert sent to ${recipient}" } send_slack_alert() { local message="${1}" local channel="${2:-#alerts}" local severity="${3:-warning}" local color case "${severity}" in critical) color="#FF0000" ;; warning) color="#FFA500" ;; *) color="#008000" ;; esac local payload=$(jq -n \ --arg channel "${channel}" \ --arg text "${message}" \ --arg color "${color}" \ '{ "channel": $channel, "attachments": [{ "color": $color, "text": $text, "ts": now }] }') curl -s -X POST -H 'Content-type: application/json' \ --data "${payload}" "${SLACK_WEBHOOK}" > /dev/null log_info "Slack alert sent to ${channel}" } send_pagerduty_alert() { local summary="${1}" local severity="${2:-critical}" local payload=$(jq -n \ --arg summary "${summary}" \ --arg severity "${severity}" \ '{ "routing_key": $env.PAGERDUTY_KEY, "event_action": "trigger", "payload": { "summary": $summary, "severity": $severity, "source": "disk-monitor", "custom_details": { "script": "disk-monitor.sh", "timestamp": now | todate } } }') curl -s -X POST \ -H "Content-Type: application/json" \ -H "Accept: application/vnd.pagerduty+json;version=2" \ --data "${payload}" \ "https://events.pagerduty.com/v2/enqueue" > /dev/null log_info "PagerDuty alert triggered" } # ============================================================================== # Analysis Functions # ============================================================================== analyze_disk_usage() { local disk_info disk_info=$(get_disk_usage) while IFS= read -r line; do [[ -z "${line}" ]] && continue local device fstype usage inode_usage mount_point read -r device fstype usage inode_usage mount_point <<< "${line}" # Remove % signs usage=${usage%\%} inode_usage=${inode_usage%\%} log_info "Checking ${device} (${mount_point}): Usage=${usage}%, Inodes=${inode_usage}%" # Check disk usage if [[ "${usage}" -ge "${CRITICAL_THRESHOLD}" ]]; then handle_critical_disk "${device}" "${usage}" "${mount_point}" elif [[ "${usage}" -ge "${WARNING_THRESHOLD}" ]]; then handle_warning_disk "${device}" "${usage}" "${mount_point}" fi # Check inode usage if [[ "${inode_usage}" -ge "${INODE_CRITICAL_THRESHOLD}" ]]; then handle_inode_critical "${device}" "${inode_usage}" "${mount_point}" fi done <<< "${disk_info}" } handle_critical_disk() { local device="${1}" local usage="${2}" local mount_point="${3}" local message="🚨 CRITICAL: Disk ${device} (${mount_point}) is ${usage}% full!" log_error "${message}" # Get analysis for troubleshooting local analysis=$(generate_disk_analysis "${mount_point}") # Send alerts send_email_alert "CRITICAL: Disk Space Alert - ${device}" "${message}\n\n${analysis}" "${ALERT_EMAIL}" send_slack_alert "${message}\n${analysis}" "#critical-alerts" "critical" send_pagerduty_alert "Disk ${device} at ${usage}%" # Log analysis echo "${analysis}" >> "${LOG_FILE}" } handle_warning_disk() { local device="${1}" local usage="${2}" local mount_point="${3}" local message="āš ļø WARNING: Disk ${device} (${mount_point}) is ${usage}% full" log_warning "${message}" # Send warning alerts send_slack_alert "${message}" "#alerts" "warning" # Log warning echo "WARNING: ${device} at ${usage}%" >> "${LOG_FILE}" } handle_inode_critical() { local device="${1}" local inode_usage="${2}" local mount_point="${3}" local message="🚨 CRITICAL: Inode usage on ${device} (${mount_point}) is ${inode_usage}%!" log_error "${message}" send_email_alert "CRITICAL: Inode Usage Alert - ${device}" "${message}" "${ALERT_EMAIL}" send_slack_alert "${message}" "#critical-alerts" "critical" } generate_disk_analysis() { local mount_point="${1}" local analysis="\nšŸ“Š DISK ANALYSIS FOR ${mount_point}\n" analysis+="========================================\n\n" # Top 10 largest directories analysis+="šŸ“ TOP 10 LARGEST DIRECTORIES:\n" analysis+="$(get_largest_directories "${mount_point}" 2)\n\n" # Top 10 largest files analysis+="šŸ“„ TOP 10 LARGEST FILES:\n" analysis+="$(get_largest_files "${mount_point}")\n\n" # Files modified in last 24 hours analysis+="šŸ• LARGE FILES MODIFIED IN LAST 24 HOURS:\n" analysis+="$(find "${mount_point}" -type f -size +100M -mtime -1 -exec ls -lh {} + 2>/dev/null | head -5 || echo "None found")\n\n" # Log files over 100MB analysis+="šŸ“‹ LOG FILES OVER 100MB:\n" analysis+="$(find "${mount_point}" -type f -name "*.log" -size +100M -exec ls -lh {} + 2>/dev/null | head -5 || echo "None found")\n" echo -e "${analysis}" } # ============================================================================== # Cleanup Functions # ============================================================================== cleanup_old_logs() { log_info "Cleaning up old log files..." # Remove log files older than 30 days find /var/log -name "*.log" -type f -mtime +30 -delete 2>/dev/null || true # Cleanup /tmp find /tmp -type f -atime +7 -delete 2>/dev/null || true find /tmp -type d -empty -atime +7 -delete 2>/dev/null || true log_success "Cleanup completed" } # ============================================================================== # Main Execution # ============================================================================== main() { log_info "Starting disk monitoring check..." acquire_lock check_dependencies # Perform disk usage analysis analyze_disk_usage # Perform cleanup if it's the first day of month if [[ $(date +%d) == "01" ]]; then cleanup_old_logs fi log_success "Disk monitoring check completed successfully" } # ============================================================================== # Script Entry Point # ============================================================================== if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then # Run main function main # Exit with appropriate code if grep -q "ERROR\|CRITICAL" "${LOG_FILE}" 2>/dev/null; then exit 1 else exit 0 fi fi

Key Features Explained:

  1. Robust Error Handling:
    • set -euo pipefail: Exit on error, treat unset variables as errors, fail pipeline if any command fails
    • Lock file mechanism: Prevents multiple simultaneous runs
    • Dependency checking: Verifies required tools are available
  2. Configuration Management:
    • Thresholds as variables for easy adjustment
    • Multiple alert channels (email, Slack, PagerDuty)
    • Logging with timestamps and colors
  3. Comprehensive Monitoring:
    • Disk space usage monitoring
    • Inode usage monitoring (often overlooked)
    • Filesystem type filtering (excludes tmpfs, etc.)
  4. Actionable Alerts:
    • Different severity levels (warning, critical)
    • Includes troubleshooting information
    • Multiple notification channels
  5. Troubleshooting Assistance:
    • Identifies largest directories and files
    • Finds recently modified large files
    • Identifies large log files
  6. Automated Cleanup:
    • Optional cleanup of old logs
    • Scheduled cleanup (first day of month)

How to Deploy and Use:

# 1. Save the script sudo nano /usr/local/bin/disk-monitor.sh sudo chmod +x /usr/local/bin/disk-monitor.sh # 2. Test the script manually sudo /usr/local/bin/disk-monitor.sh # 3. Configure cron for regular checks sudo crontab -e # Add: */30 * * * * /usr/local/bin/disk-monitor.sh # Runs every 30 minutes # 4. Test with simulated high disk usage # Create large file to trigger alert dd if=/dev/zero of=/tmp/testfile bs=1G count=5 # 5. Check logs tail -f /var/log/disk-monitor.log # 6. Test alert channels # Modify thresholds to trigger alerts during testing

Advanced Features to Mention:

  • Predictive Analysis: Could add trend analysis to predict when disk will be full
  • Auto-remediation: Automatically clean up certain file types (old core dumps, temp files)
  • Integration: Integrate with monitoring systems like Prometheus, Nagios
  • Container Support: Monitor Docker/container disk usage
  • Cloud Integration: Auto-expand volumes in AWS/GCP/Azure

Why this is a good interview answer:

  1. Shows production thinking: Error handling, logging, locking
  2. Demonstrates DevOps mindset: Automation, monitoring, alerting
  3. Shows understanding of scale: Handles multiple disks, multiple alert channels
  4. Demonstrates troubleshooting skills: Provides analysis to help fix issues
  5. Shows knowledge of tools: Uses standard Unix tools effectively

Alternative Solutions:

For different interview scenarios:

  1. Simple version (beginner):
    #!/bin/bash THRESHOLD=90 df -h | awk '{print $5 " " $6}' | grep -v Use | while read output; do usage=$(echo $output | awk '{print $1}' | cut -d'%' -f1) partition=$(echo $output | awk '{print $2}') if [ $usage -ge $THRESHOLD ]; then echo "Warning: $partition is $usage% full" fi done
  2. Python version (if asked for non-bash):
    #!/usr/bin/env python3 import shutil import smtplib from email.mime.text import MIMEText THRESHOLD = 90 partition = '/' usage = shutil.disk_usage(partition) percent_used = (usage.used / usage.total) * 100 if percent_used > THRESHOLD: msg = MIMEText(f"Partition {partition} is {percent_used:.1f}% full") msg['Subject'] = f'Disk Space Alert: {partition}' msg['From'] = 'monitor@example.com' msg['To'] = 'admin@example.com' with smtplib.SMTP('localhost') as server: server.send_message(msg)
  3. Using monitoring tools:
    • Prometheus + node_exporter + Alertmanager
    • Nagios/Icinga checks
    • Commercial: Datadog, New Relic

5. Networking & Troubleshooting

Essential networking concepts and troubleshooting techniques for Linux system administration.

Q8: How would you troubleshoot a website that's slow to load?

Scenario

What the interviewer is testing:

This scenario evaluates your systematic troubleshooting approach:

  • Methodical problem-solving skills
  • Knowledge of networking and web stack
  • Ability to use diagnostic tools
  • Understanding of performance metrics
  • Communication of technical issues
  • Prioritization of investigation steps

Complete Troubleshooting Guide:

I would follow a systematic approach, starting from the client side and moving toward the server side:

šŸ” SYSTEMATIC TROUBLESHOOTING APPROACH ========================================= 1. LOCAL CLIENT CHECK ↓ 2. NETWORK CONNECTIVITY ↓ 3. DNS RESOLUTION ↓ 4. WEB SERVER RESPONSE ↓ 5. APPLICATION PERFORMANCE ↓ 6. DATABASE & BACKEND ↓ 7. SYSTEM RESOURCES ↓ 8. EXTERNAL DEPENDENCIES

Step-by-Step Investigation:

# ============================================================================== # 1. LOCAL CLIENT CHECK # ============================================================================== # Check if it's just your machine or widespread echo "Testing from local machine..." # Try different browsers # Clear browser cache # Check browser developer tools (Network tab) # Test from command line time curl -v https://example.com # Check timing breakdown curl -w "\n\nTiming:\n--------\n time_namelookup: %{time_namelookup}\n time_connect: %{time_connect}\n time_appconnect: %{time_appconnect}\n time_pretransfer: %{time_pretransfer}\n time_redirect: %{time_redirect}\n time_starttransfer: %{time_starttransfer}\n time_total: %{time_total}\n" -o /dev/null -s https://example.com # ============================================================================== # 2. NETWORK CONNECTIVITY # ============================================================================== echo "\nChecking network connectivity..." # Basic ping (check latency) ping -c 4 example.com # Traceroute to identify network hops traceroute example.com # or mtr --report example.com # Check packet loss ping -c 100 -i 0.2 example.com | grep "packet loss" # Check MTU issues ping -s 1472 -M do example.com # Test with large packets # ============================================================================== # 3. DNS RESOLUTION # ============================================================================== echo "\nChecking DNS resolution..." # Check DNS resolution time time nslookup example.com time dig example.com # Check different DNS servers dig @8.8.8.8 example.com dig @1.1.1.1 example.com # Check DNS cache dig +stats example.com # Check for DNSSEC issues delv example.com # ============================================================================== # 4. WEB SERVER RESPONSE # ============================================================================== echo "\nChecking web server..." # Check SSL/TLS handshake time echo | openssl s_client -connect example.com:443 -servername example.com # Check SSL certificate openssl s_client -connect example.com:443 -servername example.com 2>/dev/null | \ openssl x509 -noout -dates -subject # Check HTTP headers curl -I https://example.com # Check keep-alive connections curl -H "Connection: close" -I https://example.com # Check from different locations # Use online tools: ping.pe, downforeveryoneorjustme.com # ============================================================================== # 5. SERVER-SIDE CHECKS (If you have access) # ============================================================================== echo "\nChecking server-side issues..." # Check web server error logs tail -100 /var/log/nginx/error.log tail -100 /var/log/apache2/error.log # Check web server status systemctl status nginx systemctl status apache2 # Check current connections ss -tan | grep :80 | wc -l ss -tan | grep :443 | wc -l # Check web server configuration nginx -t apachectl configtest # ============================================================================== # 6. APPLICATION PERFORMANCE # ============================================================================== echo "\nChecking application performance..." # Check application logs tail -100 /var/log/application.log # Check PHP-FPM/Apache worker status systemctl status php-fpm ps aux | grep php-fpm | wc -l # Check for slow queries (if applicable) tail -100 /var/log/mysql/slow.log # Check application metrics # - Response time percentiles # - Error rates # - Request rates # ============================================================================== # 7. SYSTEM RESOURCES # ============================================================================== echo "\nChecking system resources..." # CPU usage top -bn1 | grep "Cpu(s)" mpstat -P ALL # Memory usage free -h cat /proc/meminfo # Disk I/O iostat -x 1 3 iotop -o # Disk space df -h df -i # Inode usage # Network connections netstat -tan | grep ESTABLISHED | wc -l # Check for swapping vmstat 1 5 cat /proc/vmstat | grep pgpg # ============================================================================== # 8. DATABASE & BACKEND SERVICES # ============================================================================== echo "\nChecking database and backend services..." # Database connections mysql -e "SHOW PROCESSLIST;" mysql -e "SHOW STATUS LIKE 'Threads_connected';" # Database performance mysql -e "SHOW ENGINE INNODB STATUS\G" # Redis/Memcached redis-cli info | grep -E "(connected_clients|used_memory|instantaneous_ops_per_sec)" # External API dependencies # Check response time of external services # ============================================================================== # 9. ADVANCED PROFILING # ============================================================================== echo "\nAdvanced profiling..." # Use ab (Apache Bench) for load testing ab -n 100 -c 10 https://example.com/ # Use siege for more realistic testing siege -c 10 -t 30S https://example.com # Use wrk for modern load testing wrk -t4 -c100 -d30s https://example.com # Profile PHP (if applicable) # - Xdebug profiling # - Blackfire.io # Profile Python (if applicable) # - cProfile # - py-spy # ============================================================================== # 10. MONITORING & ALERTING CHECK # ============================================================================== echo "\nChecking monitoring systems..." # Check if alerts are firing # - Prometheus alerts # - Grafana dashboards # - Nagios/Icinga checks # - CloudWatch metrics (if AWS) # Check historical performance # Compare with baseline performance # Check for recent deployments/changes

Common Issues and Their Solutions:

šŸŽÆ COMMON SLOW WEBSITE ISSUES =================================== ISSUE 1: High DNS Lookup Time (>200ms) ---------------------------------------- SYMPTOMS: curl shows high time_namelookup SOLUTION: - Implement local DNS caching (dnsmasq, systemd-resolved) - Use faster DNS servers (Cloudflare 1.1.1.1, Google 8.8.8.8) - Reduce DNS TTL for frequent changes - Use DNS prefetching in HTML ISSUE 2: Slow SSL/TLS Handshake -------------------------------- SYMPTOMS: High time_appconnect in curl SOLUTION: - Enable TLS session resumption - Use modern TLS 1.3 - Optimize cipher suites - Consider SSL termination at load balancer ISSUE 3: High Server Response Time ---------------------------------- SYMPTOMS: High time_starttransfer in curl SOLUTION: - Check application code for bottlenecks - Optimize database queries - Implement caching (Redis, Memcached) - Add more application servers ISSUE 4: Network Latency Issues ------------------------------- SYMPTOMS: High ping times, packet loss SOLUTION: - Use CDN for static assets - Implement HTTP/2 or HTTP/3 - Enable compression (gzip, brotli) - Reduce number of HTTP requests ISSUE 5: Resource Exhaustion ---------------------------- SYMPTOMS: High CPU, memory, or I/O usage SOLUTION: - Vertical scaling (more resources) - Horizontal scaling (more servers) - Optimize application code - Implement rate limiting

Performance Optimization Checklist:

  1. Frontend Optimization:
    • Minify and compress assets (CSS, JS)
    • Optimize images (WebP format, proper sizing)
    • Implement lazy loading
    • Use browser caching headers
  2. Backend Optimization:
    • Implement caching at multiple levels
    • Optimize database queries and indexes
    • Use connection pooling
    • Implement asynchronous processing
  3. Infrastructure Optimization:
    • Use CDN for static content
    • Implement load balancing
    • Enable HTTP/2 or HTTP/3
    • Use keep-alive connections
  4. Monitoring & Alerting:
    • Set up real-time monitoring
    • Define performance SLOs/SLAs
    • Implement alerting for degradation
    • Regular performance testing

Tools for Different Layers:

Layer Diagnostic Tools Monitoring Tools
Network ping, traceroute, mtr, tcpdump SmokePing, LibreNMS
DNS dig, nslookup, delv DNSSEC Monitoring
HTTP/SSL curl, openssl, ab, siege Pingdom, GTmetrix
Server top, vmstat, iostat, netstat Prometheus, Grafana
Application strace, perf, Xdebug New Relic, AppDynamics
Database EXPLAIN, slow query log pt-query-digest, VividCortex

Communication Strategy:

  • Immediate Actions: Document what you checked and found
  • Stakeholder Updates: Provide regular updates on investigation
  • Root Cause Analysis: Document findings and lessons learned
  • Prevention: Implement monitoring to detect issues earlier

Why this is a strong interview answer:

  1. Shows systematic approach: Methodical troubleshooting from client to server
  2. Demonstrates technical depth: Knowledge of tools at each layer
  3. Shows practical experience: Real commands that can be used immediately
  4. Communicates effectively: Clear explanation of what and why
  5. Shows preventative thinking: Goes beyond fixing to preventing recurrence

How to Present Your Answer:

During the interview:

  1. Start with methodology: "I would follow a systematic approach, starting from..."
  2. Explain your thinking: "First, I'd check if it's client-side or server-side because..."
  3. Use real examples: "For DNS issues, I'd use dig to check resolution time..."
  4. Discuss trade-offs: "If it's database-related, I might add indexes, but that has trade-offs..."
  5. End with prevention: "To prevent this in future, I'd implement monitoring for..."
Avoid: Jumping straight to solutions without explaining your diagnostic process.

Interview Preparation Strategy

30-Day Study Plan

Week 1: Fundamentals (Days 1-7)

  • Day 1-2: Linux filesystem hierarchy, basic commands
  • Day 3-4: File permissions, ownership, special permissions
  • Day 5-6: Process management, signals, job control
  • Day 7: Review and practice basic scenarios

Week 2: Intermediate Topics (Days 8-14)

  • Day 8-9: Shell scripting fundamentals
  • Day 10-11: Networking commands and concepts
  • Day 12-13: System monitoring and performance
  • Day 14: Review and intermediate practice

Week 3: Advanced & DevOps Topics (Days 15-21)

  • Day 15-16: Containerization (Docker)
  • Day 17-18: Orchestration (Kubernetes basics)
  • Day 19-20: Infrastructure as Code (Terraform basics)
  • Day 21: CI/CD concepts and tools

Week 4: Practice & Mock Interviews (Days 22-30)

  • Day 22-24: Practice common interview questions
  • Day 25-27: Solve practical scenarios and problems
  • Day 28-29: Mock interviews with peers
  • Day 30: Final review and relaxation

Essential Resources

Books:

  • "The Linux Command Line" by William Shotts
  • "How Linux Works" by Brian Ward
  • "Linux Bible" by Christopher Negus
  • "UNIX and Linux System Administration Handbook" by Evi Nemeth

Online Resources:

  • Linux Journey: linuxjourney.com
  • Explain Shell: explainshell.com
  • TLDP: tldp.org guides and HOWTOs
  • Kernel.org Documentation

Practice Platforms:

  • LeetCode: Linux/database problems
  • HackerRank: Linux shell challenges
  • OverTheWire: Bandit wargame for Linux practice
  • Codewars: Shell scripting katas

Interview Success Tips

Before the Interview:

  1. Research the company: Understand their tech stack and infrastructure
  2. Review the job description: Tailor your answers to their requirements
  3. Prepare your environment: Have a Linux VM ready for practical tests
  4. Practice aloud: Explain concepts as you would in the interview
  5. Prepare questions: Have intelligent questions ready for the interviewer

During the Interview:

  1. Think aloud: Explain your thought process as you solve problems
  2. Ask clarifying questions: Don't assume requirements
  3. Admit what you don't know: Be honest, but show how you'd find out
  4. Use examples: Reference real experiences when possible
  5. Stay calm under pressure: Take a moment to think if needed

For Technical Questions:

  1. Start simple: Give basic answer first, then elaborate
  2. Use the STAR method: Situation, Task, Action, Result for scenarios
  3. Draw diagrams: Visual explanations help (if remote, use digital whiteboard)
  4. Check your work: Review your answers for errors
  5. Consider edge cases: Show you think about failure scenarios

Common Mistakes to Avoid:

  • āŒ Memorizing answers: Understand concepts instead
  • āŒ Being too brief: Provide sufficient detail
  • āŒ Getting defensive: Accept constructive criticism
  • āŒ Focusing only on tech: Show communication skills too
  • āŒ Not preparing questions: Shows lack of interest

Key Areas to Master

Must-Know Commands (Be able to explain each):

# Process Management ps, top, htop, kill, pkill, killall, jobs, bg, fg, nice, renice # File Operations ls, find, locate, which, whereis, grep, awk, sed, cut, sort, uniq chmod, chown, chgrp, umask, setfacl, getfacl # Disk Management df, du, mount, umount, fdisk, parted, lsblk, blkid, fsck # Networking ping, traceroute, netstat, ss, curl, wget, dig, nslookup ifconfig, ip, route, iptables, firewall-cmd, tcpdump # System Monitoring free, vmstat, iostat, mpstat, sar, dmesg, journalctl # Text Processing cat, less, more, tail, head, vi, nano, tee, xargs # Package Management apt, yum, dnf, rpm, dpkg (know your distro's tools) # User Management useradd, usermod, userdel, groupadd, passwd, su, sudo

Must-Understand Concepts:

  • Linux boot process (BIOS/UEFI → Bootloader → Kernel → Init)
  • Process states (running, sleeping, stopped, zombie)
  • File descriptors and redirection (stdin, stdout, stderr)
  • Shell expansion and quoting (variable, command, arithmetic)
  • Environment variables and shell configuration
  • System logging (syslog, journald, log rotation)
  • Service management (systemd vs init)
  • Network configuration and troubleshooting
  • Security basics (firewalls, SELinux/AppArmor)

DevOps-Specific Knowledge:

  • Container basics (Docker commands, Dockerfile)
  • Orchestration basics (kubectl commands, pod/deployment concepts)
  • Infrastructure as Code (Terraform/CloudFormation basics)
  • CI/CD concepts (Jenkins, GitLab CI, GitHub Actions)
  • Monitoring stack (Prometheus, Grafana, Alertmanager)
  • Log aggregation (ELK stack, Loki)
  • Configuration management (Ansible, Puppet, Chef basics)