This comprehensive guide covers everything from basic file backups to enterprise disaster recovery strategies. Learn what to backup, how to backup, when to backup, and most importantly, how to restore when disaster strikes.
1. Backup Fundamentals & Principles
Understand the core principles of data protection before implementing any backup strategy. These fundamentals ensure your backups are reliable and effective.
The 3-2-1 Backup Rule
CriticalWhat is the 3-2-1 rule?
3 Copies of your data: Original + 2 backups
2 Different Media: Hard drive + Cloud/Tape
1 Off-site Copy: Geographic separation
Why this matters:
Protects against multiple failure scenarios: hardware failure, theft, fire, ransomware, and human error. A single backup is never enough.
Implementation example:
1. Primary: Live data on server
2. Local backup: External HDD/NAS (daily)
3. Cloud backup: AWS S3/Backblaze (weekly)
4. Off-site: Tape at different location (monthly)
Critical Warning
Having only local backups means you lose everything if your building burns down. Always maintain at least one geographically separated copy.
RPO vs RTO Explained
IntermediateRPO - Recovery Point Objective
Definition: Maximum acceptable data loss measured in time.
Example: RPO of 1 hour means you can afford to lose up to 1 hour of data.
Implies: You need backups at least every hour.
RTO - Recovery Time Objective
Definition: Maximum acceptable downtime for restoration.
Example: RTO of 4 hours means system must be restored within 4 hours.
Implies: You need efficient, automated restore procedures.
Business impact:
• E-commerce: RPO=minutes, RTO=hours (critical revenue loss)
• Development server: RPO=1 day, RTO=1 day (acceptable delay)
• Personal computer: RPO=1 week, RTO=2 days (low priority)
Quick Assessment Tool
Ask these questions to determine your RPO/RTO:
2. Essential Backup Tools & Commands
Master the fundamental Linux commands for creating reliable backups. Each tool has specific use cases and advantages.
tar - Tape Archive
BeginnerWhat each flag does:
• c: Create new archive
• z: Compress with gzip
• v: Verbose (show progress)
• f: File name follows
• $(date +%Y%m%d): Auto-date in filename
When to use tar:
• Creating compressed archives of directories
• Long-term storage backups
• Transferring multiple files as one
• Simple, reliable file-level backups
Practical tar Examples
rsync - Remote Synchronization
IntermediateWhy rsync is superior:
• Incremental: Only copies changed files
• Resumable: Can continue interrupted transfers
• Bandwidth efficient: Compresses during transfer
• Versatile: Local, remote, SSH, and more
Essential rsync flags:
• -a: Archive mode (preserves permissions)
• -v: Verbose output
• -z: Compress during transfer
• --delete: Delete extra files at destination
• --progress: Show transfer progress
• --exclude: Exclude patterns
Production rsync Script
dd - Disk Duplicator
AdvancedWhat dd does:
Creates exact byte-for-byte copy of entire disk or partition. This includes empty space, partition tables, and boot sectors.
When to use dd:
• Complete system imaging
• Disaster recovery preparation
• Disk cloning/migration
• Forensic analysis
• Boot disk creation
DANGER Warning!
dd is called "Data Destroyer" for a reason! Reversing source and target can wipe your entire disk. Always double-check if= (input) and of= (output) parameters.
Safe dd Practices
3. Backup Strategies & Scheduling
Different data requires different backup frequencies and retention policies. Implement a tiered approach for optimal protection.
Grandfather-Father-Son (GFS) Strategy
IntermediateHow GFS works:
Son (Daily): Keep 7 daily backups
Father (Weekly): Keep 4 weekly backups
Grandfather (Monthly): Keep 12 monthly backups
Yearly: Keep 3-7 yearly backups
Advantages:
• Provides multiple recovery points over time
• Efficient storage usage
• Protects against gradual data corruption
• Meets compliance requirements
GFS Rotation Schedule
Monday-Sunday: daily.0 to daily.6
Keep for 7 days, then discard or promote
First Sunday of each month → weekly.0
Keep 4 weeks, then promote to monthly
Last backup of each month → monthly.0
Keep 12 months, then promote to yearly
Incremental vs Differential
IntermediateIncremental Backup:
How it works: Backs up only changed files since last backup (any type)
Storage: Minimal - only changes
Restore: Requires full + all incrementals
Best for: Frequent backups, limited storage
Differential Backup:
How it works: Backs up changes since last full backup
Storage: Grows over time
Restore: Requires full + latest differential
Best for: Medium frequency, faster restore
Backup Type Comparison
Automated Scheduling with Cron
BeginnerCron syntax explained:
Format: minute hour day month day-of-week command
• * = any value
• */5 = every 5 units
• 1,3,5 = specific values
• 1-5 = range of values
Complete Backup Schedule
4. Database Backup & Recovery
Databases require special handling for consistent backups. Learn application-aware backup techniques for MySQL, PostgreSQL, and MongoDB.
MySQL/MariaDB Backup
IntermediateCritical flags explained:
• --single-transaction: Creates consistent backup without locking (InnoDB)
• --routines: Includes stored procedures/functions
• --triggers: Includes database triggers
• --events: Includes scheduled events
• --master-data: Includes binary log position (replication)
Production MySQL Backup Script
PostgreSQL Backup
IntermediatePostgreSQL backup methods:
• pg_dump: Logical backup of single database
• pg_dumpall: Logical backup of all databases + roles
• pg_basebackup: Physical backup (filesystem level)
• WAL Archiving: Continuous backup for PITR
PostgreSQL Continuous Backup
5. Disaster Recovery Procedures
When disaster strikes, having documented recovery procedures is crucial. Test these regularly to ensure they work when needed.
Complete System Recovery
CriticalStep-by-step recovery:
Step 1: Assess the damage
File-Level Recovery
BeginnerCommon recovery scenarios:
• Accidental deletion: Restore from backup
• File corruption: Restore previous version
• Ransomware: Restore from clean backup
• Permission issues: Restore original permissions
Quick File Recovery Commands
6. Cloud & Remote Backup Strategies
Cloud storage provides geographic redundancy and scalability. Implement secure, automated cloud backup solutions.
AWS S3 Backup
AdvancedAWS CLI setup:
1. Install AWS CLI: apt install awscli
2. Configure credentials: aws configure
3. Create S3 bucket: aws s3 mb s3://my-backup-bucket
4. Enable versioning: aws s3api put-bucket-versioning
Automated S3 Backup Script
Rsync over SSH (Remote Backup)
IntermediateSSH key setup for automation:
Security Considerations
Passwordless SSH keys are convenient but risky. Implement additional security:
7. Backup Verification & Testing
A backup is only as good as your ability to restore from it. Regular testing is non-negotiable for reliable data protection.
Backup Integrity Checks
CriticalWhat to verify:
• File integrity: No corruption in backup files
• Completeness: All required files are backed up
• Consistency: Databases are transactionally consistent
• Accessibility: Backup media is readable
• Encryption: Encrypted backups can be decrypted
Automated Verification Script
Restore Testing Schedule
RecoveryTesting frequency:
• Weekly: Test single file restore
• Monthly: Test database restore
• Quarterly: Test full application restore
• Annually: Full disaster recovery drill
• After changes: Test whenever backup process changes
Quarterly DR Test Procedure
• Schedule maintenance window
• Notify stakeholders
• Prepare test environment
• Simulate disaster scenario
• Restore from latest backup
• Verify system functionality
• Measure recovery time
• Record actual RTO/RPO
• Note issues encountered
• Update recovery procedures
• Share lessons learned
8. Complete Backup Implementation
Putting it all together: A complete, production-ready backup solution with monitoring, alerting, and documentation.
Complete Backup Architecture
AdvancedMulti-tier backup architecture:
Level 1: Local Snapshots
• Technology: LVM/ZFS/Btrfs snapshots
• Frequency: Hourly
• Retention: 24 hours
• Purpose: Quick file recovery
Level 2: Local Backup Server
• Technology: rsync + hard links
• Frequency: Daily
• Retention: 30 days (GFS rotation)
• Purpose: Server recovery
Level 3: Cloud Storage
• Technology: AWS S3/Backblaze
• Frequency: Weekly
• Retention: 1 year
• Purpose: Disaster recovery
Level 4: Off-site Archive
• Technology: Tape/LTO
• Frequency: Monthly
• Retention: 7 years
• Purpose: Compliance/archival
Monitoring & Alerting
IntermediateWhat to monitor:
• Backup success/failure status
• Backup duration and size
• Storage capacity trends
• Verification test results
• Recovery time measurements
Nagios/Icinga Backup Check
Backup Success Checklist
Implementation Checklist:
1. Define RPO/RTO for each system
2. Implement 3-2-1 rule with off-site copy
3. Choose appropriate tools (rsync, tar, database-specific)
4. Set up GFS rotation with proper retention
5. Automate scheduling with cron/systemd timers
6. Enable monitoring and alerting
7. Document procedures for restoration
8. Test regularly - backup without restore is useless
9. Review annually - update with system changes
10. Train staff - ensure multiple people can restore
Pro Tips for Success
• Start small: Protect critical data first, expand later
• Automate everything: Manual backups are forgotten backups
• Test restores: The only way to know backups work
• Monitor proactively: Don't wait for failure to check backups
• Keep it simple: Complex systems fail in complex ways
• Document thoroughly: You won't remember details during crisis
• Budget appropriately: Backup storage costs money, but data loss costs more
Common Backup Failures to Avoid
1. Backing up to same disk: Disk failure loses original AND backup
2. No off-site copy: Fire/theft/ransomware takes everything
3. Untested restores: Backups that don't work are worthless
4. Insufficient retention: Can't recover from corruption discovered weeks later
5. No monitoring: Silent failures go unnoticed until needed
6. Single point of knowledge: Only one person knows how to restore
7. Ignoring database consistency: File-level backup of live database
8. No encryption for cloud: Sensitive data exposed
9. Backing up everything: Wasting storage on non-critical data
10. Forgetting to update: Not adapting backup strategy to system changes