Data is Irreplaceable
Data loss can be catastrophic. Whether it's customer data, application databases, or configuration files, losing data means losing business. Automated backup systems aren't optional—they're essential infrastructure.
Backup Strategy
3-2-1 Rule
The industry standard:
- 3 copies: Original + 2 backups
- 2 different media: Different storage types
- 1 off-site: Geographic separation
Backup Types
Different backup strategies:
Full Backups
- Complete copy of all data
- Slow, but complete recovery
- Use for weekly/monthly backups
Incremental Backups
- Only changed data since last backup
- Fast, but requires full + all incrementals
- Use for daily backups
Differential Backups
- Changed data since last full backup
- Faster restore than incremental
- Balance between speed and storage
Database Backups
PostgreSQL
# Full backup
pg_dump -h localhost -U user -d database -F c -f backup.dump
# Automated daily backup script
#!/bin/bash
BACKUP_DIR="/backups/postgres"
DATE=$(date +%Y%m%d_%H%M%S)
pg_dump -h localhost -U user -d mydb -F c -f "$BACKUP_DIR/mydb_$DATE.dump"
# Keep only last 30 days
find $BACKUP_DIR -name "*.dump" -mtime +30 -delete
MySQL
# Full backup
mysqldump -u user -p database > backup.sql
# Automated backup
mysqldump -u user -p --all-databases | gzip > backup_$(date +%Y%m%d).sql.gz
MongoDB
# Full backup
mongodump --host localhost --db mydb --out /backups/mongodb
# With compression
mongodump --host localhost --db mydb --archive=/backups/mydb.archive --gzip
File System Backups
rsync for Incremental Backups
# Incremental backup script
rsync -avz --delete \
--exclude 'node_modules' \
--exclude '.git' \
/source/directory/ \
/backup/destination/
Tar Archives
# Create compressed archive
tar -czf backup_$(date +%Y%m%d).tar.gz /path/to/backup
# Automated weekly full backup
tar -czf /backups/weekly_$(date +%Y%m%d).tar.gz \
--exclude=/backups \
/home /var/www /etc
Off-Site Backups
Cloud Storage
Popular options:
AWS S3
- Highly durable (99.999999999%)
- Lifecycle policies for cost optimization
- Glacier for long-term storage
- Cross-region replication
Backblaze B2
- Cost-effective
- S3-compatible API
- Good performance
- Simpler pricing
Google Cloud Storage
- Integrated with GCP services
- Multiple storage classes
- Strong consistency
Backup to Cloud
# Upload to S3
aws s3 sync /local/backups s3://my-backup-bucket/backups/
# With lifecycle policy (move to Glacier after 30 days)
# Configured in S3 bucket settings
Encryption
Encrypt backups before uploading:
# Encrypt before upload
tar -czf - /data | openssl enc -aes-256-cbc -salt \
-out backup_encrypted.tar.gz.enc \
-pass file:/path/to/password
# Decrypt
openssl enc -d -aes-256-cbc -in backup_encrypted.tar.gz.enc \
-out backup.tar.gz \
-pass file:/path/to/password
Backup Automation
Cron Jobs
Schedule regular backups:
# Daily database backup at 2 AM
0 2 * * * /usr/local/bin/backup-database.sh
# Weekly full backup on Sunday at 1 AM
0 1 * * 0 /usr/local/bin/full-backup.sh
# Monthly archive backup
0 3 1 * * /usr/local/bin/monthly-archive.sh
Backup Scripts
Comprehensive backup script:
#!/bin/bash
set -e
BACKUP_DIR="/backups"
DATE=$(date +%Y%m%d_%H%M%S)
LOG_FILE="/var/log/backups.log"
log() {
echo "[$(date)] $1" >> $LOG_FILE
}
# Database backup
log "Starting database backup..."
pg_dump -h localhost -U user -d mydb \
-F c -f "$BACKUP_DIR/db_$DATE.dump"
log "Database backup completed"
# File backup
log "Starting file backup..."
tar -czf "$BACKUP_DIR/files_$DATE.tar.gz" /var/www
log "File backup completed"
# Upload to S3
log "Uploading to S3..."
aws s3 cp "$BACKUP_DIR/db_$DATE.dump" \
s3://my-backup-bucket/database/
aws s3 cp "$BACKUP_DIR/files_$DATE.tar.gz" \
s3://my-backup-bucket/files/
log "Upload completed"
# Cleanup local backups older than 7 days
find $BACKUP_DIR -name "*.dump" -mtime +7 -delete
find $BACKUP_DIR -name "*.tar.gz" -mtime +7 -delete
log "Backup process completed successfully"
Backup Verification
Verify Backups
Never assume backups work—verify them:
- Test restores: Regularly restore from backups
- Checksum verification: Verify file integrity
- Automated testing: Script restore tests
- Documentation: Document restore procedures
Verification Script
#!/bin/bash
# Verify backup integrity
BACKUP_FILE="$1"
# Check if file exists and is readable
if [ ! -r "$BACKUP_FILE" ]; then
echo "Error: Cannot read backup file"
exit 1
fi
# Verify PostgreSQL dump
if [[ "$BACKUP_FILE" == *.dump ]]; then
pg_restore --list "$BACKUP_FILE" > /dev/null
if [ $? -eq 0 ]; then
echo "PostgreSQL backup is valid"
else
echo "Error: PostgreSQL backup is corrupted"
exit 1
fi
fi
# Verify tar archive
if [[ "$BACKUP_FILE" == *.tar.gz ]]; then
tar -tzf "$BACKUP_FILE" > /dev/null
if [ $? -eq 0 ]; then
echo "Tar archive is valid"
else
echo "Error: Tar archive is corrupted"
exit 1
fi
fi
Disaster Recovery
Recovery Procedures
Document recovery steps:
- Assess damage: What data is lost?
- Choose backup: Select appropriate backup point
- Prepare environment: Set up recovery environment
- Restore data: Execute restore procedures
- Verify restoration: Confirm data integrity
- Resume operations: Bring systems back online
- Post-mortem: Document what happened
Recovery Time Objectives (RTO)
Define acceptable downtime:
- Critical systems: < 1 hour
- Important systems: < 4 hours
- Non-critical: < 24 hours
Recovery Point Objectives (RPO)
Define acceptable data loss:
- Critical data: < 15 minutes
- Important data: < 1 hour
- Non-critical: < 24 hours
Monitoring & Alerts
Backup Monitoring
Monitor backup health:
- Backup success/failure: Alert on failures
- Backup size: Detect anomalies
- Backup duration: Performance monitoring
- Storage usage: Prevent disk full
- Restore tests: Regular verification
Alerting
Set up alerts for:
- Backup failures: Immediate notification
- Storage issues: Disk space warnings
- Restore test failures: Backup corruption
- Cloud upload failures: Off-site backup issues
Real-World Implementation
For a client's infrastructure, I set up:
- Daily incremental backups: Databases and files
- Weekly full backups: Complete system snapshots
- Monthly archives: Long-term retention
- S3 off-site storage: Geographic redundancy
- Automated verification: Daily restore tests
- Monitoring and alerts: Backup health tracking
- Documented procedures: Recovery runbooks
Results:
- 100% backup success rate (monitored and verified)
- < 1 hour RTO for critical systems
- < 15 minute RPO for critical data
- Zero data loss incidents
- Automated 95% of backup process
Best Practices
- Automate everything: Manual backups are unreliable
- Test restores regularly: Backups are useless if you can't restore
- Encrypt sensitive data: Protect backups from unauthorized access
- Off-site storage: Geographic separation is essential
- Monitor backup health: Know immediately if backups fail
- Document procedures: Recovery needs clear documentation
- Regular reviews: Update backup strategy as needs change
Backup systems are insurance policies you hope to never use, but when disaster strikes, they're the difference between business continuity and business failure. Invest in robust, automated backup systems and verify they work regularly.
