- Update README.md version badge from v4.0.1 to v4.1.4 - Remove emoticons from CHANGELOG.md (rocket, potato, shield) - Add missing command documentation to QUICK.md (engine, blob stats) - Remove emoticons from RESTORE_PROFILES.md - Fix ENGINES.md command syntax to match actual CLI - Complete METRICS.md with PITR metric examples - Create docs/CATALOG.md - Complete backup catalog reference - Create docs/DRILL.md - Disaster recovery drilling guide - Create docs/RTO.md - Recovery objectives analysis guide All documentation now follows conservative, professional style without emoticons.
7.8 KiB
Disaster Recovery Drilling
Complete guide for automated disaster recovery testing with dbbackup.
Overview
DR drills automate the process of validating backup integrity through actual restore testing. Instead of hoping backups work when needed, automated drills regularly restore backups in isolated containers to verify:
- Backup file integrity
- Database compatibility
- Restore time estimates (RTO)
- Schema validation
- Data consistency
Quick Start
# Run single DR drill on latest backup
dbbackup drill /mnt/backups/databases
# Drill specific database
dbbackup drill /mnt/backups/databases --database myapp
# Drill multiple databases
dbbackup drill /mnt/backups/databases --database myapp,mydb
# Schedule daily drills
dbbackup drill /mnt/backups/databases --schedule daily
How It Works
- Select backup - Picks latest or specified backup
- Create container - Starts isolated database container
- Extract backup - Decompresses to temporary storage
- Restore - Imports data to test database
- Validate - Runs integrity checks
- Cleanup - Removes test container
- Report - Stores results in catalog
Drill Configuration
Select Specific Backup
# Latest backup for database
dbbackup drill /mnt/backups/databases --database myapp
# Backup from specific date
dbbackup drill /mnt/backups/databases --database myapp --date 2026-01-23
# Oldest backup (best test)
dbbackup drill /mnt/backups/databases --database myapp --oldest
Drill Options
# Full validation (slower)
dbbackup drill /mnt/backups/databases --full-validation
# Quick validation (schema only, faster)
dbbackup drill /mnt/backups/databases --quick-validation
# Store results in catalog
dbbackup drill /mnt/backups/databases --catalog
# Send notification on failure
dbbackup drill /mnt/backups/databases --notify-on-failure
# Custom test database name
dbbackup drill /mnt/backups/databases --test-database dr_test_prod
Scheduled Drills
Run drills automatically on a schedule.
Configure Schedule
# Daily drill at 03:00
dbbackup drill /mnt/backups/databases --schedule "03:00"
# Weekly drill (Sunday 02:00)
dbbackup drill /mnt/backups/databases --schedule "sun 02:00"
# Monthly drill (1st of month)
dbbackup drill /mnt/backups/databases --schedule "monthly"
# Install as systemd timer
sudo dbbackup install drill \
--backup-path /mnt/backups/databases \
--schedule "03:00"
Verify Schedule
# Show next 5 scheduled drills
dbbackup drill list --upcoming
# Check drill history
dbbackup drill list --history
# Show drill statistics
dbbackup drill stats
Drill Results
View Drill History
# All drill results
dbbackup drill list
# Recent 10 drills
dbbackup drill list --limit 10
# Drills from last week
dbbackup drill list --after "$(date -d '7 days ago' +%Y-%m-%d)"
# Failed drills only
dbbackup drill list --status failed
# Passed drills only
dbbackup drill list --status passed
Detailed Drill Report
dbbackup drill report myapp_2026-01-23.dump.gz
# Output includes:
# - Backup filename
# - Database version
# - Extract time
# - Restore time
# - Row counts (before/after)
# - Table verification results
# - Data integrity status
# - Pass/Fail verdict
# - Warnings/errors
Validation Types
Full Validation
Deep integrity checks on restored data.
dbbackup drill /mnt/backups/databases --full-validation
# Checks:
# - All tables restored
# - Row counts match original
# - Indexes present and valid
# - Constraints enforced
# - Foreign key references valid
# - Sequence values correct (PostgreSQL)
# - Triggers present (if not system-generated)
Quick Validation
Schema-only validation (fast).
dbbackup drill /mnt/backups/databases --quick-validation
# Checks:
# - Database connects
# - All tables present
# - Column definitions correct
# - Indexes exist
Custom Validation
Run custom SQL checks.
# Add custom validation query
dbbackup drill /mnt/backups/databases \
--validation-query "SELECT COUNT(*) FROM users" \
--validation-expected 15000
# Example for multiple tables
dbbackup drill /mnt/backups/databases \
--validation-query "SELECT COUNT(*) FROM orders WHERE status='completed'" \
--validation-expected 42000
Reporting
Generate Drill Report
# HTML report (email-friendly)
dbbackup drill report --format html --output drill-report.html
# JSON report (for CI/CD pipelines)
dbbackup drill report --format json --output drill-results.json
# Markdown report (GitHub integration)
dbbackup drill report --format markdown --output drill-results.md
Example Report Format
Disaster Recovery Drill Results
================================
Backup: myapp_2026-01-23_14-30-00.dump.gz
Date: 2026-01-25 03:15:00
Duration: 5m 32s
Status: PASSED
Details:
Extract Time: 1m 15s
Restore Time: 3m 42s
Validation Time: 34s
Tables Restored: 42
Rows Verified: 1,234,567
Total Size: 2.5 GB
Validation:
Schema Check: OK
Row Count Check: OK (all tables)
Index Check: OK (all 28 indexes present)
Constraint Check: OK (all 5 foreign keys valid)
Warnings: None
Errors: None
Integration with CI/CD
GitHub Actions
name: Daily DR Drill
on:
schedule:
- cron: '0 3 * * *' # Daily at 03:00
jobs:
dr-drill:
runs-on: ubuntu-latest
steps:
- name: Run DR drill
run: |
dbbackup drill /backups/databases \
--full-validation \
--format json \
--output results.json
- name: Check results
run: |
if grep -q '"status":"failed"' results.json; then
echo "DR drill failed!"
exit 1
fi
- name: Upload report
uses: actions/upload-artifact@v2
with:
name: drill-results
path: results.json
Jenkins Pipeline
pipeline {
triggers {
cron('H 3 * * *') // Daily at 03:00
}
stages {
stage('DR Drill') {
steps {
sh 'dbbackup drill /backups/databases --full-validation --format json --output drill.json'
}
}
stage('Validate Results') {
steps {
script {
def results = readJSON file: 'drill.json'
if (results.status != 'passed') {
error("DR drill failed!")
}
}
}
}
}
}
Troubleshooting
Drill Fails with "Out of Space"
# Check available disk space
df -h
# Clean up old test databases
docker system prune -a
# Use faster storage for test
dbbackup drill /mnt/backups/databases --temp-dir /ssd/drill-temp
Drill Times Out
# Increase timeout (minutes)
dbbackup drill /mnt/backups/databases --timeout 30
# Skip certain validations to speed up
dbbackup drill /mnt/backups/databases --quick-validation
Drill Shows Data Mismatch
Indicates a problem with the backup - investigate immediately:
# Get detailed diff report
dbbackup drill report --show-diffs myapp_2026-01-23.dump.gz
# Regenerate backup
dbbackup backup single myapp --force-full
Best Practices
-
Run weekly drills minimum - Catch issues early
-
Test oldest backups - Verify full retention chain works
dbbackup drill /mnt/backups/databases --oldest -
Test critical databases first - Prioritize by impact
-
Store results in catalog - Track historical pass/fail rates
-
Alert on failures - Automatic notification via email/Slack
-
Document RTO - Use drill times to refine recovery objectives
-
Test cross-major-versions - Use test environment with different DB version
# Test PostgreSQL 15 backup on PostgreSQL 16 dbbackup drill /mnt/backups/databases --target-version 16