- Created comprehensive test suite (700+ lines) * 7 major test functions with 21+ sub-tests * Recovery target validation (time/XID/LSN/name/immediate) * WAL archiving (plain, compressed, with mock files) * WAL parsing (filename validation, error cases) * Timeline management (history parsing, consistency, path finding) * Recovery config generation (PG 12+ and legacy formats) * Data directory validation (exists, writable, not running) * Performance benchmarks (WAL archiving, target parsing) * All tests passing (0.031s execution time) - Updated README.md with PITR documentation (200+ lines) * Complete PITR overview and benefits * Step-by-step setup guide (enable, backup, monitor) * 5 recovery target examples with full commands * Advanced options (compression, encryption, actions, timelines) * Complete WAL management command reference * 7 best practices recommendations * Troubleshooting section with common issues - Created PITR.md standalone guide * Comprehensive PITR documentation * Use cases and practical examples * Setup instructions with alternatives * Recovery operations for all target types * Advanced features (compression, encryption, timelines) * Troubleshooting with debugging tips * Best practices and compliance guidance * Performance considerations - Updated CHANGELOG.md with v3.1 PITR features * Complete feature list (WAL archiving, timeline mgmt, recovery) * New commands (pitr enable/disable/status, wal archive/list/cleanup/timeline) * PITR restore with all target types * Advanced features and configuration examples * Technical implementation details * Performance metrics and use cases Phases completed: - Phase 1: WAL Archiving (1.5h) ✓ - Phase 2: Compression & Encryption (1h) ✓ - Phase 3: Timeline Management (0.75h) ✓ - Phase 4: Point-in-Time Restore (1.25h) ✓ - Phase 5: Tests & Documentation (1.25h) ✓ All PITR functionality implemented, tested, and documented.
16 KiB
Point-in-Time Recovery (PITR) Guide
Complete guide to Point-in-Time Recovery in dbbackup v3.1.
Table of Contents
- Overview
- How PITR Works
- Setup Instructions
- Recovery Operations
- Advanced Features
- Troubleshooting
- Best Practices
Overview
Point-in-Time Recovery (PITR) allows you to restore your PostgreSQL database to any specific moment in time, not just to the time of your last backup. This is crucial for:
- Disaster Recovery: Recover from accidental data deletion, corruption, or malicious changes
- Compliance: Meet regulatory requirements for data retention and recovery
- Testing: Create snapshots at specific points for testing or analysis
- Time Travel: Investigate database state at any historical moment
Use Cases
- Accidental DELETE: User accidentally deletes important data at 2:00 PM. Restore to 1:59 PM.
- Bad Migration: Deploy breaks production at 3:00 PM. Restore to 2:55 PM (before deploy).
- Audit Investigation: Need to see exact database state on Nov 15 at 10:30 AM.
- Testing Scenarios: Create multiple recovery branches to test different outcomes.
How PITR Works
PITR combines three components:
1. Base Backup
A full snapshot of your database at a specific point in time.
# Take a base backup
pg_basebackup -D /backups/base.tar.gz -Ft -z -P
2. WAL Archives
PostgreSQL's Write-Ahead Log (WAL) files contain all database changes. These are continuously archived.
Base Backup (9 AM) → WAL Files (9 AM - 5 PM) → Current State
↓ ↓
Snapshot All changes since backup
3. Recovery Target
The specific point in time you want to restore to. Can be:
- Timestamp:
2024-11-26 14:30:00 - Transaction ID:
1000000 - LSN:
0/3000000(Log Sequence Number) - Named Point:
before_migration - Immediate: Earliest consistent point
Setup Instructions
Prerequisites
- PostgreSQL 9.5+ (12+ recommended for modern recovery format)
- Sufficient disk space for WAL archives (~10-50 GB/day typical)
- dbbackup v3.1 or later
Step 1: Enable WAL Archiving
# Configure PostgreSQL for PITR
./dbbackup pitr enable --archive-dir /backups/wal_archive
# This modifies postgresql.conf:
# wal_level = replica
# archive_mode = on
# archive_command = 'dbbackup wal archive %p %f --archive-dir /backups/wal_archive'
Manual Configuration (alternative):
Edit /etc/postgresql/14/main/postgresql.conf:
# WAL archiving for PITR
wal_level = replica # Minimum required for PITR
archive_mode = on # Enable WAL archiving
archive_command = '/usr/local/bin/dbbackup wal archive %p %f --archive-dir /backups/wal_archive'
max_wal_senders = 3 # For replication (optional)
wal_keep_size = 1GB # Retain WAL on server (optional)
Restart PostgreSQL:
# Restart to apply changes
sudo systemctl restart postgresql
# Verify configuration
./dbbackup pitr status
Step 2: Take a Base Backup
# Option 1: pg_basebackup (recommended)
pg_basebackup -D /backups/base_$(date +%Y%m%d_%H%M%S).tar.gz -Ft -z -P
# Option 2: Regular pg_dump backup
./dbbackup backup single mydb --output /backups/base.dump.gz
# Option 3: File-level copy (PostgreSQL stopped)
sudo service postgresql stop
tar -czf /backups/base.tar.gz -C /var/lib/postgresql/14/main .
sudo service postgresql start
Step 3: Verify WAL Archiving
# Check that WAL files are being archived
./dbbackup wal list --archive-dir /backups/wal_archive
# Expected output:
# 000000010000000000000001 Timeline 1 Segment 0x00000001 16 MB 2024-11-26 09:00
# 000000010000000000000002 Timeline 1 Segment 0x00000002 16 MB 2024-11-26 09:15
# 000000010000000000000003 Timeline 1 Segment 0x00000003 16 MB 2024-11-26 09:30
# Check archive statistics
./dbbackup pitr status
Step 4: Create Restore Points (Optional)
-- Create named restore points before major operations
SELECT pg_create_restore_point('before_schema_migration');
SELECT pg_create_restore_point('before_data_import');
SELECT pg_create_restore_point('end_of_day_2024_11_26');
Recovery Operations
Basic Recovery
Restore to Specific Time:
./dbbackup restore pitr \
--base-backup /backups/base_20241126_090000.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "2024-11-26 14:30:00" \
--target-dir /var/lib/postgresql/14/restored
What happens:
- Extracts base backup to target directory
- Creates recovery configuration (postgresql.auto.conf + recovery.signal)
- Provides instructions to start PostgreSQL
- PostgreSQL replays WAL files until target time reached
- Automatically promotes to primary (default action)
Recovery Target Types
1. Timestamp Recovery
--target-time "2024-11-26 14:30:00"
--target-time "2024-11-26T14:30:00Z" # ISO 8601
--target-time "2024-11-26 14:30:00.123456" # Microseconds
2. Transaction ID (XID) Recovery
# Find XID from logs or pg_stat_activity
--target-xid 1000000
# Use case: Rollback specific transaction
# Check transaction ID: SELECT txid_current();
3. LSN (Log Sequence Number) Recovery
--target-lsn "0/3000000"
# Find LSN: SELECT pg_current_wal_lsn();
# Use case: Precise replication catchup
4. Named Restore Point
--target-name before_migration
# Use case: Restore to pre-defined checkpoint
5. Immediate (Earliest Consistent)
--target-immediate
# Use case: Restore to end of base backup
Recovery Actions
Control what happens after recovery target is reached:
1. Promote (default)
--target-action promote
# PostgreSQL becomes primary, accepts writes
# Use case: Normal disaster recovery
2. Pause
--target-action pause
# PostgreSQL pauses at target, read-only
# Inspect data before committing
# Manually promote: pg_ctl promote -D /path
3. Shutdown
--target-action shutdown
# PostgreSQL shuts down at target
# Use case: Take filesystem snapshot
Advanced Recovery Options
Skip Base Backup Extraction:
# If data directory already exists
./dbbackup restore pitr \
--base-backup /backups/base.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "2024-11-26 14:30:00" \
--target-dir /var/lib/postgresql/14/main \
--skip-extraction
Auto-Start PostgreSQL:
# Automatically start PostgreSQL after setup
./dbbackup restore pitr \
--base-backup /backups/base.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "2024-11-26 14:30:00" \
--target-dir /var/lib/postgresql/14/restored \
--auto-start
Monitor Recovery Progress:
# Monitor recovery in real-time
./dbbackup restore pitr \
--base-backup /backups/base.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "2024-11-26 14:30:00" \
--target-dir /var/lib/postgresql/14/restored \
--auto-start \
--monitor
# Or manually monitor logs:
tail -f /var/lib/postgresql/14/restored/logfile
Non-Inclusive Recovery:
# Exclude target transaction/time
./dbbackup restore pitr \
--base-backup /backups/base.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "2024-11-26 14:30:00" \
--target-dir /var/lib/postgresql/14/restored \
--inclusive=false
Timeline Selection:
# Recover along specific timeline
--timeline 2
# Recover along latest timeline (default)
--timeline latest
# View available timelines:
./dbbackup wal timeline --archive-dir /backups/wal_archive
Advanced Features
WAL Compression
Save 70-80% storage space:
# Enable compression in archive_command
archive_command = 'dbbackup wal archive %p %f --archive-dir /backups/wal_archive --compress'
# Or compress during manual archive:
./dbbackup wal archive /path/to/wal/file %f \
--archive-dir /backups/wal_archive \
--compress
WAL Encryption
Encrypt WAL files for compliance:
# Generate encryption key
openssl rand -hex 32 > /secure/wal_encryption.key
# Enable encryption in archive_command
archive_command = 'dbbackup wal archive %p %f --archive-dir /backups/wal_archive --encrypt --encryption-key-file /secure/wal_encryption.key'
# Or encrypt during manual archive:
./dbbackup wal archive /path/to/wal/file %f \
--archive-dir /backups/wal_archive \
--encrypt \
--encryption-key-file /secure/wal_encryption.key
Timeline Management
PostgreSQL creates a new timeline each time you perform PITR. This allows parallel recovery paths.
View Timeline History:
./dbbackup wal timeline --archive-dir /backups/wal_archive
# Output:
# Timeline Branching Structure:
# ● Timeline 1
# WAL segments: 100 files
# ├─ Timeline 2 (switched at 0/3000000)
# WAL segments: 50 files
# ├─ Timeline 3 [CURRENT] (switched at 0/5000000)
# WAL segments: 25 files
Recover to Specific Timeline:
# Recover to timeline 2 instead of latest
./dbbackup restore pitr \
--base-backup /backups/base.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "2024-11-26 14:30:00" \
--target-dir /var/lib/postgresql/14/restored \
--timeline 2
WAL Cleanup
Manage WAL archive growth:
# Clean up WAL files older than 7 days
./dbbackup wal cleanup \
--archive-dir /backups/wal_archive \
--retention-days 7
# Dry run (preview what would be deleted)
./dbbackup wal cleanup \
--archive-dir /backups/wal_archive \
--retention-days 7 \
--dry-run
Troubleshooting
Common Issues
1. WAL Archiving Not Working
# Check PITR status
./dbbackup pitr status
# Verify PostgreSQL configuration
psql -c "SHOW archive_mode;"
psql -c "SHOW wal_level;"
psql -c "SHOW archive_command;"
# Check PostgreSQL logs
tail -f /var/log/postgresql/postgresql-14-main.log | grep archive
# Test archive command manually
su - postgres -c "dbbackup wal archive /test/path test_file --archive-dir /backups/wal_archive"
2. Recovery Target Not Reached
# Check if required WAL files exist
./dbbackup wal list --archive-dir /backups/wal_archive | grep "2024-11-26"
# Verify timeline consistency
./dbbackup wal timeline --archive-dir /backups/wal_archive
# Review recovery logs
tail -f /var/lib/postgresql/14/restored/logfile
3. Permission Errors
# Fix data directory ownership
sudo chown -R postgres:postgres /var/lib/postgresql/14/restored
# Fix WAL archive permissions
sudo chown -R postgres:postgres /backups/wal_archive
sudo chmod 700 /backups/wal_archive
4. Disk Space Issues
# Check WAL archive size
du -sh /backups/wal_archive
# Enable compression to save space
# Add --compress to archive_command
# Clean up old WAL files
./dbbackup wal cleanup --archive-dir /backups/wal_archive --retention-days 7
5. PostgreSQL Won't Start After Recovery
# Check PostgreSQL logs
tail -50 /var/lib/postgresql/14/restored/logfile
# Verify recovery configuration
cat /var/lib/postgresql/14/restored/postgresql.auto.conf
ls -la /var/lib/postgresql/14/restored/recovery.signal
# Check permissions
ls -ld /var/lib/postgresql/14/restored
Debugging Tips
Enable Verbose Logging:
# Add to postgresql.conf
log_min_messages = debug2
log_error_verbosity = verbose
log_statement = 'all'
Check WAL File Integrity:
# Verify compressed WAL
gunzip -t /backups/wal_archive/000000010000000000000001.gz
# Verify encrypted WAL
./dbbackup wal verify /backups/wal_archive/000000010000000000000001.enc \
--encryption-key-file /secure/key.bin
Monitor Recovery Progress:
-- In PostgreSQL during recovery
SELECT * FROM pg_stat_recovery_prefetch;
SELECT pg_is_in_recovery();
SELECT pg_last_wal_replay_lsn();
Best Practices
1. Regular Base Backups
# Schedule daily base backups
0 2 * * * /usr/local/bin/pg_basebackup -D /backups/base_$(date +\%Y\%m\%d).tar.gz -Ft -z
Why: Limits WAL archive size, faster recovery.
2. Monitor WAL Archive Growth
# Add monitoring
du -sh /backups/wal_archive | mail -s "WAL Archive Size" admin@example.com
# Alert on >100 GB
if [ $(du -s /backups/wal_archive | cut -f1) -gt 100000000 ]; then
echo "WAL archive exceeds 100 GB" | mail -s "ALERT" admin@example.com
fi
3. Test Recovery Regularly
# Monthly recovery test
./dbbackup restore pitr \
--base-backup /backups/base_latest.tar.gz \
--wal-archive /backups/wal_archive \
--target-immediate \
--target-dir /tmp/recovery_test \
--auto-start
# Verify database accessible
psql -h localhost -p 5433 -d postgres -c "SELECT version();"
# Cleanup
pg_ctl stop -D /tmp/recovery_test
rm -rf /tmp/recovery_test
4. Document Restore Points
# Create log of restore points
echo "$(date '+%Y-%m-%d %H:%M:%S') - before_migration - Schema version 2.5 to 3.0" >> /backups/restore_points.log
# In PostgreSQL
SELECT pg_create_restore_point('before_migration');
5. Compression & Encryption
# Always compress (70-80% savings)
--compress
# Encrypt for compliance
--encrypt --encryption-key-file /secure/key.bin
# Combined (compress first, then encrypt)
--compress --encrypt --encryption-key-file /secure/key.bin
6. Retention Policy
# Keep base backups: 30 days
# Keep WAL archives: 7 days (between base backups)
# Cleanup script
#!/bin/bash
find /backups/base_* -mtime +30 -delete
./dbbackup wal cleanup --archive-dir /backups/wal_archive --retention-days 7
7. Monitoring & Alerting
# Check WAL archiving status
psql -c "SELECT last_archived_wal, last_archived_time FROM pg_stat_archiver;"
# Alert if archiving fails
if psql -tAc "SELECT last_failed_wal FROM pg_stat_archiver WHERE last_failed_wal IS NOT NULL;"; then
echo "WAL archiving failed" | mail -s "ALERT" admin@example.com
fi
8. Disaster Recovery Plan
Document your recovery procedure:
## Disaster Recovery Steps
1. Stop application traffic
2. Identify recovery target (time/XID/LSN)
3. Prepare clean data directory
4. Run PITR restore:
./dbbackup restore pitr \
--base-backup /backups/base_latest.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "YYYY-MM-DD HH:MM:SS" \
--target-dir /var/lib/postgresql/14/main
5. Start PostgreSQL
6. Verify data integrity
7. Update application configuration
8. Resume application traffic
9. Create new base backup
Performance Considerations
WAL Archive Size
- Typical: 16 MB per WAL file
- High-traffic database: 1-5 GB/hour
- Low-traffic database: 100-500 MB/day
Recovery Time
- Base backup restoration: 5-30 minutes (depends on size)
- WAL replay: 10-100 MB/sec (depends on disk I/O)
- Total recovery time: backup size / disk speed + WAL replay time
Compression Performance
- CPU overhead: 5-10%
- Storage savings: 70-80%
- Recommended: Use unless CPU constrained
Encryption Performance
- CPU overhead: 2-5%
- Storage overhead: ~1% (header + nonce)
- Recommended: Use for compliance
Compliance & Security
Regulatory Requirements
PITR helps meet:
- GDPR: Data recovery within 72 hours
- SOC 2: Backup and recovery procedures
- HIPAA: Data integrity and availability
- PCI DSS: Backup retention and testing
Security Best Practices
- Encrypt WAL archives containing sensitive data
- Secure encryption keys (HSM, KMS, or secure filesystem)
- Limit access to WAL archive directory (chmod 700)
- Audit logs for recovery operations
- Test recovery from encrypted backups regularly
Additional Resources
- PostgreSQL PITR Documentation: https://www.postgresql.org/docs/current/continuous-archiving.html
- dbbackup GitHub: https://github.com/uuxo/dbbackup
- Report Issues: https://github.com/uuxo/dbbackup/issues
dbbackup v3.1 | Point-in-Time Recovery for PostgreSQL