Files
dbbackup/CHANGELOG.md
Alexander Renz 9d52f43d29
All checks were successful
CI/CD / Test (push) Successful in 2m59s
CI/CD / Lint (push) Successful in 1m11s
CI/CD / Integration Tests (push) Successful in 52s
CI/CD / Native Engine Tests (push) Successful in 49s
CI/CD / Build Binary (push) Successful in 42s
CI/CD / Test Release Build (push) Successful in 1m15s
CI/CD / Release Binaries (push) Successful in 10m31s
v5.6.0: Native Engine Performance Optimizations - 3.5x Faster Backup
PERFORMANCE BENCHMARKS (1M rows, 205 MB):
- Backup: 4.0s native vs 14.1s pg_dump = 3.5x FASTER
- Restore: 8.7s native vs 9.9s pg_restore = 13% FASTER
- Throughput: 250K rows/sec backup, 115K rows/sec restore

CONNECTION POOL OPTIMIZATIONS:
- MinConns = Parallel (warm pool, no connection setup delay)
- MaxConns = Parallel + 2 (headroom for metadata queries)
- Health checks every 1 minute
- Max lifetime 1 hour, idle timeout 5 minutes

RESTORE SESSION OPTIMIZATIONS:
- synchronous_commit = off (async WAL commits)
- work_mem = 256MB (faster sorts and hashes)
- maintenance_work_mem = 512MB (faster index builds)
- session_replication_role = replica (bypass triggers/FK checks)

Files changed:
- internal/engine/native/postgresql.go: Pool optimization
- internal/engine/native/restore.go: Session performance settings
- main.go: v5.5.3 → v5.6.0
- CHANGELOG.md: Performance benchmark results
2026-02-02 20:48:56 +01:00

95 KiB
Raw Blame History

Changelog

All notable changes to dbbackup will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[5.6.0] - 2026-02-02

Performance Optimizations 🚀

  • Native Engine Outperforms pg_dump/pg_restore!
    • Backup: 3.5x faster than pg_dump (250K vs 71K rows/sec)
    • Restore: 13% faster than pg_restore (115K vs 101K rows/sec)
    • Tested with 1M row database (205 MB)

Enhanced

  • Connection Pool Optimizations

    • Optimized min/max connections for warm pool
    • Added health check configuration
    • Connection lifetime and idle timeout tuning
  • Restore Session Optimizations

    • synchronous_commit = off for async commits
    • work_mem = 256MB for faster sorts
    • maintenance_work_mem = 512MB for faster index builds
    • session_replication_role = replica to bypass triggers/FK checks
  • TUI Improvements

    • Fixed separator line placement in Cluster Restore Progress view

Technical Details

  • internal/engine/native/postgresql.go: Pool optimization with min/max connections
  • internal/engine/native/restore.go: Session-level performance settings

[5.5.3] - 2026-02-02

Fixed

  • Fixed TUI separator line to appear under title instead of after it

[5.5.2] - 2026-02-02

Fixed

  • CRITICAL: Native Engine Array Type Support
    • Fixed: Array columns (e.g., INTEGER[], TEXT[]) were exported as just ARRAY
    • Now properly exports array types using PostgreSQL's udt_name from information_schema
    • Supports all common array types: integer[], text[], bigint[], boolean[], bytea[], json[], jsonb[], uuid[], timestamp[], etc.

Verified Working

  • Full BLOB/Binary Data Round-Trip Validated
    • BYTEA columns with NULL bytes (0x00) preserved correctly
    • Unicode data (emoji 🚀, Chinese 中文, Arabic العربية) preserved
    • JSON/JSONB with Unicode preserved
    • Integer and text arrays restored correctly
    • 10,002 row test with checksum verification: PASS

Technical Details

  • internal/engine/native/postgresql.go:
    • Added udt_name to column query
    • Updated formatDataType() to convert PostgreSQL internal array names (_int4, _text, etc.) to SQL syntax

[5.5.1] - 2026-02-02

Fixed

  • CRITICAL: Native Engine Restore Fixed - Restore now connects to target database correctly

    • Previously connected to source database, causing data to be written to wrong database
    • Now creates engine with target database for proper restore
  • CRITICAL: Native Engine Backup - Sequences Now Exported

    • Fixed: Sequences were silently skipped due to type mismatch in PostgreSQL query
    • Cast information_schema.sequences string values to bigint
    • Sequences now properly created BEFORE tables that reference them
  • CRITICAL: Native Engine COPY Handling

    • Fixed: COPY FROM stdin data blocks now properly parsed and executed
    • Replaced simple line-by-line SQL execution with proper COPY protocol handling
    • Uses pgx CopyFrom for bulk data loading (100k+ rows/sec)
  • Tool Verification Bypass for Native Mode

    • Skip pg_restore/psql check when --native flag is used
    • Enables truly zero-dependency deployment
  • Panic Fix: Slice Bounds Error

    • Fixed runtime panic when logging short SQL statements during errors

Technical Details

  • internal/engine/native/manager.go: Create new engine with target database for restore
  • internal/engine/native/postgresql.go: Fixed Restore() to handle COPY protocol, fixed getSequenceCreateSQL() type casting
  • cmd/restore.go: Skip VerifyTools when cfg.UseNativeEngine is true
  • internal/tui/restore_preview.go: Show "Native engine mode" instead of tool check

[5.5.0] - 2026-02-02

Added

  • 🚀 Native Engine Support for Cluster Backup/Restore

    • NEW: --native flag for cluster backup creates SQL format (.sql.gz) using pure Go
    • NEW: --native flag for cluster restore uses pure Go engine for .sql.gz files
    • Zero external tool dependencies when using native mode
    • Single-binary deployment now possible without pg_dump/pg_restore installed
  • Native Cluster Backup (dbbackup backup cluster --native)

    • Creates .sql.gz files instead of .dump files
    • Uses pgx wire protocol for data export
    • Parallel gzip compression with pgzip
    • Automatic fallback to pg_dump if --fallback-tools is set
  • Native Cluster Restore (dbbackup restore cluster --native --confirm)

    • Restores .sql.gz files using pure Go (pgx CopyFrom)
    • No psql or pg_restore required
    • Automatic detection: uses native for .sql.gz, pg_restore for .dump
    • Fallback support with --fallback-tools

Updated

  • NATIVE_ENGINE_SUMMARY.md - Complete rewrite with accurate documentation
  • Native engine matrix now shows full cluster support with --native flag

Technical Details

  • internal/backup/engine.go: Added native engine path in BackupCluster()
  • internal/restore/engine.go: Added restoreWithNativeEngine() function
  • cmd/backup.go: Added --native and --fallback-tools flags to cluster command
  • cmd/restore.go: Added --native and --fallback-tools flags with PreRunE handlers
  • Version bumped to 5.5.0 (new feature release)

[5.4.6] - 2026-02-02

Fixed

  • CRITICAL: Progress Tracking for Large Database Restores

    • Fixed "no progress" issue where TUI showed 0% for hours during large single-DB restore
    • Root cause: Progress only updated after database completed, not during restore
    • Heartbeat now reports estimated progress every 5 seconds (was 15s, text-only)
    • Time-based progress estimation: ~10MB/s throughput assumption
    • Progress capped at 95% until actual completion (prevents jumping to 100% too early)
  • Improved TUI Feedback During Long Restores

    • Shows spinner + elapsed time when byte-level progress not available
    • Displays "pg_restore in progress (progress updates every 5s)" message
    • Better visual feedback that restore is actively running

Technical Details

  • reportDatabaseProgressByBytes() now called during restore, not just after completion
  • Heartbeat interval reduced from 15s to 5s for more responsive feedback
  • TUI gracefully handles CurrentDBTotal=0 case with activity indicator

[5.4.5] - 2026-02-02

Fixed

  • Accurate Disk Space Estimation for Cluster Archives
    • Fixed WARNING showing 836GB for 119GB archive - was using wrong compression multiplier
    • Cluster archives (.tar.gz) contain pre-compressed .dump files → now uses 1.2x multiplier
    • Single SQL files (.sql.gz) still use 5x multiplier (was 7x, slightly optimized)
    • New CheckSystemMemoryWithType(size, isClusterArchive) method for accurate estimates
    • 119GB cluster archive now correctly estimates ~143GB instead of ~833GB

[5.4.4] - 2026-02-02

Fixed

  • TUI Header Separator Fix - Capped separator length at 40 chars to prevent line overflow on wide terminals

[5.4.3] - 2026-02-02

Fixed

  • Bulletproof SIGINT Handling - Zero zombie processes guaranteed

    • All external commands now use cleanup.SafeCommand() with process group isolation
    • KillCommandGroup() sends signals to entire process group (-pgid)
    • No more orphaned pg_restore/pg_dump/psql/pigz processes on Ctrl+C
    • 16 files updated with proper signal handling
  • Eliminated External gzip Process - The zgrep command was spawning gzip -cdfq

    • Replaced with in-process pgzip decompression in preflight.go
    • estimateBlobsInSQL() now uses pure Go pgzip.NewReader
    • Zero external gzip processes during restore

[5.1.22] - 2026-02-01

Added

  • Restore Metrics for Prometheus/Grafana - Now you can monitor restore performance!

    • dbbackup_restore_total{status="success|failure"} - Total restore count
    • dbbackup_restore_duration_seconds{profile, parallel_jobs} - Restore duration
    • dbbackup_restore_parallel_jobs{profile} - Jobs used (shows if turbo=8 is working!)
    • dbbackup_restore_size_bytes - Restored archive size
    • dbbackup_restore_last_timestamp - Last restore time
  • Grafana Dashboard: Restore Operations Section

    • Total Successful/Failed Restores
    • Parallel Jobs Used (RED if 1=SLOW, GREEN if 8=TURBO)
    • Last Restore Duration with thresholds
    • Restore Duration Over Time graph
    • Parallel Jobs per Restore bar chart
  • Restore Engine Metrics Recording

    • All single database and cluster restores now record metrics
    • Stored in ~/.dbbackup/restore_metrics.json
    • Prometheus exporter reads and exposes these metrics

[5.1.21] - 2026-02-01

Fixed

  • Complete verification of profile system - Full code path analysis confirms TURBO works:
    • CLI: --profile turboconfig.ApplyProfile()cfg.Jobs=8pg_restore --jobs=8
    • TUI: Settings → ApplyResourceProfile()cpu.GetProfileByName("turbo")cfg.Jobs=8
    • Updated help text for restore cluster command to show turbo example
    • Updated flag description to list all profiles: conservative, balanced, turbo, max-performance

[5.1.20] - 2026-02-01

Fixed

  • CRITICAL: "turbo" and "max-performance" profiles were NOT recognized in restore command!
    • profile.go only had: conservative, balanced, aggressive, potato
    • "turbo" profile returned ERROR "unknown profile" and SILENTLY fell back to "balanced"
    • "balanced" profile has Jobs: 0 which became Jobs: 1 after default fallback
    • Result: --profile turbo was IGNORED and restore ran with --jobs=1 (single-threaded)
    • Added turbo profile: Jobs=8, ParallelDBs=2
    • Added max-performance profile: Jobs=8, ParallelDBs=4
    • NOW --profile turbo correctly uses pg_restore --jobs=8

[5.1.19] - 2026-02-01

Fixed

  • CRITICAL: pg_restore --jobs flag was NEVER added when Parallel <= 1 - Root cause finally found and fixed:
    • In BuildRestoreCommand() the condition was if options.Parallel > 1 which meant --jobs flag was NEVER added when Parallel was 1 or less
    • Changed to if options.Parallel > 0 so --jobs is ALWAYS set when Parallel > 0
    • This was THE root cause why restores took 12+ hours instead of ~4 hours
    • Now pg_restore --jobs=8 is correctly generated for turbo profile

[5.1.18] - 2026-02-01

Fixed

  • CRITICAL: Profile Jobs setting now ALWAYS respected - Removed multiple code paths that were overriding user's profile Jobs setting:
    • restoreSection() for phased restores now uses --jobs flag (was missing entirely!)
    • Removed auto-fallback that forced Jobs=1 when PostgreSQL locks couldn't be boosted
    • Removed auto-fallback that forced Jobs=1 on low memory detection
    • User's profile choice (turbo, performance, etc.) is now respected - only warnings are logged
    • This was causing restores to take 9+ hours instead of ~4 hours with turbo profile

[5.1.17] - 2026-02-01

Fixed

  • TUI Settings now persist to disk - Settings changes in TUI are now saved to .dbbackup.conf file, not just in-memory
  • Native Engine is now the default - Pure Go engine (no external tools required) is now the default instead of external tools mode

[5.1.16] - 2026-02-01

Fixed

  • Critical: pg_restore parallel jobs now actually used - Fixed bug where --jobs flag and profile Jobs setting were completely ignored for pg_restore. The code had hardcoded Parallel: 1 instead of using e.cfg.Jobs, causing all restores to run single-threaded regardless of configuration. This fix enables 3-4x faster restores matching native pg_restore -j8 performance.
    • Affected functions: restorePostgreSQLDump(), restorePostgreSQLDumpWithOwnership()
    • Now logs parallel_jobs value for visibility
    • Turbo profile with Jobs: 8 now correctly passes --jobs=8 to pg_restore

[5.1.15] - 2026-01-31

Fixed

  • Fixed go vet warning for Printf directive in shell command output (CI fix)

[5.1.14] - 2026-01-31

Added - Quick Win Features

  • Cross-Region Sync (cloud cross-region-sync)

    • Sync backups between cloud regions for disaster recovery
    • Support for S3, MinIO, Azure Blob, Google Cloud Storage
    • Parallel transfers with configurable concurrency
    • Dry-run mode to preview sync plan
    • Filter by database name or backup age
    • Delete orphaned files with --delete flag
  • Retention Policy Simulator (retention-simulator)

    • Preview retention policy effects without deleting backups
    • Simulate simple age-based and GFS retention strategies
    • Compare multiple retention periods side-by-side (7, 14, 30, 60, 90 days)
    • Calculate space savings and backup counts
    • Analyze backup frequency and provide recommendations
  • Catalog Dashboard (catalog dashboard)

    • Interactive TUI for browsing backup catalog
    • Sort by date, size, database, or type
    • Filter backups with search
    • Detailed view with backup metadata
    • Keyboard navigation (vim-style keys supported)
  • Parallel Restore Analysis (parallel-restore)

    • Analyze system for optimal parallel restore settings
    • Benchmark disk I/O performance
    • Simulate restore with different parallelism levels
    • Provide recommendations based on CPU and memory
  • Progress Webhooks (progress-webhooks)

    • Configure webhook notifications for backup/restore progress
    • Periodic progress updates during long operations
    • Test mode to verify webhook connectivity
    • Environment variable configuration (DBBACKUP_WEBHOOK_URL)
  • Encryption Key Rotation (encryption rotate)

    • Generate new encryption keys (128, 192, 256-bit)
    • Save keys to file with secure permissions (0600)
    • Support for base64 and hex output formats

Changed

  • Updated version to 5.1.14
  • Removed development files from repository (.dbbackup.conf, TODO_SESSION.md, test-backups/)

[5.1.0] - 2026-01-30

Fixed

  • CRITICAL: Fixed PostgreSQL native engine connection pooling issues that caused "conn busy" errors
  • CRITICAL: Fixed PostgreSQL table data export - now properly captures all table schemas and data using COPY protocol
  • CRITICAL: Fixed PostgreSQL native engine to use connection pool for all metadata queries (getTables, getViews, getSequences, getFunctions)
  • Fixed gzip compression implementation in native backup CLI integration
  • Fixed exitcode package syntax errors causing CI failures

Added

  • Enhanced PostgreSQL native engine with proper connection pool management
  • Complete table data export using COPY TO STDOUT protocol
  • Comprehensive testing with complex data types (JSONB, arrays, foreign keys)
  • Production-ready native engine performance and stability

Changed

  • All PostgreSQL metadata queries now use connection pooling instead of shared connection
  • Improved error handling and debugging output for native engines
  • Enhanced backup file structure with proper SQL headers and footers

[5.0.1] - 2026-01-30

Fixed - Quality Improvements

  • PostgreSQL COPY Format: Fixed format mismatch - now uses native TEXT format compatible with COPY FROM stdin
  • MySQL Restore Security: Fixed potential SQL injection in restore by properly escaping backticks in database names
  • MySQL 8.0.22+ Compatibility: Added fallback for SHOW BINARY LOG STATUS (MySQL 8.0.22+) with graceful fallback to SHOW MASTER STATUS for older versions
  • Duration Calculation: Fixed backup duration tracking to accurately capture elapsed time

[5.0.0] - 2026-01-30

MAJOR RELEASE - Native Engine Implementation

BREAKTHROUGH: We Built Our Own Database Engines

This is a really big step. We're no longer calling external tools - we built our own machines.

dbbackup v5.0.0 represents a fundamental architectural revolution. We've eliminated ALL external tool dependencies by implementing pure Go database engines that speak directly to PostgreSQL and MySQL using their native wire protocols. No more pg_dump. No more mysqldump. No more shelling out. Our code, our engines, our control.

Added - Native Database Engines

  • Native PostgreSQL Engine (internal/engine/native/postgresql.go)

    • Pure Go implementation using pgx/v5 driver
    • Direct PostgreSQL wire protocol communication
    • Native SQL generation and COPY data export
    • Advanced data type handling (arrays, JSON, binary, timestamps)
    • Proper SQL escaping and PostgreSQL-specific formatting
  • Native MySQL Engine (internal/engine/native/mysql.go)

    • Pure Go implementation using go-sql-driver/mysql
    • Direct MySQL protocol communication
    • Batch INSERT generation with advanced data types
    • Binary data support with hex encoding
    • MySQL-specific escape sequences and formatting
  • Advanced Engine Framework (internal/engine/native/advanced.go)

    • Extensible architecture for multiple backup formats
    • Compression support (Gzip, Zstd, LZ4)
    • Configurable batch processing (1K-10K rows per batch)
    • Performance optimization settings
    • Future-ready for custom formats and parallel processing
  • Engine Manager (internal/engine/native/manager.go)

    • Pluggable architecture for engine selection
    • Configuration-based engine initialization
    • Unified backup orchestration across all engines
    • Automatic fallback mechanisms
  • Restore Framework (internal/engine/native/restore.go)

    • Native restore engine architecture (basic implementation)
    • Transaction control and error handling
    • Progress tracking and status reporting
    • Foundation for complete restore implementation

Added - CLI Integration

  • New Command Line Flags
    • --native: Use pure Go native engines (no external tools)
    • --fallback-tools: Fallback to external tools if native engine fails
    • --native-debug: Enable detailed native engine debugging

Added - Advanced Features

  • Production-Ready Data Handling

    • Proper handling of complex PostgreSQL types (arrays, JSON, custom types)
    • Advanced MySQL binary data encoding and type detection
    • NULL value handling across all data types
    • Timestamp formatting with microsecond precision
    • Memory-efficient streaming for large datasets
  • Performance Optimizations

    • Configurable batch processing for optimal throughput
    • I/O streaming with buffered writers
    • Connection pooling integration
    • Memory usage optimization for large tables

Changed - Core Architecture

  • Zero External Dependencies: No longer requires pg_dump, mysqldump, pg_restore, mysql, psql, or mysqlbinlog
  • Native Protocol Communication: Direct database protocol usage instead of shelling out to external tools
  • Pure Go Implementation: All backup and restore operations now implemented in Go
  • Backward Compatibility: All existing configurations and workflows continue to work

Technical Impact

  • Build Size: Reduced dependencies and smaller binaries
  • Performance: Eliminated process spawning overhead and improved data streaming
  • Reliability: Removed external tool version compatibility issues
  • Maintenance: Simplified deployment with single binary distribution
  • Security: Eliminated attack vectors from external tool dependencies

Migration Guide

Existing users can continue using dbbackup exactly as before - all existing configurations work unchanged. The new native engines are opt-in via the --native flag.

Recommended: Test native engines with --native --native-debug flags, then switch to native-only operation for improved performance and reliability.


[4.2.9] - 2026-01-30

Added - MEDIUM Priority Features

  • #11: Enhanced Error Diagnostics with System Context (MEDIUM priority)
    • Automatic environmental context collection on errors
    • Real-time system diagnostics: disk space, memory, file descriptors
    • PostgreSQL diagnostics: connections, locks, shared memory, version
    • Smart root cause analysis based on error + environment
    • Context-specific recommendations (e.g., "Disk 95% full" → cleanup commands)
    • Comprehensive diagnostics report with actionable fixes
    • Problem: Errors showed symptoms but not environmental causes
    • Solution: Diagnose system state + error pattern → root cause + fix

Diagnostic Report Includes:

  • Disk space usage and available capacity
  • Memory usage and pressure indicators
  • File descriptor utilization (Linux/Unix)
  • PostgreSQL connection pool status
  • Lock table capacity calculations
  • Version compatibility checks
  • Contextual recommendations based on actual system state

Example Diagnostics:

═══════════════════════════════════════════════════════════
  DBBACKUP ERROR DIAGNOSTICS REPORT
═══════════════════════════════════════════════════════════

Error Type: CRITICAL
Category:   locks
Severity:   2/3

Message:
  out of shared memory: max_locks_per_transaction exceeded

Root Cause:
  Lock table capacity too low (32,000 total locks). Likely cause: 
  max_locks_per_transaction (128) too low for this database size

System Context:
  Disk Space:  45.3 GB / 100.0 GB (45.3% used)
  Memory:      3.2 GB / 8.0 GB (40.0% used)
  File Descriptors: 234 / 4096

Database Context:
  Version:     PostgreSQL 14.10
  Connections: 15 / 100
  Max Locks:   128 per transaction
  Total Lock Capacity: ~12,800

Recommendations:
  Current lock capacity: 12,800 locks (max_locks_per_transaction × max_connections)
  WARNING: max_locks_per_transaction is low (128)
  • Increase: ALTER SYSTEM SET max_locks_per_transaction = 4096;
  • Then restart PostgreSQL: sudo systemctl restart postgresql

Suggested Action:
  Fix: ALTER SYSTEM SET max_locks_per_transaction = 4096; then 
  RESTART PostgreSQL

Functions:

  • GatherErrorContext() - Collects system + database metrics
  • DiagnoseError() - Full error analysis with environmental context
  • FormatDiagnosticsReport() - Human-readable report generation
  • generateContextualRecommendations() - Smart recommendations based on state
  • analyzeRootCause() - Pattern matching for root cause identification

Integration:

  • Available for all backup/restore operations
  • Automatic context collection on critical errors
  • Can be manually triggered for troubleshooting
  • Export as JSON for automated monitoring

[4.2.8] - 2026-01-30

Added - MEDIUM Priority Features

  • #10: WAL Archive Statistics (MEDIUM priority)
    • dbbackup pitr status now shows comprehensive WAL archive statistics
    • Displays: total files, total size, compression rate, oldest/newest WAL, time span
    • Auto-detects archive directory from PostgreSQL archive_command
    • Supports compressed (.gz, .zst, .lz4) and encrypted (.enc) WAL files
    • Problem: No visibility into WAL archive health and growth
    • Solution: Real-time stats in PITR status command, helps identify retention issues

Example Output:

WAL Archive Statistics:
======================================================
  Total Files:      1,234
  Total Size:       19.8 GB
  Average Size:     16.4 MB
  Compressed:       1,234 files (68.5% saved)
  Encrypted:        1,234 files

  Oldest WAL:       000000010000000000000042
    Created:        2026-01-15 08:30:00
  Newest WAL:       000000010000000000004D2F
    Created:        2026-01-30 17:45:30
  Time Span:        15.4 days

Files Modified:

  • internal/wal/archiver.go: Extended ArchiveStats struct with detailed fields
  • internal/wal/archiver.go: Added GetArchiveStats(), FormatArchiveStats() functions
  • cmd/pitr.go: Integrated stats into pitr status command
  • cmd/pitr.go: Added extractArchiveDirFromCommand() helper

[4.2.7] - 2026-01-30

Added - HIGH Priority Features

  • #9: Auto Backup Verification (HIGH priority)
    • Automatic integrity verification after every backup (default: ON)
    • Single DB backups: Full SHA-256 checksum verification
    • Cluster backups: Quick tar.gz structure validation (header scan)
    • Prevents corrupted backups from being stored undetected
    • Can disable with --no-verify flag or VERIFY_AFTER_BACKUP=false
    • Performance overhead: +5-10% for single DB, +1-2% for cluster
    • Problem: Backups not verified until restore time (too late to fix)
    • Solution: Immediate feedback on backup integrity, fail-fast on corruption

Fixed - Performance & Reliability

  • #5: TUI Memory Leak in Long Operations (HIGH priority)
    • Throttled progress speed samples to max 10 updates/second (100ms intervals)
    • Fixed memory bloat during large cluster restores (100+ databases)
    • Reduced memory usage by ~90% in long-running operations
    • No visual degradation (10 FPS is smooth enough for progress display)
    • Applied to: internal/tui/restore_exec.go, internal/tui/detailed_progress.go
    • Problem: Progress callbacks fired on every 4KB buffer read = millions of allocations
    • Solution: Throttle sample collection to prevent unbounded array growth

[4.2.5] - 2026-01-30

[4.2.6] - 2026-01-30

Security - Critical Fixes

  • SEC#1: Password exposure in process list

    • Removed --password CLI flag to prevent passwords appearing in ps aux
    • Use environment variables (PGPASSWORD, MYSQL_PWD) or config file instead
    • Enhanced security for multi-user systems and shared environments
  • SEC#2: World-readable backup files

    • All backup files now created with 0600 permissions (owner-only read/write)
    • Prevents unauthorized users from reading sensitive database dumps
    • Affects: internal/backup/engine.go, incremental_mysql.go, incremental_tar.go
    • Critical for GDPR, HIPAA, and PCI-DSS compliance
  • #4: Directory race condition in parallel backups

    • Replaced os.MkdirAll() with fs.SecureMkdirAll() that handles EEXIST gracefully
    • Prevents "file exists" errors when multiple backup processes create directories
    • Affects: All backup directory creation paths

Added

  • internal/fs/secure.go: New secure file operations utilities

    • SecureMkdirAll(): Race-condition-safe directory creation
    • SecureCreate(): File creation with 0600 permissions
    • SecureMkdirTemp(): Temporary directories with 0700 permissions
    • CheckWriteAccess(): Proactive detection of read-only filesystems
  • internal/exitcode/codes.go: BSD-style exit codes for automation

    • Standard exit codes for scripting and monitoring systems
    • Improves integration with systemd, cron, and orchestration tools

Fixed

  • Fixed multiple file creation calls using insecure 0644 permissions
  • Fixed race conditions in backup directory creation during parallel operations
  • Improved security posture for multi-user and shared environments

Fixed - TUI Cluster Restore Double-Extraction

  • TUI cluster restore performance optimization
    • Eliminated double-extraction: cluster archives were scanned twice (once for DB list, once for restore)
    • internal/restore/extract.go: Added ListDatabasesFromExtractedDir() to list databases from disk instead of tar scan
    • internal/tui/cluster_db_selector.go: Now pre-extracts cluster once, lists from extracted directory
    • internal/tui/archive_browser.go: Added ExtractedDir field to ArchiveInfo for passing pre-extracted path
    • internal/tui/restore_exec.go: Reuses pre-extracted directory when available
    • Performance improvement: 50GB cluster archive now processes once instead of twice (saves 5-15 minutes)
    • Automatic cleanup of extracted directory after restore completes or fails

[4.2.4] - 2026-01-30

Fixed - Comprehensive Ctrl+C Support Across All Operations

  • System-wide context-aware file operations

    • All long-running I/O operations now respond to Ctrl+C
    • Added CopyWithContext() to cloud package for S3/Azure/GCS transfers
    • Partial files are cleaned up on cancellation
  • Fixed components:

    • internal/restore/extract.go: Single DB extraction from cluster
    • internal/wal/compression.go: WAL file compression/decompression
    • internal/restore/engine.go: SQL restore streaming (2 paths)
    • internal/backup/engine.go: pg_dump/mysqldump streaming (3 paths)
    • internal/cloud/s3.go: S3 download interruption
    • internal/cloud/azure.go: Azure Blob download interruption
    • internal/cloud/gcs.go: GCS upload/download interruption
    • internal/drill/engine.go: DR drill decompression

[4.2.3] - 2026-01-30

Fixed - Cluster Restore Performance & Ctrl+C Handling

  • Removed redundant gzip validation in cluster restore

    • ValidateAndExtractCluster() no longer calls ValidateArchive() internally
    • Previously validation happened 2x before extraction (caller + internal)
    • Eliminates duplicate gzip header reads on large archives
    • Reduces cluster restore startup time
  • Fixed Ctrl+C not working during extraction

    • Added CopyWithContext() function for context-aware file copying
    • Extraction now checks for cancellation every 1MB of data
    • Ctrl+C immediately interrupts large file extractions
    • Partial files are cleaned up on cancellation
    • Applies to both ExtractTarGzParallel and extractArchiveWithProgress

[4.2.2] - 2026-01-30

Fixed - Complete pgzip Migration (Backup Side)

  • Removed ALL external gzip/pigz calls from backup engine

    • internal/backup/engine.go: executeWithStreamingCompression now uses pgzip
    • internal/parallel/engine.go: Fixed stub gzipWriter to use pgzip
    • No more gzip/pigz processes visible in htop during backup
    • Uses klauspost/pgzip for parallel multi-core compression
  • Complete pgzip migration status:

    • Backup: All compression uses in-process pgzip
    • Restore: All decompression uses in-process pgzip
    • Drill: Decompress on host with pgzip before Docker copy
    • WARNING: PITR only: PostgreSQL's restore_command must remain shell (PostgreSQL limitation)

[4.2.1] - 2026-01-30

Fixed - Complete pgzip Migration

  • Removed ALL external gunzip/gzip calls - Systematic audit and fix

    • internal/restore/engine.go: SQL restores now use pgzip stream → psql/mysql stdin
    • internal/drill/engine.go: Decompress on host with pgzip before Docker copy
    • No more gzip/gunzip/pigz processes visible in htop during restore
    • Uses klauspost/pgzip for parallel multi-core decompression
  • PostgreSQL PITR exception - restore_command in recovery config must remain shell

    • PostgreSQL itself runs this command to fetch WAL files
    • Cannot be replaced with Go code (PostgreSQL limitation)

[4.2.0] - 2026-01-30

Added - Quick Wins Release

  • dbbackup health command - Comprehensive backup infrastructure health check

    • 10 automated health checks: config, DB connectivity, backup dir, catalog, freshness, gaps, verification, file integrity, orphans, disk space
    • Exit codes for automation: 0=healthy, 1=warning, 2=critical
    • JSON output for monitoring integration (Prometheus, Nagios, etc.)
    • Auto-generates actionable recommendations
    • Custom backup interval for gap detection: --interval 12h
    • Skip database check for offline mode: --skip-db
    • Example: dbbackup health --format json
  • TUI System Health Check - Interactive health monitoring

    • Accessible via Tools → System Health Check
    • Runs all 10 checks asynchronously with progress spinner
    • Color-coded results: green=healthy, yellow=warning, red=critical
    • Displays recommendations for any issues found
  • dbbackup restore preview command - Pre-restore analysis and validation

    • Shows backup format, compression type, database type
    • Estimates uncompressed size (3x compression ratio)
    • Calculates RTO (Recovery Time Objective) based on active profile
    • Validates backup integrity without actual restore
    • Displays resource requirements (RAM, CPU, disk space)
    • Example: dbbackup restore preview backup.dump.gz
  • dbbackup diff command - Compare two backups and track changes

    • Flexible input: file paths, catalog IDs, or database:latest/previous
    • Shows size delta with percentage change
    • Calculates database growth rate (GB/day)
    • Projects time to reach 10GB threshold
    • Compares backup duration and compression efficiency
    • JSON output for automation and reporting
    • Example: dbbackup diff mydb:latest mydb:previous
  • dbbackup cost analyze command - Cloud storage cost optimization

    • Analyzes 15 storage tiers across 5 cloud providers
    • AWS S3: Standard, IA, Glacier Instant/Flexible, Deep Archive
    • Google Cloud Storage: Standard, Nearline, Coldline, Archive
    • Azure Blob Storage: Hot, Cool, Archive
    • Backblaze B2 and Wasabi alternatives
    • Monthly/annual cost projections
    • Savings calculations vs S3 Standard baseline
    • Tiered lifecycle strategy recommendations
    • Shows potential savings of 90%+ with proper policies
    • Example: dbbackup cost analyze --database mydb

Enhanced

  • TUI restore preview - Added RTO estimates and size calculations
    • Shows estimated uncompressed size during restore confirmation
    • Displays estimated restore time based on current profile
    • Helps users make informed restore decisions
    • Keeps TUI simple (essentials only), detailed analysis in CLI

Documentation

  • Updated README.md with new commands and examples
  • Created QUICK_WINS.md documenting the rapid development sprint
  • Added backup diff and cost analysis sections

[4.1.4] - 2026-01-29

Added

  • New turbo restore profile - Maximum restore speed, matches native pg_restore -j8

    • ClusterParallelism = 2 (restore 2 DBs concurrently)
    • Jobs = 8 (8 parallel pg_restore jobs)
    • BufferedIO = true (32KB write buffers for faster extraction)
    • Works on 16GB+ RAM, 4+ cores
    • Usage: dbbackup restore cluster backup.tar.gz --profile=turbo --confirm
  • Restore startup performance logging - Shows actual parallelism settings at restore start

    • Logs profile name, cluster_parallelism, pg_restore_jobs, buffered_io
    • Helps verify settings before long restore operations
  • Buffered I/O optimization - 32KB write buffers during tar extraction (turbo profile)

    • Reduces system call overhead
    • Improves I/O throughput for large archives

Fixed

  • TUI now respects saved profile settings - Previously TUI forced conservative profile on every launch, ignoring user's saved configuration. Now properly loads and respects saved settings.

Changed

  • TUI default profile changed from forced conservative to balanced (only when no profile configured)
  • LargeDBMode no longer forced on TUI startup - user controls it via settings

[4.1.3] - 2026-01-27

Added

  • --config / -c global flag - Specify config file path from anywhere
    • Example: dbbackup --config /opt/dbbackup/.dbbackup.conf backup single mydb
    • No longer need to cd to config directory before running commands
    • Works with all subcommands (backup, restore, verify, etc.)

[4.1.2] - 2026-01-27

Added

  • --socket flag for MySQL/MariaDB - Connect via Unix socket instead of TCP/IP
    • Usage: dbbackup backup single mydb --db-type mysql --socket /var/run/mysqld/mysqld.sock
    • Works for both backup and restore operations
    • Supports socket auth (no password required with proper permissions)

Fixed

  • Socket path as --host now works - If --host starts with /, it's auto-detected as a socket path
    • Example: --host /var/run/mysqld/mysqld.sock now works correctly instead of DNS lookup error
    • Auto-converts to --socket internally

[4.1.1] - 2026-01-25

Added

  • dbbackup_build_info metric - Exposes version and git commit as Prometheus labels
    • Useful for tracking deployed versions across a fleet
    • Labels: server, version, commit

Fixed

  • Documentation clarification: The pitr_base value for backup_type label is auto-assigned by dbbackup pitr base command. CLI --backup-type flag only accepts full or incremental. This was causing confusion in deployments.

[4.1.0] - 2026-01-25

Added

  • Backup Type Tracking: All backup metrics now include a backup_type label (full, incremental, or pitr_base for PITR base backups)
  • PITR Metrics: Complete Point-in-Time Recovery monitoring
    • dbbackup_pitr_enabled - Whether PITR is enabled (1/0)
    • dbbackup_pitr_archive_lag_seconds - Seconds since last WAL/binlog archived
    • dbbackup_pitr_chain_valid - WAL/binlog chain integrity (1=valid)
    • dbbackup_pitr_gap_count - Number of gaps in archive chain
    • dbbackup_pitr_archive_count - Total archived segments
    • dbbackup_pitr_archive_size_bytes - Total archive storage
    • dbbackup_pitr_recovery_window_minutes - Estimated PITR coverage
  • PITR Alerting Rules: 6 new alerts for PITR monitoring
    • PITRArchiveLag, PITRChainBroken, PITRGapsDetected, PITRArchiveStalled, PITRStorageGrowing, PITRDisabledUnexpectedly
  • dbbackup_backup_by_type metric - Count backups by type

Changed

  • dbbackup_backup_total type changed from counter to gauge for snapshot-based collection

[3.42.110] - 2026-01-24

Improved - Code Quality & Testing

  • Cleaned up 40+ unused code items found by staticcheck:

    • Removed unused functions, variables, struct fields, and type aliases
    • Fixed SA4006 warning (unused value assignment in restore engine)
    • All packages now pass staticcheck with zero warnings
  • Added golangci-lint integration to Makefile:

    • New make golangci-lint target with auto-install
    • Updated lint target to include golangci-lint
    • Updated install-tools to install golangci-lint
  • New unit tests for improved coverage:

    • internal/config/config_test.go - Tests for config initialization, database types, env helpers
    • internal/security/security_test.go - Tests for checksums, path validation, rate limiting, audit logging

[3.42.109] - 2026-01-24

Added - Grafana Dashboard & Monitoring Improvements

  • Enhanced Grafana dashboard with comprehensive improvements:

    • Added dashboard description for better discoverability
    • New collapsible "Backup Overview" row for organization
    • New Verification Status panel showing last backup verification state
    • Added descriptions to all 17 panels for better understanding
    • Enabled shared crosshair (graphTooltip=1) for correlated analysis
    • Added "monitoring" tag for dashboard discovery
  • New Prometheus alerting rules (grafana/alerting-rules.yaml):

    • DBBackupRPOCritical - No backup in 24+ hours (critical)
    • DBBackupRPOWarning - No backup in 12+ hours (warning)
    • DBBackupFailure - Backup failures detected
    • DBBackupNotVerified - Backup not verified in 24h
    • DBBackupDedupRatioLow - Dedup ratio below 10%
    • DBBackupDedupDiskGrowth - Rapid storage growth prediction
    • DBBackupExporterDown - Metrics exporter not responding
    • DBBackupMetricsStale - Metrics not updated in 10+ minutes
    • DBBackupNeverSucceeded - Database never backed up successfully

Changed

  • Grafana dashboard layout fixes:

    • Fixed overlapping dedup panels (y: 31/36 → 22/27/32)
    • Adjusted top row panel widths for better balance (5+5+5+4+5=24)
  • Added Makefile for streamlined development workflow:

    • make build - optimized binary with ldflags
    • make test, make race, make cover - testing targets
    • make lint - runs vet + staticcheck
    • make all-platforms - cross-platform builds

Fixed

  • Removed deprecated netErr.Temporary() call in cloud retry logic (Go 1.18+)
  • Fixed staticcheck warnings for redundant fmt.Sprintf calls
  • Logger optimizations: buffer pooling, early level check, pre-allocated maps
  • Clone engine now validates disk space before operations

[3.42.108] - 2026-01-24

Added - TUI Tools Expansion

  • Table Sizes - view top 100 tables sorted by size with row counts, data/index breakdown

    • Supports PostgreSQL (pg_stat_user_tables) and MySQL (information_schema.TABLES)
    • Shows total/data/index sizes, row counts, schema prefix for non-public schemas
  • Kill Connections - manage active database connections

    • List all active connections with PID, user, database, state, query preview, duration
    • Kill single connection or all connections to a specific database
    • Useful before restore operations to clear blocking sessions
    • Supports PostgreSQL (pg_terminate_backend) and MySQL (KILL)
  • Drop Database - safely drop databases with double confirmation

    • Lists user databases (system DBs hidden: postgres, template0/1, mysql, sys, etc.)
    • Requires two confirmations: y/n then type full database name
    • Auto-terminates connections before drop
    • Supports PostgreSQL and MySQL

[3.42.107] - 2026-01-24

Added - Tools Menu & Blob Statistics

  • New "Tools" submenu in TUI - centralized access to utility functions

    • Blob Statistics - scan database for bytea/blob columns with size analysis
    • Blob Extract - externalize large objects (coming soon)
    • Dedup Store Analyze - storage savings analysis (coming soon)
    • Verify Backup Integrity - backup verification
    • Catalog Sync - synchronize local catalog (coming soon)
  • New dbbackup blob stats CLI command - analyze blob/bytea columns

    • Scans information_schema for binary column types
    • Shows row counts, total size, average size, max size per column
    • Identifies tables storing large binary data for optimization
    • Supports both PostgreSQL (bytea, oid) and MySQL (blob, mediumblob, longblob)
    • Provides recommendations for databases with >100MB blob data

[3.42.106] - 2026-01-24

Fixed - Cluster Restore Resilience & Performance

  • Fixed cluster restore failing on missing roles - harmless "role does not exist" errors no longer abort restore

    • Added role-related errors to isIgnorableError() with warning log
    • Removed ON_ERROR_STOP=1 from psql commands (pre-validation catches real corruption)
    • Restore now continues gracefully when referenced roles don't exist in target cluster
    • Previously caused 12h+ restores to fail at 94% completion
  • Fixed TUI output scrambling in screen/tmux sessions - added terminal detection

    • Uses go-isatty to detect non-interactive terminals (backgrounded screen sessions, pipes)
    • Added viewSimple() methods for clean line-by-line output without ANSI escape codes
    • TUI menu now shows warning when running in non-interactive terminal

Changed - Consistent Parallel Compression (pgzip)

  • Migrated all gzip operations to parallel pgzip - 2-4x faster compression/decompression on multi-core systems
    • Systematic audit found 17 files using standard compress/gzip
    • All converted to github.com/klauspost/pgzip for consistent performance
    • Files updated:
      • internal/backup/: incremental_tar.go, incremental_extract.go, incremental_mysql.go
      • internal/wal/: compression.go (CompressWALFile, DecompressWALFile, VerifyCompressedFile)
      • internal/engine/: clone.go, snapshot_engine.go, mysqldump.go, binlog/file_target.go
      • internal/restore/: engine.go, safety.go, formats.go, error_report.go
      • internal/pitr/: mysql.go, binlog.go
      • internal/dedup/: store.go
      • cmd/: dedup.go, placeholder.go
    • Benefit: Large backup/restore operations now fully utilize available CPU cores

[3.42.105] - 2026-01-23

Changed - TUI Visual Cleanup

  • Removed ASCII box characters from backup/restore success/failure banners
    • Replaced ╔═╗║╚╝ boxes with clean ═══ horizontal line separators
    • Cleaner, more modern appearance in terminal output
  • Consolidated duplicate styles in TUI components
    • Unified check status styles (passed/failed/warning/pending) into global definitions
    • Reduces code duplication across restore preview and diagnose views

[3.42.98] - 2025-01-23

Fixed - Critical Bug Fixes for v3.42.97

  • Fixed CGO/SQLite build issue - binaries now work when compiled with CGO_ENABLED=0

    • Switched from github.com/mattn/go-sqlite3 (requires CGO) to modernc.org/sqlite (pure Go)
    • All cross-compiled binaries now work correctly on all platforms
    • No more "Binary was compiled with 'CGO_ENABLED=0', go-sqlite3 requires cgo to work" errors
  • Fixed MySQL positional database argument being ignored

    • dbbackup backup single <dbname> --db-type mysql now correctly uses <dbname>
    • Previously defaulted to 'postgres' regardless of positional argument
    • Also fixed in backup sample command

[3.42.97] - 2025-01-23

Added - Bandwidth Throttling for Cloud Uploads

  • New --bandwidth-limit flag for cloud operations - prevent network saturation during business hours
    • Works with S3, GCS, Azure Blob Storage, MinIO, Backblaze B2
    • Supports human-readable formats:
      • 10MB/s, 50MiB/s - megabytes per second
      • 100KB/s, 500KiB/s - kilobytes per second
      • 1GB/s - gigabytes per second
      • 100Mbps - megabits per second (for network-minded users)
      • unlimited or 0 - no limit (default)
    • Environment variable: DBBACKUP_BANDWIDTH_LIMIT
    • Example usage:
      # Limit upload to 10 MB/s during business hours
      dbbackup cloud upload backup.dump --bandwidth-limit 10MB/s
      
      # Environment variable for all operations
      export DBBACKUP_BANDWIDTH_LIMIT=50MiB/s
      
    • Implementation: Token-bucket style throttling with 100ms windows for smooth rate limiting
    • DBA requested feature: Avoid saturating production network during scheduled backups

[3.42.96] - 2025-02-01

Changed - Complete Elimination of Shell tar/gzip Dependencies

  • All tar/gzip operations now 100% in-process - ZERO shell dependencies for backup/restore
    • Removed ALL remaining exec.Command("tar", ...) calls
    • Removed ALL remaining exec.Command("gzip", ...) calls
    • Systematic code audit found and eliminated:
      • diagnose.go: Replaced tar -tzf test with direct file open check
      • large_restore_check.go: Replaced gzip -t and gzip -l with in-process pgzip verification
      • pitr/restore.go: Replaced tar -xf with in-process tar extraction
    • Benefits:
      • No external tool dependencies (works in minimal containers)
      • 2-4x faster on multi-core systems using parallel pgzip
      • More reliable error handling with Go-native errors
      • Consistent behavior across all platforms
      • Reduced attack surface (no shell spawning)
    • Verification: strace and ps aux show no tar/gzip/gunzip processes during backup/restore
    • Note: Docker drill container commands still use gunzip for in-container operations (intentional)

[Unreleased]

Added - Single Database Extraction from Cluster Backups (CLI + TUI)

  • Extract and restore individual databases from cluster backups - selective restore without full cluster restoration
    • CLI Commands:
      • List databases: dbbackup restore cluster backup.tar.gz --list-databases
        • Shows all databases in cluster backup with sizes
        • Fast scan without full extraction
      • Extract single database: dbbackup restore cluster backup.tar.gz --database myapp --output-dir /tmp/extract
        • Extracts only the specified database dump
        • No restore, just file extraction
      • Restore single database from cluster: dbbackup restore cluster backup.tar.gz --database myapp --confirm
        • Extracts and restores only one database
        • Much faster than full cluster restore when you only need one database
      • Rename on restore: dbbackup restore cluster backup.tar.gz --database myapp --target myapp_test --confirm
        • Restore with different database name (useful for testing)
      • Extract multiple databases: dbbackup restore cluster backup.tar.gz --databases "app1,app2,app3" --output-dir /tmp/extract
        • Comma-separated list of databases to extract
    • TUI Support:
      • Press 's' on any cluster backup in archive browser to select individual databases
      • New ClusterDatabaseSelector view shows all databases with sizes
      • Navigate with arrow keys, select with Enter
      • Automatic handling when cluster backup selected in single restore mode
      • Full restore preview and confirmation workflow
    • Benefits:
      • Faster restores (extract only what you need)
      • Less disk space usage during restore
      • Easy database migration/copying
      • Better testing workflow
      • Selective disaster recovery

Performance - Cluster Restore Optimization

  • Eliminated duplicate archive extraction in cluster restore - saves 30-50% time on large restores
    • Previously: Archive was extracted twice (once in preflight validation, once in actual restore)
    • Now: Archive extracted once and reused for both validation and restore
    • Time savings:
      • 50 GB cluster: ~3-6 minutes faster
      • 10 GB cluster: ~1-2 minutes faster
      • Small clusters (<5 GB): ~30 seconds faster
    • Optimization automatically enabled when --diagnose flag is used
    • New ValidateAndExtractCluster() performs combined validation + extraction
    • RestoreCluster() accepts optional preExtractedPath parameter to reuse extracted directory
    • Disk space checks intelligently skipped when using pre-extracted directory
    • Maintains backward compatibility - works with and without pre-extraction
    • Log output shows optimization: "Using pre-extracted cluster directory ... optimization: skipping duplicate extraction"

Improved - Archive Validation

  • Enhanced tar.gz validation with stream-based checks
    • Fast header-only validation (validates gzip + tar structure without full extraction)
    • Checks gzip magic bytes (0x1f 0x8b) and tar header signature
    • Reduces preflight validation time from minutes to seconds on large archives
    • Falls back to full extraction only when necessary (with --diagnose)

Added - PostgreSQL lock verification (CLI + preflight)

  • dbbackup verify-locks — new CLI command that probes PostgreSQL GUCs (max_locks_per_transaction, max_connections, max_prepared_transactions) and prints total lock capacity plus actionable restore guidance.
  • Integrated into preflight checks — preflight now warns/fails when lock settings are insufficient and provides exact remediation commands and recommended restore flags (e.g. --jobs 1 --parallel-dbs 1).
  • Implemented in Go (replaces verify_postgres_locks.sh) with robust parsing, sudo/psql fallback and unit-tested decision logic.
  • Files: cmd/verify_locks.go, internal/checks/locks.go, internal/checks/locks_test.go, internal/checks/preflight.go.
  • Why: Prevents repeated parallel-restore failures by surfacing lock-capacity issues early and providing bulletproof guidance.

[3.42.74] - 2026-01-20 "Resource Profile System + Critical Ctrl+C Fix"

Critical Bug Fix

  • Fixed Ctrl+C not working in TUI backup/restore - Context cancellation was broken in TUI mode
    • executeBackupWithTUIProgress() and executeRestoreWithTUIProgress() created new contexts with WithCancel(parentCtx)
    • When user pressed Ctrl+C, model.cancel() was called on parent context but execution had separate context
    • Fixed by using parent context directly instead of creating new one
    • Ctrl+C/ESC/q now properly propagate cancellation to running operations
    • Users can now interrupt long-running TUI operations

Added - Resource Profile System

  • --profile flag for restore operations with three presets:
    • Conservative (--profile=conservative): Single-threaded (--parallel=1), minimal memory usage
      • Best for resource-constrained servers, shared hosting, or when "out of shared memory" errors occur
      • Automatically enables LargeDBMode for better resource management
    • Balanced (default): Auto-detect resources, moderate parallelism
      • Good default for most scenarios
    • Aggressive (--profile=aggressive): Maximum parallelism, all available resources
      • Best for dedicated database servers with ample resources
    • Potato (--profile=potato): Easter egg, same as conservative
  • Profile system applies to both CLI and TUI:
    • CLI: dbbackup restore cluster backup.tar.gz --profile=conservative --confirm
    • TUI: Automatically uses conservative profile for safer interactive operation
  • User overrides supported: --jobs and --parallel-dbs flags override profile settings
  • New internal/config/profile.go module:
    • GetRestoreProfile(name) - Returns profile settings
    • ApplyProfile(cfg, profile, jobs, parallelDBs) - Applies profile with overrides
    • GetProfileDescription(name) - Human-readable descriptions
    • ListProfiles() - All available profiles

Added - PostgreSQL Diagnostic Tools

  • diagnose_postgres_memory.sh - Comprehensive memory and resource analysis script:
    • System memory overview with usage percentages and warnings
    • Top 15 memory consuming processes
    • PostgreSQL-specific memory configuration analysis
    • Current locks and connections monitoring
    • Shared memory segments inspection
    • Disk space and swap usage checks
    • Identifies other resource consumers (Nessus, Elastic Agent, monitoring tools)
    • Smart recommendations based on findings
    • Detects temp file usage (indicator of low work_mem)
  • fix_postgres_locks.sh - PostgreSQL lock configuration helper:
    • Automatically increases max_locks_per_transaction to 4096
    • Shows current configuration before applying changes
    • Calculates total lock capacity
    • Provides restart commands for different PostgreSQL setups
    • References diagnostic tool for comprehensive analysis

Added - Documentation

  • RESTORE_PROFILES.md - Complete profile guide with real-world scenarios:
    • Profile comparison table
    • When to use each profile
    • Override examples
    • Troubleshooting guide for "out of shared memory" errors
    • Integration with diagnostic tools
  • email_infra_team.txt - Admin communication template (German):
    • Analysis results template
    • Problem identification section
    • Three solution variants (temporary, permanent, workaround)
    • Includes diagnostic tool references

Changed - TUI Improvements

  • TUI mode defaults to conservative profile for safer operation
    • Interactive users benefit from stability over speed
    • Prevents resource exhaustion on shared systems
    • Can be overridden with environment variable: export RESOURCE_PROFILE=balanced

Fixed

  • Context cancellation in TUI backup operations (critical)
  • Context cancellation in TUI restore operations (critical)
  • Better error diagnostics for "out of shared memory" errors
  • Improved resource detection and management

Technical Details

  • Profile system respects explicit user flags (--jobs, --parallel-dbs)
  • Conservative profile sets cfg.LargeDBMode = true automatically
  • TUI profile selection logged when Debug mode enabled
  • All profiles support both single and cluster restore operations

[3.42.50] - 2026-01-16 "Ctrl+C Signal Handling Fix"

Fixed - Proper Ctrl+C/SIGINT Handling in TUI

  • Added tea.InterruptMsg handling - Bubbletea v1.3+ sends InterruptMsg for SIGINT signals instead of a KeyMsg with "ctrl+c", causing cancellation to not work
  • Fixed cluster restore cancellation - Ctrl+C now properly cancels running restore operations
  • Fixed cluster backup cancellation - Ctrl+C now properly cancels running backup operations
  • Added interrupt handling to main menu - Proper cleanup on SIGINT from menu
  • Orphaned process cleanup - cleanup.KillOrphanedProcesses() called on all interrupt paths

Changed

  • All TUI execution views now handle both tea.KeyMsg ("ctrl+c") and tea.InterruptMsg
  • Context cancellation properly propagates to child processes via exec.CommandContext
  • No zombie pg_dump/pg_restore/gzip processes left behind on cancellation

[3.42.49] - 2026-01-16 "Unified Cluster Backup Progress"

Added - Unified Progress Display for Cluster Backup

  • Combined overall progress bar for cluster backup showing all phases:
    • Phase 1/3: Backing up Globals (0-15% of overall)
    • Phase 2/3: Backing up Databases (15-90% of overall)
    • Phase 3/3: Compressing Archive (90-100% of overall)
  • Current database indicator - Shows which database is currently being backed up
  • Phase-aware progress tracking - New fields in backup progress state:
    • overallPhase - Current phase (1=globals, 2=databases, 3=compressing)
    • phaseDesc - Human-readable phase description
  • Dual progress bars for cluster backup:
    • Overall progress bar showing combined operation progress
    • Database count progress bar showing individual database progress

Changed

  • Cluster backup TUI now shows unified progress display matching restore
  • Progress callbacks now include phase information
  • Better visual feedback during entire cluster backup operation

[3.42.48] - 2026-01-15 "Unified Cluster Restore Progress"

Added - Unified Progress Display for Cluster Restore

  • Combined overall progress bar showing progress across all restore phases:
    • Phase 1/3: Extracting Archive (0-60% of overall)
    • Phase 2/3: Restoring Globals (60-65% of overall)
    • Phase 3/3: Restoring Databases (65-100% of overall)
  • Current database indicator - Shows which database is currently being restored
  • Phase-aware progress tracking - New fields in progress state:
    • overallPhase - Current phase (1=extraction, 2=globals, 3=databases)
    • currentDB - Name of database currently being restored
    • extractionDone - Boolean flag for phase transition
  • Dual progress bars for cluster restore:
    • Overall progress bar showing combined operation progress
    • Phase-specific progress bar (extraction bytes or database count)

Changed

  • Cluster restore TUI now shows unified progress display
  • Progress callbacks now set phase and current database information
  • Extraction completion triggers automatic transition to globals phase
  • Database restore phase shows current database name with spinner

Improved

  • Better visual feedback during entire cluster restore operation
  • Clear phase indicators help users understand restore progress
  • Overall progress percentage gives better time estimates

[3.42.35] - 2026-01-15 "TUI Detailed Progress"

Added - Enhanced TUI Progress Display

  • Detailed progress bar in TUI restore - schollz-style progress bar with:
    • Byte progress display (e.g., 245 MB / 1.2 GB)
    • Transfer speed calculation (e.g., 45 MB/s)
    • ETA prediction for long operations
    • Unicode block-based visual bar
  • Real-time extraction progress - Archive extraction now reports actual bytes processed
  • Go-native tar extraction - Uses Go's archive/tar + compress/gzip when progress callback is set
  • New DetailedProgress component in TUI package:
    • NewDetailedProgress(total, description) - Byte-based progress
    • NewDetailedProgressItems(total, description) - Item count progress
    • NewDetailedProgressSpinner(description) - Indeterminate spinner
    • RenderProgressBar(width) - Generate schollz-style output
  • Progress callback API in restore engine:
    • SetProgressCallback(func(current, total int64, description string))
    • Allows TUI to receive real-time progress updates from restore operations
  • Shared progress state pattern for Bubble Tea integration

Changed

  • TUI restore execution now shows detailed byte progress during archive extraction
  • Cluster restore shows extraction progress instead of just spinner
  • Falls back to shell tar command when no progress callback is set (faster)

Technical Details

  • progressReader wrapper tracks bytes read through gzip/tar pipeline
  • Throttled progress updates (every 100ms) to avoid UI flooding
  • Thread-safe shared state pattern for cross-goroutine progress updates

[3.42.34] - 2026-01-14 "Filesystem Abstraction"

Added - spf13/afero for Filesystem Abstraction

  • New internal/fs package for testable filesystem operations
  • In-memory filesystem for unit testing without disk I/O
  • Global FS interface that can be swapped for testing:
    fs.SetFS(afero.NewMemMapFs())  // Use memory
    fs.ResetFS()                    // Back to real disk
    
  • Wrapper functions for all common file operations:
    • ReadFile, WriteFile, Create, Open, Remove, RemoveAll
    • Mkdir, MkdirAll, ReadDir, Walk, Glob
    • Exists, DirExists, IsDir, IsEmpty
    • TempDir, TempFile, CopyFile, FileSize
  • Testing helpers:
    • WithMemFs(fn) - Execute function with temp in-memory FS
    • SetupTestDir(files) - Create test directory structure
  • Comprehensive test suite demonstrating usage

Changed

  • Upgraded afero from v1.10.0 to v1.15.0

[3.42.33] - 2026-01-14 "Exponential Backoff Retry"

Added - cenkalti/backoff for Cloud Operation Retry

  • Exponential backoff retry for all cloud operations (S3, Azure, GCS)
  • Retry configurations:
    • DefaultRetryConfig() - 5 retries, 500ms→30s backoff, 5 min max
    • AggressiveRetryConfig() - 10 retries, 1s→60s backoff, 15 min max
    • QuickRetryConfig() - 3 retries, 100ms→5s backoff, 30s max
  • Smart error classification:
    • IsPermanentError() - Auth/bucket errors (no retry)
    • IsRetryableError() - Timeout/network errors (retry)
  • Retry logging - Each retry attempt is logged with wait duration

Changed

  • S3 simple upload, multipart upload, download now retry on transient failures
  • Azure simple upload, download now retry on transient failures
  • GCS upload, download now retry on transient failures
  • Large file multipart uploads use AggressiveRetryConfig() (more retries)

[3.42.32] - 2026-01-14 "Cross-Platform Colors"

Added - fatih/color for Cross-Platform Terminal Colors

  • Windows-compatible colors - Native Windows console API support
  • Color helper functions in logger package:
    • Success(), Error(), Warning(), Info() - Status messages with icons
    • Header(), Dim(), Bold() - Text styling
    • Green(), Red(), Yellow(), Cyan() - Colored text
    • StatusLine(), TableRow() - Formatted output
    • DisableColors(), EnableColors() - Runtime control
  • Consistent color scheme across all log levels

Changed

  • Logger CleanFormatter now uses fatih/color instead of raw ANSI codes
  • All progress indicators use fatih/color for [OK]/[FAIL] status
  • Automatic color detection (disabled for non-TTY)

[3.42.31] - 2026-01-14 "Visual Progress Bars"

Added - schollz/progressbar for Enhanced Progress Display

  • Visual progress bars for cloud uploads/downloads with:
    • Byte transfer display (e.g., 245 MB / 1.2 GB)
    • Transfer speed (e.g., 45 MB/s)
    • ETA prediction
    • Color-coded progress with Unicode blocks
  • Checksum verification progress - visual progress while calculating SHA-256
  • Spinner for indeterminate operations - Braille-style spinner when size unknown
  • New progress types: NewSchollzBar(), NewSchollzBarItems(), NewSchollzSpinner()
  • Progress bar Writer() method for io.Copy integration

Changed

  • Cloud download shows real-time byte progress instead of 10% log messages
  • Cloud upload shows visual progress bar instead of debug logs
  • Checksum verification shows progress for large files

[3.42.30] - 2026-01-09 "Better Error Aggregation"

Added - go-multierror for Cluster Restore Errors

  • Enhanced error reporting - Now shows ALL database failures, not just a count
  • Uses hashicorp/go-multierror for proper error aggregation
  • Each failed database error is preserved with full context
  • Bullet-pointed error output for readability:
    cluster restore completed with 3 failures:
    3 database(s) failed:
      • db1: restore failed: max_locks_per_transaction exceeded
      • db2: restore failed: connection refused
      • db3: failed to create database: permission denied
    

Changed

  • Replaced string slice error collection with proper *multierror.Error
  • Thread-safe error aggregation with dedicated mutex
  • Improved error wrapping with %w for error chain preservation

[3.42.10] - 2026-01-08 "Code Quality"

Fixed - Code Quality Issues

  • Removed deprecated io/ioutil usage (replaced with os)
  • Fixed os.DirEntry.ModTime()file.Info().ModTime()
  • Removed unused fields and variables
  • Fixed ineffective assignments in TUI code
  • Fixed error strings (no capitalization, no trailing punctuation)

[3.42.9] - 2026-01-08 "Diagnose Timeout Fix"

Fixed - diagnose.go Timeout Bugs

More short timeouts that caused large archive failures:

  • diagnoseClusterArchive(): tar listing 60s → 5 minutes
  • verifyWithPgRestore(): pg_restore --list 60s → 5 minutes
  • DiagnoseClusterDumps(): archive listing 120s → 10 minutes

Impact: These timeouts caused "context deadline exceeded" errors when diagnosing multi-GB backup archives, preventing TUI restore from even starting.

[3.42.8] - 2026-01-08 "TUI Timeout Fix"

Fixed - TUI Timeout Bugs Causing Backup/Restore Failures

ROOT CAUSE of 2-3 month TUI backup/restore failures identified and fixed:

Critical Timeout Fixes:

  • restore_preview.go: Safety check timeout increased from 60s → 10 minutes
    • Large archives (>1GB) take 2+ minutes to diagnose
    • Users saw "context deadline exceeded" before backup even started
  • dbselector.go: Database listing timeout increased from 15s → 60 seconds
    • Busy PostgreSQL servers need more time to respond
  • status.go: Status check timeout increased from 10s → 30 seconds
    • SSL negotiation and slow networks caused failures

Stability Improvements:

  • Panic recovery added to parallel goroutines in:
    • backup/engine.go:BackupCluster() - cluster backup workers
    • restore/engine.go:RestoreCluster() - cluster restore workers
    • Prevents single database panic from crashing entire operation

Bug Fix:

  • restore/engine.go: Fixed variable shadowing errcmdErr for exit code detection

[3.42.7] - 2026-01-08 "Context Killer Complete"

Fixed - Additional Deadlock Bugs in Restore & Engine

All remaining cmd.Wait() deadlock bugs fixed across the codebase:

internal/restore/engine.go:

  • executeRestoreWithDecompression() - gunzip/pigz pipeline restore
  • extractArchive() - tar extraction for cluster restore
  • restoreGlobals() - pg_dumpall globals restore

internal/backup/engine.go:

  • createArchive() - tar/pigz archive creation pipeline

internal/engine/mysqldump.go:

  • Backup() - mysqldump backup operation
  • BackupToWriter() - streaming mysqldump to writer

All 6 functions now use proper channel-based context handling with Process.Kill().

[3.42.6] - 2026-01-08 "Deadlock Killer"

Fixed - Backup Command Context Handling

Critical Bug: pg_dump/mysqldump could hang forever on context cancellation

The executeCommand, executeCommandWithProgress, executeMySQLWithProgressAndCompression, and executeMySQLWithCompression functions had a race condition where:

  1. A goroutine was spawned to read stderr
  2. cmd.Wait() was called directly
  3. If context was cancelled, the process was NOT killed
  4. The goroutine could hang forever waiting for stderr

Fix: All backup execution functions now use proper channel-based context handling:

// Wait for command with context handling
cmdDone := make(chan error, 1)
go func() {
    cmdDone <- cmd.Wait()
}()

select {
case cmdErr = <-cmdDone:
    // Command completed
case <-ctx.Done():
    // Context cancelled - kill process
    cmd.Process.Kill()
    <-cmdDone
    cmdErr = ctx.Err()
}

Affected Functions:

  • executeCommand() - pg_dump for cluster backup
  • executeCommandWithProgress() - pg_dump for single backup with progress
  • executeMySQLWithProgressAndCompression() - mysqldump pipeline
  • executeMySQLWithCompression() - mysqldump pipeline

This fixes: Backup operations hanging indefinitely when cancelled or timing out.

[3.42.5] - 2026-01-08 "False Positive Fix"

Fixed - Encryption Detection Bug

IsBackupEncrypted False Positive:

  • BUG FIX: IsBackupEncrypted() returned true for ALL files, blocking normal restores
  • Root cause: Fallback logic checked if first 12 bytes (nonce size) could be read - always true
  • Fix: Now properly detects known unencrypted formats by magic bytes:
    • Gzip: 1f 8b
    • PostgreSQL custom: PGDMP
    • Plain SQL: starts with --, SET, CREATE
  • Returns false if no metadata present and format is recognized as unencrypted
  • Affected file: internal/backup/encryption.go

[3.42.4] - 2026-01-08 "The Long Haul"

Fixed - Critical Restore Timeout Bug

Removed Arbitrary Timeouts from Backup/Restore Operations:

  • CRITICAL FIX: Removed 4-hour timeout that was killing large database restores
  • PostgreSQL cluster restores of 69GB+ databases no longer fail with "context deadline exceeded"
  • All backup/restore operations now use context.WithCancel instead of context.WithTimeout
  • Operations run until completion or manual cancellation (Ctrl+C)

Affected Files:

  • internal/tui/restore_exec.go: Changed from 4-hour timeout to context.WithCancel
  • internal/tui/backup_exec.go: Changed from 4-hour timeout to context.WithCancel
  • internal/backup/engine.go: Removed per-database timeout in cluster backup
  • cmd/restore.go: CLI restore commands use context.WithCancel

exec.Command Context Audit:

  • Fixed exec.Command without Context in internal/restore/engine.go:730
  • Added proper context handling to all external command calls
  • Added timeouts only for quick diagnostic/version checks (not restore path):
    • restore/version_check.go: 30s timeout for pg_restore --version check only
    • restore/error_report.go: 10s timeout for tool version detection
    • restore/diagnose.go: 60s timeout for diagnostic functions
    • pitr/binlog.go: 10s timeout for mysqlbinlog --version check
    • cleanup/processes.go: 5s timeout for process listing
    • auth/helper.go: 30s timeout for auth helper commands

Verification:

  • 54 total exec.CommandContext calls verified in backup/restore/pitr path
  • 0 exec.Command without Context in critical restore path
  • All 14 PostgreSQL exec calls use CommandContext (pg_dump, pg_restore, psql)
  • All 15 MySQL/MariaDB exec calls use CommandContext (mysqldump, mysql, mysqlbinlog)
  • All 14 test packages pass

Technical Details

  • Large Object (BLOB/BYTEA) restores are particularly affected by timeouts
  • 69GB database with large objects can take 5+ hours to restore
  • Previous 4-hour hard timeout was causing consistent failures
  • Now: No timeout - runs until complete or user cancels

[3.42.1] - 2026-01-07 "Resistance is Futile"

Added - Content-Defined Chunking Deduplication

Deduplication Engine:

  • New dbbackup dedup command family for space-efficient backups
  • Gear hash content-defined chunking (CDC) with 92%+ overlap on shifted data
  • SHA-256 content-addressed storage - chunks stored by hash
  • AES-256-GCM per-chunk encryption (optional, via --encrypt)
  • Gzip compression enabled by default
  • SQLite index for fast chunk lookups
  • JSON manifests track chunks per backup with full verification

Dedup Commands:

dbbackup dedup backup <file>              # Create deduplicated backup
dbbackup dedup backup <file> --encrypt    # With encryption
dbbackup dedup restore <id> <output>      # Restore from manifest
dbbackup dedup list                       # List all backups
dbbackup dedup stats                      # Show deduplication statistics
dbbackup dedup delete <id>                # Delete a backup manifest
dbbackup dedup gc                         # Garbage collect unreferenced chunks

Storage Structure:

<backup-dir>/dedup/
  chunks/           # Content-addressed chunk files (sharded by hash prefix)
  manifests/        # JSON manifest per backup
  chunks.db         # SQLite index for fast lookups

Test Results:

  • First 5MB backup: 448 chunks, 5MB stored
  • Modified 5MB file: 448 chunks, only 1 NEW chunk (1.6KB), 100% dedup ratio
  • Restore with SHA-256 verification

Added - Documentation Updates

  • Prometheus alerting rules added to SYSTEMD.md
  • Catalog sync instructions for existing backups

[3.41.1] - 2026-01-07

Fixed

  • Enabled CGO for Linux builds (required for SQLite catalog)

[3.41.0] - 2026-01-07 "The Operator"

Added - Systemd Integration & Prometheus Metrics

Embedded Systemd Installer:

  • New dbbackup install command installs as systemd service/timer
  • Supports single-database (--backup-type single) and cluster (--backup-type cluster) modes
  • Automatic dbbackup user/group creation with proper permissions
  • Hardened service units with security features (NoNewPrivileges, ProtectSystem, CapabilityBoundingSet)
  • Templated timer units with configurable schedules (daily, weekly, or custom OnCalendar)
  • Built-in dry-run mode (--dry-run) to preview installation
  • dbbackup install --status shows current installation state
  • dbbackup uninstall cleanly removes all systemd units and optionally configuration

Prometheus Metrics Support:

  • New dbbackup metrics export command writes textfile collector format
  • New dbbackup metrics serve command runs HTTP exporter on port 9399
  • Metrics: dbbackup_last_success_timestamp, dbbackup_rpo_seconds, dbbackup_backup_total, etc.
  • Integration with node_exporter textfile collector
  • Metrics automatically updated via ExecStopPost in service units
  • --with-metrics flag during install sets up exporter as systemd service

New Commands:

# Install as systemd service
sudo dbbackup install --backup-type cluster --schedule daily

# Install with Prometheus metrics
sudo dbbackup install --with-metrics --metrics-port 9399

# Check installation status
dbbackup install --status

# Export metrics for node_exporter
dbbackup metrics export --output /var/lib/dbbackup/metrics/dbbackup.prom

# Run HTTP metrics server
dbbackup metrics serve --port 9399

Technical Details

  • Systemd templates embedded with //go:embed for self-contained binary
  • Templates use ReadWritePaths for security isolation
  • Service units include proper OOMScoreAdjust (-100) to protect backups
  • Metrics exporter caches with 30-second TTL for performance
  • Graceful shutdown on SIGTERM for metrics server

[3.41.0] - 2026-01-07 "The Pre-Flight Check"

Added - Pre-Restore Validation

Automatic Dump Validation Before Restore:

  • SQL dump files are now validated BEFORE attempting restore
  • Detects truncated COPY blocks that cause "syntax error" failures
  • Catches corrupted backups in seconds instead of wasting 49+ minutes
  • Cluster restore pre-validates ALL dumps upfront (fail-fast approach)
  • Custom format .dump files now validated with pg_restore --list

Improved Error Messages:

  • Clear indication when dump file is truncated
  • Shows which table's COPY block was interrupted
  • Displays sample orphaned data for diagnosis
  • Provides actionable error messages with root cause

Fixed

  • P0: SQL Injection - Added identifier validation for database names in CREATE/DROP DATABASE to prevent SQL injection attacks; uses safe quoting and regex validation (alphanumeric + underscore only)
  • P0: Data Race - Fixed concurrent goroutines appending to shared error slice in notification manager; now uses mutex synchronization
  • P0: psql ON_ERROR_STOP - Added -v ON_ERROR_STOP=1 to psql commands to fail fast on first error instead of accumulating millions of errors
  • P1: Pipe deadlock - Fixed streaming compression deadlock when pg_dump blocks on full pipe buffer; now uses goroutine with proper context timeout handling
  • P1: SIGPIPE handling - Detect exit code 141 (broken pipe) and report compressor failure as root cause
  • P2: .dump validation - Custom format dumps now validated with pg_restore --list before restore
  • P2: fsync durability - Added outFile.Sync() after streaming compression to prevent truncation on power loss
  • Truncated .sql.gz dumps no longer waste hours on doomed restores
  • "syntax error at or near" errors now caught before restore begins
  • Cluster restores abort immediately if any dump is corrupted

Technical Details

  • Integrated Diagnoser into restore pipeline for pre-validation
  • Added quickValidateSQLDump() for fast integrity checks
  • Pre-validation runs on all .sql.gz and .dump files in cluster archives
  • Streaming compression uses channel-based wait with context cancellation
  • Zero performance impact on valid backups (diagnosis is fast)

[3.40.0] - 2026-01-05 "The Diagnostician"

Added - Restore Diagnostics & Error Reporting

Backup Diagnosis Command:

  • restore diagnose <archive> - Deep analysis of backup files before restore
  • Detects truncated dumps, corrupted archives, incomplete COPY blocks
  • PGDMP signature validation for PostgreSQL custom format
  • Gzip integrity verification with decompression test
  • pg_restore --list validation for custom format archives
  • --deep flag for exhaustive line-by-line analysis
  • --json flag for machine-readable output
  • Cluster archive diagnosis scans all contained dumps

Detailed Error Reporting:

  • Comprehensive error collector captures stderr during restore
  • Ring buffer prevents OOM on high-error restores (2M+ errors)
  • Error classification with actionable hints and recommendations
  • --save-debug-log <path> saves JSON report on failure
  • Reports include: exit codes, last errors, line context, tool versions
  • Automatic recommendations based on error patterns

TUI Restore Enhancements:

  • Dump validity safety check runs automatically before restore
  • Detects truncated/corrupted backups in restore preview
  • Press d to toggle debug log saving in Advanced Options
  • Debug logs saved to /tmp/dbbackup-restore-debug-*.json on failure
  • Press d in archive browser to run diagnosis on any backup

New Commands:

  • restore diagnose - Analyze backup file integrity and structure

New Flags:

  • --save-debug-log <path> - Save detailed JSON error report on failure
  • --diagnose - Run deep diagnosis before cluster restore
  • --deep - Enable exhaustive diagnosis (line-by-line analysis)
  • --json - Output diagnosis in JSON format
  • --keep-temp - Keep temporary files after diagnosis
  • --verbose - Show detailed diagnosis progress

Technical Details

  • 1,200+ lines of new diagnostic code
  • Error classification system with 15+ error patterns
  • Ring buffer stderr capture (1MB max, 10K lines)
  • Zero memory growth on high-error restores
  • Full TUI integration for diagnostics

[3.2.0] - 2025-12-13 "The Margin Eraser"

Added - Physical Backup Revolution

MySQL Clone Plugin Integration:

  • Native physical backup using MySQL 8.0.17+ Clone Plugin
  • No XtraBackup dependency - pure Go implementation
  • Real-time progress monitoring via performance_schema
  • Support for both local and remote clone operations

Filesystem Snapshot Orchestration:

  • LVM snapshot support with automatic cleanup
  • ZFS snapshot integration with send/receive
  • Btrfs subvolume snapshot support
  • Brief table lock (<100ms) for consistency
  • Automatic snapshot backend detection

Continuous Binlog Streaming:

  • Real-time binlog capture using MySQL replication protocol
  • Multiple targets: file, compressed file, S3 direct streaming
  • Sub-second RPO without impacting database server
  • Automatic position tracking and checkpointing

Parallel Cloud Streaming:

  • Direct database-to-S3 streaming (zero local storage)
  • Configurable worker pool for parallel uploads
  • S3 multipart upload with automatic retry
  • Support for S3, GCS, and Azure Blob Storage

Smart Engine Selection:

  • Automatic engine selection based on environment
  • MySQL version detection and capability checking
  • Filesystem type detection for optimal snapshot backend
  • Database size-based recommendations

New Commands:

  • engine list - List available backup engines
  • engine info <name> - Show detailed engine information
  • backup --engine=<name> - Use specific backup engine

Technical Details

  • 7,559 lines of new code
  • Zero new external dependencies
  • 10/10 platform builds successful
  • Full test coverage for new engines

[3.1.0] - 2025-11-26

Added - 🔄 Point-in-Time Recovery (PITR)

Complete PITR Implementation for PostgreSQL:

  • WAL Archiving: Continuous archiving of Write-Ahead Log files with compression and encryption support
  • Timeline Management: Track and manage PostgreSQL timeline history with branching support
  • Recovery Targets: Restore to specific timestamp, transaction ID (XID), LSN, named restore point, or immediate
  • PostgreSQL Version Support: Both modern (12+) and legacy recovery configuration formats
  • Recovery Actions: Promote to primary, pause for inspection, or shutdown after recovery
  • Comprehensive Testing: 700+ lines of tests covering all PITR functionality with 100% pass rate

New Commands:

PITR Management:

  • pitr enable - Configure PostgreSQL for WAL archiving and PITR
  • pitr disable - Disable WAL archiving in PostgreSQL configuration
  • pitr status - Display current PITR configuration and archive statistics

WAL Archive Operations:

  • wal archive <wal-file> <filename> - Archive WAL file (used by archive_command)
  • wal list - List all archived WAL files with details
  • wal cleanup - Remove old WAL files based on retention policy
  • wal timeline - Display timeline history and branching structure

Point-in-Time Restore:

  • restore pitr - Perform point-in-time recovery with multiple target types:
    • --target-time "YYYY-MM-DD HH:MM:SS" - Restore to specific timestamp
    • --target-xid <xid> - Restore to transaction ID
    • --target-lsn <lsn> - Restore to Log Sequence Number
    • --target-name <name> - Restore to named restore point
    • --target-immediate - Restore to earliest consistent point

Advanced PITR Features:

  • WAL Compression: gzip compression (70-80% space savings)
  • WAL Encryption: AES-256-GCM encryption for archived WAL files
  • Timeline Selection: Recover along specific timeline or latest
  • Recovery Actions: Promote (default), pause, or shutdown after target reached
  • Inclusive/Exclusive: Control whether target transaction is included
  • Auto-Start: Automatically start PostgreSQL after recovery setup
  • Recovery Monitoring: Real-time monitoring of recovery progress

Configuration Options:

# Enable PITR with compression and encryption
./dbbackup pitr enable --archive-dir /backups/wal_archive \
  --compress --encrypt --encryption-key-file /secure/key.bin

# Perform PITR to specific time
./dbbackup restore pitr \
  --base-backup /backups/base.tar.gz \
  --wal-archive /backups/wal_archive \
  --target-time "2024-11-26 14:30:00" \
  --target-dir /var/lib/postgresql/14/restored \
  --auto-start --monitor

Technical Details:

  • WAL file parsing and validation (timeline, segment, extension detection)
  • Timeline history parsing (.history files) with consistency validation
  • Automatic PostgreSQL version detection (12+ vs legacy)
  • Recovery configuration generation (postgresql.auto.conf + recovery.signal)
  • Data directory validation (exists, writable, PostgreSQL not running)
  • Comprehensive error handling and validation

Documentation:

  • Complete PITR section in README.md (200+ lines)
  • Dedicated PITR.md guide with detailed examples and troubleshooting
  • Test suite documentation (tests/pitr_complete_test.go)

Files Added:

  • internal/pitr/wal/ - WAL archiving and parsing
  • internal/pitr/config/ - Recovery configuration generation
  • internal/pitr/timeline/ - Timeline management
  • cmd/pitr.go - PITR command implementation
  • cmd/wal.go - WAL management commands
  • cmd/restore_pitr.go - PITR restore command
  • tests/pitr_complete_test.go - Comprehensive test suite (700+ lines)
  • PITR.md - Complete PITR guide

Performance:

  • WAL archiving: ~100-200 MB/s (with compression)
  • WAL encryption: ~1-2 GB/s (streaming)
  • Recovery replay: 10-100 MB/s (disk I/O dependent)
  • Minimal overhead during normal operations

Use Cases:

  • Disaster recovery from accidental data deletion
  • Rollback to pre-migration state
  • Compliance and audit requirements
  • Testing and what-if scenarios
  • Timeline branching for parallel recovery paths

Changed

  • Licensing: Added Apache License 2.0 to the project (LICENSE file)
  • Version: Updated to v3.1.0
  • Enhanced metadata format with PITR information
  • Improved progress reporting for long-running operations
  • Better error messages for PITR operations

Production

  • Production Validated: 2 production hosts
  • Databases backed up: 8 databases nightly
  • Retention policy: 30-day retention with minimum 5 backups
  • Backup volume: ~10MB/night
  • Schedule: 02:09 and 02:25 CET
  • Impact: Resolved 4-day backup failure immediately
  • User feedback: "cleanup command is SO gut" | "--dry-run: chef's kiss!" 💋

Documentation

  • Added comprehensive PITR.md guide (complete PITR documentation)
  • Updated README.md with PITR section (200+ lines)
  • Updated CHANGELOG.md with v3.1.0 details
  • Added NOTICE file for Apache License attribution
  • Created comprehensive test suite (tests/pitr_complete_test.go - 700+ lines)

[3.0.0] - 2025-11-26

Added - AES-256-GCM Encryption (Phase 4)

Secure Backup Encryption:

  • Algorithm: AES-256-GCM authenticated encryption (prevents tampering)
  • Key Derivation: PBKDF2-SHA256 with 600,000 iterations (OWASP 2024 recommended)
  • Streaming Encryption: Memory-efficient for large backups (O(buffer) not O(file))
  • Key Sources: File (raw/base64), environment variable, or passphrase
  • Auto-Detection: Restore automatically detects and decrypts encrypted backups
  • Metadata Tracking: Encrypted flag and algorithm stored in .meta.json

CLI Integration:

  • --encrypt - Enable encryption for backup operations
  • --encryption-key-file <path> - Path to 32-byte encryption key (raw or base64 encoded)
  • --encryption-key-env <var> - Environment variable containing key (default: DBBACKUP_ENCRYPTION_KEY)
  • Automatic decryption on restore (no extra flags needed)

Security Features:

  • Unique nonce per encryption (no key reuse vulnerabilities)
  • Cryptographically secure random generation (crypto/rand)
  • Key validation (32 bytes required)
  • Authenticated encryption prevents tampering attacks
  • 56-byte header: Magic(16) + Algorithm(16) + Nonce(12) + Salt(32)

Usage Examples:

# Generate encryption key
head -c 32 /dev/urandom | base64 > encryption.key

# Encrypted backup
./dbbackup backup single mydb --encrypt --encryption-key-file encryption.key

# Restore (automatic decryption)
./dbbackup restore single mydb_backup.sql.gz --encryption-key-file encryption.key --confirm

Performance:

  • Encryption speed: ~1-2 GB/s (streaming, no memory bottleneck)
  • Overhead: 56 bytes header + 16 bytes GCM tag per file
  • Key derivation: ~1.4s for 600k iterations (intentionally slow for security)

Files Added:

  • internal/crypto/interface.go - Encryption interface and configuration
  • internal/crypto/aes.go - AES-256-GCM implementation (272 lines)
  • internal/crypto/aes_test.go - Comprehensive test suite (all tests passing)
  • cmd/encryption.go - CLI encryption helpers
  • internal/backup/encryption.go - Backup encryption operations
  • Total: ~1,200 lines across 13 files

Added - Incremental Backups (Phase 3B)

MySQL/MariaDB Incremental Backups:

  • Change Detection: mtime-based file modification tracking
  • Archive Format: tar.gz containing only changed files since base backup
  • Space Savings: 70-95% smaller than full backups (typical)
  • Backup Chain: Tracks base → incremental relationships with metadata
  • Checksum Verification: SHA-256 integrity checking
  • Auto-Detection: CLI automatically uses correct engine for PostgreSQL vs MySQL

MySQL-Specific Exclusions:

  • Relay logs (relay-log, relay-bin*)
  • Binary logs (mysql-bin*, binlog*)
  • InnoDB redo logs (ib_logfile*)
  • InnoDB undo logs (undo_*)
  • Performance schema (in-memory)
  • Temporary files (#sql*, *.tmp)
  • Lock files (*.lock, auto.cnf.lock)
  • PID files (*.pid, mysqld.pid)
  • Error logs (*.err, error.log)
  • Slow query logs (slow.log)
  • General logs (general.log, query.log)

CLI Integration:

  • --backup-type <full|incremental> - Backup type (default: full)
  • --base-backup <path> - Path to base backup (required for incremental)
  • Auto-detects database type (PostgreSQL vs MySQL) and uses appropriate engine
  • Same interface for both database types

Usage Examples:

# Full backup (base)
./dbbackup backup single mydb --db-type mysql --backup-type full

# Incremental backup
./dbbackup backup single mydb \
  --db-type mysql \
  --backup-type incremental \
  --base-backup /backups/mydb_20251126.tar.gz

# Restore incremental
./dbbackup restore incremental \
  --base-backup mydb_base.tar.gz \
  --incremental-backup mydb_incr_20251126.tar.gz \
  --target /restore/path

Implementation:

  • Copy-paste-adapt from Phase 3A PostgreSQL (95% code reuse)
  • Interface-based design enables sharing tests between engines
  • internal/backup/incremental_mysql.go - MySQL incremental engine (530 lines)
  • All existing tests pass immediately (interface compatibility)
  • Development time: 30 minutes (vs 5-6h estimated) - 10x speedup!

Combined Features:

# Encrypted + Incremental backup
./dbbackup backup single mydb \
  --backup-type incremental \
  --base-backup mydb_base.tar.gz \
  --encrypt \
  --encryption-key-file key.txt

Changed

  • Version: Bumped to 3.0.0 (major feature release)
  • Backup Engine: Integrated encryption and incremental capabilities
  • Restore Engine: Added automatic decryption detection
  • Metadata Format: Extended with encryption and incremental fields

Testing

  • Encryption tests: 4 tests passing (TestAESEncryptionDecryption, TestKeyDerivation, TestKeyValidation, TestLargeData)
  • Incremental tests: 2 tests passing (TestIncrementalBackupRestore, TestIncrementalBackupErrors)
  • Roundtrip validation: Encrypt → Decrypt → Verify (data matches perfectly)
  • Build: All platforms compile successfully
  • Interface compatibility: PostgreSQL and MySQL engines share test suite

Documentation

  • Updated README.md with encryption and incremental sections
  • Added PHASE4_COMPLETION.md - Encryption implementation details
  • Added PHASE3B_COMPLETION.md - MySQL incremental implementation report
  • Usage examples for encryption, incremental, and combined workflows

Performance

  • Phase 4: Completed in ~1h (encryption library + CLI integration)
  • Phase 3B: Completed in 30 minutes (vs 5-6h estimated)
  • Total: 2 major features delivered in 1 day (planned: 6 hours, actual: ~2 hours)
  • Quality: Production-ready, all tests passing, no breaking changes

Commits

  • Phase 4: 3 commits (7d96ec7, f9140cf, dd614dd, 8bbca16)
  • Phase 3B: 2 commits (357084c, a0974ef)
  • Docs: 1 commit (3b9055b)

[2.1.0] - 2025-11-26

Added - Cloud Storage Integration

  • S3/MinIO/B2 Support: Native S3-compatible storage backend with streaming uploads
  • Azure Blob Storage: Native Azure integration with block blob support for files >256MB
  • Google Cloud Storage: Native GCS integration with 16MB chunked uploads
  • Cloud URI Syntax: Direct backup/restore using --cloud s3://bucket/path URIs
  • TUI Cloud Settings: Configure cloud providers directly in interactive menu
    • Cloud Storage Enabled toggle
    • Provider selector (S3, MinIO, B2, Azure, GCS)
    • Bucket/Container configuration
    • Region configuration
    • Credential management with masking
    • Auto-upload toggle
  • Multipart Uploads: Automatic multipart uploads for files >100MB (S3/MinIO/B2)
  • Streaming Transfers: Memory-efficient streaming for all cloud operations
  • Progress Tracking: Real-time upload/download progress with ETA
  • Metadata Sync: Automatic .sha256 and .info file upload alongside backups
  • Cloud Verification: Verify backup integrity directly from cloud storage
  • Cloud Cleanup: Apply retention policies to cloud-stored backups

Added - Cross-Platform Support

  • Windows Support: Native binaries for Windows Intel (amd64) and ARM (arm64)
  • NetBSD Support: Full support for NetBSD amd64 (disk checks use safe defaults)
  • Platform-Specific Implementations:
    • resources_unix.go - Linux, macOS, FreeBSD, OpenBSD
    • resources_windows.go - Windows stub implementation
    • disk_check_netbsd.go - NetBSD disk space stub
  • Build Tags: Proper Go build constraints for platform-specific code
  • All Platforms Building: 10/10 platforms successfully compile
    • Linux (amd64, arm64, armv7)
    • macOS (Intel, Apple Silicon)
    • Windows (Intel, ARM)
    • FreeBSD amd64
    • OpenBSD amd64
      • NetBSD amd64

Changed

  • Cloud Auto-Upload: When CloudEnabled=true and CloudAutoUpload=true, backups automatically upload after creation
  • Configuration: Added cloud settings to TUI settings interface
  • Backup Engine: Integrated cloud upload into backup workflow with progress tracking

Fixed

  • BSD Syscall Issues: Fixed syscall.Rlimit type mismatches (int64 vs uint64) on BSD platforms
  • OpenBSD RLIMIT_AS: Made RLIMIT_AS check Linux-only (not available on OpenBSD)
  • NetBSD Disk Checks: Added safe default implementation for NetBSD (syscall.Statfs unavailable)
  • Cross-Platform Builds: Resolved Windows syscall.Rlimit undefined errors

Documentation

  • Updated README.md with Cloud Storage section and examples
  • Enhanced CLOUD.md with setup guides for all providers
  • Added testing scripts for Azure and GCS
  • Docker Compose files for Azurite and fake-gcs-server

Testing

  • Added scripts/test_azure_storage.sh - Azure Blob Storage integration tests
  • Added scripts/test_gcs_storage.sh - Google Cloud Storage integration tests
  • Docker Compose setups for local testing (Azurite, fake-gcs-server, MinIO)

[2.0.0] - 2025-11-25

Added - Production-Ready Release

  • 100% Test Coverage: All 24 automated tests passing
  • Zero Critical Issues: Production-validated and deployment-ready
  • Backup Verification: SHA-256 checksum generation and validation
  • JSON Metadata: Structured .info files with backup metadata
  • Retention Policy: Automatic cleanup of old backups with configurable retention
  • Configuration Management:
    • Auto-save/load settings to .dbbackup.conf in current directory
    • Per-directory configuration for different projects
    • CLI flags always take precedence over saved configuration
    • Passwords excluded from saved configuration files

Added - Performance Optimizations

  • Parallel Cluster Operations: Worker pool pattern for concurrent database operations
  • Memory Efficiency: Streaming command output eliminates OOM errors
  • Optimized Goroutines: Ticker-based progress indicators reduce CPU overhead
  • Configurable Concurrency: CLUSTER_PARALLELISM environment variable

Added - Reliability Enhancements

  • Context Cleanup: Proper resource cleanup with sync.Once and io.Closer interface
  • Process Management: Thread-safe process tracking with automatic cleanup on exit
  • Error Classification: Regex-based error pattern matching for robust error handling
  • Performance Caching: Disk space checks cached with 30-second TTL
  • Metrics Collection: Structured logging with operation metrics

Fixed

  • Configuration Bug: CLI flags now correctly override config file values
  • Memory Leaks: Proper cleanup prevents resource leaks in long-running operations

Changed

  • Streaming Architecture: Constant ~1GB memory footprint regardless of database size
  • Cross-Platform: Native binaries for Linux (x64/ARM), macOS (x64/ARM), FreeBSD, OpenBSD

[1.2.0] - 2025-11-12

Added

  • Interactive TUI: Full terminal user interface with progress tracking
  • Database Selector: Interactive database selection for backup operations
  • Archive Browser: Browse and restore from backup archives
  • Configuration Settings: In-TUI configuration management
  • CPU Detection: Automatic CPU detection and optimization

Changed

  • Improved error handling and user feedback
  • Enhanced progress tracking with real-time updates

[1.1.0] - 2025-11-10

Added

  • Multi-Database Support: PostgreSQL, MySQL, MariaDB
  • Cluster Operations: Full cluster backup and restore for PostgreSQL
  • Sample Backups: Create reduced-size backups for testing
  • Parallel Processing: Automatic CPU detection and parallel jobs

Changed

  • Refactored command structure for better organization
  • Improved compression handling

[1.0.0] - 2025-11-08

Added

  • Initial release
  • Single database backup and restore
  • PostgreSQL support
  • Basic CLI interface
  • Streaming compression

Version Numbering

  • Major (X.0.0): Breaking changes, major feature additions
  • Minor (0.X.0): New features, non-breaking changes
  • Patch (0.0.X): Bug fixes, minor improvements

Upcoming Features

See ROADMAP.md for planned features:

  • Phase 3: Incremental Backups
  • Phase 4: Encryption (AES-256)
  • Phase 5: PITR (Point-in-Time Recovery)
  • Phase 6: Enterprise Features (Prometheus metrics, remote restore)