Critical fix: Large DB Mode settings (reduced parallelism, increased locks)
were not being reapplied when user changed resource profiles in Settings.
This caused cluster restores to fail with 'out of shared memory' errors on
low-resource VMs even when Large DB Mode was enabled.
Now ApplyResourceProfile() properly applies LargeDBMode modifiers after
setting base profile values, ensuring parallelism stays reduced.
Example: balanced profile with Large DB Mode:
- ClusterParallelism: 4 → 2 (halved)
- MaxLocksPerTxn: 2048 → 8192 (increased)
This allows successful cluster restores on low-spec VMs.
BREAKING CHANGE: Removed 'large-db' as standalone profile
New Design:
- Resource Profiles now purely represent VM capacity:
conservative, balanced, performance, max-performance
- LargeDBMode is a separate boolean toggle that modifies any profile:
- Reduces ClusterParallelism and Jobs by 50%
- Forces MaxLocksPerTxn = 8192
- Increases MaintenanceWorkMem
TUI Changes:
- 'l' key now toggles LargeDBMode ON/OFF instead of applying large-db profile
- New 'Large DB Mode' setting in settings menu
- Settings are persisted to .dbbackup.conf
This allows any resource profile to be combined with large database
optimization, giving users more flexibility on both small and large VMs.
- ClusterParallelism now defaults to recommended profile (1 for small VMs)
- Jobs and DumpJobs now use profile values instead of raw CPU counts
- Small VMs (4 cores, 32GB) will now get 'conservative' or 'balanced' profile
with lower parallelism to avoid 'out of shared memory' errors
This fixes the issue where small VMs defaulted to ClusterParallelism=2
regardless of detected resources, causing restore failures on large DBs.
- Add memory detection for Linux, macOS, Windows
- Add 5 predefined profiles: conservative, balanced, performance, max-performance, large-db
- Add Resource Profile and Cluster Parallelism settings in TUI
- Add quick hotkeys: 'l' for large-db, 'c' for conservative, 'p' for recommendations
- Display system resources (CPU cores, memory) and recommended profile
- Auto-detect and recommend profile based on system resources
Fixes 'out of shared memory' errors when restoring large databases on small VMs.
Use 'large-db' or 'conservative' profile for large databases on constrained systems.
CRITICAL FIXES:
- Encryption detection false positive (IsBackupEncrypted returned true for ALL files)
- 12 cmd.Wait() deadlocks fixed with channel-based context handling
- TUI timeout bugs: 60s->10min for safety checks, 15s->60s for DB listing
- diagnose.go timeouts: 60s->5min for tar/pg_restore operations
- Panic recovery added to parallel backup/restore goroutines
- Variable shadowing fix in restore/engine.go
These bugs caused pg_dump backups to fail through TUI for months.
This is a critical bugfix release addressing multiple hardcoded temporary directory paths
that prevented proper use of the WorkDir configuration option.
PROBLEM:
Users configuring WorkDir (e.g., /u01/dba/tmp) for systems with small root filesystems
still experienced failures because critical operations hardcoded /tmp instead of respecting
the configured WorkDir. This made the WorkDir option essentially non-functional.
FIXED LOCATIONS:
1. internal/restore/engine.go:632 - CRITICAL: Used BackupDir instead of WorkDir for extraction
2. cmd/restore.go:354,834 - CLI restore/diagnose commands ignored WorkDir
3. cmd/migrate.go:208,347 - Migration commands hardcoded /tmp
4. internal/migrate/engine.go:120 - Migration engine ignored WorkDir
5. internal/config/config.go:224 - SwapFilePath hardcoded /tmp
6. internal/config/config.go:519 - Backup directory fallback hardcoded /tmp
7. internal/tui/restore_exec.go:161 - Debug logs hardcoded /tmp
8. internal/tui/settings.go:805 - Directory browser default hardcoded /tmp
9. internal/tui/restore_preview.go:474 - Display message hardcoded /tmp
NEW FEATURES:
- Added Config.GetEffectiveWorkDir() helper method
- WorkDir now respects WORK_DIR environment variable
- All temp operations now consistently use configured WorkDir with /tmp fallback
IMPACT:
- Restores on systems with small root disks now work properly with WorkDir configured
- Admins can control disk space usage for all temporary operations
- Debug logs, extraction dirs, swap files all respect WorkDir setting
Version: 3.42.1 (Critical Fix Release)
- Added WorkDir to Config for custom temp directory
- TUI Settings: new 'Work Directory' option to set alternative temp location
- Restore Preview: press 'w' to toggle work directory (uses backup dir as default)
- Diagnose View: now uses configured WorkDir for cluster extraction
- Config persistence: WorkDir saved to .dbbackup.conf
This fixes diagnosis/restore failures when /tmp is too small for large archives.
Use cases: servers with limited /tmp, 70GB+ archives needing 280GB+ extraction space.
- Add --dry-run/-n flag for backup commands with comprehensive preflight checks
- Database connectivity validation
- Required tools availability check
- Storage target and permissions verification
- Backup size estimation
- Encryption and cloud storage configuration validation
- Implement GFS (Grandfather-Father-Son) retention policies
- Daily/Weekly/Monthly/Yearly tier classification
- Configurable retention counts per tier
- Custom weekly day and monthly day settings
- ISO week handling for proper week boundaries
- Add notification system with SMTP and webhook support
- SMTP email notifications with TLS/STARTTLS
- Webhook HTTP notifications with HMAC-SHA256 signing
- Slack-compatible webhook payload format
- Event types: backup/restore started/completed/failed, cleanup, verify, PITR
- Configurable severity levels and retry logic
- Update README.md with documentation for all new features
- Add .golangci.yml with minimal linters (govet, ineffassign)
- Run gofmt -s and goimports on all files to fix formatting
- Disable fieldalignment and copylocks checks in govet
Implemented full native support for Azure Blob Storage and Google Cloud Storage:
**Azure Blob Storage (internal/cloud/azure.go):**
- Native Azure SDK integration (github.com/Azure/azure-sdk-for-go)
- Block blob upload for large files (>256MB with 100MB blocks)
- Azurite emulator support for local testing
- Production Azure authentication (account name + key)
- SHA-256 integrity verification with metadata
- Streaming uploads with progress tracking
**Google Cloud Storage (internal/cloud/gcs.go):**
- Native GCS SDK integration (cloud.google.com/go/storage)
- Chunked upload for large files (16MB chunks)
- fake-gcs-server emulator support for local testing
- Application Default Credentials support
- Service account JSON key file support
- SHA-256 integrity verification with metadata
- Streaming uploads with progress tracking
**Backend Integration:**
- Updated NewBackend() factory to support azure/azblob and gs/gcs providers
- Added Name() methods to both backends
- Fixed ProgressReader usage across all backends
- Updated Config comments to document Azure/GCS support
**Testing Infrastructure:**
- docker-compose.azurite.yml: Azurite + PostgreSQL + MySQL test environment
- docker-compose.gcs.yml: fake-gcs-server + PostgreSQL + MySQL test environment
- scripts/test_azure_storage.sh: 8 comprehensive Azure integration tests
- scripts/test_gcs_storage.sh: 8 comprehensive GCS integration tests
- Both test scripts validate upload/download/verify/cleanup/restore operations
**Documentation:**
- AZURE.md: Complete guide (600+ lines) covering setup, authentication, usage
- GCS.md: Complete guide (600+ lines) covering setup, authentication, usage
- Updated CLOUD.md with Azure and GCS sections
- Updated internal/config/config.go with Azure/GCS field documentation
**Test Coverage:**
- Large file uploads (300MB for Azure, 200MB for GCS)
- Block/chunked upload verification
- Backup verification with SHA-256 checksums
- Restore from cloud URIs
- Cleanup and retention policies
- Emulator support for both providers
**Dependencies Added:**
- Azure: github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.6.3
- GCS: cloud.google.com/go/storage v1.57.2
- Plus transitive dependencies (~50+ packages)
**Build:**
- Compiles successfully: 68MB binary
- All imports resolved
- No compilation errors
Sprint 4 closes the multi-cloud gap identified in Sprint 3 evaluation.
Users can now use Azure and GCS URIs that were previously parsed but unsupported.
- Add cloud configuration to Config struct
- Integrate automatic upload into backup flow
- Add --cloud-auto-upload flag to all backup commands
- Support environment variables for cloud credentials
- Upload both backup file and metadata to cloud
- Non-blocking: backup succeeds even if cloud upload fails
Usage:
dbbackup backup single mydb --cloud-auto-upload \
--cloud-bucket my-backups \
--cloud-provider s3
Or via environment:
export CLOUD_ENABLED=true
export CLOUD_AUTO_UPLOAD=true
export CLOUD_BUCKET=my-backups
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
dbbackup backup single mydb
Credentials from AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
- Implement context cleanup with sync.Once and io.Closer interface
- Add regex-based error classification for robust error handling
- Create ProcessManager with thread-safe process tracking
- Add disk space caching with 30s TTL for performance
- Implement metrics collection with structured logging
- Add config persistence (.dbbackup.conf) for directory-local settings
- Auto-save/auto-load configuration with --no-config and --no-save-config flags
- Successfully tested with 42GB d7030 database (35K large objects, 36min backup)
- All cross-platform builds working (9/10 platforms)
1. Parallel Cluster Operations (3-5x speedup):
- Added ClusterParallelism config option (default: 2 concurrent operations)
- Implemented worker pool pattern for cluster backup/restore
- Thread-safe progress tracking with sync.Mutex and atomic counters
- Configurable via CLUSTER_PARALLELISM env var
2. Progress Indicator Optimizations:
- Replaced busy-wait select+sleep with time.Ticker in Spinner
- Replaced busy-wait select+sleep with time.Ticker in Dots
- More CPU-efficient, cleaner shutdown pattern
3. Signal Handler Cleanup:
- Added signal.Stop() to properly deregister signal handlers
- Prevents goroutine leaks on long-running operations
- Applied to both single and cluster restore commands
Benefits:
- Cluster backup/restore 3-5x faster with 2-4 workers
- Reduced CPU usage in progress spinners
- Cleaner goroutine lifecycle management
- No breaking changes - sequential by default if parallelism=1
Phase 1: Detection & Guidance
- Detect OS user vs DB user mismatch
- Identify PostgreSQL authentication method (peer/ident/md5)
- Show helpful error messages with 4 solutions:
1. sudo -u <user> (for peer auth)
2. ~/.pgpass file (recommended)
3. PGPASSWORD env variable
4. --password flag
Phase 2: pgpass Support
- Auto-load passwords from ~/.pgpass file
- Support standard PostgreSQL pgpass format
- Check file permissions (must be 0600)
- Support wildcard matching (host:port:db:user:pass)
Tested on CentOS Stream 10 with PostgreSQL 16
HIGH PRIORITY FIXES:
1. Remove unused progressCallback mechanism (dead code cleanup)
2. Add unit tests for restore package (formats, safety checks)
- Test coverage for archive format detection
- Test coverage for safety validation
- Added NullLogger for testing
3. Fix ignored errors in backup pipeline
- Handle StdoutPipe() errors properly
- Log stderr pipe errors
- Document CPU detection errors
IMPROVEMENTS:
- formats_test.go: 8 test functions, all passing
- safety_test.go: 6 test functions for validation
- logger/null.go: Test helper for unit tests
- Proper error handling in streaming compression
- Fixed indentation in stderr handling
- PostgreSQL and MySQL support
- Interactive TUI with fixed menu navigation
- Line-by-line progress display
- CPU-aware parallel processing
- Cross-platform build support
- Configuration settings menu
- Silent mode for TUI operations