Previously, the ETA during phase 3 (database restores) would appear to
hang because dbPhaseElapsed was only updated when a new database started
restoring, not during the restore operation itself.
Fixed by:
- Added phase3StartTime to track when phase 3 begins
- Calculate dbPhaseElapsed in realtime using time.Since(phase3StartTime)
- ETA now updates every 100ms tick instead of only on database transitions
This ensures the elapsed time and ETA display continuously update during
long-running database restores.
BREAKING CHANGE: Removed 'large-db' as standalone profile
New Design:
- Resource Profiles now purely represent VM capacity:
conservative, balanced, performance, max-performance
- LargeDBMode is a separate boolean toggle that modifies any profile:
- Reduces ClusterParallelism and Jobs by 50%
- Forces MaxLocksPerTxn = 8192
- Increases MaintenanceWorkMem
TUI Changes:
- 'l' key now toggles LargeDBMode ON/OFF instead of applying large-db profile
- New 'Large DB Mode' setting in settings menu
- Settings are persisted to .dbbackup.conf
This allows any resource profile to be combined with large database
optimization, giving users more flexibility on both small and large VMs.
- Displays current Resource Profile with parallelism settings
- Shows CPU Workload type (balanced/cpu-intensive/io-intensive)
- Shows Cluster Parallelism (number of concurrent database restores)
This helps users understand what performance settings will be used
before starting a cluster restore operation.
- ClusterParallelism now defaults to recommended profile (1 for small VMs)
- Jobs and DumpJobs now use profile values instead of raw CPU counts
- Small VMs (4 cores, 32GB) will now get 'conservative' or 'balanced' profile
with lower parallelism to avoid 'out of shared memory' errors
This fixes the issue where small VMs defaulted to ClusterParallelism=2
regardless of detected resources, causing restore failures on large DBs.
- Parse and structure error messages for clean TUI display
- Extract error type, message, hint, and failed databases
- Show specific recommendations based on error type
- Fix for 'out of shared memory' - suggest profile settings
- Limit line width to prevent scrambled display
- Add structured sections: Error Details, Diagnosis, Recommendations
- Add memory detection for Linux, macOS, Windows
- Add 5 predefined profiles: conservative, balanced, performance, max-performance, large-db
- Add Resource Profile and Cluster Parallelism settings in TUI
- Add quick hotkeys: 'l' for large-db, 'c' for conservative, 'p' for recommendations
- Display system resources (CPU cores, memory) and recommended profile
- Auto-detect and recommend profile based on system resources
Fixes 'out of shared memory' errors when restoring large databases on small VMs.
Use 'large-db' or 'conservative' profile for large databases on constrained systems.
- Allow cleanup toggle even when preview detection failed
- Show 'detection pending' message instead of blocking the toggle
- Will re-detect databases at restore execution time
- Always show cleanup toggle option for cluster restores
- Better messaging: 'enabled/disabled' instead of showing 0 count
- Detection in preview may fail or return stale results
- Re-detect user databases when cleanup is enabled at execution time
- Fall back to preview list if re-detection fails
- Ensures actual databases are dropped, not just what was detected earlier
- Fixed psql connection for database detection (always use -h flag)
- Use CombinedOutput() to capture stderr for better diagnostics
- Added existingDBError tracking in restore preview
- Show 'Unable to detect' instead of misleading 'None' when listing fails
- Disable cleanup toggle when database detection failed
- Add product branding header to main menu (version + tagline)
- Fix backup success/error report formatting consistency
- Remove extra newline before error box in backup_exec
- Align backup and restore completion screens
Three high-value improvements for cluster restore:
1. Weighted progress by database size
- Progress now shows percentage by data volume, not just count
- Phase 3/3: Databases (2/7) - 45.2% by size
- Gives more accurate ETA for clusters with varied DB sizes
2. Pre-extraction disk space check
- Checks workdir has 3x archive size before extraction
- Prevents partial extraction failures when disk fills mid-way
- Clear error message with required vs available GB
3. --parallel-dbs flag for concurrent restores
- dbbackup restore cluster archive.tar.gz --parallel-dbs=4
- Overrides CLUSTER_PARALLELISM config setting
- Set to 1 for sequential restore (safest for large objects)
Bubbletea v1.3+ sends InterruptMsg for SIGINT signals instead of
KeyMsg with 'ctrl+c', causing cancellation to not work properly.
- Add tea.InterruptMsg handling to restore_exec.go
- Add tea.InterruptMsg handling to backup_exec.go
- Add tea.InterruptMsg handling to menu.go
- Call cleanup.KillOrphanedProcesses on all interrupt paths
- No zombie pg_dump/pg_restore/gzip processes left behind
Fixes Ctrl+C not working during cluster restore/backup operations.
v3.42.50
- Add box-style headers for success/failure states
- Display comprehensive summary with archive info, type, database count
- Show timing section with total time, throughput, and average per-DB stats
- Use consistent styling and formatting across all result views
- Improve visual hierarchy with section separators
- Add DatabaseProgressWithTimingCallback for timing-aware progress reporting
- Track elapsed time and average duration per database during restore phase
- Display ETA based on completed database restore times
- Show restore phase elapsed time in progress bar
- Enhance cluster restore progress bar with [elapsed / ETA: remaining] format
- Add ProgressCallback and DatabaseProgressCallback to backup engine
- Add SetProgressCallback() and SetDatabaseProgressCallback() methods
- Wire database progress callback in backup_exec.go
- Show progress bar (Database: [████░░░] X/Y) during cluster backups
- Hide spinner when progress bar is displayed
- Use shared progress state pattern for thread-safe TUI updates
- CheckDiskSpace now uses GetEffectiveWorkDir() instead of BackupDir
- Dynamic timeout calculation based on file size:
- diagnoseClusterArchive: 5 + (GB/3) min, max 60 min
- verifyWithPgRestore: 5 + (GB/5) min, max 30 min
- DiagnoseClusterDumps: 10 + (GB/3) min, max 120 min
- TUI safety checks: 10 + (GB/5) min, max 120 min
- Timeout vs corruption differentiation (no false CORRUPTED on timeout)
- Streaming tar listing to avoid OOM on large archives
For 119GB archives: ~45 min timeout instead of 5 min false-positive
- verifyArchiveCmd now uses restore.Safety and restore.Diagnoser
- Same validation logic in backup manager verify and restore safety checks
- No more discrepancy between verify showing valid and restore failing
- Block cluster restore if existing databases found and cleanup not enabled
- User must press 'c' to enable 'Clean All First' before proceeding
- Prevents accidental data conflicts during disaster recovery
- Bug #24: Missing safety gate for cluster restore
CRITICAL FIXES:
- Encryption detection false positive (IsBackupEncrypted returned true for ALL files)
- 12 cmd.Wait() deadlocks fixed with channel-based context handling
- TUI timeout bugs: 60s->10min for safety checks, 15s->60s for DB listing
- diagnose.go timeouts: 60s->5min for tar/pg_restore operations
- Panic recovery added to parallel backup/restore goroutines
- Variable shadowing fix in restore/engine.go
These bugs caused pg_dump backups to fail through TUI for months.
This is a critical bugfix release addressing multiple hardcoded temporary directory paths
that prevented proper use of the WorkDir configuration option.
PROBLEM:
Users configuring WorkDir (e.g., /u01/dba/tmp) for systems with small root filesystems
still experienced failures because critical operations hardcoded /tmp instead of respecting
the configured WorkDir. This made the WorkDir option essentially non-functional.
FIXED LOCATIONS:
1. internal/restore/engine.go:632 - CRITICAL: Used BackupDir instead of WorkDir for extraction
2. cmd/restore.go:354,834 - CLI restore/diagnose commands ignored WorkDir
3. cmd/migrate.go:208,347 - Migration commands hardcoded /tmp
4. internal/migrate/engine.go:120 - Migration engine ignored WorkDir
5. internal/config/config.go:224 - SwapFilePath hardcoded /tmp
6. internal/config/config.go:519 - Backup directory fallback hardcoded /tmp
7. internal/tui/restore_exec.go:161 - Debug logs hardcoded /tmp
8. internal/tui/settings.go:805 - Directory browser default hardcoded /tmp
9. internal/tui/restore_preview.go:474 - Display message hardcoded /tmp
NEW FEATURES:
- Added Config.GetEffectiveWorkDir() helper method
- WorkDir now respects WORK_DIR environment variable
- All temp operations now consistently use configured WorkDir with /tmp fallback
IMPACT:
- Restores on systems with small root disks now work properly with WorkDir configured
- Admins can control disk space usage for all temporary operations
- Debug logs, extraction dirs, swap files all respect WorkDir setting
Version: 3.42.1 (Critical Fix Release)
PROBLEM: Users could not interrupt backup or restore operations through
the TUI interface. Pressing Ctrl+C or ESC did nothing during execution.
ROOT CAUSE:
- BackupExecutionModel ignored ALL key presses while running (only handled when done)
- RestoreExecutionModel returned tea.Quit but didn't cancel the context
- The operation goroutine kept running in the background with its own context
FIX:
- Added cancel context.CancelFunc to both execution models
- Create child context with WithCancel in New*Execution constructors
- Handle ctrl+c and esc during execution to call cancel()
- Show 'Cancelling...' status while waiting for graceful shutdown
- Show cancel hint in View: 'Press Ctrl+C or ESC to cancel'
The fix works because:
- exec.CommandContext(ctx) will SIGKILL the subprocess when ctx is cancelled
- pg_dump, pg_restore, psql, mysql all get terminated properly
- User sees immediate feedback that cancellation is in progress
- Added WorkDir to Config for custom temp directory
- TUI Settings: new 'Work Directory' option to set alternative temp location
- Restore Preview: press 'w' to toggle work directory (uses backup dir as default)
- Diagnose View: now uses configured WorkDir for cluster extraction
- Config persistence: WorkDir saved to .dbbackup.conf
This fixes diagnosis/restore failures when /tmp is too small for large archives.
Use cases: servers with limited /tmp, 70GB+ archives needing 280GB+ extraction space.
- Added 'Diagnose Backup File' as menu option in TUI
- Archive browser now supports 'diagnose' mode
- Allows users to run deep diagnosis on backups before restore
- Helps identify truncation/corruption issues in large backups
Features:
- restore diagnose command for backup file analysis
- Deep COPY block verification for truncated dump detection
- PGDMP signature and gzip integrity validation
- Detailed error reports with --save-debug-log flag
- Ring buffer stderr capture (prevents OOM on 2M+ errors)
- Error classification with actionable recommendations
TUI Enhancements:
- Automatic dump validity safety check before restore
- Press 'd' in archive browser to diagnose backups
- Press 'd' in restore preview for debug log toggle
- Debug logs saved to /tmp on failure when enabled
Documentation:
- Updated README with diagnose command and examples
- Updated CHANGELOG with full feature list
- Updated restore preview screenshots
- Fix format detection to read database_type from .meta.json metadata file
- Add ensureMySQLDatabaseExists() for MySQL/MariaDB database creation
- Route database creation to correct implementation based on db type
- Add TUI auto-forward in auto-confirm mode (no input required for debugging)
- All TUI components now exit automatically when --auto-confirm is set
- Fix status view to skip loading in auto-confirm mode
- Add WithField and WithFields methods to NullLogger to implement Logger interface
- Change MenuModel to use pointer receivers to avoid copying sync.Once
- Add .golangci.yml with minimal linters (govet, ineffassign)
- Run gofmt -s and goimports on all files to fix formatting
- Disable fieldalignment and copylocks checks in govet
Added cloud storage configuration to TUI settings interface:
- Cloud Storage Enabled toggle
- Cloud Provider selector (S3, MinIO, B2, Azure, GCS)
- Bucket/Container name configuration
- Region configuration
- Access/Secret key management with masking
- Auto-upload toggle
Users can now configure cloud backends directly from the
interactive menu instead of only via command-line flags.
Cloud auto-upload works when CloudEnabled + CloudAutoUpload
are enabled - backups automatically upload after creation.
- Implement context cleanup with sync.Once and io.Closer interface
- Add regex-based error classification for robust error handling
- Create ProcessManager with thread-safe process tracking
- Add disk space caching with 30s TTL for performance
- Implement metrics collection with structured logging
- Add config persistence (.dbbackup.conf) for directory-local settings
- Auto-save/auto-load configuration with --no-config and --no-save-config flags
- Successfully tested with 42GB d7030 database (35K large objects, 36min backup)
- All cross-platform builds working (9/10 platforms)
- Created internal/cleanup package for orphaned process management
- KillOrphanedProcesses(): Finds and kills pg_dump, pg_restore, gzip, pigz
- killProcessGroup(): Kills entire process groups (handles pipelines)
- Pass parent context through all TUI operations (backup/restore inherit cancellation)
- Menu cancel now kills all child processes before exit
- Fixed context chain: menu.ctx → backup/restore operations
- No more zombie processes when user quits TUI mid-operation
Context chain:
- signal.NotifyContext in main.go → menu.ctx
- menu.ctx → backup_exec.ctx, restore_exec.ctx
- Child contexts inherit cancellation via context.WithTimeout(parentCtx)
- All exec.CommandContext use proper parent context
Prevents: Orphaned pg_dump/pg_restore eating CPU/disk after TUI quit
- Removed workloadOption struct and workload-related fields from MenuModel
- Removed workload initialization and cursor tracking
- Removed keyboard handlers (Shift+←/→, 'w') for workload switching
- Removed workload selector display from main menu view
- Removed applyWorkloadSelection() function
- CPU workload type now only configurable via Configuration Settings
- Cleaner main menu focused on actions rather than configuration