CRITICAL FIXES:
- Encryption detection false positive (IsBackupEncrypted returned true for ALL files)
- 12 cmd.Wait() deadlocks fixed with channel-based context handling
- TUI timeout bugs: 60s->10min for safety checks, 15s->60s for DB listing
- diagnose.go timeouts: 60s->5min for tar/pg_restore operations
- Panic recovery added to parallel backup/restore goroutines
- Variable shadowing fix in restore/engine.go
These bugs caused pg_dump backups to fail through TUI for months.
Removed duplicate CI triggers:
- Before: Ran on push to branches AND on tag push (doubled)
- After: Runs on push to branches OR when release is published
This prevents wasted CI resources and confusion.
This is a critical bugfix release addressing multiple hardcoded temporary directory paths
that prevented proper use of the WorkDir configuration option.
PROBLEM:
Users configuring WorkDir (e.g., /u01/dba/tmp) for systems with small root filesystems
still experienced failures because critical operations hardcoded /tmp instead of respecting
the configured WorkDir. This made the WorkDir option essentially non-functional.
FIXED LOCATIONS:
1. internal/restore/engine.go:632 - CRITICAL: Used BackupDir instead of WorkDir for extraction
2. cmd/restore.go:354,834 - CLI restore/diagnose commands ignored WorkDir
3. cmd/migrate.go:208,347 - Migration commands hardcoded /tmp
4. internal/migrate/engine.go:120 - Migration engine ignored WorkDir
5. internal/config/config.go:224 - SwapFilePath hardcoded /tmp
6. internal/config/config.go:519 - Backup directory fallback hardcoded /tmp
7. internal/tui/restore_exec.go:161 - Debug logs hardcoded /tmp
8. internal/tui/settings.go:805 - Directory browser default hardcoded /tmp
9. internal/tui/restore_preview.go:474 - Display message hardcoded /tmp
NEW FEATURES:
- Added Config.GetEffectiveWorkDir() helper method
- WorkDir now respects WORK_DIR environment variable
- All temp operations now consistently use configured WorkDir with /tmp fallback
IMPACT:
- Restores on systems with small root disks now work properly with WorkDir configured
- Admins can control disk space usage for all temporary operations
- Debug logs, extraction dirs, swap files all respect WorkDir setting
Version: 3.42.1 (Critical Fix Release)
- Remove invalid --config flag from exporter service template
- Change ReadOnlyPaths to ReadWritePaths for catalog access
- Add copyBinary() to install binary to /usr/local/bin (ProtectHome compat)
- Fix exporter status detection using direct systemctl check
- Add os/exec import for status check
- Add install/uninstall and metrics commands to command table
- Add Systemd Integration section with install examples
- Add Prometheus Metrics section with textfile and HTTP exporter docs
- Update Features list with systemd and metrics highlights
- Rebuild all platform binaries
Systemd Integration:
- New 'dbbackup install' command creates service/timer units
- Supports single-database and cluster backup modes
- Automatic dbbackup user/group creation with proper permissions
- Hardened service units with security features
- Template units with configurable OnCalendar schedules
- 'dbbackup uninstall' for clean removal
Prometheus Metrics:
- 'dbbackup metrics export' for textfile collector format
- 'dbbackup metrics serve' runs HTTP exporter on port 9399
- Metrics: last_success_timestamp, rpo_seconds, backup_total, etc.
- Integration with node_exporter textfile collector
- --with-metrics flag during install
Technical:
- Systemd templates embedded with //go:embed
- Service units include ReadWritePaths, OOMScoreAdjust
- Metrics exporter caches with 30s TTL
- Graceful shutdown on SIGTERM
PROBLEM: Users could not interrupt backup or restore operations through
the TUI interface. Pressing Ctrl+C or ESC did nothing during execution.
ROOT CAUSE:
- BackupExecutionModel ignored ALL key presses while running (only handled when done)
- RestoreExecutionModel returned tea.Quit but didn't cancel the context
- The operation goroutine kept running in the background with its own context
FIX:
- Added cancel context.CancelFunc to both execution models
- Create child context with WithCancel in New*Execution constructors
- Handle ctrl+c and esc during execution to call cancel()
- Show 'Cancelling...' status while waiting for graceful shutdown
- Show cancel hint in View: 'Press Ctrl+C or ESC to cancel'
The fix works because:
- exec.CommandContext(ctx) will SIGKILL the subprocess when ctx is cancelled
- pg_dump, pg_restore, psql, mysql all get terminated properly
- User sees immediate feedback that cancellation is in progress
- Add identifier validation for database names in PostgreSQL and MySQL
- validateIdentifier() rejects names with invalid characters
- quoteIdentifier() safely quotes identifiers with proper escaping
- Max length: 63 chars (PostgreSQL), 64 chars (MySQL)
- Only allows alphanumeric + underscores, must start with letter/underscore
- Fix data race in notification manager
- Multiple goroutines were appending to shared error slice
- Added errMu sync.Mutex to protect concurrent error collection
- Security improvements prevent:
- SQL injection via malicious database names
- CREATE DATABASE `foo`; DROP DATABASE production; --`
- Race conditions causing lost or corrupted error data
- Validate SQL dump files BEFORE attempting restore
- Detect unterminated COPY blocks that cause 'syntax error' failures
- Cluster restore now pre-validates ALL dumps upfront (fail-fast)
- Saves hours of wasted restore time on corrupted backups
The truncated resydb.sql.gz was causing 49min restore attempts
that failed with 2.6M errors. Now fails immediately with clear
error message showing which table's COPY block was truncated.
- Added WorkDir to Config for custom temp directory
- TUI Settings: new 'Work Directory' option to set alternative temp location
- Restore Preview: press 'w' to toggle work directory (uses backup dir as default)
- Diagnose View: now uses configured WorkDir for cluster extraction
- Config persistence: WorkDir saved to .dbbackup.conf
This fixes diagnosis/restore failures when /tmp is too small for large archives.
Use cases: servers with limited /tmp, 70GB+ archives needing 280GB+ extraction space.
- Better error messages when tar extraction fails
- Detect truncated/corrupted archives without full extraction
- Show archive contents even when extraction fails
- Provide helpful hints for disk space and corruption issues
- Exit status 2 from tar now shows detailed diagnostics
- Added 'Diagnose Backup File' as menu option in TUI
- Archive browser now supports 'diagnose' mode
- Allows users to run deep diagnosis on backups before restore
- Helps identify truncation/corruption issues in large backups
Features:
- restore diagnose command for backup file analysis
- Deep COPY block verification for truncated dump detection
- PGDMP signature and gzip integrity validation
- Detailed error reports with --save-debug-log flag
- Ring buffer stderr capture (prevents OOM on 2M+ errors)
- Error classification with actionable recommendations
TUI Enhancements:
- Automatic dump validity safety check before restore
- Press 'd' in archive browser to diagnose backups
- Press 'd' in restore preview for debug log toggle
- Debug logs saved to /tmp on failure when enabled
Documentation:
- Updated README with diagnose command and examples
- Updated CHANGELOG with full feature list
- Updated restore preview screenshots