Compare commits

..

8 Commits

Author SHA1 Message Date
bfe99e959c feat(tui): unified cluster backup progress display
All checks were successful
CI/CD / Test (push) Successful in 1m17s
CI/CD / Lint (push) Successful in 1m27s
CI/CD / Build & Release (push) Successful in 3m31s
- Add combined overall progress bar showing all phases (0-100%)
- Phase 1/3: Backing up Globals (0-15% of overall)
- Phase 2/3: Backing up Databases (15-90% of overall)
- Phase 3/3: Compressing Archive (90-100% of overall)
- Show current database name during backup
- Phase-aware progress tracking with overallPhase, phaseDesc
- Dual progress bars: overall + database count
- Consistent with cluster restore progress display

v3.42.49
2026-01-16 15:37:04 +01:00
780beaadfb feat(tui): unified cluster restore progress display
All checks were successful
CI/CD / Test (push) Successful in 1m19s
CI/CD / Lint (push) Successful in 1m26s
CI/CD / Build & Release (push) Successful in 3m27s
- Add combined overall progress bar showing all phases (0-100%)
- Phase 1/3: Extracting Archive (0-60% of overall)
- Phase 2/3: Restoring Globals (60-65% of overall)
- Phase 3/3: Restoring Databases (65-100% of overall)
- Show current database name during restore
- Phase-aware progress tracking with overallPhase, currentDB, extractionDone
- Dual progress bars: overall + phase-specific (bytes or db count)
- Better visual feedback during entire cluster restore operation

v3.42.48
2026-01-16 15:32:24 +01:00
838c5b8c15 Fix: PostgreSQL expert review - cluster backup/restore improvements
All checks were successful
CI/CD / Test (push) Successful in 1m16s
CI/CD / Lint (push) Successful in 1m27s
CI/CD / Build & Release (push) Successful in 3m14s
Critical PostgreSQL-specific fixes identified by database expert review:

1. **Port always passed for localhost** (pg_dump, pg_restore, pg_dumpall, psql)
   - Previously, port was only passed for non-localhost connections
   - If user has PostgreSQL on non-standard port (e.g., 5433), commands
     would connect to wrong instance or fail
   - Now always passes -p PORT to all PostgreSQL tools

2. **CREATE DATABASE with encoding/locale preservation**
   - Now creates databases with explicit ENCODING 'UTF8'
   - Detects server's LC_COLLATE and uses it for new databases
   - Prevents encoding mismatch errors during restore
   - Falls back to simple CREATE if encoding fails (older PG versions)

3. **DROP DATABASE WITH (FORCE) for PostgreSQL 13+**
   - Uses new WITH (FORCE) option to atomically terminate connections
   - Prevents race condition where new connections are established
   - Falls back to standard DROP for PostgreSQL < 13
   - Also revokes CONNECT privilege before drop attempt

4. **Improved globals restore error handling**
   - Distinguishes between FATAL errors (real problems) and regular
     ERROR messages (like 'role already exists' which is expected)
   - Only fails on FATAL errors or psql command failures
   - Logs error count summary for visibility

5. **Better error classification in restore logs**
   - Separate log levels for FATAL vs ERROR
   - Debug-level logging for 'already exists' errors (expected)
   - Error count tracking to avoid log spam

These fixes improve reliability for enterprise PostgreSQL deployments
with non-standard configurations and existing data.
2026-01-16 14:36:03 +01:00
9d95a193db Fix: Enterprise cluster restore (postgres user via su)
All checks were successful
CI/CD / Test (push) Successful in 1m16s
CI/CD / Lint (push) Successful in 1m29s
CI/CD / Build & Release (push) Successful in 3m13s
Critical fixes for enterprise environments where dbbackup runs as
postgres user via 'su postgres' without sudo access:

1. canRestartPostgreSQL(): New function that detects if we can restart
   PostgreSQL. Returns false immediately if running as postgres user
   without sudo access, avoiding wasted time and potential hangs.

2. tryRestartPostgreSQL(): Now calls canRestartPostgreSQL() first to
   skip restart attempts in restricted environments.

3. Changed restart warning from ERROR to WARN level - it's expected
   behavior in enterprise environments, not an error.

4. Context cancellation check: Goroutines now check ctx.Err() before
   starting and properly count cancelled databases as failures.

5. Goroutine accounting: After wg.Wait(), verify all databases were
   accounted for (success + fail = total). Catches goroutine crashes
   or deadlocks.

6. Port argument fix: Always pass -p port to psql for localhost
   restores, fixing non-standard port configurations.

This should fix the issue where cluster restore showed success but
0 databases were actually restored when running on enterprise systems.
2026-01-16 14:17:04 +01:00
3201f0fb6a Fix: Critical bug - cluster restore showing success with 0 databases restored
All checks were successful
CI/CD / Test (push) Successful in 1m14s
CI/CD / Lint (push) Successful in 1m25s
CI/CD / Build & Release (push) Successful in 3m23s
CRITICAL FIXES:
- Add check for successCount == 0 to properly fail when no databases restored
- Fix tryRestartPostgreSQL to use non-interactive sudo (-n flag)
- Add 10-second timeout per restart attempt to prevent blocking
- Try pg_ctl directly for postgres user (no sudo needed)
- Set stdin to nil to prevent sudo from waiting for password input

This fixes the issue where cluster restore showed success but no databases
were actually restored due to sudo blocking on password prompts.
2026-01-16 14:03:02 +01:00
62ddc57fb7 Fix: Remove sudo usage from auth detection to avoid password prompts
All checks were successful
CI/CD / Test (push) Successful in 1m16s
CI/CD / Lint (push) Successful in 1m27s
CI/CD / Build & Release (push) Successful in 3m16s
- Remove sudo cat attempt for reading pg_hba.conf
- Prevents password prompts when running as postgres via 'su postgres'
- Auth detection now relies on connection attempts when file is unreadable
- Fixes issue where returning to menu after restore triggers sudo prompt
2026-01-16 13:52:41 +01:00
510175ff04 TUI: Enhance completion/result screens for backup and restore
All checks were successful
CI/CD / Test (push) Successful in 1m17s
CI/CD / Lint (push) Successful in 1m28s
CI/CD / Build & Release (push) Successful in 3m12s
- Add box-style headers for success/failure states
- Display comprehensive summary with archive info, type, database count
- Show timing section with total time, throughput, and average per-DB stats
- Use consistent styling and formatting across all result views
- Improve visual hierarchy with section separators
2026-01-16 13:37:58 +01:00
a85ad0c88c TUI: Add timing and ETA tracking to cluster restore progress
All checks were successful
CI/CD / Test (push) Successful in 1m17s
CI/CD / Lint (push) Successful in 1m28s
CI/CD / Build & Release (push) Successful in 3m21s
- Add DatabaseProgressWithTimingCallback for timing-aware progress reporting
- Track elapsed time and average duration per database during restore phase
- Display ETA based on completed database restore times
- Show restore phase elapsed time in progress bar
- Enhance cluster restore progress bar with [elapsed / ETA: remaining] format
2026-01-16 09:42:05 +01:00
9 changed files with 777 additions and 146 deletions

View File

@@ -5,6 +5,53 @@ All notable changes to dbbackup will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [3.42.49] - 2026-01-16 "Unified Cluster Backup Progress"
### Added - Unified Progress Display for Cluster Backup
- **Combined overall progress bar** for cluster backup showing all phases:
- Phase 1/3: Backing up Globals (0-15% of overall)
- Phase 2/3: Backing up Databases (15-90% of overall)
- Phase 3/3: Compressing Archive (90-100% of overall)
- **Current database indicator** - Shows which database is currently being backed up
- **Phase-aware progress tracking** - New fields in backup progress state:
- `overallPhase` - Current phase (1=globals, 2=databases, 3=compressing)
- `phaseDesc` - Human-readable phase description
- **Dual progress bars** for cluster backup:
- Overall progress bar showing combined operation progress
- Database count progress bar showing individual database progress
### Changed
- Cluster backup TUI now shows unified progress display matching restore
- Progress callbacks now include phase information
- Better visual feedback during entire cluster backup operation
## [3.42.48] - 2026-01-15 "Unified Cluster Restore Progress"
### Added - Unified Progress Display for Cluster Restore
- **Combined overall progress bar** showing progress across all restore phases:
- Phase 1/3: Extracting Archive (0-60% of overall)
- Phase 2/3: Restoring Globals (60-65% of overall)
- Phase 3/3: Restoring Databases (65-100% of overall)
- **Current database indicator** - Shows which database is currently being restored
- **Phase-aware progress tracking** - New fields in progress state:
- `overallPhase` - Current phase (1=extraction, 2=globals, 3=databases)
- `currentDB` - Name of database currently being restored
- `extractionDone` - Boolean flag for phase transition
- **Dual progress bars** for cluster restore:
- Overall progress bar showing combined operation progress
- Phase-specific progress bar (extraction bytes or database count)
### Changed
- Cluster restore TUI now shows unified progress display
- Progress callbacks now set phase and current database information
- Extraction completion triggers automatic transition to globals phase
- Database restore phase shows current database name with spinner
### Improved
- Better visual feedback during entire cluster restore operation
- Clear phase indicators help users understand restore progress
- Overall progress percentage gives better time estimates
## [3.42.35] - 2026-01-15 "TUI Detailed Progress"
### Added - Enhanced TUI Progress Display

View File

@@ -3,9 +3,9 @@
This directory contains pre-compiled binaries for the DB Backup Tool across multiple platforms and architectures.
## Build Information
- **Version**: 3.42.34
- **Build Time**: 2026-01-15_14:33:12_UTC
- **Git Commit**: 09a9177
- **Version**: 3.42.48
- **Build Time**: 2026-01-16_14:32:45_UTC
- **Git Commit**: 780beaa
## Recent Updates (v1.1.0)
- ✅ Fixed TUI progress display with line-by-line output

View File

@@ -84,20 +84,14 @@ func findHbaFileViaPostgres() string {
// parsePgHbaConf parses pg_hba.conf and returns the authentication method
func parsePgHbaConf(path string, user string) AuthMethod {
// Try with sudo if we can't read directly
// Try to read the file directly - do NOT use sudo as it triggers password prompts
// If we can't read pg_hba.conf, we'll rely on connection attempts to determine auth
file, err := os.Open(path)
if err != nil {
// Try with sudo (with timeout)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "sudo", "cat", path)
output, err := cmd.Output()
if err != nil {
// If we can't read the file, return unknown and let the connection determine auth
// This avoids sudo password prompts when running as postgres via su
return AuthUnknown
}
return parseHbaContent(string(output), user)
}
defer file.Close()
scanner := bufio.NewScanner(file)

View File

@@ -937,11 +937,15 @@ func (e *Engine) createSampleBackup(ctx context.Context, databaseName, outputFil
func (e *Engine) backupGlobals(ctx context.Context, tempDir string) error {
globalsFile := filepath.Join(tempDir, "globals.sql")
cmd := exec.CommandContext(ctx, "pg_dumpall", "--globals-only")
if e.cfg.Host != "localhost" {
cmd.Args = append(cmd.Args, "-h", e.cfg.Host, "-p", fmt.Sprintf("%d", e.cfg.Port))
// CRITICAL: Always pass port even for localhost - user may have non-standard port
cmd := exec.CommandContext(ctx, "pg_dumpall", "--globals-only",
"-p", fmt.Sprintf("%d", e.cfg.Port),
"-U", e.cfg.User)
// Only add -h flag for non-localhost to use Unix socket for peer auth
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
cmd.Args = append([]string{cmd.Args[0], "-h", e.cfg.Host}, cmd.Args[1:]...)
}
cmd.Args = append(cmd.Args, "-U", e.cfg.User)
cmd.Env = os.Environ()
if e.cfg.Password != "" {

View File

@@ -316,11 +316,12 @@ func (p *PostgreSQL) BuildBackupCommand(database, outputFile string, options Bac
cmd := []string{"pg_dump"}
// Connection parameters
if p.cfg.Host != "localhost" {
// CRITICAL: Always pass port even for localhost - user may have non-standard port
if p.cfg.Host != "localhost" && p.cfg.Host != "127.0.0.1" && p.cfg.Host != "" {
cmd = append(cmd, "-h", p.cfg.Host)
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
cmd = append(cmd, "--no-password")
}
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
cmd = append(cmd, "-U", p.cfg.User)
// Format and compression
@@ -380,11 +381,12 @@ func (p *PostgreSQL) BuildRestoreCommand(database, inputFile string, options Res
cmd := []string{"pg_restore"}
// Connection parameters
if p.cfg.Host != "localhost" {
// CRITICAL: Always pass port even for localhost - user may have non-standard port
if p.cfg.Host != "localhost" && p.cfg.Host != "127.0.0.1" && p.cfg.Host != "" {
cmd = append(cmd, "-h", p.cfg.Host)
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
cmd = append(cmd, "--no-password")
}
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
cmd = append(cmd, "-U", p.cfg.User)
// Parallel jobs (incompatible with --single-transaction per PostgreSQL docs)

View File

@@ -34,6 +34,10 @@ type ProgressCallback func(current, total int64, description string)
// DatabaseProgressCallback is called with database count progress during cluster restore
type DatabaseProgressCallback func(done, total int, dbName string)
// DatabaseProgressWithTimingCallback is called with database progress including timing info
// Parameters: done count, total count, database name, elapsed time for current restore phase, avg duration per DB
type DatabaseProgressWithTimingCallback func(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration)
// Engine handles database restore operations
type Engine struct {
cfg *config.Config
@@ -47,6 +51,7 @@ type Engine struct {
// TUI progress callback for detailed progress reporting
progressCallback ProgressCallback
dbProgressCallback DatabaseProgressCallback
dbProgressTimingCallback DatabaseProgressWithTimingCallback
}
// New creates a new restore engine
@@ -112,6 +117,11 @@ func (e *Engine) SetDatabaseProgressCallback(cb DatabaseProgressCallback) {
e.dbProgressCallback = cb
}
// SetDatabaseProgressWithTimingCallback sets a callback for database progress with timing info
func (e *Engine) SetDatabaseProgressWithTimingCallback(cb DatabaseProgressWithTimingCallback) {
e.dbProgressTimingCallback = cb
}
// reportProgress safely calls the progress callback if set
func (e *Engine) reportProgress(current, total int64, description string) {
if e.progressCallback != nil {
@@ -126,6 +136,13 @@ func (e *Engine) reportDatabaseProgress(done, total int, dbName string) {
}
}
// reportDatabaseProgressWithTiming safely calls the timing-aware callback if set
func (e *Engine) reportDatabaseProgressWithTiming(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration) {
if e.dbProgressTimingCallback != nil {
e.dbProgressTimingCallback(done, total, dbName, phaseElapsed, avgPerDB)
}
}
// loggerAdapter adapts our logger to the progress.Logger interface
type loggerAdapter struct {
logger logger.Logger
@@ -425,16 +442,18 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
var cmd []string
// For localhost, omit -h to use Unix socket (avoids Ident auth issues)
// But always include -p for port (in case of non-standard port)
hostArg := ""
portArg := fmt.Sprintf("-p %d", e.cfg.Port)
if e.cfg.Host != "localhost" && e.cfg.Host != "" {
hostArg = fmt.Sprintf("-h %s -p %d", e.cfg.Host, e.cfg.Port)
hostArg = fmt.Sprintf("-h %s", e.cfg.Host)
}
if compressed {
// Use ON_ERROR_STOP=1 to fail fast on first error (prevents millions of errors on truncated dumps)
psqlCmd := fmt.Sprintf("psql -U %s -d %s -v ON_ERROR_STOP=1", e.cfg.User, targetDB)
psqlCmd := fmt.Sprintf("psql %s -U %s -d %s -v ON_ERROR_STOP=1", portArg, e.cfg.User, targetDB)
if hostArg != "" {
psqlCmd = fmt.Sprintf("psql %s -U %s -d %s -v ON_ERROR_STOP=1", hostArg, e.cfg.User, targetDB)
psqlCmd = fmt.Sprintf("psql %s %s -U %s -d %s -v ON_ERROR_STOP=1", hostArg, portArg, e.cfg.User, targetDB)
}
// Set PGPASSWORD in the bash command for password-less auth
cmd = []string{
@@ -455,6 +474,7 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
} else {
cmd = []string{
"psql",
"-p", fmt.Sprintf("%d", e.cfg.Port),
"-U", e.cfg.User,
"-d", targetDB,
"-v", "ON_ERROR_STOP=1",
@@ -1037,6 +1057,11 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
var successCount, failCount int32
var mu sync.Mutex // Protect shared resources (progress, logger)
// Timing tracking for restore phase progress
restorePhaseStart := time.Now()
var completedDBTimes []time.Duration // Track duration for each completed DB restore
var completedDBTimesMu sync.Mutex
// Create semaphore to limit concurrency
semaphore := make(chan struct{}, parallelism)
var wg sync.WaitGroup
@@ -1062,6 +1087,19 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
}
}()
// Check for context cancellation before starting
if ctx.Err() != nil {
e.log.Warn("Context cancelled - skipping database restore", "file", filename)
atomic.AddInt32(&failCount, 1)
restoreErrorsMu.Lock()
restoreErrors = multierror.Append(restoreErrors, fmt.Errorf("%s: restore skipped (context cancelled)", strings.TrimSuffix(strings.TrimSuffix(filename, ".dump"), ".sql.gz")))
restoreErrorsMu.Unlock()
return
}
// Track timing for this database restore
dbRestoreStart := time.Now()
// Update estimator progress (thread-safe)
mu.Lock()
estimator.UpdateProgress(idx)
@@ -1074,12 +1112,26 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
dbProgress := 15 + int(float64(idx)/float64(totalDBs)*85.0)
// Calculate average time per DB and report progress with timing
completedDBTimesMu.Lock()
var avgPerDB time.Duration
if len(completedDBTimes) > 0 {
var totalDuration time.Duration
for _, d := range completedDBTimes {
totalDuration += d
}
avgPerDB = totalDuration / time.Duration(len(completedDBTimes))
}
phaseElapsed := time.Since(restorePhaseStart)
completedDBTimesMu.Unlock()
mu.Lock()
statusMsg := fmt.Sprintf("Restoring database %s (%d/%d)", dbName, idx+1, totalDBs)
e.progress.Update(statusMsg)
e.log.Info("Restoring database", "name", dbName, "file", dumpFile, "progress", dbProgress)
// Report database progress for TUI
// Report database progress for TUI (both callbacks)
e.reportDatabaseProgress(idx, totalDBs, dbName)
e.reportDatabaseProgressWithTiming(idx, totalDBs, dbName, phaseElapsed, avgPerDB)
mu.Unlock()
// STEP 1: Drop existing database completely (clean slate)
@@ -1144,6 +1196,12 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
return
}
// Track completed database restore duration for ETA calculation
dbRestoreDuration := time.Since(dbRestoreStart)
completedDBTimesMu.Lock()
completedDBTimes = append(completedDBTimes, dbRestoreDuration)
completedDBTimesMu.Unlock()
atomic.AddInt32(&successCount, 1)
}(dbIndex, entry.Name())
@@ -1156,6 +1214,35 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
successCountFinal := int(atomic.LoadInt32(&successCount))
failCountFinal := int(atomic.LoadInt32(&failCount))
// SANITY CHECK: Verify all databases were accounted for
// This catches any goroutine that exited without updating counters
accountedFor := successCountFinal + failCountFinal
if accountedFor != totalDBs {
missingCount := totalDBs - accountedFor
e.log.Error("INTERNAL ERROR: Some database restore goroutines did not report status",
"expected", totalDBs,
"success", successCountFinal,
"failed", failCountFinal,
"unaccounted", missingCount)
// Treat unaccounted databases as failures
failCountFinal += missingCount
restoreErrorsMu.Lock()
restoreErrors = multierror.Append(restoreErrors, fmt.Errorf("%d database(s) did not complete (possible goroutine crash or deadlock)", missingCount))
restoreErrorsMu.Unlock()
}
// CRITICAL: Check if no databases were restored at all
if successCountFinal == 0 {
e.progress.Fail(fmt.Sprintf("Cluster restore FAILED: 0 of %d databases restored", totalDBs))
operation.Fail("No databases were restored")
if failCountFinal > 0 && restoreErrors != nil {
return fmt.Errorf("cluster restore failed: all %d database(s) failed:\n%s", failCountFinal, restoreErrors.Error())
}
return fmt.Errorf("cluster restore failed: no databases were restored (0 of %d total). Check PostgreSQL logs for details", totalDBs)
}
if failCountFinal > 0 {
// Format multi-error with detailed output
restoreErrors.ErrorFormat = func(errs []error) string {
@@ -1375,6 +1462,8 @@ func (e *Engine) extractArchiveShell(ctx context.Context, archivePath, destDir s
}
// restoreGlobals restores global objects (roles, tablespaces)
// Note: psql returns 0 even when some statements fail (e.g., role already exists)
// We track errors but only fail on FATAL errors that would prevent restore
func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
args := []string{
"-p", fmt.Sprintf("%d", e.cfg.Port),
@@ -1404,6 +1493,8 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
// Read stderr in chunks in goroutine
var lastError string
var errorCount int
var fatalError bool
stderrDone := make(chan struct{})
go func() {
defer close(stderrDone)
@@ -1412,9 +1503,23 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
n, err := stderr.Read(buf)
if n > 0 {
chunk := string(buf[:n])
if strings.Contains(chunk, "ERROR") || strings.Contains(chunk, "FATAL") {
// Track different error types
if strings.Contains(chunk, "FATAL") {
fatalError = true
lastError = chunk
e.log.Warn("Globals restore stderr", "output", chunk)
e.log.Error("Globals restore FATAL error", "output", chunk)
} else if strings.Contains(chunk, "ERROR") {
errorCount++
lastError = chunk
// Only log first few errors to avoid spam
if errorCount <= 5 {
// Check if it's an ignorable "already exists" error
if strings.Contains(chunk, "already exists") {
e.log.Debug("Globals restore: object already exists (expected)", "output", chunk)
} else {
e.log.Warn("Globals restore error", "output", chunk)
}
}
}
}
if err != nil {
@@ -1442,10 +1547,23 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
<-stderrDone
// Only fail on actual command errors or FATAL PostgreSQL errors
// Regular ERROR messages (like "role already exists") are expected
if cmdErr != nil {
return fmt.Errorf("failed to restore globals: %w (last error: %s)", cmdErr, lastError)
}
// If we had FATAL errors, those are real problems
if fatalError {
return fmt.Errorf("globals restore had FATAL error: %s", lastError)
}
// Log summary if there were errors (but don't fail)
if errorCount > 0 {
e.log.Info("Globals restore completed with some errors (usually 'already exists' - expected)",
"error_count", errorCount)
}
return nil
}
@@ -1513,6 +1631,7 @@ func (e *Engine) terminateConnections(ctx context.Context, dbName string) error
}
// dropDatabaseIfExists drops a database completely (clean slate)
// Uses PostgreSQL 13+ WITH (FORCE) option to forcefully drop even with active connections
func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error {
// First terminate all connections
if err := e.terminateConnections(ctx, dbName); err != nil {
@@ -1522,28 +1641,69 @@ func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error
// Wait a moment for connections to terminate
time.Sleep(500 * time.Millisecond)
// Drop the database
// Try to revoke new connections (prevents race condition)
// This only works if we have the privilege to do so
revokeArgs := []string{
"-p", fmt.Sprintf("%d", e.cfg.Port),
"-U", e.cfg.User,
"-d", "postgres",
"-c", fmt.Sprintf("REVOKE CONNECT ON DATABASE \"%s\" FROM PUBLIC", dbName),
}
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
revokeArgs = append([]string{"-h", e.cfg.Host}, revokeArgs...)
}
revokeCmd := exec.CommandContext(ctx, "psql", revokeArgs...)
revokeCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
revokeCmd.Run() // Ignore errors - database might not exist
// Terminate connections again after revoking connect privilege
e.terminateConnections(ctx, dbName)
time.Sleep(200 * time.Millisecond)
// Try DROP DATABASE WITH (FORCE) first (PostgreSQL 13+)
// This forcefully terminates connections and drops the database atomically
forceArgs := []string{
"-p", fmt.Sprintf("%d", e.cfg.Port),
"-U", e.cfg.User,
"-d", "postgres",
"-c", fmt.Sprintf("DROP DATABASE IF EXISTS \"%s\" WITH (FORCE)", dbName),
}
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
forceArgs = append([]string{"-h", e.cfg.Host}, forceArgs...)
}
forceCmd := exec.CommandContext(ctx, "psql", forceArgs...)
forceCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
output, err := forceCmd.CombinedOutput()
if err == nil {
e.log.Info("Dropped existing database (with FORCE)", "name", dbName)
return nil
}
// If FORCE option failed (PostgreSQL < 13), try regular drop
if strings.Contains(string(output), "syntax error") || strings.Contains(string(output), "WITH (FORCE)") {
e.log.Debug("WITH (FORCE) not supported, using standard DROP", "name", dbName)
args := []string{
"-p", fmt.Sprintf("%d", e.cfg.Port),
"-U", e.cfg.User,
"-d", "postgres",
"-c", fmt.Sprintf("DROP DATABASE IF EXISTS \"%s\"", dbName),
}
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
args = append([]string{"-h", e.cfg.Host}, args...)
}
cmd := exec.CommandContext(ctx, "psql", args...)
// Always set PGPASSWORD (empty string is fine for peer/ident auth)
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
output, err := cmd.CombinedOutput()
output, err = cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to drop database '%s': %w\nOutput: %s", dbName, err, string(output))
}
} else if err != nil {
return fmt.Errorf("failed to drop database '%s': %w\nOutput: %s", dbName, err, string(output))
}
e.log.Info("Dropped existing database", "name", dbName)
return nil
@@ -1584,12 +1744,14 @@ func (e *Engine) ensureMySQLDatabaseExists(ctx context.Context, dbName string) e
}
// ensurePostgresDatabaseExists checks if a PostgreSQL database exists and creates it if not
// It attempts to extract encoding/locale from the dump file to preserve original settings
func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string) error {
// Skip creation for postgres and template databases - they should already exist
if dbName == "postgres" || dbName == "template0" || dbName == "template1" {
e.log.Info("Skipping create for system database (assume exists)", "name", dbName)
return nil
}
// Build psql command with authentication
buildPsqlCmd := func(ctx context.Context, database, query string) *exec.Cmd {
args := []string{
@@ -1629,14 +1791,31 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
// Database doesn't exist, create it
// IMPORTANT: Use template0 to avoid duplicate definition errors from local additions to template1
// Also use UTF8 encoding explicitly as it's the most common and safest choice
// See PostgreSQL docs: https://www.postgresql.org/docs/current/app-pgrestore.html#APP-PGRESTORE-NOTES
e.log.Info("Creating database from template0", "name", dbName)
e.log.Info("Creating database from template0 with UTF8 encoding", "name", dbName)
// Get server's default locale for LC_COLLATE and LC_CTYPE
// This ensures compatibility while using the correct encoding
localeCmd := buildPsqlCmd(ctx, "postgres", "SHOW lc_collate")
localeOutput, _ := localeCmd.CombinedOutput()
serverLocale := strings.TrimSpace(string(localeOutput))
if serverLocale == "" {
serverLocale = "en_US.UTF-8" // Fallback to common default
}
// Build CREATE DATABASE command with encoding and locale
// Using ENCODING 'UTF8' explicitly ensures the dump can be restored
createSQL := fmt.Sprintf(
"CREATE DATABASE \"%s\" WITH TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE '%s' LC_CTYPE '%s'",
dbName, serverLocale, serverLocale,
)
createArgs := []string{
"-p", fmt.Sprintf("%d", e.cfg.Port),
"-U", e.cfg.User,
"-d", "postgres",
"-c", fmt.Sprintf("CREATE DATABASE \"%s\" WITH TEMPLATE template0", dbName),
"-c", createSQL,
}
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
@@ -1651,10 +1830,28 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
output, err = createCmd.CombinedOutput()
if err != nil {
// Log the error and include the psql output in the returned error to aid debugging
// If encoding/locale fails, try simpler CREATE DATABASE
e.log.Warn("Database creation with encoding failed, trying simple create", "name", dbName, "error", err)
simpleArgs := []string{
"-p", fmt.Sprintf("%d", e.cfg.Port),
"-U", e.cfg.User,
"-d", "postgres",
"-c", fmt.Sprintf("CREATE DATABASE \"%s\" WITH TEMPLATE template0", dbName),
}
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
simpleArgs = append([]string{"-h", e.cfg.Host}, simpleArgs...)
}
simpleCmd := exec.CommandContext(ctx, "psql", simpleArgs...)
simpleCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
output, err = simpleCmd.CombinedOutput()
if err != nil {
e.log.Warn("Database creation failed", "name", dbName, "error", err, "output", string(output))
return fmt.Errorf("failed to create database '%s': %w (output: %s)", dbName, err, strings.TrimSpace(string(output)))
}
}
e.log.Info("Successfully created database from template0", "name", dbName)
return nil
@@ -1993,56 +2190,100 @@ func (e *Engine) boostPostgreSQLSettings(ctx context.Context, lockBoostValue int
// Wait for PostgreSQL to be ready
time.Sleep(3 * time.Second)
} else {
// Cannot restart - warn user loudly
e.log.Error("=" + strings.Repeat("=", 70))
e.log.Error("WARNING: max_locks_per_transaction change requires PostgreSQL restart!")
e.log.Error("Current value: " + strconv.Itoa(original.MaxLocks) + ", needed: " + strconv.Itoa(lockBoostValue))
e.log.Error("Restore may fail with 'out of shared memory' error on BLOB-heavy databases.")
e.log.Error("")
e.log.Error("To fix manually:")
e.log.Error(" 1. sudo systemctl restart postgresql")
e.log.Error(" 2. Or: sudo -u postgres pg_ctl restart -D $PGDATA")
e.log.Error(" 3. Then re-run the restore")
e.log.Error("=" + strings.Repeat("=", 70))
// Continue anyway - might work for small restores
// Cannot restart - warn user but continue
// The setting is written to postgresql.auto.conf and will take effect on next restart
e.log.Warn("=" + strings.Repeat("=", 70))
e.log.Warn("NOTE: max_locks_per_transaction change requires PostgreSQL restart")
e.log.Warn("Current value: " + strconv.Itoa(original.MaxLocks) + ", target: " + strconv.Itoa(lockBoostValue))
e.log.Warn("")
e.log.Warn("The setting has been saved to postgresql.auto.conf and will take")
e.log.Warn("effect on the next PostgreSQL restart. If restore fails with")
e.log.Warn("'out of shared memory' errors, ask your DBA to restart PostgreSQL.")
e.log.Warn("")
e.log.Warn("Continuing with restore - this may succeed if your databases")
e.log.Warn("don't have many large objects (BLOBs).")
e.log.Warn("=" + strings.Repeat("=", 70))
// Continue anyway - might work for small restores or DBs without BLOBs
}
}
return original, nil
}
// canRestartPostgreSQL checks if we have the ability to restart PostgreSQL
// Returns false if running in a restricted environment (e.g., su postgres on enterprise systems)
func (e *Engine) canRestartPostgreSQL() bool {
// Check if we're running as postgres user - if so, we likely can't restart
// because PostgreSQL is managed by init/systemd, not directly by pg_ctl
currentUser := os.Getenv("USER")
if currentUser == "" {
currentUser = os.Getenv("LOGNAME")
}
// If we're the postgres user, check if we have sudo access
if currentUser == "postgres" {
// Try a quick sudo check - if this fails, we can't restart
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "sudo", "-n", "true")
cmd.Stdin = nil
if err := cmd.Run(); err != nil {
e.log.Info("Running as postgres user without sudo access - cannot restart PostgreSQL",
"user", currentUser,
"hint", "Ask system administrator to restart PostgreSQL if needed")
return false
}
}
return true
}
// tryRestartPostgreSQL attempts to restart PostgreSQL using various methods
// Returns true if restart was successful
// IMPORTANT: Uses short timeouts and non-interactive sudo to avoid blocking on password prompts
// NOTE: This function will return false immediately if running as postgres without sudo
func (e *Engine) tryRestartPostgreSQL(ctx context.Context) bool {
// First check if we can even attempt a restart
if !e.canRestartPostgreSQL() {
e.log.Info("Skipping PostgreSQL restart attempt (no privileges)")
return false
}
e.progress.Update("Attempting PostgreSQL restart for lock settings...")
// Method 1: systemctl (most common on modern Linux)
cmd := exec.CommandContext(ctx, "sudo", "systemctl", "restart", "postgresql")
if err := cmd.Run(); err == nil {
// Use short timeout for each restart attempt (don't block on sudo password prompts)
runWithTimeout := func(args ...string) bool {
cmdCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(cmdCtx, args[0], args[1:]...)
// Set stdin to /dev/null to prevent sudo from waiting for password
cmd.Stdin = nil
return cmd.Run() == nil
}
// Method 1: systemctl (most common on modern Linux) - use sudo -n for non-interactive
if runWithTimeout("sudo", "-n", "systemctl", "restart", "postgresql") {
return true
}
// Method 2: systemctl with version suffix (e.g., postgresql-15)
for _, ver := range []string{"17", "16", "15", "14", "13", "12"} {
cmd = exec.CommandContext(ctx, "sudo", "systemctl", "restart", "postgresql-"+ver)
if err := cmd.Run(); err == nil {
if runWithTimeout("sudo", "-n", "systemctl", "restart", "postgresql-"+ver) {
return true
}
}
// Method 3: service command (older systems)
cmd = exec.CommandContext(ctx, "sudo", "service", "postgresql", "restart")
if err := cmd.Run(); err == nil {
if runWithTimeout("sudo", "-n", "service", "postgresql", "restart") {
return true
}
// Method 4: pg_ctl as postgres user
cmd = exec.CommandContext(ctx, "sudo", "-u", "postgres", "pg_ctl", "restart", "-D", "/var/lib/postgresql/data", "-m", "fast")
if err := cmd.Run(); err == nil {
// Method 4: pg_ctl as postgres user (if we ARE postgres user, no sudo needed)
if runWithTimeout("pg_ctl", "restart", "-D", "/var/lib/postgresql/data", "-m", "fast") {
return true
}
// Method 5: Try common PGDATA paths
// Method 5: Try common PGDATA paths with pg_ctl directly (for postgres user)
pgdataPaths := []string{
"/var/lib/pgsql/data",
"/var/lib/pgsql/17/data",
@@ -2053,8 +2294,7 @@ func (e *Engine) tryRestartPostgreSQL(ctx context.Context) bool {
"/var/lib/postgresql/15/main",
}
for _, pgdata := range pgdataPaths {
cmd = exec.CommandContext(ctx, "sudo", "-u", "postgres", "pg_ctl", "restart", "-D", pgdata, "-m", "fast")
if err := cmd.Run(); err == nil {
if runWithTimeout("pg_ctl", "restart", "-D", pgdata, "-m", "fast") {
return true
}
}

View File

@@ -39,6 +39,8 @@ type BackupExecutionModel struct {
dbTotal int
dbDone int
dbName string // Current database being backed up
overallPhase int // 1=globals, 2=databases, 3=compressing
phaseDesc string // Description of current phase
}
// sharedBackupProgressState holds progress state that can be safely accessed from callbacks
@@ -47,6 +49,8 @@ type sharedBackupProgressState struct {
dbTotal int
dbDone int
dbName string
overallPhase int // 1=globals, 2=databases, 3=compressing
phaseDesc string // Description of current phase
hasUpdate bool
}
@@ -68,12 +72,12 @@ func clearCurrentBackupProgress() {
currentBackupProgressState = nil
}
func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, hasUpdate bool) {
func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, overallPhase int, phaseDesc string, hasUpdate bool) {
currentBackupProgressMu.Lock()
defer currentBackupProgressMu.Unlock()
if currentBackupProgressState == nil {
return 0, 0, "", false
return 0, 0, "", 0, "", false
}
currentBackupProgressState.mu.Lock()
@@ -83,7 +87,8 @@ func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, hasUpdate b
currentBackupProgressState.hasUpdate = false
return currentBackupProgressState.dbTotal, currentBackupProgressState.dbDone,
currentBackupProgressState.dbName, hasUpdate
currentBackupProgressState.dbName, currentBackupProgressState.overallPhase,
currentBackupProgressState.phaseDesc, hasUpdate
}
func NewBackupExecution(cfg *config.Config, log logger.Logger, parent tea.Model, ctx context.Context, backupType, dbName string, ratio int) BackupExecutionModel {
@@ -171,6 +176,8 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
progressState.dbDone = done
progressState.dbTotal = total
progressState.dbName = currentDB
progressState.overallPhase = 2 // Phase 2: Backing up databases
progressState.phaseDesc = fmt.Sprintf("Phase 2/3: Databases (%d/%d)", done, total)
progressState.hasUpdate = true
progressState.mu.Unlock()
})
@@ -223,11 +230,13 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.spinnerFrame = (m.spinnerFrame + 1) % len(spinnerFrames)
// Poll for database progress updates from callbacks
dbTotal, dbDone, dbName, hasUpdate := getCurrentBackupProgress()
dbTotal, dbDone, dbName, overallPhase, phaseDesc, hasUpdate := getCurrentBackupProgress()
if hasUpdate {
m.dbTotal = dbTotal
m.dbDone = dbDone
m.dbName = dbName
m.overallPhase = overallPhase
m.phaseDesc = phaseDesc
}
// Update status based on progress and elapsed time
@@ -361,41 +370,151 @@ func (m BackupExecutionModel) View() string {
// Status display
if !m.done {
// Show database progress bar if we have progress data (cluster backup)
// Unified progress display for cluster backup
if m.backupType == "cluster" {
// Calculate overall progress across all phases
// Phase 1: Globals (0-15%)
// Phase 2: Databases (15-90%)
// Phase 3: Compressing (90-100%)
overallProgress := 0
phaseLabel := "Starting..."
elapsedSec := int(time.Since(m.startTime).Seconds())
if m.overallPhase == 2 && m.dbTotal > 0 {
// Phase 2: Database backups - contributes 15-90%
dbPct := int((int64(m.dbDone) * 100) / int64(m.dbTotal))
overallProgress = 15 + (dbPct * 75 / 100)
phaseLabel = m.phaseDesc
} else if elapsedSec < 5 {
// Initial setup
overallProgress = 2
phaseLabel = "Phase 1/3: Initializing..."
} else if m.dbTotal == 0 {
// Phase 1: Globals backup (before databases start)
overallProgress = 10
phaseLabel = "Phase 1/3: Backing up Globals"
}
// Header with phase and overall progress
s.WriteString(infoStyle.Render(" ─── Cluster Backup Progress ──────────────────────────────"))
s.WriteString("\n\n")
s.WriteString(fmt.Sprintf(" %s\n\n", phaseLabel))
// Overall progress bar
s.WriteString(" Overall: ")
s.WriteString(renderProgressBar(overallProgress))
s.WriteString(fmt.Sprintf(" %d%%\n", overallProgress))
// Phase-specific details
if m.dbTotal > 0 && m.dbDone > 0 {
// Show progress bar instead of spinner when we have real progress
// Show current database being backed up
s.WriteString("\n")
spinner := spinnerFrames[m.spinnerFrame]
if m.dbName != "" && m.dbDone <= m.dbTotal {
s.WriteString(fmt.Sprintf(" Current: %s %s\n", spinner, m.dbName))
}
s.WriteString("\n")
// Database progress bar
progressBar := renderBackupDatabaseProgressBar(m.dbDone, m.dbTotal, m.dbName, 50)
s.WriteString(progressBar + "\n")
s.WriteString(fmt.Sprintf(" %s\n", m.status))
} else {
// Show spinner during initial phases
if m.cancelling {
s.WriteString(fmt.Sprintf(" %s %s\n", spinnerFrames[m.spinnerFrame], m.status))
} else {
s.WriteString(fmt.Sprintf(" %s %s\n", spinnerFrames[m.spinnerFrame], m.status))
// Intermediate phase (globals)
spinner := spinnerFrames[m.spinnerFrame]
s.WriteString(fmt.Sprintf("\n %s %s\n\n", spinner, m.status))
}
s.WriteString("\n")
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
s.WriteString("\n\n")
} else {
// Single/sample database backup - simpler display
spinner := spinnerFrames[m.spinnerFrame]
s.WriteString(fmt.Sprintf(" %s %s\n", spinner, m.status))
}
if !m.cancelling {
s.WriteString("\n [KEY] Press Ctrl+C or ESC to cancel\n")
}
} else {
s.WriteString(fmt.Sprintf(" %s\n\n", m.status))
// Show completion summary with detailed stats
if m.err != nil {
s.WriteString(fmt.Sprintf(" [FAIL] Error: %v\n", m.err))
} else if m.result != "" {
// Parse and display result cleanly
lines := strings.Split(m.result, "\n")
for _, line := range lines {
line = strings.TrimSpace(line)
if line != "" {
s.WriteString(" " + line + "\n")
s.WriteString("\n")
s.WriteString(errorStyle.Render(" ╔══════════════════════════════════════════════════════════╗"))
s.WriteString("\n")
s.WriteString(errorStyle.Render(" ║ [FAIL] BACKUP FAILED ║"))
s.WriteString("\n")
s.WriteString(errorStyle.Render(" ╚══════════════════════════════════════════════════════════╝"))
s.WriteString("\n\n")
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
s.WriteString("\n")
} else {
s.WriteString("\n")
s.WriteString(successStyle.Render(" ╔══════════════════════════════════════════════════════════╗"))
s.WriteString("\n")
s.WriteString(successStyle.Render(" ║ [OK] BACKUP COMPLETED SUCCESSFULLY ║"))
s.WriteString("\n")
s.WriteString(successStyle.Render(" ╚══════════════════════════════════════════════════════════╝"))
s.WriteString("\n\n")
// Summary section
s.WriteString(infoStyle.Render(" ─── Summary ─────────────────────────────────────────────"))
s.WriteString("\n\n")
// Backup type specific info
switch m.backupType {
case "cluster":
s.WriteString(" Type: Cluster Backup\n")
if m.dbTotal > 0 {
s.WriteString(fmt.Sprintf(" Databases: %d backed up\n", m.dbTotal))
}
case "single":
s.WriteString(" Type: Single Database Backup\n")
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
case "sample":
s.WriteString(" Type: Sample Backup\n")
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
s.WriteString(fmt.Sprintf(" Sample Ratio: %d\n", m.ratio))
}
s.WriteString("\n")
// Timing section
s.WriteString(infoStyle.Render(" ─── Timing ──────────────────────────────────────────────"))
s.WriteString("\n\n")
elapsed := time.Since(m.startTime)
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatBackupDuration(elapsed)))
if m.backupType == "cluster" && m.dbTotal > 0 {
avgPerDB := elapsed / time.Duration(m.dbTotal)
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatBackupDuration(avgPerDB)))
}
s.WriteString("\n [KEY] Press Enter or ESC to return to menu\n")
s.WriteString("\n")
s.WriteString(infoStyle.Render(" ─────────────────────────────────────────────────────────"))
s.WriteString("\n")
}
s.WriteString("\n")
s.WriteString(" [KEY] Press Enter or ESC to return to menu\n")
}
return s.String()
}
// formatBackupDuration formats duration in human readable format
func formatBackupDuration(d time.Duration) string {
if d < time.Minute {
return fmt.Sprintf("%.1fs", d.Seconds())
}
if d < time.Hour {
minutes := int(d.Minutes())
seconds := int(d.Seconds()) % 60
return fmt.Sprintf("%dm %ds", minutes, seconds)
}
hours := int(d.Hours())
minutes := int(d.Minutes()) % 60
return fmt.Sprintf("%dh %dm", hours, minutes)
}

View File

@@ -57,6 +57,18 @@ type RestoreExecutionModel struct {
dbTotal int
dbDone int
// Current database being restored (for detailed display)
currentDB string
// Timing info for database restore phase (ETA calculation)
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
dbAvgPerDB time.Duration // Average time per database restore
// Overall progress tracking for unified display
overallPhase int // 1=Extracting, 2=Globals, 3=Databases
extractionDone bool
extractionTime time.Duration // How long extraction took (for ETA calc)
// Results
done bool
cancelling bool // True when user has requested cancellation
@@ -136,6 +148,17 @@ type sharedProgressState struct {
dbTotal int
dbDone int
// Current database being restored
currentDB string
// Timing info for database restore phase
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
dbAvgPerDB time.Duration // Average time per database restore
// Overall phase tracking (1=Extract, 2=Globals, 3=Databases)
overallPhase int
extractionDone bool
// Rolling window for speed calculation
speedSamples []restoreSpeedSample
}
@@ -163,12 +186,12 @@ func clearCurrentRestoreProgress() {
currentRestoreProgressState = nil
}
func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description string, hasUpdate bool, dbTotal, dbDone int, speed float64) {
func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description string, hasUpdate bool, dbTotal, dbDone int, speed float64, dbPhaseElapsed, dbAvgPerDB time.Duration, currentDB string, overallPhase int, extractionDone bool) {
currentRestoreProgressMu.Lock()
defer currentRestoreProgressMu.Unlock()
if currentRestoreProgressState == nil {
return 0, 0, "", false, 0, 0, 0
return 0, 0, "", false, 0, 0, 0, 0, 0, "", 0, false
}
currentRestoreProgressState.mu.Lock()
@@ -179,7 +202,10 @@ func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description strin
return currentRestoreProgressState.bytesTotal, currentRestoreProgressState.bytesDone,
currentRestoreProgressState.description, currentRestoreProgressState.hasUpdate,
currentRestoreProgressState.dbTotal, currentRestoreProgressState.dbDone, speed
currentRestoreProgressState.dbTotal, currentRestoreProgressState.dbDone, speed,
currentRestoreProgressState.dbPhaseElapsed, currentRestoreProgressState.dbAvgPerDB,
currentRestoreProgressState.currentDB, currentRestoreProgressState.overallPhase,
currentRestoreProgressState.extractionDone
}
// calculateRollingSpeed calculates speed from recent samples (last 5 seconds)
@@ -279,6 +305,14 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
progressState.bytesTotal = total
progressState.description = description
progressState.hasUpdate = true
progressState.overallPhase = 1
progressState.extractionDone = false
// Check if extraction is complete
if current >= total && total > 0 {
progressState.extractionDone = true
progressState.overallPhase = 2
}
// Add speed sample for rolling window calculation
progressState.speedSamples = append(progressState.speedSamples, restoreSpeedSample{
@@ -298,6 +332,27 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
progressState.dbDone = done
progressState.dbTotal = total
progressState.description = fmt.Sprintf("Restoring %s", dbName)
progressState.currentDB = dbName
progressState.overallPhase = 3
progressState.extractionDone = true
progressState.hasUpdate = true
// Clear byte progress when switching to db progress
progressState.bytesTotal = 0
progressState.bytesDone = 0
})
// Set up timing-aware database progress callback for cluster restore ETA
engine.SetDatabaseProgressWithTimingCallback(func(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration) {
progressState.mu.Lock()
defer progressState.mu.Unlock()
progressState.dbDone = done
progressState.dbTotal = total
progressState.description = fmt.Sprintf("Restoring %s", dbName)
progressState.currentDB = dbName
progressState.overallPhase = 3
progressState.extractionDone = true
progressState.dbPhaseElapsed = phaseElapsed
progressState.dbAvgPerDB = avgPerDB
progressState.hasUpdate = true
// Clear byte progress when switching to db progress
progressState.bytesTotal = 0
@@ -357,26 +412,46 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
m.elapsed = time.Since(m.startTime)
// Poll shared progress state for real-time updates
bytesTotal, bytesDone, description, hasUpdate, dbTotal, dbDone, speed := getCurrentRestoreProgress()
if hasUpdate && bytesTotal > 0 {
bytesTotal, bytesDone, description, hasUpdate, dbTotal, dbDone, speed, dbPhaseElapsed, dbAvgPerDB, currentDB, overallPhase, extractionDone := getCurrentRestoreProgress()
if hasUpdate && bytesTotal > 0 && !extractionDone {
// Phase 1: Extraction
m.bytesTotal = bytesTotal
m.bytesDone = bytesDone
m.description = description
m.showBytes = true
m.speed = speed
m.overallPhase = 1
m.extractionDone = false
// Update status to reflect actual progress
m.status = description
m.phase = "Extracting"
m.phase = "Phase 1/3: Extracting Archive"
m.progress = int((bytesDone * 100) / bytesTotal)
} else if hasUpdate && dbTotal > 0 {
// Database count progress for cluster restore
// Phase 3: Database restores
m.dbTotal = dbTotal
m.dbDone = dbDone
m.dbPhaseElapsed = dbPhaseElapsed
m.dbAvgPerDB = dbAvgPerDB
m.currentDB = currentDB
m.overallPhase = overallPhase
m.extractionDone = extractionDone
m.showBytes = false
m.status = fmt.Sprintf("Restoring database %d of %d...", dbDone+1, dbTotal)
m.phase = "Restore"
if dbDone < dbTotal {
m.status = fmt.Sprintf("Restoring: %s", currentDB)
} else {
m.status = "Finalizing..."
}
m.phase = fmt.Sprintf("Phase 3/3: Databases (%d/%d)", dbDone, dbTotal)
m.progress = int((dbDone * 100) / dbTotal)
} else if hasUpdate && extractionDone && dbTotal == 0 {
// Phase 2: Globals restore (brief phase between extraction and databases)
m.overallPhase = 2
m.extractionDone = true
m.showBytes = false
m.status = "Restoring global objects (roles, tablespaces)..."
m.phase = "Phase 2/3: Restoring Globals"
} else {
// Fallback: Update status based on elapsed time to show progress
// This provides visual feedback even though we don't have real-time progress
@@ -518,53 +593,154 @@ func (m RestoreExecutionModel) View() string {
s.WriteString("\n")
if m.done {
// Show result
// Show result with comprehensive summary
if m.err != nil {
s.WriteString(errorStyle.Render("[FAIL] Restore Failed"))
s.WriteString(errorStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
s.WriteString("\n")
s.WriteString(errorStyle.Render("║ [FAIL] RESTORE FAILED ║"))
s.WriteString("\n")
s.WriteString(errorStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
s.WriteString("\n\n")
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
s.WriteString("\n")
} else {
s.WriteString(successStyle.Render("[OK] Restore Completed Successfully"))
s.WriteString(successStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
s.WriteString("\n")
s.WriteString(successStyle.Render("║ [OK] RESTORE COMPLETED SUCCESSFULLY ║"))
s.WriteString("\n")
s.WriteString(successStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
s.WriteString("\n\n")
s.WriteString(successStyle.Render(m.result))
// Summary section
s.WriteString(infoStyle.Render(" ─── Summary ───────────────────────────────────────────────"))
s.WriteString("\n\n")
// Archive info
s.WriteString(fmt.Sprintf(" Archive: %s\n", m.archive.Name))
if m.archive.Size > 0 {
s.WriteString(fmt.Sprintf(" Archive Size: %s\n", FormatBytes(m.archive.Size)))
}
// Restore type specific info
if m.restoreType == "restore-cluster" {
s.WriteString(fmt.Sprintf(" Type: Cluster Restore\n"))
if m.dbTotal > 0 {
s.WriteString(fmt.Sprintf(" Databases: %d restored\n", m.dbTotal))
}
if m.cleanClusterFirst && len(m.existingDBs) > 0 {
s.WriteString(fmt.Sprintf(" Cleaned: %d existing database(s) dropped\n", len(m.existingDBs)))
}
} else {
s.WriteString(fmt.Sprintf(" Type: Single Database Restore\n"))
s.WriteString(fmt.Sprintf(" Target DB: %s\n", m.targetDB))
}
s.WriteString("\n")
}
s.WriteString(fmt.Sprintf("\nElapsed Time: %s\n", formatDuration(m.elapsed)))
// Timing section
s.WriteString(infoStyle.Render(" ─── Timing ────────────────────────────────────────────────"))
s.WriteString("\n\n")
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatDuration(m.elapsed)))
// Calculate and show throughput if we have size info
if m.archive.Size > 0 && m.elapsed.Seconds() > 0 {
throughput := float64(m.archive.Size) / m.elapsed.Seconds()
s.WriteString(fmt.Sprintf(" Throughput: %s/s (average)\n", FormatBytes(int64(throughput))))
}
if m.dbTotal > 0 && m.err == nil {
avgPerDB := m.elapsed / time.Duration(m.dbTotal)
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatDuration(avgPerDB)))
}
s.WriteString("\n")
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
s.WriteString("\n\n")
s.WriteString(infoStyle.Render(" [KEYS] Press Enter to continue"))
} else {
// Show progress
// Show unified progress for cluster restore
if m.restoreType == "restore-cluster" {
// Calculate overall progress across all phases
// Phase 1: Extraction (0-60%)
// Phase 2: Globals (60-65%)
// Phase 3: Databases (65-100%)
overallProgress := 0
phaseLabel := "Starting..."
if m.showBytes && m.bytesTotal > 0 {
// Phase 1: Extraction - contributes 0-60%
extractPct := int((m.bytesDone * 100) / m.bytesTotal)
overallProgress = (extractPct * 60) / 100
phaseLabel = "Phase 1/3: Extracting Archive"
} else if m.extractionDone && m.dbTotal == 0 {
// Phase 2: Globals restore
overallProgress = 62
phaseLabel = "Phase 2/3: Restoring Globals"
} else if m.dbTotal > 0 {
// Phase 3: Database restores - contributes 65-100%
dbPct := int((int64(m.dbDone) * 100) / int64(m.dbTotal))
overallProgress = 65 + (dbPct * 35 / 100)
phaseLabel = fmt.Sprintf("Phase 3/3: Databases (%d/%d)", m.dbDone, m.dbTotal)
}
// Header with phase and overall progress
s.WriteString(infoStyle.Render(" ─── Cluster Restore Progress ─────────────────────────────"))
s.WriteString("\n\n")
s.WriteString(fmt.Sprintf(" %s\n\n", phaseLabel))
// Overall progress bar
s.WriteString(" Overall: ")
s.WriteString(renderProgressBar(overallProgress))
s.WriteString(fmt.Sprintf(" %d%%\n", overallProgress))
// Phase-specific details
if m.showBytes && m.bytesTotal > 0 {
// Show extraction details
s.WriteString("\n")
s.WriteString(fmt.Sprintf(" %s\n", m.status))
s.WriteString("\n")
s.WriteString(renderDetailedProgressBarWithSpeed(m.bytesDone, m.bytesTotal, m.speed))
s.WriteString("\n")
} else if m.dbTotal > 0 {
// Show current database being restored
s.WriteString("\n")
spinner := m.spinnerFrames[m.spinnerFrame]
if m.currentDB != "" && m.dbDone < m.dbTotal {
s.WriteString(fmt.Sprintf(" Current: %s %s\n", spinner, m.currentDB))
} else if m.dbDone >= m.dbTotal {
s.WriteString(fmt.Sprintf(" %s Finalizing...\n", spinner))
}
s.WriteString("\n")
// Database progress bar with timing
s.WriteString(renderDatabaseProgressBarWithTiming(m.dbDone, m.dbTotal, m.dbPhaseElapsed, m.dbAvgPerDB))
s.WriteString("\n")
} else {
// Intermediate phase (globals)
spinner := m.spinnerFrames[m.spinnerFrame]
s.WriteString(fmt.Sprintf("\n %s %s\n\n", spinner, m.status))
}
s.WriteString("\n")
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
s.WriteString("\n\n")
} else {
// Single database restore - simpler display
s.WriteString(fmt.Sprintf("Phase: %s\n", m.phase))
// Show detailed progress bar when we have byte-level information
// In this case, hide the spinner for cleaner display
if m.showBytes && m.bytesTotal > 0 {
// Status line without spinner (progress bar provides activity indication)
s.WriteString(fmt.Sprintf("Status: %s\n", m.status))
s.WriteString("\n")
// Render schollz-style progress bar with bytes, rolling speed, ETA
s.WriteString(renderDetailedProgressBarWithSpeed(m.bytesDone, m.bytesTotal, m.speed))
s.WriteString("\n\n")
} else if m.dbTotal > 0 {
// Database count progress for cluster restore
spinner := m.spinnerFrames[m.spinnerFrame]
s.WriteString(fmt.Sprintf("Status: %s %s\n", spinner, m.status))
s.WriteString("\n")
// Show database progress bar
s.WriteString(renderDatabaseProgressBar(m.dbDone, m.dbTotal))
s.WriteString("\n\n")
} else {
// Show status with rotating spinner (for phases without detailed progress)
spinner := m.spinnerFrames[m.spinnerFrame]
s.WriteString(fmt.Sprintf("Status: %s %s\n", spinner, m.status))
s.WriteString("\n")
if m.restoreType == "restore-single" {
// Fallback to simple progress bar for single database restore
// Fallback to simple progress bar
progressBar := renderProgressBar(m.progress)
s.WriteString(progressBar)
s.WriteString(fmt.Sprintf(" %d%%\n", m.progress))
@@ -678,6 +854,55 @@ func renderDatabaseProgressBar(done, total int) string {
return s.String()
}
// renderDatabaseProgressBarWithTiming renders a progress bar for database count with timing and ETA
func renderDatabaseProgressBarWithTiming(done, total int, phaseElapsed, avgPerDB time.Duration) string {
var s strings.Builder
// Calculate percentage
percent := 0
if total > 0 {
percent = (done * 100) / total
if percent > 100 {
percent = 100
}
}
// Render progress bar
width := 30
filled := (percent * width) / 100
barFilled := strings.Repeat("█", filled)
barEmpty := strings.Repeat("░", width-filled)
s.WriteString(successStyle.Render("["))
s.WriteString(successStyle.Render(barFilled))
s.WriteString(infoStyle.Render(barEmpty))
s.WriteString(successStyle.Render("]"))
// Count and percentage
s.WriteString(fmt.Sprintf(" %3d%% %d / %d databases", percent, done, total))
// Timing and ETA
if phaseElapsed > 0 {
s.WriteString(fmt.Sprintf(" [%s", FormatDurationShort(phaseElapsed)))
// Calculate ETA based on average time per database
if avgPerDB > 0 && done < total {
remainingDBs := total - done
eta := time.Duration(remainingDBs) * avgPerDB
s.WriteString(fmt.Sprintf(" / ETA: %s", FormatDurationShort(eta)))
} else if done > 0 && done < total {
// Fallback: estimate ETA from overall elapsed time
avgElapsed := phaseElapsed / time.Duration(done)
remainingDBs := total - done
eta := time.Duration(remainingDBs) * avgElapsed
s.WriteString(fmt.Sprintf(" / ETA: ~%s", FormatDurationShort(eta)))
}
s.WriteString("]")
}
return s.String()
}
// formatDuration formats duration in human readable format
func formatDuration(d time.Duration) string {
if d < time.Minute {

View File

@@ -16,7 +16,7 @@ import (
// Build information (set by ldflags)
var (
version = "3.42.34"
version = "3.42.49"
buildTime = "unknown"
gitCommit = "unknown"
)