Compare commits
21 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 62d58c77af | |||
| c5be9bcd2b | |||
| b120f1507e | |||
| dd1db844ce | |||
| 4ea3ec2cf8 | |||
| 9200024e50 | |||
| 698b8a761c | |||
| dd7c4da0eb | |||
| b2a78cad2a | |||
| 5728b465e6 | |||
| bfe99e959c | |||
| 780beaadfb | |||
| 838c5b8c15 | |||
| 9d95a193db | |||
| 3201f0fb6a | |||
| 62ddc57fb7 | |||
| 510175ff04 | |||
| a85ad0c88c | |||
| 4938dc1918 | |||
| 09a917766f | |||
| eeacbfa007 |
92
CHANGELOG.md
92
CHANGELOG.md
@@ -5,6 +5,98 @@ All notable changes to dbbackup will be documented in this file.
|
|||||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||||
|
|
||||||
|
## [3.42.50] - 2026-01-16 "Ctrl+C Signal Handling Fix"
|
||||||
|
|
||||||
|
### Fixed - Proper Ctrl+C/SIGINT Handling in TUI
|
||||||
|
- **Added tea.InterruptMsg handling** - Bubbletea v1.3+ sends `InterruptMsg` for SIGINT signals
|
||||||
|
instead of a `KeyMsg` with "ctrl+c", causing cancellation to not work
|
||||||
|
- **Fixed cluster restore cancellation** - Ctrl+C now properly cancels running restore operations
|
||||||
|
- **Fixed cluster backup cancellation** - Ctrl+C now properly cancels running backup operations
|
||||||
|
- **Added interrupt handling to main menu** - Proper cleanup on SIGINT from menu
|
||||||
|
- **Orphaned process cleanup** - `cleanup.KillOrphanedProcesses()` called on all interrupt paths
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- All TUI execution views now handle both `tea.KeyMsg` ("ctrl+c") and `tea.InterruptMsg`
|
||||||
|
- Context cancellation properly propagates to child processes via `exec.CommandContext`
|
||||||
|
- No zombie pg_dump/pg_restore/gzip processes left behind on cancellation
|
||||||
|
|
||||||
|
## [3.42.49] - 2026-01-16 "Unified Cluster Backup Progress"
|
||||||
|
|
||||||
|
### Added - Unified Progress Display for Cluster Backup
|
||||||
|
- **Combined overall progress bar** for cluster backup showing all phases:
|
||||||
|
- Phase 1/3: Backing up Globals (0-15% of overall)
|
||||||
|
- Phase 2/3: Backing up Databases (15-90% of overall)
|
||||||
|
- Phase 3/3: Compressing Archive (90-100% of overall)
|
||||||
|
- **Current database indicator** - Shows which database is currently being backed up
|
||||||
|
- **Phase-aware progress tracking** - New fields in backup progress state:
|
||||||
|
- `overallPhase` - Current phase (1=globals, 2=databases, 3=compressing)
|
||||||
|
- `phaseDesc` - Human-readable phase description
|
||||||
|
- **Dual progress bars** for cluster backup:
|
||||||
|
- Overall progress bar showing combined operation progress
|
||||||
|
- Database count progress bar showing individual database progress
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- Cluster backup TUI now shows unified progress display matching restore
|
||||||
|
- Progress callbacks now include phase information
|
||||||
|
- Better visual feedback during entire cluster backup operation
|
||||||
|
|
||||||
|
## [3.42.48] - 2026-01-15 "Unified Cluster Restore Progress"
|
||||||
|
|
||||||
|
### Added - Unified Progress Display for Cluster Restore
|
||||||
|
- **Combined overall progress bar** showing progress across all restore phases:
|
||||||
|
- Phase 1/3: Extracting Archive (0-60% of overall)
|
||||||
|
- Phase 2/3: Restoring Globals (60-65% of overall)
|
||||||
|
- Phase 3/3: Restoring Databases (65-100% of overall)
|
||||||
|
- **Current database indicator** - Shows which database is currently being restored
|
||||||
|
- **Phase-aware progress tracking** - New fields in progress state:
|
||||||
|
- `overallPhase` - Current phase (1=extraction, 2=globals, 3=databases)
|
||||||
|
- `currentDB` - Name of database currently being restored
|
||||||
|
- `extractionDone` - Boolean flag for phase transition
|
||||||
|
- **Dual progress bars** for cluster restore:
|
||||||
|
- Overall progress bar showing combined operation progress
|
||||||
|
- Phase-specific progress bar (extraction bytes or database count)
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- Cluster restore TUI now shows unified progress display
|
||||||
|
- Progress callbacks now set phase and current database information
|
||||||
|
- Extraction completion triggers automatic transition to globals phase
|
||||||
|
- Database restore phase shows current database name with spinner
|
||||||
|
|
||||||
|
### Improved
|
||||||
|
- Better visual feedback during entire cluster restore operation
|
||||||
|
- Clear phase indicators help users understand restore progress
|
||||||
|
- Overall progress percentage gives better time estimates
|
||||||
|
|
||||||
|
## [3.42.35] - 2026-01-15 "TUI Detailed Progress"
|
||||||
|
|
||||||
|
### Added - Enhanced TUI Progress Display
|
||||||
|
- **Detailed progress bar in TUI restore** - schollz-style progress bar with:
|
||||||
|
- Byte progress display (e.g., `245 MB / 1.2 GB`)
|
||||||
|
- Transfer speed calculation (e.g., `45 MB/s`)
|
||||||
|
- ETA prediction for long operations
|
||||||
|
- Unicode block-based visual bar
|
||||||
|
- **Real-time extraction progress** - Archive extraction now reports actual bytes processed
|
||||||
|
- **Go-native tar extraction** - Uses Go's `archive/tar` + `compress/gzip` when progress callback is set
|
||||||
|
- **New `DetailedProgress` component** in TUI package:
|
||||||
|
- `NewDetailedProgress(total, description)` - Byte-based progress
|
||||||
|
- `NewDetailedProgressItems(total, description)` - Item count progress
|
||||||
|
- `NewDetailedProgressSpinner(description)` - Indeterminate spinner
|
||||||
|
- `RenderProgressBar(width)` - Generate schollz-style output
|
||||||
|
- **Progress callback API** in restore engine:
|
||||||
|
- `SetProgressCallback(func(current, total int64, description string))`
|
||||||
|
- Allows TUI to receive real-time progress updates from restore operations
|
||||||
|
- **Shared progress state** pattern for Bubble Tea integration
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- TUI restore execution now shows detailed byte progress during archive extraction
|
||||||
|
- Cluster restore shows extraction progress instead of just spinner
|
||||||
|
- Falls back to shell `tar` command when no progress callback is set (faster)
|
||||||
|
|
||||||
|
### Technical Details
|
||||||
|
- `progressReader` wrapper tracks bytes read through gzip/tar pipeline
|
||||||
|
- Throttled progress updates (every 100ms) to avoid UI flooding
|
||||||
|
- Thread-safe shared state pattern for cross-goroutine progress updates
|
||||||
|
|
||||||
## [3.42.34] - 2026-01-14 "Filesystem Abstraction"
|
## [3.42.34] - 2026-01-14 "Filesystem Abstraction"
|
||||||
|
|
||||||
### Added - spf13/afero for Filesystem Abstraction
|
### Added - spf13/afero for Filesystem Abstraction
|
||||||
|
|||||||
@@ -3,9 +3,9 @@
|
|||||||
This directory contains pre-compiled binaries for the DB Backup Tool across multiple platforms and architectures.
|
This directory contains pre-compiled binaries for the DB Backup Tool across multiple platforms and architectures.
|
||||||
|
|
||||||
## Build Information
|
## Build Information
|
||||||
- **Version**: 3.42.34
|
- **Version**: 3.42.50
|
||||||
- **Build Time**: 2026-01-14_16:06:08_UTC
|
- **Build Time**: 2026-01-17_12:25:20_UTC
|
||||||
- **Git Commit**: ba6e8a2
|
- **Git Commit**: c5be9bc
|
||||||
|
|
||||||
## Recent Updates (v1.1.0)
|
## Recent Updates (v1.1.0)
|
||||||
- ✅ Fixed TUI progress display with line-by-line output
|
- ✅ Fixed TUI progress display with line-by-line output
|
||||||
|
|||||||
@@ -28,6 +28,7 @@ var (
|
|||||||
restoreClean bool
|
restoreClean bool
|
||||||
restoreCreate bool
|
restoreCreate bool
|
||||||
restoreJobs int
|
restoreJobs int
|
||||||
|
restoreParallelDBs int // Number of parallel database restores
|
||||||
restoreTarget string
|
restoreTarget string
|
||||||
restoreVerbose bool
|
restoreVerbose bool
|
||||||
restoreNoProgress bool
|
restoreNoProgress bool
|
||||||
@@ -289,6 +290,7 @@ func init() {
|
|||||||
restoreClusterCmd.Flags().BoolVar(&restoreForce, "force", false, "Skip safety checks and confirmations")
|
restoreClusterCmd.Flags().BoolVar(&restoreForce, "force", false, "Skip safety checks and confirmations")
|
||||||
restoreClusterCmd.Flags().BoolVar(&restoreCleanCluster, "clean-cluster", false, "Drop all existing user databases before restore (disaster recovery)")
|
restoreClusterCmd.Flags().BoolVar(&restoreCleanCluster, "clean-cluster", false, "Drop all existing user databases before restore (disaster recovery)")
|
||||||
restoreClusterCmd.Flags().IntVar(&restoreJobs, "jobs", 0, "Number of parallel decompression jobs (0 = auto)")
|
restoreClusterCmd.Flags().IntVar(&restoreJobs, "jobs", 0, "Number of parallel decompression jobs (0 = auto)")
|
||||||
|
restoreClusterCmd.Flags().IntVar(&restoreParallelDBs, "parallel-dbs", 0, "Number of databases to restore in parallel (0 = use config default, 1 = sequential, -1 = auto-detect based on CPU/RAM)")
|
||||||
restoreClusterCmd.Flags().StringVar(&restoreWorkdir, "workdir", "", "Working directory for extraction (use when system disk is small, e.g. /mnt/storage/restore_tmp)")
|
restoreClusterCmd.Flags().StringVar(&restoreWorkdir, "workdir", "", "Working directory for extraction (use when system disk is small, e.g. /mnt/storage/restore_tmp)")
|
||||||
restoreClusterCmd.Flags().BoolVar(&restoreVerbose, "verbose", false, "Show detailed restore progress")
|
restoreClusterCmd.Flags().BoolVar(&restoreVerbose, "verbose", false, "Show detailed restore progress")
|
||||||
restoreClusterCmd.Flags().BoolVar(&restoreNoProgress, "no-progress", false, "Disable progress indicators")
|
restoreClusterCmd.Flags().BoolVar(&restoreNoProgress, "no-progress", false, "Disable progress indicators")
|
||||||
@@ -783,6 +785,17 @@ func runRestoreCluster(cmd *cobra.Command, args []string) error {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Override cluster parallelism if --parallel-dbs is specified
|
||||||
|
if restoreParallelDBs == -1 {
|
||||||
|
// Auto-detect optimal parallelism based on system resources
|
||||||
|
autoParallel := restore.CalculateOptimalParallel()
|
||||||
|
cfg.ClusterParallelism = autoParallel
|
||||||
|
log.Info("Auto-detected optimal parallelism for database restores", "parallel_dbs", autoParallel, "mode", "auto")
|
||||||
|
} else if restoreParallelDBs > 0 {
|
||||||
|
cfg.ClusterParallelism = restoreParallelDBs
|
||||||
|
log.Info("Using custom parallelism for database restores", "parallel_dbs", restoreParallelDBs)
|
||||||
|
}
|
||||||
|
|
||||||
// Create restore engine
|
// Create restore engine
|
||||||
engine := restore.New(cfg, log, db)
|
engine := restore.New(cfg, log, db)
|
||||||
|
|
||||||
|
|||||||
@@ -94,7 +94,7 @@
|
|||||||
"uid": "${DS_PROMETHEUS}"
|
"uid": "${DS_PROMETHEUS}"
|
||||||
},
|
},
|
||||||
"editorMode": "code",
|
"editorMode": "code",
|
||||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < 86400",
|
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < bool 604800",
|
||||||
"legendFormat": "{{database}}",
|
"legendFormat": "{{database}}",
|
||||||
"range": true,
|
"range": true,
|
||||||
"refId": "A"
|
"refId": "A"
|
||||||
@@ -711,19 +711,6 @@
|
|||||||
},
|
},
|
||||||
"pluginVersion": "10.2.0",
|
"pluginVersion": "10.2.0",
|
||||||
"targets": [
|
"targets": [
|
||||||
{
|
|
||||||
"datasource": {
|
|
||||||
"type": "prometheus",
|
|
||||||
"uid": "${DS_PROMETHEUS}"
|
|
||||||
},
|
|
||||||
"editorMode": "code",
|
|
||||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < 86400",
|
|
||||||
"format": "table",
|
|
||||||
"instant": true,
|
|
||||||
"legendFormat": "__auto",
|
|
||||||
"range": false,
|
|
||||||
"refId": "Status"
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
"datasource": {
|
"datasource": {
|
||||||
"type": "prometheus",
|
"type": "prometheus",
|
||||||
@@ -769,26 +756,30 @@
|
|||||||
"Time": true,
|
"Time": true,
|
||||||
"Time 1": true,
|
"Time 1": true,
|
||||||
"Time 2": true,
|
"Time 2": true,
|
||||||
"Time 3": true,
|
|
||||||
"__name__": true,
|
"__name__": true,
|
||||||
"__name__ 1": true,
|
"__name__ 1": true,
|
||||||
"__name__ 2": true,
|
"__name__ 2": true,
|
||||||
"__name__ 3": true,
|
|
||||||
"instance 1": true,
|
"instance 1": true,
|
||||||
"instance 2": true,
|
"instance 2": true,
|
||||||
"instance 3": true,
|
|
||||||
"job": true,
|
"job": true,
|
||||||
"job 1": true,
|
"job 1": true,
|
||||||
"job 2": true,
|
"job 2": true,
|
||||||
"job 3": true
|
"engine 1": true,
|
||||||
|
"engine 2": true
|
||||||
|
},
|
||||||
|
"indexByName": {
|
||||||
|
"Database": 0,
|
||||||
|
"Instance": 1,
|
||||||
|
"Engine": 2,
|
||||||
|
"RPO": 3,
|
||||||
|
"Size": 4
|
||||||
},
|
},
|
||||||
"indexByName": {},
|
|
||||||
"renameByName": {
|
"renameByName": {
|
||||||
"Value #RPO": "RPO",
|
"Value #RPO": "RPO",
|
||||||
"Value #Size": "Size",
|
"Value #Size": "Size",
|
||||||
"Value #Status": "Status",
|
|
||||||
"database": "Database",
|
"database": "Database",
|
||||||
"instance": "Instance"
|
"instance": "Instance",
|
||||||
|
"engine": "Engine"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -1275,7 +1266,7 @@
|
|||||||
"query": "label_values(dbbackup_rpo_seconds, instance)",
|
"query": "label_values(dbbackup_rpo_seconds, instance)",
|
||||||
"refId": "StandardVariableQuery"
|
"refId": "StandardVariableQuery"
|
||||||
},
|
},
|
||||||
"refresh": 1,
|
"refresh": 2,
|
||||||
"regex": "",
|
"regex": "",
|
||||||
"skipUrlSync": false,
|
"skipUrlSync": false,
|
||||||
"sort": 1,
|
"sort": 1,
|
||||||
|
|||||||
@@ -84,19 +84,13 @@ func findHbaFileViaPostgres() string {
|
|||||||
|
|
||||||
// parsePgHbaConf parses pg_hba.conf and returns the authentication method
|
// parsePgHbaConf parses pg_hba.conf and returns the authentication method
|
||||||
func parsePgHbaConf(path string, user string) AuthMethod {
|
func parsePgHbaConf(path string, user string) AuthMethod {
|
||||||
// Try with sudo if we can't read directly
|
// Try to read the file directly - do NOT use sudo as it triggers password prompts
|
||||||
|
// If we can't read pg_hba.conf, we'll rely on connection attempts to determine auth
|
||||||
file, err := os.Open(path)
|
file, err := os.Open(path)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
// Try with sudo (with timeout)
|
// If we can't read the file, return unknown and let the connection determine auth
|
||||||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
// This avoids sudo password prompts when running as postgres via su
|
||||||
defer cancel()
|
return AuthUnknown
|
||||||
|
|
||||||
cmd := exec.CommandContext(ctx, "sudo", "cat", path)
|
|
||||||
output, err := cmd.Output()
|
|
||||||
if err != nil {
|
|
||||||
return AuthUnknown
|
|
||||||
}
|
|
||||||
return parseHbaContent(string(output), user)
|
|
||||||
}
|
}
|
||||||
defer file.Close()
|
defer file.Close()
|
||||||
|
|
||||||
|
|||||||
@@ -28,14 +28,22 @@ import (
|
|||||||
"dbbackup/internal/swap"
|
"dbbackup/internal/swap"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
// ProgressCallback is called with byte-level progress updates during backup operations
|
||||||
|
type ProgressCallback func(current, total int64, description string)
|
||||||
|
|
||||||
|
// DatabaseProgressCallback is called with database count progress during cluster backup
|
||||||
|
type DatabaseProgressCallback func(done, total int, dbName string)
|
||||||
|
|
||||||
// Engine handles backup operations
|
// Engine handles backup operations
|
||||||
type Engine struct {
|
type Engine struct {
|
||||||
cfg *config.Config
|
cfg *config.Config
|
||||||
log logger.Logger
|
log logger.Logger
|
||||||
db database.Database
|
db database.Database
|
||||||
progress progress.Indicator
|
progress progress.Indicator
|
||||||
detailedReporter *progress.DetailedReporter
|
detailedReporter *progress.DetailedReporter
|
||||||
silent bool // Silent mode for TUI
|
silent bool // Silent mode for TUI
|
||||||
|
progressCallback ProgressCallback
|
||||||
|
dbProgressCallback DatabaseProgressCallback
|
||||||
}
|
}
|
||||||
|
|
||||||
// New creates a new backup engine
|
// New creates a new backup engine
|
||||||
@@ -86,6 +94,30 @@ func NewSilent(cfg *config.Config, log logger.Logger, db database.Database, prog
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// SetProgressCallback sets a callback for detailed progress reporting (for TUI mode)
|
||||||
|
func (e *Engine) SetProgressCallback(cb ProgressCallback) {
|
||||||
|
e.progressCallback = cb
|
||||||
|
}
|
||||||
|
|
||||||
|
// SetDatabaseProgressCallback sets a callback for database count progress during cluster backup
|
||||||
|
func (e *Engine) SetDatabaseProgressCallback(cb DatabaseProgressCallback) {
|
||||||
|
e.dbProgressCallback = cb
|
||||||
|
}
|
||||||
|
|
||||||
|
// reportProgress reports progress to the callback if set
|
||||||
|
func (e *Engine) reportProgress(current, total int64, description string) {
|
||||||
|
if e.progressCallback != nil {
|
||||||
|
e.progressCallback(current, total, description)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// reportDatabaseProgress reports database count progress to the callback if set
|
||||||
|
func (e *Engine) reportDatabaseProgress(done, total int, dbName string) {
|
||||||
|
if e.dbProgressCallback != nil {
|
||||||
|
e.dbProgressCallback(done, total, dbName)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// loggerAdapter adapts our logger to the progress.Logger interface
|
// loggerAdapter adapts our logger to the progress.Logger interface
|
||||||
type loggerAdapter struct {
|
type loggerAdapter struct {
|
||||||
logger logger.Logger
|
logger logger.Logger
|
||||||
@@ -465,6 +497,8 @@ func (e *Engine) BackupCluster(ctx context.Context) error {
|
|||||||
estimator.UpdateProgress(idx)
|
estimator.UpdateProgress(idx)
|
||||||
e.printf(" [%d/%d] Backing up database: %s\n", idx+1, len(databases), name)
|
e.printf(" [%d/%d] Backing up database: %s\n", idx+1, len(databases), name)
|
||||||
quietProgress.Update(fmt.Sprintf("Backing up database %d/%d: %s", idx+1, len(databases), name))
|
quietProgress.Update(fmt.Sprintf("Backing up database %d/%d: %s", idx+1, len(databases), name))
|
||||||
|
// Report database progress to TUI callback
|
||||||
|
e.reportDatabaseProgress(idx+1, len(databases), name)
|
||||||
mu.Unlock()
|
mu.Unlock()
|
||||||
|
|
||||||
// Check database size and warn if very large
|
// Check database size and warn if very large
|
||||||
@@ -903,11 +937,15 @@ func (e *Engine) createSampleBackup(ctx context.Context, databaseName, outputFil
|
|||||||
func (e *Engine) backupGlobals(ctx context.Context, tempDir string) error {
|
func (e *Engine) backupGlobals(ctx context.Context, tempDir string) error {
|
||||||
globalsFile := filepath.Join(tempDir, "globals.sql")
|
globalsFile := filepath.Join(tempDir, "globals.sql")
|
||||||
|
|
||||||
cmd := exec.CommandContext(ctx, "pg_dumpall", "--globals-only")
|
// CRITICAL: Always pass port even for localhost - user may have non-standard port
|
||||||
if e.cfg.Host != "localhost" {
|
cmd := exec.CommandContext(ctx, "pg_dumpall", "--globals-only",
|
||||||
cmd.Args = append(cmd.Args, "-h", e.cfg.Host, "-p", fmt.Sprintf("%d", e.cfg.Port))
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
|
"-U", e.cfg.User)
|
||||||
|
|
||||||
|
// Only add -h flag for non-localhost to use Unix socket for peer auth
|
||||||
|
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||||
|
cmd.Args = append([]string{cmd.Args[0], "-h", e.cfg.Host}, cmd.Args[1:]...)
|
||||||
}
|
}
|
||||||
cmd.Args = append(cmd.Args, "-U", e.cfg.User)
|
|
||||||
|
|
||||||
cmd.Env = os.Environ()
|
cmd.Env = os.Environ()
|
||||||
if e.cfg.Password != "" {
|
if e.cfg.Password != "" {
|
||||||
|
|||||||
@@ -68,8 +68,8 @@ func ClassifyError(errorMsg string) *ErrorClassification {
|
|||||||
Type: "critical",
|
Type: "critical",
|
||||||
Category: "locks",
|
Category: "locks",
|
||||||
Message: errorMsg,
|
Message: errorMsg,
|
||||||
Hint: "Lock table exhausted - typically caused by large objects (BLOBs) during restore",
|
Hint: "Lock table exhausted. Total capacity = max_locks_per_transaction × (max_connections + max_prepared_transactions). If you reduced VM size or max_connections, you need higher max_locks_per_transaction to compensate.",
|
||||||
Action: "Option 1: Increase max_locks_per_transaction to 1024+ in postgresql.conf (requires restart). Option 2: Update dbbackup and retry - phased restore now auto-enabled for BLOB databases",
|
Action: "Fix: ALTER SYSTEM SET max_locks_per_transaction = 4096; then RESTART PostgreSQL. For smaller VMs with fewer connections, you need higher max_locks_per_transaction values.",
|
||||||
Severity: 2,
|
Severity: 2,
|
||||||
}
|
}
|
||||||
case "permission_denied":
|
case "permission_denied":
|
||||||
@@ -142,8 +142,8 @@ func ClassifyError(errorMsg string) *ErrorClassification {
|
|||||||
Type: "critical",
|
Type: "critical",
|
||||||
Category: "locks",
|
Category: "locks",
|
||||||
Message: errorMsg,
|
Message: errorMsg,
|
||||||
Hint: "Lock table exhausted - typically caused by large objects (BLOBs) during restore",
|
Hint: "Lock table exhausted. Total capacity = max_locks_per_transaction × (max_connections + max_prepared_transactions). If you reduced VM size or max_connections, you need higher max_locks_per_transaction to compensate.",
|
||||||
Action: "Option 1: Increase max_locks_per_transaction to 1024+ in postgresql.conf (requires restart). Option 2: Update dbbackup and retry - phased restore now auto-enabled for BLOB databases",
|
Action: "Fix: ALTER SYSTEM SET max_locks_per_transaction = 4096; then RESTART PostgreSQL. For smaller VMs with fewer connections, you need higher max_locks_per_transaction values.",
|
||||||
Severity: 2,
|
Severity: 2,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -316,11 +316,12 @@ func (p *PostgreSQL) BuildBackupCommand(database, outputFile string, options Bac
|
|||||||
cmd := []string{"pg_dump"}
|
cmd := []string{"pg_dump"}
|
||||||
|
|
||||||
// Connection parameters
|
// Connection parameters
|
||||||
if p.cfg.Host != "localhost" {
|
// CRITICAL: Always pass port even for localhost - user may have non-standard port
|
||||||
|
if p.cfg.Host != "localhost" && p.cfg.Host != "127.0.0.1" && p.cfg.Host != "" {
|
||||||
cmd = append(cmd, "-h", p.cfg.Host)
|
cmd = append(cmd, "-h", p.cfg.Host)
|
||||||
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
|
||||||
cmd = append(cmd, "--no-password")
|
cmd = append(cmd, "--no-password")
|
||||||
}
|
}
|
||||||
|
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
||||||
cmd = append(cmd, "-U", p.cfg.User)
|
cmd = append(cmd, "-U", p.cfg.User)
|
||||||
|
|
||||||
// Format and compression
|
// Format and compression
|
||||||
@@ -380,11 +381,12 @@ func (p *PostgreSQL) BuildRestoreCommand(database, inputFile string, options Res
|
|||||||
cmd := []string{"pg_restore"}
|
cmd := []string{"pg_restore"}
|
||||||
|
|
||||||
// Connection parameters
|
// Connection parameters
|
||||||
if p.cfg.Host != "localhost" {
|
// CRITICAL: Always pass port even for localhost - user may have non-standard port
|
||||||
|
if p.cfg.Host != "localhost" && p.cfg.Host != "127.0.0.1" && p.cfg.Host != "" {
|
||||||
cmd = append(cmd, "-h", p.cfg.Host)
|
cmd = append(cmd, "-h", p.cfg.Host)
|
||||||
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
|
||||||
cmd = append(cmd, "--no-password")
|
cmd = append(cmd, "--no-password")
|
||||||
}
|
}
|
||||||
|
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
||||||
cmd = append(cmd, "-U", p.cfg.User)
|
cmd = append(cmd, "-U", p.cfg.User)
|
||||||
|
|
||||||
// Parallel jobs (incompatible with --single-transaction per PostgreSQL docs)
|
// Parallel jobs (incompatible with --single-transaction per PostgreSQL docs)
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ import (
|
|||||||
"bytes"
|
"bytes"
|
||||||
"crypto/rand"
|
"crypto/rand"
|
||||||
"io"
|
"io"
|
||||||
|
mathrand "math/rand"
|
||||||
"testing"
|
"testing"
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -100,12 +101,15 @@ func TestChunker_Deterministic(t *testing.T) {
|
|||||||
|
|
||||||
func TestChunker_ShiftedData(t *testing.T) {
|
func TestChunker_ShiftedData(t *testing.T) {
|
||||||
// Test that shifted data still shares chunks (the key CDC benefit)
|
// Test that shifted data still shares chunks (the key CDC benefit)
|
||||||
|
// Use deterministic random data for reproducible test results
|
||||||
|
rng := mathrand.New(mathrand.NewSource(42))
|
||||||
|
|
||||||
original := make([]byte, 100*1024)
|
original := make([]byte, 100*1024)
|
||||||
rand.Read(original)
|
rng.Read(original)
|
||||||
|
|
||||||
// Create shifted version (prepend some bytes)
|
// Create shifted version (prepend some bytes)
|
||||||
prefix := make([]byte, 1000)
|
prefix := make([]byte, 1000)
|
||||||
rand.Read(prefix)
|
rng.Read(prefix)
|
||||||
shifted := append(prefix, original...)
|
shifted := append(prefix, original...)
|
||||||
|
|
||||||
// Chunk both
|
// Chunk both
|
||||||
|
|||||||
@@ -1,9 +1,12 @@
|
|||||||
package restore
|
package restore
|
||||||
|
|
||||||
import (
|
import (
|
||||||
|
"archive/tar"
|
||||||
|
"compress/gzip"
|
||||||
"context"
|
"context"
|
||||||
"database/sql"
|
"database/sql"
|
||||||
"fmt"
|
"fmt"
|
||||||
|
"io"
|
||||||
"os"
|
"os"
|
||||||
"os/exec"
|
"os/exec"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
@@ -24,6 +27,21 @@ import (
|
|||||||
_ "github.com/jackc/pgx/v5/stdlib" // PostgreSQL driver
|
_ "github.com/jackc/pgx/v5/stdlib" // PostgreSQL driver
|
||||||
)
|
)
|
||||||
|
|
||||||
|
// ProgressCallback is called with progress updates during long operations
|
||||||
|
// Parameters: current bytes/items done, total bytes/items, description
|
||||||
|
type ProgressCallback func(current, total int64, description string)
|
||||||
|
|
||||||
|
// DatabaseProgressCallback is called with database count progress during cluster restore
|
||||||
|
type DatabaseProgressCallback func(done, total int, dbName string)
|
||||||
|
|
||||||
|
// DatabaseProgressWithTimingCallback is called with database progress including timing info
|
||||||
|
// Parameters: done count, total count, database name, elapsed time for current restore phase, avg duration per DB
|
||||||
|
type DatabaseProgressWithTimingCallback func(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration)
|
||||||
|
|
||||||
|
// DatabaseProgressByBytesCallback is called with progress weighted by database sizes (bytes)
|
||||||
|
// Parameters: bytes completed, total bytes, current database name, databases done count, total database count
|
||||||
|
type DatabaseProgressByBytesCallback func(bytesDone, bytesTotal int64, dbName string, dbDone, dbTotal int)
|
||||||
|
|
||||||
// Engine handles database restore operations
|
// Engine handles database restore operations
|
||||||
type Engine struct {
|
type Engine struct {
|
||||||
cfg *config.Config
|
cfg *config.Config
|
||||||
@@ -33,6 +51,12 @@ type Engine struct {
|
|||||||
detailedReporter *progress.DetailedReporter
|
detailedReporter *progress.DetailedReporter
|
||||||
dryRun bool
|
dryRun bool
|
||||||
debugLogPath string // Path to save debug log on error
|
debugLogPath string // Path to save debug log on error
|
||||||
|
|
||||||
|
// TUI progress callback for detailed progress reporting
|
||||||
|
progressCallback ProgressCallback
|
||||||
|
dbProgressCallback DatabaseProgressCallback
|
||||||
|
dbProgressTimingCallback DatabaseProgressWithTimingCallback
|
||||||
|
dbProgressByBytesCallback DatabaseProgressByBytesCallback
|
||||||
}
|
}
|
||||||
|
|
||||||
// New creates a new restore engine
|
// New creates a new restore engine
|
||||||
@@ -88,6 +112,54 @@ func (e *Engine) SetDebugLogPath(path string) {
|
|||||||
e.debugLogPath = path
|
e.debugLogPath = path
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// SetProgressCallback sets a callback for detailed progress reporting (for TUI mode)
|
||||||
|
func (e *Engine) SetProgressCallback(cb ProgressCallback) {
|
||||||
|
e.progressCallback = cb
|
||||||
|
}
|
||||||
|
|
||||||
|
// SetDatabaseProgressCallback sets a callback for database count progress during cluster restore
|
||||||
|
func (e *Engine) SetDatabaseProgressCallback(cb DatabaseProgressCallback) {
|
||||||
|
e.dbProgressCallback = cb
|
||||||
|
}
|
||||||
|
|
||||||
|
// SetDatabaseProgressWithTimingCallback sets a callback for database progress with timing info
|
||||||
|
func (e *Engine) SetDatabaseProgressWithTimingCallback(cb DatabaseProgressWithTimingCallback) {
|
||||||
|
e.dbProgressTimingCallback = cb
|
||||||
|
}
|
||||||
|
|
||||||
|
// SetDatabaseProgressByBytesCallback sets a callback for progress weighted by database sizes
|
||||||
|
func (e *Engine) SetDatabaseProgressByBytesCallback(cb DatabaseProgressByBytesCallback) {
|
||||||
|
e.dbProgressByBytesCallback = cb
|
||||||
|
}
|
||||||
|
|
||||||
|
// reportProgress safely calls the progress callback if set
|
||||||
|
func (e *Engine) reportProgress(current, total int64, description string) {
|
||||||
|
if e.progressCallback != nil {
|
||||||
|
e.progressCallback(current, total, description)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// reportDatabaseProgress safely calls the database progress callback if set
|
||||||
|
func (e *Engine) reportDatabaseProgress(done, total int, dbName string) {
|
||||||
|
if e.dbProgressCallback != nil {
|
||||||
|
e.dbProgressCallback(done, total, dbName)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// reportDatabaseProgressWithTiming safely calls the timing-aware callback if set
|
||||||
|
func (e *Engine) reportDatabaseProgressWithTiming(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration) {
|
||||||
|
if e.dbProgressTimingCallback != nil {
|
||||||
|
e.dbProgressTimingCallback(done, total, dbName, phaseElapsed, avgPerDB)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// reportDatabaseProgressByBytes safely calls the bytes-weighted callback if set
|
||||||
|
func (e *Engine) reportDatabaseProgressByBytes(bytesDone, bytesTotal int64, dbName string, dbDone, dbTotal int) {
|
||||||
|
if e.dbProgressByBytesCallback != nil {
|
||||||
|
e.dbProgressByBytesCallback(bytesDone, bytesTotal, dbName, dbDone, dbTotal)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// loggerAdapter adapts our logger to the progress.Logger interface
|
// loggerAdapter adapts our logger to the progress.Logger interface
|
||||||
type loggerAdapter struct {
|
type loggerAdapter struct {
|
||||||
logger logger.Logger
|
logger logger.Logger
|
||||||
@@ -387,16 +459,18 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
|
|||||||
var cmd []string
|
var cmd []string
|
||||||
|
|
||||||
// For localhost, omit -h to use Unix socket (avoids Ident auth issues)
|
// For localhost, omit -h to use Unix socket (avoids Ident auth issues)
|
||||||
|
// But always include -p for port (in case of non-standard port)
|
||||||
hostArg := ""
|
hostArg := ""
|
||||||
|
portArg := fmt.Sprintf("-p %d", e.cfg.Port)
|
||||||
if e.cfg.Host != "localhost" && e.cfg.Host != "" {
|
if e.cfg.Host != "localhost" && e.cfg.Host != "" {
|
||||||
hostArg = fmt.Sprintf("-h %s -p %d", e.cfg.Host, e.cfg.Port)
|
hostArg = fmt.Sprintf("-h %s", e.cfg.Host)
|
||||||
}
|
}
|
||||||
|
|
||||||
if compressed {
|
if compressed {
|
||||||
// Use ON_ERROR_STOP=1 to fail fast on first error (prevents millions of errors on truncated dumps)
|
// Use ON_ERROR_STOP=1 to fail fast on first error (prevents millions of errors on truncated dumps)
|
||||||
psqlCmd := fmt.Sprintf("psql -U %s -d %s -v ON_ERROR_STOP=1", e.cfg.User, targetDB)
|
psqlCmd := fmt.Sprintf("psql %s -U %s -d %s -v ON_ERROR_STOP=1", portArg, e.cfg.User, targetDB)
|
||||||
if hostArg != "" {
|
if hostArg != "" {
|
||||||
psqlCmd = fmt.Sprintf("psql %s -U %s -d %s -v ON_ERROR_STOP=1", hostArg, e.cfg.User, targetDB)
|
psqlCmd = fmt.Sprintf("psql %s %s -U %s -d %s -v ON_ERROR_STOP=1", hostArg, portArg, e.cfg.User, targetDB)
|
||||||
}
|
}
|
||||||
// Set PGPASSWORD in the bash command for password-less auth
|
// Set PGPASSWORD in the bash command for password-less auth
|
||||||
cmd = []string{
|
cmd = []string{
|
||||||
@@ -417,6 +491,7 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
|
|||||||
} else {
|
} else {
|
||||||
cmd = []string{
|
cmd = []string{
|
||||||
"psql",
|
"psql",
|
||||||
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
"-U", e.cfg.User,
|
"-U", e.cfg.User,
|
||||||
"-d", targetDB,
|
"-d", targetDB,
|
||||||
"-v", "ON_ERROR_STOP=1",
|
"-v", "ON_ERROR_STOP=1",
|
||||||
@@ -803,6 +878,25 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
// Create temporary extraction directory in configured WorkDir
|
// Create temporary extraction directory in configured WorkDir
|
||||||
workDir := e.cfg.GetEffectiveWorkDir()
|
workDir := e.cfg.GetEffectiveWorkDir()
|
||||||
tempDir := filepath.Join(workDir, fmt.Sprintf(".restore_%d", time.Now().Unix()))
|
tempDir := filepath.Join(workDir, fmt.Sprintf(".restore_%d", time.Now().Unix()))
|
||||||
|
|
||||||
|
// Check disk space for extraction (need ~3x archive size: compressed + extracted + working space)
|
||||||
|
if archiveInfo != nil {
|
||||||
|
requiredBytes := uint64(archiveInfo.Size()) * 3
|
||||||
|
extractionCheck := checks.CheckDiskSpace(workDir)
|
||||||
|
if extractionCheck.AvailableBytes < requiredBytes {
|
||||||
|
operation.Fail("Insufficient disk space for extraction")
|
||||||
|
return fmt.Errorf("insufficient disk space for extraction in %s: need %.1f GB, have %.1f GB (archive size: %.1f GB × 3)",
|
||||||
|
workDir,
|
||||||
|
float64(requiredBytes)/(1024*1024*1024),
|
||||||
|
float64(extractionCheck.AvailableBytes)/(1024*1024*1024),
|
||||||
|
float64(archiveInfo.Size())/(1024*1024*1024))
|
||||||
|
}
|
||||||
|
e.log.Info("Disk space check for extraction passed",
|
||||||
|
"workdir", workDir,
|
||||||
|
"required_gb", float64(requiredBytes)/(1024*1024*1024),
|
||||||
|
"available_gb", float64(extractionCheck.AvailableBytes)/(1024*1024*1024))
|
||||||
|
}
|
||||||
|
|
||||||
if err := os.MkdirAll(tempDir, 0755); err != nil {
|
if err := os.MkdirAll(tempDir, 0755); err != nil {
|
||||||
operation.Fail("Failed to create temporary directory")
|
operation.Fail("Failed to create temporary directory")
|
||||||
return fmt.Errorf("failed to create temp directory in %s: %w", workDir, err)
|
return fmt.Errorf("failed to create temp directory in %s: %w", workDir, err)
|
||||||
@@ -816,6 +910,16 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
return fmt.Errorf("failed to extract archive: %w", err)
|
return fmt.Errorf("failed to extract archive: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Check context validity after extraction (debugging context cancellation issues)
|
||||||
|
if ctx.Err() != nil {
|
||||||
|
e.log.Error("Context cancelled after extraction - this should not happen",
|
||||||
|
"context_error", ctx.Err(),
|
||||||
|
"extraction_completed", true)
|
||||||
|
operation.Fail("Context cancelled unexpectedly")
|
||||||
|
return fmt.Errorf("context cancelled after extraction completed: %w", ctx.Err())
|
||||||
|
}
|
||||||
|
e.log.Info("Extraction completed, context still valid")
|
||||||
|
|
||||||
// Check if user has superuser privileges (required for ownership restoration)
|
// Check if user has superuser privileges (required for ownership restoration)
|
||||||
e.progress.Update("Checking privileges...")
|
e.progress.Update("Checking privileges...")
|
||||||
isSuperuser, err := e.checkSuperuser(ctx)
|
isSuperuser, err := e.checkSuperuser(ctx)
|
||||||
@@ -966,12 +1070,27 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
var restoreErrorsMu sync.Mutex
|
var restoreErrorsMu sync.Mutex
|
||||||
totalDBs := 0
|
totalDBs := 0
|
||||||
|
|
||||||
// Count total databases
|
// Count total databases and calculate total bytes for weighted progress
|
||||||
|
var totalBytes int64
|
||||||
|
dbSizes := make(map[string]int64) // Map database name to dump file size
|
||||||
for _, entry := range entries {
|
for _, entry := range entries {
|
||||||
if !entry.IsDir() {
|
if !entry.IsDir() {
|
||||||
totalDBs++
|
totalDBs++
|
||||||
|
dumpFile := filepath.Join(dumpsDir, entry.Name())
|
||||||
|
if info, err := os.Stat(dumpFile); err == nil {
|
||||||
|
dbName := entry.Name()
|
||||||
|
dbName = strings.TrimSuffix(dbName, ".dump")
|
||||||
|
dbName = strings.TrimSuffix(dbName, ".sql.gz")
|
||||||
|
dbSizes[dbName] = info.Size()
|
||||||
|
totalBytes += info.Size()
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
e.log.Info("Calculated total restore size", "databases", totalDBs, "total_bytes", totalBytes)
|
||||||
|
|
||||||
|
// Track bytes completed for weighted progress
|
||||||
|
var bytesCompleted int64
|
||||||
|
var bytesCompletedMu sync.Mutex
|
||||||
|
|
||||||
// Create ETA estimator for database restores
|
// Create ETA estimator for database restores
|
||||||
estimator := progress.NewETAEstimator("Restoring cluster", totalDBs)
|
estimator := progress.NewETAEstimator("Restoring cluster", totalDBs)
|
||||||
@@ -999,6 +1118,23 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
var successCount, failCount int32
|
var successCount, failCount int32
|
||||||
var mu sync.Mutex // Protect shared resources (progress, logger)
|
var mu sync.Mutex // Protect shared resources (progress, logger)
|
||||||
|
|
||||||
|
// CRITICAL: Check context before starting database restore loop
|
||||||
|
// This helps debug issues where context gets cancelled between extraction and restore
|
||||||
|
if ctx.Err() != nil {
|
||||||
|
e.log.Error("Context cancelled before database restore loop started",
|
||||||
|
"context_error", ctx.Err(),
|
||||||
|
"total_databases", totalDBs,
|
||||||
|
"parallelism", parallelism)
|
||||||
|
operation.Fail("Context cancelled before database restores could start")
|
||||||
|
return fmt.Errorf("context cancelled before database restore: %w", ctx.Err())
|
||||||
|
}
|
||||||
|
e.log.Info("Starting database restore loop", "databases", totalDBs, "parallelism", parallelism)
|
||||||
|
|
||||||
|
// Timing tracking for restore phase progress
|
||||||
|
restorePhaseStart := time.Now()
|
||||||
|
var completedDBTimes []time.Duration // Track duration for each completed DB restore
|
||||||
|
var completedDBTimesMu sync.Mutex
|
||||||
|
|
||||||
// Create semaphore to limit concurrency
|
// Create semaphore to limit concurrency
|
||||||
semaphore := make(chan struct{}, parallelism)
|
semaphore := make(chan struct{}, parallelism)
|
||||||
var wg sync.WaitGroup
|
var wg sync.WaitGroup
|
||||||
@@ -1024,6 +1160,19 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
}
|
}
|
||||||
}()
|
}()
|
||||||
|
|
||||||
|
// Check for context cancellation before starting
|
||||||
|
if ctx.Err() != nil {
|
||||||
|
e.log.Warn("Context cancelled - skipping database restore", "file", filename)
|
||||||
|
atomic.AddInt32(&failCount, 1)
|
||||||
|
restoreErrorsMu.Lock()
|
||||||
|
restoreErrors = multierror.Append(restoreErrors, fmt.Errorf("%s: restore skipped (context cancelled)", strings.TrimSuffix(strings.TrimSuffix(filename, ".dump"), ".sql.gz")))
|
||||||
|
restoreErrorsMu.Unlock()
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Track timing for this database restore
|
||||||
|
dbRestoreStart := time.Now()
|
||||||
|
|
||||||
// Update estimator progress (thread-safe)
|
// Update estimator progress (thread-safe)
|
||||||
mu.Lock()
|
mu.Lock()
|
||||||
estimator.UpdateProgress(idx)
|
estimator.UpdateProgress(idx)
|
||||||
@@ -1036,10 +1185,26 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
|
|
||||||
dbProgress := 15 + int(float64(idx)/float64(totalDBs)*85.0)
|
dbProgress := 15 + int(float64(idx)/float64(totalDBs)*85.0)
|
||||||
|
|
||||||
|
// Calculate average time per DB and report progress with timing
|
||||||
|
completedDBTimesMu.Lock()
|
||||||
|
var avgPerDB time.Duration
|
||||||
|
if len(completedDBTimes) > 0 {
|
||||||
|
var totalDuration time.Duration
|
||||||
|
for _, d := range completedDBTimes {
|
||||||
|
totalDuration += d
|
||||||
|
}
|
||||||
|
avgPerDB = totalDuration / time.Duration(len(completedDBTimes))
|
||||||
|
}
|
||||||
|
phaseElapsed := time.Since(restorePhaseStart)
|
||||||
|
completedDBTimesMu.Unlock()
|
||||||
|
|
||||||
mu.Lock()
|
mu.Lock()
|
||||||
statusMsg := fmt.Sprintf("Restoring database %s (%d/%d)", dbName, idx+1, totalDBs)
|
statusMsg := fmt.Sprintf("Restoring database %s (%d/%d)", dbName, idx+1, totalDBs)
|
||||||
e.progress.Update(statusMsg)
|
e.progress.Update(statusMsg)
|
||||||
e.log.Info("Restoring database", "name", dbName, "file", dumpFile, "progress", dbProgress)
|
e.log.Info("Restoring database", "name", dbName, "file", dumpFile, "progress", dbProgress)
|
||||||
|
// Report database progress for TUI (both callbacks)
|
||||||
|
e.reportDatabaseProgress(idx, totalDBs, dbName)
|
||||||
|
e.reportDatabaseProgressWithTiming(idx, totalDBs, dbName, phaseElapsed, avgPerDB)
|
||||||
mu.Unlock()
|
mu.Unlock()
|
||||||
|
|
||||||
// STEP 1: Drop existing database completely (clean slate)
|
// STEP 1: Drop existing database completely (clean slate)
|
||||||
@@ -1104,7 +1269,27 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
return
|
return
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Track completed database restore duration for ETA calculation
|
||||||
|
dbRestoreDuration := time.Since(dbRestoreStart)
|
||||||
|
completedDBTimesMu.Lock()
|
||||||
|
completedDBTimes = append(completedDBTimes, dbRestoreDuration)
|
||||||
|
completedDBTimesMu.Unlock()
|
||||||
|
|
||||||
|
// Update bytes completed for weighted progress
|
||||||
|
dbSize := dbSizes[dbName]
|
||||||
|
bytesCompletedMu.Lock()
|
||||||
|
bytesCompleted += dbSize
|
||||||
|
currentBytesCompleted := bytesCompleted
|
||||||
|
currentSuccessCount := int(atomic.LoadInt32(&successCount)) + 1 // +1 because we're about to increment
|
||||||
|
bytesCompletedMu.Unlock()
|
||||||
|
|
||||||
|
// Report weighted progress (bytes-based)
|
||||||
|
e.reportDatabaseProgressByBytes(currentBytesCompleted, totalBytes, dbName, currentSuccessCount, totalDBs)
|
||||||
|
|
||||||
atomic.AddInt32(&successCount, 1)
|
atomic.AddInt32(&successCount, 1)
|
||||||
|
|
||||||
|
// Small delay to ensure PostgreSQL fully closes connections before next restore
|
||||||
|
time.Sleep(100 * time.Millisecond)
|
||||||
}(dbIndex, entry.Name())
|
}(dbIndex, entry.Name())
|
||||||
|
|
||||||
dbIndex++
|
dbIndex++
|
||||||
@@ -1116,6 +1301,35 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
successCountFinal := int(atomic.LoadInt32(&successCount))
|
successCountFinal := int(atomic.LoadInt32(&successCount))
|
||||||
failCountFinal := int(atomic.LoadInt32(&failCount))
|
failCountFinal := int(atomic.LoadInt32(&failCount))
|
||||||
|
|
||||||
|
// SANITY CHECK: Verify all databases were accounted for
|
||||||
|
// This catches any goroutine that exited without updating counters
|
||||||
|
accountedFor := successCountFinal + failCountFinal
|
||||||
|
if accountedFor != totalDBs {
|
||||||
|
missingCount := totalDBs - accountedFor
|
||||||
|
e.log.Error("INTERNAL ERROR: Some database restore goroutines did not report status",
|
||||||
|
"expected", totalDBs,
|
||||||
|
"success", successCountFinal,
|
||||||
|
"failed", failCountFinal,
|
||||||
|
"unaccounted", missingCount)
|
||||||
|
|
||||||
|
// Treat unaccounted databases as failures
|
||||||
|
failCountFinal += missingCount
|
||||||
|
restoreErrorsMu.Lock()
|
||||||
|
restoreErrors = multierror.Append(restoreErrors, fmt.Errorf("%d database(s) did not complete (possible goroutine crash or deadlock)", missingCount))
|
||||||
|
restoreErrorsMu.Unlock()
|
||||||
|
}
|
||||||
|
|
||||||
|
// CRITICAL: Check if no databases were restored at all
|
||||||
|
if successCountFinal == 0 {
|
||||||
|
e.progress.Fail(fmt.Sprintf("Cluster restore FAILED: 0 of %d databases restored", totalDBs))
|
||||||
|
operation.Fail("No databases were restored")
|
||||||
|
|
||||||
|
if failCountFinal > 0 && restoreErrors != nil {
|
||||||
|
return fmt.Errorf("cluster restore failed: all %d database(s) failed:\n%s", failCountFinal, restoreErrors.Error())
|
||||||
|
}
|
||||||
|
return fmt.Errorf("cluster restore failed: no databases were restored (0 of %d total). Check PostgreSQL logs for details", totalDBs)
|
||||||
|
}
|
||||||
|
|
||||||
if failCountFinal > 0 {
|
if failCountFinal > 0 {
|
||||||
// Format multi-error with detailed output
|
// Format multi-error with detailed output
|
||||||
restoreErrors.ErrorFormat = func(errs []error) string {
|
restoreErrors.ErrorFormat = func(errs []error) string {
|
||||||
@@ -1146,8 +1360,144 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// extractArchive extracts a tar.gz archive
|
// extractArchive extracts a tar.gz archive with progress reporting
|
||||||
func (e *Engine) extractArchive(ctx context.Context, archivePath, destDir string) error {
|
func (e *Engine) extractArchive(ctx context.Context, archivePath, destDir string) error {
|
||||||
|
// If progress callback is set, use Go's archive/tar for progress tracking
|
||||||
|
if e.progressCallback != nil {
|
||||||
|
return e.extractArchiveWithProgress(ctx, archivePath, destDir)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Otherwise use fast shell tar (no progress)
|
||||||
|
return e.extractArchiveShell(ctx, archivePath, destDir)
|
||||||
|
}
|
||||||
|
|
||||||
|
// extractArchiveWithProgress extracts using Go's archive/tar with detailed progress reporting
|
||||||
|
func (e *Engine) extractArchiveWithProgress(ctx context.Context, archivePath, destDir string) error {
|
||||||
|
// Get archive size for progress calculation
|
||||||
|
archiveInfo, err := os.Stat(archivePath)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to stat archive: %w", err)
|
||||||
|
}
|
||||||
|
totalSize := archiveInfo.Size()
|
||||||
|
|
||||||
|
// Open the archive file
|
||||||
|
file, err := os.Open(archivePath)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to open archive: %w", err)
|
||||||
|
}
|
||||||
|
defer file.Close()
|
||||||
|
|
||||||
|
// Wrap with progress reader
|
||||||
|
progressReader := &progressReader{
|
||||||
|
reader: file,
|
||||||
|
totalSize: totalSize,
|
||||||
|
callback: e.progressCallback,
|
||||||
|
desc: "Extracting archive",
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create gzip reader
|
||||||
|
gzReader, err := gzip.NewReader(progressReader)
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to create gzip reader: %w", err)
|
||||||
|
}
|
||||||
|
defer gzReader.Close()
|
||||||
|
|
||||||
|
// Create tar reader
|
||||||
|
tarReader := tar.NewReader(gzReader)
|
||||||
|
|
||||||
|
// Extract files
|
||||||
|
for {
|
||||||
|
select {
|
||||||
|
case <-ctx.Done():
|
||||||
|
return ctx.Err()
|
||||||
|
default:
|
||||||
|
}
|
||||||
|
|
||||||
|
header, err := tarReader.Next()
|
||||||
|
if err == io.EOF {
|
||||||
|
break // End of archive
|
||||||
|
}
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to read tar header: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Sanitize and validate path
|
||||||
|
targetPath := filepath.Join(destDir, header.Name)
|
||||||
|
|
||||||
|
// Security check: ensure path is within destDir (prevent path traversal)
|
||||||
|
if !strings.HasPrefix(filepath.Clean(targetPath), filepath.Clean(destDir)) {
|
||||||
|
e.log.Warn("Skipping potentially malicious path in archive", "path", header.Name)
|
||||||
|
continue
|
||||||
|
}
|
||||||
|
|
||||||
|
switch header.Typeflag {
|
||||||
|
case tar.TypeDir:
|
||||||
|
if err := os.MkdirAll(targetPath, 0755); err != nil {
|
||||||
|
return fmt.Errorf("failed to create directory %s: %w", targetPath, err)
|
||||||
|
}
|
||||||
|
case tar.TypeReg:
|
||||||
|
// Ensure parent directory exists
|
||||||
|
if err := os.MkdirAll(filepath.Dir(targetPath), 0755); err != nil {
|
||||||
|
return fmt.Errorf("failed to create parent directory: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create the file
|
||||||
|
outFile, err := os.OpenFile(targetPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.FileMode(header.Mode))
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to create file %s: %w", targetPath, err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Copy file contents
|
||||||
|
if _, err := io.Copy(outFile, tarReader); err != nil {
|
||||||
|
outFile.Close()
|
||||||
|
return fmt.Errorf("failed to write file %s: %w", targetPath, err)
|
||||||
|
}
|
||||||
|
outFile.Close()
|
||||||
|
case tar.TypeSymlink:
|
||||||
|
// Handle symlinks (common in some archives)
|
||||||
|
if err := os.Symlink(header.Linkname, targetPath); err != nil {
|
||||||
|
// Ignore symlink errors (may already exist or not supported)
|
||||||
|
e.log.Debug("Could not create symlink", "path", targetPath, "target", header.Linkname)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Final progress update
|
||||||
|
e.reportProgress(totalSize, totalSize, "Extraction complete")
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// progressReader wraps an io.Reader to report read progress
|
||||||
|
type progressReader struct {
|
||||||
|
reader io.Reader
|
||||||
|
totalSize int64
|
||||||
|
bytesRead int64
|
||||||
|
callback ProgressCallback
|
||||||
|
desc string
|
||||||
|
lastReport time.Time
|
||||||
|
reportEvery time.Duration
|
||||||
|
}
|
||||||
|
|
||||||
|
func (pr *progressReader) Read(p []byte) (n int, err error) {
|
||||||
|
n, err = pr.reader.Read(p)
|
||||||
|
pr.bytesRead += int64(n)
|
||||||
|
|
||||||
|
// Throttle progress reporting to every 100ms
|
||||||
|
if pr.reportEvery == 0 {
|
||||||
|
pr.reportEvery = 100 * time.Millisecond
|
||||||
|
}
|
||||||
|
if time.Since(pr.lastReport) > pr.reportEvery {
|
||||||
|
if pr.callback != nil {
|
||||||
|
pr.callback(pr.bytesRead, pr.totalSize, pr.desc)
|
||||||
|
}
|
||||||
|
pr.lastReport = time.Now()
|
||||||
|
}
|
||||||
|
|
||||||
|
return n, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// extractArchiveShell extracts using shell tar command (faster but no progress)
|
||||||
|
func (e *Engine) extractArchiveShell(ctx context.Context, archivePath, destDir string) error {
|
||||||
cmd := exec.CommandContext(ctx, "tar", "-xzf", archivePath, "-C", destDir)
|
cmd := exec.CommandContext(ctx, "tar", "-xzf", archivePath, "-C", destDir)
|
||||||
|
|
||||||
// Stream stderr to avoid memory issues - tar can produce lots of output for large archives
|
// Stream stderr to avoid memory issues - tar can produce lots of output for large archives
|
||||||
@@ -1199,6 +1549,8 @@ func (e *Engine) extractArchive(ctx context.Context, archivePath, destDir string
|
|||||||
}
|
}
|
||||||
|
|
||||||
// restoreGlobals restores global objects (roles, tablespaces)
|
// restoreGlobals restores global objects (roles, tablespaces)
|
||||||
|
// Note: psql returns 0 even when some statements fail (e.g., role already exists)
|
||||||
|
// We track errors but only fail on FATAL errors that would prevent restore
|
||||||
func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
||||||
args := []string{
|
args := []string{
|
||||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
@@ -1228,6 +1580,8 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
|||||||
|
|
||||||
// Read stderr in chunks in goroutine
|
// Read stderr in chunks in goroutine
|
||||||
var lastError string
|
var lastError string
|
||||||
|
var errorCount int
|
||||||
|
var fatalError bool
|
||||||
stderrDone := make(chan struct{})
|
stderrDone := make(chan struct{})
|
||||||
go func() {
|
go func() {
|
||||||
defer close(stderrDone)
|
defer close(stderrDone)
|
||||||
@@ -1236,9 +1590,23 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
|||||||
n, err := stderr.Read(buf)
|
n, err := stderr.Read(buf)
|
||||||
if n > 0 {
|
if n > 0 {
|
||||||
chunk := string(buf[:n])
|
chunk := string(buf[:n])
|
||||||
if strings.Contains(chunk, "ERROR") || strings.Contains(chunk, "FATAL") {
|
// Track different error types
|
||||||
|
if strings.Contains(chunk, "FATAL") {
|
||||||
|
fatalError = true
|
||||||
lastError = chunk
|
lastError = chunk
|
||||||
e.log.Warn("Globals restore stderr", "output", chunk)
|
e.log.Error("Globals restore FATAL error", "output", chunk)
|
||||||
|
} else if strings.Contains(chunk, "ERROR") {
|
||||||
|
errorCount++
|
||||||
|
lastError = chunk
|
||||||
|
// Only log first few errors to avoid spam
|
||||||
|
if errorCount <= 5 {
|
||||||
|
// Check if it's an ignorable "already exists" error
|
||||||
|
if strings.Contains(chunk, "already exists") {
|
||||||
|
e.log.Debug("Globals restore: object already exists (expected)", "output", chunk)
|
||||||
|
} else {
|
||||||
|
e.log.Warn("Globals restore error", "output", chunk)
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
if err != nil {
|
if err != nil {
|
||||||
@@ -1266,10 +1634,23 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
|||||||
|
|
||||||
<-stderrDone
|
<-stderrDone
|
||||||
|
|
||||||
|
// Only fail on actual command errors or FATAL PostgreSQL errors
|
||||||
|
// Regular ERROR messages (like "role already exists") are expected
|
||||||
if cmdErr != nil {
|
if cmdErr != nil {
|
||||||
return fmt.Errorf("failed to restore globals: %w (last error: %s)", cmdErr, lastError)
|
return fmt.Errorf("failed to restore globals: %w (last error: %s)", cmdErr, lastError)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// If we had FATAL errors, those are real problems
|
||||||
|
if fatalError {
|
||||||
|
return fmt.Errorf("globals restore had FATAL error: %s", lastError)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Log summary if there were errors (but don't fail)
|
||||||
|
if errorCount > 0 {
|
||||||
|
e.log.Info("Globals restore completed with some errors (usually 'already exists' - expected)",
|
||||||
|
"error_count", errorCount)
|
||||||
|
}
|
||||||
|
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -1337,6 +1718,7 @@ func (e *Engine) terminateConnections(ctx context.Context, dbName string) error
|
|||||||
}
|
}
|
||||||
|
|
||||||
// dropDatabaseIfExists drops a database completely (clean slate)
|
// dropDatabaseIfExists drops a database completely (clean slate)
|
||||||
|
// Uses PostgreSQL 13+ WITH (FORCE) option to forcefully drop even with active connections
|
||||||
func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error {
|
func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error {
|
||||||
// First terminate all connections
|
// First terminate all connections
|
||||||
if err := e.terminateConnections(ctx, dbName); err != nil {
|
if err := e.terminateConnections(ctx, dbName); err != nil {
|
||||||
@@ -1346,26 +1728,67 @@ func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error
|
|||||||
// Wait a moment for connections to terminate
|
// Wait a moment for connections to terminate
|
||||||
time.Sleep(500 * time.Millisecond)
|
time.Sleep(500 * time.Millisecond)
|
||||||
|
|
||||||
// Drop the database
|
// Try to revoke new connections (prevents race condition)
|
||||||
args := []string{
|
// This only works if we have the privilege to do so
|
||||||
|
revokeArgs := []string{
|
||||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
"-U", e.cfg.User,
|
"-U", e.cfg.User,
|
||||||
"-d", "postgres",
|
"-d", "postgres",
|
||||||
"-c", fmt.Sprintf("DROP DATABASE IF EXISTS \"%s\"", dbName),
|
"-c", fmt.Sprintf("REVOKE CONNECT ON DATABASE \"%s\" FROM PUBLIC", dbName),
|
||||||
}
|
}
|
||||||
|
|
||||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
|
||||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
revokeArgs = append([]string{"-h", e.cfg.Host}, revokeArgs...)
|
||||||
|
}
|
||||||
|
revokeCmd := exec.CommandContext(ctx, "psql", revokeArgs...)
|
||||||
|
revokeCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||||
|
revokeCmd.Run() // Ignore errors - database might not exist
|
||||||
|
|
||||||
|
// Terminate connections again after revoking connect privilege
|
||||||
|
e.terminateConnections(ctx, dbName)
|
||||||
|
time.Sleep(200 * time.Millisecond)
|
||||||
|
|
||||||
|
// Try DROP DATABASE WITH (FORCE) first (PostgreSQL 13+)
|
||||||
|
// This forcefully terminates connections and drops the database atomically
|
||||||
|
forceArgs := []string{
|
||||||
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
|
"-U", e.cfg.User,
|
||||||
|
"-d", "postgres",
|
||||||
|
"-c", fmt.Sprintf("DROP DATABASE IF EXISTS \"%s\" WITH (FORCE)", dbName),
|
||||||
|
}
|
||||||
|
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||||
|
forceArgs = append([]string{"-h", e.cfg.Host}, forceArgs...)
|
||||||
|
}
|
||||||
|
forceCmd := exec.CommandContext(ctx, "psql", forceArgs...)
|
||||||
|
forceCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||||
|
|
||||||
|
output, err := forceCmd.CombinedOutput()
|
||||||
|
if err == nil {
|
||||||
|
e.log.Info("Dropped existing database (with FORCE)", "name", dbName)
|
||||||
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
// If FORCE option failed (PostgreSQL < 13), try regular drop
|
||||||
|
if strings.Contains(string(output), "syntax error") || strings.Contains(string(output), "WITH (FORCE)") {
|
||||||
|
e.log.Debug("WITH (FORCE) not supported, using standard DROP", "name", dbName)
|
||||||
|
|
||||||
// Always set PGPASSWORD (empty string is fine for peer/ident auth)
|
args := []string{
|
||||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
|
"-U", e.cfg.User,
|
||||||
|
"-d", "postgres",
|
||||||
|
"-c", fmt.Sprintf("DROP DATABASE IF EXISTS \"%s\"", dbName),
|
||||||
|
}
|
||||||
|
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||||
|
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||||
|
}
|
||||||
|
|
||||||
output, err := cmd.CombinedOutput()
|
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||||
if err != nil {
|
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||||
|
|
||||||
|
output, err = cmd.CombinedOutput()
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to drop database '%s': %w\nOutput: %s", dbName, err, string(output))
|
||||||
|
}
|
||||||
|
} else if err != nil {
|
||||||
return fmt.Errorf("failed to drop database '%s': %w\nOutput: %s", dbName, err, string(output))
|
return fmt.Errorf("failed to drop database '%s': %w\nOutput: %s", dbName, err, string(output))
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -1408,12 +1831,14 @@ func (e *Engine) ensureMySQLDatabaseExists(ctx context.Context, dbName string) e
|
|||||||
}
|
}
|
||||||
|
|
||||||
// ensurePostgresDatabaseExists checks if a PostgreSQL database exists and creates it if not
|
// ensurePostgresDatabaseExists checks if a PostgreSQL database exists and creates it if not
|
||||||
|
// It attempts to extract encoding/locale from the dump file to preserve original settings
|
||||||
func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string) error {
|
func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string) error {
|
||||||
// Skip creation for postgres and template databases - they should already exist
|
// Skip creation for postgres and template databases - they should already exist
|
||||||
if dbName == "postgres" || dbName == "template0" || dbName == "template1" {
|
if dbName == "postgres" || dbName == "template0" || dbName == "template1" {
|
||||||
e.log.Info("Skipping create for system database (assume exists)", "name", dbName)
|
e.log.Info("Skipping create for system database (assume exists)", "name", dbName)
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// Build psql command with authentication
|
// Build psql command with authentication
|
||||||
buildPsqlCmd := func(ctx context.Context, database, query string) *exec.Cmd {
|
buildPsqlCmd := func(ctx context.Context, database, query string) *exec.Cmd {
|
||||||
args := []string{
|
args := []string{
|
||||||
@@ -1453,14 +1878,31 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
|
|||||||
|
|
||||||
// Database doesn't exist, create it
|
// Database doesn't exist, create it
|
||||||
// IMPORTANT: Use template0 to avoid duplicate definition errors from local additions to template1
|
// IMPORTANT: Use template0 to avoid duplicate definition errors from local additions to template1
|
||||||
|
// Also use UTF8 encoding explicitly as it's the most common and safest choice
|
||||||
// See PostgreSQL docs: https://www.postgresql.org/docs/current/app-pgrestore.html#APP-PGRESTORE-NOTES
|
// See PostgreSQL docs: https://www.postgresql.org/docs/current/app-pgrestore.html#APP-PGRESTORE-NOTES
|
||||||
e.log.Info("Creating database from template0", "name", dbName)
|
e.log.Info("Creating database from template0 with UTF8 encoding", "name", dbName)
|
||||||
|
|
||||||
|
// Get server's default locale for LC_COLLATE and LC_CTYPE
|
||||||
|
// This ensures compatibility while using the correct encoding
|
||||||
|
localeCmd := buildPsqlCmd(ctx, "postgres", "SHOW lc_collate")
|
||||||
|
localeOutput, _ := localeCmd.CombinedOutput()
|
||||||
|
serverLocale := strings.TrimSpace(string(localeOutput))
|
||||||
|
if serverLocale == "" {
|
||||||
|
serverLocale = "en_US.UTF-8" // Fallback to common default
|
||||||
|
}
|
||||||
|
|
||||||
|
// Build CREATE DATABASE command with encoding and locale
|
||||||
|
// Using ENCODING 'UTF8' explicitly ensures the dump can be restored
|
||||||
|
createSQL := fmt.Sprintf(
|
||||||
|
"CREATE DATABASE \"%s\" WITH TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE '%s' LC_CTYPE '%s'",
|
||||||
|
dbName, serverLocale, serverLocale,
|
||||||
|
)
|
||||||
|
|
||||||
createArgs := []string{
|
createArgs := []string{
|
||||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
"-U", e.cfg.User,
|
"-U", e.cfg.User,
|
||||||
"-d", "postgres",
|
"-d", "postgres",
|
||||||
"-c", fmt.Sprintf("CREATE DATABASE \"%s\" WITH TEMPLATE template0", dbName),
|
"-c", createSQL,
|
||||||
}
|
}
|
||||||
|
|
||||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
||||||
@@ -1475,9 +1917,27 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
|
|||||||
|
|
||||||
output, err = createCmd.CombinedOutput()
|
output, err = createCmd.CombinedOutput()
|
||||||
if err != nil {
|
if err != nil {
|
||||||
// Log the error and include the psql output in the returned error to aid debugging
|
// If encoding/locale fails, try simpler CREATE DATABASE
|
||||||
e.log.Warn("Database creation failed", "name", dbName, "error", err, "output", string(output))
|
e.log.Warn("Database creation with encoding failed, trying simple create", "name", dbName, "error", err)
|
||||||
return fmt.Errorf("failed to create database '%s': %w (output: %s)", dbName, err, strings.TrimSpace(string(output)))
|
|
||||||
|
simpleArgs := []string{
|
||||||
|
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||||
|
"-U", e.cfg.User,
|
||||||
|
"-d", "postgres",
|
||||||
|
"-c", fmt.Sprintf("CREATE DATABASE \"%s\" WITH TEMPLATE template0", dbName),
|
||||||
|
}
|
||||||
|
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||||
|
simpleArgs = append([]string{"-h", e.cfg.Host}, simpleArgs...)
|
||||||
|
}
|
||||||
|
|
||||||
|
simpleCmd := exec.CommandContext(ctx, "psql", simpleArgs...)
|
||||||
|
simpleCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||||
|
|
||||||
|
output, err = simpleCmd.CombinedOutput()
|
||||||
|
if err != nil {
|
||||||
|
e.log.Warn("Database creation failed", "name", dbName, "error", err, "output", string(output))
|
||||||
|
return fmt.Errorf("failed to create database '%s': %w (output: %s)", dbName, err, strings.TrimSpace(string(output)))
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
e.log.Info("Successfully created database from template0", "name", dbName)
|
e.log.Info("Successfully created database from template0", "name", dbName)
|
||||||
@@ -1665,9 +2125,10 @@ func (e *Engine) quickValidateSQLDump(archivePath string, compressed bool) error
|
|||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// boostLockCapacity temporarily increases max_locks_per_transaction to prevent OOM
|
// boostLockCapacity checks and reports on max_locks_per_transaction capacity.
|
||||||
// during large restores with many BLOBs. Returns the original value for later reset.
|
// IMPORTANT: max_locks_per_transaction requires a PostgreSQL RESTART to change!
|
||||||
// Uses ALTER SYSTEM + pg_reload_conf() so no restart is needed.
|
// This function now calculates total lock capacity based on max_connections and
|
||||||
|
// warns the user if capacity is insufficient for the restore.
|
||||||
func (e *Engine) boostLockCapacity(ctx context.Context) (int, error) {
|
func (e *Engine) boostLockCapacity(ctx context.Context) (int, error) {
|
||||||
// Connect to PostgreSQL to run system commands
|
// Connect to PostgreSQL to run system commands
|
||||||
connStr := fmt.Sprintf("host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable",
|
connStr := fmt.Sprintf("host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable",
|
||||||
@@ -1685,7 +2146,7 @@ func (e *Engine) boostLockCapacity(ctx context.Context) (int, error) {
|
|||||||
}
|
}
|
||||||
defer db.Close()
|
defer db.Close()
|
||||||
|
|
||||||
// Get current value
|
// Get current max_locks_per_transaction
|
||||||
var currentValue int
|
var currentValue int
|
||||||
err = db.QueryRowContext(ctx, "SHOW max_locks_per_transaction").Scan(¤tValue)
|
err = db.QueryRowContext(ctx, "SHOW max_locks_per_transaction").Scan(¤tValue)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
@@ -1698,22 +2159,56 @@ func (e *Engine) boostLockCapacity(ctx context.Context) (int, error) {
|
|||||||
fmt.Sscanf(currentValueStr, "%d", ¤tValue)
|
fmt.Sscanf(currentValueStr, "%d", ¤tValue)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Skip if already high enough
|
// Get max_connections to calculate total lock capacity
|
||||||
if currentValue >= 2048 {
|
var maxConns int
|
||||||
e.log.Info("max_locks_per_transaction already sufficient", "value", currentValue)
|
if err := db.QueryRowContext(ctx, "SHOW max_connections").Scan(&maxConns); err != nil {
|
||||||
return currentValue, nil
|
maxConns = 100 // default
|
||||||
}
|
}
|
||||||
|
|
||||||
// Boost to 2048 (enough for most BLOB-heavy databases)
|
// Get max_prepared_transactions
|
||||||
_, err = db.ExecContext(ctx, "ALTER SYSTEM SET max_locks_per_transaction = 2048")
|
var maxPreparedTxns int
|
||||||
if err != nil {
|
if err := db.QueryRowContext(ctx, "SHOW max_prepared_transactions").Scan(&maxPreparedTxns); err != nil {
|
||||||
return currentValue, fmt.Errorf("failed to set max_locks_per_transaction: %w", err)
|
maxPreparedTxns = 0
|
||||||
}
|
}
|
||||||
|
|
||||||
// Reload config without restart
|
// Calculate total lock table capacity:
|
||||||
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
// Total locks = max_locks_per_transaction × (max_connections + max_prepared_transactions)
|
||||||
if err != nil {
|
totalLockCapacity := currentValue * (maxConns + maxPreparedTxns)
|
||||||
return currentValue, fmt.Errorf("failed to reload config: %w", err)
|
|
||||||
|
e.log.Info("PostgreSQL lock table capacity",
|
||||||
|
"max_locks_per_transaction", currentValue,
|
||||||
|
"max_connections", maxConns,
|
||||||
|
"max_prepared_transactions", maxPreparedTxns,
|
||||||
|
"total_lock_capacity", totalLockCapacity)
|
||||||
|
|
||||||
|
// Minimum recommended total capacity for BLOB-heavy restores: 200,000 locks
|
||||||
|
minRecommendedCapacity := 200000
|
||||||
|
if totalLockCapacity < minRecommendedCapacity {
|
||||||
|
recommendedMaxLocks := minRecommendedCapacity / (maxConns + maxPreparedTxns)
|
||||||
|
if recommendedMaxLocks < 4096 {
|
||||||
|
recommendedMaxLocks = 4096
|
||||||
|
}
|
||||||
|
|
||||||
|
e.log.Warn("Lock table capacity may be insufficient for BLOB-heavy restores",
|
||||||
|
"current_total_capacity", totalLockCapacity,
|
||||||
|
"recommended_capacity", minRecommendedCapacity,
|
||||||
|
"current_max_locks", currentValue,
|
||||||
|
"recommended_max_locks", recommendedMaxLocks,
|
||||||
|
"note", "max_locks_per_transaction requires PostgreSQL RESTART to change")
|
||||||
|
|
||||||
|
// Write suggested fix to ALTER SYSTEM but warn about restart
|
||||||
|
_, err = db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", recommendedMaxLocks))
|
||||||
|
if err != nil {
|
||||||
|
e.log.Warn("Could not set recommended max_locks_per_transaction (needs superuser)", "error", err)
|
||||||
|
} else {
|
||||||
|
e.log.Warn("Wrote recommended max_locks_per_transaction to postgresql.auto.conf",
|
||||||
|
"value", recommendedMaxLocks,
|
||||||
|
"action", "RESTART PostgreSQL to apply: sudo systemctl restart postgresql")
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
e.log.Info("Lock table capacity is sufficient",
|
||||||
|
"total_capacity", totalLockCapacity,
|
||||||
|
"max_locks_per_transaction", currentValue)
|
||||||
}
|
}
|
||||||
|
|
||||||
return currentValue, nil
|
return currentValue, nil
|
||||||
@@ -1761,6 +2256,8 @@ type OriginalSettings struct {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// boostPostgreSQLSettings boosts multiple PostgreSQL settings for large restores
|
// boostPostgreSQLSettings boosts multiple PostgreSQL settings for large restores
|
||||||
|
// NOTE: max_locks_per_transaction requires a PostgreSQL RESTART to take effect!
|
||||||
|
// maintenance_work_mem can be changed with pg_reload_conf().
|
||||||
func (e *Engine) boostPostgreSQLSettings(ctx context.Context, lockBoostValue int) (*OriginalSettings, error) {
|
func (e *Engine) boostPostgreSQLSettings(ctx context.Context, lockBoostValue int) (*OriginalSettings, error) {
|
||||||
connStr := e.buildConnString()
|
connStr := e.buildConnString()
|
||||||
db, err := sql.Open("pgx", connStr)
|
db, err := sql.Open("pgx", connStr)
|
||||||
@@ -1780,30 +2277,156 @@ func (e *Engine) boostPostgreSQLSettings(ctx context.Context, lockBoostValue int
|
|||||||
// Get current maintenance_work_mem
|
// Get current maintenance_work_mem
|
||||||
db.QueryRowContext(ctx, "SHOW maintenance_work_mem").Scan(&original.MaintenanceWorkMem)
|
db.QueryRowContext(ctx, "SHOW maintenance_work_mem").Scan(&original.MaintenanceWorkMem)
|
||||||
|
|
||||||
// Boost max_locks_per_transaction (if not already high enough)
|
// CRITICAL: max_locks_per_transaction requires a PostgreSQL RESTART!
|
||||||
|
// pg_reload_conf() is NOT sufficient for this parameter.
|
||||||
|
needsRestart := false
|
||||||
if original.MaxLocks < lockBoostValue {
|
if original.MaxLocks < lockBoostValue {
|
||||||
_, err = db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", lockBoostValue))
|
_, err = db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", lockBoostValue))
|
||||||
if err != nil {
|
if err != nil {
|
||||||
e.log.Warn("Could not boost max_locks_per_transaction", "error", err)
|
e.log.Warn("Could not set max_locks_per_transaction", "error", err)
|
||||||
|
} else {
|
||||||
|
needsRestart = true
|
||||||
|
e.log.Warn("max_locks_per_transaction requires PostgreSQL restart to take effect",
|
||||||
|
"current", original.MaxLocks,
|
||||||
|
"target", lockBoostValue)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Boost maintenance_work_mem to 2GB for faster index creation
|
// Boost maintenance_work_mem to 2GB for faster index creation
|
||||||
|
// (this one CAN be applied via pg_reload_conf)
|
||||||
_, err = db.ExecContext(ctx, "ALTER SYSTEM SET maintenance_work_mem = '2GB'")
|
_, err = db.ExecContext(ctx, "ALTER SYSTEM SET maintenance_work_mem = '2GB'")
|
||||||
if err != nil {
|
if err != nil {
|
||||||
e.log.Warn("Could not boost maintenance_work_mem", "error", err)
|
e.log.Warn("Could not boost maintenance_work_mem", "error", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Reload config to apply changes (no restart needed for these settings)
|
// Reload config to apply maintenance_work_mem
|
||||||
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return original, fmt.Errorf("failed to reload config: %w", err)
|
return original, fmt.Errorf("failed to reload config: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// If max_locks_per_transaction needs a restart, try to do it
|
||||||
|
if needsRestart {
|
||||||
|
if restarted := e.tryRestartPostgreSQL(ctx); restarted {
|
||||||
|
e.log.Info("PostgreSQL restarted successfully - max_locks_per_transaction now active")
|
||||||
|
// Wait for PostgreSQL to be ready
|
||||||
|
time.Sleep(3 * time.Second)
|
||||||
|
} else {
|
||||||
|
// Cannot restart - warn user but continue
|
||||||
|
// The setting is written to postgresql.auto.conf and will take effect on next restart
|
||||||
|
e.log.Warn("=" + strings.Repeat("=", 70))
|
||||||
|
e.log.Warn("NOTE: max_locks_per_transaction change requires PostgreSQL restart")
|
||||||
|
e.log.Warn("Current value: " + strconv.Itoa(original.MaxLocks) + ", target: " + strconv.Itoa(lockBoostValue))
|
||||||
|
e.log.Warn("")
|
||||||
|
e.log.Warn("The setting has been saved to postgresql.auto.conf and will take")
|
||||||
|
e.log.Warn("effect on the next PostgreSQL restart. If restore fails with")
|
||||||
|
e.log.Warn("'out of shared memory' errors, ask your DBA to restart PostgreSQL.")
|
||||||
|
e.log.Warn("")
|
||||||
|
e.log.Warn("Continuing with restore - this may succeed if your databases")
|
||||||
|
e.log.Warn("don't have many large objects (BLOBs).")
|
||||||
|
e.log.Warn("=" + strings.Repeat("=", 70))
|
||||||
|
// Continue anyway - might work for small restores or DBs without BLOBs
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
return original, nil
|
return original, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// canRestartPostgreSQL checks if we have the ability to restart PostgreSQL
|
||||||
|
// Returns false if running in a restricted environment (e.g., su postgres on enterprise systems)
|
||||||
|
func (e *Engine) canRestartPostgreSQL() bool {
|
||||||
|
// Check if we're running as postgres user - if so, we likely can't restart
|
||||||
|
// because PostgreSQL is managed by init/systemd, not directly by pg_ctl
|
||||||
|
currentUser := os.Getenv("USER")
|
||||||
|
if currentUser == "" {
|
||||||
|
currentUser = os.Getenv("LOGNAME")
|
||||||
|
}
|
||||||
|
|
||||||
|
// If we're the postgres user, check if we have sudo access
|
||||||
|
if currentUser == "postgres" {
|
||||||
|
// Try a quick sudo check - if this fails, we can't restart
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
|
||||||
|
defer cancel()
|
||||||
|
cmd := exec.CommandContext(ctx, "sudo", "-n", "true")
|
||||||
|
cmd.Stdin = nil
|
||||||
|
if err := cmd.Run(); err != nil {
|
||||||
|
e.log.Info("Running as postgres user without sudo access - cannot restart PostgreSQL",
|
||||||
|
"user", currentUser,
|
||||||
|
"hint", "Ask system administrator to restart PostgreSQL if needed")
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
// tryRestartPostgreSQL attempts to restart PostgreSQL using various methods
|
||||||
|
// Returns true if restart was successful
|
||||||
|
// IMPORTANT: Uses short timeouts and non-interactive sudo to avoid blocking on password prompts
|
||||||
|
// NOTE: This function will return false immediately if running as postgres without sudo
|
||||||
|
func (e *Engine) tryRestartPostgreSQL(ctx context.Context) bool {
|
||||||
|
// First check if we can even attempt a restart
|
||||||
|
if !e.canRestartPostgreSQL() {
|
||||||
|
e.log.Info("Skipping PostgreSQL restart attempt (no privileges)")
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
|
e.progress.Update("Attempting PostgreSQL restart for lock settings...")
|
||||||
|
|
||||||
|
// Use short timeout for each restart attempt (don't block on sudo password prompts)
|
||||||
|
runWithTimeout := func(args ...string) bool {
|
||||||
|
cmdCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||||
|
defer cancel()
|
||||||
|
cmd := exec.CommandContext(cmdCtx, args[0], args[1:]...)
|
||||||
|
// Set stdin to /dev/null to prevent sudo from waiting for password
|
||||||
|
cmd.Stdin = nil
|
||||||
|
return cmd.Run() == nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// Method 1: systemctl (most common on modern Linux) - use sudo -n for non-interactive
|
||||||
|
if runWithTimeout("sudo", "-n", "systemctl", "restart", "postgresql") {
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
// Method 2: systemctl with version suffix (e.g., postgresql-15)
|
||||||
|
for _, ver := range []string{"17", "16", "15", "14", "13", "12"} {
|
||||||
|
if runWithTimeout("sudo", "-n", "systemctl", "restart", "postgresql-"+ver) {
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Method 3: service command (older systems)
|
||||||
|
if runWithTimeout("sudo", "-n", "service", "postgresql", "restart") {
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
// Method 4: pg_ctl as postgres user (if we ARE postgres user, no sudo needed)
|
||||||
|
if runWithTimeout("pg_ctl", "restart", "-D", "/var/lib/postgresql/data", "-m", "fast") {
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
|
||||||
|
// Method 5: Try common PGDATA paths with pg_ctl directly (for postgres user)
|
||||||
|
pgdataPaths := []string{
|
||||||
|
"/var/lib/pgsql/data",
|
||||||
|
"/var/lib/pgsql/17/data",
|
||||||
|
"/var/lib/pgsql/16/data",
|
||||||
|
"/var/lib/pgsql/15/data",
|
||||||
|
"/var/lib/postgresql/17/main",
|
||||||
|
"/var/lib/postgresql/16/main",
|
||||||
|
"/var/lib/postgresql/15/main",
|
||||||
|
}
|
||||||
|
for _, pgdata := range pgdataPaths {
|
||||||
|
if runWithTimeout("pg_ctl", "restart", "-D", pgdata, "-m", "fast") {
|
||||||
|
return true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return false
|
||||||
|
}
|
||||||
|
|
||||||
// resetPostgreSQLSettings restores original PostgreSQL settings
|
// resetPostgreSQLSettings restores original PostgreSQL settings
|
||||||
|
// NOTE: max_locks_per_transaction changes are written but require restart to take effect.
|
||||||
|
// We don't restart here since we're done with the restore.
|
||||||
func (e *Engine) resetPostgreSQLSettings(ctx context.Context, original *OriginalSettings) error {
|
func (e *Engine) resetPostgreSQLSettings(ctx context.Context, original *OriginalSettings) error {
|
||||||
connStr := e.buildConnString()
|
connStr := e.buildConnString()
|
||||||
db, err := sql.Open("pgx", connStr)
|
db, err := sql.Open("pgx", connStr)
|
||||||
@@ -1812,25 +2435,28 @@ func (e *Engine) resetPostgreSQLSettings(ctx context.Context, original *Original
|
|||||||
}
|
}
|
||||||
defer db.Close()
|
defer db.Close()
|
||||||
|
|
||||||
// Reset max_locks_per_transaction
|
// Reset max_locks_per_transaction (will take effect on next restart)
|
||||||
if original.MaxLocks == 64 { // Default
|
if original.MaxLocks == 64 { // Default
|
||||||
db.ExecContext(ctx, "ALTER SYSTEM RESET max_locks_per_transaction")
|
db.ExecContext(ctx, "ALTER SYSTEM RESET max_locks_per_transaction")
|
||||||
} else if original.MaxLocks > 0 {
|
} else if original.MaxLocks > 0 {
|
||||||
db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", original.MaxLocks))
|
db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", original.MaxLocks))
|
||||||
}
|
}
|
||||||
|
|
||||||
// Reset maintenance_work_mem
|
// Reset maintenance_work_mem (takes effect immediately with reload)
|
||||||
if original.MaintenanceWorkMem == "64MB" { // Default
|
if original.MaintenanceWorkMem == "64MB" { // Default
|
||||||
db.ExecContext(ctx, "ALTER SYSTEM RESET maintenance_work_mem")
|
db.ExecContext(ctx, "ALTER SYSTEM RESET maintenance_work_mem")
|
||||||
} else if original.MaintenanceWorkMem != "" {
|
} else if original.MaintenanceWorkMem != "" {
|
||||||
db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET maintenance_work_mem = '%s'", original.MaintenanceWorkMem))
|
db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET maintenance_work_mem = '%s'", original.MaintenanceWorkMem))
|
||||||
}
|
}
|
||||||
|
|
||||||
// Reload config
|
// Reload config (only maintenance_work_mem will take effect immediately)
|
||||||
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return fmt.Errorf("failed to reload config: %w", err)
|
return fmt.Errorf("failed to reload config: %w", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
e.log.Info("PostgreSQL settings reset queued",
|
||||||
|
"note", "max_locks_per_transaction will revert on next PostgreSQL restart")
|
||||||
|
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -16,6 +16,57 @@ import (
|
|||||||
"github.com/shirou/gopsutil/v3/mem"
|
"github.com/shirou/gopsutil/v3/mem"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
// CalculateOptimalParallel returns the recommended number of parallel workers
|
||||||
|
// based on available system resources (CPU cores and RAM).
|
||||||
|
// This is a standalone function that can be called from anywhere.
|
||||||
|
// Returns 0 if resources cannot be detected.
|
||||||
|
func CalculateOptimalParallel() int {
|
||||||
|
cpuCores := runtime.NumCPU()
|
||||||
|
|
||||||
|
vmem, err := mem.VirtualMemory()
|
||||||
|
if err != nil {
|
||||||
|
// Fallback: use half of CPU cores if memory detection fails
|
||||||
|
if cpuCores > 1 {
|
||||||
|
return cpuCores / 2
|
||||||
|
}
|
||||||
|
return 1
|
||||||
|
}
|
||||||
|
|
||||||
|
memAvailableGB := float64(vmem.Available) / (1024 * 1024 * 1024)
|
||||||
|
|
||||||
|
// Each pg_restore worker needs approximately 2-4GB of RAM
|
||||||
|
// Use conservative 3GB per worker to avoid OOM
|
||||||
|
const memPerWorkerGB = 3.0
|
||||||
|
|
||||||
|
// Calculate limits
|
||||||
|
maxByMem := int(memAvailableGB / memPerWorkerGB)
|
||||||
|
maxByCPU := cpuCores
|
||||||
|
|
||||||
|
// Use the minimum of memory and CPU limits
|
||||||
|
recommended := maxByMem
|
||||||
|
if maxByCPU < recommended {
|
||||||
|
recommended = maxByCPU
|
||||||
|
}
|
||||||
|
|
||||||
|
// Apply sensible bounds
|
||||||
|
if recommended < 1 {
|
||||||
|
recommended = 1
|
||||||
|
}
|
||||||
|
if recommended > 16 {
|
||||||
|
recommended = 16 // Cap at 16 to avoid diminishing returns
|
||||||
|
}
|
||||||
|
|
||||||
|
// If memory pressure is high (>80%), reduce parallelism
|
||||||
|
if vmem.UsedPercent > 80 && recommended > 1 {
|
||||||
|
recommended = recommended / 2
|
||||||
|
if recommended < 1 {
|
||||||
|
recommended = 1
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return recommended
|
||||||
|
}
|
||||||
|
|
||||||
// PreflightResult contains all preflight check results
|
// PreflightResult contains all preflight check results
|
||||||
type PreflightResult struct {
|
type PreflightResult struct {
|
||||||
// Linux system checks
|
// Linux system checks
|
||||||
@@ -35,25 +86,29 @@ type PreflightResult struct {
|
|||||||
|
|
||||||
// LinuxChecks contains Linux kernel/system checks
|
// LinuxChecks contains Linux kernel/system checks
|
||||||
type LinuxChecks struct {
|
type LinuxChecks struct {
|
||||||
ShmMax int64 // /proc/sys/kernel/shmmax
|
ShmMax int64 // /proc/sys/kernel/shmmax
|
||||||
ShmAll int64 // /proc/sys/kernel/shmall
|
ShmAll int64 // /proc/sys/kernel/shmall
|
||||||
MemTotal uint64 // Total RAM in bytes
|
MemTotal uint64 // Total RAM in bytes
|
||||||
MemAvailable uint64 // Available RAM in bytes
|
MemAvailable uint64 // Available RAM in bytes
|
||||||
MemUsedPercent float64 // Memory usage percentage
|
MemUsedPercent float64 // Memory usage percentage
|
||||||
ShmMaxOK bool // Is shmmax sufficient?
|
CPUCores int // Number of CPU cores
|
||||||
ShmAllOK bool // Is shmall sufficient?
|
RecommendedParallel int // Auto-calculated optimal parallel count
|
||||||
MemAvailableOK bool // Is available RAM sufficient?
|
ShmMaxOK bool // Is shmmax sufficient?
|
||||||
IsLinux bool // Are we running on Linux?
|
ShmAllOK bool // Is shmall sufficient?
|
||||||
|
MemAvailableOK bool // Is available RAM sufficient?
|
||||||
|
IsLinux bool // Are we running on Linux?
|
||||||
}
|
}
|
||||||
|
|
||||||
// PostgreSQLChecks contains PostgreSQL configuration checks
|
// PostgreSQLChecks contains PostgreSQL configuration checks
|
||||||
type PostgreSQLChecks struct {
|
type PostgreSQLChecks struct {
|
||||||
MaxLocksPerTransaction int // Current setting
|
MaxLocksPerTransaction int // Current setting
|
||||||
MaintenanceWorkMem string // Current setting
|
MaxPreparedTransactions int // Current setting (affects lock capacity)
|
||||||
SharedBuffers string // Current setting (info only)
|
TotalLockCapacity int // Calculated: max_locks × (max_connections + max_prepared)
|
||||||
MaxConnections int // Current setting
|
MaintenanceWorkMem string // Current setting
|
||||||
Version string // PostgreSQL version
|
SharedBuffers string // Current setting (info only)
|
||||||
IsSuperuser bool // Can we modify settings?
|
MaxConnections int // Current setting
|
||||||
|
Version string // PostgreSQL version
|
||||||
|
IsSuperuser bool // Can we modify settings?
|
||||||
}
|
}
|
||||||
|
|
||||||
// ArchiveChecks contains analysis of the backup archive
|
// ArchiveChecks contains analysis of the backup archive
|
||||||
@@ -98,6 +153,7 @@ func (e *Engine) RunPreflightChecks(ctx context.Context, dumpsDir string, entrie
|
|||||||
// checkSystemResources uses gopsutil for cross-platform system checks
|
// checkSystemResources uses gopsutil for cross-platform system checks
|
||||||
func (e *Engine) checkSystemResources(result *PreflightResult) {
|
func (e *Engine) checkSystemResources(result *PreflightResult) {
|
||||||
result.Linux.IsLinux = runtime.GOOS == "linux"
|
result.Linux.IsLinux = runtime.GOOS == "linux"
|
||||||
|
result.Linux.CPUCores = runtime.NumCPU()
|
||||||
|
|
||||||
// Get memory info (works on Linux, macOS, Windows, BSD)
|
// Get memory info (works on Linux, macOS, Windows, BSD)
|
||||||
if vmem, err := mem.VirtualMemory(); err == nil {
|
if vmem, err := mem.VirtualMemory(); err == nil {
|
||||||
@@ -116,6 +172,9 @@ func (e *Engine) checkSystemResources(result *PreflightResult) {
|
|||||||
e.log.Warn("Could not detect system memory", "error", err)
|
e.log.Warn("Could not detect system memory", "error", err)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Calculate recommended parallel based on resources
|
||||||
|
result.Linux.RecommendedParallel = e.calculateRecommendedParallel(result)
|
||||||
|
|
||||||
// Linux-specific kernel checks (shmmax, shmall)
|
// Linux-specific kernel checks (shmmax, shmall)
|
||||||
if result.Linux.IsLinux {
|
if result.Linux.IsLinux {
|
||||||
e.checkLinuxKernel(result)
|
e.checkLinuxKernel(result)
|
||||||
@@ -201,10 +260,70 @@ func (e *Engine) checkPostgreSQL(ctx context.Context, result *PreflightResult) {
|
|||||||
result.PostgreSQL.IsSuperuser = isSuperuser
|
result.PostgreSQL.IsSuperuser = isSuperuser
|
||||||
}
|
}
|
||||||
|
|
||||||
// Add info/warnings
|
// Check max_prepared_transactions for lock capacity calculation
|
||||||
|
var maxPreparedTxns string
|
||||||
|
if err := db.QueryRowContext(ctx, "SHOW max_prepared_transactions").Scan(&maxPreparedTxns); err == nil {
|
||||||
|
result.PostgreSQL.MaxPreparedTransactions, _ = strconv.Atoi(maxPreparedTxns)
|
||||||
|
}
|
||||||
|
|
||||||
|
// CRITICAL: Calculate TOTAL lock table capacity
|
||||||
|
// Formula: max_locks_per_transaction × (max_connections + max_prepared_transactions)
|
||||||
|
// This is THE key capacity metric for BLOB-heavy restores
|
||||||
|
maxConns := result.PostgreSQL.MaxConnections
|
||||||
|
if maxConns == 0 {
|
||||||
|
maxConns = 100 // default
|
||||||
|
}
|
||||||
|
maxPrepared := result.PostgreSQL.MaxPreparedTransactions
|
||||||
|
totalLockCapacity := result.PostgreSQL.MaxLocksPerTransaction * (maxConns + maxPrepared)
|
||||||
|
result.PostgreSQL.TotalLockCapacity = totalLockCapacity
|
||||||
|
|
||||||
|
e.log.Info("PostgreSQL lock table capacity",
|
||||||
|
"max_locks_per_transaction", result.PostgreSQL.MaxLocksPerTransaction,
|
||||||
|
"max_connections", maxConns,
|
||||||
|
"max_prepared_transactions", maxPrepared,
|
||||||
|
"total_lock_capacity", totalLockCapacity)
|
||||||
|
|
||||||
|
// CRITICAL: max_locks_per_transaction requires PostgreSQL RESTART to change!
|
||||||
|
// Warn users loudly about this - it's the #1 cause of "out of shared memory" errors
|
||||||
if result.PostgreSQL.MaxLocksPerTransaction < 256 {
|
if result.PostgreSQL.MaxLocksPerTransaction < 256 {
|
||||||
e.log.Info("PostgreSQL max_locks_per_transaction is low - will auto-boost",
|
e.log.Warn("PostgreSQL max_locks_per_transaction is LOW",
|
||||||
"current", result.PostgreSQL.MaxLocksPerTransaction)
|
"current", result.PostgreSQL.MaxLocksPerTransaction,
|
||||||
|
"recommended", "256+",
|
||||||
|
"note", "REQUIRES PostgreSQL restart to change!")
|
||||||
|
|
||||||
|
result.Warnings = append(result.Warnings,
|
||||||
|
fmt.Sprintf("max_locks_per_transaction=%d is low (recommend 256+). "+
|
||||||
|
"This setting requires PostgreSQL RESTART to change. "+
|
||||||
|
"BLOB-heavy databases may fail with 'out of shared memory' error. "+
|
||||||
|
"Fix: Edit postgresql.conf, set max_locks_per_transaction=2048, then restart PostgreSQL.",
|
||||||
|
result.PostgreSQL.MaxLocksPerTransaction))
|
||||||
|
}
|
||||||
|
|
||||||
|
// NEW: Check total lock capacity is sufficient for typical BLOB operations
|
||||||
|
// Minimum recommended: 200,000 for moderate BLOB databases
|
||||||
|
minRecommendedCapacity := 200000
|
||||||
|
if totalLockCapacity < minRecommendedCapacity {
|
||||||
|
recommendedMaxLocks := minRecommendedCapacity / (maxConns + maxPrepared)
|
||||||
|
if recommendedMaxLocks < 4096 {
|
||||||
|
recommendedMaxLocks = 4096
|
||||||
|
}
|
||||||
|
|
||||||
|
e.log.Warn("Total lock table capacity is LOW for BLOB-heavy restores",
|
||||||
|
"current_capacity", totalLockCapacity,
|
||||||
|
"recommended", minRecommendedCapacity,
|
||||||
|
"current_max_locks", result.PostgreSQL.MaxLocksPerTransaction,
|
||||||
|
"current_max_connections", maxConns,
|
||||||
|
"recommended_max_locks", recommendedMaxLocks,
|
||||||
|
"note", "VMs with fewer connections need higher max_locks_per_transaction")
|
||||||
|
|
||||||
|
result.Warnings = append(result.Warnings,
|
||||||
|
fmt.Sprintf("Total lock capacity=%d is low (recommend %d+). "+
|
||||||
|
"Capacity = max_locks_per_transaction(%d) × max_connections(%d). "+
|
||||||
|
"If you reduced VM size/connections, increase max_locks_per_transaction to %d. "+
|
||||||
|
"Fix: ALTER SYSTEM SET max_locks_per_transaction = %d; then RESTART PostgreSQL.",
|
||||||
|
totalLockCapacity, minRecommendedCapacity,
|
||||||
|
result.PostgreSQL.MaxLocksPerTransaction, maxConns,
|
||||||
|
recommendedMaxLocks, recommendedMaxLocks))
|
||||||
}
|
}
|
||||||
|
|
||||||
// Parse shared_buffers and warn if very low
|
// Parse shared_buffers and warn if very low
|
||||||
@@ -315,20 +434,113 @@ func (e *Engine) calculateRecommendations(result *PreflightResult) {
|
|||||||
if result.Archive.TotalBlobCount > 50000 {
|
if result.Archive.TotalBlobCount > 50000 {
|
||||||
lockBoost = 16384
|
lockBoost = 16384
|
||||||
}
|
}
|
||||||
|
if result.Archive.TotalBlobCount > 100000 {
|
||||||
|
lockBoost = 32768
|
||||||
|
}
|
||||||
|
if result.Archive.TotalBlobCount > 200000 {
|
||||||
|
lockBoost = 65536
|
||||||
|
}
|
||||||
|
|
||||||
// Cap at reasonable maximum
|
// For extreme cases, calculate actual requirement
|
||||||
if lockBoost > 16384 {
|
// Rule of thumb: ~1 lock per BLOB, divided by max_connections (default 100)
|
||||||
lockBoost = 16384
|
// Add 50% safety margin
|
||||||
|
maxConns := result.PostgreSQL.MaxConnections
|
||||||
|
if maxConns == 0 {
|
||||||
|
maxConns = 100 // default
|
||||||
|
}
|
||||||
|
calculatedLocks := (result.Archive.TotalBlobCount / maxConns) * 3 / 2 // 1.5x safety margin
|
||||||
|
if calculatedLocks > lockBoost {
|
||||||
|
lockBoost = calculatedLocks
|
||||||
}
|
}
|
||||||
|
|
||||||
result.Archive.RecommendedLockBoost = lockBoost
|
result.Archive.RecommendedLockBoost = lockBoost
|
||||||
|
|
||||||
|
// CRITICAL: Check if current max_locks_per_transaction is dangerously low for this BLOB count
|
||||||
|
currentLocks := result.PostgreSQL.MaxLocksPerTransaction
|
||||||
|
if currentLocks > 0 && result.Archive.TotalBlobCount > 0 {
|
||||||
|
// Estimate max BLOBs we can handle: locks * max_connections
|
||||||
|
maxSafeBLOBs := currentLocks * maxConns
|
||||||
|
|
||||||
|
if result.Archive.TotalBlobCount > maxSafeBLOBs {
|
||||||
|
severity := "WARNING"
|
||||||
|
if result.Archive.TotalBlobCount > maxSafeBLOBs*2 {
|
||||||
|
severity = "CRITICAL"
|
||||||
|
result.CanProceed = false
|
||||||
|
}
|
||||||
|
|
||||||
|
e.log.Error(fmt.Sprintf("%s: max_locks_per_transaction too low for BLOB count", severity),
|
||||||
|
"current_max_locks", currentLocks,
|
||||||
|
"total_blobs", result.Archive.TotalBlobCount,
|
||||||
|
"max_safe_blobs", maxSafeBLOBs,
|
||||||
|
"recommended_max_locks", lockBoost)
|
||||||
|
|
||||||
|
result.Errors = append(result.Errors,
|
||||||
|
fmt.Sprintf("%s: Archive contains %s BLOBs but max_locks_per_transaction=%d can only safely handle ~%s. "+
|
||||||
|
"Increase max_locks_per_transaction to %d in postgresql.conf and RESTART PostgreSQL.",
|
||||||
|
severity,
|
||||||
|
humanize.Comma(int64(result.Archive.TotalBlobCount)),
|
||||||
|
currentLocks,
|
||||||
|
humanize.Comma(int64(maxSafeBLOBs)),
|
||||||
|
lockBoost))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Log recommendation
|
// Log recommendation
|
||||||
e.log.Info("Calculated recommended lock boost",
|
e.log.Info("Calculated recommended lock boost",
|
||||||
"total_blobs", result.Archive.TotalBlobCount,
|
"total_blobs", result.Archive.TotalBlobCount,
|
||||||
"recommended_locks", lockBoost)
|
"recommended_locks", lockBoost)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// calculateRecommendedParallel determines optimal parallelism based on system resources
|
||||||
|
// Returns the recommended number of parallel workers for pg_restore
|
||||||
|
func (e *Engine) calculateRecommendedParallel(result *PreflightResult) int {
|
||||||
|
cpuCores := result.Linux.CPUCores
|
||||||
|
if cpuCores == 0 {
|
||||||
|
cpuCores = runtime.NumCPU()
|
||||||
|
}
|
||||||
|
|
||||||
|
memAvailableGB := float64(result.Linux.MemAvailable) / (1024 * 1024 * 1024)
|
||||||
|
|
||||||
|
// Each pg_restore worker needs approximately 2-4GB of RAM
|
||||||
|
// Use conservative 3GB per worker to avoid OOM
|
||||||
|
const memPerWorkerGB = 3.0
|
||||||
|
|
||||||
|
// Calculate limits
|
||||||
|
maxByMem := int(memAvailableGB / memPerWorkerGB)
|
||||||
|
maxByCPU := cpuCores
|
||||||
|
|
||||||
|
// Use the minimum of memory and CPU limits
|
||||||
|
recommended := maxByMem
|
||||||
|
if maxByCPU < recommended {
|
||||||
|
recommended = maxByCPU
|
||||||
|
}
|
||||||
|
|
||||||
|
// Apply sensible bounds
|
||||||
|
if recommended < 1 {
|
||||||
|
recommended = 1
|
||||||
|
}
|
||||||
|
if recommended > 16 {
|
||||||
|
recommended = 16 // Cap at 16 to avoid diminishing returns
|
||||||
|
}
|
||||||
|
|
||||||
|
// If memory pressure is high (>80%), reduce parallelism
|
||||||
|
if result.Linux.MemUsedPercent > 80 && recommended > 1 {
|
||||||
|
recommended = recommended / 2
|
||||||
|
if recommended < 1 {
|
||||||
|
recommended = 1
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
e.log.Info("Calculated recommended parallel",
|
||||||
|
"cpu_cores", cpuCores,
|
||||||
|
"mem_available_gb", fmt.Sprintf("%.1f", memAvailableGB),
|
||||||
|
"max_by_mem", maxByMem,
|
||||||
|
"max_by_cpu", maxByCPU,
|
||||||
|
"recommended", recommended)
|
||||||
|
|
||||||
|
return recommended
|
||||||
|
}
|
||||||
|
|
||||||
// printPreflightSummary prints a nice summary of all checks
|
// printPreflightSummary prints a nice summary of all checks
|
||||||
func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||||
fmt.Println()
|
fmt.Println()
|
||||||
@@ -341,6 +553,8 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
|||||||
printCheck("Total RAM", humanize.Bytes(result.Linux.MemTotal), true)
|
printCheck("Total RAM", humanize.Bytes(result.Linux.MemTotal), true)
|
||||||
printCheck("Available RAM", humanize.Bytes(result.Linux.MemAvailable), result.Linux.MemAvailableOK || result.Linux.MemAvailable == 0)
|
printCheck("Available RAM", humanize.Bytes(result.Linux.MemAvailable), result.Linux.MemAvailableOK || result.Linux.MemAvailable == 0)
|
||||||
printCheck("Memory Usage", fmt.Sprintf("%.1f%%", result.Linux.MemUsedPercent), result.Linux.MemUsedPercent < 85)
|
printCheck("Memory Usage", fmt.Sprintf("%.1f%%", result.Linux.MemUsedPercent), result.Linux.MemUsedPercent < 85)
|
||||||
|
printCheck("CPU Cores", fmt.Sprintf("%d", result.Linux.CPUCores), true)
|
||||||
|
printCheck("Recommended Parallel", fmt.Sprintf("%d (auto-calculated)", result.Linux.RecommendedParallel), true)
|
||||||
|
|
||||||
// Linux-specific kernel checks
|
// Linux-specific kernel checks
|
||||||
if result.Linux.IsLinux && result.Linux.ShmMax > 0 {
|
if result.Linux.IsLinux && result.Linux.ShmMax > 0 {
|
||||||
@@ -356,6 +570,13 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
|||||||
humanize.Comma(int64(result.PostgreSQL.MaxLocksPerTransaction)),
|
humanize.Comma(int64(result.PostgreSQL.MaxLocksPerTransaction)),
|
||||||
humanize.Comma(int64(result.Archive.RecommendedLockBoost))),
|
humanize.Comma(int64(result.Archive.RecommendedLockBoost))),
|
||||||
true)
|
true)
|
||||||
|
printCheck("max_connections", humanize.Comma(int64(result.PostgreSQL.MaxConnections)), true)
|
||||||
|
// Show total lock capacity with warning if low
|
||||||
|
totalCapacityOK := result.PostgreSQL.TotalLockCapacity >= 200000
|
||||||
|
printCheck("Total Lock Capacity",
|
||||||
|
fmt.Sprintf("%s (max_locks × max_conns)",
|
||||||
|
humanize.Comma(int64(result.PostgreSQL.TotalLockCapacity))),
|
||||||
|
totalCapacityOK)
|
||||||
printCheck("maintenance_work_mem", fmt.Sprintf("%s → 2GB (auto-boost)",
|
printCheck("maintenance_work_mem", fmt.Sprintf("%s → 2GB (auto-boost)",
|
||||||
result.PostgreSQL.MaintenanceWorkMem), true)
|
result.PostgreSQL.MaintenanceWorkMem), true)
|
||||||
printInfo("shared_buffers", result.PostgreSQL.SharedBuffers)
|
printInfo("shared_buffers", result.PostgreSQL.SharedBuffers)
|
||||||
@@ -377,6 +598,14 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Errors (blocking issues)
|
||||||
|
if len(result.Errors) > 0 {
|
||||||
|
fmt.Println("\n ✗ ERRORS (must fix before proceeding):")
|
||||||
|
for _, e := range result.Errors {
|
||||||
|
fmt.Printf(" • %s\n", e)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Warnings
|
// Warnings
|
||||||
if len(result.Warnings) > 0 {
|
if len(result.Warnings) > 0 {
|
||||||
fmt.Println("\n ⚠ Warnings:")
|
fmt.Println("\n ⚠ Warnings:")
|
||||||
@@ -385,6 +614,23 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Final status
|
||||||
|
fmt.Println()
|
||||||
|
if !result.CanProceed {
|
||||||
|
fmt.Println(" ┌─────────────────────────────────────────────────────────┐")
|
||||||
|
fmt.Println(" │ ✗ PREFLIGHT FAILED - Cannot proceed with restore │")
|
||||||
|
fmt.Println(" │ Fix the errors above and try again. │")
|
||||||
|
fmt.Println(" └─────────────────────────────────────────────────────────┘")
|
||||||
|
} else if len(result.Warnings) > 0 {
|
||||||
|
fmt.Println(" ┌─────────────────────────────────────────────────────────┐")
|
||||||
|
fmt.Println(" │ ⚠ PREFLIGHT PASSED WITH WARNINGS - Proceed with care │")
|
||||||
|
fmt.Println(" └─────────────────────────────────────────────────────────┘")
|
||||||
|
} else {
|
||||||
|
fmt.Println(" ┌─────────────────────────────────────────────────────────┐")
|
||||||
|
fmt.Println(" │ ✓ PREFLIGHT PASSED - Ready to restore │")
|
||||||
|
fmt.Println(" └─────────────────────────────────────────────────────────┘")
|
||||||
|
}
|
||||||
|
|
||||||
fmt.Println(strings.Repeat("─", 60))
|
fmt.Println(strings.Repeat("─", 60))
|
||||||
fmt.Println()
|
fmt.Println()
|
||||||
}
|
}
|
||||||
|
|||||||
285
internal/tui/backup_exec.go
Executable file → Normal file
285
internal/tui/backup_exec.go
Executable file → Normal file
@@ -4,6 +4,7 @@ import (
|
|||||||
"context"
|
"context"
|
||||||
"fmt"
|
"fmt"
|
||||||
"strings"
|
"strings"
|
||||||
|
"sync"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
tea "github.com/charmbracelet/bubbletea"
|
tea "github.com/charmbracelet/bubbletea"
|
||||||
@@ -33,6 +34,61 @@ type BackupExecutionModel struct {
|
|||||||
startTime time.Time
|
startTime time.Time
|
||||||
details []string
|
details []string
|
||||||
spinnerFrame int
|
spinnerFrame int
|
||||||
|
|
||||||
|
// Database count progress (for cluster backup)
|
||||||
|
dbTotal int
|
||||||
|
dbDone int
|
||||||
|
dbName string // Current database being backed up
|
||||||
|
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||||
|
phaseDesc string // Description of current phase
|
||||||
|
}
|
||||||
|
|
||||||
|
// sharedBackupProgressState holds progress state that can be safely accessed from callbacks
|
||||||
|
type sharedBackupProgressState struct {
|
||||||
|
mu sync.Mutex
|
||||||
|
dbTotal int
|
||||||
|
dbDone int
|
||||||
|
dbName string
|
||||||
|
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||||
|
phaseDesc string // Description of current phase
|
||||||
|
hasUpdate bool
|
||||||
|
}
|
||||||
|
|
||||||
|
// Package-level shared progress state for backup operations
|
||||||
|
var (
|
||||||
|
currentBackupProgressMu sync.Mutex
|
||||||
|
currentBackupProgressState *sharedBackupProgressState
|
||||||
|
)
|
||||||
|
|
||||||
|
func setCurrentBackupProgress(state *sharedBackupProgressState) {
|
||||||
|
currentBackupProgressMu.Lock()
|
||||||
|
defer currentBackupProgressMu.Unlock()
|
||||||
|
currentBackupProgressState = state
|
||||||
|
}
|
||||||
|
|
||||||
|
func clearCurrentBackupProgress() {
|
||||||
|
currentBackupProgressMu.Lock()
|
||||||
|
defer currentBackupProgressMu.Unlock()
|
||||||
|
currentBackupProgressState = nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, overallPhase int, phaseDesc string, hasUpdate bool) {
|
||||||
|
currentBackupProgressMu.Lock()
|
||||||
|
defer currentBackupProgressMu.Unlock()
|
||||||
|
|
||||||
|
if currentBackupProgressState == nil {
|
||||||
|
return 0, 0, "", 0, "", false
|
||||||
|
}
|
||||||
|
|
||||||
|
currentBackupProgressState.mu.Lock()
|
||||||
|
defer currentBackupProgressState.mu.Unlock()
|
||||||
|
|
||||||
|
hasUpdate = currentBackupProgressState.hasUpdate
|
||||||
|
currentBackupProgressState.hasUpdate = false
|
||||||
|
|
||||||
|
return currentBackupProgressState.dbTotal, currentBackupProgressState.dbDone,
|
||||||
|
currentBackupProgressState.dbName, currentBackupProgressState.overallPhase,
|
||||||
|
currentBackupProgressState.phaseDesc, hasUpdate
|
||||||
}
|
}
|
||||||
|
|
||||||
func NewBackupExecution(cfg *config.Config, log logger.Logger, parent tea.Model, ctx context.Context, backupType, dbName string, ratio int) BackupExecutionModel {
|
func NewBackupExecution(cfg *config.Config, log logger.Logger, parent tea.Model, ctx context.Context, backupType, dbName string, ratio int) BackupExecutionModel {
|
||||||
@@ -55,7 +111,6 @@ func NewBackupExecution(cfg *config.Config, log logger.Logger, parent tea.Model,
|
|||||||
}
|
}
|
||||||
|
|
||||||
func (m BackupExecutionModel) Init() tea.Cmd {
|
func (m BackupExecutionModel) Init() tea.Cmd {
|
||||||
// TUI handles all display through View() - no progress callbacks needed
|
|
||||||
return tea.Batch(
|
return tea.Batch(
|
||||||
executeBackupWithTUIProgress(m.ctx, m.config, m.logger, m.backupType, m.databaseName, m.ratio),
|
executeBackupWithTUIProgress(m.ctx, m.config, m.logger, m.backupType, m.databaseName, m.ratio),
|
||||||
backupTickCmd(),
|
backupTickCmd(),
|
||||||
@@ -91,6 +146,11 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
|
|||||||
|
|
||||||
start := time.Now()
|
start := time.Now()
|
||||||
|
|
||||||
|
// Setup shared progress state for TUI polling
|
||||||
|
progressState := &sharedBackupProgressState{}
|
||||||
|
setCurrentBackupProgress(progressState)
|
||||||
|
defer clearCurrentBackupProgress()
|
||||||
|
|
||||||
dbClient, err := database.New(cfg, log)
|
dbClient, err := database.New(cfg, log)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
return backupCompleteMsg{
|
return backupCompleteMsg{
|
||||||
@@ -110,6 +170,18 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
|
|||||||
// Pass nil as indicator - TUI itself handles all display, no stdout printing
|
// Pass nil as indicator - TUI itself handles all display, no stdout printing
|
||||||
engine := backup.NewSilent(cfg, log, dbClient, nil)
|
engine := backup.NewSilent(cfg, log, dbClient, nil)
|
||||||
|
|
||||||
|
// Set database progress callback for cluster backups
|
||||||
|
engine.SetDatabaseProgressCallback(func(done, total int, currentDB string) {
|
||||||
|
progressState.mu.Lock()
|
||||||
|
progressState.dbDone = done
|
||||||
|
progressState.dbTotal = total
|
||||||
|
progressState.dbName = currentDB
|
||||||
|
progressState.overallPhase = 2 // Phase 2: Backing up databases
|
||||||
|
progressState.phaseDesc = fmt.Sprintf("Phase 2/3: Databases (%d/%d)", done, total)
|
||||||
|
progressState.hasUpdate = true
|
||||||
|
progressState.mu.Unlock()
|
||||||
|
})
|
||||||
|
|
||||||
var backupErr error
|
var backupErr error
|
||||||
switch backupType {
|
switch backupType {
|
||||||
case "single":
|
case "single":
|
||||||
@@ -157,10 +229,23 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
|||||||
// Increment spinner frame for smooth animation
|
// Increment spinner frame for smooth animation
|
||||||
m.spinnerFrame = (m.spinnerFrame + 1) % len(spinnerFrames)
|
m.spinnerFrame = (m.spinnerFrame + 1) % len(spinnerFrames)
|
||||||
|
|
||||||
// Update status based on elapsed time to show progress
|
// Poll for database progress updates from callbacks
|
||||||
|
dbTotal, dbDone, dbName, overallPhase, phaseDesc, hasUpdate := getCurrentBackupProgress()
|
||||||
|
if hasUpdate {
|
||||||
|
m.dbTotal = dbTotal
|
||||||
|
m.dbDone = dbDone
|
||||||
|
m.dbName = dbName
|
||||||
|
m.overallPhase = overallPhase
|
||||||
|
m.phaseDesc = phaseDesc
|
||||||
|
}
|
||||||
|
|
||||||
|
// Update status based on progress and elapsed time
|
||||||
elapsedSec := int(time.Since(m.startTime).Seconds())
|
elapsedSec := int(time.Since(m.startTime).Seconds())
|
||||||
|
|
||||||
if elapsedSec < 2 {
|
if m.dbTotal > 0 && m.dbDone > 0 {
|
||||||
|
// We have real progress from cluster backup
|
||||||
|
m.status = fmt.Sprintf("Backing up database: %s", m.dbName)
|
||||||
|
} else if elapsedSec < 2 {
|
||||||
m.status = "Initializing backup..."
|
m.status = "Initializing backup..."
|
||||||
} else if elapsedSec < 5 {
|
} else if elapsedSec < 5 {
|
||||||
if m.backupType == "cluster" {
|
if m.backupType == "cluster" {
|
||||||
@@ -210,6 +295,20 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
|||||||
}
|
}
|
||||||
return m, nil
|
return m, nil
|
||||||
|
|
||||||
|
case tea.InterruptMsg:
|
||||||
|
// Handle Ctrl+C signal (SIGINT) - Bubbletea v1.3+ sends this instead of KeyMsg for ctrl+c
|
||||||
|
if !m.done && !m.cancelling {
|
||||||
|
m.cancelling = true
|
||||||
|
m.status = "[STOP] Cancelling backup... (please wait)"
|
||||||
|
if m.cancel != nil {
|
||||||
|
m.cancel()
|
||||||
|
}
|
||||||
|
return m, nil
|
||||||
|
} else if m.done {
|
||||||
|
return m.parent, tea.Quit
|
||||||
|
}
|
||||||
|
return m, nil
|
||||||
|
|
||||||
case tea.KeyMsg:
|
case tea.KeyMsg:
|
||||||
switch msg.String() {
|
switch msg.String() {
|
||||||
case "ctrl+c", "esc":
|
case "ctrl+c", "esc":
|
||||||
@@ -234,6 +333,34 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
|||||||
return m, nil
|
return m, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// renderDatabaseProgressBar renders a progress bar for database count progress
|
||||||
|
func renderBackupDatabaseProgressBar(done, total int, dbName string, width int) string {
|
||||||
|
if total == 0 {
|
||||||
|
return ""
|
||||||
|
}
|
||||||
|
|
||||||
|
// Calculate progress percentage
|
||||||
|
percent := float64(done) / float64(total)
|
||||||
|
if percent > 1.0 {
|
||||||
|
percent = 1.0
|
||||||
|
}
|
||||||
|
|
||||||
|
// Calculate filled width
|
||||||
|
barWidth := width - 20 // Leave room for label and percentage
|
||||||
|
if barWidth < 10 {
|
||||||
|
barWidth = 10
|
||||||
|
}
|
||||||
|
filled := int(float64(barWidth) * percent)
|
||||||
|
if filled > barWidth {
|
||||||
|
filled = barWidth
|
||||||
|
}
|
||||||
|
|
||||||
|
// Build progress bar
|
||||||
|
bar := strings.Repeat("█", filled) + strings.Repeat("░", barWidth-filled)
|
||||||
|
|
||||||
|
return fmt.Sprintf(" Database: [%s] %d/%d", bar, done, total)
|
||||||
|
}
|
||||||
|
|
||||||
func (m BackupExecutionModel) View() string {
|
func (m BackupExecutionModel) View() string {
|
||||||
var s strings.Builder
|
var s strings.Builder
|
||||||
s.Grow(512) // Pre-allocate estimated capacity for better performance
|
s.Grow(512) // Pre-allocate estimated capacity for better performance
|
||||||
@@ -255,31 +382,153 @@ func (m BackupExecutionModel) View() string {
|
|||||||
s.WriteString(fmt.Sprintf(" %-10s %s\n", "Duration:", time.Since(m.startTime).Round(time.Second)))
|
s.WriteString(fmt.Sprintf(" %-10s %s\n", "Duration:", time.Since(m.startTime).Round(time.Second)))
|
||||||
s.WriteString("\n")
|
s.WriteString("\n")
|
||||||
|
|
||||||
// Status with spinner
|
// Status display
|
||||||
if !m.done {
|
if !m.done {
|
||||||
if m.cancelling {
|
// Unified progress display for cluster backup
|
||||||
s.WriteString(fmt.Sprintf(" %s %s\n", spinnerFrames[m.spinnerFrame], m.status))
|
if m.backupType == "cluster" {
|
||||||
|
// Calculate overall progress across all phases
|
||||||
|
// Phase 1: Globals (0-15%)
|
||||||
|
// Phase 2: Databases (15-90%)
|
||||||
|
// Phase 3: Compressing (90-100%)
|
||||||
|
overallProgress := 0
|
||||||
|
phaseLabel := "Starting..."
|
||||||
|
|
||||||
|
elapsedSec := int(time.Since(m.startTime).Seconds())
|
||||||
|
|
||||||
|
if m.overallPhase == 2 && m.dbTotal > 0 {
|
||||||
|
// Phase 2: Database backups - contributes 15-90%
|
||||||
|
dbPct := int((int64(m.dbDone) * 100) / int64(m.dbTotal))
|
||||||
|
overallProgress = 15 + (dbPct * 75 / 100)
|
||||||
|
phaseLabel = m.phaseDesc
|
||||||
|
} else if elapsedSec < 5 {
|
||||||
|
// Initial setup
|
||||||
|
overallProgress = 2
|
||||||
|
phaseLabel = "Phase 1/3: Initializing..."
|
||||||
|
} else if m.dbTotal == 0 {
|
||||||
|
// Phase 1: Globals backup (before databases start)
|
||||||
|
overallProgress = 10
|
||||||
|
phaseLabel = "Phase 1/3: Backing up Globals"
|
||||||
|
}
|
||||||
|
|
||||||
|
// Header with phase and overall progress
|
||||||
|
s.WriteString(infoStyle.Render(" ─── Cluster Backup Progress ──────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
s.WriteString(fmt.Sprintf(" %s\n\n", phaseLabel))
|
||||||
|
|
||||||
|
// Overall progress bar
|
||||||
|
s.WriteString(" Overall: ")
|
||||||
|
s.WriteString(renderProgressBar(overallProgress))
|
||||||
|
s.WriteString(fmt.Sprintf(" %d%%\n", overallProgress))
|
||||||
|
|
||||||
|
// Phase-specific details
|
||||||
|
if m.dbTotal > 0 && m.dbDone > 0 {
|
||||||
|
// Show current database being backed up
|
||||||
|
s.WriteString("\n")
|
||||||
|
spinner := spinnerFrames[m.spinnerFrame]
|
||||||
|
if m.dbName != "" && m.dbDone <= m.dbTotal {
|
||||||
|
s.WriteString(fmt.Sprintf(" Current: %s %s\n", spinner, m.dbName))
|
||||||
|
}
|
||||||
|
s.WriteString("\n")
|
||||||
|
|
||||||
|
// Database progress bar
|
||||||
|
progressBar := renderBackupDatabaseProgressBar(m.dbDone, m.dbTotal, m.dbName, 50)
|
||||||
|
s.WriteString(progressBar + "\n")
|
||||||
|
} else {
|
||||||
|
// Intermediate phase (globals)
|
||||||
|
spinner := spinnerFrames[m.spinnerFrame]
|
||||||
|
s.WriteString(fmt.Sprintf("\n %s %s\n\n", spinner, m.status))
|
||||||
|
}
|
||||||
|
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
} else {
|
} else {
|
||||||
s.WriteString(fmt.Sprintf(" %s %s\n", spinnerFrames[m.spinnerFrame], m.status))
|
// Single/sample database backup - simpler display
|
||||||
|
spinner := spinnerFrames[m.spinnerFrame]
|
||||||
|
s.WriteString(fmt.Sprintf(" %s %s\n", spinner, m.status))
|
||||||
|
}
|
||||||
|
|
||||||
|
if !m.cancelling {
|
||||||
s.WriteString("\n [KEY] Press Ctrl+C or ESC to cancel\n")
|
s.WriteString("\n [KEY] Press Ctrl+C or ESC to cancel\n")
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
s.WriteString(fmt.Sprintf(" %s\n\n", m.status))
|
// Show completion summary with detailed stats
|
||||||
|
|
||||||
if m.err != nil {
|
if m.err != nil {
|
||||||
s.WriteString(fmt.Sprintf(" [FAIL] Error: %v\n", m.err))
|
s.WriteString("\n")
|
||||||
} else if m.result != "" {
|
s.WriteString(errorStyle.Render(" ╔══════════════════════════════════════════════════════════╗"))
|
||||||
// Parse and display result cleanly
|
s.WriteString("\n")
|
||||||
lines := strings.Split(m.result, "\n")
|
s.WriteString(errorStyle.Render(" ║ [FAIL] BACKUP FAILED ║"))
|
||||||
for _, line := range lines {
|
s.WriteString("\n")
|
||||||
line = strings.TrimSpace(line)
|
s.WriteString(errorStyle.Render(" ╚══════════════════════════════════════════════════════════╝"))
|
||||||
if line != "" {
|
s.WriteString("\n\n")
|
||||||
s.WriteString(" " + line + "\n")
|
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
|
||||||
|
s.WriteString("\n")
|
||||||
|
} else {
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(successStyle.Render(" ╔══════════════════════════════════════════════════════════╗"))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(successStyle.Render(" ║ [OK] BACKUP COMPLETED SUCCESSFULLY ║"))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(successStyle.Render(" ╚══════════════════════════════════════════════════════════╝"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
|
||||||
|
// Summary section
|
||||||
|
s.WriteString(infoStyle.Render(" ─── Summary ─────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
|
||||||
|
// Backup type specific info
|
||||||
|
switch m.backupType {
|
||||||
|
case "cluster":
|
||||||
|
s.WriteString(" Type: Cluster Backup\n")
|
||||||
|
if m.dbTotal > 0 {
|
||||||
|
s.WriteString(fmt.Sprintf(" Databases: %d backed up\n", m.dbTotal))
|
||||||
}
|
}
|
||||||
|
case "single":
|
||||||
|
s.WriteString(" Type: Single Database Backup\n")
|
||||||
|
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||||
|
case "sample":
|
||||||
|
s.WriteString(" Type: Sample Backup\n")
|
||||||
|
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||||
|
s.WriteString(fmt.Sprintf(" Sample Ratio: %d\n", m.ratio))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
s.WriteString("\n")
|
||||||
|
|
||||||
|
// Timing section
|
||||||
|
s.WriteString(infoStyle.Render(" ─── Timing ──────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
|
||||||
|
elapsed := time.Since(m.startTime)
|
||||||
|
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatBackupDuration(elapsed)))
|
||||||
|
|
||||||
|
if m.backupType == "cluster" && m.dbTotal > 0 {
|
||||||
|
avgPerDB := elapsed / time.Duration(m.dbTotal)
|
||||||
|
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatBackupDuration(avgPerDB)))
|
||||||
|
}
|
||||||
|
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(infoStyle.Render(" ─────────────────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n")
|
||||||
}
|
}
|
||||||
s.WriteString("\n [KEY] Press Enter or ESC to return to menu\n")
|
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(" [KEY] Press Enter or ESC to return to menu\n")
|
||||||
}
|
}
|
||||||
|
|
||||||
return s.String()
|
return s.String()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// formatBackupDuration formats duration in human readable format
|
||||||
|
func formatBackupDuration(d time.Duration) string {
|
||||||
|
if d < time.Minute {
|
||||||
|
return fmt.Sprintf("%.1fs", d.Seconds())
|
||||||
|
}
|
||||||
|
if d < time.Hour {
|
||||||
|
minutes := int(d.Minutes())
|
||||||
|
seconds := int(d.Seconds()) % 60
|
||||||
|
return fmt.Sprintf("%dm %ds", minutes, seconds)
|
||||||
|
}
|
||||||
|
hours := int(d.Hours())
|
||||||
|
minutes := int(d.Minutes()) % 60
|
||||||
|
return fmt.Sprintf("%dh %dm", hours, minutes)
|
||||||
|
}
|
||||||
|
|||||||
406
internal/tui/detailed_progress.go
Normal file
406
internal/tui/detailed_progress.go
Normal file
@@ -0,0 +1,406 @@
|
|||||||
|
package tui
|
||||||
|
|
||||||
|
import (
|
||||||
|
"fmt"
|
||||||
|
"strings"
|
||||||
|
"sync"
|
||||||
|
"time"
|
||||||
|
)
|
||||||
|
|
||||||
|
// DetailedProgress provides schollz-like progress information for TUI rendering
|
||||||
|
// This is a data structure that can be queried by Bubble Tea's View() method
|
||||||
|
type DetailedProgress struct {
|
||||||
|
mu sync.RWMutex
|
||||||
|
|
||||||
|
// Core progress
|
||||||
|
Total int64 // Total bytes or items
|
||||||
|
Current int64 // Current bytes or items done
|
||||||
|
|
||||||
|
// Display info
|
||||||
|
Description string // What operation is happening
|
||||||
|
Unit string // "bytes", "files", "databases", etc.
|
||||||
|
|
||||||
|
// Timing for ETA/speed calculation
|
||||||
|
StartTime time.Time
|
||||||
|
LastUpdate time.Time
|
||||||
|
SpeedWindow []speedSample // Rolling window for speed calculation
|
||||||
|
|
||||||
|
// State
|
||||||
|
IsIndeterminate bool // True if total is unknown (spinner mode)
|
||||||
|
IsComplete bool
|
||||||
|
IsFailed bool
|
||||||
|
ErrorMessage string
|
||||||
|
}
|
||||||
|
|
||||||
|
type speedSample struct {
|
||||||
|
timestamp time.Time
|
||||||
|
bytes int64
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewDetailedProgress creates a progress tracker with known total
|
||||||
|
func NewDetailedProgress(total int64, description string) *DetailedProgress {
|
||||||
|
return &DetailedProgress{
|
||||||
|
Total: total,
|
||||||
|
Description: description,
|
||||||
|
Unit: "bytes",
|
||||||
|
StartTime: time.Now(),
|
||||||
|
LastUpdate: time.Now(),
|
||||||
|
SpeedWindow: make([]speedSample, 0, 20),
|
||||||
|
IsIndeterminate: total <= 0,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewDetailedProgressItems creates a progress tracker for item counts
|
||||||
|
func NewDetailedProgressItems(total int, description string) *DetailedProgress {
|
||||||
|
return &DetailedProgress{
|
||||||
|
Total: int64(total),
|
||||||
|
Description: description,
|
||||||
|
Unit: "items",
|
||||||
|
StartTime: time.Now(),
|
||||||
|
LastUpdate: time.Now(),
|
||||||
|
SpeedWindow: make([]speedSample, 0, 20),
|
||||||
|
IsIndeterminate: total <= 0,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewDetailedProgressSpinner creates an indeterminate progress tracker
|
||||||
|
func NewDetailedProgressSpinner(description string) *DetailedProgress {
|
||||||
|
return &DetailedProgress{
|
||||||
|
Total: -1,
|
||||||
|
Description: description,
|
||||||
|
Unit: "",
|
||||||
|
StartTime: time.Now(),
|
||||||
|
LastUpdate: time.Now(),
|
||||||
|
SpeedWindow: make([]speedSample, 0, 20),
|
||||||
|
IsIndeterminate: true,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Add adds to the current progress
|
||||||
|
func (dp *DetailedProgress) Add(n int64) {
|
||||||
|
dp.mu.Lock()
|
||||||
|
defer dp.mu.Unlock()
|
||||||
|
|
||||||
|
dp.Current += n
|
||||||
|
dp.LastUpdate = time.Now()
|
||||||
|
|
||||||
|
// Add speed sample
|
||||||
|
dp.SpeedWindow = append(dp.SpeedWindow, speedSample{
|
||||||
|
timestamp: dp.LastUpdate,
|
||||||
|
bytes: dp.Current,
|
||||||
|
})
|
||||||
|
|
||||||
|
// Keep only last 20 samples for speed calculation
|
||||||
|
if len(dp.SpeedWindow) > 20 {
|
||||||
|
dp.SpeedWindow = dp.SpeedWindow[len(dp.SpeedWindow)-20:]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Set sets the current progress to a specific value
|
||||||
|
func (dp *DetailedProgress) Set(n int64) {
|
||||||
|
dp.mu.Lock()
|
||||||
|
defer dp.mu.Unlock()
|
||||||
|
|
||||||
|
dp.Current = n
|
||||||
|
dp.LastUpdate = time.Now()
|
||||||
|
|
||||||
|
// Add speed sample
|
||||||
|
dp.SpeedWindow = append(dp.SpeedWindow, speedSample{
|
||||||
|
timestamp: dp.LastUpdate,
|
||||||
|
bytes: dp.Current,
|
||||||
|
})
|
||||||
|
|
||||||
|
if len(dp.SpeedWindow) > 20 {
|
||||||
|
dp.SpeedWindow = dp.SpeedWindow[len(dp.SpeedWindow)-20:]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// SetTotal updates the total (useful when total becomes known during operation)
|
||||||
|
func (dp *DetailedProgress) SetTotal(total int64) {
|
||||||
|
dp.mu.Lock()
|
||||||
|
defer dp.mu.Unlock()
|
||||||
|
|
||||||
|
dp.Total = total
|
||||||
|
dp.IsIndeterminate = total <= 0
|
||||||
|
}
|
||||||
|
|
||||||
|
// SetDescription updates the description
|
||||||
|
func (dp *DetailedProgress) SetDescription(desc string) {
|
||||||
|
dp.mu.Lock()
|
||||||
|
defer dp.mu.Unlock()
|
||||||
|
dp.Description = desc
|
||||||
|
}
|
||||||
|
|
||||||
|
// Complete marks the progress as complete
|
||||||
|
func (dp *DetailedProgress) Complete() {
|
||||||
|
dp.mu.Lock()
|
||||||
|
defer dp.mu.Unlock()
|
||||||
|
|
||||||
|
dp.IsComplete = true
|
||||||
|
dp.Current = dp.Total
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fail marks the progress as failed
|
||||||
|
func (dp *DetailedProgress) Fail(errMsg string) {
|
||||||
|
dp.mu.Lock()
|
||||||
|
defer dp.mu.Unlock()
|
||||||
|
|
||||||
|
dp.IsFailed = true
|
||||||
|
dp.ErrorMessage = errMsg
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetPercent returns the progress percentage (0-100)
|
||||||
|
func (dp *DetailedProgress) GetPercent() int {
|
||||||
|
dp.mu.RLock()
|
||||||
|
defer dp.mu.RUnlock()
|
||||||
|
|
||||||
|
if dp.IsIndeterminate || dp.Total <= 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
percent := int((dp.Current * 100) / dp.Total)
|
||||||
|
if percent > 100 {
|
||||||
|
return 100
|
||||||
|
}
|
||||||
|
return percent
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetSpeed returns the current transfer speed in bytes/second
|
||||||
|
func (dp *DetailedProgress) GetSpeed() float64 {
|
||||||
|
dp.mu.RLock()
|
||||||
|
defer dp.mu.RUnlock()
|
||||||
|
|
||||||
|
if len(dp.SpeedWindow) < 2 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
// Use first and last samples in window for smoothed speed
|
||||||
|
first := dp.SpeedWindow[0]
|
||||||
|
last := dp.SpeedWindow[len(dp.SpeedWindow)-1]
|
||||||
|
|
||||||
|
elapsed := last.timestamp.Sub(first.timestamp).Seconds()
|
||||||
|
if elapsed <= 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
bytesTransferred := last.bytes - first.bytes
|
||||||
|
return float64(bytesTransferred) / elapsed
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetETA returns the estimated time remaining
|
||||||
|
func (dp *DetailedProgress) GetETA() time.Duration {
|
||||||
|
dp.mu.RLock()
|
||||||
|
defer dp.mu.RUnlock()
|
||||||
|
|
||||||
|
if dp.IsIndeterminate || dp.Total <= 0 || dp.Current >= dp.Total {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
speed := dp.getSpeedLocked()
|
||||||
|
if speed <= 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
remaining := dp.Total - dp.Current
|
||||||
|
seconds := float64(remaining) / speed
|
||||||
|
return time.Duration(seconds) * time.Second
|
||||||
|
}
|
||||||
|
|
||||||
|
func (dp *DetailedProgress) getSpeedLocked() float64 {
|
||||||
|
if len(dp.SpeedWindow) < 2 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
first := dp.SpeedWindow[0]
|
||||||
|
last := dp.SpeedWindow[len(dp.SpeedWindow)-1]
|
||||||
|
|
||||||
|
elapsed := last.timestamp.Sub(first.timestamp).Seconds()
|
||||||
|
if elapsed <= 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
bytesTransferred := last.bytes - first.bytes
|
||||||
|
return float64(bytesTransferred) / elapsed
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetElapsed returns the elapsed time since start
|
||||||
|
func (dp *DetailedProgress) GetElapsed() time.Duration {
|
||||||
|
dp.mu.RLock()
|
||||||
|
defer dp.mu.RUnlock()
|
||||||
|
return time.Since(dp.StartTime)
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetState returns a snapshot of the current state for rendering
|
||||||
|
func (dp *DetailedProgress) GetState() DetailedProgressState {
|
||||||
|
dp.mu.RLock()
|
||||||
|
defer dp.mu.RUnlock()
|
||||||
|
|
||||||
|
return DetailedProgressState{
|
||||||
|
Description: dp.Description,
|
||||||
|
Current: dp.Current,
|
||||||
|
Total: dp.Total,
|
||||||
|
Percent: dp.getPercentLocked(),
|
||||||
|
Speed: dp.getSpeedLocked(),
|
||||||
|
ETA: dp.getETALocked(),
|
||||||
|
Elapsed: time.Since(dp.StartTime),
|
||||||
|
Unit: dp.Unit,
|
||||||
|
IsIndeterminate: dp.IsIndeterminate,
|
||||||
|
IsComplete: dp.IsComplete,
|
||||||
|
IsFailed: dp.IsFailed,
|
||||||
|
ErrorMessage: dp.ErrorMessage,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
func (dp *DetailedProgress) getPercentLocked() int {
|
||||||
|
if dp.IsIndeterminate || dp.Total <= 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
percent := int((dp.Current * 100) / dp.Total)
|
||||||
|
if percent > 100 {
|
||||||
|
return 100
|
||||||
|
}
|
||||||
|
return percent
|
||||||
|
}
|
||||||
|
|
||||||
|
func (dp *DetailedProgress) getETALocked() time.Duration {
|
||||||
|
if dp.IsIndeterminate || dp.Total <= 0 || dp.Current >= dp.Total {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
speed := dp.getSpeedLocked()
|
||||||
|
if speed <= 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
remaining := dp.Total - dp.Current
|
||||||
|
seconds := float64(remaining) / speed
|
||||||
|
return time.Duration(seconds) * time.Second
|
||||||
|
}
|
||||||
|
|
||||||
|
// DetailedProgressState is an immutable snapshot for rendering
|
||||||
|
type DetailedProgressState struct {
|
||||||
|
Description string
|
||||||
|
Current int64
|
||||||
|
Total int64
|
||||||
|
Percent int
|
||||||
|
Speed float64 // bytes/sec
|
||||||
|
ETA time.Duration
|
||||||
|
Elapsed time.Duration
|
||||||
|
Unit string
|
||||||
|
IsIndeterminate bool
|
||||||
|
IsComplete bool
|
||||||
|
IsFailed bool
|
||||||
|
ErrorMessage string
|
||||||
|
}
|
||||||
|
|
||||||
|
// RenderProgressBar renders a TUI-friendly progress bar string
|
||||||
|
// Returns something like: "Extracting archive [████████░░░░░░░░░░░░] 45% 12.5 MB/s ETA: 2m 30s"
|
||||||
|
func (s DetailedProgressState) RenderProgressBar(width int) string {
|
||||||
|
if s.IsIndeterminate {
|
||||||
|
return s.renderIndeterminate()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Progress bar
|
||||||
|
barWidth := 30
|
||||||
|
if width < 80 {
|
||||||
|
barWidth = 20
|
||||||
|
}
|
||||||
|
filled := (s.Percent * barWidth) / 100
|
||||||
|
if filled > barWidth {
|
||||||
|
filled = barWidth
|
||||||
|
}
|
||||||
|
|
||||||
|
bar := strings.Repeat("█", filled) + strings.Repeat("░", barWidth-filled)
|
||||||
|
|
||||||
|
// Format bytes
|
||||||
|
currentStr := FormatBytes(s.Current)
|
||||||
|
totalStr := FormatBytes(s.Total)
|
||||||
|
|
||||||
|
// Format speed
|
||||||
|
speedStr := ""
|
||||||
|
if s.Speed > 0 {
|
||||||
|
speedStr = fmt.Sprintf("%s/s", FormatBytes(int64(s.Speed)))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Format ETA
|
||||||
|
etaStr := ""
|
||||||
|
if s.ETA > 0 && !s.IsComplete {
|
||||||
|
etaStr = fmt.Sprintf("ETA: %s", FormatDurationShort(s.ETA))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Build the line
|
||||||
|
parts := []string{
|
||||||
|
fmt.Sprintf("[%s]", bar),
|
||||||
|
fmt.Sprintf("%3d%%", s.Percent),
|
||||||
|
}
|
||||||
|
|
||||||
|
if s.Unit == "bytes" && s.Total > 0 {
|
||||||
|
parts = append(parts, fmt.Sprintf("%s/%s", currentStr, totalStr))
|
||||||
|
} else if s.Total > 0 {
|
||||||
|
parts = append(parts, fmt.Sprintf("%d/%d", s.Current, s.Total))
|
||||||
|
}
|
||||||
|
|
||||||
|
if speedStr != "" {
|
||||||
|
parts = append(parts, speedStr)
|
||||||
|
}
|
||||||
|
if etaStr != "" {
|
||||||
|
parts = append(parts, etaStr)
|
||||||
|
}
|
||||||
|
|
||||||
|
return strings.Join(parts, " ")
|
||||||
|
}
|
||||||
|
|
||||||
|
func (s DetailedProgressState) renderIndeterminate() string {
|
||||||
|
elapsed := FormatDurationShort(s.Elapsed)
|
||||||
|
return fmt.Sprintf("[spinner] %s Elapsed: %s", s.Description, elapsed)
|
||||||
|
}
|
||||||
|
|
||||||
|
// RenderCompact renders a compact single-line progress string
|
||||||
|
func (s DetailedProgressState) RenderCompact() string {
|
||||||
|
if s.IsComplete {
|
||||||
|
return fmt.Sprintf("[OK] %s completed in %s", s.Description, FormatDurationShort(s.Elapsed))
|
||||||
|
}
|
||||||
|
if s.IsFailed {
|
||||||
|
return fmt.Sprintf("[FAIL] %s: %s", s.Description, s.ErrorMessage)
|
||||||
|
}
|
||||||
|
if s.IsIndeterminate {
|
||||||
|
return fmt.Sprintf("[...] %s (%s)", s.Description, FormatDurationShort(s.Elapsed))
|
||||||
|
}
|
||||||
|
|
||||||
|
return fmt.Sprintf("[%3d%%] %s - %s/%s", s.Percent, s.Description,
|
||||||
|
FormatBytes(s.Current), FormatBytes(s.Total))
|
||||||
|
}
|
||||||
|
|
||||||
|
// FormatBytes formats bytes in human-readable format
|
||||||
|
func FormatBytes(b int64) string {
|
||||||
|
const unit = 1024
|
||||||
|
if b < unit {
|
||||||
|
return fmt.Sprintf("%d B", b)
|
||||||
|
}
|
||||||
|
div, exp := int64(unit), 0
|
||||||
|
for n := b / unit; n >= unit; n /= unit {
|
||||||
|
div *= unit
|
||||||
|
exp++
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("%.1f %cB", float64(b)/float64(div), "KMGTPE"[exp])
|
||||||
|
}
|
||||||
|
|
||||||
|
// FormatDurationShort formats duration in short form
|
||||||
|
func FormatDurationShort(d time.Duration) string {
|
||||||
|
if d < time.Second {
|
||||||
|
return "<1s"
|
||||||
|
}
|
||||||
|
if d < time.Minute {
|
||||||
|
return fmt.Sprintf("%ds", int(d.Seconds()))
|
||||||
|
}
|
||||||
|
if d < time.Hour {
|
||||||
|
m := int(d.Minutes())
|
||||||
|
s := int(d.Seconds()) % 60
|
||||||
|
if s > 0 {
|
||||||
|
return fmt.Sprintf("%dm %ds", m, s)
|
||||||
|
}
|
||||||
|
return fmt.Sprintf("%dm", m)
|
||||||
|
}
|
||||||
|
h := int(d.Hours())
|
||||||
|
m := int(d.Minutes()) % 60
|
||||||
|
return fmt.Sprintf("%dh %dm", h, m)
|
||||||
|
}
|
||||||
@@ -188,6 +188,21 @@ func (m *MenuModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
|||||||
}
|
}
|
||||||
return m, nil
|
return m, nil
|
||||||
|
|
||||||
|
case tea.InterruptMsg:
|
||||||
|
// Handle Ctrl+C signal (SIGINT) - Bubbletea v1.3+ sends this
|
||||||
|
if m.cancel != nil {
|
||||||
|
m.cancel()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Clean up any orphaned processes before exit
|
||||||
|
m.logger.Info("Cleaning up processes before exit (SIGINT)")
|
||||||
|
if err := cleanup.KillOrphanedProcesses(m.logger); err != nil {
|
||||||
|
m.logger.Warn("Failed to clean up all processes", "error", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
m.quitting = true
|
||||||
|
return m, tea.Quit
|
||||||
|
|
||||||
case tea.KeyMsg:
|
case tea.KeyMsg:
|
||||||
switch msg.String() {
|
switch msg.String() {
|
||||||
case "ctrl+c", "q":
|
case "ctrl+c", "q":
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ import (
|
|||||||
"os/exec"
|
"os/exec"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
"strings"
|
"strings"
|
||||||
|
"sync"
|
||||||
"time"
|
"time"
|
||||||
|
|
||||||
tea "github.com/charmbracelet/bubbletea"
|
tea "github.com/charmbracelet/bubbletea"
|
||||||
@@ -45,6 +46,29 @@ type RestoreExecutionModel struct {
|
|||||||
spinnerFrame int
|
spinnerFrame int
|
||||||
spinnerFrames []string
|
spinnerFrames []string
|
||||||
|
|
||||||
|
// Detailed byte progress for schollz-style display
|
||||||
|
bytesTotal int64
|
||||||
|
bytesDone int64
|
||||||
|
description string
|
||||||
|
showBytes bool // True when we have real byte progress to show
|
||||||
|
speed float64 // Rolling window speed in bytes/sec
|
||||||
|
|
||||||
|
// Database count progress (for cluster restore)
|
||||||
|
dbTotal int
|
||||||
|
dbDone int
|
||||||
|
|
||||||
|
// Current database being restored (for detailed display)
|
||||||
|
currentDB string
|
||||||
|
|
||||||
|
// Timing info for database restore phase (ETA calculation)
|
||||||
|
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
|
||||||
|
dbAvgPerDB time.Duration // Average time per database restore
|
||||||
|
|
||||||
|
// Overall progress tracking for unified display
|
||||||
|
overallPhase int // 1=Extracting, 2=Globals, 3=Databases
|
||||||
|
extractionDone bool
|
||||||
|
extractionTime time.Duration // How long extraction took (for ETA calc)
|
||||||
|
|
||||||
// Results
|
// Results
|
||||||
done bool
|
done bool
|
||||||
cancelling bool // True when user has requested cancellation
|
cancelling bool // True when user has requested cancellation
|
||||||
@@ -97,10 +121,13 @@ func restoreTickCmd() tea.Cmd {
|
|||||||
}
|
}
|
||||||
|
|
||||||
type restoreProgressMsg struct {
|
type restoreProgressMsg struct {
|
||||||
status string
|
status string
|
||||||
phase string
|
phase string
|
||||||
progress int
|
progress int
|
||||||
detail string
|
detail string
|
||||||
|
bytesTotal int64
|
||||||
|
bytesDone int64
|
||||||
|
description string
|
||||||
}
|
}
|
||||||
|
|
||||||
type restoreCompleteMsg struct {
|
type restoreCompleteMsg struct {
|
||||||
@@ -109,6 +136,121 @@ type restoreCompleteMsg struct {
|
|||||||
elapsed time.Duration
|
elapsed time.Duration
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// sharedProgressState holds progress state that can be safely accessed from callbacks
|
||||||
|
type sharedProgressState struct {
|
||||||
|
mu sync.Mutex
|
||||||
|
bytesTotal int64
|
||||||
|
bytesDone int64
|
||||||
|
description string
|
||||||
|
hasUpdate bool
|
||||||
|
|
||||||
|
// Database count progress (for cluster restore)
|
||||||
|
dbTotal int
|
||||||
|
dbDone int
|
||||||
|
|
||||||
|
// Current database being restored
|
||||||
|
currentDB string
|
||||||
|
|
||||||
|
// Timing info for database restore phase
|
||||||
|
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
|
||||||
|
dbAvgPerDB time.Duration // Average time per database restore
|
||||||
|
|
||||||
|
// Overall phase tracking (1=Extract, 2=Globals, 3=Databases)
|
||||||
|
overallPhase int
|
||||||
|
extractionDone bool
|
||||||
|
|
||||||
|
// Weighted progress by database sizes (bytes)
|
||||||
|
dbBytesTotal int64 // Total bytes across all databases
|
||||||
|
dbBytesDone int64 // Bytes completed (sum of finished DB sizes)
|
||||||
|
|
||||||
|
// Rolling window for speed calculation
|
||||||
|
speedSamples []restoreSpeedSample
|
||||||
|
}
|
||||||
|
|
||||||
|
type restoreSpeedSample struct {
|
||||||
|
timestamp time.Time
|
||||||
|
bytes int64
|
||||||
|
}
|
||||||
|
|
||||||
|
// Package-level shared progress state for restore operations
|
||||||
|
var (
|
||||||
|
currentRestoreProgressMu sync.Mutex
|
||||||
|
currentRestoreProgressState *sharedProgressState
|
||||||
|
)
|
||||||
|
|
||||||
|
func setCurrentRestoreProgress(state *sharedProgressState) {
|
||||||
|
currentRestoreProgressMu.Lock()
|
||||||
|
defer currentRestoreProgressMu.Unlock()
|
||||||
|
currentRestoreProgressState = state
|
||||||
|
}
|
||||||
|
|
||||||
|
func clearCurrentRestoreProgress() {
|
||||||
|
currentRestoreProgressMu.Lock()
|
||||||
|
defer currentRestoreProgressMu.Unlock()
|
||||||
|
currentRestoreProgressState = nil
|
||||||
|
}
|
||||||
|
|
||||||
|
func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description string, hasUpdate bool, dbTotal, dbDone int, speed float64, dbPhaseElapsed, dbAvgPerDB time.Duration, currentDB string, overallPhase int, extractionDone bool, dbBytesTotal, dbBytesDone int64) {
|
||||||
|
currentRestoreProgressMu.Lock()
|
||||||
|
defer currentRestoreProgressMu.Unlock()
|
||||||
|
|
||||||
|
if currentRestoreProgressState == nil {
|
||||||
|
return 0, 0, "", false, 0, 0, 0, 0, 0, "", 0, false, 0, 0
|
||||||
|
}
|
||||||
|
|
||||||
|
currentRestoreProgressState.mu.Lock()
|
||||||
|
defer currentRestoreProgressState.mu.Unlock()
|
||||||
|
|
||||||
|
// Calculate rolling window speed
|
||||||
|
speed = calculateRollingSpeed(currentRestoreProgressState.speedSamples)
|
||||||
|
|
||||||
|
return currentRestoreProgressState.bytesTotal, currentRestoreProgressState.bytesDone,
|
||||||
|
currentRestoreProgressState.description, currentRestoreProgressState.hasUpdate,
|
||||||
|
currentRestoreProgressState.dbTotal, currentRestoreProgressState.dbDone, speed,
|
||||||
|
currentRestoreProgressState.dbPhaseElapsed, currentRestoreProgressState.dbAvgPerDB,
|
||||||
|
currentRestoreProgressState.currentDB, currentRestoreProgressState.overallPhase,
|
||||||
|
currentRestoreProgressState.extractionDone,
|
||||||
|
currentRestoreProgressState.dbBytesTotal, currentRestoreProgressState.dbBytesDone
|
||||||
|
}
|
||||||
|
|
||||||
|
// calculateRollingSpeed calculates speed from recent samples (last 5 seconds)
|
||||||
|
func calculateRollingSpeed(samples []restoreSpeedSample) float64 {
|
||||||
|
if len(samples) < 2 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
// Use samples from last 5 seconds for smoothed speed
|
||||||
|
now := time.Now()
|
||||||
|
cutoff := now.Add(-5 * time.Second)
|
||||||
|
|
||||||
|
var firstInWindow, lastInWindow *restoreSpeedSample
|
||||||
|
for i := range samples {
|
||||||
|
if samples[i].timestamp.After(cutoff) {
|
||||||
|
if firstInWindow == nil {
|
||||||
|
firstInWindow = &samples[i]
|
||||||
|
}
|
||||||
|
lastInWindow = &samples[i]
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fall back to first and last if window is empty
|
||||||
|
if firstInWindow == nil || lastInWindow == nil || firstInWindow == lastInWindow {
|
||||||
|
firstInWindow = &samples[0]
|
||||||
|
lastInWindow = &samples[len(samples)-1]
|
||||||
|
}
|
||||||
|
|
||||||
|
elapsed := lastInWindow.timestamp.Sub(firstInWindow.timestamp).Seconds()
|
||||||
|
if elapsed <= 0 {
|
||||||
|
return 0
|
||||||
|
}
|
||||||
|
|
||||||
|
bytesTransferred := lastInWindow.bytes - firstInWindow.bytes
|
||||||
|
return float64(bytesTransferred) / elapsed
|
||||||
|
}
|
||||||
|
|
||||||
|
// restoreProgressChannel allows sending progress updates from the restore goroutine
|
||||||
|
type restoreProgressChannel chan restoreProgressMsg
|
||||||
|
|
||||||
func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string, cleanFirst, createIfMissing bool, restoreType string, cleanClusterFirst bool, existingDBs []string, saveDebugLog bool) tea.Cmd {
|
func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string, cleanFirst, createIfMissing bool, restoreType string, cleanClusterFirst bool, existingDBs []string, saveDebugLog bool) tea.Cmd {
|
||||||
return func() tea.Msg {
|
return func() tea.Msg {
|
||||||
// NO TIMEOUT for restore operations - a restore takes as long as it takes
|
// NO TIMEOUT for restore operations - a restore takes as long as it takes
|
||||||
@@ -156,6 +298,91 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
|||||||
// STEP 2: Create restore engine with silent progress (no stdout interference with TUI)
|
// STEP 2: Create restore engine with silent progress (no stdout interference with TUI)
|
||||||
engine := restore.NewSilent(cfg, log, dbClient)
|
engine := restore.NewSilent(cfg, log, dbClient)
|
||||||
|
|
||||||
|
// Set up progress callback for detailed progress reporting
|
||||||
|
// We use a shared pointer that can be queried by the TUI ticker
|
||||||
|
progressState := &sharedProgressState{
|
||||||
|
speedSamples: make([]restoreSpeedSample, 0, 100),
|
||||||
|
}
|
||||||
|
engine.SetProgressCallback(func(current, total int64, description string) {
|
||||||
|
progressState.mu.Lock()
|
||||||
|
defer progressState.mu.Unlock()
|
||||||
|
progressState.bytesDone = current
|
||||||
|
progressState.bytesTotal = total
|
||||||
|
progressState.description = description
|
||||||
|
progressState.hasUpdate = true
|
||||||
|
progressState.overallPhase = 1
|
||||||
|
progressState.extractionDone = false
|
||||||
|
|
||||||
|
// Check if extraction is complete
|
||||||
|
if current >= total && total > 0 {
|
||||||
|
progressState.extractionDone = true
|
||||||
|
progressState.overallPhase = 2
|
||||||
|
}
|
||||||
|
|
||||||
|
// Add speed sample for rolling window calculation
|
||||||
|
progressState.speedSamples = append(progressState.speedSamples, restoreSpeedSample{
|
||||||
|
timestamp: time.Now(),
|
||||||
|
bytes: current,
|
||||||
|
})
|
||||||
|
// Keep only last 100 samples
|
||||||
|
if len(progressState.speedSamples) > 100 {
|
||||||
|
progressState.speedSamples = progressState.speedSamples[len(progressState.speedSamples)-100:]
|
||||||
|
}
|
||||||
|
})
|
||||||
|
|
||||||
|
// Set up database progress callback for cluster restore
|
||||||
|
engine.SetDatabaseProgressCallback(func(done, total int, dbName string) {
|
||||||
|
progressState.mu.Lock()
|
||||||
|
defer progressState.mu.Unlock()
|
||||||
|
progressState.dbDone = done
|
||||||
|
progressState.dbTotal = total
|
||||||
|
progressState.description = fmt.Sprintf("Restoring %s", dbName)
|
||||||
|
progressState.currentDB = dbName
|
||||||
|
progressState.overallPhase = 3
|
||||||
|
progressState.extractionDone = true
|
||||||
|
progressState.hasUpdate = true
|
||||||
|
// Clear byte progress when switching to db progress
|
||||||
|
progressState.bytesTotal = 0
|
||||||
|
progressState.bytesDone = 0
|
||||||
|
})
|
||||||
|
|
||||||
|
// Set up timing-aware database progress callback for cluster restore ETA
|
||||||
|
engine.SetDatabaseProgressWithTimingCallback(func(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration) {
|
||||||
|
progressState.mu.Lock()
|
||||||
|
defer progressState.mu.Unlock()
|
||||||
|
progressState.dbDone = done
|
||||||
|
progressState.dbTotal = total
|
||||||
|
progressState.description = fmt.Sprintf("Restoring %s", dbName)
|
||||||
|
progressState.currentDB = dbName
|
||||||
|
progressState.overallPhase = 3
|
||||||
|
progressState.extractionDone = true
|
||||||
|
progressState.dbPhaseElapsed = phaseElapsed
|
||||||
|
progressState.dbAvgPerDB = avgPerDB
|
||||||
|
progressState.hasUpdate = true
|
||||||
|
// Clear byte progress when switching to db progress
|
||||||
|
progressState.bytesTotal = 0
|
||||||
|
progressState.bytesDone = 0
|
||||||
|
})
|
||||||
|
|
||||||
|
// Set up weighted (bytes-based) progress callback for accurate cluster restore progress
|
||||||
|
engine.SetDatabaseProgressByBytesCallback(func(bytesDone, bytesTotal int64, dbName string, dbDone, dbTotal int) {
|
||||||
|
progressState.mu.Lock()
|
||||||
|
defer progressState.mu.Unlock()
|
||||||
|
progressState.dbBytesDone = bytesDone
|
||||||
|
progressState.dbBytesTotal = bytesTotal
|
||||||
|
progressState.dbDone = dbDone
|
||||||
|
progressState.dbTotal = dbTotal
|
||||||
|
progressState.currentDB = dbName
|
||||||
|
progressState.overallPhase = 3
|
||||||
|
progressState.extractionDone = true
|
||||||
|
progressState.hasUpdate = true
|
||||||
|
})
|
||||||
|
|
||||||
|
// Store progress state in a package-level variable for the ticker to access
|
||||||
|
// This is a workaround because tea messages can't be sent from callbacks
|
||||||
|
setCurrentRestoreProgress(progressState)
|
||||||
|
defer clearCurrentRestoreProgress()
|
||||||
|
|
||||||
// Enable debug logging if requested
|
// Enable debug logging if requested
|
||||||
if saveDebugLog {
|
if saveDebugLog {
|
||||||
// Generate debug log path using configured WorkDir
|
// Generate debug log path using configured WorkDir
|
||||||
@@ -165,9 +392,6 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
|||||||
log.Info("Debug logging enabled", "path", debugLogPath)
|
log.Info("Debug logging enabled", "path", debugLogPath)
|
||||||
}
|
}
|
||||||
|
|
||||||
// Set up progress callback (but it won't work in goroutine - progress is already sent via logs)
|
|
||||||
// The TUI will just use spinner animation to show activity
|
|
||||||
|
|
||||||
// STEP 3: Execute restore based on type
|
// STEP 3: Execute restore based on type
|
||||||
var restoreErr error
|
var restoreErr error
|
||||||
if restoreType == "restore-cluster" {
|
if restoreType == "restore-cluster" {
|
||||||
@@ -206,39 +430,90 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
|||||||
m.spinnerFrame = (m.spinnerFrame + 1) % len(m.spinnerFrames)
|
m.spinnerFrame = (m.spinnerFrame + 1) % len(m.spinnerFrames)
|
||||||
m.elapsed = time.Since(m.startTime)
|
m.elapsed = time.Since(m.startTime)
|
||||||
|
|
||||||
// Update status based on elapsed time to show progress
|
// Poll shared progress state for real-time updates
|
||||||
// This provides visual feedback even though we don't have real-time progress
|
bytesTotal, bytesDone, description, hasUpdate, dbTotal, dbDone, speed, dbPhaseElapsed, dbAvgPerDB, currentDB, overallPhase, extractionDone, dbBytesTotal, dbBytesDone := getCurrentRestoreProgress()
|
||||||
elapsedSec := int(m.elapsed.Seconds())
|
if hasUpdate && bytesTotal > 0 && !extractionDone {
|
||||||
|
// Phase 1: Extraction
|
||||||
|
m.bytesTotal = bytesTotal
|
||||||
|
m.bytesDone = bytesDone
|
||||||
|
m.description = description
|
||||||
|
m.showBytes = true
|
||||||
|
m.speed = speed
|
||||||
|
m.overallPhase = 1
|
||||||
|
m.extractionDone = false
|
||||||
|
|
||||||
if elapsedSec < 2 {
|
// Update status to reflect actual progress
|
||||||
m.status = "Initializing restore..."
|
m.status = description
|
||||||
m.phase = "Starting"
|
m.phase = "Phase 1/3: Extracting Archive"
|
||||||
} else if elapsedSec < 5 {
|
m.progress = int((bytesDone * 100) / bytesTotal)
|
||||||
if m.cleanClusterFirst && len(m.existingDBs) > 0 {
|
} else if hasUpdate && dbTotal > 0 {
|
||||||
m.status = fmt.Sprintf("Cleaning %d existing database(s)...", len(m.existingDBs))
|
// Phase 3: Database restores
|
||||||
m.phase = "Cleanup"
|
m.dbTotal = dbTotal
|
||||||
} else if m.restoreType == "restore-cluster" {
|
m.dbDone = dbDone
|
||||||
m.status = "Extracting cluster archive..."
|
m.dbPhaseElapsed = dbPhaseElapsed
|
||||||
m.phase = "Extraction"
|
m.dbAvgPerDB = dbAvgPerDB
|
||||||
|
m.currentDB = currentDB
|
||||||
|
m.overallPhase = overallPhase
|
||||||
|
m.extractionDone = extractionDone
|
||||||
|
m.showBytes = false
|
||||||
|
|
||||||
|
if dbDone < dbTotal {
|
||||||
|
m.status = fmt.Sprintf("Restoring: %s", currentDB)
|
||||||
} else {
|
} else {
|
||||||
m.status = "Preparing restore..."
|
m.status = "Finalizing..."
|
||||||
m.phase = "Preparation"
|
|
||||||
}
|
}
|
||||||
} else if elapsedSec < 10 {
|
|
||||||
if m.restoreType == "restore-cluster" {
|
// Use weighted progress by bytes if available, otherwise use count
|
||||||
m.status = "Restoring global objects..."
|
if dbBytesTotal > 0 {
|
||||||
m.phase = "Globals"
|
weightedPercent := int((dbBytesDone * 100) / dbBytesTotal)
|
||||||
|
m.phase = fmt.Sprintf("Phase 3/3: Databases (%d/%d) - %.1f%% by size", dbDone, dbTotal, float64(dbBytesDone*100)/float64(dbBytesTotal))
|
||||||
|
m.progress = weightedPercent
|
||||||
} else {
|
} else {
|
||||||
m.status = fmt.Sprintf("Restoring database '%s'...", m.targetDB)
|
m.phase = fmt.Sprintf("Phase 3/3: Databases (%d/%d)", dbDone, dbTotal)
|
||||||
m.phase = "Restore"
|
m.progress = int((dbDone * 100) / dbTotal)
|
||||||
}
|
}
|
||||||
|
} else if hasUpdate && extractionDone && dbTotal == 0 {
|
||||||
|
// Phase 2: Globals restore (brief phase between extraction and databases)
|
||||||
|
m.overallPhase = 2
|
||||||
|
m.extractionDone = true
|
||||||
|
m.showBytes = false
|
||||||
|
m.status = "Restoring global objects (roles, tablespaces)..."
|
||||||
|
m.phase = "Phase 2/3: Restoring Globals"
|
||||||
} else {
|
} else {
|
||||||
if m.restoreType == "restore-cluster" {
|
// Fallback: Update status based on elapsed time to show progress
|
||||||
m.status = "Restoring cluster databases..."
|
// This provides visual feedback even though we don't have real-time progress
|
||||||
m.phase = "Restore"
|
elapsedSec := int(m.elapsed.Seconds())
|
||||||
|
|
||||||
|
if elapsedSec < 2 {
|
||||||
|
m.status = "Initializing restore..."
|
||||||
|
m.phase = "Starting"
|
||||||
|
} else if elapsedSec < 5 {
|
||||||
|
if m.cleanClusterFirst && len(m.existingDBs) > 0 {
|
||||||
|
m.status = fmt.Sprintf("Cleaning %d existing database(s)...", len(m.existingDBs))
|
||||||
|
m.phase = "Cleanup"
|
||||||
|
} else if m.restoreType == "restore-cluster" {
|
||||||
|
m.status = "Extracting cluster archive..."
|
||||||
|
m.phase = "Extraction"
|
||||||
|
} else {
|
||||||
|
m.status = "Preparing restore..."
|
||||||
|
m.phase = "Preparation"
|
||||||
|
}
|
||||||
|
} else if elapsedSec < 10 {
|
||||||
|
if m.restoreType == "restore-cluster" {
|
||||||
|
m.status = "Restoring global objects..."
|
||||||
|
m.phase = "Globals"
|
||||||
|
} else {
|
||||||
|
m.status = fmt.Sprintf("Restoring database '%s'...", m.targetDB)
|
||||||
|
m.phase = "Restore"
|
||||||
|
}
|
||||||
} else {
|
} else {
|
||||||
m.status = fmt.Sprintf("Restoring database '%s'...", m.targetDB)
|
if m.restoreType == "restore-cluster" {
|
||||||
m.phase = "Restore"
|
m.status = "Restoring cluster databases..."
|
||||||
|
m.phase = "Restore"
|
||||||
|
} else {
|
||||||
|
m.status = fmt.Sprintf("Restoring database '%s'...", m.targetDB)
|
||||||
|
m.phase = "Restore"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -250,6 +525,15 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
|||||||
m.status = msg.status
|
m.status = msg.status
|
||||||
m.phase = msg.phase
|
m.phase = msg.phase
|
||||||
m.progress = msg.progress
|
m.progress = msg.progress
|
||||||
|
|
||||||
|
// Update byte-level progress if available
|
||||||
|
if msg.bytesTotal > 0 {
|
||||||
|
m.bytesTotal = msg.bytesTotal
|
||||||
|
m.bytesDone = msg.bytesDone
|
||||||
|
m.description = msg.description
|
||||||
|
m.showBytes = true
|
||||||
|
}
|
||||||
|
|
||||||
if msg.detail != "" {
|
if msg.detail != "" {
|
||||||
m.details = append(m.details, msg.detail)
|
m.details = append(m.details, msg.detail)
|
||||||
// Keep only last 5 details
|
// Keep only last 5 details
|
||||||
@@ -279,6 +563,21 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
|||||||
}
|
}
|
||||||
return m, nil
|
return m, nil
|
||||||
|
|
||||||
|
case tea.InterruptMsg:
|
||||||
|
// Handle Ctrl+C signal (SIGINT) - Bubbletea v1.3+ sends this instead of KeyMsg for ctrl+c
|
||||||
|
if !m.done && !m.cancelling {
|
||||||
|
m.cancelling = true
|
||||||
|
m.status = "[STOP] Cancelling restore... (please wait)"
|
||||||
|
m.phase = "Cancelling"
|
||||||
|
if m.cancel != nil {
|
||||||
|
m.cancel()
|
||||||
|
}
|
||||||
|
return m, nil
|
||||||
|
} else if m.done {
|
||||||
|
return m.parent, tea.Quit
|
||||||
|
}
|
||||||
|
return m, nil
|
||||||
|
|
||||||
case tea.KeyMsg:
|
case tea.KeyMsg:
|
||||||
switch msg.String() {
|
switch msg.String() {
|
||||||
case "ctrl+c", "esc":
|
case "ctrl+c", "esc":
|
||||||
@@ -336,38 +635,159 @@ func (m RestoreExecutionModel) View() string {
|
|||||||
s.WriteString("\n")
|
s.WriteString("\n")
|
||||||
|
|
||||||
if m.done {
|
if m.done {
|
||||||
// Show result
|
// Show result with comprehensive summary
|
||||||
if m.err != nil {
|
if m.err != nil {
|
||||||
s.WriteString(errorStyle.Render("[FAIL] Restore Failed"))
|
s.WriteString(errorStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(errorStyle.Render("║ [FAIL] RESTORE FAILED ║"))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(errorStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||||
s.WriteString("\n\n")
|
s.WriteString("\n\n")
|
||||||
s.WriteString(errorStyle.Render(fmt.Sprintf("Error: %v", m.err)))
|
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
|
||||||
s.WriteString("\n")
|
s.WriteString("\n")
|
||||||
} else {
|
} else {
|
||||||
s.WriteString(successStyle.Render("[OK] Restore Completed Successfully"))
|
s.WriteString(successStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(successStyle.Render("║ [OK] RESTORE COMPLETED SUCCESSFULLY ║"))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(successStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||||
s.WriteString("\n\n")
|
s.WriteString("\n\n")
|
||||||
s.WriteString(successStyle.Render(m.result))
|
|
||||||
|
// Summary section
|
||||||
|
s.WriteString(infoStyle.Render(" ─── Summary ───────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
|
||||||
|
// Archive info
|
||||||
|
s.WriteString(fmt.Sprintf(" Archive: %s\n", m.archive.Name))
|
||||||
|
if m.archive.Size > 0 {
|
||||||
|
s.WriteString(fmt.Sprintf(" Archive Size: %s\n", FormatBytes(m.archive.Size)))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Restore type specific info
|
||||||
|
if m.restoreType == "restore-cluster" {
|
||||||
|
s.WriteString(fmt.Sprintf(" Type: Cluster Restore\n"))
|
||||||
|
if m.dbTotal > 0 {
|
||||||
|
s.WriteString(fmt.Sprintf(" Databases: %d restored\n", m.dbTotal))
|
||||||
|
}
|
||||||
|
if m.cleanClusterFirst && len(m.existingDBs) > 0 {
|
||||||
|
s.WriteString(fmt.Sprintf(" Cleaned: %d existing database(s) dropped\n", len(m.existingDBs)))
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
s.WriteString(fmt.Sprintf(" Type: Single Database Restore\n"))
|
||||||
|
s.WriteString(fmt.Sprintf(" Target DB: %s\n", m.targetDB))
|
||||||
|
}
|
||||||
|
|
||||||
s.WriteString("\n")
|
s.WriteString("\n")
|
||||||
}
|
}
|
||||||
|
|
||||||
s.WriteString(fmt.Sprintf("\nElapsed Time: %s\n", formatDuration(m.elapsed)))
|
// Timing section
|
||||||
|
s.WriteString(infoStyle.Render(" ─── Timing ────────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatDuration(m.elapsed)))
|
||||||
|
|
||||||
|
// Calculate and show throughput if we have size info
|
||||||
|
if m.archive.Size > 0 && m.elapsed.Seconds() > 0 {
|
||||||
|
throughput := float64(m.archive.Size) / m.elapsed.Seconds()
|
||||||
|
s.WriteString(fmt.Sprintf(" Throughput: %s/s (average)\n", FormatBytes(int64(throughput))))
|
||||||
|
}
|
||||||
|
|
||||||
|
if m.dbTotal > 0 && m.err == nil {
|
||||||
|
avgPerDB := m.elapsed / time.Duration(m.dbTotal)
|
||||||
|
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatDuration(avgPerDB)))
|
||||||
|
}
|
||||||
|
|
||||||
s.WriteString("\n")
|
s.WriteString("\n")
|
||||||
s.WriteString(infoStyle.Render("[KEYS] Press Enter to continue"))
|
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
s.WriteString(infoStyle.Render(" [KEYS] Press Enter to continue"))
|
||||||
} else {
|
} else {
|
||||||
// Show progress
|
// Show unified progress for cluster restore
|
||||||
s.WriteString(fmt.Sprintf("Phase: %s\n", m.phase))
|
if m.restoreType == "restore-cluster" {
|
||||||
|
// Calculate overall progress across all phases
|
||||||
|
// Phase 1: Extraction (0-60%)
|
||||||
|
// Phase 2: Globals (60-65%)
|
||||||
|
// Phase 3: Databases (65-100%)
|
||||||
|
overallProgress := 0
|
||||||
|
phaseLabel := "Starting..."
|
||||||
|
|
||||||
// Show status with rotating spinner (unified indicator for all operations)
|
if m.showBytes && m.bytesTotal > 0 {
|
||||||
spinner := m.spinnerFrames[m.spinnerFrame]
|
// Phase 1: Extraction - contributes 0-60%
|
||||||
s.WriteString(fmt.Sprintf("Status: %s %s\n", spinner, m.status))
|
extractPct := int((m.bytesDone * 100) / m.bytesTotal)
|
||||||
s.WriteString("\n")
|
overallProgress = (extractPct * 60) / 100
|
||||||
|
phaseLabel = "Phase 1/3: Extracting Archive"
|
||||||
|
} else if m.extractionDone && m.dbTotal == 0 {
|
||||||
|
// Phase 2: Globals restore
|
||||||
|
overallProgress = 62
|
||||||
|
phaseLabel = "Phase 2/3: Restoring Globals"
|
||||||
|
} else if m.dbTotal > 0 {
|
||||||
|
// Phase 3: Database restores - contributes 65-100%
|
||||||
|
dbPct := int((int64(m.dbDone) * 100) / int64(m.dbTotal))
|
||||||
|
overallProgress = 65 + (dbPct * 35 / 100)
|
||||||
|
phaseLabel = fmt.Sprintf("Phase 3/3: Databases (%d/%d)", m.dbDone, m.dbTotal)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Header with phase and overall progress
|
||||||
|
s.WriteString(infoStyle.Render(" ─── Cluster Restore Progress ─────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
s.WriteString(fmt.Sprintf(" %s\n\n", phaseLabel))
|
||||||
|
|
||||||
|
// Overall progress bar
|
||||||
|
s.WriteString(" Overall: ")
|
||||||
|
s.WriteString(renderProgressBar(overallProgress))
|
||||||
|
s.WriteString(fmt.Sprintf(" %d%%\n", overallProgress))
|
||||||
|
|
||||||
|
// Phase-specific details
|
||||||
|
if m.showBytes && m.bytesTotal > 0 {
|
||||||
|
// Show extraction details
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(fmt.Sprintf(" %s\n", m.status))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(renderDetailedProgressBarWithSpeed(m.bytesDone, m.bytesTotal, m.speed))
|
||||||
|
s.WriteString("\n")
|
||||||
|
} else if m.dbTotal > 0 {
|
||||||
|
// Show current database being restored
|
||||||
|
s.WriteString("\n")
|
||||||
|
spinner := m.spinnerFrames[m.spinnerFrame]
|
||||||
|
if m.currentDB != "" && m.dbDone < m.dbTotal {
|
||||||
|
s.WriteString(fmt.Sprintf(" Current: %s %s\n", spinner, m.currentDB))
|
||||||
|
} else if m.dbDone >= m.dbTotal {
|
||||||
|
s.WriteString(fmt.Sprintf(" %s Finalizing...\n", spinner))
|
||||||
|
}
|
||||||
|
s.WriteString("\n")
|
||||||
|
|
||||||
|
// Database progress bar with timing
|
||||||
|
s.WriteString(renderDatabaseProgressBarWithTiming(m.dbDone, m.dbTotal, m.dbPhaseElapsed, m.dbAvgPerDB))
|
||||||
|
s.WriteString("\n")
|
||||||
|
} else {
|
||||||
|
// Intermediate phase (globals)
|
||||||
|
spinner := m.spinnerFrames[m.spinnerFrame]
|
||||||
|
s.WriteString(fmt.Sprintf("\n %s %s\n\n", spinner, m.status))
|
||||||
|
}
|
||||||
|
|
||||||
// Only show progress bar for single database restore
|
|
||||||
// Cluster restore uses spinner only (consistent with CLI behavior)
|
|
||||||
if m.restoreType == "restore-single" {
|
|
||||||
progressBar := renderProgressBar(m.progress)
|
|
||||||
s.WriteString(progressBar)
|
|
||||||
s.WriteString(fmt.Sprintf(" %d%%\n", m.progress))
|
|
||||||
s.WriteString("\n")
|
s.WriteString("\n")
|
||||||
|
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
} else {
|
||||||
|
// Single database restore - simpler display
|
||||||
|
s.WriteString(fmt.Sprintf("Phase: %s\n", m.phase))
|
||||||
|
|
||||||
|
// Show detailed progress bar when we have byte-level information
|
||||||
|
if m.showBytes && m.bytesTotal > 0 {
|
||||||
|
s.WriteString(fmt.Sprintf("Status: %s\n", m.status))
|
||||||
|
s.WriteString("\n")
|
||||||
|
s.WriteString(renderDetailedProgressBarWithSpeed(m.bytesDone, m.bytesTotal, m.speed))
|
||||||
|
s.WriteString("\n\n")
|
||||||
|
} else {
|
||||||
|
spinner := m.spinnerFrames[m.spinnerFrame]
|
||||||
|
s.WriteString(fmt.Sprintf("Status: %s %s\n", spinner, m.status))
|
||||||
|
s.WriteString("\n")
|
||||||
|
|
||||||
|
// Fallback to simple progress bar
|
||||||
|
progressBar := renderProgressBar(m.progress)
|
||||||
|
s.WriteString(progressBar)
|
||||||
|
s.WriteString(fmt.Sprintf(" %d%%\n", m.progress))
|
||||||
|
s.WriteString("\n")
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Elapsed time
|
// Elapsed time
|
||||||
@@ -390,6 +810,141 @@ func renderProgressBar(percent int) string {
|
|||||||
return successStyle.Render(bar) + infoStyle.Render(empty)
|
return successStyle.Render(bar) + infoStyle.Render(empty)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// renderDetailedProgressBar renders a schollz-style progress bar with bytes, speed, and ETA
|
||||||
|
// Uses elapsed time for speed calculation (fallback)
|
||||||
|
func renderDetailedProgressBar(done, total int64, elapsed time.Duration) string {
|
||||||
|
speed := 0.0
|
||||||
|
if elapsed.Seconds() > 0 {
|
||||||
|
speed = float64(done) / elapsed.Seconds()
|
||||||
|
}
|
||||||
|
return renderDetailedProgressBarWithSpeed(done, total, speed)
|
||||||
|
}
|
||||||
|
|
||||||
|
// renderDetailedProgressBarWithSpeed renders a schollz-style progress bar with pre-calculated rolling speed
|
||||||
|
func renderDetailedProgressBarWithSpeed(done, total int64, speed float64) string {
|
||||||
|
var s strings.Builder
|
||||||
|
|
||||||
|
// Calculate percentage
|
||||||
|
percent := 0
|
||||||
|
if total > 0 {
|
||||||
|
percent = int((done * 100) / total)
|
||||||
|
if percent > 100 {
|
||||||
|
percent = 100
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Render progress bar
|
||||||
|
width := 30
|
||||||
|
filled := (percent * width) / 100
|
||||||
|
barFilled := strings.Repeat("█", filled)
|
||||||
|
barEmpty := strings.Repeat("░", width-filled)
|
||||||
|
|
||||||
|
s.WriteString(successStyle.Render("["))
|
||||||
|
s.WriteString(successStyle.Render(barFilled))
|
||||||
|
s.WriteString(infoStyle.Render(barEmpty))
|
||||||
|
s.WriteString(successStyle.Render("]"))
|
||||||
|
|
||||||
|
// Percentage
|
||||||
|
s.WriteString(fmt.Sprintf(" %3d%%", percent))
|
||||||
|
|
||||||
|
// Bytes progress
|
||||||
|
s.WriteString(fmt.Sprintf(" %s / %s", FormatBytes(done), FormatBytes(total)))
|
||||||
|
|
||||||
|
// Speed display (using rolling window speed)
|
||||||
|
if speed > 0 {
|
||||||
|
s.WriteString(fmt.Sprintf(" %s/s", FormatBytes(int64(speed))))
|
||||||
|
|
||||||
|
// ETA calculation based on rolling speed
|
||||||
|
if done < total {
|
||||||
|
remaining := total - done
|
||||||
|
etaSeconds := float64(remaining) / speed
|
||||||
|
eta := time.Duration(etaSeconds) * time.Second
|
||||||
|
s.WriteString(fmt.Sprintf(" ETA: %s", FormatDurationShort(eta)))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return s.String()
|
||||||
|
}
|
||||||
|
|
||||||
|
// renderDatabaseProgressBar renders a progress bar for database count (cluster restore)
|
||||||
|
func renderDatabaseProgressBar(done, total int) string {
|
||||||
|
var s strings.Builder
|
||||||
|
|
||||||
|
// Calculate percentage
|
||||||
|
percent := 0
|
||||||
|
if total > 0 {
|
||||||
|
percent = (done * 100) / total
|
||||||
|
if percent > 100 {
|
||||||
|
percent = 100
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Render progress bar
|
||||||
|
width := 30
|
||||||
|
filled := (percent * width) / 100
|
||||||
|
barFilled := strings.Repeat("█", filled)
|
||||||
|
barEmpty := strings.Repeat("░", width-filled)
|
||||||
|
|
||||||
|
s.WriteString(successStyle.Render("["))
|
||||||
|
s.WriteString(successStyle.Render(barFilled))
|
||||||
|
s.WriteString(infoStyle.Render(barEmpty))
|
||||||
|
s.WriteString(successStyle.Render("]"))
|
||||||
|
|
||||||
|
// Count and percentage
|
||||||
|
s.WriteString(fmt.Sprintf(" %3d%% %d / %d databases", percent, done, total))
|
||||||
|
|
||||||
|
return s.String()
|
||||||
|
}
|
||||||
|
|
||||||
|
// renderDatabaseProgressBarWithTiming renders a progress bar for database count with timing and ETA
|
||||||
|
func renderDatabaseProgressBarWithTiming(done, total int, phaseElapsed, avgPerDB time.Duration) string {
|
||||||
|
var s strings.Builder
|
||||||
|
|
||||||
|
// Calculate percentage
|
||||||
|
percent := 0
|
||||||
|
if total > 0 {
|
||||||
|
percent = (done * 100) / total
|
||||||
|
if percent > 100 {
|
||||||
|
percent = 100
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Render progress bar
|
||||||
|
width := 30
|
||||||
|
filled := (percent * width) / 100
|
||||||
|
barFilled := strings.Repeat("█", filled)
|
||||||
|
barEmpty := strings.Repeat("░", width-filled)
|
||||||
|
|
||||||
|
s.WriteString(successStyle.Render("["))
|
||||||
|
s.WriteString(successStyle.Render(barFilled))
|
||||||
|
s.WriteString(infoStyle.Render(barEmpty))
|
||||||
|
s.WriteString(successStyle.Render("]"))
|
||||||
|
|
||||||
|
// Count and percentage
|
||||||
|
s.WriteString(fmt.Sprintf(" %3d%% %d / %d databases", percent, done, total))
|
||||||
|
|
||||||
|
// Timing and ETA
|
||||||
|
if phaseElapsed > 0 {
|
||||||
|
s.WriteString(fmt.Sprintf(" [%s", FormatDurationShort(phaseElapsed)))
|
||||||
|
|
||||||
|
// Calculate ETA based on average time per database
|
||||||
|
if avgPerDB > 0 && done < total {
|
||||||
|
remainingDBs := total - done
|
||||||
|
eta := time.Duration(remainingDBs) * avgPerDB
|
||||||
|
s.WriteString(fmt.Sprintf(" / ETA: %s", FormatDurationShort(eta)))
|
||||||
|
} else if done > 0 && done < total {
|
||||||
|
// Fallback: estimate ETA from overall elapsed time
|
||||||
|
avgElapsed := phaseElapsed / time.Duration(done)
|
||||||
|
remainingDBs := total - done
|
||||||
|
eta := time.Duration(remainingDBs) * avgElapsed
|
||||||
|
s.WriteString(fmt.Sprintf(" / ETA: ~%s", FormatDurationShort(eta)))
|
||||||
|
}
|
||||||
|
s.WriteString("]")
|
||||||
|
}
|
||||||
|
|
||||||
|
return s.String()
|
||||||
|
}
|
||||||
|
|
||||||
// formatDuration formats duration in human readable format
|
// formatDuration formats duration in human readable format
|
||||||
func formatDuration(d time.Duration) string {
|
func formatDuration(d time.Duration) string {
|
||||||
if d < time.Minute {
|
if d < time.Minute {
|
||||||
|
|||||||
Reference in New Issue
Block a user