Compare commits
45 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| b305d1342e | |||
| 5456da7183 | |||
| f9ff45cf2a | |||
| 72c06ba5c2 | |||
| a0a401cab1 | |||
| 59a717abe7 | |||
| 490a12f858 | |||
| ea4337e298 | |||
| bbd4f0ceac | |||
| f6f8b04785 | |||
| 670c9af2e7 | |||
| e2cf9adc62 | |||
| 29e089fe3b | |||
| 9396c8e605 | |||
| e363e1937f | |||
| df1ab2f55b | |||
| 0e050b2def | |||
| 62d58c77af | |||
| c5be9bcd2b | |||
| b120f1507e | |||
| dd1db844ce | |||
| 4ea3ec2cf8 | |||
| 9200024e50 | |||
| 698b8a761c | |||
| dd7c4da0eb | |||
| b2a78cad2a | |||
| 5728b465e6 | |||
| bfe99e959c | |||
| 780beaadfb | |||
| 838c5b8c15 | |||
| 9d95a193db | |||
| 3201f0fb6a | |||
| 62ddc57fb7 | |||
| 510175ff04 | |||
| a85ad0c88c | |||
| 4938dc1918 | |||
| 09a917766f | |||
| eeacbfa007 | |||
| 7711a206ab | |||
| ba6e8a2b39 | |||
| ec5e89eab7 | |||
| e24d7ab49f | |||
| 721e53fe6a | |||
| 4e09066aa5 | |||
| 6a24ee39be |
151
CHANGELOG.md
151
CHANGELOG.md
@@ -5,6 +5,157 @@ All notable changes to dbbackup will be documented in this file.
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [3.42.50] - 2026-01-16 "Ctrl+C Signal Handling Fix"
|
||||
|
||||
### Fixed - Proper Ctrl+C/SIGINT Handling in TUI
|
||||
- **Added tea.InterruptMsg handling** - Bubbletea v1.3+ sends `InterruptMsg` for SIGINT signals
|
||||
instead of a `KeyMsg` with "ctrl+c", causing cancellation to not work
|
||||
- **Fixed cluster restore cancellation** - Ctrl+C now properly cancels running restore operations
|
||||
- **Fixed cluster backup cancellation** - Ctrl+C now properly cancels running backup operations
|
||||
- **Added interrupt handling to main menu** - Proper cleanup on SIGINT from menu
|
||||
- **Orphaned process cleanup** - `cleanup.KillOrphanedProcesses()` called on all interrupt paths
|
||||
|
||||
### Changed
|
||||
- All TUI execution views now handle both `tea.KeyMsg` ("ctrl+c") and `tea.InterruptMsg`
|
||||
- Context cancellation properly propagates to child processes via `exec.CommandContext`
|
||||
- No zombie pg_dump/pg_restore/gzip processes left behind on cancellation
|
||||
|
||||
## [3.42.49] - 2026-01-16 "Unified Cluster Backup Progress"
|
||||
|
||||
### Added - Unified Progress Display for Cluster Backup
|
||||
- **Combined overall progress bar** for cluster backup showing all phases:
|
||||
- Phase 1/3: Backing up Globals (0-15% of overall)
|
||||
- Phase 2/3: Backing up Databases (15-90% of overall)
|
||||
- Phase 3/3: Compressing Archive (90-100% of overall)
|
||||
- **Current database indicator** - Shows which database is currently being backed up
|
||||
- **Phase-aware progress tracking** - New fields in backup progress state:
|
||||
- `overallPhase` - Current phase (1=globals, 2=databases, 3=compressing)
|
||||
- `phaseDesc` - Human-readable phase description
|
||||
- **Dual progress bars** for cluster backup:
|
||||
- Overall progress bar showing combined operation progress
|
||||
- Database count progress bar showing individual database progress
|
||||
|
||||
### Changed
|
||||
- Cluster backup TUI now shows unified progress display matching restore
|
||||
- Progress callbacks now include phase information
|
||||
- Better visual feedback during entire cluster backup operation
|
||||
|
||||
## [3.42.48] - 2026-01-15 "Unified Cluster Restore Progress"
|
||||
|
||||
### Added - Unified Progress Display for Cluster Restore
|
||||
- **Combined overall progress bar** showing progress across all restore phases:
|
||||
- Phase 1/3: Extracting Archive (0-60% of overall)
|
||||
- Phase 2/3: Restoring Globals (60-65% of overall)
|
||||
- Phase 3/3: Restoring Databases (65-100% of overall)
|
||||
- **Current database indicator** - Shows which database is currently being restored
|
||||
- **Phase-aware progress tracking** - New fields in progress state:
|
||||
- `overallPhase` - Current phase (1=extraction, 2=globals, 3=databases)
|
||||
- `currentDB` - Name of database currently being restored
|
||||
- `extractionDone` - Boolean flag for phase transition
|
||||
- **Dual progress bars** for cluster restore:
|
||||
- Overall progress bar showing combined operation progress
|
||||
- Phase-specific progress bar (extraction bytes or database count)
|
||||
|
||||
### Changed
|
||||
- Cluster restore TUI now shows unified progress display
|
||||
- Progress callbacks now set phase and current database information
|
||||
- Extraction completion triggers automatic transition to globals phase
|
||||
- Database restore phase shows current database name with spinner
|
||||
|
||||
### Improved
|
||||
- Better visual feedback during entire cluster restore operation
|
||||
- Clear phase indicators help users understand restore progress
|
||||
- Overall progress percentage gives better time estimates
|
||||
|
||||
## [3.42.35] - 2026-01-15 "TUI Detailed Progress"
|
||||
|
||||
### Added - Enhanced TUI Progress Display
|
||||
- **Detailed progress bar in TUI restore** - schollz-style progress bar with:
|
||||
- Byte progress display (e.g., `245 MB / 1.2 GB`)
|
||||
- Transfer speed calculation (e.g., `45 MB/s`)
|
||||
- ETA prediction for long operations
|
||||
- Unicode block-based visual bar
|
||||
- **Real-time extraction progress** - Archive extraction now reports actual bytes processed
|
||||
- **Go-native tar extraction** - Uses Go's `archive/tar` + `compress/gzip` when progress callback is set
|
||||
- **New `DetailedProgress` component** in TUI package:
|
||||
- `NewDetailedProgress(total, description)` - Byte-based progress
|
||||
- `NewDetailedProgressItems(total, description)` - Item count progress
|
||||
- `NewDetailedProgressSpinner(description)` - Indeterminate spinner
|
||||
- `RenderProgressBar(width)` - Generate schollz-style output
|
||||
- **Progress callback API** in restore engine:
|
||||
- `SetProgressCallback(func(current, total int64, description string))`
|
||||
- Allows TUI to receive real-time progress updates from restore operations
|
||||
- **Shared progress state** pattern for Bubble Tea integration
|
||||
|
||||
### Changed
|
||||
- TUI restore execution now shows detailed byte progress during archive extraction
|
||||
- Cluster restore shows extraction progress instead of just spinner
|
||||
- Falls back to shell `tar` command when no progress callback is set (faster)
|
||||
|
||||
### Technical Details
|
||||
- `progressReader` wrapper tracks bytes read through gzip/tar pipeline
|
||||
- Throttled progress updates (every 100ms) to avoid UI flooding
|
||||
- Thread-safe shared state pattern for cross-goroutine progress updates
|
||||
|
||||
## [3.42.34] - 2026-01-14 "Filesystem Abstraction"
|
||||
|
||||
### Added - spf13/afero for Filesystem Abstraction
|
||||
- **New `internal/fs` package** for testable filesystem operations
|
||||
- **In-memory filesystem** for unit testing without disk I/O
|
||||
- **Global FS interface** that can be swapped for testing:
|
||||
```go
|
||||
fs.SetFS(afero.NewMemMapFs()) // Use memory
|
||||
fs.ResetFS() // Back to real disk
|
||||
```
|
||||
- **Wrapper functions** for all common file operations:
|
||||
- `ReadFile`, `WriteFile`, `Create`, `Open`, `Remove`, `RemoveAll`
|
||||
- `Mkdir`, `MkdirAll`, `ReadDir`, `Walk`, `Glob`
|
||||
- `Exists`, `DirExists`, `IsDir`, `IsEmpty`
|
||||
- `TempDir`, `TempFile`, `CopyFile`, `FileSize`
|
||||
- **Testing helpers**:
|
||||
- `WithMemFs(fn)` - Execute function with temp in-memory FS
|
||||
- `SetupTestDir(files)` - Create test directory structure
|
||||
- **Comprehensive test suite** demonstrating usage
|
||||
|
||||
### Changed
|
||||
- Upgraded afero from v1.10.0 to v1.15.0
|
||||
|
||||
## [3.42.33] - 2026-01-14 "Exponential Backoff Retry"
|
||||
|
||||
### Added - cenkalti/backoff for Cloud Operation Retry
|
||||
- **Exponential backoff retry** for all cloud operations (S3, Azure, GCS)
|
||||
- **Retry configurations**:
|
||||
- `DefaultRetryConfig()` - 5 retries, 500ms→30s backoff, 5 min max
|
||||
- `AggressiveRetryConfig()` - 10 retries, 1s→60s backoff, 15 min max
|
||||
- `QuickRetryConfig()` - 3 retries, 100ms→5s backoff, 30s max
|
||||
- **Smart error classification**:
|
||||
- `IsPermanentError()` - Auth/bucket errors (no retry)
|
||||
- `IsRetryableError()` - Timeout/network errors (retry)
|
||||
- **Retry logging** - Each retry attempt is logged with wait duration
|
||||
|
||||
### Changed
|
||||
- S3 simple upload, multipart upload, download now retry on transient failures
|
||||
- Azure simple upload, download now retry on transient failures
|
||||
- GCS upload, download now retry on transient failures
|
||||
- Large file multipart uploads use `AggressiveRetryConfig()` (more retries)
|
||||
|
||||
## [3.42.32] - 2026-01-14 "Cross-Platform Colors"
|
||||
|
||||
### Added - fatih/color for Cross-Platform Terminal Colors
|
||||
- **Windows-compatible colors** - Native Windows console API support
|
||||
- **Color helper functions** in `logger` package:
|
||||
- `Success()`, `Error()`, `Warning()`, `Info()` - Status messages with icons
|
||||
- `Header()`, `Dim()`, `Bold()` - Text styling
|
||||
- `Green()`, `Red()`, `Yellow()`, `Cyan()` - Colored text
|
||||
- `StatusLine()`, `TableRow()` - Formatted output
|
||||
- `DisableColors()`, `EnableColors()` - Runtime control
|
||||
- **Consistent color scheme** across all log levels
|
||||
|
||||
### Changed
|
||||
- Logger `CleanFormatter` now uses fatih/color instead of raw ANSI codes
|
||||
- All progress indicators use fatih/color for `[OK]`/`[FAIL]` status
|
||||
- Automatic color detection (disabled for non-TTY)
|
||||
|
||||
## [3.42.31] - 2026-01-14 "Visual Progress Bars"
|
||||
|
||||
### Added - schollz/progressbar for Enhanced Progress Display
|
||||
|
||||
44
README.md
44
README.md
@@ -56,7 +56,7 @@ Download from [releases](https://git.uuxo.net/UUXO/dbbackup/releases):
|
||||
|
||||
```bash
|
||||
# Linux x86_64
|
||||
wget https://git.uuxo.net/UUXO/dbbackup/releases/download/v3.42.1/dbbackup-linux-amd64
|
||||
wget https://git.uuxo.net/UUXO/dbbackup/releases/download/v3.42.35/dbbackup-linux-amd64
|
||||
chmod +x dbbackup-linux-amd64
|
||||
sudo mv dbbackup-linux-amd64 /usr/local/bin/dbbackup
|
||||
```
|
||||
@@ -194,21 +194,51 @@ r: Restore | v: Verify | i: Info | d: Diagnose | D: Delete | R: Refresh | Esc: B
|
||||
```
|
||||
Configuration Settings
|
||||
|
||||
[SYSTEM] Detected Resources
|
||||
CPU: 8 physical cores, 16 logical cores
|
||||
Memory: 32GB total, 28GB available
|
||||
Recommended Profile: balanced
|
||||
→ 8 cores and 32GB RAM supports moderate parallelism
|
||||
|
||||
[CONFIG] Current Settings
|
||||
Target DB: PostgreSQL (postgres)
|
||||
Database: postgres@localhost:5432
|
||||
Backup Dir: /var/backups/postgres
|
||||
Compression: Level 6
|
||||
Profile: balanced | Cluster: 2 parallel | Jobs: 4
|
||||
|
||||
> Database Type: postgres
|
||||
CPU Workload Type: balanced
|
||||
Backup Directory: /root/db_backups
|
||||
Work Directory: /tmp
|
||||
Resource Profile: balanced (P:2 J:4)
|
||||
Cluster Parallelism: 2
|
||||
Backup Directory: /var/backups/postgres
|
||||
Work Directory: (system temp)
|
||||
Compression Level: 6
|
||||
Parallel Jobs: 16
|
||||
Dump Jobs: 8
|
||||
Parallel Jobs: 4
|
||||
Dump Jobs: 4
|
||||
Database Host: localhost
|
||||
Database Port: 5432
|
||||
Database User: root
|
||||
Database User: postgres
|
||||
SSL Mode: prefer
|
||||
|
||||
s: Save | r: Reset | q: Menu
|
||||
[KEYS] ↑↓ navigate | Enter edit | 'l' toggle LargeDB | 'c' conservative | 'p' recommend | 's' save | 'q' menu
|
||||
```
|
||||
|
||||
**Resource Profiles for Large Databases:**
|
||||
|
||||
When restoring large databases on VMs with limited resources, use the resource profile settings to prevent "out of shared memory" errors:
|
||||
|
||||
| Profile | Cluster Parallel | Jobs | Best For |
|
||||
|---------|------------------|------|----------|
|
||||
| conservative | 1 | 1 | Small VMs (<16GB RAM) |
|
||||
| balanced | 2 | 2-4 | Medium VMs (16-32GB RAM) |
|
||||
| performance | 4 | 4-8 | Large servers (32GB+ RAM) |
|
||||
| max-performance | 8 | 8-16 | High-end servers (64GB+) |
|
||||
|
||||
**Large DB Mode:** Toggle with `l` key. Reduces parallelism by 50% and sets max_locks_per_transaction=8192 for complex databases with many tables/LOBs.
|
||||
|
||||
**Quick shortcuts:** Press `l` to toggle Large DB Mode, `c` for conservative, `p` to show recommendation.
|
||||
|
||||
**Database Status:**
|
||||
```
|
||||
Database Status & Health Check
|
||||
|
||||
@@ -3,9 +3,9 @@
|
||||
This directory contains pre-compiled binaries for the DB Backup Tool across multiple platforms and architectures.
|
||||
|
||||
## Build Information
|
||||
- **Version**: 3.42.30
|
||||
- **Build Time**: 2026-01-14_14:59:20_UTC
|
||||
- **Git Commit**: 7b4ab76
|
||||
- **Version**: 3.42.50
|
||||
- **Build Time**: 2026-01-18_17:52:44_UTC
|
||||
- **Git Commit**: f9ff45c
|
||||
|
||||
## Recent Updates (v1.1.0)
|
||||
- ✅ Fixed TUI progress display with line-by-line output
|
||||
|
||||
@@ -28,6 +28,7 @@ var (
|
||||
restoreClean bool
|
||||
restoreCreate bool
|
||||
restoreJobs int
|
||||
restoreParallelDBs int // Number of parallel database restores
|
||||
restoreTarget string
|
||||
restoreVerbose bool
|
||||
restoreNoProgress bool
|
||||
@@ -289,6 +290,7 @@ func init() {
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreForce, "force", false, "Skip safety checks and confirmations")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreCleanCluster, "clean-cluster", false, "Drop all existing user databases before restore (disaster recovery)")
|
||||
restoreClusterCmd.Flags().IntVar(&restoreJobs, "jobs", 0, "Number of parallel decompression jobs (0 = auto)")
|
||||
restoreClusterCmd.Flags().IntVar(&restoreParallelDBs, "parallel-dbs", 0, "Number of databases to restore in parallel (0 = use config default, 1 = sequential, -1 = auto-detect based on CPU/RAM)")
|
||||
restoreClusterCmd.Flags().StringVar(&restoreWorkdir, "workdir", "", "Working directory for extraction (use when system disk is small, e.g. /mnt/storage/restore_tmp)")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreVerbose, "verbose", false, "Show detailed restore progress")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreNoProgress, "no-progress", false, "Disable progress indicators")
|
||||
@@ -783,6 +785,17 @@ func runRestoreCluster(cmd *cobra.Command, args []string) error {
|
||||
}
|
||||
}
|
||||
|
||||
// Override cluster parallelism if --parallel-dbs is specified
|
||||
if restoreParallelDBs == -1 {
|
||||
// Auto-detect optimal parallelism based on system resources
|
||||
autoParallel := restore.CalculateOptimalParallel()
|
||||
cfg.ClusterParallelism = autoParallel
|
||||
log.Info("Auto-detected optimal parallelism for database restores", "parallel_dbs", autoParallel, "mode", "auto")
|
||||
} else if restoreParallelDBs > 0 {
|
||||
cfg.ClusterParallelism = restoreParallelDBs
|
||||
log.Info("Using custom parallelism for database restores", "parallel_dbs", restoreParallelDBs)
|
||||
}
|
||||
|
||||
// Create restore engine
|
||||
engine := restore.New(cfg, log, db)
|
||||
|
||||
|
||||
4
go.mod
4
go.mod
@@ -57,6 +57,7 @@ require (
|
||||
github.com/aws/aws-sdk-go-v2/service/sts v1.41.2 // indirect
|
||||
github.com/aws/smithy-go v1.23.2 // indirect
|
||||
github.com/aymanbagabas/go-osc52/v2 v2.0.1 // indirect
|
||||
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
|
||||
github.com/cespare/xxhash/v2 v2.3.0 // indirect
|
||||
github.com/charmbracelet/colorprofile v0.2.3-0.20250311203215-f60798e515dc // indirect
|
||||
github.com/charmbracelet/x/ansi v0.10.1 // indirect
|
||||
@@ -66,6 +67,7 @@ require (
|
||||
github.com/envoyproxy/go-control-plane/envoy v1.32.4 // indirect
|
||||
github.com/envoyproxy/protoc-gen-validate v1.2.1 // indirect
|
||||
github.com/erikgeiser/coninput v0.0.0-20211004153227-1c3628e74d0f // indirect
|
||||
github.com/fatih/color v1.18.0 // indirect
|
||||
github.com/felixge/httpsnoop v1.0.4 // indirect
|
||||
github.com/go-jose/go-jose/v4 v4.1.2 // indirect
|
||||
github.com/go-logr/logr v1.4.3 // indirect
|
||||
@@ -83,6 +85,7 @@ require (
|
||||
github.com/jackc/puddle/v2 v2.2.2 // indirect
|
||||
github.com/lucasb-eyer/go-colorful v1.2.0 // indirect
|
||||
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
|
||||
github.com/mattn/go-colorable v0.1.13 // indirect
|
||||
github.com/mattn/go-isatty v0.0.20 // indirect
|
||||
github.com/mattn/go-localereader v0.0.1 // indirect
|
||||
github.com/mattn/go-runewidth v0.0.16 // indirect
|
||||
@@ -94,6 +97,7 @@ require (
|
||||
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect
|
||||
github.com/rivo/uniseg v0.4.7 // indirect
|
||||
github.com/schollz/progressbar/v3 v3.19.0 // indirect
|
||||
github.com/spf13/afero v1.15.0 // indirect
|
||||
github.com/spiffe/go-spiffe/v2 v2.5.0 // indirect
|
||||
github.com/tklauser/go-sysconf v0.3.12 // indirect
|
||||
github.com/tklauser/numcpus v0.6.1 // indirect
|
||||
|
||||
10
go.sum
10
go.sum
@@ -84,6 +84,8 @@ github.com/aws/smithy-go v1.23.2 h1:Crv0eatJUQhaManss33hS5r40CG3ZFH+21XSkqMrIUM=
|
||||
github.com/aws/smithy-go v1.23.2/go.mod h1:LEj2LM3rBRQJxPZTB4KuzZkaZYnZPnvgIhb4pu07mx0=
|
||||
github.com/aymanbagabas/go-osc52/v2 v2.0.1 h1:HwpRHbFMcZLEVr42D4p7XBqjyuxQH5SMiErDT4WkJ2k=
|
||||
github.com/aymanbagabas/go-osc52/v2 v2.0.1/go.mod h1:uYgXzlJ7ZpABp8OJ+exZzJJhRNQ2ASbcXHWsFqH8hp8=
|
||||
github.com/cenkalti/backoff/v4 v4.3.0 h1:MyRJ/UdXutAwSAT+s3wNd7MfTIcy71VQueUuFK343L8=
|
||||
github.com/cenkalti/backoff/v4 v4.3.0/go.mod h1:Y3VNntkOUPxTVeUxJ/G5vcM//AlwfmyYozVcomhLiZE=
|
||||
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
|
||||
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
|
||||
github.com/charmbracelet/bubbles v0.21.0 h1:9TdC97SdRVg/1aaXNVWfFH3nnLAwOXr8Fn6u6mfQdFs=
|
||||
@@ -119,6 +121,8 @@ github.com/envoyproxy/protoc-gen-validate v1.2.1 h1:DEo3O99U8j4hBFwbJfrz9VtgcDfU
|
||||
github.com/envoyproxy/protoc-gen-validate v1.2.1/go.mod h1:d/C80l/jxXLdfEIhX1W2TmLfsJ31lvEjwamM4DxlWXU=
|
||||
github.com/erikgeiser/coninput v0.0.0-20211004153227-1c3628e74d0f h1:Y/CXytFA4m6baUTXGLOoWe4PQhGxaX0KpnayAqC48p4=
|
||||
github.com/erikgeiser/coninput v0.0.0-20211004153227-1c3628e74d0f/go.mod h1:vw97MGsxSvLiUE2X8qFplwetxpGLQrlU1Q9AUEIzCaM=
|
||||
github.com/fatih/color v1.18.0 h1:S8gINlzdQ840/4pfAwic/ZE0djQEH3wM94VfqLTZcOM=
|
||||
github.com/fatih/color v1.18.0/go.mod h1:4FelSpRwEGDpQ12mAdzqdOukCy4u8WUtOY6lkT/6HfU=
|
||||
github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg=
|
||||
github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U=
|
||||
github.com/go-jose/go-jose/v4 v4.1.2 h1:TK/7NqRQZfgAh+Td8AlsrvtPoUyiHh0LqVvokh+1vHI=
|
||||
@@ -169,6 +173,9 @@ github.com/lucasb-eyer/go-colorful v1.2.0 h1:1nnpGOrhyZZuNyfu1QjKiUICQ74+3FNCN69
|
||||
github.com/lucasb-eyer/go-colorful v1.2.0/go.mod h1:R4dSotOR9KMtayYi1e77YzuveK+i7ruzyGqttikkLy0=
|
||||
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 h1:6E+4a0GO5zZEnZ81pIr0yLvtUWk2if982qA3F3QD6H4=
|
||||
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0/go.mod h1:zJYVVT2jmtg6P3p1VtQj7WsuWi/y4VnjVBn7F8KPB3I=
|
||||
github.com/mattn/go-colorable v0.1.13 h1:fFA4WZxdEF4tXPZVKMLwD8oUnCTTo08duU7wxecdEvA=
|
||||
github.com/mattn/go-colorable v0.1.13/go.mod h1:7S9/ev0klgBDR4GtXTXX8a3vIGJpMovkB8vQcUbaXHg=
|
||||
github.com/mattn/go-isatty v0.0.16/go.mod h1:kYGgaQfpe5nmfYZH+SKPsOc2e4SrIfOl2e/yFXSvRLM=
|
||||
github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
|
||||
github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
|
||||
github.com/mattn/go-localereader v0.0.1 h1:ygSAOl7ZXTx4RdPYinUpg6W99U8jWvWi9Ye2JC/oIi4=
|
||||
@@ -204,6 +211,8 @@ github.com/shirou/gopsutil/v3 v3.24.5 h1:i0t8kL+kQTvpAYToeuiVk3TgDeKOFioZO3Ztz/i
|
||||
github.com/shirou/gopsutil/v3 v3.24.5/go.mod h1:bsoOS1aStSs9ErQ1WWfxllSeS1K5D+U30r2NfcubMVk=
|
||||
github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
|
||||
github.com/sirupsen/logrus v1.9.3/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
|
||||
github.com/spf13/afero v1.15.0 h1:b/YBCLWAJdFWJTN9cLhiXXcD7mzKn9Dm86dNnfyQw1I=
|
||||
github.com/spf13/afero v1.15.0/go.mod h1:NC2ByUVxtQs4b3sIUphxK0NioZnmxgyCrfzeuq8lxMg=
|
||||
github.com/spf13/cobra v1.10.1 h1:lJeBwCfmrnXthfAupyUTzJ/J4Nc1RsHC/mSRU2dll/s=
|
||||
github.com/spf13/cobra v1.10.1/go.mod h1:7SmJGaTHFVBY0jW4NXGluQoLvhqFQM+6XSKD+P4XaB0=
|
||||
github.com/spf13/pflag v1.0.9 h1:9exaQaMOCwffKiiiYk6/BndUBv+iRViNW+4lEMi0PvY=
|
||||
@@ -259,6 +268,7 @@ golang.org/x/sys v0.0.0-20190916202348-b4ddaad3f8a3/go.mod h1:h1NjWce9XRLGQEsW7w
|
||||
golang.org/x/sys v0.0.0-20201204225414-ed752295db88/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
|
||||
golang.org/x/sys v0.0.0-20210809222454-d867a43fc93e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.11.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
|
||||
@@ -94,7 +94,7 @@
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "code",
|
||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < 86400",
|
||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < bool 604800",
|
||||
"legendFormat": "{{database}}",
|
||||
"range": true,
|
||||
"refId": "A"
|
||||
@@ -711,19 +711,6 @@
|
||||
},
|
||||
"pluginVersion": "10.2.0",
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "code",
|
||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < 86400",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"legendFormat": "__auto",
|
||||
"range": false,
|
||||
"refId": "Status"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
@@ -769,26 +756,30 @@
|
||||
"Time": true,
|
||||
"Time 1": true,
|
||||
"Time 2": true,
|
||||
"Time 3": true,
|
||||
"__name__": true,
|
||||
"__name__ 1": true,
|
||||
"__name__ 2": true,
|
||||
"__name__ 3": true,
|
||||
"instance 1": true,
|
||||
"instance 2": true,
|
||||
"instance 3": true,
|
||||
"job": true,
|
||||
"job 1": true,
|
||||
"job 2": true,
|
||||
"job 3": true
|
||||
"engine 1": true,
|
||||
"engine 2": true
|
||||
},
|
||||
"indexByName": {
|
||||
"Database": 0,
|
||||
"Instance": 1,
|
||||
"Engine": 2,
|
||||
"RPO": 3,
|
||||
"Size": 4
|
||||
},
|
||||
"indexByName": {},
|
||||
"renameByName": {
|
||||
"Value #RPO": "RPO",
|
||||
"Value #Size": "Size",
|
||||
"Value #Status": "Status",
|
||||
"database": "Database",
|
||||
"instance": "Instance"
|
||||
"instance": "Instance",
|
||||
"engine": "Engine"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1275,7 +1266,7 @@
|
||||
"query": "label_values(dbbackup_rpo_seconds, instance)",
|
||||
"refId": "StandardVariableQuery"
|
||||
},
|
||||
"refresh": 1,
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"skipUrlSync": false,
|
||||
"sort": 1,
|
||||
|
||||
@@ -84,20 +84,14 @@ func findHbaFileViaPostgres() string {
|
||||
|
||||
// parsePgHbaConf parses pg_hba.conf and returns the authentication method
|
||||
func parsePgHbaConf(path string, user string) AuthMethod {
|
||||
// Try with sudo if we can't read directly
|
||||
// Try to read the file directly - do NOT use sudo as it triggers password prompts
|
||||
// If we can't read pg_hba.conf, we'll rely on connection attempts to determine auth
|
||||
file, err := os.Open(path)
|
||||
if err != nil {
|
||||
// Try with sudo (with timeout)
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(ctx, "sudo", "cat", path)
|
||||
output, err := cmd.Output()
|
||||
if err != nil {
|
||||
// If we can't read the file, return unknown and let the connection determine auth
|
||||
// This avoids sudo password prompts when running as postgres via su
|
||||
return AuthUnknown
|
||||
}
|
||||
return parseHbaContent(string(output), user)
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
scanner := bufio.NewScanner(file)
|
||||
|
||||
@@ -28,6 +28,12 @@ import (
|
||||
"dbbackup/internal/swap"
|
||||
)
|
||||
|
||||
// ProgressCallback is called with byte-level progress updates during backup operations
|
||||
type ProgressCallback func(current, total int64, description string)
|
||||
|
||||
// DatabaseProgressCallback is called with database count progress during cluster backup
|
||||
type DatabaseProgressCallback func(done, total int, dbName string)
|
||||
|
||||
// Engine handles backup operations
|
||||
type Engine struct {
|
||||
cfg *config.Config
|
||||
@@ -36,6 +42,8 @@ type Engine struct {
|
||||
progress progress.Indicator
|
||||
detailedReporter *progress.DetailedReporter
|
||||
silent bool // Silent mode for TUI
|
||||
progressCallback ProgressCallback
|
||||
dbProgressCallback DatabaseProgressCallback
|
||||
}
|
||||
|
||||
// New creates a new backup engine
|
||||
@@ -86,6 +94,30 @@ func NewSilent(cfg *config.Config, log logger.Logger, db database.Database, prog
|
||||
}
|
||||
}
|
||||
|
||||
// SetProgressCallback sets a callback for detailed progress reporting (for TUI mode)
|
||||
func (e *Engine) SetProgressCallback(cb ProgressCallback) {
|
||||
e.progressCallback = cb
|
||||
}
|
||||
|
||||
// SetDatabaseProgressCallback sets a callback for database count progress during cluster backup
|
||||
func (e *Engine) SetDatabaseProgressCallback(cb DatabaseProgressCallback) {
|
||||
e.dbProgressCallback = cb
|
||||
}
|
||||
|
||||
// reportProgress reports progress to the callback if set
|
||||
func (e *Engine) reportProgress(current, total int64, description string) {
|
||||
if e.progressCallback != nil {
|
||||
e.progressCallback(current, total, description)
|
||||
}
|
||||
}
|
||||
|
||||
// reportDatabaseProgress reports database count progress to the callback if set
|
||||
func (e *Engine) reportDatabaseProgress(done, total int, dbName string) {
|
||||
if e.dbProgressCallback != nil {
|
||||
e.dbProgressCallback(done, total, dbName)
|
||||
}
|
||||
}
|
||||
|
||||
// loggerAdapter adapts our logger to the progress.Logger interface
|
||||
type loggerAdapter struct {
|
||||
logger logger.Logger
|
||||
@@ -465,6 +497,8 @@ func (e *Engine) BackupCluster(ctx context.Context) error {
|
||||
estimator.UpdateProgress(idx)
|
||||
e.printf(" [%d/%d] Backing up database: %s\n", idx+1, len(databases), name)
|
||||
quietProgress.Update(fmt.Sprintf("Backing up database %d/%d: %s", idx+1, len(databases), name))
|
||||
// Report database progress to TUI callback
|
||||
e.reportDatabaseProgress(idx+1, len(databases), name)
|
||||
mu.Unlock()
|
||||
|
||||
// Check database size and warn if very large
|
||||
@@ -903,11 +937,15 @@ func (e *Engine) createSampleBackup(ctx context.Context, databaseName, outputFil
|
||||
func (e *Engine) backupGlobals(ctx context.Context, tempDir string) error {
|
||||
globalsFile := filepath.Join(tempDir, "globals.sql")
|
||||
|
||||
cmd := exec.CommandContext(ctx, "pg_dumpall", "--globals-only")
|
||||
if e.cfg.Host != "localhost" {
|
||||
cmd.Args = append(cmd.Args, "-h", e.cfg.Host, "-p", fmt.Sprintf("%d", e.cfg.Port))
|
||||
// CRITICAL: Always pass port even for localhost - user may have non-standard port
|
||||
cmd := exec.CommandContext(ctx, "pg_dumpall", "--globals-only",
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User)
|
||||
|
||||
// Only add -h flag for non-localhost to use Unix socket for peer auth
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||
cmd.Args = append([]string{cmd.Args[0], "-h", e.cfg.Host}, cmd.Args[1:]...)
|
||||
}
|
||||
cmd.Args = append(cmd.Args, "-U", e.cfg.User)
|
||||
|
||||
cmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" {
|
||||
|
||||
@@ -68,8 +68,8 @@ func ClassifyError(errorMsg string) *ErrorClassification {
|
||||
Type: "critical",
|
||||
Category: "locks",
|
||||
Message: errorMsg,
|
||||
Hint: "Lock table exhausted - typically caused by large objects (BLOBs) during restore",
|
||||
Action: "Option 1: Increase max_locks_per_transaction to 1024+ in postgresql.conf (requires restart). Option 2: Update dbbackup and retry - phased restore now auto-enabled for BLOB databases",
|
||||
Hint: "Lock table exhausted. Total capacity = max_locks_per_transaction × (max_connections + max_prepared_transactions). If you reduced VM size or max_connections, you need higher max_locks_per_transaction to compensate.",
|
||||
Action: "Fix: ALTER SYSTEM SET max_locks_per_transaction = 4096; then RESTART PostgreSQL. For smaller VMs with fewer connections, you need higher max_locks_per_transaction values.",
|
||||
Severity: 2,
|
||||
}
|
||||
case "permission_denied":
|
||||
@@ -142,8 +142,8 @@ func ClassifyError(errorMsg string) *ErrorClassification {
|
||||
Type: "critical",
|
||||
Category: "locks",
|
||||
Message: errorMsg,
|
||||
Hint: "Lock table exhausted - typically caused by large objects (BLOBs) during restore",
|
||||
Action: "Option 1: Increase max_locks_per_transaction to 1024+ in postgresql.conf (requires restart). Option 2: Update dbbackup and retry - phased restore now auto-enabled for BLOB databases",
|
||||
Hint: "Lock table exhausted. Total capacity = max_locks_per_transaction × (max_connections + max_prepared_transactions). If you reduced VM size or max_connections, you need higher max_locks_per_transaction to compensate.",
|
||||
Action: "Fix: ALTER SYSTEM SET max_locks_per_transaction = 4096; then RESTART PostgreSQL. For smaller VMs with fewer connections, you need higher max_locks_per_transaction values.",
|
||||
Severity: 2,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -151,8 +151,14 @@ func (a *AzureBackend) Upload(ctx context.Context, localPath, remotePath string,
|
||||
return a.uploadSimple(ctx, file, blobName, fileSize, progress)
|
||||
}
|
||||
|
||||
// uploadSimple uploads a file using simple upload (single request)
|
||||
// uploadSimple uploads a file using simple upload (single request) with retry
|
||||
func (a *AzureBackend) uploadSimple(ctx context.Context, file *os.File, blobName string, fileSize int64, progress ProgressCallback) error {
|
||||
return RetryOperationWithNotify(ctx, DefaultRetryConfig(), func() error {
|
||||
// Reset file position for retry
|
||||
if _, err := file.Seek(0, 0); err != nil {
|
||||
return fmt.Errorf("failed to reset file position: %w", err)
|
||||
}
|
||||
|
||||
blockBlobClient := a.client.ServiceClient().NewContainerClient(a.containerName).NewBlockBlobClient(blobName)
|
||||
|
||||
// Wrap reader with progress tracking
|
||||
@@ -182,6 +188,9 @@ func (a *AzureBackend) uploadSimple(ctx context.Context, file *os.File, blobName
|
||||
}
|
||||
|
||||
return nil
|
||||
}, func(err error, duration time.Duration) {
|
||||
fmt.Printf("[Azure] Upload retry in %v: %v\n", duration, err)
|
||||
})
|
||||
}
|
||||
|
||||
// uploadBlocks uploads a file using block blob staging (for large files)
|
||||
@@ -251,7 +260,7 @@ func (a *AzureBackend) uploadBlocks(ctx context.Context, file *os.File, blobName
|
||||
return nil
|
||||
}
|
||||
|
||||
// Download downloads a file from Azure Blob Storage
|
||||
// Download downloads a file from Azure Blob Storage with retry
|
||||
func (a *AzureBackend) Download(ctx context.Context, remotePath, localPath string, progress ProgressCallback) error {
|
||||
blobName := strings.TrimPrefix(remotePath, "/")
|
||||
blockBlobClient := a.client.ServiceClient().NewContainerClient(a.containerName).NewBlockBlobClient(blobName)
|
||||
@@ -264,6 +273,7 @@ func (a *AzureBackend) Download(ctx context.Context, remotePath, localPath strin
|
||||
|
||||
fileSize := *props.ContentLength
|
||||
|
||||
return RetryOperationWithNotify(ctx, DefaultRetryConfig(), func() error {
|
||||
// Download blob
|
||||
resp, err := blockBlobClient.DownloadStream(ctx, nil)
|
||||
if err != nil {
|
||||
@@ -271,7 +281,7 @@ func (a *AzureBackend) Download(ctx context.Context, remotePath, localPath strin
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
// Create local file
|
||||
// Create/truncate local file
|
||||
file, err := os.Create(localPath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create file: %w", err)
|
||||
@@ -288,6 +298,9 @@ func (a *AzureBackend) Download(ctx context.Context, remotePath, localPath strin
|
||||
}
|
||||
|
||||
return nil
|
||||
}, func(err error, duration time.Duration) {
|
||||
fmt.Printf("[Azure] Download retry in %v: %v\n", duration, err)
|
||||
})
|
||||
}
|
||||
|
||||
// Delete deletes a file from Azure Blob Storage
|
||||
|
||||
@@ -89,7 +89,7 @@ func (g *GCSBackend) Name() string {
|
||||
return "gcs"
|
||||
}
|
||||
|
||||
// Upload uploads a file to Google Cloud Storage
|
||||
// Upload uploads a file to Google Cloud Storage with retry
|
||||
func (g *GCSBackend) Upload(ctx context.Context, localPath, remotePath string, progress ProgressCallback) error {
|
||||
file, err := os.Open(localPath)
|
||||
if err != nil {
|
||||
@@ -106,6 +106,12 @@ func (g *GCSBackend) Upload(ctx context.Context, localPath, remotePath string, p
|
||||
// Remove leading slash from remote path
|
||||
objectName := strings.TrimPrefix(remotePath, "/")
|
||||
|
||||
return RetryOperationWithNotify(ctx, DefaultRetryConfig(), func() error {
|
||||
// Reset file position for retry
|
||||
if _, err := file.Seek(0, 0); err != nil {
|
||||
return fmt.Errorf("failed to reset file position: %w", err)
|
||||
}
|
||||
|
||||
bucket := g.client.Bucket(g.bucketName)
|
||||
object := bucket.Object(objectName)
|
||||
|
||||
@@ -142,9 +148,12 @@ func (g *GCSBackend) Upload(ctx context.Context, localPath, remotePath string, p
|
||||
}
|
||||
|
||||
return nil
|
||||
}, func(err error, duration time.Duration) {
|
||||
fmt.Printf("[GCS] Upload retry in %v: %v\n", duration, err)
|
||||
})
|
||||
}
|
||||
|
||||
// Download downloads a file from Google Cloud Storage
|
||||
// Download downloads a file from Google Cloud Storage with retry
|
||||
func (g *GCSBackend) Download(ctx context.Context, remotePath, localPath string, progress ProgressCallback) error {
|
||||
objectName := strings.TrimPrefix(remotePath, "/")
|
||||
|
||||
@@ -159,6 +168,7 @@ func (g *GCSBackend) Download(ctx context.Context, remotePath, localPath string,
|
||||
|
||||
fileSize := attrs.Size
|
||||
|
||||
return RetryOperationWithNotify(ctx, DefaultRetryConfig(), func() error {
|
||||
// Create reader
|
||||
reader, err := object.NewReader(ctx)
|
||||
if err != nil {
|
||||
@@ -166,7 +176,7 @@ func (g *GCSBackend) Download(ctx context.Context, remotePath, localPath string,
|
||||
}
|
||||
defer reader.Close()
|
||||
|
||||
// Create local file
|
||||
// Create/truncate local file
|
||||
file, err := os.Create(localPath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create file: %w", err)
|
||||
@@ -183,6 +193,9 @@ func (g *GCSBackend) Download(ctx context.Context, remotePath, localPath string,
|
||||
}
|
||||
|
||||
return nil
|
||||
}, func(err error, duration time.Duration) {
|
||||
fmt.Printf("[GCS] Download retry in %v: %v\n", duration, err)
|
||||
})
|
||||
}
|
||||
|
||||
// Delete deletes a file from Google Cloud Storage
|
||||
|
||||
257
internal/cloud/retry.go
Normal file
257
internal/cloud/retry.go
Normal file
@@ -0,0 +1,257 @@
|
||||
package cloud
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"net"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/cenkalti/backoff/v4"
|
||||
)
|
||||
|
||||
// RetryConfig configures retry behavior
|
||||
type RetryConfig struct {
|
||||
MaxRetries int // Maximum number of retries (0 = unlimited)
|
||||
InitialInterval time.Duration // Initial backoff interval
|
||||
MaxInterval time.Duration // Maximum backoff interval
|
||||
MaxElapsedTime time.Duration // Maximum total time for retries
|
||||
Multiplier float64 // Backoff multiplier
|
||||
}
|
||||
|
||||
// DefaultRetryConfig returns sensible defaults for cloud operations
|
||||
func DefaultRetryConfig() *RetryConfig {
|
||||
return &RetryConfig{
|
||||
MaxRetries: 5,
|
||||
InitialInterval: 500 * time.Millisecond,
|
||||
MaxInterval: 30 * time.Second,
|
||||
MaxElapsedTime: 5 * time.Minute,
|
||||
Multiplier: 2.0,
|
||||
}
|
||||
}
|
||||
|
||||
// AggressiveRetryConfig returns config for critical operations that need more retries
|
||||
func AggressiveRetryConfig() *RetryConfig {
|
||||
return &RetryConfig{
|
||||
MaxRetries: 10,
|
||||
InitialInterval: 1 * time.Second,
|
||||
MaxInterval: 60 * time.Second,
|
||||
MaxElapsedTime: 15 * time.Minute,
|
||||
Multiplier: 1.5,
|
||||
}
|
||||
}
|
||||
|
||||
// QuickRetryConfig returns config for operations that should fail fast
|
||||
func QuickRetryConfig() *RetryConfig {
|
||||
return &RetryConfig{
|
||||
MaxRetries: 3,
|
||||
InitialInterval: 100 * time.Millisecond,
|
||||
MaxInterval: 5 * time.Second,
|
||||
MaxElapsedTime: 30 * time.Second,
|
||||
Multiplier: 2.0,
|
||||
}
|
||||
}
|
||||
|
||||
// RetryOperation executes an operation with exponential backoff retry
|
||||
func RetryOperation(ctx context.Context, cfg *RetryConfig, operation func() error) error {
|
||||
if cfg == nil {
|
||||
cfg = DefaultRetryConfig()
|
||||
}
|
||||
|
||||
// Create exponential backoff
|
||||
expBackoff := backoff.NewExponentialBackOff()
|
||||
expBackoff.InitialInterval = cfg.InitialInterval
|
||||
expBackoff.MaxInterval = cfg.MaxInterval
|
||||
expBackoff.MaxElapsedTime = cfg.MaxElapsedTime
|
||||
expBackoff.Multiplier = cfg.Multiplier
|
||||
expBackoff.Reset()
|
||||
|
||||
// Wrap with max retries if specified
|
||||
var b backoff.BackOff = expBackoff
|
||||
if cfg.MaxRetries > 0 {
|
||||
b = backoff.WithMaxRetries(expBackoff, uint64(cfg.MaxRetries))
|
||||
}
|
||||
|
||||
// Add context support
|
||||
b = backoff.WithContext(b, ctx)
|
||||
|
||||
// Track attempts for logging
|
||||
attempt := 0
|
||||
|
||||
// Wrap operation to handle permanent vs retryable errors
|
||||
wrappedOp := func() error {
|
||||
attempt++
|
||||
err := operation()
|
||||
if err == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Check if error is permanent (should not retry)
|
||||
if IsPermanentError(err) {
|
||||
return backoff.Permanent(err)
|
||||
}
|
||||
|
||||
return err
|
||||
}
|
||||
|
||||
return backoff.Retry(wrappedOp, b)
|
||||
}
|
||||
|
||||
// RetryOperationWithNotify executes an operation with retry and calls notify on each retry
|
||||
func RetryOperationWithNotify(ctx context.Context, cfg *RetryConfig, operation func() error, notify func(err error, duration time.Duration)) error {
|
||||
if cfg == nil {
|
||||
cfg = DefaultRetryConfig()
|
||||
}
|
||||
|
||||
// Create exponential backoff
|
||||
expBackoff := backoff.NewExponentialBackOff()
|
||||
expBackoff.InitialInterval = cfg.InitialInterval
|
||||
expBackoff.MaxInterval = cfg.MaxInterval
|
||||
expBackoff.MaxElapsedTime = cfg.MaxElapsedTime
|
||||
expBackoff.Multiplier = cfg.Multiplier
|
||||
expBackoff.Reset()
|
||||
|
||||
// Wrap with max retries if specified
|
||||
var b backoff.BackOff = expBackoff
|
||||
if cfg.MaxRetries > 0 {
|
||||
b = backoff.WithMaxRetries(expBackoff, uint64(cfg.MaxRetries))
|
||||
}
|
||||
|
||||
// Add context support
|
||||
b = backoff.WithContext(b, ctx)
|
||||
|
||||
// Wrap operation to handle permanent vs retryable errors
|
||||
wrappedOp := func() error {
|
||||
err := operation()
|
||||
if err == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Check if error is permanent (should not retry)
|
||||
if IsPermanentError(err) {
|
||||
return backoff.Permanent(err)
|
||||
}
|
||||
|
||||
return err
|
||||
}
|
||||
|
||||
return backoff.RetryNotify(wrappedOp, b, notify)
|
||||
}
|
||||
|
||||
// IsPermanentError returns true if the error should not be retried
|
||||
func IsPermanentError(err error) bool {
|
||||
if err == nil {
|
||||
return false
|
||||
}
|
||||
|
||||
errStr := strings.ToLower(err.Error())
|
||||
|
||||
// Authentication/authorization errors - don't retry
|
||||
permanentPatterns := []string{
|
||||
"access denied",
|
||||
"forbidden",
|
||||
"unauthorized",
|
||||
"invalid credentials",
|
||||
"invalid access key",
|
||||
"invalid secret",
|
||||
"no such bucket",
|
||||
"bucket not found",
|
||||
"container not found",
|
||||
"nosuchbucket",
|
||||
"nosuchkey",
|
||||
"invalid argument",
|
||||
"malformed",
|
||||
"invalid request",
|
||||
"permission denied",
|
||||
"access control",
|
||||
"policy",
|
||||
}
|
||||
|
||||
for _, pattern := range permanentPatterns {
|
||||
if strings.Contains(errStr, pattern) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
// IsRetryableError returns true if the error is transient and should be retried
|
||||
func IsRetryableError(err error) bool {
|
||||
if err == nil {
|
||||
return false
|
||||
}
|
||||
|
||||
// Network errors are typically retryable
|
||||
var netErr net.Error
|
||||
if ok := isNetError(err, &netErr); ok {
|
||||
return netErr.Timeout() || netErr.Temporary()
|
||||
}
|
||||
|
||||
errStr := strings.ToLower(err.Error())
|
||||
|
||||
// Transient errors - should retry
|
||||
retryablePatterns := []string{
|
||||
"timeout",
|
||||
"connection reset",
|
||||
"connection refused",
|
||||
"connection closed",
|
||||
"eof",
|
||||
"broken pipe",
|
||||
"temporary failure",
|
||||
"service unavailable",
|
||||
"internal server error",
|
||||
"bad gateway",
|
||||
"gateway timeout",
|
||||
"too many requests",
|
||||
"rate limit",
|
||||
"throttl",
|
||||
"slowdown",
|
||||
"try again",
|
||||
"retry",
|
||||
}
|
||||
|
||||
for _, pattern := range retryablePatterns {
|
||||
if strings.Contains(errStr, pattern) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
// isNetError checks if err wraps a net.Error
|
||||
func isNetError(err error, target *net.Error) bool {
|
||||
for err != nil {
|
||||
if ne, ok := err.(net.Error); ok {
|
||||
*target = ne
|
||||
return true
|
||||
}
|
||||
// Try to unwrap
|
||||
if unwrapper, ok := err.(interface{ Unwrap() error }); ok {
|
||||
err = unwrapper.Unwrap()
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
return false
|
||||
}
|
||||
|
||||
// WithRetry is a helper that wraps a function with default retry logic
|
||||
func WithRetry(ctx context.Context, operationName string, fn func() error) error {
|
||||
notify := func(err error, duration time.Duration) {
|
||||
// Log retry attempts (caller can provide their own logger if needed)
|
||||
fmt.Printf("[RETRY] %s failed, retrying in %v: %v\n", operationName, duration, err)
|
||||
}
|
||||
|
||||
return RetryOperationWithNotify(ctx, DefaultRetryConfig(), fn, notify)
|
||||
}
|
||||
|
||||
// WithRetryConfig is a helper that wraps a function with custom retry config
|
||||
func WithRetryConfig(ctx context.Context, cfg *RetryConfig, operationName string, fn func() error) error {
|
||||
notify := func(err error, duration time.Duration) {
|
||||
fmt.Printf("[RETRY] %s failed, retrying in %v: %v\n", operationName, duration, err)
|
||||
}
|
||||
|
||||
return RetryOperationWithNotify(ctx, cfg, fn, notify)
|
||||
}
|
||||
@@ -7,6 +7,7 @@ import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/aws/aws-sdk-go-v2/aws"
|
||||
"github.com/aws/aws-sdk-go-v2/config"
|
||||
@@ -123,8 +124,14 @@ func (s *S3Backend) Upload(ctx context.Context, localPath, remotePath string, pr
|
||||
return s.uploadSimple(ctx, file, key, fileSize, progress)
|
||||
}
|
||||
|
||||
// uploadSimple performs a simple single-part upload
|
||||
// uploadSimple performs a simple single-part upload with retry
|
||||
func (s *S3Backend) uploadSimple(ctx context.Context, file *os.File, key string, fileSize int64, progress ProgressCallback) error {
|
||||
return RetryOperationWithNotify(ctx, DefaultRetryConfig(), func() error {
|
||||
// Reset file position for retry
|
||||
if _, err := file.Seek(0, 0); err != nil {
|
||||
return fmt.Errorf("failed to reset file position: %w", err)
|
||||
}
|
||||
|
||||
// Create progress reader
|
||||
var reader io.Reader = file
|
||||
if progress != nil {
|
||||
@@ -143,10 +150,19 @@ func (s *S3Backend) uploadSimple(ctx context.Context, file *os.File, key string,
|
||||
}
|
||||
|
||||
return nil
|
||||
}, func(err error, duration time.Duration) {
|
||||
fmt.Printf("[S3] Upload retry in %v: %v\n", duration, err)
|
||||
})
|
||||
}
|
||||
|
||||
// uploadMultipart performs a multipart upload for large files
|
||||
// uploadMultipart performs a multipart upload for large files with retry
|
||||
func (s *S3Backend) uploadMultipart(ctx context.Context, file *os.File, key string, fileSize int64, progress ProgressCallback) error {
|
||||
return RetryOperationWithNotify(ctx, AggressiveRetryConfig(), func() error {
|
||||
// Reset file position for retry
|
||||
if _, err := file.Seek(0, 0); err != nil {
|
||||
return fmt.Errorf("failed to reset file position: %w", err)
|
||||
}
|
||||
|
||||
// Create uploader with custom options
|
||||
uploader := manager.NewUploader(s.client, func(u *manager.Uploader) {
|
||||
// Part size: 10MB
|
||||
@@ -177,9 +193,12 @@ func (s *S3Backend) uploadMultipart(ctx context.Context, file *os.File, key stri
|
||||
}
|
||||
|
||||
return nil
|
||||
}, func(err error, duration time.Duration) {
|
||||
fmt.Printf("[S3] Multipart upload retry in %v: %v\n", duration, err)
|
||||
})
|
||||
}
|
||||
|
||||
// Download downloads a file from S3
|
||||
// Download downloads a file from S3 with retry
|
||||
func (s *S3Backend) Download(ctx context.Context, remotePath, localPath string, progress ProgressCallback) error {
|
||||
// Build S3 key
|
||||
key := s.buildKey(remotePath)
|
||||
@@ -190,6 +209,12 @@ func (s *S3Backend) Download(ctx context.Context, remotePath, localPath string,
|
||||
return fmt.Errorf("failed to get object size: %w", err)
|
||||
}
|
||||
|
||||
// Create directory for local file
|
||||
if err := os.MkdirAll(filepath.Dir(localPath), 0755); err != nil {
|
||||
return fmt.Errorf("failed to create directory: %w", err)
|
||||
}
|
||||
|
||||
return RetryOperationWithNotify(ctx, DefaultRetryConfig(), func() error {
|
||||
// Download from S3
|
||||
result, err := s.client.GetObject(ctx, &s3.GetObjectInput{
|
||||
Bucket: aws.String(s.bucket),
|
||||
@@ -200,11 +225,7 @@ func (s *S3Backend) Download(ctx context.Context, remotePath, localPath string,
|
||||
}
|
||||
defer result.Body.Close()
|
||||
|
||||
// Create local file
|
||||
if err := os.MkdirAll(filepath.Dir(localPath), 0755); err != nil {
|
||||
return fmt.Errorf("failed to create directory: %w", err)
|
||||
}
|
||||
|
||||
// Create/truncate local file
|
||||
outFile, err := os.Create(localPath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create local file: %w", err)
|
||||
@@ -223,6 +244,9 @@ func (s *S3Backend) Download(ctx context.Context, remotePath, localPath string,
|
||||
}
|
||||
|
||||
return nil
|
||||
}, func(err error, duration time.Duration) {
|
||||
fmt.Printf("[S3] Download retry in %v: %v\n", duration, err)
|
||||
})
|
||||
}
|
||||
|
||||
// List lists all backup files in S3
|
||||
|
||||
@@ -36,9 +36,14 @@ type Config struct {
|
||||
AutoDetectCores bool
|
||||
CPUWorkloadType string // "cpu-intensive", "io-intensive", "balanced"
|
||||
|
||||
// Resource profile for backup/restore operations
|
||||
ResourceProfile string // "conservative", "balanced", "performance", "max-performance"
|
||||
LargeDBMode bool // Enable large database mode (reduces parallelism, increases max_locks)
|
||||
|
||||
// CPU detection
|
||||
CPUDetector *cpu.Detector
|
||||
CPUInfo *cpu.CPUInfo
|
||||
MemoryInfo *cpu.MemoryInfo // System memory information
|
||||
|
||||
// Sample backup options
|
||||
SampleStrategy string // "ratio", "percent", "count"
|
||||
@@ -178,6 +183,13 @@ func New() *Config {
|
||||
sslMode = ""
|
||||
}
|
||||
|
||||
// Detect memory information
|
||||
memInfo, _ := cpu.DetectMemory()
|
||||
|
||||
// Determine recommended resource profile
|
||||
recommendedProfile := cpu.RecommendProfile(cpuInfo, memInfo, false)
|
||||
defaultProfile := getEnvString("RESOURCE_PROFILE", recommendedProfile.Name)
|
||||
|
||||
cfg := &Config{
|
||||
// Database defaults
|
||||
Host: host,
|
||||
@@ -189,18 +201,21 @@ func New() *Config {
|
||||
SSLMode: sslMode,
|
||||
Insecure: getEnvBool("INSECURE", false),
|
||||
|
||||
// Backup defaults
|
||||
// Backup defaults - use recommended profile's settings for small VMs
|
||||
BackupDir: backupDir,
|
||||
CompressionLevel: getEnvInt("COMPRESS_LEVEL", 6),
|
||||
Jobs: getEnvInt("JOBS", getDefaultJobs(cpuInfo)),
|
||||
DumpJobs: getEnvInt("DUMP_JOBS", getDefaultDumpJobs(cpuInfo)),
|
||||
Jobs: getEnvInt("JOBS", recommendedProfile.Jobs),
|
||||
DumpJobs: getEnvInt("DUMP_JOBS", recommendedProfile.DumpJobs),
|
||||
MaxCores: getEnvInt("MAX_CORES", getDefaultMaxCores(cpuInfo)),
|
||||
AutoDetectCores: getEnvBool("AUTO_DETECT_CORES", true),
|
||||
CPUWorkloadType: getEnvString("CPU_WORKLOAD_TYPE", "balanced"),
|
||||
ResourceProfile: defaultProfile,
|
||||
LargeDBMode: getEnvBool("LARGE_DB_MODE", false),
|
||||
|
||||
// CPU detection
|
||||
// CPU and memory detection
|
||||
CPUDetector: cpuDetector,
|
||||
CPUInfo: cpuInfo,
|
||||
MemoryInfo: memInfo,
|
||||
|
||||
// Sample backup defaults
|
||||
SampleStrategy: getEnvString("SAMPLE_STRATEGY", "ratio"),
|
||||
@@ -220,8 +235,8 @@ func New() *Config {
|
||||
// Timeouts - default 24 hours (1440 min) to handle very large databases with large objects
|
||||
ClusterTimeoutMinutes: getEnvInt("CLUSTER_TIMEOUT_MIN", 1440),
|
||||
|
||||
// Cluster parallelism (default: 2 concurrent operations for faster cluster backup/restore)
|
||||
ClusterParallelism: getEnvInt("CLUSTER_PARALLELISM", 2),
|
||||
// Cluster parallelism - use recommended profile's setting for small VMs
|
||||
ClusterParallelism: getEnvInt("CLUSTER_PARALLELISM", recommendedProfile.ClusterParallelism),
|
||||
|
||||
// Working directory for large operations (default: system temp)
|
||||
WorkDir: getEnvString("WORK_DIR", ""),
|
||||
@@ -409,6 +424,56 @@ func (c *Config) OptimizeForCPU() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// ApplyResourceProfile applies a resource profile to the configuration
|
||||
// This adjusts parallelism settings based on the chosen profile
|
||||
func (c *Config) ApplyResourceProfile(profileName string) error {
|
||||
profile := cpu.GetProfileByName(profileName)
|
||||
if profile == nil {
|
||||
return &ConfigError{
|
||||
Field: "resource_profile",
|
||||
Value: profileName,
|
||||
Message: "unknown profile. Valid profiles: conservative, balanced, performance, max-performance",
|
||||
}
|
||||
}
|
||||
|
||||
// Validate profile against current system
|
||||
isValid, warnings := cpu.ValidateProfileForSystem(profile, c.CPUInfo, c.MemoryInfo)
|
||||
if !isValid {
|
||||
// Log warnings but don't block - user may know what they're doing
|
||||
_ = warnings // In production, log these warnings
|
||||
}
|
||||
|
||||
// Apply profile settings
|
||||
c.ResourceProfile = profile.Name
|
||||
c.ClusterParallelism = profile.ClusterParallelism
|
||||
c.Jobs = profile.Jobs
|
||||
c.DumpJobs = profile.DumpJobs
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// GetResourceProfileRecommendation returns the recommended profile and reason
|
||||
func (c *Config) GetResourceProfileRecommendation(isLargeDB bool) (string, string) {
|
||||
profile, reason := cpu.RecommendProfileWithReason(c.CPUInfo, c.MemoryInfo, isLargeDB)
|
||||
return profile.Name, reason
|
||||
}
|
||||
|
||||
// GetCurrentProfile returns the current resource profile details
|
||||
// If LargeDBMode is enabled, returns a modified profile with reduced parallelism
|
||||
func (c *Config) GetCurrentProfile() *cpu.ResourceProfile {
|
||||
profile := cpu.GetProfileByName(c.ResourceProfile)
|
||||
if profile == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Apply LargeDBMode modifier if enabled
|
||||
if c.LargeDBMode {
|
||||
return cpu.ApplyLargeDBMode(profile)
|
||||
}
|
||||
|
||||
return profile
|
||||
}
|
||||
|
||||
// GetCPUInfo returns CPU information, detecting if necessary
|
||||
func (c *Config) GetCPUInfo() (*cpu.CPUInfo, error) {
|
||||
if c.CPUInfo != nil {
|
||||
|
||||
@@ -31,6 +31,8 @@ type LocalConfig struct {
|
||||
CPUWorkload string
|
||||
MaxCores int
|
||||
ClusterTimeout int // Cluster operation timeout in minutes (default: 1440 = 24 hours)
|
||||
ResourceProfile string
|
||||
LargeDBMode bool // Enable large database mode (reduces parallelism, increases locks)
|
||||
|
||||
// Security settings
|
||||
RetentionDays int
|
||||
@@ -126,6 +128,10 @@ func LoadLocalConfig() (*LocalConfig, error) {
|
||||
if ct, err := strconv.Atoi(value); err == nil {
|
||||
cfg.ClusterTimeout = ct
|
||||
}
|
||||
case "resource_profile":
|
||||
cfg.ResourceProfile = value
|
||||
case "large_db_mode":
|
||||
cfg.LargeDBMode = value == "true" || value == "1"
|
||||
}
|
||||
case "security":
|
||||
switch key {
|
||||
@@ -207,6 +213,12 @@ func SaveLocalConfig(cfg *LocalConfig) error {
|
||||
if cfg.ClusterTimeout != 0 {
|
||||
sb.WriteString(fmt.Sprintf("cluster_timeout = %d\n", cfg.ClusterTimeout))
|
||||
}
|
||||
if cfg.ResourceProfile != "" {
|
||||
sb.WriteString(fmt.Sprintf("resource_profile = %s\n", cfg.ResourceProfile))
|
||||
}
|
||||
if cfg.LargeDBMode {
|
||||
sb.WriteString("large_db_mode = true\n")
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
|
||||
// Security section
|
||||
@@ -280,6 +292,14 @@ func ApplyLocalConfig(cfg *Config, local *LocalConfig) {
|
||||
if local.ClusterTimeout != 0 {
|
||||
cfg.ClusterTimeoutMinutes = local.ClusterTimeout
|
||||
}
|
||||
// Apply resource profile settings
|
||||
if local.ResourceProfile != "" {
|
||||
cfg.ResourceProfile = local.ResourceProfile
|
||||
}
|
||||
// LargeDBMode is a boolean - apply if true in config
|
||||
if local.LargeDBMode {
|
||||
cfg.LargeDBMode = true
|
||||
}
|
||||
if cfg.RetentionDays == 30 && local.RetentionDays != 0 {
|
||||
cfg.RetentionDays = local.RetentionDays
|
||||
}
|
||||
@@ -308,6 +328,8 @@ func ConfigFromConfig(cfg *Config) *LocalConfig {
|
||||
CPUWorkload: cfg.CPUWorkloadType,
|
||||
MaxCores: cfg.MaxCores,
|
||||
ClusterTimeout: cfg.ClusterTimeoutMinutes,
|
||||
ResourceProfile: cfg.ResourceProfile,
|
||||
LargeDBMode: cfg.LargeDBMode,
|
||||
RetentionDays: cfg.RetentionDays,
|
||||
MinBackups: cfg.MinBackups,
|
||||
MaxRetries: cfg.MaxRetries,
|
||||
|
||||
475
internal/cpu/profiles.go
Normal file
475
internal/cpu/profiles.go
Normal file
@@ -0,0 +1,475 @@
|
||||
package cpu
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"fmt"
|
||||
"os"
|
||||
"os/exec"
|
||||
"runtime"
|
||||
"strconv"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// MemoryInfo holds system memory information
|
||||
type MemoryInfo struct {
|
||||
TotalBytes int64 `json:"total_bytes"`
|
||||
AvailableBytes int64 `json:"available_bytes"`
|
||||
FreeBytes int64 `json:"free_bytes"`
|
||||
UsedBytes int64 `json:"used_bytes"`
|
||||
SwapTotalBytes int64 `json:"swap_total_bytes"`
|
||||
SwapFreeBytes int64 `json:"swap_free_bytes"`
|
||||
TotalGB int `json:"total_gb"`
|
||||
AvailableGB int `json:"available_gb"`
|
||||
Platform string `json:"platform"`
|
||||
}
|
||||
|
||||
// ResourceProfile defines a resource allocation profile for backup/restore operations
|
||||
type ResourceProfile struct {
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description"`
|
||||
ClusterParallelism int `json:"cluster_parallelism"` // Concurrent databases
|
||||
Jobs int `json:"jobs"` // Parallel jobs within pg_restore
|
||||
DumpJobs int `json:"dump_jobs"` // Parallel jobs for pg_dump
|
||||
MaintenanceWorkMem string `json:"maintenance_work_mem"` // PostgreSQL recommendation
|
||||
MaxLocksPerTxn int `json:"max_locks_per_txn"` // PostgreSQL recommendation
|
||||
RecommendedForLarge bool `json:"recommended_for_large"` // Suitable for large DBs?
|
||||
MinMemoryGB int `json:"min_memory_gb"` // Minimum memory for this profile
|
||||
MinCores int `json:"min_cores"` // Minimum cores for this profile
|
||||
}
|
||||
|
||||
// Predefined resource profiles
|
||||
var (
|
||||
// ProfileConservative - Safe for constrained VMs, avoids shared memory issues
|
||||
ProfileConservative = ResourceProfile{
|
||||
Name: "conservative",
|
||||
Description: "Safe for small VMs (2-4 cores, <16GB). Sequential operations, minimal memory pressure. Best for large DBs on limited hardware.",
|
||||
ClusterParallelism: 1,
|
||||
Jobs: 1,
|
||||
DumpJobs: 2,
|
||||
MaintenanceWorkMem: "256MB",
|
||||
MaxLocksPerTxn: 4096,
|
||||
RecommendedForLarge: true,
|
||||
MinMemoryGB: 4,
|
||||
MinCores: 2,
|
||||
}
|
||||
|
||||
// ProfileBalanced - Default profile, works for most scenarios
|
||||
ProfileBalanced = ResourceProfile{
|
||||
Name: "balanced",
|
||||
Description: "Balanced for medium VMs (4-8 cores, 16-32GB). Moderate parallelism with good safety margin.",
|
||||
ClusterParallelism: 2,
|
||||
Jobs: 2,
|
||||
DumpJobs: 4,
|
||||
MaintenanceWorkMem: "512MB",
|
||||
MaxLocksPerTxn: 2048,
|
||||
RecommendedForLarge: true,
|
||||
MinMemoryGB: 16,
|
||||
MinCores: 4,
|
||||
}
|
||||
|
||||
// ProfilePerformance - Aggressive parallelism for powerful servers
|
||||
ProfilePerformance = ResourceProfile{
|
||||
Name: "performance",
|
||||
Description: "Aggressive for powerful servers (8+ cores, 32GB+). Maximum parallelism for fast operations.",
|
||||
ClusterParallelism: 4,
|
||||
Jobs: 4,
|
||||
DumpJobs: 8,
|
||||
MaintenanceWorkMem: "1GB",
|
||||
MaxLocksPerTxn: 1024,
|
||||
RecommendedForLarge: false, // Large DBs may still need conservative
|
||||
MinMemoryGB: 32,
|
||||
MinCores: 8,
|
||||
}
|
||||
|
||||
// ProfileMaxPerformance - Maximum parallelism for high-end servers
|
||||
ProfileMaxPerformance = ResourceProfile{
|
||||
Name: "max-performance",
|
||||
Description: "Maximum for high-end servers (16+ cores, 64GB+). Full CPU utilization.",
|
||||
ClusterParallelism: 8,
|
||||
Jobs: 8,
|
||||
DumpJobs: 16,
|
||||
MaintenanceWorkMem: "2GB",
|
||||
MaxLocksPerTxn: 512,
|
||||
RecommendedForLarge: false, // Large DBs should use LargeDBMode
|
||||
MinMemoryGB: 64,
|
||||
MinCores: 16,
|
||||
}
|
||||
|
||||
// AllProfiles contains all available profiles (VM resource-based)
|
||||
AllProfiles = []ResourceProfile{
|
||||
ProfileConservative,
|
||||
ProfileBalanced,
|
||||
ProfilePerformance,
|
||||
ProfileMaxPerformance,
|
||||
}
|
||||
)
|
||||
|
||||
// GetProfileByName returns a profile by its name
|
||||
func GetProfileByName(name string) *ResourceProfile {
|
||||
for _, p := range AllProfiles {
|
||||
if strings.EqualFold(p.Name, name) {
|
||||
return &p
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// ApplyLargeDBMode modifies a profile for large database operations.
|
||||
// This is a modifier that reduces parallelism and increases max_locks_per_transaction
|
||||
// to prevent "out of shared memory" errors with large databases (many tables, LOBs, etc.).
|
||||
// It returns a new profile with adjusted settings, leaving the original unchanged.
|
||||
func ApplyLargeDBMode(profile *ResourceProfile) *ResourceProfile {
|
||||
if profile == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Create a copy with adjusted settings
|
||||
modified := *profile
|
||||
|
||||
// Add "(large-db)" suffix to indicate this is modified
|
||||
modified.Name = profile.Name + " +large-db"
|
||||
modified.Description = fmt.Sprintf("%s [LargeDBMode: reduced parallelism, high locks]", profile.Description)
|
||||
|
||||
// Reduce parallelism to avoid lock exhaustion
|
||||
// Rule: halve parallelism, minimum 1
|
||||
modified.ClusterParallelism = max(1, profile.ClusterParallelism/2)
|
||||
modified.Jobs = max(1, profile.Jobs/2)
|
||||
modified.DumpJobs = max(2, profile.DumpJobs/2)
|
||||
|
||||
// Force high max_locks_per_transaction for large schemas
|
||||
modified.MaxLocksPerTxn = 8192
|
||||
|
||||
// Increase maintenance_work_mem for complex operations
|
||||
// Keep or boost maintenance work mem
|
||||
modified.MaintenanceWorkMem = "1GB"
|
||||
if profile.MinMemoryGB >= 32 {
|
||||
modified.MaintenanceWorkMem = "2GB"
|
||||
}
|
||||
|
||||
modified.RecommendedForLarge = true
|
||||
|
||||
return &modified
|
||||
}
|
||||
|
||||
// max returns the larger of two integers
|
||||
func max(a, b int) int {
|
||||
if a > b {
|
||||
return a
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
// DetectMemory detects system memory information
|
||||
func DetectMemory() (*MemoryInfo, error) {
|
||||
info := &MemoryInfo{
|
||||
Platform: runtime.GOOS,
|
||||
}
|
||||
|
||||
switch runtime.GOOS {
|
||||
case "linux":
|
||||
if err := detectLinuxMemory(info); err != nil {
|
||||
return info, fmt.Errorf("linux memory detection failed: %w", err)
|
||||
}
|
||||
case "darwin":
|
||||
if err := detectDarwinMemory(info); err != nil {
|
||||
return info, fmt.Errorf("darwin memory detection failed: %w", err)
|
||||
}
|
||||
case "windows":
|
||||
if err := detectWindowsMemory(info); err != nil {
|
||||
return info, fmt.Errorf("windows memory detection failed: %w", err)
|
||||
}
|
||||
default:
|
||||
// Fallback: use Go runtime memory stats
|
||||
var memStats runtime.MemStats
|
||||
runtime.ReadMemStats(&memStats)
|
||||
info.TotalBytes = int64(memStats.Sys)
|
||||
info.AvailableBytes = int64(memStats.Sys - memStats.Alloc)
|
||||
}
|
||||
|
||||
// Calculate GB values
|
||||
info.TotalGB = int(info.TotalBytes / (1024 * 1024 * 1024))
|
||||
info.AvailableGB = int(info.AvailableBytes / (1024 * 1024 * 1024))
|
||||
|
||||
return info, nil
|
||||
}
|
||||
|
||||
// detectLinuxMemory reads memory info from /proc/meminfo
|
||||
func detectLinuxMemory(info *MemoryInfo) error {
|
||||
file, err := os.Open("/proc/meminfo")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
scanner := bufio.NewScanner(file)
|
||||
for scanner.Scan() {
|
||||
line := scanner.Text()
|
||||
parts := strings.Fields(line)
|
||||
if len(parts) < 2 {
|
||||
continue
|
||||
}
|
||||
|
||||
key := strings.TrimSuffix(parts[0], ":")
|
||||
value, err := strconv.ParseInt(parts[1], 10, 64)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
// Values are in kB
|
||||
valueBytes := value * 1024
|
||||
|
||||
switch key {
|
||||
case "MemTotal":
|
||||
info.TotalBytes = valueBytes
|
||||
case "MemAvailable":
|
||||
info.AvailableBytes = valueBytes
|
||||
case "MemFree":
|
||||
info.FreeBytes = valueBytes
|
||||
case "SwapTotal":
|
||||
info.SwapTotalBytes = valueBytes
|
||||
case "SwapFree":
|
||||
info.SwapFreeBytes = valueBytes
|
||||
}
|
||||
}
|
||||
|
||||
info.UsedBytes = info.TotalBytes - info.AvailableBytes
|
||||
|
||||
return scanner.Err()
|
||||
}
|
||||
|
||||
// detectDarwinMemory detects memory on macOS
|
||||
func detectDarwinMemory(info *MemoryInfo) error {
|
||||
// Use sysctl for total memory
|
||||
if output, err := runCommand("sysctl", "-n", "hw.memsize"); err == nil {
|
||||
if val, err := strconv.ParseInt(strings.TrimSpace(output), 10, 64); err == nil {
|
||||
info.TotalBytes = val
|
||||
}
|
||||
}
|
||||
|
||||
// Use vm_stat for available memory (more complex parsing required)
|
||||
if output, err := runCommand("vm_stat"); err == nil {
|
||||
pageSize := int64(4096) // Default page size
|
||||
var freePages, inactivePages int64
|
||||
|
||||
lines := strings.Split(output, "\n")
|
||||
for _, line := range lines {
|
||||
if strings.Contains(line, "page size of") {
|
||||
parts := strings.Fields(line)
|
||||
for i, p := range parts {
|
||||
if p == "of" && i+1 < len(parts) {
|
||||
if ps, err := strconv.ParseInt(parts[i+1], 10, 64); err == nil {
|
||||
pageSize = ps
|
||||
}
|
||||
}
|
||||
}
|
||||
} else if strings.Contains(line, "Pages free:") {
|
||||
val := extractNumberFromLine(line)
|
||||
freePages = val
|
||||
} else if strings.Contains(line, "Pages inactive:") {
|
||||
val := extractNumberFromLine(line)
|
||||
inactivePages = val
|
||||
}
|
||||
}
|
||||
|
||||
info.FreeBytes = freePages * pageSize
|
||||
info.AvailableBytes = (freePages + inactivePages) * pageSize
|
||||
}
|
||||
|
||||
info.UsedBytes = info.TotalBytes - info.AvailableBytes
|
||||
return nil
|
||||
}
|
||||
|
||||
// detectWindowsMemory detects memory on Windows
|
||||
func detectWindowsMemory(info *MemoryInfo) error {
|
||||
// Use wmic for memory info
|
||||
if output, err := runCommand("wmic", "OS", "get", "TotalVisibleMemorySize,FreePhysicalMemory", "/format:list"); err == nil {
|
||||
lines := strings.Split(output, "\n")
|
||||
for _, line := range lines {
|
||||
line = strings.TrimSpace(line)
|
||||
if strings.HasPrefix(line, "TotalVisibleMemorySize=") {
|
||||
val := strings.TrimPrefix(line, "TotalVisibleMemorySize=")
|
||||
if v, err := strconv.ParseInt(strings.TrimSpace(val), 10, 64); err == nil {
|
||||
info.TotalBytes = v * 1024 // KB to bytes
|
||||
}
|
||||
} else if strings.HasPrefix(line, "FreePhysicalMemory=") {
|
||||
val := strings.TrimPrefix(line, "FreePhysicalMemory=")
|
||||
if v, err := strconv.ParseInt(strings.TrimSpace(val), 10, 64); err == nil {
|
||||
info.FreeBytes = v * 1024
|
||||
info.AvailableBytes = v * 1024
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
info.UsedBytes = info.TotalBytes - info.AvailableBytes
|
||||
return nil
|
||||
}
|
||||
|
||||
// RecommendProfile recommends a resource profile based on system resources and workload
|
||||
func RecommendProfile(cpuInfo *CPUInfo, memInfo *MemoryInfo, isLargeDB bool) *ResourceProfile {
|
||||
cores := 0
|
||||
if cpuInfo != nil {
|
||||
cores = cpuInfo.PhysicalCores
|
||||
if cores == 0 {
|
||||
cores = cpuInfo.LogicalCores
|
||||
}
|
||||
}
|
||||
if cores == 0 {
|
||||
cores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memGB := 0
|
||||
if memInfo != nil {
|
||||
memGB = memInfo.TotalGB
|
||||
}
|
||||
|
||||
// Special case: large databases should use conservative profile
|
||||
// The caller should also enable LargeDBMode for increased MaxLocksPerTxn
|
||||
if isLargeDB {
|
||||
// For large DBs, recommend conservative regardless of resources
|
||||
// LargeDBMode flag will handle the lock settings separately
|
||||
return &ProfileConservative
|
||||
}
|
||||
|
||||
// Resource-based selection
|
||||
if cores >= 16 && memGB >= 64 {
|
||||
return &ProfileMaxPerformance
|
||||
} else if cores >= 8 && memGB >= 32 {
|
||||
return &ProfilePerformance
|
||||
} else if cores >= 4 && memGB >= 16 {
|
||||
return &ProfileBalanced
|
||||
}
|
||||
|
||||
// Default to conservative for constrained systems
|
||||
return &ProfileConservative
|
||||
}
|
||||
|
||||
// RecommendProfileWithReason returns a profile recommendation with explanation
|
||||
func RecommendProfileWithReason(cpuInfo *CPUInfo, memInfo *MemoryInfo, isLargeDB bool) (*ResourceProfile, string) {
|
||||
cores := 0
|
||||
if cpuInfo != nil {
|
||||
cores = cpuInfo.PhysicalCores
|
||||
if cores == 0 {
|
||||
cores = cpuInfo.LogicalCores
|
||||
}
|
||||
}
|
||||
if cores == 0 {
|
||||
cores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memGB := 0
|
||||
if memInfo != nil {
|
||||
memGB = memInfo.TotalGB
|
||||
}
|
||||
|
||||
// Build reason string
|
||||
var reason strings.Builder
|
||||
reason.WriteString(fmt.Sprintf("System: %d cores, %dGB RAM. ", cores, memGB))
|
||||
|
||||
profile := RecommendProfile(cpuInfo, memInfo, isLargeDB)
|
||||
|
||||
if isLargeDB {
|
||||
reason.WriteString("Large database mode - using conservative settings. Enable LargeDBMode for higher max_locks.")
|
||||
} else if profile.Name == "conservative" {
|
||||
reason.WriteString("Limited resources detected - using conservative profile for stability.")
|
||||
} else if profile.Name == "max-performance" {
|
||||
reason.WriteString("High-end server detected - using maximum parallelism.")
|
||||
} else if profile.Name == "performance" {
|
||||
reason.WriteString("Good resources detected - using performance profile.")
|
||||
} else {
|
||||
reason.WriteString("Using balanced profile for optimal performance/stability trade-off.")
|
||||
}
|
||||
|
||||
return profile, reason.String()
|
||||
}
|
||||
|
||||
// ValidateProfileForSystem checks if a profile is suitable for the current system
|
||||
func ValidateProfileForSystem(profile *ResourceProfile, cpuInfo *CPUInfo, memInfo *MemoryInfo) (bool, []string) {
|
||||
var warnings []string
|
||||
|
||||
cores := 0
|
||||
if cpuInfo != nil {
|
||||
cores = cpuInfo.PhysicalCores
|
||||
if cores == 0 {
|
||||
cores = cpuInfo.LogicalCores
|
||||
}
|
||||
}
|
||||
if cores == 0 {
|
||||
cores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memGB := 0
|
||||
if memInfo != nil {
|
||||
memGB = memInfo.TotalGB
|
||||
}
|
||||
|
||||
// Check minimum requirements
|
||||
if cores < profile.MinCores {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("Profile '%s' recommends %d+ cores (system has %d)", profile.Name, profile.MinCores, cores))
|
||||
}
|
||||
|
||||
if memGB < profile.MinMemoryGB {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("Profile '%s' recommends %dGB+ RAM (system has %dGB)", profile.Name, profile.MinMemoryGB, memGB))
|
||||
}
|
||||
|
||||
// Check for potential issues
|
||||
if profile.ClusterParallelism > cores {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("Cluster parallelism (%d) exceeds CPU cores (%d) - may cause contention",
|
||||
profile.ClusterParallelism, cores))
|
||||
}
|
||||
|
||||
// Memory pressure warning
|
||||
memPerWorker := 2 // Rough estimate: 2GB per parallel worker for large DB operations
|
||||
requiredMem := profile.ClusterParallelism * profile.Jobs * memPerWorker
|
||||
if memGB > 0 && requiredMem > memGB {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("High parallelism may require ~%dGB RAM (system has %dGB) - risk of OOM",
|
||||
requiredMem, memGB))
|
||||
}
|
||||
|
||||
return len(warnings) == 0, warnings
|
||||
}
|
||||
|
||||
// FormatProfileSummary returns a formatted summary of a profile
|
||||
func (p *ResourceProfile) FormatProfileSummary() string {
|
||||
return fmt.Sprintf("[%s] Parallel: %d DBs, %d jobs | Recommended for large DBs: %v",
|
||||
strings.ToUpper(p.Name),
|
||||
p.ClusterParallelism,
|
||||
p.Jobs,
|
||||
p.RecommendedForLarge)
|
||||
}
|
||||
|
||||
// PostgreSQLRecommendations returns PostgreSQL configuration recommendations for this profile
|
||||
func (p *ResourceProfile) PostgreSQLRecommendations() []string {
|
||||
return []string{
|
||||
fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d;", p.MaxLocksPerTxn),
|
||||
fmt.Sprintf("ALTER SYSTEM SET maintenance_work_mem = '%s';", p.MaintenanceWorkMem),
|
||||
"-- Restart PostgreSQL after changes to max_locks_per_transaction",
|
||||
}
|
||||
}
|
||||
|
||||
// Helper functions
|
||||
|
||||
func runCommand(name string, args ...string) (string, error) {
|
||||
cmd := exec.Command(name, args...)
|
||||
output, err := cmd.Output()
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
return string(output), nil
|
||||
}
|
||||
|
||||
func extractNumberFromLine(line string) int64 {
|
||||
// Extract number before the period at end (e.g., "Pages free: 123456.")
|
||||
parts := strings.Fields(line)
|
||||
for _, p := range parts {
|
||||
p = strings.TrimSuffix(p, ".")
|
||||
if val, err := strconv.ParseInt(p, 10, 64); err == nil && val > 0 {
|
||||
return val
|
||||
}
|
||||
}
|
||||
return 0
|
||||
}
|
||||
@@ -316,11 +316,12 @@ func (p *PostgreSQL) BuildBackupCommand(database, outputFile string, options Bac
|
||||
cmd := []string{"pg_dump"}
|
||||
|
||||
// Connection parameters
|
||||
if p.cfg.Host != "localhost" {
|
||||
// CRITICAL: Always pass port even for localhost - user may have non-standard port
|
||||
if p.cfg.Host != "localhost" && p.cfg.Host != "127.0.0.1" && p.cfg.Host != "" {
|
||||
cmd = append(cmd, "-h", p.cfg.Host)
|
||||
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
||||
cmd = append(cmd, "--no-password")
|
||||
}
|
||||
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
||||
cmd = append(cmd, "-U", p.cfg.User)
|
||||
|
||||
// Format and compression
|
||||
@@ -380,11 +381,12 @@ func (p *PostgreSQL) BuildRestoreCommand(database, inputFile string, options Res
|
||||
cmd := []string{"pg_restore"}
|
||||
|
||||
// Connection parameters
|
||||
if p.cfg.Host != "localhost" {
|
||||
// CRITICAL: Always pass port even for localhost - user may have non-standard port
|
||||
if p.cfg.Host != "localhost" && p.cfg.Host != "127.0.0.1" && p.cfg.Host != "" {
|
||||
cmd = append(cmd, "-h", p.cfg.Host)
|
||||
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
||||
cmd = append(cmd, "--no-password")
|
||||
}
|
||||
cmd = append(cmd, "-p", strconv.Itoa(p.cfg.Port))
|
||||
cmd = append(cmd, "-U", p.cfg.User)
|
||||
|
||||
// Parallel jobs (incompatible with --single-transaction per PostgreSQL docs)
|
||||
|
||||
@@ -4,6 +4,7 @@ import (
|
||||
"bytes"
|
||||
"crypto/rand"
|
||||
"io"
|
||||
mathrand "math/rand"
|
||||
"testing"
|
||||
)
|
||||
|
||||
@@ -100,12 +101,15 @@ func TestChunker_Deterministic(t *testing.T) {
|
||||
|
||||
func TestChunker_ShiftedData(t *testing.T) {
|
||||
// Test that shifted data still shares chunks (the key CDC benefit)
|
||||
// Use deterministic random data for reproducible test results
|
||||
rng := mathrand.New(mathrand.NewSource(42))
|
||||
|
||||
original := make([]byte, 100*1024)
|
||||
rand.Read(original)
|
||||
rng.Read(original)
|
||||
|
||||
// Create shifted version (prepend some bytes)
|
||||
prefix := make([]byte, 1000)
|
||||
rand.Read(prefix)
|
||||
rng.Read(prefix)
|
||||
shifted := append(prefix, original...)
|
||||
|
||||
// Chunk both
|
||||
|
||||
223
internal/fs/fs.go
Normal file
223
internal/fs/fs.go
Normal file
@@ -0,0 +1,223 @@
|
||||
// Package fs provides filesystem abstraction using spf13/afero for testability.
|
||||
// It allows swapping the real filesystem with an in-memory mock for unit tests.
|
||||
package fs
|
||||
|
||||
import (
|
||||
"io"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"time"
|
||||
|
||||
"github.com/spf13/afero"
|
||||
)
|
||||
|
||||
// FS is the global filesystem interface used throughout the application.
|
||||
// By default, it uses the real OS filesystem.
|
||||
// For testing, use SetFS(afero.NewMemMapFs()) to use an in-memory filesystem.
|
||||
var FS afero.Fs = afero.NewOsFs()
|
||||
|
||||
// SetFS sets the global filesystem (useful for testing)
|
||||
func SetFS(fs afero.Fs) {
|
||||
FS = fs
|
||||
}
|
||||
|
||||
// ResetFS resets to the real OS filesystem
|
||||
func ResetFS() {
|
||||
FS = afero.NewOsFs()
|
||||
}
|
||||
|
||||
// NewMemMapFs creates a new in-memory filesystem for testing
|
||||
func NewMemMapFs() afero.Fs {
|
||||
return afero.NewMemMapFs()
|
||||
}
|
||||
|
||||
// NewReadOnlyFs wraps a filesystem to make it read-only
|
||||
func NewReadOnlyFs(base afero.Fs) afero.Fs {
|
||||
return afero.NewReadOnlyFs(base)
|
||||
}
|
||||
|
||||
// NewBasePathFs creates a filesystem rooted at a specific path
|
||||
func NewBasePathFs(base afero.Fs, path string) afero.Fs {
|
||||
return afero.NewBasePathFs(base, path)
|
||||
}
|
||||
|
||||
// --- File Operations (use global FS) ---
|
||||
|
||||
// Create creates a file
|
||||
func Create(name string) (afero.File, error) {
|
||||
return FS.Create(name)
|
||||
}
|
||||
|
||||
// Open opens a file for reading
|
||||
func Open(name string) (afero.File, error) {
|
||||
return FS.Open(name)
|
||||
}
|
||||
|
||||
// OpenFile opens a file with specified flags and permissions
|
||||
func OpenFile(name string, flag int, perm os.FileMode) (afero.File, error) {
|
||||
return FS.OpenFile(name, flag, perm)
|
||||
}
|
||||
|
||||
// Remove removes a file or empty directory
|
||||
func Remove(name string) error {
|
||||
return FS.Remove(name)
|
||||
}
|
||||
|
||||
// RemoveAll removes a path and any children it contains
|
||||
func RemoveAll(path string) error {
|
||||
return FS.RemoveAll(path)
|
||||
}
|
||||
|
||||
// Rename renames (moves) a file
|
||||
func Rename(oldname, newname string) error {
|
||||
return FS.Rename(oldname, newname)
|
||||
}
|
||||
|
||||
// Stat returns file info
|
||||
func Stat(name string) (os.FileInfo, error) {
|
||||
return FS.Stat(name)
|
||||
}
|
||||
|
||||
// Chmod changes file mode
|
||||
func Chmod(name string, mode os.FileMode) error {
|
||||
return FS.Chmod(name, mode)
|
||||
}
|
||||
|
||||
// Chown changes file ownership (may not work on all filesystems)
|
||||
func Chown(name string, uid, gid int) error {
|
||||
return FS.Chown(name, uid, gid)
|
||||
}
|
||||
|
||||
// Chtimes changes file access and modification times
|
||||
func Chtimes(name string, atime, mtime time.Time) error {
|
||||
return FS.Chtimes(name, atime, mtime)
|
||||
}
|
||||
|
||||
// --- Directory Operations ---
|
||||
|
||||
// Mkdir creates a directory
|
||||
func Mkdir(name string, perm os.FileMode) error {
|
||||
return FS.Mkdir(name, perm)
|
||||
}
|
||||
|
||||
// MkdirAll creates a directory and all parents
|
||||
func MkdirAll(path string, perm os.FileMode) error {
|
||||
return FS.MkdirAll(path, perm)
|
||||
}
|
||||
|
||||
// ReadDir reads a directory
|
||||
func ReadDir(dirname string) ([]os.FileInfo, error) {
|
||||
return afero.ReadDir(FS, dirname)
|
||||
}
|
||||
|
||||
// --- File Content Operations ---
|
||||
|
||||
// ReadFile reads an entire file
|
||||
func ReadFile(filename string) ([]byte, error) {
|
||||
return afero.ReadFile(FS, filename)
|
||||
}
|
||||
|
||||
// WriteFile writes data to a file
|
||||
func WriteFile(filename string, data []byte, perm os.FileMode) error {
|
||||
return afero.WriteFile(FS, filename, data, perm)
|
||||
}
|
||||
|
||||
// --- Existence Checks ---
|
||||
|
||||
// Exists checks if a file or directory exists
|
||||
func Exists(path string) (bool, error) {
|
||||
return afero.Exists(FS, path)
|
||||
}
|
||||
|
||||
// DirExists checks if a directory exists
|
||||
func DirExists(path string) (bool, error) {
|
||||
return afero.DirExists(FS, path)
|
||||
}
|
||||
|
||||
// IsDir checks if path is a directory
|
||||
func IsDir(path string) (bool, error) {
|
||||
return afero.IsDir(FS, path)
|
||||
}
|
||||
|
||||
// IsEmpty checks if a directory is empty
|
||||
func IsEmpty(path string) (bool, error) {
|
||||
return afero.IsEmpty(FS, path)
|
||||
}
|
||||
|
||||
// --- Utility Functions ---
|
||||
|
||||
// Walk walks a directory tree
|
||||
func Walk(root string, walkFn filepath.WalkFunc) error {
|
||||
return afero.Walk(FS, root, walkFn)
|
||||
}
|
||||
|
||||
// Glob returns the names of all files matching pattern
|
||||
func Glob(pattern string) ([]string, error) {
|
||||
return afero.Glob(FS, pattern)
|
||||
}
|
||||
|
||||
// TempDir creates a temporary directory
|
||||
func TempDir(dir, prefix string) (string, error) {
|
||||
return afero.TempDir(FS, dir, prefix)
|
||||
}
|
||||
|
||||
// TempFile creates a temporary file
|
||||
func TempFile(dir, pattern string) (afero.File, error) {
|
||||
return afero.TempFile(FS, dir, pattern)
|
||||
}
|
||||
|
||||
// CopyFile copies a file from src to dst
|
||||
func CopyFile(src, dst string) error {
|
||||
srcFile, err := FS.Open(src)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer srcFile.Close()
|
||||
|
||||
srcInfo, err := srcFile.Stat()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
dstFile, err := FS.OpenFile(dst, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, srcInfo.Mode())
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer dstFile.Close()
|
||||
|
||||
_, err = io.Copy(dstFile, srcFile)
|
||||
return err
|
||||
}
|
||||
|
||||
// FileSize returns the size of a file
|
||||
func FileSize(path string) (int64, error) {
|
||||
info, err := FS.Stat(path)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
return info.Size(), nil
|
||||
}
|
||||
|
||||
// --- Testing Helpers ---
|
||||
|
||||
// WithMemFs executes a function with an in-memory filesystem, then restores the original
|
||||
func WithMemFs(fn func(fs afero.Fs)) {
|
||||
original := FS
|
||||
memFs := afero.NewMemMapFs()
|
||||
FS = memFs
|
||||
defer func() { FS = original }()
|
||||
fn(memFs)
|
||||
}
|
||||
|
||||
// SetupTestDir creates a test directory structure in-memory
|
||||
func SetupTestDir(files map[string]string) afero.Fs {
|
||||
memFs := afero.NewMemMapFs()
|
||||
for path, content := range files {
|
||||
dir := filepath.Dir(path)
|
||||
if dir != "." && dir != "/" {
|
||||
_ = memFs.MkdirAll(dir, 0755)
|
||||
}
|
||||
_ = afero.WriteFile(memFs, path, []byte(content), 0644)
|
||||
}
|
||||
return memFs
|
||||
}
|
||||
191
internal/fs/fs_test.go
Normal file
191
internal/fs/fs_test.go
Normal file
@@ -0,0 +1,191 @@
|
||||
package fs
|
||||
|
||||
import (
|
||||
"os"
|
||||
"testing"
|
||||
|
||||
"github.com/spf13/afero"
|
||||
)
|
||||
|
||||
func TestMemMapFs(t *testing.T) {
|
||||
// Use in-memory filesystem for testing
|
||||
WithMemFs(func(memFs afero.Fs) {
|
||||
// Create a file
|
||||
err := WriteFile("/test/file.txt", []byte("hello world"), 0644)
|
||||
if err != nil {
|
||||
t.Fatalf("WriteFile failed: %v", err)
|
||||
}
|
||||
|
||||
// Read it back
|
||||
content, err := ReadFile("/test/file.txt")
|
||||
if err != nil {
|
||||
t.Fatalf("ReadFile failed: %v", err)
|
||||
}
|
||||
|
||||
if string(content) != "hello world" {
|
||||
t.Errorf("expected 'hello world', got '%s'", string(content))
|
||||
}
|
||||
|
||||
// Check existence
|
||||
exists, err := Exists("/test/file.txt")
|
||||
if err != nil {
|
||||
t.Fatalf("Exists failed: %v", err)
|
||||
}
|
||||
if !exists {
|
||||
t.Error("file should exist")
|
||||
}
|
||||
|
||||
// Check non-existent file
|
||||
exists, err = Exists("/nonexistent.txt")
|
||||
if err != nil {
|
||||
t.Fatalf("Exists failed: %v", err)
|
||||
}
|
||||
if exists {
|
||||
t.Error("file should not exist")
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestSetupTestDir(t *testing.T) {
|
||||
// Create test directory structure
|
||||
testFs := SetupTestDir(map[string]string{
|
||||
"/backups/db1.dump": "database 1 content",
|
||||
"/backups/db2.dump": "database 2 content",
|
||||
"/config/settings.json": `{"key": "value"}`,
|
||||
})
|
||||
|
||||
// Verify files exist
|
||||
content, err := afero.ReadFile(testFs, "/backups/db1.dump")
|
||||
if err != nil {
|
||||
t.Fatalf("ReadFile failed: %v", err)
|
||||
}
|
||||
if string(content) != "database 1 content" {
|
||||
t.Errorf("unexpected content: %s", string(content))
|
||||
}
|
||||
|
||||
// Verify directory structure
|
||||
files, err := afero.ReadDir(testFs, "/backups")
|
||||
if err != nil {
|
||||
t.Fatalf("ReadDir failed: %v", err)
|
||||
}
|
||||
if len(files) != 2 {
|
||||
t.Errorf("expected 2 files, got %d", len(files))
|
||||
}
|
||||
}
|
||||
|
||||
func TestCopyFile(t *testing.T) {
|
||||
WithMemFs(func(memFs afero.Fs) {
|
||||
// Create source file
|
||||
err := WriteFile("/source.txt", []byte("copy me"), 0644)
|
||||
if err != nil {
|
||||
t.Fatalf("WriteFile failed: %v", err)
|
||||
}
|
||||
|
||||
// Copy file
|
||||
err = CopyFile("/source.txt", "/dest.txt")
|
||||
if err != nil {
|
||||
t.Fatalf("CopyFile failed: %v", err)
|
||||
}
|
||||
|
||||
// Verify copy
|
||||
content, err := ReadFile("/dest.txt")
|
||||
if err != nil {
|
||||
t.Fatalf("ReadFile failed: %v", err)
|
||||
}
|
||||
if string(content) != "copy me" {
|
||||
t.Errorf("unexpected content: %s", string(content))
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestFileSize(t *testing.T) {
|
||||
WithMemFs(func(memFs afero.Fs) {
|
||||
data := []byte("12345678901234567890") // 20 bytes
|
||||
err := WriteFile("/sized.txt", data, 0644)
|
||||
if err != nil {
|
||||
t.Fatalf("WriteFile failed: %v", err)
|
||||
}
|
||||
|
||||
size, err := FileSize("/sized.txt")
|
||||
if err != nil {
|
||||
t.Fatalf("FileSize failed: %v", err)
|
||||
}
|
||||
if size != 20 {
|
||||
t.Errorf("expected size 20, got %d", size)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestTempDir(t *testing.T) {
|
||||
WithMemFs(func(memFs afero.Fs) {
|
||||
// Create temp dir
|
||||
dir, err := TempDir("", "test-")
|
||||
if err != nil {
|
||||
t.Fatalf("TempDir failed: %v", err)
|
||||
}
|
||||
|
||||
// Verify it exists
|
||||
isDir, err := IsDir(dir)
|
||||
if err != nil {
|
||||
t.Fatalf("IsDir failed: %v", err)
|
||||
}
|
||||
if !isDir {
|
||||
t.Error("temp dir should be a directory")
|
||||
}
|
||||
|
||||
// Verify it's empty
|
||||
isEmpty, err := IsEmpty(dir)
|
||||
if err != nil {
|
||||
t.Fatalf("IsEmpty failed: %v", err)
|
||||
}
|
||||
if !isEmpty {
|
||||
t.Error("temp dir should be empty")
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestWalk(t *testing.T) {
|
||||
WithMemFs(func(memFs afero.Fs) {
|
||||
// Create directory structure
|
||||
_ = MkdirAll("/root/a/b", 0755)
|
||||
_ = WriteFile("/root/file1.txt", []byte("1"), 0644)
|
||||
_ = WriteFile("/root/a/file2.txt", []byte("2"), 0644)
|
||||
_ = WriteFile("/root/a/b/file3.txt", []byte("3"), 0644)
|
||||
|
||||
var files []string
|
||||
err := Walk("/root", func(path string, info os.FileInfo, err error) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if !info.IsDir() {
|
||||
files = append(files, path)
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
if err != nil {
|
||||
t.Fatalf("Walk failed: %v", err)
|
||||
}
|
||||
|
||||
if len(files) != 3 {
|
||||
t.Errorf("expected 3 files, got %d: %v", len(files), files)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
func TestGlob(t *testing.T) {
|
||||
WithMemFs(func(memFs afero.Fs) {
|
||||
_ = WriteFile("/data/backup1.dump", []byte("1"), 0644)
|
||||
_ = WriteFile("/data/backup2.dump", []byte("2"), 0644)
|
||||
_ = WriteFile("/data/config.json", []byte("{}"), 0644)
|
||||
|
||||
matches, err := Glob("/data/*.dump")
|
||||
if err != nil {
|
||||
t.Fatalf("Glob failed: %v", err)
|
||||
}
|
||||
|
||||
if len(matches) != 2 {
|
||||
t.Errorf("expected 2 matches, got %d: %v", len(matches), matches)
|
||||
}
|
||||
})
|
||||
}
|
||||
118
internal/logger/colors.go
Normal file
118
internal/logger/colors.go
Normal file
@@ -0,0 +1,118 @@
|
||||
package logger
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/fatih/color"
|
||||
)
|
||||
|
||||
// CLI output helpers using fatih/color for cross-platform support
|
||||
|
||||
// Success prints a success message with green checkmark
|
||||
func Success(format string, args ...interface{}) {
|
||||
msg := fmt.Sprintf(format, args...)
|
||||
SuccessColor.Fprint(os.Stdout, "✓ ")
|
||||
fmt.Println(msg)
|
||||
}
|
||||
|
||||
// Error prints an error message with red X
|
||||
func Error(format string, args ...interface{}) {
|
||||
msg := fmt.Sprintf(format, args...)
|
||||
ErrorColor.Fprint(os.Stderr, "✗ ")
|
||||
fmt.Fprintln(os.Stderr, msg)
|
||||
}
|
||||
|
||||
// Warning prints a warning message with yellow exclamation
|
||||
func Warning(format string, args ...interface{}) {
|
||||
msg := fmt.Sprintf(format, args...)
|
||||
WarnColor.Fprint(os.Stdout, "⚠ ")
|
||||
fmt.Println(msg)
|
||||
}
|
||||
|
||||
// Info prints an info message with blue arrow
|
||||
func Info(format string, args ...interface{}) {
|
||||
msg := fmt.Sprintf(format, args...)
|
||||
InfoColor.Fprint(os.Stdout, "→ ")
|
||||
fmt.Println(msg)
|
||||
}
|
||||
|
||||
// Header prints a bold header
|
||||
func Header(format string, args ...interface{}) {
|
||||
msg := fmt.Sprintf(format, args...)
|
||||
HighlightColor.Println(msg)
|
||||
}
|
||||
|
||||
// Dim prints dimmed/secondary text
|
||||
func Dim(format string, args ...interface{}) {
|
||||
msg := fmt.Sprintf(format, args...)
|
||||
DimColor.Println(msg)
|
||||
}
|
||||
|
||||
// Bold returns bold text
|
||||
func Bold(text string) string {
|
||||
return color.New(color.Bold).Sprint(text)
|
||||
}
|
||||
|
||||
// Green returns green text
|
||||
func Green(text string) string {
|
||||
return SuccessColor.Sprint(text)
|
||||
}
|
||||
|
||||
// Red returns red text
|
||||
func Red(text string) string {
|
||||
return ErrorColor.Sprint(text)
|
||||
}
|
||||
|
||||
// Yellow returns yellow text
|
||||
func Yellow(text string) string {
|
||||
return WarnColor.Sprint(text)
|
||||
}
|
||||
|
||||
// Cyan returns cyan text
|
||||
func Cyan(text string) string {
|
||||
return InfoColor.Sprint(text)
|
||||
}
|
||||
|
||||
// StatusLine prints a key-value status line
|
||||
func StatusLine(key, value string) {
|
||||
DimColor.Printf(" %s: ", key)
|
||||
fmt.Println(value)
|
||||
}
|
||||
|
||||
// ProgressStatus prints operation status with timing
|
||||
func ProgressStatus(operation string, status string, isSuccess bool) {
|
||||
if isSuccess {
|
||||
SuccessColor.Print("[OK] ")
|
||||
} else {
|
||||
ErrorColor.Print("[FAIL] ")
|
||||
}
|
||||
fmt.Printf("%s: %s\n", operation, status)
|
||||
}
|
||||
|
||||
// Table prints a simple formatted table row
|
||||
func TableRow(cols ...string) {
|
||||
for i, col := range cols {
|
||||
if i == 0 {
|
||||
InfoColor.Printf("%-20s", col)
|
||||
} else {
|
||||
fmt.Printf("%-15s", col)
|
||||
}
|
||||
}
|
||||
fmt.Println()
|
||||
}
|
||||
|
||||
// DisableColors disables all color output (for non-TTY or --no-color flag)
|
||||
func DisableColors() {
|
||||
color.NoColor = true
|
||||
}
|
||||
|
||||
// EnableColors enables color output
|
||||
func EnableColors() {
|
||||
color.NoColor = false
|
||||
}
|
||||
|
||||
// IsColorEnabled returns whether colors are enabled
|
||||
func IsColorEnabled() bool {
|
||||
return !color.NoColor
|
||||
}
|
||||
@@ -7,9 +7,29 @@ import (
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/fatih/color"
|
||||
"github.com/sirupsen/logrus"
|
||||
)
|
||||
|
||||
// Color printers for consistent output across the application
|
||||
var (
|
||||
// Status colors
|
||||
SuccessColor = color.New(color.FgGreen, color.Bold)
|
||||
ErrorColor = color.New(color.FgRed, color.Bold)
|
||||
WarnColor = color.New(color.FgYellow, color.Bold)
|
||||
InfoColor = color.New(color.FgCyan)
|
||||
DebugColor = color.New(color.FgWhite)
|
||||
|
||||
// Highlight colors
|
||||
HighlightColor = color.New(color.FgMagenta, color.Bold)
|
||||
DimColor = color.New(color.FgHiBlack)
|
||||
|
||||
// Data colors
|
||||
NumberColor = color.New(color.FgYellow)
|
||||
PathColor = color.New(color.FgBlue, color.Underline)
|
||||
TimeColor = color.New(color.FgCyan)
|
||||
)
|
||||
|
||||
// Logger defines the interface for logging
|
||||
type Logger interface {
|
||||
Debug(msg string, args ...any)
|
||||
@@ -226,34 +246,32 @@ type CleanFormatter struct{}
|
||||
func (f *CleanFormatter) Format(entry *logrus.Entry) ([]byte, error) {
|
||||
timestamp := entry.Time.Format("2006-01-02T15:04:05")
|
||||
|
||||
// Color codes for different log levels
|
||||
var levelColor, levelText string
|
||||
// Get level color and text using fatih/color
|
||||
var levelPrinter *color.Color
|
||||
var levelText string
|
||||
switch entry.Level {
|
||||
case logrus.DebugLevel:
|
||||
levelColor = "\033[36m" // Cyan
|
||||
levelPrinter = DebugColor
|
||||
levelText = "DEBUG"
|
||||
case logrus.InfoLevel:
|
||||
levelColor = "\033[32m" // Green
|
||||
levelPrinter = SuccessColor
|
||||
levelText = "INFO "
|
||||
case logrus.WarnLevel:
|
||||
levelColor = "\033[33m" // Yellow
|
||||
levelPrinter = WarnColor
|
||||
levelText = "WARN "
|
||||
case logrus.ErrorLevel:
|
||||
levelColor = "\033[31m" // Red
|
||||
levelPrinter = ErrorColor
|
||||
levelText = "ERROR"
|
||||
default:
|
||||
levelColor = "\033[0m" // Reset
|
||||
levelPrinter = InfoColor
|
||||
levelText = "INFO "
|
||||
}
|
||||
resetColor := "\033[0m"
|
||||
|
||||
// Build the message with perfectly aligned columns
|
||||
var output strings.Builder
|
||||
|
||||
// Column 1: Level (with color, fixed width 5 chars)
|
||||
output.WriteString(levelColor)
|
||||
output.WriteString(levelText)
|
||||
output.WriteString(resetColor)
|
||||
output.WriteString(levelPrinter.Sprint(levelText))
|
||||
output.WriteString(" ")
|
||||
|
||||
// Column 2: Timestamp (fixed format)
|
||||
|
||||
@@ -7,9 +7,17 @@ import (
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/fatih/color"
|
||||
"github.com/schollz/progressbar/v3"
|
||||
)
|
||||
|
||||
// Color printers for progress indicators
|
||||
var (
|
||||
okColor = color.New(color.FgGreen, color.Bold)
|
||||
failColor = color.New(color.FgRed, color.Bold)
|
||||
warnColor = color.New(color.FgYellow, color.Bold)
|
||||
)
|
||||
|
||||
// Indicator represents a progress indicator interface
|
||||
type Indicator interface {
|
||||
Start(message string)
|
||||
@@ -94,13 +102,15 @@ func (s *Spinner) Update(message string) {
|
||||
// Complete stops the spinner with a success message
|
||||
func (s *Spinner) Complete(message string) {
|
||||
s.Stop()
|
||||
fmt.Fprintf(s.writer, "\n[OK] %s\n", message)
|
||||
okColor.Fprint(s.writer, "[OK] ")
|
||||
fmt.Fprintln(s.writer, message)
|
||||
}
|
||||
|
||||
// Fail stops the spinner with a failure message
|
||||
func (s *Spinner) Fail(message string) {
|
||||
s.Stop()
|
||||
fmt.Fprintf(s.writer, "\n[FAIL] %s\n", message)
|
||||
failColor.Fprint(s.writer, "[FAIL] ")
|
||||
fmt.Fprintln(s.writer, message)
|
||||
}
|
||||
|
||||
// Stop stops the spinner
|
||||
@@ -169,13 +179,15 @@ func (d *Dots) Update(message string) {
|
||||
// Complete stops the dots with a success message
|
||||
func (d *Dots) Complete(message string) {
|
||||
d.Stop()
|
||||
fmt.Fprintf(d.writer, " [OK] %s\n", message)
|
||||
okColor.Fprint(d.writer, " [OK] ")
|
||||
fmt.Fprintln(d.writer, message)
|
||||
}
|
||||
|
||||
// Fail stops the dots with a failure message
|
||||
func (d *Dots) Fail(message string) {
|
||||
d.Stop()
|
||||
fmt.Fprintf(d.writer, " [FAIL] %s\n", message)
|
||||
failColor.Fprint(d.writer, " [FAIL] ")
|
||||
fmt.Fprintln(d.writer, message)
|
||||
}
|
||||
|
||||
// Stop stops the dots indicator
|
||||
@@ -241,14 +253,16 @@ func (p *ProgressBar) Complete(message string) {
|
||||
p.current = p.total
|
||||
p.message = message
|
||||
p.render()
|
||||
fmt.Fprintf(p.writer, " [OK] %s\n", message)
|
||||
okColor.Fprint(p.writer, " [OK] ")
|
||||
fmt.Fprintln(p.writer, message)
|
||||
p.Stop()
|
||||
}
|
||||
|
||||
// Fail stops the progress bar with failure
|
||||
func (p *ProgressBar) Fail(message string) {
|
||||
p.render()
|
||||
fmt.Fprintf(p.writer, " [FAIL] %s\n", message)
|
||||
failColor.Fprint(p.writer, " [FAIL] ")
|
||||
fmt.Fprintln(p.writer, message)
|
||||
p.Stop()
|
||||
}
|
||||
|
||||
@@ -300,12 +314,14 @@ func (s *Static) Update(message string) {
|
||||
|
||||
// Complete shows completion message
|
||||
func (s *Static) Complete(message string) {
|
||||
fmt.Fprintf(s.writer, " [OK] %s\n", message)
|
||||
okColor.Fprint(s.writer, " [OK] ")
|
||||
fmt.Fprintln(s.writer, message)
|
||||
}
|
||||
|
||||
// Fail shows failure message
|
||||
func (s *Static) Fail(message string) {
|
||||
fmt.Fprintf(s.writer, " [FAIL] %s\n", message)
|
||||
failColor.Fprint(s.writer, " [FAIL] ")
|
||||
fmt.Fprintln(s.writer, message)
|
||||
}
|
||||
|
||||
// Stop does nothing for static indicator
|
||||
@@ -382,12 +398,14 @@ func (l *LineByLine) SetEstimator(estimator *ETAEstimator) {
|
||||
|
||||
// Complete shows completion message
|
||||
func (l *LineByLine) Complete(message string) {
|
||||
fmt.Fprintf(l.writer, "[OK] %s\n\n", message)
|
||||
okColor.Fprint(l.writer, "[OK] ")
|
||||
fmt.Fprintf(l.writer, "%s\n\n", message)
|
||||
}
|
||||
|
||||
// Fail shows failure message
|
||||
func (l *LineByLine) Fail(message string) {
|
||||
fmt.Fprintf(l.writer, "[FAIL] %s\n\n", message)
|
||||
failColor.Fprint(l.writer, "[FAIL] ")
|
||||
fmt.Fprintf(l.writer, "%s\n\n", message)
|
||||
}
|
||||
|
||||
// Stop does nothing for line-by-line (no cleanup needed)
|
||||
@@ -410,13 +428,15 @@ func (l *Light) Update(message string) {
|
||||
|
||||
func (l *Light) Complete(message string) {
|
||||
if !l.silent {
|
||||
fmt.Fprintf(l.writer, "[OK] %s\n", message)
|
||||
okColor.Fprint(l.writer, "[OK] ")
|
||||
fmt.Fprintln(l.writer, message)
|
||||
}
|
||||
}
|
||||
|
||||
func (l *Light) Fail(message string) {
|
||||
if !l.silent {
|
||||
fmt.Fprintf(l.writer, "[FAIL] %s\n", message)
|
||||
failColor.Fprint(l.writer, "[FAIL] ")
|
||||
fmt.Fprintln(l.writer, message)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -594,13 +614,15 @@ func (s *SchollzBar) ChangeMax64(max int64) {
|
||||
// Complete finishes with success (Indicator interface)
|
||||
func (s *SchollzBar) Complete(message string) {
|
||||
_ = s.bar.Finish()
|
||||
fmt.Printf("\n[green][OK][reset] %s\n", message)
|
||||
okColor.Print("[OK] ")
|
||||
fmt.Println(message)
|
||||
}
|
||||
|
||||
// Fail finishes with failure (Indicator interface)
|
||||
func (s *SchollzBar) Fail(message string) {
|
||||
_ = s.bar.Clear()
|
||||
fmt.Printf("\n[red][FAIL][reset] %s\n", message)
|
||||
failColor.Print("[FAIL] ")
|
||||
fmt.Println(message)
|
||||
}
|
||||
|
||||
// Stop stops the progress bar (Indicator interface)
|
||||
|
||||
@@ -368,7 +368,7 @@ func (d *Diagnoser) diagnoseSQLScript(filePath string, compressed bool, result *
|
||||
}
|
||||
|
||||
// Store last line for termination check
|
||||
if lineNumber > 0 && (lineNumber%100000 == 0) && d.verbose {
|
||||
if lineNumber > 0 && (lineNumber%100000 == 0) && d.verbose && d.log != nil {
|
||||
d.log.Debug("Scanning SQL file", "lines_processed", lineNumber)
|
||||
}
|
||||
}
|
||||
@@ -430,9 +430,11 @@ func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResu
|
||||
}
|
||||
}
|
||||
|
||||
if d.log != nil {
|
||||
d.log.Info("Verifying cluster archive integrity",
|
||||
"size", fmt.Sprintf("%.1f GB", float64(result.FileSize)/(1024*1024*1024)),
|
||||
"timeout", fmt.Sprintf("%d min", timeoutMinutes))
|
||||
}
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
|
||||
defer cancel()
|
||||
@@ -561,7 +563,7 @@ func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResu
|
||||
}
|
||||
|
||||
// For verbose mode, diagnose individual dumps inside the archive
|
||||
if d.verbose && len(dumpFiles) > 0 {
|
||||
if d.verbose && len(dumpFiles) > 0 && d.log != nil {
|
||||
d.log.Info("Cluster archive contains databases", "count", len(dumpFiles))
|
||||
for _, df := range dumpFiles {
|
||||
d.log.Info(" - " + df)
|
||||
@@ -684,9 +686,11 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
|
||||
}
|
||||
}
|
||||
|
||||
if d.log != nil {
|
||||
d.log.Info("Listing cluster archive contents",
|
||||
"size", fmt.Sprintf("%.1f GB", float64(archiveInfo.Size())/(1024*1024*1024)),
|
||||
"timeout", fmt.Sprintf("%d min", timeoutMinutes))
|
||||
}
|
||||
|
||||
listCtx, listCancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
|
||||
defer listCancel()
|
||||
@@ -766,7 +770,9 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
|
||||
return []*DiagnoseResult{errResult}, nil
|
||||
}
|
||||
|
||||
if d.log != nil {
|
||||
d.log.Debug("Archive listing streamed successfully", "total_files", fileCount, "relevant_files", len(files))
|
||||
}
|
||||
|
||||
// Check if we have enough disk space (estimate 4x archive size needed)
|
||||
// archiveInfo already obtained at function start
|
||||
@@ -781,7 +787,9 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
|
||||
testCancel()
|
||||
}
|
||||
|
||||
if d.log != nil {
|
||||
d.log.Info("Archive listing successful", "files", len(files))
|
||||
}
|
||||
|
||||
// Try full extraction - NO TIMEOUT here as large archives can take a long time
|
||||
// Use a generous timeout (30 minutes) for very large archives
|
||||
@@ -870,11 +878,15 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
|
||||
}
|
||||
|
||||
dumpPath := filepath.Join(dumpsDir, name)
|
||||
if d.log != nil {
|
||||
d.log.Info("Diagnosing dump file", "file", name)
|
||||
}
|
||||
|
||||
result, err := d.DiagnoseFile(dumpPath)
|
||||
if err != nil {
|
||||
if d.log != nil {
|
||||
d.log.Warn("Failed to diagnose file", "file", name, "error", err)
|
||||
}
|
||||
continue
|
||||
}
|
||||
results = append(results, result)
|
||||
|
||||
@@ -1,9 +1,12 @@
|
||||
package restore
|
||||
|
||||
import (
|
||||
"archive/tar"
|
||||
"compress/gzip"
|
||||
"context"
|
||||
"database/sql"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
@@ -24,6 +27,21 @@ import (
|
||||
_ "github.com/jackc/pgx/v5/stdlib" // PostgreSQL driver
|
||||
)
|
||||
|
||||
// ProgressCallback is called with progress updates during long operations
|
||||
// Parameters: current bytes/items done, total bytes/items, description
|
||||
type ProgressCallback func(current, total int64, description string)
|
||||
|
||||
// DatabaseProgressCallback is called with database count progress during cluster restore
|
||||
type DatabaseProgressCallback func(done, total int, dbName string)
|
||||
|
||||
// DatabaseProgressWithTimingCallback is called with database progress including timing info
|
||||
// Parameters: done count, total count, database name, elapsed time for current restore phase, avg duration per DB
|
||||
type DatabaseProgressWithTimingCallback func(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration)
|
||||
|
||||
// DatabaseProgressByBytesCallback is called with progress weighted by database sizes (bytes)
|
||||
// Parameters: bytes completed, total bytes, current database name, databases done count, total database count
|
||||
type DatabaseProgressByBytesCallback func(bytesDone, bytesTotal int64, dbName string, dbDone, dbTotal int)
|
||||
|
||||
// Engine handles database restore operations
|
||||
type Engine struct {
|
||||
cfg *config.Config
|
||||
@@ -32,7 +50,14 @@ type Engine struct {
|
||||
progress progress.Indicator
|
||||
detailedReporter *progress.DetailedReporter
|
||||
dryRun bool
|
||||
silentMode bool // Suppress stdout output (for TUI mode)
|
||||
debugLogPath string // Path to save debug log on error
|
||||
|
||||
// TUI progress callback for detailed progress reporting
|
||||
progressCallback ProgressCallback
|
||||
dbProgressCallback DatabaseProgressCallback
|
||||
dbProgressTimingCallback DatabaseProgressWithTimingCallback
|
||||
dbProgressByBytesCallback DatabaseProgressByBytesCallback
|
||||
}
|
||||
|
||||
// New creates a new restore engine
|
||||
@@ -62,6 +87,7 @@ func NewSilent(cfg *config.Config, log logger.Logger, db database.Database) *Eng
|
||||
progress: progressIndicator,
|
||||
detailedReporter: detailedReporter,
|
||||
dryRun: false,
|
||||
silentMode: true, // Suppress stdout for TUI
|
||||
}
|
||||
}
|
||||
|
||||
@@ -88,6 +114,54 @@ func (e *Engine) SetDebugLogPath(path string) {
|
||||
e.debugLogPath = path
|
||||
}
|
||||
|
||||
// SetProgressCallback sets a callback for detailed progress reporting (for TUI mode)
|
||||
func (e *Engine) SetProgressCallback(cb ProgressCallback) {
|
||||
e.progressCallback = cb
|
||||
}
|
||||
|
||||
// SetDatabaseProgressCallback sets a callback for database count progress during cluster restore
|
||||
func (e *Engine) SetDatabaseProgressCallback(cb DatabaseProgressCallback) {
|
||||
e.dbProgressCallback = cb
|
||||
}
|
||||
|
||||
// SetDatabaseProgressWithTimingCallback sets a callback for database progress with timing info
|
||||
func (e *Engine) SetDatabaseProgressWithTimingCallback(cb DatabaseProgressWithTimingCallback) {
|
||||
e.dbProgressTimingCallback = cb
|
||||
}
|
||||
|
||||
// SetDatabaseProgressByBytesCallback sets a callback for progress weighted by database sizes
|
||||
func (e *Engine) SetDatabaseProgressByBytesCallback(cb DatabaseProgressByBytesCallback) {
|
||||
e.dbProgressByBytesCallback = cb
|
||||
}
|
||||
|
||||
// reportProgress safely calls the progress callback if set
|
||||
func (e *Engine) reportProgress(current, total int64, description string) {
|
||||
if e.progressCallback != nil {
|
||||
e.progressCallback(current, total, description)
|
||||
}
|
||||
}
|
||||
|
||||
// reportDatabaseProgress safely calls the database progress callback if set
|
||||
func (e *Engine) reportDatabaseProgress(done, total int, dbName string) {
|
||||
if e.dbProgressCallback != nil {
|
||||
e.dbProgressCallback(done, total, dbName)
|
||||
}
|
||||
}
|
||||
|
||||
// reportDatabaseProgressWithTiming safely calls the timing-aware callback if set
|
||||
func (e *Engine) reportDatabaseProgressWithTiming(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration) {
|
||||
if e.dbProgressTimingCallback != nil {
|
||||
e.dbProgressTimingCallback(done, total, dbName, phaseElapsed, avgPerDB)
|
||||
}
|
||||
}
|
||||
|
||||
// reportDatabaseProgressByBytes safely calls the bytes-weighted callback if set
|
||||
func (e *Engine) reportDatabaseProgressByBytes(bytesDone, bytesTotal int64, dbName string, dbDone, dbTotal int) {
|
||||
if e.dbProgressByBytesCallback != nil {
|
||||
e.dbProgressByBytesCallback(bytesDone, bytesTotal, dbName, dbDone, dbTotal)
|
||||
}
|
||||
}
|
||||
|
||||
// loggerAdapter adapts our logger to the progress.Logger interface
|
||||
type loggerAdapter struct {
|
||||
logger logger.Logger
|
||||
@@ -387,16 +461,18 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
|
||||
var cmd []string
|
||||
|
||||
// For localhost, omit -h to use Unix socket (avoids Ident auth issues)
|
||||
// But always include -p for port (in case of non-standard port)
|
||||
hostArg := ""
|
||||
portArg := fmt.Sprintf("-p %d", e.cfg.Port)
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "" {
|
||||
hostArg = fmt.Sprintf("-h %s -p %d", e.cfg.Host, e.cfg.Port)
|
||||
hostArg = fmt.Sprintf("-h %s", e.cfg.Host)
|
||||
}
|
||||
|
||||
if compressed {
|
||||
// Use ON_ERROR_STOP=1 to fail fast on first error (prevents millions of errors on truncated dumps)
|
||||
psqlCmd := fmt.Sprintf("psql -U %s -d %s -v ON_ERROR_STOP=1", e.cfg.User, targetDB)
|
||||
psqlCmd := fmt.Sprintf("psql %s -U %s -d %s -v ON_ERROR_STOP=1", portArg, e.cfg.User, targetDB)
|
||||
if hostArg != "" {
|
||||
psqlCmd = fmt.Sprintf("psql %s -U %s -d %s -v ON_ERROR_STOP=1", hostArg, e.cfg.User, targetDB)
|
||||
psqlCmd = fmt.Sprintf("psql %s %s -U %s -d %s -v ON_ERROR_STOP=1", hostArg, portArg, e.cfg.User, targetDB)
|
||||
}
|
||||
// Set PGPASSWORD in the bash command for password-less auth
|
||||
cmd = []string{
|
||||
@@ -417,6 +493,7 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
|
||||
} else {
|
||||
cmd = []string{
|
||||
"psql",
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User,
|
||||
"-d", targetDB,
|
||||
"-v", "ON_ERROR_STOP=1",
|
||||
@@ -803,6 +880,25 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
// Create temporary extraction directory in configured WorkDir
|
||||
workDir := e.cfg.GetEffectiveWorkDir()
|
||||
tempDir := filepath.Join(workDir, fmt.Sprintf(".restore_%d", time.Now().Unix()))
|
||||
|
||||
// Check disk space for extraction (need ~3x archive size: compressed + extracted + working space)
|
||||
if archiveInfo != nil {
|
||||
requiredBytes := uint64(archiveInfo.Size()) * 3
|
||||
extractionCheck := checks.CheckDiskSpace(workDir)
|
||||
if extractionCheck.AvailableBytes < requiredBytes {
|
||||
operation.Fail("Insufficient disk space for extraction")
|
||||
return fmt.Errorf("insufficient disk space for extraction in %s: need %.1f GB, have %.1f GB (archive size: %.1f GB × 3)",
|
||||
workDir,
|
||||
float64(requiredBytes)/(1024*1024*1024),
|
||||
float64(extractionCheck.AvailableBytes)/(1024*1024*1024),
|
||||
float64(archiveInfo.Size())/(1024*1024*1024))
|
||||
}
|
||||
e.log.Info("Disk space check for extraction passed",
|
||||
"workdir", workDir,
|
||||
"required_gb", float64(requiredBytes)/(1024*1024*1024),
|
||||
"available_gb", float64(extractionCheck.AvailableBytes)/(1024*1024*1024))
|
||||
}
|
||||
|
||||
if err := os.MkdirAll(tempDir, 0755); err != nil {
|
||||
operation.Fail("Failed to create temporary directory")
|
||||
return fmt.Errorf("failed to create temp directory in %s: %w", workDir, err)
|
||||
@@ -816,6 +912,16 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
return fmt.Errorf("failed to extract archive: %w", err)
|
||||
}
|
||||
|
||||
// Check context validity after extraction (debugging context cancellation issues)
|
||||
if ctx.Err() != nil {
|
||||
e.log.Error("Context cancelled after extraction - this should not happen",
|
||||
"context_error", ctx.Err(),
|
||||
"extraction_completed", true)
|
||||
operation.Fail("Context cancelled unexpectedly")
|
||||
return fmt.Errorf("context cancelled after extraction completed: %w", ctx.Err())
|
||||
}
|
||||
e.log.Info("Extraction completed, context still valid")
|
||||
|
||||
// Check if user has superuser privileges (required for ownership restoration)
|
||||
e.progress.Update("Checking privileges...")
|
||||
isSuperuser, err := e.checkSuperuser(ctx)
|
||||
@@ -966,12 +1072,27 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
var restoreErrorsMu sync.Mutex
|
||||
totalDBs := 0
|
||||
|
||||
// Count total databases
|
||||
// Count total databases and calculate total bytes for weighted progress
|
||||
var totalBytes int64
|
||||
dbSizes := make(map[string]int64) // Map database name to dump file size
|
||||
for _, entry := range entries {
|
||||
if !entry.IsDir() {
|
||||
totalDBs++
|
||||
dumpFile := filepath.Join(dumpsDir, entry.Name())
|
||||
if info, err := os.Stat(dumpFile); err == nil {
|
||||
dbName := entry.Name()
|
||||
dbName = strings.TrimSuffix(dbName, ".dump")
|
||||
dbName = strings.TrimSuffix(dbName, ".sql.gz")
|
||||
dbSizes[dbName] = info.Size()
|
||||
totalBytes += info.Size()
|
||||
}
|
||||
}
|
||||
}
|
||||
e.log.Info("Calculated total restore size", "databases", totalDBs, "total_bytes", totalBytes)
|
||||
|
||||
// Track bytes completed for weighted progress
|
||||
var bytesCompleted int64
|
||||
var bytesCompletedMu sync.Mutex
|
||||
|
||||
// Create ETA estimator for database restores
|
||||
estimator := progress.NewETAEstimator("Restoring cluster", totalDBs)
|
||||
@@ -999,6 +1120,23 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
var successCount, failCount int32
|
||||
var mu sync.Mutex // Protect shared resources (progress, logger)
|
||||
|
||||
// CRITICAL: Check context before starting database restore loop
|
||||
// This helps debug issues where context gets cancelled between extraction and restore
|
||||
if ctx.Err() != nil {
|
||||
e.log.Error("Context cancelled before database restore loop started",
|
||||
"context_error", ctx.Err(),
|
||||
"total_databases", totalDBs,
|
||||
"parallelism", parallelism)
|
||||
operation.Fail("Context cancelled before database restores could start")
|
||||
return fmt.Errorf("context cancelled before database restore: %w", ctx.Err())
|
||||
}
|
||||
e.log.Info("Starting database restore loop", "databases", totalDBs, "parallelism", parallelism)
|
||||
|
||||
// Timing tracking for restore phase progress
|
||||
restorePhaseStart := time.Now()
|
||||
var completedDBTimes []time.Duration // Track duration for each completed DB restore
|
||||
var completedDBTimesMu sync.Mutex
|
||||
|
||||
// Create semaphore to limit concurrency
|
||||
semaphore := make(chan struct{}, parallelism)
|
||||
var wg sync.WaitGroup
|
||||
@@ -1024,6 +1162,19 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
}
|
||||
}()
|
||||
|
||||
// Check for context cancellation before starting
|
||||
if ctx.Err() != nil {
|
||||
e.log.Warn("Context cancelled - skipping database restore", "file", filename)
|
||||
atomic.AddInt32(&failCount, 1)
|
||||
restoreErrorsMu.Lock()
|
||||
restoreErrors = multierror.Append(restoreErrors, fmt.Errorf("%s: restore skipped (context cancelled)", strings.TrimSuffix(strings.TrimSuffix(filename, ".dump"), ".sql.gz")))
|
||||
restoreErrorsMu.Unlock()
|
||||
return
|
||||
}
|
||||
|
||||
// Track timing for this database restore
|
||||
dbRestoreStart := time.Now()
|
||||
|
||||
// Update estimator progress (thread-safe)
|
||||
mu.Lock()
|
||||
estimator.UpdateProgress(idx)
|
||||
@@ -1036,10 +1187,26 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
|
||||
dbProgress := 15 + int(float64(idx)/float64(totalDBs)*85.0)
|
||||
|
||||
// Calculate average time per DB and report progress with timing
|
||||
completedDBTimesMu.Lock()
|
||||
var avgPerDB time.Duration
|
||||
if len(completedDBTimes) > 0 {
|
||||
var totalDuration time.Duration
|
||||
for _, d := range completedDBTimes {
|
||||
totalDuration += d
|
||||
}
|
||||
avgPerDB = totalDuration / time.Duration(len(completedDBTimes))
|
||||
}
|
||||
phaseElapsed := time.Since(restorePhaseStart)
|
||||
completedDBTimesMu.Unlock()
|
||||
|
||||
mu.Lock()
|
||||
statusMsg := fmt.Sprintf("Restoring database %s (%d/%d)", dbName, idx+1, totalDBs)
|
||||
e.progress.Update(statusMsg)
|
||||
e.log.Info("Restoring database", "name", dbName, "file", dumpFile, "progress", dbProgress)
|
||||
// Report database progress for TUI (both callbacks)
|
||||
e.reportDatabaseProgress(idx, totalDBs, dbName)
|
||||
e.reportDatabaseProgressWithTiming(idx, totalDBs, dbName, phaseElapsed, avgPerDB)
|
||||
mu.Unlock()
|
||||
|
||||
// STEP 1: Drop existing database completely (clean slate)
|
||||
@@ -1104,7 +1271,27 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
return
|
||||
}
|
||||
|
||||
// Track completed database restore duration for ETA calculation
|
||||
dbRestoreDuration := time.Since(dbRestoreStart)
|
||||
completedDBTimesMu.Lock()
|
||||
completedDBTimes = append(completedDBTimes, dbRestoreDuration)
|
||||
completedDBTimesMu.Unlock()
|
||||
|
||||
// Update bytes completed for weighted progress
|
||||
dbSize := dbSizes[dbName]
|
||||
bytesCompletedMu.Lock()
|
||||
bytesCompleted += dbSize
|
||||
currentBytesCompleted := bytesCompleted
|
||||
currentSuccessCount := int(atomic.LoadInt32(&successCount)) + 1 // +1 because we're about to increment
|
||||
bytesCompletedMu.Unlock()
|
||||
|
||||
// Report weighted progress (bytes-based)
|
||||
e.reportDatabaseProgressByBytes(currentBytesCompleted, totalBytes, dbName, currentSuccessCount, totalDBs)
|
||||
|
||||
atomic.AddInt32(&successCount, 1)
|
||||
|
||||
// Small delay to ensure PostgreSQL fully closes connections before next restore
|
||||
time.Sleep(100 * time.Millisecond)
|
||||
}(dbIndex, entry.Name())
|
||||
|
||||
dbIndex++
|
||||
@@ -1116,6 +1303,35 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
successCountFinal := int(atomic.LoadInt32(&successCount))
|
||||
failCountFinal := int(atomic.LoadInt32(&failCount))
|
||||
|
||||
// SANITY CHECK: Verify all databases were accounted for
|
||||
// This catches any goroutine that exited without updating counters
|
||||
accountedFor := successCountFinal + failCountFinal
|
||||
if accountedFor != totalDBs {
|
||||
missingCount := totalDBs - accountedFor
|
||||
e.log.Error("INTERNAL ERROR: Some database restore goroutines did not report status",
|
||||
"expected", totalDBs,
|
||||
"success", successCountFinal,
|
||||
"failed", failCountFinal,
|
||||
"unaccounted", missingCount)
|
||||
|
||||
// Treat unaccounted databases as failures
|
||||
failCountFinal += missingCount
|
||||
restoreErrorsMu.Lock()
|
||||
restoreErrors = multierror.Append(restoreErrors, fmt.Errorf("%d database(s) did not complete (possible goroutine crash or deadlock)", missingCount))
|
||||
restoreErrorsMu.Unlock()
|
||||
}
|
||||
|
||||
// CRITICAL: Check if no databases were restored at all
|
||||
if successCountFinal == 0 {
|
||||
e.progress.Fail(fmt.Sprintf("Cluster restore FAILED: 0 of %d databases restored", totalDBs))
|
||||
operation.Fail("No databases were restored")
|
||||
|
||||
if failCountFinal > 0 && restoreErrors != nil {
|
||||
return fmt.Errorf("cluster restore failed: all %d database(s) failed:\n%s", failCountFinal, restoreErrors.Error())
|
||||
}
|
||||
return fmt.Errorf("cluster restore failed: no databases were restored (0 of %d total). Check PostgreSQL logs for details", totalDBs)
|
||||
}
|
||||
|
||||
if failCountFinal > 0 {
|
||||
// Format multi-error with detailed output
|
||||
restoreErrors.ErrorFormat = func(errs []error) string {
|
||||
@@ -1146,8 +1362,144 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// extractArchive extracts a tar.gz archive
|
||||
// extractArchive extracts a tar.gz archive with progress reporting
|
||||
func (e *Engine) extractArchive(ctx context.Context, archivePath, destDir string) error {
|
||||
// If progress callback is set, use Go's archive/tar for progress tracking
|
||||
if e.progressCallback != nil {
|
||||
return e.extractArchiveWithProgress(ctx, archivePath, destDir)
|
||||
}
|
||||
|
||||
// Otherwise use fast shell tar (no progress)
|
||||
return e.extractArchiveShell(ctx, archivePath, destDir)
|
||||
}
|
||||
|
||||
// extractArchiveWithProgress extracts using Go's archive/tar with detailed progress reporting
|
||||
func (e *Engine) extractArchiveWithProgress(ctx context.Context, archivePath, destDir string) error {
|
||||
// Get archive size for progress calculation
|
||||
archiveInfo, err := os.Stat(archivePath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to stat archive: %w", err)
|
||||
}
|
||||
totalSize := archiveInfo.Size()
|
||||
|
||||
// Open the archive file
|
||||
file, err := os.Open(archivePath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to open archive: %w", err)
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
// Wrap with progress reader
|
||||
progressReader := &progressReader{
|
||||
reader: file,
|
||||
totalSize: totalSize,
|
||||
callback: e.progressCallback,
|
||||
desc: "Extracting archive",
|
||||
}
|
||||
|
||||
// Create gzip reader
|
||||
gzReader, err := gzip.NewReader(progressReader)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create gzip reader: %w", err)
|
||||
}
|
||||
defer gzReader.Close()
|
||||
|
||||
// Create tar reader
|
||||
tarReader := tar.NewReader(gzReader)
|
||||
|
||||
// Extract files
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return ctx.Err()
|
||||
default:
|
||||
}
|
||||
|
||||
header, err := tarReader.Next()
|
||||
if err == io.EOF {
|
||||
break // End of archive
|
||||
}
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to read tar header: %w", err)
|
||||
}
|
||||
|
||||
// Sanitize and validate path
|
||||
targetPath := filepath.Join(destDir, header.Name)
|
||||
|
||||
// Security check: ensure path is within destDir (prevent path traversal)
|
||||
if !strings.HasPrefix(filepath.Clean(targetPath), filepath.Clean(destDir)) {
|
||||
e.log.Warn("Skipping potentially malicious path in archive", "path", header.Name)
|
||||
continue
|
||||
}
|
||||
|
||||
switch header.Typeflag {
|
||||
case tar.TypeDir:
|
||||
if err := os.MkdirAll(targetPath, 0755); err != nil {
|
||||
return fmt.Errorf("failed to create directory %s: %w", targetPath, err)
|
||||
}
|
||||
case tar.TypeReg:
|
||||
// Ensure parent directory exists
|
||||
if err := os.MkdirAll(filepath.Dir(targetPath), 0755); err != nil {
|
||||
return fmt.Errorf("failed to create parent directory: %w", err)
|
||||
}
|
||||
|
||||
// Create the file
|
||||
outFile, err := os.OpenFile(targetPath, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, os.FileMode(header.Mode))
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create file %s: %w", targetPath, err)
|
||||
}
|
||||
|
||||
// Copy file contents
|
||||
if _, err := io.Copy(outFile, tarReader); err != nil {
|
||||
outFile.Close()
|
||||
return fmt.Errorf("failed to write file %s: %w", targetPath, err)
|
||||
}
|
||||
outFile.Close()
|
||||
case tar.TypeSymlink:
|
||||
// Handle symlinks (common in some archives)
|
||||
if err := os.Symlink(header.Linkname, targetPath); err != nil {
|
||||
// Ignore symlink errors (may already exist or not supported)
|
||||
e.log.Debug("Could not create symlink", "path", targetPath, "target", header.Linkname)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Final progress update
|
||||
e.reportProgress(totalSize, totalSize, "Extraction complete")
|
||||
return nil
|
||||
}
|
||||
|
||||
// progressReader wraps an io.Reader to report read progress
|
||||
type progressReader struct {
|
||||
reader io.Reader
|
||||
totalSize int64
|
||||
bytesRead int64
|
||||
callback ProgressCallback
|
||||
desc string
|
||||
lastReport time.Time
|
||||
reportEvery time.Duration
|
||||
}
|
||||
|
||||
func (pr *progressReader) Read(p []byte) (n int, err error) {
|
||||
n, err = pr.reader.Read(p)
|
||||
pr.bytesRead += int64(n)
|
||||
|
||||
// Throttle progress reporting to every 100ms
|
||||
if pr.reportEvery == 0 {
|
||||
pr.reportEvery = 100 * time.Millisecond
|
||||
}
|
||||
if time.Since(pr.lastReport) > pr.reportEvery {
|
||||
if pr.callback != nil {
|
||||
pr.callback(pr.bytesRead, pr.totalSize, pr.desc)
|
||||
}
|
||||
pr.lastReport = time.Now()
|
||||
}
|
||||
|
||||
return n, err
|
||||
}
|
||||
|
||||
// extractArchiveShell extracts using shell tar command (faster but no progress)
|
||||
func (e *Engine) extractArchiveShell(ctx context.Context, archivePath, destDir string) error {
|
||||
cmd := exec.CommandContext(ctx, "tar", "-xzf", archivePath, "-C", destDir)
|
||||
|
||||
// Stream stderr to avoid memory issues - tar can produce lots of output for large archives
|
||||
@@ -1199,6 +1551,8 @@ func (e *Engine) extractArchive(ctx context.Context, archivePath, destDir string
|
||||
}
|
||||
|
||||
// restoreGlobals restores global objects (roles, tablespaces)
|
||||
// Note: psql returns 0 even when some statements fail (e.g., role already exists)
|
||||
// We track errors but only fail on FATAL errors that would prevent restore
|
||||
func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
||||
args := []string{
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
@@ -1228,6 +1582,8 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
||||
|
||||
// Read stderr in chunks in goroutine
|
||||
var lastError string
|
||||
var errorCount int
|
||||
var fatalError bool
|
||||
stderrDone := make(chan struct{})
|
||||
go func() {
|
||||
defer close(stderrDone)
|
||||
@@ -1236,9 +1592,23 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
||||
n, err := stderr.Read(buf)
|
||||
if n > 0 {
|
||||
chunk := string(buf[:n])
|
||||
if strings.Contains(chunk, "ERROR") || strings.Contains(chunk, "FATAL") {
|
||||
// Track different error types
|
||||
if strings.Contains(chunk, "FATAL") {
|
||||
fatalError = true
|
||||
lastError = chunk
|
||||
e.log.Warn("Globals restore stderr", "output", chunk)
|
||||
e.log.Error("Globals restore FATAL error", "output", chunk)
|
||||
} else if strings.Contains(chunk, "ERROR") {
|
||||
errorCount++
|
||||
lastError = chunk
|
||||
// Only log first few errors to avoid spam
|
||||
if errorCount <= 5 {
|
||||
// Check if it's an ignorable "already exists" error
|
||||
if strings.Contains(chunk, "already exists") {
|
||||
e.log.Debug("Globals restore: object already exists (expected)", "output", chunk)
|
||||
} else {
|
||||
e.log.Warn("Globals restore error", "output", chunk)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
if err != nil {
|
||||
@@ -1266,10 +1636,23 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
||||
|
||||
<-stderrDone
|
||||
|
||||
// Only fail on actual command errors or FATAL PostgreSQL errors
|
||||
// Regular ERROR messages (like "role already exists") are expected
|
||||
if cmdErr != nil {
|
||||
return fmt.Errorf("failed to restore globals: %w (last error: %s)", cmdErr, lastError)
|
||||
}
|
||||
|
||||
// If we had FATAL errors, those are real problems
|
||||
if fatalError {
|
||||
return fmt.Errorf("globals restore had FATAL error: %s", lastError)
|
||||
}
|
||||
|
||||
// Log summary if there were errors (but don't fail)
|
||||
if errorCount > 0 {
|
||||
e.log.Info("Globals restore completed with some errors (usually 'already exists' - expected)",
|
||||
"error_count", errorCount)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -1337,6 +1720,7 @@ func (e *Engine) terminateConnections(ctx context.Context, dbName string) error
|
||||
}
|
||||
|
||||
// dropDatabaseIfExists drops a database completely (clean slate)
|
||||
// Uses PostgreSQL 13+ WITH (FORCE) option to forcefully drop even with active connections
|
||||
func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error {
|
||||
// First terminate all connections
|
||||
if err := e.terminateConnections(ctx, dbName); err != nil {
|
||||
@@ -1346,28 +1730,69 @@ func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error
|
||||
// Wait a moment for connections to terminate
|
||||
time.Sleep(500 * time.Millisecond)
|
||||
|
||||
// Drop the database
|
||||
// Try to revoke new connections (prevents race condition)
|
||||
// This only works if we have the privilege to do so
|
||||
revokeArgs := []string{
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User,
|
||||
"-d", "postgres",
|
||||
"-c", fmt.Sprintf("REVOKE CONNECT ON DATABASE \"%s\" FROM PUBLIC", dbName),
|
||||
}
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||
revokeArgs = append([]string{"-h", e.cfg.Host}, revokeArgs...)
|
||||
}
|
||||
revokeCmd := exec.CommandContext(ctx, "psql", revokeArgs...)
|
||||
revokeCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
revokeCmd.Run() // Ignore errors - database might not exist
|
||||
|
||||
// Terminate connections again after revoking connect privilege
|
||||
e.terminateConnections(ctx, dbName)
|
||||
time.Sleep(200 * time.Millisecond)
|
||||
|
||||
// Try DROP DATABASE WITH (FORCE) first (PostgreSQL 13+)
|
||||
// This forcefully terminates connections and drops the database atomically
|
||||
forceArgs := []string{
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User,
|
||||
"-d", "postgres",
|
||||
"-c", fmt.Sprintf("DROP DATABASE IF EXISTS \"%s\" WITH (FORCE)", dbName),
|
||||
}
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||
forceArgs = append([]string{"-h", e.cfg.Host}, forceArgs...)
|
||||
}
|
||||
forceCmd := exec.CommandContext(ctx, "psql", forceArgs...)
|
||||
forceCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
|
||||
output, err := forceCmd.CombinedOutput()
|
||||
if err == nil {
|
||||
e.log.Info("Dropped existing database (with FORCE)", "name", dbName)
|
||||
return nil
|
||||
}
|
||||
|
||||
// If FORCE option failed (PostgreSQL < 13), try regular drop
|
||||
if strings.Contains(string(output), "syntax error") || strings.Contains(string(output), "WITH (FORCE)") {
|
||||
e.log.Debug("WITH (FORCE) not supported, using standard DROP", "name", dbName)
|
||||
|
||||
args := []string{
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User,
|
||||
"-d", "postgres",
|
||||
"-c", fmt.Sprintf("DROP DATABASE IF EXISTS \"%s\"", dbName),
|
||||
}
|
||||
|
||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
|
||||
// Always set PGPASSWORD (empty string is fine for peer/ident auth)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
|
||||
output, err := cmd.CombinedOutput()
|
||||
output, err = cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to drop database '%s': %w\nOutput: %s", dbName, err, string(output))
|
||||
}
|
||||
} else if err != nil {
|
||||
return fmt.Errorf("failed to drop database '%s': %w\nOutput: %s", dbName, err, string(output))
|
||||
}
|
||||
|
||||
e.log.Info("Dropped existing database", "name", dbName)
|
||||
return nil
|
||||
@@ -1408,12 +1833,14 @@ func (e *Engine) ensureMySQLDatabaseExists(ctx context.Context, dbName string) e
|
||||
}
|
||||
|
||||
// ensurePostgresDatabaseExists checks if a PostgreSQL database exists and creates it if not
|
||||
// It attempts to extract encoding/locale from the dump file to preserve original settings
|
||||
func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string) error {
|
||||
// Skip creation for postgres and template databases - they should already exist
|
||||
if dbName == "postgres" || dbName == "template0" || dbName == "template1" {
|
||||
e.log.Info("Skipping create for system database (assume exists)", "name", dbName)
|
||||
return nil
|
||||
}
|
||||
|
||||
// Build psql command with authentication
|
||||
buildPsqlCmd := func(ctx context.Context, database, query string) *exec.Cmd {
|
||||
args := []string{
|
||||
@@ -1453,14 +1880,31 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
|
||||
|
||||
// Database doesn't exist, create it
|
||||
// IMPORTANT: Use template0 to avoid duplicate definition errors from local additions to template1
|
||||
// Also use UTF8 encoding explicitly as it's the most common and safest choice
|
||||
// See PostgreSQL docs: https://www.postgresql.org/docs/current/app-pgrestore.html#APP-PGRESTORE-NOTES
|
||||
e.log.Info("Creating database from template0", "name", dbName)
|
||||
e.log.Info("Creating database from template0 with UTF8 encoding", "name", dbName)
|
||||
|
||||
// Get server's default locale for LC_COLLATE and LC_CTYPE
|
||||
// This ensures compatibility while using the correct encoding
|
||||
localeCmd := buildPsqlCmd(ctx, "postgres", "SHOW lc_collate")
|
||||
localeOutput, _ := localeCmd.CombinedOutput()
|
||||
serverLocale := strings.TrimSpace(string(localeOutput))
|
||||
if serverLocale == "" {
|
||||
serverLocale = "en_US.UTF-8" // Fallback to common default
|
||||
}
|
||||
|
||||
// Build CREATE DATABASE command with encoding and locale
|
||||
// Using ENCODING 'UTF8' explicitly ensures the dump can be restored
|
||||
createSQL := fmt.Sprintf(
|
||||
"CREATE DATABASE \"%s\" WITH TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE '%s' LC_CTYPE '%s'",
|
||||
dbName, serverLocale, serverLocale,
|
||||
)
|
||||
|
||||
createArgs := []string{
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User,
|
||||
"-d", "postgres",
|
||||
"-c", fmt.Sprintf("CREATE DATABASE \"%s\" WITH TEMPLATE template0", dbName),
|
||||
"-c", createSQL,
|
||||
}
|
||||
|
||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
||||
@@ -1475,10 +1919,28 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
|
||||
|
||||
output, err = createCmd.CombinedOutput()
|
||||
if err != nil {
|
||||
// Log the error and include the psql output in the returned error to aid debugging
|
||||
// If encoding/locale fails, try simpler CREATE DATABASE
|
||||
e.log.Warn("Database creation with encoding failed, trying simple create", "name", dbName, "error", err)
|
||||
|
||||
simpleArgs := []string{
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User,
|
||||
"-d", "postgres",
|
||||
"-c", fmt.Sprintf("CREATE DATABASE \"%s\" WITH TEMPLATE template0", dbName),
|
||||
}
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||
simpleArgs = append([]string{"-h", e.cfg.Host}, simpleArgs...)
|
||||
}
|
||||
|
||||
simpleCmd := exec.CommandContext(ctx, "psql", simpleArgs...)
|
||||
simpleCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
|
||||
output, err = simpleCmd.CombinedOutput()
|
||||
if err != nil {
|
||||
e.log.Warn("Database creation failed", "name", dbName, "error", err, "output", string(output))
|
||||
return fmt.Errorf("failed to create database '%s': %w (output: %s)", dbName, err, strings.TrimSpace(string(output)))
|
||||
}
|
||||
}
|
||||
|
||||
e.log.Info("Successfully created database from template0", "name", dbName)
|
||||
return nil
|
||||
@@ -1665,9 +2127,10 @@ func (e *Engine) quickValidateSQLDump(archivePath string, compressed bool) error
|
||||
return nil
|
||||
}
|
||||
|
||||
// boostLockCapacity temporarily increases max_locks_per_transaction to prevent OOM
|
||||
// during large restores with many BLOBs. Returns the original value for later reset.
|
||||
// Uses ALTER SYSTEM + pg_reload_conf() so no restart is needed.
|
||||
// boostLockCapacity checks and reports on max_locks_per_transaction capacity.
|
||||
// IMPORTANT: max_locks_per_transaction requires a PostgreSQL RESTART to change!
|
||||
// This function now calculates total lock capacity based on max_connections and
|
||||
// warns the user if capacity is insufficient for the restore.
|
||||
func (e *Engine) boostLockCapacity(ctx context.Context) (int, error) {
|
||||
// Connect to PostgreSQL to run system commands
|
||||
connStr := fmt.Sprintf("host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable",
|
||||
@@ -1685,7 +2148,7 @@ func (e *Engine) boostLockCapacity(ctx context.Context) (int, error) {
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
// Get current value
|
||||
// Get current max_locks_per_transaction
|
||||
var currentValue int
|
||||
err = db.QueryRowContext(ctx, "SHOW max_locks_per_transaction").Scan(¤tValue)
|
||||
if err != nil {
|
||||
@@ -1698,22 +2161,56 @@ func (e *Engine) boostLockCapacity(ctx context.Context) (int, error) {
|
||||
fmt.Sscanf(currentValueStr, "%d", ¤tValue)
|
||||
}
|
||||
|
||||
// Skip if already high enough
|
||||
if currentValue >= 2048 {
|
||||
e.log.Info("max_locks_per_transaction already sufficient", "value", currentValue)
|
||||
return currentValue, nil
|
||||
// Get max_connections to calculate total lock capacity
|
||||
var maxConns int
|
||||
if err := db.QueryRowContext(ctx, "SHOW max_connections").Scan(&maxConns); err != nil {
|
||||
maxConns = 100 // default
|
||||
}
|
||||
|
||||
// Boost to 2048 (enough for most BLOB-heavy databases)
|
||||
_, err = db.ExecContext(ctx, "ALTER SYSTEM SET max_locks_per_transaction = 2048")
|
||||
if err != nil {
|
||||
return currentValue, fmt.Errorf("failed to set max_locks_per_transaction: %w", err)
|
||||
// Get max_prepared_transactions
|
||||
var maxPreparedTxns int
|
||||
if err := db.QueryRowContext(ctx, "SHOW max_prepared_transactions").Scan(&maxPreparedTxns); err != nil {
|
||||
maxPreparedTxns = 0
|
||||
}
|
||||
|
||||
// Reload config without restart
|
||||
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
||||
// Calculate total lock table capacity:
|
||||
// Total locks = max_locks_per_transaction × (max_connections + max_prepared_transactions)
|
||||
totalLockCapacity := currentValue * (maxConns + maxPreparedTxns)
|
||||
|
||||
e.log.Info("PostgreSQL lock table capacity",
|
||||
"max_locks_per_transaction", currentValue,
|
||||
"max_connections", maxConns,
|
||||
"max_prepared_transactions", maxPreparedTxns,
|
||||
"total_lock_capacity", totalLockCapacity)
|
||||
|
||||
// Minimum recommended total capacity for BLOB-heavy restores: 200,000 locks
|
||||
minRecommendedCapacity := 200000
|
||||
if totalLockCapacity < minRecommendedCapacity {
|
||||
recommendedMaxLocks := minRecommendedCapacity / (maxConns + maxPreparedTxns)
|
||||
if recommendedMaxLocks < 4096 {
|
||||
recommendedMaxLocks = 4096
|
||||
}
|
||||
|
||||
e.log.Warn("Lock table capacity may be insufficient for BLOB-heavy restores",
|
||||
"current_total_capacity", totalLockCapacity,
|
||||
"recommended_capacity", minRecommendedCapacity,
|
||||
"current_max_locks", currentValue,
|
||||
"recommended_max_locks", recommendedMaxLocks,
|
||||
"note", "max_locks_per_transaction requires PostgreSQL RESTART to change")
|
||||
|
||||
// Write suggested fix to ALTER SYSTEM but warn about restart
|
||||
_, err = db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", recommendedMaxLocks))
|
||||
if err != nil {
|
||||
return currentValue, fmt.Errorf("failed to reload config: %w", err)
|
||||
e.log.Warn("Could not set recommended max_locks_per_transaction (needs superuser)", "error", err)
|
||||
} else {
|
||||
e.log.Warn("Wrote recommended max_locks_per_transaction to postgresql.auto.conf",
|
||||
"value", recommendedMaxLocks,
|
||||
"action", "RESTART PostgreSQL to apply: sudo systemctl restart postgresql")
|
||||
}
|
||||
} else {
|
||||
e.log.Info("Lock table capacity is sufficient",
|
||||
"total_capacity", totalLockCapacity,
|
||||
"max_locks_per_transaction", currentValue)
|
||||
}
|
||||
|
||||
return currentValue, nil
|
||||
@@ -1761,6 +2258,8 @@ type OriginalSettings struct {
|
||||
}
|
||||
|
||||
// boostPostgreSQLSettings boosts multiple PostgreSQL settings for large restores
|
||||
// NOTE: max_locks_per_transaction requires a PostgreSQL RESTART to take effect!
|
||||
// maintenance_work_mem can be changed with pg_reload_conf().
|
||||
func (e *Engine) boostPostgreSQLSettings(ctx context.Context, lockBoostValue int) (*OriginalSettings, error) {
|
||||
connStr := e.buildConnString()
|
||||
db, err := sql.Open("pgx", connStr)
|
||||
@@ -1780,30 +2279,156 @@ func (e *Engine) boostPostgreSQLSettings(ctx context.Context, lockBoostValue int
|
||||
// Get current maintenance_work_mem
|
||||
db.QueryRowContext(ctx, "SHOW maintenance_work_mem").Scan(&original.MaintenanceWorkMem)
|
||||
|
||||
// Boost max_locks_per_transaction (if not already high enough)
|
||||
// CRITICAL: max_locks_per_transaction requires a PostgreSQL RESTART!
|
||||
// pg_reload_conf() is NOT sufficient for this parameter.
|
||||
needsRestart := false
|
||||
if original.MaxLocks < lockBoostValue {
|
||||
_, err = db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", lockBoostValue))
|
||||
if err != nil {
|
||||
e.log.Warn("Could not boost max_locks_per_transaction", "error", err)
|
||||
e.log.Warn("Could not set max_locks_per_transaction", "error", err)
|
||||
} else {
|
||||
needsRestart = true
|
||||
e.log.Warn("max_locks_per_transaction requires PostgreSQL restart to take effect",
|
||||
"current", original.MaxLocks,
|
||||
"target", lockBoostValue)
|
||||
}
|
||||
}
|
||||
|
||||
// Boost maintenance_work_mem to 2GB for faster index creation
|
||||
// (this one CAN be applied via pg_reload_conf)
|
||||
_, err = db.ExecContext(ctx, "ALTER SYSTEM SET maintenance_work_mem = '2GB'")
|
||||
if err != nil {
|
||||
e.log.Warn("Could not boost maintenance_work_mem", "error", err)
|
||||
}
|
||||
|
||||
// Reload config to apply changes (no restart needed for these settings)
|
||||
// Reload config to apply maintenance_work_mem
|
||||
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
||||
if err != nil {
|
||||
return original, fmt.Errorf("failed to reload config: %w", err)
|
||||
}
|
||||
|
||||
// If max_locks_per_transaction needs a restart, try to do it
|
||||
if needsRestart {
|
||||
if restarted := e.tryRestartPostgreSQL(ctx); restarted {
|
||||
e.log.Info("PostgreSQL restarted successfully - max_locks_per_transaction now active")
|
||||
// Wait for PostgreSQL to be ready
|
||||
time.Sleep(3 * time.Second)
|
||||
} else {
|
||||
// Cannot restart - warn user but continue
|
||||
// The setting is written to postgresql.auto.conf and will take effect on next restart
|
||||
e.log.Warn("=" + strings.Repeat("=", 70))
|
||||
e.log.Warn("NOTE: max_locks_per_transaction change requires PostgreSQL restart")
|
||||
e.log.Warn("Current value: " + strconv.Itoa(original.MaxLocks) + ", target: " + strconv.Itoa(lockBoostValue))
|
||||
e.log.Warn("")
|
||||
e.log.Warn("The setting has been saved to postgresql.auto.conf and will take")
|
||||
e.log.Warn("effect on the next PostgreSQL restart. If restore fails with")
|
||||
e.log.Warn("'out of shared memory' errors, ask your DBA to restart PostgreSQL.")
|
||||
e.log.Warn("")
|
||||
e.log.Warn("Continuing with restore - this may succeed if your databases")
|
||||
e.log.Warn("don't have many large objects (BLOBs).")
|
||||
e.log.Warn("=" + strings.Repeat("=", 70))
|
||||
// Continue anyway - might work for small restores or DBs without BLOBs
|
||||
}
|
||||
}
|
||||
|
||||
return original, nil
|
||||
}
|
||||
|
||||
// canRestartPostgreSQL checks if we have the ability to restart PostgreSQL
|
||||
// Returns false if running in a restricted environment (e.g., su postgres on enterprise systems)
|
||||
func (e *Engine) canRestartPostgreSQL() bool {
|
||||
// Check if we're running as postgres user - if so, we likely can't restart
|
||||
// because PostgreSQL is managed by init/systemd, not directly by pg_ctl
|
||||
currentUser := os.Getenv("USER")
|
||||
if currentUser == "" {
|
||||
currentUser = os.Getenv("LOGNAME")
|
||||
}
|
||||
|
||||
// If we're the postgres user, check if we have sudo access
|
||||
if currentUser == "postgres" {
|
||||
// Try a quick sudo check - if this fails, we can't restart
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
|
||||
defer cancel()
|
||||
cmd := exec.CommandContext(ctx, "sudo", "-n", "true")
|
||||
cmd.Stdin = nil
|
||||
if err := cmd.Run(); err != nil {
|
||||
e.log.Info("Running as postgres user without sudo access - cannot restart PostgreSQL",
|
||||
"user", currentUser,
|
||||
"hint", "Ask system administrator to restart PostgreSQL if needed")
|
||||
return false
|
||||
}
|
||||
}
|
||||
|
||||
return true
|
||||
}
|
||||
|
||||
// tryRestartPostgreSQL attempts to restart PostgreSQL using various methods
|
||||
// Returns true if restart was successful
|
||||
// IMPORTANT: Uses short timeouts and non-interactive sudo to avoid blocking on password prompts
|
||||
// NOTE: This function will return false immediately if running as postgres without sudo
|
||||
func (e *Engine) tryRestartPostgreSQL(ctx context.Context) bool {
|
||||
// First check if we can even attempt a restart
|
||||
if !e.canRestartPostgreSQL() {
|
||||
e.log.Info("Skipping PostgreSQL restart attempt (no privileges)")
|
||||
return false
|
||||
}
|
||||
|
||||
e.progress.Update("Attempting PostgreSQL restart for lock settings...")
|
||||
|
||||
// Use short timeout for each restart attempt (don't block on sudo password prompts)
|
||||
runWithTimeout := func(args ...string) bool {
|
||||
cmdCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||
defer cancel()
|
||||
cmd := exec.CommandContext(cmdCtx, args[0], args[1:]...)
|
||||
// Set stdin to /dev/null to prevent sudo from waiting for password
|
||||
cmd.Stdin = nil
|
||||
return cmd.Run() == nil
|
||||
}
|
||||
|
||||
// Method 1: systemctl (most common on modern Linux) - use sudo -n for non-interactive
|
||||
if runWithTimeout("sudo", "-n", "systemctl", "restart", "postgresql") {
|
||||
return true
|
||||
}
|
||||
|
||||
// Method 2: systemctl with version suffix (e.g., postgresql-15)
|
||||
for _, ver := range []string{"17", "16", "15", "14", "13", "12"} {
|
||||
if runWithTimeout("sudo", "-n", "systemctl", "restart", "postgresql-"+ver) {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
// Method 3: service command (older systems)
|
||||
if runWithTimeout("sudo", "-n", "service", "postgresql", "restart") {
|
||||
return true
|
||||
}
|
||||
|
||||
// Method 4: pg_ctl as postgres user (if we ARE postgres user, no sudo needed)
|
||||
if runWithTimeout("pg_ctl", "restart", "-D", "/var/lib/postgresql/data", "-m", "fast") {
|
||||
return true
|
||||
}
|
||||
|
||||
// Method 5: Try common PGDATA paths with pg_ctl directly (for postgres user)
|
||||
pgdataPaths := []string{
|
||||
"/var/lib/pgsql/data",
|
||||
"/var/lib/pgsql/17/data",
|
||||
"/var/lib/pgsql/16/data",
|
||||
"/var/lib/pgsql/15/data",
|
||||
"/var/lib/postgresql/17/main",
|
||||
"/var/lib/postgresql/16/main",
|
||||
"/var/lib/postgresql/15/main",
|
||||
}
|
||||
for _, pgdata := range pgdataPaths {
|
||||
if runWithTimeout("pg_ctl", "restart", "-D", pgdata, "-m", "fast") {
|
||||
return true
|
||||
}
|
||||
}
|
||||
|
||||
return false
|
||||
}
|
||||
|
||||
// resetPostgreSQLSettings restores original PostgreSQL settings
|
||||
// NOTE: max_locks_per_transaction changes are written but require restart to take effect.
|
||||
// We don't restart here since we're done with the restore.
|
||||
func (e *Engine) resetPostgreSQLSettings(ctx context.Context, original *OriginalSettings) error {
|
||||
connStr := e.buildConnString()
|
||||
db, err := sql.Open("pgx", connStr)
|
||||
@@ -1812,25 +2437,28 @@ func (e *Engine) resetPostgreSQLSettings(ctx context.Context, original *Original
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
// Reset max_locks_per_transaction
|
||||
// Reset max_locks_per_transaction (will take effect on next restart)
|
||||
if original.MaxLocks == 64 { // Default
|
||||
db.ExecContext(ctx, "ALTER SYSTEM RESET max_locks_per_transaction")
|
||||
} else if original.MaxLocks > 0 {
|
||||
db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d", original.MaxLocks))
|
||||
}
|
||||
|
||||
// Reset maintenance_work_mem
|
||||
// Reset maintenance_work_mem (takes effect immediately with reload)
|
||||
if original.MaintenanceWorkMem == "64MB" { // Default
|
||||
db.ExecContext(ctx, "ALTER SYSTEM RESET maintenance_work_mem")
|
||||
} else if original.MaintenanceWorkMem != "" {
|
||||
db.ExecContext(ctx, fmt.Sprintf("ALTER SYSTEM SET maintenance_work_mem = '%s'", original.MaintenanceWorkMem))
|
||||
}
|
||||
|
||||
// Reload config
|
||||
// Reload config (only maintenance_work_mem will take effect immediately)
|
||||
_, err = db.ExecContext(ctx, "SELECT pg_reload_conf()")
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to reload config: %w", err)
|
||||
}
|
||||
|
||||
e.log.Info("PostgreSQL settings reset queued",
|
||||
"note", "max_locks_per_transaction will revert on next PostgreSQL restart")
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
@@ -16,6 +16,57 @@ import (
|
||||
"github.com/shirou/gopsutil/v3/mem"
|
||||
)
|
||||
|
||||
// CalculateOptimalParallel returns the recommended number of parallel workers
|
||||
// based on available system resources (CPU cores and RAM).
|
||||
// This is a standalone function that can be called from anywhere.
|
||||
// Returns 0 if resources cannot be detected.
|
||||
func CalculateOptimalParallel() int {
|
||||
cpuCores := runtime.NumCPU()
|
||||
|
||||
vmem, err := mem.VirtualMemory()
|
||||
if err != nil {
|
||||
// Fallback: use half of CPU cores if memory detection fails
|
||||
if cpuCores > 1 {
|
||||
return cpuCores / 2
|
||||
}
|
||||
return 1
|
||||
}
|
||||
|
||||
memAvailableGB := float64(vmem.Available) / (1024 * 1024 * 1024)
|
||||
|
||||
// Each pg_restore worker needs approximately 2-4GB of RAM
|
||||
// Use conservative 3GB per worker to avoid OOM
|
||||
const memPerWorkerGB = 3.0
|
||||
|
||||
// Calculate limits
|
||||
maxByMem := int(memAvailableGB / memPerWorkerGB)
|
||||
maxByCPU := cpuCores
|
||||
|
||||
// Use the minimum of memory and CPU limits
|
||||
recommended := maxByMem
|
||||
if maxByCPU < recommended {
|
||||
recommended = maxByCPU
|
||||
}
|
||||
|
||||
// Apply sensible bounds
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
if recommended > 16 {
|
||||
recommended = 16 // Cap at 16 to avoid diminishing returns
|
||||
}
|
||||
|
||||
// If memory pressure is high (>80%), reduce parallelism
|
||||
if vmem.UsedPercent > 80 && recommended > 1 {
|
||||
recommended = recommended / 2
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
}
|
||||
|
||||
return recommended
|
||||
}
|
||||
|
||||
// PreflightResult contains all preflight check results
|
||||
type PreflightResult struct {
|
||||
// Linux system checks
|
||||
@@ -40,6 +91,8 @@ type LinuxChecks struct {
|
||||
MemTotal uint64 // Total RAM in bytes
|
||||
MemAvailable uint64 // Available RAM in bytes
|
||||
MemUsedPercent float64 // Memory usage percentage
|
||||
CPUCores int // Number of CPU cores
|
||||
RecommendedParallel int // Auto-calculated optimal parallel count
|
||||
ShmMaxOK bool // Is shmmax sufficient?
|
||||
ShmAllOK bool // Is shmall sufficient?
|
||||
MemAvailableOK bool // Is available RAM sufficient?
|
||||
@@ -49,6 +102,8 @@ type LinuxChecks struct {
|
||||
// PostgreSQLChecks contains PostgreSQL configuration checks
|
||||
type PostgreSQLChecks struct {
|
||||
MaxLocksPerTransaction int // Current setting
|
||||
MaxPreparedTransactions int // Current setting (affects lock capacity)
|
||||
TotalLockCapacity int // Calculated: max_locks × (max_connections + max_prepared)
|
||||
MaintenanceWorkMem string // Current setting
|
||||
SharedBuffers string // Current setting (info only)
|
||||
MaxConnections int // Current setting
|
||||
@@ -98,6 +153,7 @@ func (e *Engine) RunPreflightChecks(ctx context.Context, dumpsDir string, entrie
|
||||
// checkSystemResources uses gopsutil for cross-platform system checks
|
||||
func (e *Engine) checkSystemResources(result *PreflightResult) {
|
||||
result.Linux.IsLinux = runtime.GOOS == "linux"
|
||||
result.Linux.CPUCores = runtime.NumCPU()
|
||||
|
||||
// Get memory info (works on Linux, macOS, Windows, BSD)
|
||||
if vmem, err := mem.VirtualMemory(); err == nil {
|
||||
@@ -116,6 +172,9 @@ func (e *Engine) checkSystemResources(result *PreflightResult) {
|
||||
e.log.Warn("Could not detect system memory", "error", err)
|
||||
}
|
||||
|
||||
// Calculate recommended parallel based on resources
|
||||
result.Linux.RecommendedParallel = e.calculateRecommendedParallel(result)
|
||||
|
||||
// Linux-specific kernel checks (shmmax, shmall)
|
||||
if result.Linux.IsLinux {
|
||||
e.checkLinuxKernel(result)
|
||||
@@ -201,10 +260,70 @@ func (e *Engine) checkPostgreSQL(ctx context.Context, result *PreflightResult) {
|
||||
result.PostgreSQL.IsSuperuser = isSuperuser
|
||||
}
|
||||
|
||||
// Add info/warnings
|
||||
// Check max_prepared_transactions for lock capacity calculation
|
||||
var maxPreparedTxns string
|
||||
if err := db.QueryRowContext(ctx, "SHOW max_prepared_transactions").Scan(&maxPreparedTxns); err == nil {
|
||||
result.PostgreSQL.MaxPreparedTransactions, _ = strconv.Atoi(maxPreparedTxns)
|
||||
}
|
||||
|
||||
// CRITICAL: Calculate TOTAL lock table capacity
|
||||
// Formula: max_locks_per_transaction × (max_connections + max_prepared_transactions)
|
||||
// This is THE key capacity metric for BLOB-heavy restores
|
||||
maxConns := result.PostgreSQL.MaxConnections
|
||||
if maxConns == 0 {
|
||||
maxConns = 100 // default
|
||||
}
|
||||
maxPrepared := result.PostgreSQL.MaxPreparedTransactions
|
||||
totalLockCapacity := result.PostgreSQL.MaxLocksPerTransaction * (maxConns + maxPrepared)
|
||||
result.PostgreSQL.TotalLockCapacity = totalLockCapacity
|
||||
|
||||
e.log.Info("PostgreSQL lock table capacity",
|
||||
"max_locks_per_transaction", result.PostgreSQL.MaxLocksPerTransaction,
|
||||
"max_connections", maxConns,
|
||||
"max_prepared_transactions", maxPrepared,
|
||||
"total_lock_capacity", totalLockCapacity)
|
||||
|
||||
// CRITICAL: max_locks_per_transaction requires PostgreSQL RESTART to change!
|
||||
// Warn users loudly about this - it's the #1 cause of "out of shared memory" errors
|
||||
if result.PostgreSQL.MaxLocksPerTransaction < 256 {
|
||||
e.log.Info("PostgreSQL max_locks_per_transaction is low - will auto-boost",
|
||||
"current", result.PostgreSQL.MaxLocksPerTransaction)
|
||||
e.log.Warn("PostgreSQL max_locks_per_transaction is LOW",
|
||||
"current", result.PostgreSQL.MaxLocksPerTransaction,
|
||||
"recommended", "256+",
|
||||
"note", "REQUIRES PostgreSQL restart to change!")
|
||||
|
||||
result.Warnings = append(result.Warnings,
|
||||
fmt.Sprintf("max_locks_per_transaction=%d is low (recommend 256+). "+
|
||||
"This setting requires PostgreSQL RESTART to change. "+
|
||||
"BLOB-heavy databases may fail with 'out of shared memory' error. "+
|
||||
"Fix: Edit postgresql.conf, set max_locks_per_transaction=2048, then restart PostgreSQL.",
|
||||
result.PostgreSQL.MaxLocksPerTransaction))
|
||||
}
|
||||
|
||||
// NEW: Check total lock capacity is sufficient for typical BLOB operations
|
||||
// Minimum recommended: 200,000 for moderate BLOB databases
|
||||
minRecommendedCapacity := 200000
|
||||
if totalLockCapacity < minRecommendedCapacity {
|
||||
recommendedMaxLocks := minRecommendedCapacity / (maxConns + maxPrepared)
|
||||
if recommendedMaxLocks < 4096 {
|
||||
recommendedMaxLocks = 4096
|
||||
}
|
||||
|
||||
e.log.Warn("Total lock table capacity is LOW for BLOB-heavy restores",
|
||||
"current_capacity", totalLockCapacity,
|
||||
"recommended", minRecommendedCapacity,
|
||||
"current_max_locks", result.PostgreSQL.MaxLocksPerTransaction,
|
||||
"current_max_connections", maxConns,
|
||||
"recommended_max_locks", recommendedMaxLocks,
|
||||
"note", "VMs with fewer connections need higher max_locks_per_transaction")
|
||||
|
||||
result.Warnings = append(result.Warnings,
|
||||
fmt.Sprintf("Total lock capacity=%d is low (recommend %d+). "+
|
||||
"Capacity = max_locks_per_transaction(%d) × max_connections(%d). "+
|
||||
"If you reduced VM size/connections, increase max_locks_per_transaction to %d. "+
|
||||
"Fix: ALTER SYSTEM SET max_locks_per_transaction = %d; then RESTART PostgreSQL.",
|
||||
totalLockCapacity, minRecommendedCapacity,
|
||||
result.PostgreSQL.MaxLocksPerTransaction, maxConns,
|
||||
recommendedMaxLocks, recommendedMaxLocks))
|
||||
}
|
||||
|
||||
// Parse shared_buffers and warn if very low
|
||||
@@ -315,22 +434,128 @@ func (e *Engine) calculateRecommendations(result *PreflightResult) {
|
||||
if result.Archive.TotalBlobCount > 50000 {
|
||||
lockBoost = 16384
|
||||
}
|
||||
if result.Archive.TotalBlobCount > 100000 {
|
||||
lockBoost = 32768
|
||||
}
|
||||
if result.Archive.TotalBlobCount > 200000 {
|
||||
lockBoost = 65536
|
||||
}
|
||||
|
||||
// Cap at reasonable maximum
|
||||
if lockBoost > 16384 {
|
||||
lockBoost = 16384
|
||||
// For extreme cases, calculate actual requirement
|
||||
// Rule of thumb: ~1 lock per BLOB, divided by max_connections (default 100)
|
||||
// Add 50% safety margin
|
||||
maxConns := result.PostgreSQL.MaxConnections
|
||||
if maxConns == 0 {
|
||||
maxConns = 100 // default
|
||||
}
|
||||
calculatedLocks := (result.Archive.TotalBlobCount / maxConns) * 3 / 2 // 1.5x safety margin
|
||||
if calculatedLocks > lockBoost {
|
||||
lockBoost = calculatedLocks
|
||||
}
|
||||
|
||||
result.Archive.RecommendedLockBoost = lockBoost
|
||||
|
||||
// CRITICAL: Check if current max_locks_per_transaction is dangerously low for this BLOB count
|
||||
currentLocks := result.PostgreSQL.MaxLocksPerTransaction
|
||||
if currentLocks > 0 && result.Archive.TotalBlobCount > 0 {
|
||||
// Estimate max BLOBs we can handle: locks * max_connections
|
||||
maxSafeBLOBs := currentLocks * maxConns
|
||||
|
||||
if result.Archive.TotalBlobCount > maxSafeBLOBs {
|
||||
severity := "WARNING"
|
||||
if result.Archive.TotalBlobCount > maxSafeBLOBs*2 {
|
||||
severity = "CRITICAL"
|
||||
result.CanProceed = false
|
||||
}
|
||||
|
||||
e.log.Error(fmt.Sprintf("%s: max_locks_per_transaction too low for BLOB count", severity),
|
||||
"current_max_locks", currentLocks,
|
||||
"total_blobs", result.Archive.TotalBlobCount,
|
||||
"max_safe_blobs", maxSafeBLOBs,
|
||||
"recommended_max_locks", lockBoost)
|
||||
|
||||
result.Errors = append(result.Errors,
|
||||
fmt.Sprintf("%s: Archive contains %s BLOBs but max_locks_per_transaction=%d can only safely handle ~%s. "+
|
||||
"Increase max_locks_per_transaction to %d in postgresql.conf and RESTART PostgreSQL.",
|
||||
severity,
|
||||
humanize.Comma(int64(result.Archive.TotalBlobCount)),
|
||||
currentLocks,
|
||||
humanize.Comma(int64(maxSafeBLOBs)),
|
||||
lockBoost))
|
||||
}
|
||||
}
|
||||
|
||||
// Log recommendation
|
||||
e.log.Info("Calculated recommended lock boost",
|
||||
"total_blobs", result.Archive.TotalBlobCount,
|
||||
"recommended_locks", lockBoost)
|
||||
}
|
||||
|
||||
// calculateRecommendedParallel determines optimal parallelism based on system resources
|
||||
// Returns the recommended number of parallel workers for pg_restore
|
||||
func (e *Engine) calculateRecommendedParallel(result *PreflightResult) int {
|
||||
cpuCores := result.Linux.CPUCores
|
||||
if cpuCores == 0 {
|
||||
cpuCores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memAvailableGB := float64(result.Linux.MemAvailable) / (1024 * 1024 * 1024)
|
||||
|
||||
// Each pg_restore worker needs approximately 2-4GB of RAM
|
||||
// Use conservative 3GB per worker to avoid OOM
|
||||
const memPerWorkerGB = 3.0
|
||||
|
||||
// Calculate limits
|
||||
maxByMem := int(memAvailableGB / memPerWorkerGB)
|
||||
maxByCPU := cpuCores
|
||||
|
||||
// Use the minimum of memory and CPU limits
|
||||
recommended := maxByMem
|
||||
if maxByCPU < recommended {
|
||||
recommended = maxByCPU
|
||||
}
|
||||
|
||||
// Apply sensible bounds
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
if recommended > 16 {
|
||||
recommended = 16 // Cap at 16 to avoid diminishing returns
|
||||
}
|
||||
|
||||
// If memory pressure is high (>80%), reduce parallelism
|
||||
if result.Linux.MemUsedPercent > 80 && recommended > 1 {
|
||||
recommended = recommended / 2
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
}
|
||||
|
||||
e.log.Info("Calculated recommended parallel",
|
||||
"cpu_cores", cpuCores,
|
||||
"mem_available_gb", fmt.Sprintf("%.1f", memAvailableGB),
|
||||
"max_by_mem", maxByMem,
|
||||
"max_by_cpu", maxByCPU,
|
||||
"recommended", recommended)
|
||||
|
||||
return recommended
|
||||
}
|
||||
|
||||
// printPreflightSummary prints a nice summary of all checks
|
||||
// In silent mode (TUI), this is skipped and results are logged instead
|
||||
func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||
// In TUI/silent mode, don't print to stdout - it causes scrambled output
|
||||
if e.silentMode {
|
||||
// Log summary instead for debugging
|
||||
e.log.Info("Preflight checks complete",
|
||||
"can_proceed", result.CanProceed,
|
||||
"warnings", len(result.Warnings),
|
||||
"errors", len(result.Errors),
|
||||
"total_blobs", result.Archive.TotalBlobCount,
|
||||
"recommended_locks", result.Archive.RecommendedLockBoost)
|
||||
return
|
||||
}
|
||||
|
||||
fmt.Println()
|
||||
fmt.Println(strings.Repeat("─", 60))
|
||||
fmt.Println(" PREFLIGHT CHECKS")
|
||||
@@ -341,6 +566,8 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||
printCheck("Total RAM", humanize.Bytes(result.Linux.MemTotal), true)
|
||||
printCheck("Available RAM", humanize.Bytes(result.Linux.MemAvailable), result.Linux.MemAvailableOK || result.Linux.MemAvailable == 0)
|
||||
printCheck("Memory Usage", fmt.Sprintf("%.1f%%", result.Linux.MemUsedPercent), result.Linux.MemUsedPercent < 85)
|
||||
printCheck("CPU Cores", fmt.Sprintf("%d", result.Linux.CPUCores), true)
|
||||
printCheck("Recommended Parallel", fmt.Sprintf("%d (auto-calculated)", result.Linux.RecommendedParallel), true)
|
||||
|
||||
// Linux-specific kernel checks
|
||||
if result.Linux.IsLinux && result.Linux.ShmMax > 0 {
|
||||
@@ -356,6 +583,13 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||
humanize.Comma(int64(result.PostgreSQL.MaxLocksPerTransaction)),
|
||||
humanize.Comma(int64(result.Archive.RecommendedLockBoost))),
|
||||
true)
|
||||
printCheck("max_connections", humanize.Comma(int64(result.PostgreSQL.MaxConnections)), true)
|
||||
// Show total lock capacity with warning if low
|
||||
totalCapacityOK := result.PostgreSQL.TotalLockCapacity >= 200000
|
||||
printCheck("Total Lock Capacity",
|
||||
fmt.Sprintf("%s (max_locks × max_conns)",
|
||||
humanize.Comma(int64(result.PostgreSQL.TotalLockCapacity))),
|
||||
totalCapacityOK)
|
||||
printCheck("maintenance_work_mem", fmt.Sprintf("%s → 2GB (auto-boost)",
|
||||
result.PostgreSQL.MaintenanceWorkMem), true)
|
||||
printInfo("shared_buffers", result.PostgreSQL.SharedBuffers)
|
||||
@@ -377,6 +611,14 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||
}
|
||||
}
|
||||
|
||||
// Errors (blocking issues)
|
||||
if len(result.Errors) > 0 {
|
||||
fmt.Println("\n ✗ ERRORS (must fix before proceeding):")
|
||||
for _, e := range result.Errors {
|
||||
fmt.Printf(" • %s\n", e)
|
||||
}
|
||||
}
|
||||
|
||||
// Warnings
|
||||
if len(result.Warnings) > 0 {
|
||||
fmt.Println("\n ⚠ Warnings:")
|
||||
@@ -385,6 +627,23 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||
}
|
||||
}
|
||||
|
||||
// Final status
|
||||
fmt.Println()
|
||||
if !result.CanProceed {
|
||||
fmt.Println(" ┌─────────────────────────────────────────────────────────┐")
|
||||
fmt.Println(" │ ✗ PREFLIGHT FAILED - Cannot proceed with restore │")
|
||||
fmt.Println(" │ Fix the errors above and try again. │")
|
||||
fmt.Println(" └─────────────────────────────────────────────────────────┘")
|
||||
} else if len(result.Warnings) > 0 {
|
||||
fmt.Println(" ┌─────────────────────────────────────────────────────────┐")
|
||||
fmt.Println(" │ ⚠ PREFLIGHT PASSED WITH WARNINGS - Proceed with care │")
|
||||
fmt.Println(" └─────────────────────────────────────────────────────────┘")
|
||||
} else {
|
||||
fmt.Println(" ┌─────────────────────────────────────────────────────────┐")
|
||||
fmt.Println(" │ ✓ PREFLIGHT PASSED - Ready to restore │")
|
||||
fmt.Println(" └─────────────────────────────────────────────────────────┘")
|
||||
}
|
||||
|
||||
fmt.Println(strings.Repeat("─", 60))
|
||||
fmt.Println()
|
||||
}
|
||||
|
||||
@@ -255,7 +255,9 @@ func (s *Safety) CheckDiskSpaceAt(archivePath string, checkDir string, multiplie
|
||||
// Get available disk space
|
||||
availableSpace, err := getDiskSpace(checkDir)
|
||||
if err != nil {
|
||||
if s.log != nil {
|
||||
s.log.Warn("Cannot check disk space", "error", err)
|
||||
}
|
||||
return nil // Don't fail if we can't check
|
||||
}
|
||||
|
||||
@@ -278,10 +280,12 @@ func (s *Safety) CheckDiskSpaceAt(archivePath string, checkDir string, multiplie
|
||||
checkDir)
|
||||
}
|
||||
|
||||
if s.log != nil {
|
||||
s.log.Info("Disk space check passed",
|
||||
"location", checkDir,
|
||||
"required", FormatBytes(requiredSpace),
|
||||
"available", FormatBytes(availableSpace))
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
@@ -330,10 +334,12 @@ func (s *Safety) checkPostgresDatabaseExists(ctx context.Context, dbName string)
|
||||
"-tAc", fmt.Sprintf("SELECT 1 FROM pg_database WHERE datname='%s'", dbName),
|
||||
}
|
||||
|
||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
||||
if s.cfg.Host != "localhost" && s.cfg.Host != "127.0.0.1" && s.cfg.Host != "" {
|
||||
args = append([]string{"-h", s.cfg.Host}, args...)
|
||||
// Always add -h flag for explicit host connection (required for password auth)
|
||||
host := s.cfg.Host
|
||||
if host == "" {
|
||||
host = "localhost"
|
||||
}
|
||||
args = append([]string{"-h", host}, args...)
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
|
||||
@@ -342,9 +348,9 @@ func (s *Safety) checkPostgresDatabaseExists(ctx context.Context, dbName string)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", s.cfg.Password))
|
||||
}
|
||||
|
||||
output, err := cmd.Output()
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
return false, fmt.Errorf("failed to check database existence: %w", err)
|
||||
return false, fmt.Errorf("failed to check database existence: %w (output: %s)", err, strings.TrimSpace(string(output)))
|
||||
}
|
||||
|
||||
return strings.TrimSpace(string(output)) == "1", nil
|
||||
@@ -401,21 +407,29 @@ func (s *Safety) listPostgresUserDatabases(ctx context.Context) ([]string, error
|
||||
"-c", query,
|
||||
}
|
||||
|
||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
||||
if s.cfg.Host != "localhost" && s.cfg.Host != "127.0.0.1" && s.cfg.Host != "" {
|
||||
args = append([]string{"-h", s.cfg.Host}, args...)
|
||||
// Always add -h flag for explicit host connection (required for password auth)
|
||||
// Empty or unset host defaults to localhost
|
||||
host := s.cfg.Host
|
||||
if host == "" {
|
||||
host = "localhost"
|
||||
}
|
||||
args = append([]string{"-h", host}, args...)
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
|
||||
// Set password if provided
|
||||
// Set password - check config first, then environment
|
||||
env := os.Environ()
|
||||
if s.cfg.Password != "" {
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", s.cfg.Password))
|
||||
env = append(env, fmt.Sprintf("PGPASSWORD=%s", s.cfg.Password))
|
||||
}
|
||||
cmd.Env = env
|
||||
|
||||
output, err := cmd.Output()
|
||||
s.log.Debug("Listing PostgreSQL databases", "host", host, "port", s.cfg.Port, "user", s.cfg.User)
|
||||
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to list databases: %w", err)
|
||||
// Include psql output in error for debugging
|
||||
return nil, fmt.Errorf("failed to list databases: %w (output: %s)", err, strings.TrimSpace(string(output)))
|
||||
}
|
||||
|
||||
// Parse output
|
||||
@@ -428,6 +442,8 @@ func (s *Safety) listPostgresUserDatabases(ctx context.Context) ([]string, error
|
||||
}
|
||||
}
|
||||
|
||||
s.log.Debug("Found user databases", "count", len(databases), "databases", databases, "raw_output", string(output))
|
||||
|
||||
return databases, nil
|
||||
}
|
||||
|
||||
|
||||
@@ -251,13 +251,13 @@ func (m ArchiveBrowserModel) View() string {
|
||||
var s strings.Builder
|
||||
|
||||
// Header
|
||||
title := "[PKG] Backup Archives"
|
||||
title := "[SELECT] Backup Archives"
|
||||
if m.mode == "restore-single" {
|
||||
title = "[PKG] Select Archive to Restore (Single Database)"
|
||||
title = "[SELECT] Select Archive to Restore (Single Database)"
|
||||
} else if m.mode == "restore-cluster" {
|
||||
title = "[PKG] Select Archive to Restore (Cluster)"
|
||||
title = "[SELECT] Select Archive to Restore (Cluster)"
|
||||
} else if m.mode == "diagnose" {
|
||||
title = "[SEARCH] Select Archive to Diagnose"
|
||||
title = "[SELECT] Select Archive to Diagnose"
|
||||
}
|
||||
|
||||
s.WriteString(titleStyle.Render(title))
|
||||
|
||||
375
internal/tui/backup_exec.go
Executable file → Normal file
375
internal/tui/backup_exec.go
Executable file → Normal file
@@ -4,6 +4,7 @@ import (
|
||||
"context"
|
||||
"fmt"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
tea "github.com/charmbracelet/bubbletea"
|
||||
@@ -12,6 +13,14 @@ import (
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/database"
|
||||
"dbbackup/internal/logger"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
// Backup phase constants for consistency
|
||||
const (
|
||||
backupPhaseGlobals = 1
|
||||
backupPhaseDatabases = 2
|
||||
backupPhaseCompressing = 3
|
||||
)
|
||||
|
||||
// BackupExecutionModel handles backup execution with progress
|
||||
@@ -30,9 +39,81 @@ type BackupExecutionModel struct {
|
||||
cancelling bool // True when user has requested cancellation
|
||||
err error
|
||||
result string
|
||||
archivePath string // Path to created archive (for summary)
|
||||
archiveSize int64 // Size of created archive (for summary)
|
||||
startTime time.Time
|
||||
elapsed time.Duration // Final elapsed time
|
||||
details []string
|
||||
spinnerFrame int
|
||||
|
||||
// Database count progress (for cluster backup)
|
||||
dbTotal int
|
||||
dbDone int
|
||||
dbName string // Current database being backed up
|
||||
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||
phaseDesc string // Description of current phase
|
||||
phase2StartTime time.Time // When phase 2 (databases) started (for realtime ETA)
|
||||
dbPhaseElapsed time.Duration // Elapsed time since database backup phase started
|
||||
dbAvgPerDB time.Duration // Average time per database backup
|
||||
}
|
||||
|
||||
// sharedBackupProgressState holds progress state that can be safely accessed from callbacks
|
||||
type sharedBackupProgressState struct {
|
||||
mu sync.Mutex
|
||||
dbTotal int
|
||||
dbDone int
|
||||
dbName string
|
||||
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||
phaseDesc string // Description of current phase
|
||||
hasUpdate bool
|
||||
phase2StartTime time.Time // When phase 2 started (for realtime ETA calculation)
|
||||
dbPhaseElapsed time.Duration // Elapsed time since database backup phase started
|
||||
dbAvgPerDB time.Duration // Average time per database backup
|
||||
}
|
||||
|
||||
// Package-level shared progress state for backup operations
|
||||
var (
|
||||
currentBackupProgressMu sync.Mutex
|
||||
currentBackupProgressState *sharedBackupProgressState
|
||||
)
|
||||
|
||||
func setCurrentBackupProgress(state *sharedBackupProgressState) {
|
||||
currentBackupProgressMu.Lock()
|
||||
defer currentBackupProgressMu.Unlock()
|
||||
currentBackupProgressState = state
|
||||
}
|
||||
|
||||
func clearCurrentBackupProgress() {
|
||||
currentBackupProgressMu.Lock()
|
||||
defer currentBackupProgressMu.Unlock()
|
||||
currentBackupProgressState = nil
|
||||
}
|
||||
|
||||
func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, overallPhase int, phaseDesc string, hasUpdate bool, dbPhaseElapsed, dbAvgPerDB time.Duration, phase2StartTime time.Time) {
|
||||
currentBackupProgressMu.Lock()
|
||||
defer currentBackupProgressMu.Unlock()
|
||||
|
||||
if currentBackupProgressState == nil {
|
||||
return 0, 0, "", 0, "", false, 0, 0, time.Time{}
|
||||
}
|
||||
|
||||
currentBackupProgressState.mu.Lock()
|
||||
defer currentBackupProgressState.mu.Unlock()
|
||||
|
||||
hasUpdate = currentBackupProgressState.hasUpdate
|
||||
currentBackupProgressState.hasUpdate = false
|
||||
|
||||
// Calculate realtime phase elapsed if we have a phase 2 start time
|
||||
dbPhaseElapsed = currentBackupProgressState.dbPhaseElapsed
|
||||
if !currentBackupProgressState.phase2StartTime.IsZero() {
|
||||
dbPhaseElapsed = time.Since(currentBackupProgressState.phase2StartTime)
|
||||
}
|
||||
|
||||
return currentBackupProgressState.dbTotal, currentBackupProgressState.dbDone,
|
||||
currentBackupProgressState.dbName, currentBackupProgressState.overallPhase,
|
||||
currentBackupProgressState.phaseDesc, hasUpdate,
|
||||
dbPhaseElapsed, currentBackupProgressState.dbAvgPerDB,
|
||||
currentBackupProgressState.phase2StartTime
|
||||
}
|
||||
|
||||
func NewBackupExecution(cfg *config.Config, log logger.Logger, parent tea.Model, ctx context.Context, backupType, dbName string, ratio int) BackupExecutionModel {
|
||||
@@ -55,7 +136,6 @@ func NewBackupExecution(cfg *config.Config, log logger.Logger, parent tea.Model,
|
||||
}
|
||||
|
||||
func (m BackupExecutionModel) Init() tea.Cmd {
|
||||
// TUI handles all display through View() - no progress callbacks needed
|
||||
return tea.Batch(
|
||||
executeBackupWithTUIProgress(m.ctx, m.config, m.logger, m.backupType, m.databaseName, m.ratio),
|
||||
backupTickCmd(),
|
||||
@@ -79,6 +159,9 @@ type backupProgressMsg struct {
|
||||
type backupCompleteMsg struct {
|
||||
result string
|
||||
err error
|
||||
archivePath string
|
||||
archiveSize int64
|
||||
elapsed time.Duration
|
||||
}
|
||||
|
||||
func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, backupType, dbName string, ratio int) tea.Cmd {
|
||||
@@ -91,6 +174,11 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
|
||||
|
||||
start := time.Now()
|
||||
|
||||
// Setup shared progress state for TUI polling
|
||||
progressState := &sharedBackupProgressState{}
|
||||
setCurrentBackupProgress(progressState)
|
||||
defer clearCurrentBackupProgress()
|
||||
|
||||
dbClient, err := database.New(cfg, log)
|
||||
if err != nil {
|
||||
return backupCompleteMsg{
|
||||
@@ -110,6 +198,22 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
|
||||
// Pass nil as indicator - TUI itself handles all display, no stdout printing
|
||||
engine := backup.NewSilent(cfg, log, dbClient, nil)
|
||||
|
||||
// Set database progress callback for cluster backups
|
||||
engine.SetDatabaseProgressCallback(func(done, total int, currentDB string) {
|
||||
progressState.mu.Lock()
|
||||
progressState.dbDone = done
|
||||
progressState.dbTotal = total
|
||||
progressState.dbName = currentDB
|
||||
progressState.overallPhase = backupPhaseDatabases
|
||||
progressState.phaseDesc = fmt.Sprintf("Phase 2/3: Backing up Databases (%d/%d)", done, total)
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 2 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase2StartTime.IsZero() {
|
||||
progressState.phase2StartTime = time.Now()
|
||||
}
|
||||
progressState.mu.Unlock()
|
||||
})
|
||||
|
||||
var backupErr error
|
||||
switch backupType {
|
||||
case "single":
|
||||
@@ -146,6 +250,7 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
|
||||
return backupCompleteMsg{
|
||||
result: result,
|
||||
err: nil,
|
||||
elapsed: elapsed,
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -157,10 +262,25 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
// Increment spinner frame for smooth animation
|
||||
m.spinnerFrame = (m.spinnerFrame + 1) % len(spinnerFrames)
|
||||
|
||||
// Update status based on elapsed time to show progress
|
||||
// Poll for database progress updates from callbacks
|
||||
dbTotal, dbDone, dbName, overallPhase, phaseDesc, hasUpdate, dbPhaseElapsed, dbAvgPerDB, _ := getCurrentBackupProgress()
|
||||
if hasUpdate {
|
||||
m.dbTotal = dbTotal
|
||||
m.dbDone = dbDone
|
||||
m.dbName = dbName
|
||||
m.overallPhase = overallPhase
|
||||
m.phaseDesc = phaseDesc
|
||||
m.dbPhaseElapsed = dbPhaseElapsed
|
||||
m.dbAvgPerDB = dbAvgPerDB
|
||||
}
|
||||
|
||||
// Update status based on progress and elapsed time
|
||||
elapsedSec := int(time.Since(m.startTime).Seconds())
|
||||
|
||||
if elapsedSec < 2 {
|
||||
if m.dbTotal > 0 && m.dbDone > 0 {
|
||||
// We have real progress from cluster backup
|
||||
m.status = fmt.Sprintf("Backing up database: %s", m.dbName)
|
||||
} else if elapsedSec < 2 {
|
||||
m.status = "Initializing backup..."
|
||||
} else if elapsedSec < 5 {
|
||||
if m.backupType == "cluster" {
|
||||
@@ -199,6 +319,7 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.done = true
|
||||
m.err = msg.err
|
||||
m.result = msg.result
|
||||
m.elapsed = msg.elapsed
|
||||
if m.err == nil {
|
||||
m.status = "[OK] Backup completed successfully!"
|
||||
} else {
|
||||
@@ -210,6 +331,20 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case tea.InterruptMsg:
|
||||
// Handle Ctrl+C signal (SIGINT) - Bubbletea v1.3+ sends this instead of KeyMsg for ctrl+c
|
||||
if !m.done && !m.cancelling {
|
||||
m.cancelling = true
|
||||
m.status = "[STOP] Cancelling backup... (please wait)"
|
||||
if m.cancel != nil {
|
||||
m.cancel()
|
||||
}
|
||||
return m, nil
|
||||
} else if m.done {
|
||||
return m.parent, tea.Quit
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case tea.KeyMsg:
|
||||
switch msg.String() {
|
||||
case "ctrl+c", "esc":
|
||||
@@ -234,14 +369,80 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// renderDatabaseProgressBar renders a progress bar for database count progress
|
||||
func renderBackupDatabaseProgressBar(done, total int, dbName string, width int) string {
|
||||
if total == 0 {
|
||||
return ""
|
||||
}
|
||||
|
||||
// Calculate progress percentage
|
||||
percent := float64(done) / float64(total)
|
||||
if percent > 1.0 {
|
||||
percent = 1.0
|
||||
}
|
||||
|
||||
// Calculate filled width
|
||||
barWidth := width - 20 // Leave room for label and percentage
|
||||
if barWidth < 10 {
|
||||
barWidth = 10
|
||||
}
|
||||
filled := int(float64(barWidth) * percent)
|
||||
if filled > barWidth {
|
||||
filled = barWidth
|
||||
}
|
||||
|
||||
// Build progress bar
|
||||
bar := strings.Repeat("█", filled) + strings.Repeat("░", barWidth-filled)
|
||||
|
||||
return fmt.Sprintf(" Database: [%s] %d/%d", bar, done, total)
|
||||
}
|
||||
|
||||
// renderBackupDatabaseProgressBarWithTiming renders database backup progress with ETA
|
||||
func renderBackupDatabaseProgressBarWithTiming(done, total int, dbPhaseElapsed, dbAvgPerDB time.Duration) string {
|
||||
if total == 0 {
|
||||
return ""
|
||||
}
|
||||
|
||||
// Calculate progress percentage
|
||||
percent := float64(done) / float64(total)
|
||||
if percent > 1.0 {
|
||||
percent = 1.0
|
||||
}
|
||||
|
||||
// Build progress bar
|
||||
barWidth := 50
|
||||
filled := int(float64(barWidth) * percent)
|
||||
if filled > barWidth {
|
||||
filled = barWidth
|
||||
}
|
||||
bar := strings.Repeat("█", filled) + strings.Repeat("░", barWidth-filled)
|
||||
|
||||
// Calculate ETA similar to restore
|
||||
var etaStr string
|
||||
if done > 0 && done < total {
|
||||
avgPerDB := dbPhaseElapsed / time.Duration(done)
|
||||
remaining := total - done
|
||||
eta := avgPerDB * time.Duration(remaining)
|
||||
etaStr = fmt.Sprintf(" | ETA: %s", formatDuration(eta))
|
||||
} else if done == total {
|
||||
etaStr = " | Complete"
|
||||
}
|
||||
|
||||
return fmt.Sprintf(" Databases: [%s] %d/%d | Elapsed: %s%s\n",
|
||||
bar, done, total, formatDuration(dbPhaseElapsed), etaStr)
|
||||
}
|
||||
|
||||
func (m BackupExecutionModel) View() string {
|
||||
var s strings.Builder
|
||||
s.Grow(512) // Pre-allocate estimated capacity for better performance
|
||||
|
||||
// Clear screen with newlines and render header
|
||||
s.WriteString("\n\n")
|
||||
header := titleStyle.Render("[EXEC] Backup Execution")
|
||||
s.WriteString(header)
|
||||
header := "[EXEC] Backing up Database"
|
||||
if m.backupType == "cluster" {
|
||||
header = "[EXEC] Cluster Backup"
|
||||
}
|
||||
s.WriteString(titleStyle.Render(header))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Backup details - properly aligned
|
||||
@@ -252,33 +453,159 @@ func (m BackupExecutionModel) View() string {
|
||||
if m.ratio > 0 {
|
||||
s.WriteString(fmt.Sprintf(" %-10s %d\n", "Sample:", m.ratio))
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" %-10s %s\n", "Duration:", time.Since(m.startTime).Round(time.Second)))
|
||||
s.WriteString("\n")
|
||||
|
||||
// Status with spinner
|
||||
// Status display
|
||||
if !m.done {
|
||||
if m.cancelling {
|
||||
s.WriteString(fmt.Sprintf(" %s %s\n", spinnerFrames[m.spinnerFrame], m.status))
|
||||
} else {
|
||||
s.WriteString(fmt.Sprintf(" %s %s\n", spinnerFrames[m.spinnerFrame], m.status))
|
||||
s.WriteString("\n [KEY] Press Ctrl+C or ESC to cancel\n")
|
||||
}
|
||||
} else {
|
||||
s.WriteString(fmt.Sprintf(" %s\n\n", m.status))
|
||||
// Unified progress display for cluster backup
|
||||
if m.backupType == "cluster" {
|
||||
// Calculate overall progress across all phases
|
||||
// Phase 1: Globals (0-15%)
|
||||
// Phase 2: Databases (15-90%)
|
||||
// Phase 3: Compressing (90-100%)
|
||||
overallProgress := 0
|
||||
phaseLabel := "Starting..."
|
||||
|
||||
elapsedSec := int(time.Since(m.startTime).Seconds())
|
||||
|
||||
if m.overallPhase == backupPhaseDatabases && m.dbTotal > 0 {
|
||||
// Phase 2: Database backups - contributes 15-90%
|
||||
dbPct := int((int64(m.dbDone) * 100) / int64(m.dbTotal))
|
||||
overallProgress = 15 + (dbPct * 75 / 100)
|
||||
phaseLabel = m.phaseDesc
|
||||
} else if m.overallPhase == backupPhaseCompressing {
|
||||
// Phase 3: Compressing archive
|
||||
overallProgress = 92
|
||||
phaseLabel = "Phase 3/3: Compressing Archive"
|
||||
} else if elapsedSec < 5 {
|
||||
// Initial setup
|
||||
overallProgress = 2
|
||||
phaseLabel = "Phase 1/3: Initializing..."
|
||||
} else if m.dbTotal == 0 {
|
||||
// Phase 1: Globals backup (before databases start)
|
||||
overallProgress = 10
|
||||
phaseLabel = "Phase 1/3: Backing up Globals"
|
||||
}
|
||||
|
||||
// Header with phase and overall progress
|
||||
s.WriteString(infoStyle.Render(" ─── Cluster Backup Progress ──────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(fmt.Sprintf(" %s\n\n", phaseLabel))
|
||||
|
||||
// Overall progress bar
|
||||
s.WriteString(" Overall: ")
|
||||
s.WriteString(renderProgressBar(overallProgress))
|
||||
s.WriteString(fmt.Sprintf(" %d%%\n", overallProgress))
|
||||
|
||||
// Phase-specific details
|
||||
if m.dbTotal > 0 && m.dbDone > 0 {
|
||||
// Show current database being backed up
|
||||
s.WriteString("\n")
|
||||
spinner := spinnerFrames[m.spinnerFrame]
|
||||
if m.dbName != "" && m.dbDone <= m.dbTotal {
|
||||
s.WriteString(fmt.Sprintf(" Current: %s %s\n", spinner, m.dbName))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
|
||||
// Database progress bar with timing
|
||||
s.WriteString(renderBackupDatabaseProgressBarWithTiming(m.dbDone, m.dbTotal, m.dbPhaseElapsed, m.dbAvgPerDB))
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
// Intermediate phase (globals)
|
||||
spinner := spinnerFrames[m.spinnerFrame]
|
||||
s.WriteString(fmt.Sprintf("\n %s %s\n\n", spinner, m.status))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
} else {
|
||||
// Single/sample database backup - simpler display
|
||||
spinner := spinnerFrames[m.spinnerFrame]
|
||||
s.WriteString(fmt.Sprintf(" %s %s\n", spinner, m.status))
|
||||
}
|
||||
|
||||
if !m.cancelling {
|
||||
// Elapsed time
|
||||
s.WriteString(fmt.Sprintf("Elapsed: %s\n", formatDuration(time.Since(m.startTime))))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render("[KEYS] Press Ctrl+C or ESC to cancel"))
|
||||
}
|
||||
} else {
|
||||
// Show completion summary with detailed stats
|
||||
if m.err != nil {
|
||||
s.WriteString(fmt.Sprintf(" [FAIL] Error: %v\n", m.err))
|
||||
} else if m.result != "" {
|
||||
// Parse and display result cleanly
|
||||
lines := strings.Split(m.result, "\n")
|
||||
for _, line := range lines {
|
||||
line = strings.TrimSpace(line)
|
||||
if line != "" {
|
||||
s.WriteString(" " + line + "\n")
|
||||
s.WriteString(errorStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render("║ [FAIL] BACKUP FAILED ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
s.WriteString(successStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(successStyle.Render("║ [OK] BACKUP COMPLETED SUCCESSFULLY ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(successStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Summary section
|
||||
s.WriteString(infoStyle.Render(" ─── Summary ───────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Archive info (if available)
|
||||
if m.archivePath != "" {
|
||||
s.WriteString(fmt.Sprintf(" Archive: %s\n", filepath.Base(m.archivePath)))
|
||||
}
|
||||
if m.archiveSize > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Archive Size: %s\n", FormatBytes(m.archiveSize)))
|
||||
}
|
||||
|
||||
// Backup type specific info
|
||||
switch m.backupType {
|
||||
case "cluster":
|
||||
s.WriteString(" Type: Cluster Backup\n")
|
||||
if m.dbTotal > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Databases: %d backed up\n", m.dbTotal))
|
||||
}
|
||||
s.WriteString("\n [KEY] Press Enter or ESC to return to menu\n")
|
||||
case "single":
|
||||
s.WriteString(" Type: Single Database Backup\n")
|
||||
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||
case "sample":
|
||||
s.WriteString(" Type: Sample Backup\n")
|
||||
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||
s.WriteString(fmt.Sprintf(" Sample Ratio: %d\n", m.ratio))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
// Timing section (always shown, consistent with restore)
|
||||
s.WriteString(infoStyle.Render(" ─── Timing ────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
elapsed := m.elapsed
|
||||
if elapsed == 0 {
|
||||
elapsed = time.Since(m.startTime)
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatDuration(elapsed)))
|
||||
|
||||
// Calculate and show throughput if we have size info
|
||||
if m.archiveSize > 0 && elapsed.Seconds() > 0 {
|
||||
throughput := float64(m.archiveSize) / elapsed.Seconds()
|
||||
s.WriteString(fmt.Sprintf(" Throughput: %s/s (average)\n", FormatBytes(int64(throughput))))
|
||||
}
|
||||
|
||||
if m.backupType == "cluster" && m.dbTotal > 0 && m.err == nil {
|
||||
avgPerDB := elapsed / time.Duration(m.dbTotal)
|
||||
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatDuration(avgPerDB)))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(infoStyle.Render(" [KEYS] Press Enter to continue"))
|
||||
}
|
||||
|
||||
return s.String()
|
||||
|
||||
@@ -230,7 +230,7 @@ func (m BackupManagerModel) View() string {
|
||||
var s strings.Builder
|
||||
|
||||
// Title
|
||||
s.WriteString(TitleStyle.Render("[DB] Backup Archive Manager"))
|
||||
s.WriteString(TitleStyle.Render("[SELECT] Backup Archive Manager"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Status line (no box, bold+color accents)
|
||||
|
||||
406
internal/tui/detailed_progress.go
Normal file
406
internal/tui/detailed_progress.go
Normal file
@@ -0,0 +1,406 @@
|
||||
package tui
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// DetailedProgress provides schollz-like progress information for TUI rendering
|
||||
// This is a data structure that can be queried by Bubble Tea's View() method
|
||||
type DetailedProgress struct {
|
||||
mu sync.RWMutex
|
||||
|
||||
// Core progress
|
||||
Total int64 // Total bytes or items
|
||||
Current int64 // Current bytes or items done
|
||||
|
||||
// Display info
|
||||
Description string // What operation is happening
|
||||
Unit string // "bytes", "files", "databases", etc.
|
||||
|
||||
// Timing for ETA/speed calculation
|
||||
StartTime time.Time
|
||||
LastUpdate time.Time
|
||||
SpeedWindow []speedSample // Rolling window for speed calculation
|
||||
|
||||
// State
|
||||
IsIndeterminate bool // True if total is unknown (spinner mode)
|
||||
IsComplete bool
|
||||
IsFailed bool
|
||||
ErrorMessage string
|
||||
}
|
||||
|
||||
type speedSample struct {
|
||||
timestamp time.Time
|
||||
bytes int64
|
||||
}
|
||||
|
||||
// NewDetailedProgress creates a progress tracker with known total
|
||||
func NewDetailedProgress(total int64, description string) *DetailedProgress {
|
||||
return &DetailedProgress{
|
||||
Total: total,
|
||||
Description: description,
|
||||
Unit: "bytes",
|
||||
StartTime: time.Now(),
|
||||
LastUpdate: time.Now(),
|
||||
SpeedWindow: make([]speedSample, 0, 20),
|
||||
IsIndeterminate: total <= 0,
|
||||
}
|
||||
}
|
||||
|
||||
// NewDetailedProgressItems creates a progress tracker for item counts
|
||||
func NewDetailedProgressItems(total int, description string) *DetailedProgress {
|
||||
return &DetailedProgress{
|
||||
Total: int64(total),
|
||||
Description: description,
|
||||
Unit: "items",
|
||||
StartTime: time.Now(),
|
||||
LastUpdate: time.Now(),
|
||||
SpeedWindow: make([]speedSample, 0, 20),
|
||||
IsIndeterminate: total <= 0,
|
||||
}
|
||||
}
|
||||
|
||||
// NewDetailedProgressSpinner creates an indeterminate progress tracker
|
||||
func NewDetailedProgressSpinner(description string) *DetailedProgress {
|
||||
return &DetailedProgress{
|
||||
Total: -1,
|
||||
Description: description,
|
||||
Unit: "",
|
||||
StartTime: time.Now(),
|
||||
LastUpdate: time.Now(),
|
||||
SpeedWindow: make([]speedSample, 0, 20),
|
||||
IsIndeterminate: true,
|
||||
}
|
||||
}
|
||||
|
||||
// Add adds to the current progress
|
||||
func (dp *DetailedProgress) Add(n int64) {
|
||||
dp.mu.Lock()
|
||||
defer dp.mu.Unlock()
|
||||
|
||||
dp.Current += n
|
||||
dp.LastUpdate = time.Now()
|
||||
|
||||
// Add speed sample
|
||||
dp.SpeedWindow = append(dp.SpeedWindow, speedSample{
|
||||
timestamp: dp.LastUpdate,
|
||||
bytes: dp.Current,
|
||||
})
|
||||
|
||||
// Keep only last 20 samples for speed calculation
|
||||
if len(dp.SpeedWindow) > 20 {
|
||||
dp.SpeedWindow = dp.SpeedWindow[len(dp.SpeedWindow)-20:]
|
||||
}
|
||||
}
|
||||
|
||||
// Set sets the current progress to a specific value
|
||||
func (dp *DetailedProgress) Set(n int64) {
|
||||
dp.mu.Lock()
|
||||
defer dp.mu.Unlock()
|
||||
|
||||
dp.Current = n
|
||||
dp.LastUpdate = time.Now()
|
||||
|
||||
// Add speed sample
|
||||
dp.SpeedWindow = append(dp.SpeedWindow, speedSample{
|
||||
timestamp: dp.LastUpdate,
|
||||
bytes: dp.Current,
|
||||
})
|
||||
|
||||
if len(dp.SpeedWindow) > 20 {
|
||||
dp.SpeedWindow = dp.SpeedWindow[len(dp.SpeedWindow)-20:]
|
||||
}
|
||||
}
|
||||
|
||||
// SetTotal updates the total (useful when total becomes known during operation)
|
||||
func (dp *DetailedProgress) SetTotal(total int64) {
|
||||
dp.mu.Lock()
|
||||
defer dp.mu.Unlock()
|
||||
|
||||
dp.Total = total
|
||||
dp.IsIndeterminate = total <= 0
|
||||
}
|
||||
|
||||
// SetDescription updates the description
|
||||
func (dp *DetailedProgress) SetDescription(desc string) {
|
||||
dp.mu.Lock()
|
||||
defer dp.mu.Unlock()
|
||||
dp.Description = desc
|
||||
}
|
||||
|
||||
// Complete marks the progress as complete
|
||||
func (dp *DetailedProgress) Complete() {
|
||||
dp.mu.Lock()
|
||||
defer dp.mu.Unlock()
|
||||
|
||||
dp.IsComplete = true
|
||||
dp.Current = dp.Total
|
||||
}
|
||||
|
||||
// Fail marks the progress as failed
|
||||
func (dp *DetailedProgress) Fail(errMsg string) {
|
||||
dp.mu.Lock()
|
||||
defer dp.mu.Unlock()
|
||||
|
||||
dp.IsFailed = true
|
||||
dp.ErrorMessage = errMsg
|
||||
}
|
||||
|
||||
// GetPercent returns the progress percentage (0-100)
|
||||
func (dp *DetailedProgress) GetPercent() int {
|
||||
dp.mu.RLock()
|
||||
defer dp.mu.RUnlock()
|
||||
|
||||
if dp.IsIndeterminate || dp.Total <= 0 {
|
||||
return 0
|
||||
}
|
||||
percent := int((dp.Current * 100) / dp.Total)
|
||||
if percent > 100 {
|
||||
return 100
|
||||
}
|
||||
return percent
|
||||
}
|
||||
|
||||
// GetSpeed returns the current transfer speed in bytes/second
|
||||
func (dp *DetailedProgress) GetSpeed() float64 {
|
||||
dp.mu.RLock()
|
||||
defer dp.mu.RUnlock()
|
||||
|
||||
if len(dp.SpeedWindow) < 2 {
|
||||
return 0
|
||||
}
|
||||
|
||||
// Use first and last samples in window for smoothed speed
|
||||
first := dp.SpeedWindow[0]
|
||||
last := dp.SpeedWindow[len(dp.SpeedWindow)-1]
|
||||
|
||||
elapsed := last.timestamp.Sub(first.timestamp).Seconds()
|
||||
if elapsed <= 0 {
|
||||
return 0
|
||||
}
|
||||
|
||||
bytesTransferred := last.bytes - first.bytes
|
||||
return float64(bytesTransferred) / elapsed
|
||||
}
|
||||
|
||||
// GetETA returns the estimated time remaining
|
||||
func (dp *DetailedProgress) GetETA() time.Duration {
|
||||
dp.mu.RLock()
|
||||
defer dp.mu.RUnlock()
|
||||
|
||||
if dp.IsIndeterminate || dp.Total <= 0 || dp.Current >= dp.Total {
|
||||
return 0
|
||||
}
|
||||
|
||||
speed := dp.getSpeedLocked()
|
||||
if speed <= 0 {
|
||||
return 0
|
||||
}
|
||||
|
||||
remaining := dp.Total - dp.Current
|
||||
seconds := float64(remaining) / speed
|
||||
return time.Duration(seconds) * time.Second
|
||||
}
|
||||
|
||||
func (dp *DetailedProgress) getSpeedLocked() float64 {
|
||||
if len(dp.SpeedWindow) < 2 {
|
||||
return 0
|
||||
}
|
||||
|
||||
first := dp.SpeedWindow[0]
|
||||
last := dp.SpeedWindow[len(dp.SpeedWindow)-1]
|
||||
|
||||
elapsed := last.timestamp.Sub(first.timestamp).Seconds()
|
||||
if elapsed <= 0 {
|
||||
return 0
|
||||
}
|
||||
|
||||
bytesTransferred := last.bytes - first.bytes
|
||||
return float64(bytesTransferred) / elapsed
|
||||
}
|
||||
|
||||
// GetElapsed returns the elapsed time since start
|
||||
func (dp *DetailedProgress) GetElapsed() time.Duration {
|
||||
dp.mu.RLock()
|
||||
defer dp.mu.RUnlock()
|
||||
return time.Since(dp.StartTime)
|
||||
}
|
||||
|
||||
// GetState returns a snapshot of the current state for rendering
|
||||
func (dp *DetailedProgress) GetState() DetailedProgressState {
|
||||
dp.mu.RLock()
|
||||
defer dp.mu.RUnlock()
|
||||
|
||||
return DetailedProgressState{
|
||||
Description: dp.Description,
|
||||
Current: dp.Current,
|
||||
Total: dp.Total,
|
||||
Percent: dp.getPercentLocked(),
|
||||
Speed: dp.getSpeedLocked(),
|
||||
ETA: dp.getETALocked(),
|
||||
Elapsed: time.Since(dp.StartTime),
|
||||
Unit: dp.Unit,
|
||||
IsIndeterminate: dp.IsIndeterminate,
|
||||
IsComplete: dp.IsComplete,
|
||||
IsFailed: dp.IsFailed,
|
||||
ErrorMessage: dp.ErrorMessage,
|
||||
}
|
||||
}
|
||||
|
||||
func (dp *DetailedProgress) getPercentLocked() int {
|
||||
if dp.IsIndeterminate || dp.Total <= 0 {
|
||||
return 0
|
||||
}
|
||||
percent := int((dp.Current * 100) / dp.Total)
|
||||
if percent > 100 {
|
||||
return 100
|
||||
}
|
||||
return percent
|
||||
}
|
||||
|
||||
func (dp *DetailedProgress) getETALocked() time.Duration {
|
||||
if dp.IsIndeterminate || dp.Total <= 0 || dp.Current >= dp.Total {
|
||||
return 0
|
||||
}
|
||||
|
||||
speed := dp.getSpeedLocked()
|
||||
if speed <= 0 {
|
||||
return 0
|
||||
}
|
||||
|
||||
remaining := dp.Total - dp.Current
|
||||
seconds := float64(remaining) / speed
|
||||
return time.Duration(seconds) * time.Second
|
||||
}
|
||||
|
||||
// DetailedProgressState is an immutable snapshot for rendering
|
||||
type DetailedProgressState struct {
|
||||
Description string
|
||||
Current int64
|
||||
Total int64
|
||||
Percent int
|
||||
Speed float64 // bytes/sec
|
||||
ETA time.Duration
|
||||
Elapsed time.Duration
|
||||
Unit string
|
||||
IsIndeterminate bool
|
||||
IsComplete bool
|
||||
IsFailed bool
|
||||
ErrorMessage string
|
||||
}
|
||||
|
||||
// RenderProgressBar renders a TUI-friendly progress bar string
|
||||
// Returns something like: "Extracting archive [████████░░░░░░░░░░░░] 45% 12.5 MB/s ETA: 2m 30s"
|
||||
func (s DetailedProgressState) RenderProgressBar(width int) string {
|
||||
if s.IsIndeterminate {
|
||||
return s.renderIndeterminate()
|
||||
}
|
||||
|
||||
// Progress bar
|
||||
barWidth := 30
|
||||
if width < 80 {
|
||||
barWidth = 20
|
||||
}
|
||||
filled := (s.Percent * barWidth) / 100
|
||||
if filled > barWidth {
|
||||
filled = barWidth
|
||||
}
|
||||
|
||||
bar := strings.Repeat("█", filled) + strings.Repeat("░", barWidth-filled)
|
||||
|
||||
// Format bytes
|
||||
currentStr := FormatBytes(s.Current)
|
||||
totalStr := FormatBytes(s.Total)
|
||||
|
||||
// Format speed
|
||||
speedStr := ""
|
||||
if s.Speed > 0 {
|
||||
speedStr = fmt.Sprintf("%s/s", FormatBytes(int64(s.Speed)))
|
||||
}
|
||||
|
||||
// Format ETA
|
||||
etaStr := ""
|
||||
if s.ETA > 0 && !s.IsComplete {
|
||||
etaStr = fmt.Sprintf("ETA: %s", FormatDurationShort(s.ETA))
|
||||
}
|
||||
|
||||
// Build the line
|
||||
parts := []string{
|
||||
fmt.Sprintf("[%s]", bar),
|
||||
fmt.Sprintf("%3d%%", s.Percent),
|
||||
}
|
||||
|
||||
if s.Unit == "bytes" && s.Total > 0 {
|
||||
parts = append(parts, fmt.Sprintf("%s/%s", currentStr, totalStr))
|
||||
} else if s.Total > 0 {
|
||||
parts = append(parts, fmt.Sprintf("%d/%d", s.Current, s.Total))
|
||||
}
|
||||
|
||||
if speedStr != "" {
|
||||
parts = append(parts, speedStr)
|
||||
}
|
||||
if etaStr != "" {
|
||||
parts = append(parts, etaStr)
|
||||
}
|
||||
|
||||
return strings.Join(parts, " ")
|
||||
}
|
||||
|
||||
func (s DetailedProgressState) renderIndeterminate() string {
|
||||
elapsed := FormatDurationShort(s.Elapsed)
|
||||
return fmt.Sprintf("[spinner] %s Elapsed: %s", s.Description, elapsed)
|
||||
}
|
||||
|
||||
// RenderCompact renders a compact single-line progress string
|
||||
func (s DetailedProgressState) RenderCompact() string {
|
||||
if s.IsComplete {
|
||||
return fmt.Sprintf("[OK] %s completed in %s", s.Description, FormatDurationShort(s.Elapsed))
|
||||
}
|
||||
if s.IsFailed {
|
||||
return fmt.Sprintf("[FAIL] %s: %s", s.Description, s.ErrorMessage)
|
||||
}
|
||||
if s.IsIndeterminate {
|
||||
return fmt.Sprintf("[...] %s (%s)", s.Description, FormatDurationShort(s.Elapsed))
|
||||
}
|
||||
|
||||
return fmt.Sprintf("[%3d%%] %s - %s/%s", s.Percent, s.Description,
|
||||
FormatBytes(s.Current), FormatBytes(s.Total))
|
||||
}
|
||||
|
||||
// FormatBytes formats bytes in human-readable format
|
||||
func FormatBytes(b int64) string {
|
||||
const unit = 1024
|
||||
if b < unit {
|
||||
return fmt.Sprintf("%d B", b)
|
||||
}
|
||||
div, exp := int64(unit), 0
|
||||
for n := b / unit; n >= unit; n /= unit {
|
||||
div *= unit
|
||||
exp++
|
||||
}
|
||||
return fmt.Sprintf("%.1f %cB", float64(b)/float64(div), "KMGTPE"[exp])
|
||||
}
|
||||
|
||||
// FormatDurationShort formats duration in short form
|
||||
func FormatDurationShort(d time.Duration) string {
|
||||
if d < time.Second {
|
||||
return "<1s"
|
||||
}
|
||||
if d < time.Minute {
|
||||
return fmt.Sprintf("%ds", int(d.Seconds()))
|
||||
}
|
||||
if d < time.Hour {
|
||||
m := int(d.Minutes())
|
||||
s := int(d.Seconds()) % 60
|
||||
if s > 0 {
|
||||
return fmt.Sprintf("%dm %ds", m, s)
|
||||
}
|
||||
return fmt.Sprintf("%dm", m)
|
||||
}
|
||||
h := int(d.Hours())
|
||||
m := int(d.Minutes()) % 60
|
||||
return fmt.Sprintf("%dh %dm", h, m)
|
||||
}
|
||||
@@ -160,7 +160,7 @@ func (m DiagnoseViewModel) View() string {
|
||||
var s strings.Builder
|
||||
|
||||
// Header
|
||||
s.WriteString(titleStyle.Render("[SEARCH] Backup Diagnosis"))
|
||||
s.WriteString(titleStyle.Render("[CHECK] Backup Diagnosis"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Archive info
|
||||
@@ -204,132 +204,111 @@ func (m DiagnoseViewModel) View() string {
|
||||
func (m DiagnoseViewModel) renderSingleResult(result *restore.DiagnoseResult) string {
|
||||
var s strings.Builder
|
||||
|
||||
// Status Box
|
||||
s.WriteString("+--[ VALIDATION STATUS ]" + strings.Repeat("-", 37) + "+\n")
|
||||
// Validation Status
|
||||
s.WriteString(diagnoseHeaderStyle.Render("[STATUS] Validation"))
|
||||
s.WriteString("\n")
|
||||
|
||||
if result.IsValid {
|
||||
s.WriteString("| " + diagnosePassStyle.Render("[OK] VALID - Archive passed all checks") + strings.Repeat(" ", 18) + "|\n")
|
||||
s.WriteString(diagnosePassStyle.Render(" [OK] VALID - Archive passed all checks"))
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
s.WriteString("| " + diagnoseFailStyle.Render("[FAIL] INVALID - Archive has problems") + strings.Repeat(" ", 19) + "|\n")
|
||||
s.WriteString(diagnoseFailStyle.Render(" [FAIL] INVALID - Archive has problems"))
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
if result.IsTruncated {
|
||||
s.WriteString("| " + diagnoseFailStyle.Render("[!] TRUNCATED - File is incomplete") + strings.Repeat(" ", 22) + "|\n")
|
||||
s.WriteString(diagnoseFailStyle.Render(" [!] TRUNCATED - File is incomplete"))
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
if result.IsCorrupted {
|
||||
s.WriteString("| " + diagnoseFailStyle.Render("[!] CORRUPTED - File structure damaged") + strings.Repeat(" ", 18) + "|\n")
|
||||
s.WriteString(diagnoseFailStyle.Render(" [!] CORRUPTED - File structure damaged"))
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
s.WriteString("+" + strings.Repeat("-", 60) + "+\n\n")
|
||||
s.WriteString("\n")
|
||||
|
||||
// Details Box
|
||||
// Details
|
||||
if result.Details != nil {
|
||||
s.WriteString("+--[ DETAILS ]" + strings.Repeat("-", 46) + "+\n")
|
||||
s.WriteString(diagnoseHeaderStyle.Render("[INFO] Details"))
|
||||
s.WriteString("\n")
|
||||
|
||||
if result.Details.HasPGDMPSignature {
|
||||
s.WriteString("| " + diagnosePassStyle.Render("[+]") + " PostgreSQL custom format (PGDMP)" + strings.Repeat(" ", 20) + "|\n")
|
||||
s.WriteString(diagnosePassStyle.Render(" [+]") + " PostgreSQL custom format (PGDMP)\n")
|
||||
}
|
||||
|
||||
if result.Details.HasSQLHeader {
|
||||
s.WriteString("| " + diagnosePassStyle.Render("[+]") + " PostgreSQL SQL header found" + strings.Repeat(" ", 25) + "|\n")
|
||||
s.WriteString(diagnosePassStyle.Render(" [+]") + " PostgreSQL SQL header found\n")
|
||||
}
|
||||
|
||||
if result.Details.GzipValid {
|
||||
s.WriteString("| " + diagnosePassStyle.Render("[+]") + " Gzip compression valid" + strings.Repeat(" ", 30) + "|\n")
|
||||
s.WriteString(diagnosePassStyle.Render(" [+]") + " Gzip compression valid\n")
|
||||
}
|
||||
|
||||
if result.Details.PgRestoreListable {
|
||||
tableInfo := fmt.Sprintf(" (%d tables)", result.Details.TableCount)
|
||||
padding := 36 - len(tableInfo)
|
||||
if padding < 0 {
|
||||
padding = 0
|
||||
}
|
||||
s.WriteString("| " + diagnosePassStyle.Render("[+]") + " pg_restore can list contents" + tableInfo + strings.Repeat(" ", padding) + "|\n")
|
||||
s.WriteString(diagnosePassStyle.Render(" [+]") + fmt.Sprintf(" pg_restore can list contents (%d tables)\n", result.Details.TableCount))
|
||||
}
|
||||
|
||||
if result.Details.CopyBlockCount > 0 {
|
||||
blockInfo := fmt.Sprintf("%d COPY blocks found", result.Details.CopyBlockCount)
|
||||
padding := 50 - len(blockInfo)
|
||||
if padding < 0 {
|
||||
padding = 0
|
||||
}
|
||||
s.WriteString("| [-] " + blockInfo + strings.Repeat(" ", padding) + "|\n")
|
||||
s.WriteString(fmt.Sprintf(" [-] %d COPY blocks found\n", result.Details.CopyBlockCount))
|
||||
}
|
||||
|
||||
if result.Details.UnterminatedCopy {
|
||||
s.WriteString("| " + diagnoseFailStyle.Render("[-]") + " Unterminated COPY: " + truncate(result.Details.LastCopyTable, 30) + strings.Repeat(" ", 5) + "|\n")
|
||||
s.WriteString(diagnoseFailStyle.Render(" [-]") + " Unterminated COPY: " + truncate(result.Details.LastCopyTable, 30) + "\n")
|
||||
}
|
||||
|
||||
if result.Details.ProperlyTerminated {
|
||||
s.WriteString("| " + diagnosePassStyle.Render("[+]") + " All COPY blocks properly terminated" + strings.Repeat(" ", 17) + "|\n")
|
||||
s.WriteString(diagnosePassStyle.Render(" [+]") + " All COPY blocks properly terminated\n")
|
||||
}
|
||||
|
||||
if result.Details.ExpandedSize > 0 {
|
||||
sizeInfo := fmt.Sprintf("Expanded: %s (%.1fx)", formatSize(result.Details.ExpandedSize), result.Details.CompressionRatio)
|
||||
padding := 50 - len(sizeInfo)
|
||||
if padding < 0 {
|
||||
padding = 0
|
||||
}
|
||||
s.WriteString("| [-] " + sizeInfo + strings.Repeat(" ", padding) + "|\n")
|
||||
s.WriteString(fmt.Sprintf(" [-] Expanded: %s (%.1fx)\n", formatSize(result.Details.ExpandedSize), result.Details.CompressionRatio))
|
||||
}
|
||||
|
||||
s.WriteString("+" + strings.Repeat("-", 60) + "+\n")
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
// Errors Box
|
||||
// Errors
|
||||
if len(result.Errors) > 0 {
|
||||
s.WriteString("\n+--[ ERRORS ]" + strings.Repeat("-", 47) + "+\n")
|
||||
s.WriteString(diagnoseFailStyle.Render("[FAIL] Errors"))
|
||||
s.WriteString("\n")
|
||||
for i, e := range result.Errors {
|
||||
if i >= 5 {
|
||||
remaining := fmt.Sprintf("... and %d more errors", len(result.Errors)-5)
|
||||
padding := 56 - len(remaining)
|
||||
s.WriteString("| " + remaining + strings.Repeat(" ", padding) + "|\n")
|
||||
s.WriteString(fmt.Sprintf(" ... and %d more errors\n", len(result.Errors)-5))
|
||||
break
|
||||
}
|
||||
errText := truncate(e, 54)
|
||||
padding := 56 - len(errText)
|
||||
if padding < 0 {
|
||||
padding = 0
|
||||
s.WriteString(" " + truncate(e, 60) + "\n")
|
||||
}
|
||||
s.WriteString("| " + errText + strings.Repeat(" ", padding) + "|\n")
|
||||
}
|
||||
s.WriteString("+" + strings.Repeat("-", 60) + "+\n")
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
// Warnings Box
|
||||
// Warnings
|
||||
if len(result.Warnings) > 0 {
|
||||
s.WriteString("\n+--[ WARNINGS ]" + strings.Repeat("-", 45) + "+\n")
|
||||
s.WriteString(diagnoseWarnStyle.Render("[WARN] Warnings"))
|
||||
s.WriteString("\n")
|
||||
for i, w := range result.Warnings {
|
||||
if i >= 3 {
|
||||
remaining := fmt.Sprintf("... and %d more warnings", len(result.Warnings)-3)
|
||||
padding := 56 - len(remaining)
|
||||
s.WriteString("| " + remaining + strings.Repeat(" ", padding) + "|\n")
|
||||
s.WriteString(fmt.Sprintf(" ... and %d more warnings\n", len(result.Warnings)-3))
|
||||
break
|
||||
}
|
||||
warnText := truncate(w, 54)
|
||||
padding := 56 - len(warnText)
|
||||
if padding < 0 {
|
||||
padding = 0
|
||||
s.WriteString(" " + truncate(w, 60) + "\n")
|
||||
}
|
||||
s.WriteString("| " + warnText + strings.Repeat(" ", padding) + "|\n")
|
||||
}
|
||||
s.WriteString("+" + strings.Repeat("-", 60) + "+\n")
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
// Recommendations Box
|
||||
// Recommendations
|
||||
if !result.IsValid {
|
||||
s.WriteString("\n+--[ RECOMMENDATIONS ]" + strings.Repeat("-", 38) + "+\n")
|
||||
s.WriteString(diagnoseInfoStyle.Render("[HINT] Recommendations"))
|
||||
s.WriteString("\n")
|
||||
if result.IsTruncated {
|
||||
s.WriteString("| 1. Re-run backup with current version (v3.42.12+) |\n")
|
||||
s.WriteString("| 2. Check disk space on backup server |\n")
|
||||
s.WriteString("| 3. Verify network stability for remote backups |\n")
|
||||
s.WriteString(" 1. Re-run backup with current version (v3.42+)\n")
|
||||
s.WriteString(" 2. Check disk space on backup server\n")
|
||||
s.WriteString(" 3. Verify network stability for remote backups\n")
|
||||
}
|
||||
if result.IsCorrupted {
|
||||
s.WriteString("| 1. Verify backup was transferred completely |\n")
|
||||
s.WriteString("| 2. Try restoring from a previous backup |\n")
|
||||
s.WriteString(" 1. Verify backup was transferred completely\n")
|
||||
s.WriteString(" 2. Try restoring from a previous backup\n")
|
||||
}
|
||||
s.WriteString("+" + strings.Repeat("-", 60) + "+\n")
|
||||
}
|
||||
|
||||
return s.String()
|
||||
@@ -349,10 +328,8 @@ func (m DiagnoseViewModel) renderClusterResults() string {
|
||||
}
|
||||
}
|
||||
|
||||
s.WriteString(strings.Repeat("-", 60))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(diagnoseHeaderStyle.Render(fmt.Sprintf("[STATS] CLUSTER SUMMARY: %d databases\n", len(m.results))))
|
||||
s.WriteString(strings.Repeat("-", 60))
|
||||
s.WriteString(diagnoseHeaderStyle.Render(fmt.Sprintf("[STATS] Cluster Summary: %d databases", len(m.results))))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
if invalidCount == 0 {
|
||||
@@ -364,7 +341,7 @@ func (m DiagnoseViewModel) renderClusterResults() string {
|
||||
}
|
||||
|
||||
// List all dumps with status
|
||||
s.WriteString(diagnoseHeaderStyle.Render("Database Dumps:"))
|
||||
s.WriteString(diagnoseHeaderStyle.Render("[LIST] Database Dumps"))
|
||||
s.WriteString("\n")
|
||||
|
||||
// Show visible range based on cursor
|
||||
@@ -413,9 +390,7 @@ func (m DiagnoseViewModel) renderClusterResults() string {
|
||||
if m.cursor < len(m.results) {
|
||||
selected := m.results[m.cursor]
|
||||
s.WriteString("\n")
|
||||
s.WriteString(strings.Repeat("-", 60))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(diagnoseHeaderStyle.Render("Selected: " + selected.FileName))
|
||||
s.WriteString(diagnoseHeaderStyle.Render("[INFO] Selected: " + selected.FileName))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Show condensed details for selected
|
||||
|
||||
@@ -191,7 +191,7 @@ func (m HistoryViewModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
func (m HistoryViewModel) View() string {
|
||||
var s strings.Builder
|
||||
|
||||
header := titleStyle.Render("[HISTORY] Operation History")
|
||||
header := titleStyle.Render("[STATS] Operation History")
|
||||
s.WriteString(fmt.Sprintf("\n%s\n\n", header))
|
||||
|
||||
if len(m.history) == 0 {
|
||||
|
||||
@@ -188,6 +188,21 @@ func (m *MenuModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case tea.InterruptMsg:
|
||||
// Handle Ctrl+C signal (SIGINT) - Bubbletea v1.3+ sends this
|
||||
if m.cancel != nil {
|
||||
m.cancel()
|
||||
}
|
||||
|
||||
// Clean up any orphaned processes before exit
|
||||
m.logger.Info("Cleaning up processes before exit (SIGINT)")
|
||||
if err := cleanup.KillOrphanedProcesses(m.logger); err != nil {
|
||||
m.logger.Warn("Failed to clean up all processes", "error", err)
|
||||
}
|
||||
|
||||
m.quitting = true
|
||||
return m, tea.Quit
|
||||
|
||||
case tea.KeyMsg:
|
||||
switch msg.String() {
|
||||
case "ctrl+c", "q":
|
||||
@@ -284,9 +299,13 @@ func (m *MenuModel) View() string {
|
||||
|
||||
var s string
|
||||
|
||||
// Product branding header
|
||||
brandLine := fmt.Sprintf("dbbackup v%s • Enterprise Database Backup & Recovery", m.config.Version)
|
||||
s += "\n" + infoStyle.Render(brandLine) + "\n"
|
||||
|
||||
// Header
|
||||
header := titleStyle.Render("[DB] Database Backup Tool - Interactive Menu")
|
||||
s += fmt.Sprintf("\n%s\n\n", header)
|
||||
header := titleStyle.Render("Interactive Menu")
|
||||
s += fmt.Sprintf("%s\n\n", header)
|
||||
|
||||
if len(m.dbTypes) > 0 {
|
||||
options := make([]string, len(m.dbTypes))
|
||||
@@ -334,13 +353,13 @@ func (m *MenuModel) View() string {
|
||||
|
||||
// handleSingleBackup opens database selector for single backup
|
||||
func (m *MenuModel) handleSingleBackup() (tea.Model, tea.Cmd) {
|
||||
selector := NewDatabaseSelector(m.config, m.logger, m, m.ctx, "[DB] Single Database Backup", "single")
|
||||
selector := NewDatabaseSelector(m.config, m.logger, m, m.ctx, "[SELECT] Single Database Backup", "single")
|
||||
return selector, selector.Init()
|
||||
}
|
||||
|
||||
// handleSampleBackup opens database selector for sample backup
|
||||
func (m *MenuModel) handleSampleBackup() (tea.Model, tea.Cmd) {
|
||||
selector := NewDatabaseSelector(m.config, m.logger, m, m.ctx, "[STATS] Sample Database Backup", "sample")
|
||||
selector := NewDatabaseSelector(m.config, m.logger, m, m.ctx, "[SELECT] Sample Database Backup", "sample")
|
||||
return selector, selector.Init()
|
||||
}
|
||||
|
||||
@@ -356,7 +375,7 @@ func (m *MenuModel) handleClusterBackup() (tea.Model, tea.Cmd) {
|
||||
return executor, executor.Init()
|
||||
}
|
||||
confirm := NewConfirmationModelWithAction(m.config, m.logger, m,
|
||||
"[DB] Cluster Backup",
|
||||
"[CHECK] Cluster Backup",
|
||||
"This will backup ALL databases in the cluster. Continue?",
|
||||
func() (tea.Model, tea.Cmd) {
|
||||
executor := NewBackupExecution(m.config, m.logger, m, m.ctx, "cluster", "", 0)
|
||||
|
||||
@@ -6,6 +6,7 @@ import (
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
tea "github.com/charmbracelet/bubbletea"
|
||||
@@ -45,6 +46,29 @@ type RestoreExecutionModel struct {
|
||||
spinnerFrame int
|
||||
spinnerFrames []string
|
||||
|
||||
// Detailed byte progress for schollz-style display
|
||||
bytesTotal int64
|
||||
bytesDone int64
|
||||
description string
|
||||
showBytes bool // True when we have real byte progress to show
|
||||
speed float64 // Rolling window speed in bytes/sec
|
||||
|
||||
// Database count progress (for cluster restore)
|
||||
dbTotal int
|
||||
dbDone int
|
||||
|
||||
// Current database being restored (for detailed display)
|
||||
currentDB string
|
||||
|
||||
// Timing info for database restore phase (ETA calculation)
|
||||
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
|
||||
dbAvgPerDB time.Duration // Average time per database restore
|
||||
|
||||
// Overall progress tracking for unified display
|
||||
overallPhase int // 1=Extracting, 2=Globals, 3=Databases
|
||||
extractionDone bool
|
||||
extractionTime time.Duration // How long extraction took (for ETA calc)
|
||||
|
||||
// Results
|
||||
done bool
|
||||
cancelling bool // True when user has requested cancellation
|
||||
@@ -101,6 +125,9 @@ type restoreProgressMsg struct {
|
||||
phase string
|
||||
progress int
|
||||
detail string
|
||||
bytesTotal int64
|
||||
bytesDone int64
|
||||
description string
|
||||
}
|
||||
|
||||
type restoreCompleteMsg struct {
|
||||
@@ -109,6 +136,129 @@ type restoreCompleteMsg struct {
|
||||
elapsed time.Duration
|
||||
}
|
||||
|
||||
// sharedProgressState holds progress state that can be safely accessed from callbacks
|
||||
type sharedProgressState struct {
|
||||
mu sync.Mutex
|
||||
bytesTotal int64
|
||||
bytesDone int64
|
||||
description string
|
||||
hasUpdate bool
|
||||
|
||||
// Database count progress (for cluster restore)
|
||||
dbTotal int
|
||||
dbDone int
|
||||
|
||||
// Current database being restored
|
||||
currentDB string
|
||||
|
||||
// Timing info for database restore phase
|
||||
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
|
||||
dbAvgPerDB time.Duration // Average time per database restore
|
||||
phase3StartTime time.Time // When phase 3 started (for realtime ETA calculation)
|
||||
|
||||
// Overall phase tracking (1=Extract, 2=Globals, 3=Databases)
|
||||
overallPhase int
|
||||
extractionDone bool
|
||||
|
||||
// Weighted progress by database sizes (bytes)
|
||||
dbBytesTotal int64 // Total bytes across all databases
|
||||
dbBytesDone int64 // Bytes completed (sum of finished DB sizes)
|
||||
|
||||
// Rolling window for speed calculation
|
||||
speedSamples []restoreSpeedSample
|
||||
}
|
||||
|
||||
type restoreSpeedSample struct {
|
||||
timestamp time.Time
|
||||
bytes int64
|
||||
}
|
||||
|
||||
// Package-level shared progress state for restore operations
|
||||
var (
|
||||
currentRestoreProgressMu sync.Mutex
|
||||
currentRestoreProgressState *sharedProgressState
|
||||
)
|
||||
|
||||
func setCurrentRestoreProgress(state *sharedProgressState) {
|
||||
currentRestoreProgressMu.Lock()
|
||||
defer currentRestoreProgressMu.Unlock()
|
||||
currentRestoreProgressState = state
|
||||
}
|
||||
|
||||
func clearCurrentRestoreProgress() {
|
||||
currentRestoreProgressMu.Lock()
|
||||
defer currentRestoreProgressMu.Unlock()
|
||||
currentRestoreProgressState = nil
|
||||
}
|
||||
|
||||
func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description string, hasUpdate bool, dbTotal, dbDone int, speed float64, dbPhaseElapsed, dbAvgPerDB time.Duration, currentDB string, overallPhase int, extractionDone bool, dbBytesTotal, dbBytesDone int64, phase3StartTime time.Time) {
|
||||
currentRestoreProgressMu.Lock()
|
||||
defer currentRestoreProgressMu.Unlock()
|
||||
|
||||
if currentRestoreProgressState == nil {
|
||||
return 0, 0, "", false, 0, 0, 0, 0, 0, "", 0, false, 0, 0, time.Time{}
|
||||
}
|
||||
|
||||
currentRestoreProgressState.mu.Lock()
|
||||
defer currentRestoreProgressState.mu.Unlock()
|
||||
|
||||
// Calculate rolling window speed
|
||||
speed = calculateRollingSpeed(currentRestoreProgressState.speedSamples)
|
||||
|
||||
// Calculate realtime phase elapsed if we have a phase 3 start time
|
||||
dbPhaseElapsed = currentRestoreProgressState.dbPhaseElapsed
|
||||
if !currentRestoreProgressState.phase3StartTime.IsZero() {
|
||||
dbPhaseElapsed = time.Since(currentRestoreProgressState.phase3StartTime)
|
||||
}
|
||||
|
||||
return currentRestoreProgressState.bytesTotal, currentRestoreProgressState.bytesDone,
|
||||
currentRestoreProgressState.description, currentRestoreProgressState.hasUpdate,
|
||||
currentRestoreProgressState.dbTotal, currentRestoreProgressState.dbDone, speed,
|
||||
dbPhaseElapsed, currentRestoreProgressState.dbAvgPerDB,
|
||||
currentRestoreProgressState.currentDB, currentRestoreProgressState.overallPhase,
|
||||
currentRestoreProgressState.extractionDone,
|
||||
currentRestoreProgressState.dbBytesTotal, currentRestoreProgressState.dbBytesDone,
|
||||
currentRestoreProgressState.phase3StartTime
|
||||
}
|
||||
|
||||
// calculateRollingSpeed calculates speed from recent samples (last 5 seconds)
|
||||
func calculateRollingSpeed(samples []restoreSpeedSample) float64 {
|
||||
if len(samples) < 2 {
|
||||
return 0
|
||||
}
|
||||
|
||||
// Use samples from last 5 seconds for smoothed speed
|
||||
now := time.Now()
|
||||
cutoff := now.Add(-5 * time.Second)
|
||||
|
||||
var firstInWindow, lastInWindow *restoreSpeedSample
|
||||
for i := range samples {
|
||||
if samples[i].timestamp.After(cutoff) {
|
||||
if firstInWindow == nil {
|
||||
firstInWindow = &samples[i]
|
||||
}
|
||||
lastInWindow = &samples[i]
|
||||
}
|
||||
}
|
||||
|
||||
// Fall back to first and last if window is empty
|
||||
if firstInWindow == nil || lastInWindow == nil || firstInWindow == lastInWindow {
|
||||
firstInWindow = &samples[0]
|
||||
lastInWindow = &samples[len(samples)-1]
|
||||
}
|
||||
|
||||
elapsed := lastInWindow.timestamp.Sub(firstInWindow.timestamp).Seconds()
|
||||
if elapsed <= 0 {
|
||||
return 0
|
||||
}
|
||||
|
||||
bytesTransferred := lastInWindow.bytes - firstInWindow.bytes
|
||||
return float64(bytesTransferred) / elapsed
|
||||
}
|
||||
|
||||
// restoreProgressChannel allows sending progress updates from the restore goroutine
|
||||
type restoreProgressChannel chan restoreProgressMsg
|
||||
|
||||
func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string, cleanFirst, createIfMissing bool, restoreType string, cleanClusterFirst bool, existingDBs []string, saveDebugLog bool) tea.Cmd {
|
||||
return func() tea.Msg {
|
||||
// NO TIMEOUT for restore operations - a restore takes as long as it takes
|
||||
@@ -131,7 +281,20 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
defer dbClient.Close()
|
||||
|
||||
// STEP 1: Clean cluster if requested (drop all existing user databases)
|
||||
if restoreType == "restore-cluster" && cleanClusterFirst && len(existingDBs) > 0 {
|
||||
if restoreType == "restore-cluster" && cleanClusterFirst {
|
||||
// Re-detect databases at execution time to get current state
|
||||
// The preview list may be stale or detection may have failed earlier
|
||||
safety := restore.NewSafety(cfg, log)
|
||||
currentDBs, err := safety.ListUserDatabases(ctx)
|
||||
if err != nil {
|
||||
log.Warn("Failed to list databases for cleanup, using preview list", "error", err)
|
||||
currentDBs = existingDBs // Fall back to preview list
|
||||
} else if len(currentDBs) > 0 {
|
||||
log.Info("Re-detected user databases for cleanup", "count", len(currentDBs), "databases", currentDBs)
|
||||
existingDBs = currentDBs // Update with fresh list
|
||||
}
|
||||
|
||||
if len(existingDBs) > 0 {
|
||||
log.Info("Dropping existing user databases before cluster restore", "count", len(existingDBs))
|
||||
|
||||
// Drop databases using command-line psql (no connection required)
|
||||
@@ -151,11 +314,111 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
}
|
||||
|
||||
log.Info("Cluster cleanup completed", "dropped", droppedCount, "total", len(existingDBs))
|
||||
} else {
|
||||
log.Info("No user databases to clean up")
|
||||
}
|
||||
}
|
||||
|
||||
// STEP 2: Create restore engine with silent progress (no stdout interference with TUI)
|
||||
engine := restore.NewSilent(cfg, log, dbClient)
|
||||
|
||||
// Set up progress callback for detailed progress reporting
|
||||
// We use a shared pointer that can be queried by the TUI ticker
|
||||
progressState := &sharedProgressState{
|
||||
speedSamples: make([]restoreSpeedSample, 0, 100),
|
||||
}
|
||||
engine.SetProgressCallback(func(current, total int64, description string) {
|
||||
progressState.mu.Lock()
|
||||
defer progressState.mu.Unlock()
|
||||
progressState.bytesDone = current
|
||||
progressState.bytesTotal = total
|
||||
progressState.description = description
|
||||
progressState.hasUpdate = true
|
||||
progressState.overallPhase = 1
|
||||
progressState.extractionDone = false
|
||||
|
||||
// Check if extraction is complete
|
||||
if current >= total && total > 0 {
|
||||
progressState.extractionDone = true
|
||||
progressState.overallPhase = 2
|
||||
}
|
||||
|
||||
// Add speed sample for rolling window calculation
|
||||
progressState.speedSamples = append(progressState.speedSamples, restoreSpeedSample{
|
||||
timestamp: time.Now(),
|
||||
bytes: current,
|
||||
})
|
||||
// Keep only last 100 samples
|
||||
if len(progressState.speedSamples) > 100 {
|
||||
progressState.speedSamples = progressState.speedSamples[len(progressState.speedSamples)-100:]
|
||||
}
|
||||
})
|
||||
|
||||
// Set up database progress callback for cluster restore
|
||||
engine.SetDatabaseProgressCallback(func(done, total int, dbName string) {
|
||||
progressState.mu.Lock()
|
||||
defer progressState.mu.Unlock()
|
||||
progressState.dbDone = done
|
||||
progressState.dbTotal = total
|
||||
progressState.description = fmt.Sprintf("Restoring %s", dbName)
|
||||
progressState.currentDB = dbName
|
||||
progressState.overallPhase = 3
|
||||
progressState.extractionDone = true
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 3 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase3StartTime.IsZero() {
|
||||
progressState.phase3StartTime = time.Now()
|
||||
}
|
||||
// Clear byte progress when switching to db progress
|
||||
progressState.bytesTotal = 0
|
||||
progressState.bytesDone = 0
|
||||
})
|
||||
|
||||
// Set up timing-aware database progress callback for cluster restore ETA
|
||||
engine.SetDatabaseProgressWithTimingCallback(func(done, total int, dbName string, phaseElapsed, avgPerDB time.Duration) {
|
||||
progressState.mu.Lock()
|
||||
defer progressState.mu.Unlock()
|
||||
progressState.dbDone = done
|
||||
progressState.dbTotal = total
|
||||
progressState.description = fmt.Sprintf("Restoring %s", dbName)
|
||||
progressState.currentDB = dbName
|
||||
progressState.overallPhase = 3
|
||||
progressState.extractionDone = true
|
||||
progressState.dbPhaseElapsed = phaseElapsed
|
||||
progressState.dbAvgPerDB = avgPerDB
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 3 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase3StartTime.IsZero() {
|
||||
progressState.phase3StartTime = time.Now()
|
||||
}
|
||||
// Clear byte progress when switching to db progress
|
||||
progressState.bytesTotal = 0
|
||||
progressState.bytesDone = 0
|
||||
})
|
||||
|
||||
// Set up weighted (bytes-based) progress callback for accurate cluster restore progress
|
||||
engine.SetDatabaseProgressByBytesCallback(func(bytesDone, bytesTotal int64, dbName string, dbDone, dbTotal int) {
|
||||
progressState.mu.Lock()
|
||||
defer progressState.mu.Unlock()
|
||||
progressState.dbBytesDone = bytesDone
|
||||
progressState.dbBytesTotal = bytesTotal
|
||||
progressState.dbDone = dbDone
|
||||
progressState.dbTotal = dbTotal
|
||||
progressState.currentDB = dbName
|
||||
progressState.overallPhase = 3
|
||||
progressState.extractionDone = true
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 3 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase3StartTime.IsZero() {
|
||||
progressState.phase3StartTime = time.Now()
|
||||
}
|
||||
})
|
||||
|
||||
// Store progress state in a package-level variable for the ticker to access
|
||||
// This is a workaround because tea messages can't be sent from callbacks
|
||||
setCurrentRestoreProgress(progressState)
|
||||
defer clearCurrentRestoreProgress()
|
||||
|
||||
// Enable debug logging if requested
|
||||
if saveDebugLog {
|
||||
// Generate debug log path using configured WorkDir
|
||||
@@ -165,9 +428,6 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
log.Info("Debug logging enabled", "path", debugLogPath)
|
||||
}
|
||||
|
||||
// Set up progress callback (but it won't work in goroutine - progress is already sent via logs)
|
||||
// The TUI will just use spinner animation to show activity
|
||||
|
||||
// STEP 3: Execute restore based on type
|
||||
var restoreErr error
|
||||
if restoreType == "restore-cluster" {
|
||||
@@ -206,7 +466,58 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.spinnerFrame = (m.spinnerFrame + 1) % len(m.spinnerFrames)
|
||||
m.elapsed = time.Since(m.startTime)
|
||||
|
||||
// Update status based on elapsed time to show progress
|
||||
// Poll shared progress state for real-time updates
|
||||
// Note: dbPhaseElapsed is now calculated in realtime inside getCurrentRestoreProgress()
|
||||
bytesTotal, bytesDone, description, hasUpdate, dbTotal, dbDone, speed, dbPhaseElapsed, dbAvgPerDB, currentDB, overallPhase, extractionDone, dbBytesTotal, dbBytesDone, _ := getCurrentRestoreProgress()
|
||||
if hasUpdate && bytesTotal > 0 && !extractionDone {
|
||||
// Phase 1: Extraction
|
||||
m.bytesTotal = bytesTotal
|
||||
m.bytesDone = bytesDone
|
||||
m.description = description
|
||||
m.showBytes = true
|
||||
m.speed = speed
|
||||
m.overallPhase = 1
|
||||
m.extractionDone = false
|
||||
|
||||
// Update status to reflect actual progress
|
||||
m.status = description
|
||||
m.phase = "Phase 1/3: Extracting Archive"
|
||||
m.progress = int((bytesDone * 100) / bytesTotal)
|
||||
} else if hasUpdate && dbTotal > 0 {
|
||||
// Phase 3: Database restores
|
||||
m.dbTotal = dbTotal
|
||||
m.dbDone = dbDone
|
||||
m.dbPhaseElapsed = dbPhaseElapsed
|
||||
m.dbAvgPerDB = dbAvgPerDB
|
||||
m.currentDB = currentDB
|
||||
m.overallPhase = overallPhase
|
||||
m.extractionDone = extractionDone
|
||||
m.showBytes = false
|
||||
|
||||
if dbDone < dbTotal {
|
||||
m.status = fmt.Sprintf("Restoring: %s", currentDB)
|
||||
} else {
|
||||
m.status = "Finalizing..."
|
||||
}
|
||||
|
||||
// Use weighted progress by bytes if available, otherwise use count
|
||||
if dbBytesTotal > 0 {
|
||||
weightedPercent := int((dbBytesDone * 100) / dbBytesTotal)
|
||||
m.phase = fmt.Sprintf("Phase 3/3: Databases (%d/%d) - %.1f%% by size", dbDone, dbTotal, float64(dbBytesDone*100)/float64(dbBytesTotal))
|
||||
m.progress = weightedPercent
|
||||
} else {
|
||||
m.phase = fmt.Sprintf("Phase 3/3: Databases (%d/%d)", dbDone, dbTotal)
|
||||
m.progress = int((dbDone * 100) / dbTotal)
|
||||
}
|
||||
} else if hasUpdate && extractionDone && dbTotal == 0 {
|
||||
// Phase 2: Globals restore (brief phase between extraction and databases)
|
||||
m.overallPhase = 2
|
||||
m.extractionDone = true
|
||||
m.showBytes = false
|
||||
m.status = "Restoring global objects (roles, tablespaces)..."
|
||||
m.phase = "Phase 2/3: Restoring Globals"
|
||||
} else {
|
||||
// Fallback: Update status based on elapsed time to show progress
|
||||
// This provides visual feedback even though we don't have real-time progress
|
||||
elapsedSec := int(m.elapsed.Seconds())
|
||||
|
||||
@@ -241,6 +552,7 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.phase = "Restore"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return m, restoreTickCmd()
|
||||
}
|
||||
@@ -250,6 +562,15 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.status = msg.status
|
||||
m.phase = msg.phase
|
||||
m.progress = msg.progress
|
||||
|
||||
// Update byte-level progress if available
|
||||
if msg.bytesTotal > 0 {
|
||||
m.bytesTotal = msg.bytesTotal
|
||||
m.bytesDone = msg.bytesDone
|
||||
m.description = msg.description
|
||||
m.showBytes = true
|
||||
}
|
||||
|
||||
if msg.detail != "" {
|
||||
m.details = append(m.details, msg.detail)
|
||||
// Keep only last 5 details
|
||||
@@ -279,6 +600,21 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case tea.InterruptMsg:
|
||||
// Handle Ctrl+C signal (SIGINT) - Bubbletea v1.3+ sends this instead of KeyMsg for ctrl+c
|
||||
if !m.done && !m.cancelling {
|
||||
m.cancelling = true
|
||||
m.status = "[STOP] Cancelling restore... (please wait)"
|
||||
m.phase = "Cancelling"
|
||||
if m.cancel != nil {
|
||||
m.cancel()
|
||||
}
|
||||
return m, nil
|
||||
} else if m.done {
|
||||
return m.parent, tea.Quit
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case tea.KeyMsg:
|
||||
switch msg.String() {
|
||||
case "ctrl+c", "esc":
|
||||
@@ -321,9 +657,9 @@ func (m RestoreExecutionModel) View() string {
|
||||
s.Grow(512) // Pre-allocate estimated capacity for better performance
|
||||
|
||||
// Title
|
||||
title := "[RESTORE] Restoring Database"
|
||||
title := "[EXEC] Restoring Database"
|
||||
if m.restoreType == "restore-cluster" {
|
||||
title = "[RESTORE] Restoring Cluster"
|
||||
title = "[EXEC] Restoring Cluster"
|
||||
}
|
||||
s.WriteString(titleStyle.Render(title))
|
||||
s.WriteString("\n\n")
|
||||
@@ -336,39 +672,166 @@ func (m RestoreExecutionModel) View() string {
|
||||
s.WriteString("\n")
|
||||
|
||||
if m.done {
|
||||
// Show result
|
||||
// Show result with comprehensive summary
|
||||
if m.err != nil {
|
||||
s.WriteString(errorStyle.Render("[FAIL] Restore Failed"))
|
||||
s.WriteString(errorStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render("║ [FAIL] RESTORE FAILED ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(errorStyle.Render(fmt.Sprintf("Error: %v", m.err)))
|
||||
|
||||
// Parse and display error in a clean, structured format
|
||||
errStr := m.err.Error()
|
||||
|
||||
// Extract key parts from the error message
|
||||
errDisplay := formatRestoreError(errStr)
|
||||
s.WriteString(errDisplay)
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
s.WriteString(successStyle.Render("[OK] Restore Completed Successfully"))
|
||||
s.WriteString(successStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(successStyle.Render("║ [OK] RESTORE COMPLETED SUCCESSFULLY ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(successStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(successStyle.Render(m.result))
|
||||
|
||||
// Summary section
|
||||
s.WriteString(infoStyle.Render(" ─── Summary ───────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Archive info
|
||||
s.WriteString(fmt.Sprintf(" Archive: %s\n", m.archive.Name))
|
||||
if m.archive.Size > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Archive Size: %s\n", FormatBytes(m.archive.Size)))
|
||||
}
|
||||
|
||||
// Restore type specific info
|
||||
if m.restoreType == "restore-cluster" {
|
||||
s.WriteString(fmt.Sprintf(" Type: Cluster Restore\n"))
|
||||
if m.dbTotal > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Databases: %d restored\n", m.dbTotal))
|
||||
}
|
||||
if m.cleanClusterFirst && len(m.existingDBs) > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Cleaned: %d existing database(s) dropped\n", len(m.existingDBs)))
|
||||
}
|
||||
} else {
|
||||
s.WriteString(fmt.Sprintf(" Type: Single Database Restore\n"))
|
||||
s.WriteString(fmt.Sprintf(" Target DB: %s\n", m.targetDB))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
s.WriteString(fmt.Sprintf("\nElapsed Time: %s\n", formatDuration(m.elapsed)))
|
||||
// Timing section
|
||||
s.WriteString(infoStyle.Render(" ─── Timing ────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatDuration(m.elapsed)))
|
||||
|
||||
// Calculate and show throughput if we have size info
|
||||
if m.archive.Size > 0 && m.elapsed.Seconds() > 0 {
|
||||
throughput := float64(m.archive.Size) / m.elapsed.Seconds()
|
||||
s.WriteString(fmt.Sprintf(" Throughput: %s/s (average)\n", FormatBytes(int64(throughput))))
|
||||
}
|
||||
|
||||
if m.dbTotal > 0 && m.err == nil {
|
||||
avgPerDB := m.elapsed / time.Duration(m.dbTotal)
|
||||
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatDuration(avgPerDB)))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render("[KEYS] Press Enter to continue"))
|
||||
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(infoStyle.Render(" [KEYS] Press Enter to continue"))
|
||||
} else {
|
||||
// Show progress
|
||||
// Show unified progress for cluster restore
|
||||
if m.restoreType == "restore-cluster" {
|
||||
// Calculate overall progress across all phases
|
||||
// Phase 1: Extraction (0-60%)
|
||||
// Phase 2: Globals (60-65%)
|
||||
// Phase 3: Databases (65-100%)
|
||||
overallProgress := 0
|
||||
phaseLabel := "Starting..."
|
||||
|
||||
if m.showBytes && m.bytesTotal > 0 {
|
||||
// Phase 1: Extraction - contributes 0-60%
|
||||
extractPct := int((m.bytesDone * 100) / m.bytesTotal)
|
||||
overallProgress = (extractPct * 60) / 100
|
||||
phaseLabel = "Phase 1/3: Extracting Archive"
|
||||
} else if m.extractionDone && m.dbTotal == 0 {
|
||||
// Phase 2: Globals restore
|
||||
overallProgress = 62
|
||||
phaseLabel = "Phase 2/3: Restoring Globals"
|
||||
} else if m.dbTotal > 0 {
|
||||
// Phase 3: Database restores - contributes 65-100%
|
||||
dbPct := int((int64(m.dbDone) * 100) / int64(m.dbTotal))
|
||||
overallProgress = 65 + (dbPct * 35 / 100)
|
||||
phaseLabel = fmt.Sprintf("Phase 3/3: Databases (%d/%d)", m.dbDone, m.dbTotal)
|
||||
}
|
||||
|
||||
// Header with phase and overall progress
|
||||
s.WriteString(infoStyle.Render(" ─── Cluster Restore Progress ─────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(fmt.Sprintf(" %s\n\n", phaseLabel))
|
||||
|
||||
// Overall progress bar
|
||||
s.WriteString(" Overall: ")
|
||||
s.WriteString(renderProgressBar(overallProgress))
|
||||
s.WriteString(fmt.Sprintf(" %d%%\n", overallProgress))
|
||||
|
||||
// Phase-specific details
|
||||
if m.showBytes && m.bytesTotal > 0 {
|
||||
// Show extraction details
|
||||
s.WriteString("\n")
|
||||
s.WriteString(fmt.Sprintf(" %s\n", m.status))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(renderDetailedProgressBarWithSpeed(m.bytesDone, m.bytesTotal, m.speed))
|
||||
s.WriteString("\n")
|
||||
} else if m.dbTotal > 0 {
|
||||
// Show current database being restored
|
||||
s.WriteString("\n")
|
||||
spinner := m.spinnerFrames[m.spinnerFrame]
|
||||
if m.currentDB != "" && m.dbDone < m.dbTotal {
|
||||
s.WriteString(fmt.Sprintf(" Current: %s %s\n", spinner, m.currentDB))
|
||||
} else if m.dbDone >= m.dbTotal {
|
||||
s.WriteString(fmt.Sprintf(" %s Finalizing...\n", spinner))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
|
||||
// Database progress bar with timing
|
||||
s.WriteString(renderDatabaseProgressBarWithTiming(m.dbDone, m.dbTotal, m.dbPhaseElapsed, m.dbAvgPerDB))
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
// Intermediate phase (globals)
|
||||
spinner := m.spinnerFrames[m.spinnerFrame]
|
||||
s.WriteString(fmt.Sprintf("\n %s %s\n\n", spinner, m.status))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
} else {
|
||||
// Single database restore - simpler display
|
||||
s.WriteString(fmt.Sprintf("Phase: %s\n", m.phase))
|
||||
|
||||
// Show status with rotating spinner (unified indicator for all operations)
|
||||
// Show detailed progress bar when we have byte-level information
|
||||
if m.showBytes && m.bytesTotal > 0 {
|
||||
s.WriteString(fmt.Sprintf("Status: %s\n", m.status))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(renderDetailedProgressBarWithSpeed(m.bytesDone, m.bytesTotal, m.speed))
|
||||
s.WriteString("\n\n")
|
||||
} else {
|
||||
spinner := m.spinnerFrames[m.spinnerFrame]
|
||||
s.WriteString(fmt.Sprintf("Status: %s %s\n", spinner, m.status))
|
||||
s.WriteString("\n")
|
||||
|
||||
// Only show progress bar for single database restore
|
||||
// Cluster restore uses spinner only (consistent with CLI behavior)
|
||||
if m.restoreType == "restore-single" {
|
||||
// Fallback to simple progress bar
|
||||
progressBar := renderProgressBar(m.progress)
|
||||
s.WriteString(progressBar)
|
||||
s.WriteString(fmt.Sprintf(" %d%%\n", m.progress))
|
||||
s.WriteString("\n")
|
||||
}
|
||||
}
|
||||
|
||||
// Elapsed time
|
||||
s.WriteString(fmt.Sprintf("Elapsed: %s\n", formatDuration(m.elapsed)))
|
||||
@@ -390,6 +853,141 @@ func renderProgressBar(percent int) string {
|
||||
return successStyle.Render(bar) + infoStyle.Render(empty)
|
||||
}
|
||||
|
||||
// renderDetailedProgressBar renders a schollz-style progress bar with bytes, speed, and ETA
|
||||
// Uses elapsed time for speed calculation (fallback)
|
||||
func renderDetailedProgressBar(done, total int64, elapsed time.Duration) string {
|
||||
speed := 0.0
|
||||
if elapsed.Seconds() > 0 {
|
||||
speed = float64(done) / elapsed.Seconds()
|
||||
}
|
||||
return renderDetailedProgressBarWithSpeed(done, total, speed)
|
||||
}
|
||||
|
||||
// renderDetailedProgressBarWithSpeed renders a schollz-style progress bar with pre-calculated rolling speed
|
||||
func renderDetailedProgressBarWithSpeed(done, total int64, speed float64) string {
|
||||
var s strings.Builder
|
||||
|
||||
// Calculate percentage
|
||||
percent := 0
|
||||
if total > 0 {
|
||||
percent = int((done * 100) / total)
|
||||
if percent > 100 {
|
||||
percent = 100
|
||||
}
|
||||
}
|
||||
|
||||
// Render progress bar
|
||||
width := 30
|
||||
filled := (percent * width) / 100
|
||||
barFilled := strings.Repeat("█", filled)
|
||||
barEmpty := strings.Repeat("░", width-filled)
|
||||
|
||||
s.WriteString(successStyle.Render("["))
|
||||
s.WriteString(successStyle.Render(barFilled))
|
||||
s.WriteString(infoStyle.Render(barEmpty))
|
||||
s.WriteString(successStyle.Render("]"))
|
||||
|
||||
// Percentage
|
||||
s.WriteString(fmt.Sprintf(" %3d%%", percent))
|
||||
|
||||
// Bytes progress
|
||||
s.WriteString(fmt.Sprintf(" %s / %s", FormatBytes(done), FormatBytes(total)))
|
||||
|
||||
// Speed display (using rolling window speed)
|
||||
if speed > 0 {
|
||||
s.WriteString(fmt.Sprintf(" %s/s", FormatBytes(int64(speed))))
|
||||
|
||||
// ETA calculation based on rolling speed
|
||||
if done < total {
|
||||
remaining := total - done
|
||||
etaSeconds := float64(remaining) / speed
|
||||
eta := time.Duration(etaSeconds) * time.Second
|
||||
s.WriteString(fmt.Sprintf(" ETA: %s", FormatDurationShort(eta)))
|
||||
}
|
||||
}
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
// renderDatabaseProgressBar renders a progress bar for database count (cluster restore)
|
||||
func renderDatabaseProgressBar(done, total int) string {
|
||||
var s strings.Builder
|
||||
|
||||
// Calculate percentage
|
||||
percent := 0
|
||||
if total > 0 {
|
||||
percent = (done * 100) / total
|
||||
if percent > 100 {
|
||||
percent = 100
|
||||
}
|
||||
}
|
||||
|
||||
// Render progress bar
|
||||
width := 30
|
||||
filled := (percent * width) / 100
|
||||
barFilled := strings.Repeat("█", filled)
|
||||
barEmpty := strings.Repeat("░", width-filled)
|
||||
|
||||
s.WriteString(successStyle.Render("["))
|
||||
s.WriteString(successStyle.Render(barFilled))
|
||||
s.WriteString(infoStyle.Render(barEmpty))
|
||||
s.WriteString(successStyle.Render("]"))
|
||||
|
||||
// Count and percentage
|
||||
s.WriteString(fmt.Sprintf(" %3d%% %d / %d databases", percent, done, total))
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
// renderDatabaseProgressBarWithTiming renders a progress bar for database count with timing and ETA
|
||||
func renderDatabaseProgressBarWithTiming(done, total int, phaseElapsed, avgPerDB time.Duration) string {
|
||||
var s strings.Builder
|
||||
|
||||
// Calculate percentage
|
||||
percent := 0
|
||||
if total > 0 {
|
||||
percent = (done * 100) / total
|
||||
if percent > 100 {
|
||||
percent = 100
|
||||
}
|
||||
}
|
||||
|
||||
// Render progress bar
|
||||
width := 30
|
||||
filled := (percent * width) / 100
|
||||
barFilled := strings.Repeat("█", filled)
|
||||
barEmpty := strings.Repeat("░", width-filled)
|
||||
|
||||
s.WriteString(successStyle.Render("["))
|
||||
s.WriteString(successStyle.Render(barFilled))
|
||||
s.WriteString(infoStyle.Render(barEmpty))
|
||||
s.WriteString(successStyle.Render("]"))
|
||||
|
||||
// Count and percentage
|
||||
s.WriteString(fmt.Sprintf(" %3d%% %d / %d databases", percent, done, total))
|
||||
|
||||
// Timing and ETA
|
||||
if phaseElapsed > 0 {
|
||||
s.WriteString(fmt.Sprintf(" [%s", FormatDurationShort(phaseElapsed)))
|
||||
|
||||
// Calculate ETA based on average time per database
|
||||
if avgPerDB > 0 && done < total {
|
||||
remainingDBs := total - done
|
||||
eta := time.Duration(remainingDBs) * avgPerDB
|
||||
s.WriteString(fmt.Sprintf(" / ETA: %s", FormatDurationShort(eta)))
|
||||
} else if done > 0 && done < total {
|
||||
// Fallback: estimate ETA from overall elapsed time
|
||||
avgElapsed := phaseElapsed / time.Duration(done)
|
||||
remainingDBs := total - done
|
||||
eta := time.Duration(remainingDBs) * avgElapsed
|
||||
s.WriteString(fmt.Sprintf(" / ETA: ~%s", FormatDurationShort(eta)))
|
||||
}
|
||||
s.WriteString("]")
|
||||
}
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
// formatDuration formats duration in human readable format
|
||||
func formatDuration(d time.Duration) string {
|
||||
if d < time.Minute {
|
||||
@@ -434,3 +1032,188 @@ func dropDatabaseCLI(ctx context.Context, cfg *config.Config, dbName string) err
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// formatRestoreError formats a restore error message for clean TUI display
|
||||
func formatRestoreError(errStr string) string {
|
||||
var s strings.Builder
|
||||
maxLineWidth := 60
|
||||
|
||||
// Common patterns to extract
|
||||
patterns := []struct {
|
||||
key string
|
||||
pattern string
|
||||
}{
|
||||
{"Error Type", "ERROR:"},
|
||||
{"Hint", "HINT:"},
|
||||
{"Last Error", "last error:"},
|
||||
{"Total Errors", "total errors:"},
|
||||
}
|
||||
|
||||
// First, try to extract a clean error summary
|
||||
errLines := strings.Split(errStr, "\n")
|
||||
|
||||
// Find the main error message (first line or first ERROR:)
|
||||
mainError := ""
|
||||
hint := ""
|
||||
totalErrors := ""
|
||||
dbsFailed := []string{}
|
||||
|
||||
for _, line := range errLines {
|
||||
line = strings.TrimSpace(line)
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
// Extract ERROR messages
|
||||
if strings.Contains(line, "ERROR:") {
|
||||
if mainError == "" {
|
||||
// Get just the ERROR part
|
||||
idx := strings.Index(line, "ERROR:")
|
||||
if idx >= 0 {
|
||||
mainError = strings.TrimSpace(line[idx:])
|
||||
// Truncate if too long
|
||||
if len(mainError) > maxLineWidth {
|
||||
mainError = mainError[:maxLineWidth-3] + "..."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Extract HINT
|
||||
if strings.Contains(line, "HINT:") {
|
||||
idx := strings.Index(line, "HINT:")
|
||||
if idx >= 0 {
|
||||
hint = strings.TrimSpace(line[idx+5:])
|
||||
if len(hint) > maxLineWidth {
|
||||
hint = hint[:maxLineWidth-3] + "..."
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Extract total errors count
|
||||
if strings.Contains(line, "total errors:") {
|
||||
idx := strings.Index(line, "total errors:")
|
||||
if idx >= 0 {
|
||||
totalErrors = strings.TrimSpace(line[idx+13:])
|
||||
// Just extract the number
|
||||
parts := strings.Fields(totalErrors)
|
||||
if len(parts) > 0 {
|
||||
totalErrors = parts[0]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Extract failed database names (for cluster restore)
|
||||
if strings.Contains(line, ": restore failed:") {
|
||||
parts := strings.SplitN(line, ":", 2)
|
||||
if len(parts) > 0 {
|
||||
dbName := strings.TrimSpace(parts[0])
|
||||
if dbName != "" && !strings.HasPrefix(dbName, "Error") {
|
||||
dbsFailed = append(dbsFailed, dbName)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// If no structured error found, use the first line
|
||||
if mainError == "" {
|
||||
firstLine := errStr
|
||||
if idx := strings.Index(errStr, "\n"); idx > 0 {
|
||||
firstLine = errStr[:idx]
|
||||
}
|
||||
if len(firstLine) > maxLineWidth*2 {
|
||||
firstLine = firstLine[:maxLineWidth*2-3] + "..."
|
||||
}
|
||||
mainError = firstLine
|
||||
}
|
||||
|
||||
// Build structured error display
|
||||
s.WriteString(infoStyle.Render(" ─── Error Details ─────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Error type detection
|
||||
errorType := "critical"
|
||||
if strings.Contains(errStr, "out of shared memory") || strings.Contains(errStr, "max_locks_per_transaction") {
|
||||
errorType = "critical"
|
||||
} else if strings.Contains(errStr, "connection") {
|
||||
errorType = "connection"
|
||||
} else if strings.Contains(errStr, "permission") || strings.Contains(errStr, "access") {
|
||||
errorType = "permission"
|
||||
}
|
||||
|
||||
s.WriteString(fmt.Sprintf(" Type: %s\n", errorType))
|
||||
s.WriteString(fmt.Sprintf(" Message: %s\n", mainError))
|
||||
|
||||
if hint != "" {
|
||||
s.WriteString(fmt.Sprintf(" Hint: %s\n", hint))
|
||||
}
|
||||
|
||||
if totalErrors != "" {
|
||||
s.WriteString(fmt.Sprintf(" Total Errors: %s\n", totalErrors))
|
||||
}
|
||||
|
||||
// Show failed databases (max 5)
|
||||
if len(dbsFailed) > 0 {
|
||||
s.WriteString("\n")
|
||||
s.WriteString(" Failed Databases:\n")
|
||||
for i, db := range dbsFailed {
|
||||
if i >= 5 {
|
||||
s.WriteString(fmt.Sprintf(" ... and %d more\n", len(dbsFailed)-5))
|
||||
break
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" • %s\n", db))
|
||||
}
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ─── Diagnosis ─────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Provide specific recommendations based on error
|
||||
if strings.Contains(errStr, "out of shared memory") || strings.Contains(errStr, "max_locks_per_transaction") {
|
||||
s.WriteString(errorStyle.Render(" • PostgreSQL lock table exhausted\n"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] Recommendations ────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" Lock capacity = max_locks_per_transaction\n")
|
||||
s.WriteString(" × (max_connections + max_prepared_transactions)\n\n")
|
||||
s.WriteString(" If you reduced VM size or max_connections, you need higher\n")
|
||||
s.WriteString(" max_locks_per_transaction to compensate.\n\n")
|
||||
s.WriteString(successStyle.Render(" FIX OPTIONS:\n"))
|
||||
s.WriteString(" 1. Enable 'Large DB Mode' in Settings\n")
|
||||
s.WriteString(" (press 'l' to toggle, reduces parallelism, increases locks)\n\n")
|
||||
s.WriteString(" 2. Increase PostgreSQL locks:\n")
|
||||
s.WriteString(" ALTER SYSTEM SET max_locks_per_transaction = 4096;\n")
|
||||
s.WriteString(" Then RESTART PostgreSQL.\n\n")
|
||||
s.WriteString(" 3. Reduce parallel jobs:\n")
|
||||
s.WriteString(" Set Cluster Parallelism = 1 in Settings\n")
|
||||
} else if strings.Contains(errStr, "connection") || strings.Contains(errStr, "refused") {
|
||||
s.WriteString(" • Database connection failed\n\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] Recommendations ────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" 1. Check database is running\n")
|
||||
s.WriteString(" 2. Verify host, port, and credentials in Settings\n")
|
||||
s.WriteString(" 3. Check firewall/network connectivity\n")
|
||||
} else if strings.Contains(errStr, "permission") || strings.Contains(errStr, "denied") {
|
||||
s.WriteString(" • Permission denied\n\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] Recommendations ────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" 1. Verify database user has sufficient privileges\n")
|
||||
s.WriteString(" 2. Grant CREATE/DROP DATABASE permissions if restoring cluster\n")
|
||||
s.WriteString(" 3. Check file system permissions on backup directory\n")
|
||||
} else {
|
||||
s.WriteString(" See error message above for details.\n\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] General Recommendations ────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" 1. Check the full error log for details\n")
|
||||
s.WriteString(" 2. Try restoring with 'conservative' profile (press 'c')\n")
|
||||
s.WriteString(" 3. For complex databases, enable 'Large DB Mode' (press 'l')\n")
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
|
||||
// Suppress the pattern variable since we don't use it but defined it
|
||||
_ = patterns
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
@@ -55,6 +55,7 @@ type RestorePreviewModel struct {
|
||||
cleanClusterFirst bool // For cluster restore: drop all user databases first
|
||||
existingDBCount int // Number of existing user databases
|
||||
existingDBs []string // List of existing user databases
|
||||
existingDBError string // Error message if database listing failed
|
||||
safetyChecks []SafetyCheck
|
||||
checking bool
|
||||
canProceed bool
|
||||
@@ -102,6 +103,7 @@ type safetyCheckCompleteMsg struct {
|
||||
canProceed bool
|
||||
existingDBCount int
|
||||
existingDBs []string
|
||||
existingDBError string
|
||||
}
|
||||
|
||||
func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd {
|
||||
@@ -221,10 +223,12 @@ func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo,
|
||||
check = SafetyCheck{Name: "Existing databases", Status: "checking", Critical: false}
|
||||
|
||||
// Get list of existing user databases (exclude templates and system DBs)
|
||||
var existingDBError string
|
||||
dbList, err := safety.ListUserDatabases(ctx)
|
||||
if err != nil {
|
||||
check.Status = "warning"
|
||||
check.Message = fmt.Sprintf("Cannot list databases: %v", err)
|
||||
existingDBError = err.Error()
|
||||
} else {
|
||||
existingDBCount = len(dbList)
|
||||
existingDBs = dbList
|
||||
@@ -238,6 +242,14 @@ func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo,
|
||||
}
|
||||
}
|
||||
checks = append(checks, check)
|
||||
|
||||
return safetyCheckCompleteMsg{
|
||||
checks: checks,
|
||||
canProceed: canProceed,
|
||||
existingDBCount: existingDBCount,
|
||||
existingDBs: existingDBs,
|
||||
existingDBError: existingDBError,
|
||||
}
|
||||
}
|
||||
|
||||
return safetyCheckCompleteMsg{
|
||||
@@ -257,6 +269,7 @@ func (m RestorePreviewModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.canProceed = msg.canProceed
|
||||
m.existingDBCount = msg.existingDBCount
|
||||
m.existingDBs = msg.existingDBs
|
||||
m.existingDBError = msg.existingDBError
|
||||
// Auto-forward in auto-confirm mode
|
||||
if m.config.TUIAutoConfirm {
|
||||
return m.parent, tea.Quit
|
||||
@@ -275,10 +288,17 @@ func (m RestorePreviewModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
|
||||
case "c":
|
||||
if m.mode == "restore-cluster" {
|
||||
// Toggle cluster cleanup
|
||||
// Toggle cluster cleanup - databases will be re-detected at execution time
|
||||
m.cleanClusterFirst = !m.cleanClusterFirst
|
||||
if m.cleanClusterFirst {
|
||||
if m.existingDBError != "" {
|
||||
// Detection failed in preview - will re-detect at execution
|
||||
m.message = checkWarningStyle.Render("[WARN] Will clean existing databases before restore (detection pending)")
|
||||
} else if m.existingDBCount > 0 {
|
||||
m.message = checkWarningStyle.Render(fmt.Sprintf("[WARN] Will drop %d existing database(s) before restore", m.existingDBCount))
|
||||
} else {
|
||||
m.message = infoStyle.Render("[INFO] Cleanup enabled (no databases currently detected)")
|
||||
}
|
||||
} else {
|
||||
m.message = fmt.Sprintf("Clean cluster first: disabled")
|
||||
}
|
||||
@@ -339,9 +359,9 @@ func (m RestorePreviewModel) View() string {
|
||||
var s strings.Builder
|
||||
|
||||
// Title
|
||||
title := "Restore Preview"
|
||||
title := "[CHECK] Restore Preview"
|
||||
if m.mode == "restore-cluster" {
|
||||
title = "Cluster Restore Preview"
|
||||
title = "[CHECK] Cluster Restore Preview"
|
||||
}
|
||||
s.WriteString(titleStyle.Render(title))
|
||||
s.WriteString("\n\n")
|
||||
@@ -382,7 +402,27 @@ func (m RestorePreviewModel) View() string {
|
||||
s.WriteString("\n")
|
||||
s.WriteString(fmt.Sprintf(" Host: %s:%d\n", m.config.Host, m.config.Port))
|
||||
|
||||
if m.existingDBCount > 0 {
|
||||
// Show Resource Profile and CPU Workload settings
|
||||
profile := m.config.GetCurrentProfile()
|
||||
if profile != nil {
|
||||
s.WriteString(fmt.Sprintf(" Resource Profile: %s (Parallel:%d, Jobs:%d)\n",
|
||||
profile.Name, profile.ClusterParallelism, profile.Jobs))
|
||||
} else {
|
||||
s.WriteString(fmt.Sprintf(" Resource Profile: %s\n", m.config.ResourceProfile))
|
||||
}
|
||||
// Show Large DB Mode status
|
||||
if m.config.LargeDBMode {
|
||||
s.WriteString(" Large DB Mode: ON (reduced parallelism, high locks)\n")
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" CPU Workload: %s\n", m.config.CPUWorkloadType))
|
||||
s.WriteString(fmt.Sprintf(" Cluster Parallelism: %d databases\n", m.config.ClusterParallelism))
|
||||
|
||||
if m.existingDBError != "" {
|
||||
// Show warning when database listing failed - but still allow cleanup toggle
|
||||
s.WriteString(checkWarningStyle.Render(" Existing Databases: Detection failed\n"))
|
||||
s.WriteString(infoStyle.Render(fmt.Sprintf(" (%s)\n", m.existingDBError)))
|
||||
s.WriteString(infoStyle.Render(" (Will re-detect at restore time)\n"))
|
||||
} else if m.existingDBCount > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Existing Databases: %d found\n", m.existingDBCount))
|
||||
|
||||
// Show first few database names
|
||||
@@ -395,16 +435,19 @@ func (m RestorePreviewModel) View() string {
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" - %s\n", db))
|
||||
}
|
||||
} else {
|
||||
s.WriteString(" Existing Databases: None (clean slate)\n")
|
||||
}
|
||||
|
||||
// Always show cleanup toggle for cluster restore
|
||||
cleanIcon := "[N]"
|
||||
cleanStyle := infoStyle
|
||||
if m.cleanClusterFirst {
|
||||
cleanIcon = "[Y]"
|
||||
cleanIcon := "[Y]"
|
||||
cleanStyle = checkWarningStyle
|
||||
}
|
||||
s.WriteString(cleanStyle.Render(fmt.Sprintf(" Clean All First: %s %v (press 'c' to toggle)\n", cleanIcon, m.cleanClusterFirst)))
|
||||
s.WriteString(cleanStyle.Render(fmt.Sprintf(" Clean All First: %s enabled (press 'c' to toggle)\n", cleanIcon)))
|
||||
} else {
|
||||
s.WriteString(" Existing Databases: None (clean slate)\n")
|
||||
s.WriteString(cleanStyle.Render(fmt.Sprintf(" Clean All First: %s disabled (press 'c' to toggle)\n", cleanIcon)))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
}
|
||||
@@ -453,10 +496,18 @@ func (m RestorePreviewModel) View() string {
|
||||
s.WriteString(infoStyle.Render(" All existing data in target database will be dropped!"))
|
||||
s.WriteString("\n\n")
|
||||
}
|
||||
if m.cleanClusterFirst && m.existingDBCount > 0 {
|
||||
if m.cleanClusterFirst {
|
||||
s.WriteString(checkWarningStyle.Render("[DANGER] WARNING: Cluster cleanup enabled"))
|
||||
s.WriteString("\n")
|
||||
if m.existingDBError != "" {
|
||||
s.WriteString(checkWarningStyle.Render(" Existing databases will be DROPPED before restore!"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" (Database count will be detected at restore time)"))
|
||||
} else if m.existingDBCount > 0 {
|
||||
s.WriteString(checkWarningStyle.Render(fmt.Sprintf(" %d existing database(s) will be DROPPED before restore!", m.existingDBCount)))
|
||||
} else {
|
||||
s.WriteString(infoStyle.Render(" No databases currently detected - cleanup will verify at restore time"))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" This ensures a clean disaster recovery scenario"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
@@ -10,6 +10,7 @@ import (
|
||||
"github.com/charmbracelet/lipgloss"
|
||||
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/cpu"
|
||||
"dbbackup/internal/logger"
|
||||
)
|
||||
|
||||
@@ -101,6 +102,65 @@ func NewSettingsModel(cfg *config.Config, log logger.Logger, parent tea.Model) S
|
||||
Type: "selector",
|
||||
Description: "CPU workload profile (press Enter to cycle: Balanced → CPU-Intensive → I/O-Intensive)",
|
||||
},
|
||||
{
|
||||
Key: "resource_profile",
|
||||
DisplayName: "Resource Profile",
|
||||
Value: func(c *config.Config) string {
|
||||
profile := c.GetCurrentProfile()
|
||||
if profile != nil {
|
||||
return fmt.Sprintf("%s (P:%d J:%d)", profile.Name, profile.ClusterParallelism, profile.Jobs)
|
||||
}
|
||||
return c.ResourceProfile
|
||||
},
|
||||
Update: func(c *config.Config, v string) error {
|
||||
profiles := []string{"conservative", "balanced", "performance", "max-performance"}
|
||||
currentIdx := 0
|
||||
for i, p := range profiles {
|
||||
if c.ResourceProfile == p {
|
||||
currentIdx = i
|
||||
break
|
||||
}
|
||||
}
|
||||
nextIdx := (currentIdx + 1) % len(profiles)
|
||||
return c.ApplyResourceProfile(profiles[nextIdx])
|
||||
},
|
||||
Type: "selector",
|
||||
Description: "Resource profile for VM capacity. Toggle 'l' for Large DB Mode on any profile.",
|
||||
},
|
||||
{
|
||||
Key: "large_db_mode",
|
||||
DisplayName: "Large DB Mode",
|
||||
Value: func(c *config.Config) string {
|
||||
if c.LargeDBMode {
|
||||
return "ON (↓parallelism, ↑locks)"
|
||||
}
|
||||
return "OFF"
|
||||
},
|
||||
Update: func(c *config.Config, v string) error {
|
||||
c.LargeDBMode = !c.LargeDBMode
|
||||
return nil
|
||||
},
|
||||
Type: "selector",
|
||||
Description: "Enable for databases with many tables/LOBs. Reduces parallelism, increases max_locks_per_transaction.",
|
||||
},
|
||||
{
|
||||
Key: "cluster_parallelism",
|
||||
DisplayName: "Cluster Parallelism",
|
||||
Value: func(c *config.Config) string { return fmt.Sprintf("%d", c.ClusterParallelism) },
|
||||
Update: func(c *config.Config, v string) error {
|
||||
val, err := strconv.Atoi(v)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cluster parallelism must be a number")
|
||||
}
|
||||
if val < 1 {
|
||||
return fmt.Errorf("cluster parallelism must be at least 1")
|
||||
}
|
||||
c.ClusterParallelism = val
|
||||
return nil
|
||||
},
|
||||
Type: "int",
|
||||
Description: "Concurrent databases during cluster backup/restore (1=sequential, safer for large DBs)",
|
||||
},
|
||||
{
|
||||
Key: "backup_dir",
|
||||
DisplayName: "Backup Directory",
|
||||
@@ -528,12 +588,70 @@ func (m SettingsModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
|
||||
case "s":
|
||||
return m.saveSettings()
|
||||
|
||||
case "l":
|
||||
// Quick shortcut: Toggle Large DB Mode
|
||||
return m.toggleLargeDBMode()
|
||||
|
||||
case "c":
|
||||
// Quick shortcut: Apply "conservative" profile for constrained VMs
|
||||
return m.applyConservativeProfile()
|
||||
|
||||
case "p":
|
||||
// Show profile recommendation
|
||||
return m.showProfileRecommendation()
|
||||
}
|
||||
}
|
||||
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// toggleLargeDBMode toggles the Large DB Mode flag
|
||||
func (m SettingsModel) toggleLargeDBMode() (tea.Model, tea.Cmd) {
|
||||
m.config.LargeDBMode = !m.config.LargeDBMode
|
||||
if m.config.LargeDBMode {
|
||||
profile := m.config.GetCurrentProfile()
|
||||
m.message = successStyle.Render(fmt.Sprintf(
|
||||
"[ON] Large DB Mode enabled: %s → Parallel=%d, Jobs=%d, MaxLocks=%d",
|
||||
profile.Name, profile.ClusterParallelism, profile.Jobs, profile.MaxLocksPerTxn))
|
||||
} else {
|
||||
profile := m.config.GetCurrentProfile()
|
||||
m.message = successStyle.Render(fmt.Sprintf(
|
||||
"[OFF] Large DB Mode disabled: %s → Parallel=%d, Jobs=%d",
|
||||
profile.Name, profile.ClusterParallelism, profile.Jobs))
|
||||
}
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// applyConservativeProfile applies the conservative profile for constrained VMs
|
||||
func (m SettingsModel) applyConservativeProfile() (tea.Model, tea.Cmd) {
|
||||
if err := m.config.ApplyResourceProfile("conservative"); err != nil {
|
||||
m.message = errorStyle.Render(fmt.Sprintf("[FAIL] %s", err.Error()))
|
||||
return m, nil
|
||||
}
|
||||
m.message = successStyle.Render("[OK] Applied 'conservative' profile: Cluster=1, Jobs=1. Safe for small VMs with limited memory.")
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// showProfileRecommendation displays the recommended profile based on system resources
|
||||
func (m SettingsModel) showProfileRecommendation() (tea.Model, tea.Cmd) {
|
||||
profileName, reason := m.config.GetResourceProfileRecommendation(false)
|
||||
|
||||
var largeDBHint string
|
||||
if m.config.LargeDBMode {
|
||||
largeDBHint = "Large DB Mode: ON"
|
||||
} else {
|
||||
largeDBHint = "Large DB Mode: OFF (press 'l' to enable)"
|
||||
}
|
||||
|
||||
m.message = infoStyle.Render(fmt.Sprintf(
|
||||
"[RECOMMEND] Profile: %s | %s\n"+
|
||||
" → %s\n"+
|
||||
" Press 'l' to toggle Large DB Mode, 'c' for conservative",
|
||||
profileName, largeDBHint, reason))
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// handleEditingInput handles input when editing a setting
|
||||
func (m SettingsModel) handleEditingInput(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
|
||||
switch msg.String() {
|
||||
@@ -688,7 +806,7 @@ func (m SettingsModel) View() string {
|
||||
var b strings.Builder
|
||||
|
||||
// Header
|
||||
header := titleStyle.Render("[CFG] Configuration Settings")
|
||||
header := titleStyle.Render("[CONFIG] Configuration Settings")
|
||||
b.WriteString(fmt.Sprintf("\n%s\n\n", header))
|
||||
|
||||
// Settings list
|
||||
@@ -747,7 +865,32 @@ func (m SettingsModel) View() string {
|
||||
// Current configuration summary
|
||||
if !m.editing {
|
||||
b.WriteString("\n")
|
||||
b.WriteString(infoStyle.Render("[LOG] Current Configuration:"))
|
||||
b.WriteString(infoStyle.Render("[INFO] System Resources & Configuration"))
|
||||
b.WriteString("\n")
|
||||
|
||||
// System resources
|
||||
var sysInfo []string
|
||||
if m.config.CPUInfo != nil {
|
||||
sysInfo = append(sysInfo, fmt.Sprintf("CPU: %d cores (physical), %d logical",
|
||||
m.config.CPUInfo.PhysicalCores, m.config.CPUInfo.LogicalCores))
|
||||
}
|
||||
if m.config.MemoryInfo != nil {
|
||||
sysInfo = append(sysInfo, fmt.Sprintf("Memory: %dGB total, %dGB available",
|
||||
m.config.MemoryInfo.TotalGB, m.config.MemoryInfo.AvailableGB))
|
||||
}
|
||||
|
||||
// Recommended profile
|
||||
recommendedProfile, reason := m.config.GetResourceProfileRecommendation(false)
|
||||
sysInfo = append(sysInfo, fmt.Sprintf("Recommended Profile: %s", recommendedProfile))
|
||||
sysInfo = append(sysInfo, fmt.Sprintf(" → %s", reason))
|
||||
|
||||
for _, line := range sysInfo {
|
||||
b.WriteString(detailStyle.Render(fmt.Sprintf(" %s", line)))
|
||||
b.WriteString("\n")
|
||||
}
|
||||
|
||||
b.WriteString("\n")
|
||||
b.WriteString(infoStyle.Render("[CONFIG] Current Settings"))
|
||||
b.WriteString("\n")
|
||||
|
||||
summary := []string{
|
||||
@@ -755,7 +898,17 @@ func (m SettingsModel) View() string {
|
||||
fmt.Sprintf("Database: %s@%s:%d", m.config.User, m.config.Host, m.config.Port),
|
||||
fmt.Sprintf("Backup Dir: %s", m.config.BackupDir),
|
||||
fmt.Sprintf("Compression: Level %d", m.config.CompressionLevel),
|
||||
fmt.Sprintf("Jobs: %d parallel, %d dump", m.config.Jobs, m.config.DumpJobs),
|
||||
fmt.Sprintf("Profile: %s | Cluster: %d parallel | Jobs: %d",
|
||||
m.config.ResourceProfile, m.config.ClusterParallelism, m.config.Jobs),
|
||||
}
|
||||
|
||||
// Show profile warnings if applicable
|
||||
profile := m.config.GetCurrentProfile()
|
||||
if profile != nil {
|
||||
isValid, warnings := cpu.ValidateProfileForSystem(profile, m.config.CPUInfo, m.config.MemoryInfo)
|
||||
if !isValid && len(warnings) > 0 {
|
||||
summary = append(summary, fmt.Sprintf("⚠️ Warning: %s", warnings[0]))
|
||||
}
|
||||
}
|
||||
|
||||
if m.config.CloudEnabled {
|
||||
@@ -782,9 +935,9 @@ func (m SettingsModel) View() string {
|
||||
} else {
|
||||
// Show different help based on current selection
|
||||
if m.cursor >= 0 && m.cursor < len(m.settings) && m.settings[m.cursor].Type == "path" {
|
||||
footer = infoStyle.Render("\n[KEYS] Up/Down navigate | Enter edit | Tab browse directories | 's' save | 'r' reset | 'q' menu")
|
||||
footer = infoStyle.Render("\n[KEYS] ↑↓ navigate | Enter edit | Tab dirs | 'l' toggle LargeDB | 'c' conservative | 'p' recommend | 's' save | 'q' menu")
|
||||
} else {
|
||||
footer = infoStyle.Render("\n[KEYS] Up/Down navigate | Enter edit | 's' save | 'r' reset | 'q' menu | Tab=dirs on path fields only")
|
||||
footer = infoStyle.Render("\n[KEYS] ↑↓ navigate | Enter edit | 'l' toggle LargeDB mode | 'c' conservative | 'p' recommend | 's' save | 'r' reset | 'q' menu")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -173,7 +173,7 @@ func (m StatusViewModel) View() string {
|
||||
s.WriteString(errorStyle.Render(fmt.Sprintf("[FAIL] Error: %v\n", m.err)))
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
s.WriteString("Connection Status:\n")
|
||||
s.WriteString("[CONN] Connection Status\n")
|
||||
if m.connected {
|
||||
s.WriteString(successStyle.Render(" [+] Connected\n"))
|
||||
} else {
|
||||
@@ -181,11 +181,12 @@ func (m StatusViewModel) View() string {
|
||||
}
|
||||
s.WriteString("\n")
|
||||
|
||||
s.WriteString(fmt.Sprintf("Database Type: %s (%s)\n", m.config.DisplayDatabaseType(), m.config.DatabaseType))
|
||||
s.WriteString(fmt.Sprintf("Host: %s:%d\n", m.config.Host, m.config.Port))
|
||||
s.WriteString(fmt.Sprintf("User: %s\n", m.config.User))
|
||||
s.WriteString(fmt.Sprintf("Backup Directory: %s\n", m.config.BackupDir))
|
||||
s.WriteString(fmt.Sprintf("Version: %s\n\n", m.dbVersion))
|
||||
s.WriteString("[INFO] Server Details\n")
|
||||
s.WriteString(fmt.Sprintf(" Database Type: %s (%s)\n", m.config.DisplayDatabaseType(), m.config.DatabaseType))
|
||||
s.WriteString(fmt.Sprintf(" Host: %s:%d\n", m.config.Host, m.config.Port))
|
||||
s.WriteString(fmt.Sprintf(" User: %s\n", m.config.User))
|
||||
s.WriteString(fmt.Sprintf(" Backup Directory: %s\n", m.config.BackupDir))
|
||||
s.WriteString(fmt.Sprintf(" Version: %s\n\n", m.dbVersion))
|
||||
|
||||
if m.dbCount > 0 {
|
||||
s.WriteString(fmt.Sprintf("Databases Found: %s\n", successStyle.Render(fmt.Sprintf("%d", m.dbCount))))
|
||||
|
||||
@@ -120,12 +120,36 @@ var ShortcutStyle = lipgloss.NewStyle().
|
||||
// =============================================================================
|
||||
// HELPER PREFIXES (no emoticons)
|
||||
// =============================================================================
|
||||
// Convention for TUI titles/headers:
|
||||
// [CHECK] - Verification/diagnosis screens
|
||||
// [STATS] - Statistics/status screens
|
||||
// [SELECT] - Selection/browser screens
|
||||
// [EXEC] - Execution/running screens
|
||||
// [CONFIG] - Configuration/settings screens
|
||||
//
|
||||
// Convention for status messages:
|
||||
// [OK] - Success
|
||||
// [FAIL] - Error/failure
|
||||
// [WAIT] - In progress
|
||||
// [WARN] - Warning
|
||||
// [INFO] - Information
|
||||
|
||||
const (
|
||||
// Title prefixes (for view headers)
|
||||
PrefixCheck = "[CHECK]"
|
||||
PrefixStats = "[STATS]"
|
||||
PrefixSelect = "[SELECT]"
|
||||
PrefixExec = "[EXEC]"
|
||||
PrefixConfig = "[CONFIG]"
|
||||
|
||||
// Status prefixes
|
||||
PrefixOK = "[OK]"
|
||||
PrefixFail = "[FAIL]"
|
||||
PrefixWarn = "[!]"
|
||||
PrefixInfo = "[i]"
|
||||
PrefixWait = "[WAIT]"
|
||||
PrefixWarn = "[WARN]"
|
||||
PrefixInfo = "[INFO]"
|
||||
|
||||
// List item prefixes
|
||||
PrefixPlus = "[+]"
|
||||
PrefixMinus = "[-]"
|
||||
PrefixArrow = ">"
|
||||
|
||||
Reference in New Issue
Block a user