Compare commits
42 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 4486a5d617 | |||
| 75dee1fff5 | |||
| 91d494537d | |||
| 8ffc1ba23c | |||
| 8e8045d8c0 | |||
| 0e94dcf384 | |||
| 33adfbdb38 | |||
| af34eaa073 | |||
| babce7cc83 | |||
| ae8c8fde3d | |||
| 346cb7fb61 | |||
| 18549584b1 | |||
| b1d1d57b61 | |||
| d0e1da1bea | |||
| 343a8b782d | |||
| bc5f7c07f4 | |||
| 821521470f | |||
| 147b9fc234 | |||
| 6f3e81a5a6 | |||
| bf1722c316 | |||
| a759f4d3db | |||
| 7cf1d6f85b | |||
| b305d1342e | |||
| 5456da7183 | |||
| f9ff45cf2a | |||
| 72c06ba5c2 | |||
| a0a401cab1 | |||
| 59a717abe7 | |||
| 490a12f858 | |||
| ea4337e298 | |||
| bbd4f0ceac | |||
| f6f8b04785 | |||
| 670c9af2e7 | |||
| e2cf9adc62 | |||
| 29e089fe3b | |||
| 9396c8e605 | |||
| e363e1937f | |||
| df1ab2f55b | |||
| 0e050b2def | |||
| 62d58c77af | |||
| c5be9bcd2b | |||
| b120f1507e |
132
CHANGELOG.md
132
CHANGELOG.md
@ -5,6 +5,138 @@ All notable changes to dbbackup will be documented in this file.
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added - Single Database Extraction from Cluster Backups (CLI + TUI)
|
||||
- **Extract and restore individual databases from cluster backups** - selective restore without full cluster restoration
|
||||
- **CLI Commands**:
|
||||
- **List databases**: `dbbackup restore cluster backup.tar.gz --list-databases`
|
||||
- Shows all databases in cluster backup with sizes
|
||||
- Fast scan without full extraction
|
||||
- **Extract single database**: `dbbackup restore cluster backup.tar.gz --database myapp --output-dir /tmp/extract`
|
||||
- Extracts only the specified database dump
|
||||
- No restore, just file extraction
|
||||
- **Restore single database from cluster**: `dbbackup restore cluster backup.tar.gz --database myapp --confirm`
|
||||
- Extracts and restores only one database
|
||||
- Much faster than full cluster restore when you only need one database
|
||||
- **Rename on restore**: `dbbackup restore cluster backup.tar.gz --database myapp --target myapp_test --confirm`
|
||||
- Restore with different database name (useful for testing)
|
||||
- **Extract multiple databases**: `dbbackup restore cluster backup.tar.gz --databases "app1,app2,app3" --output-dir /tmp/extract`
|
||||
- Comma-separated list of databases to extract
|
||||
- **TUI Support**:
|
||||
- Press **'s'** on any cluster backup in archive browser to select individual databases
|
||||
- New **ClusterDatabaseSelector** view shows all databases with sizes
|
||||
- Navigate with arrow keys, select with Enter
|
||||
- Automatic handling when cluster backup selected in single restore mode
|
||||
- Full restore preview and confirmation workflow
|
||||
- **Benefits**:
|
||||
- Faster restores (extract only what you need)
|
||||
- Less disk space usage during restore
|
||||
- Easy database migration/copying
|
||||
- Better testing workflow
|
||||
- Selective disaster recovery
|
||||
|
||||
### Performance - Cluster Restore Optimization
|
||||
- **Eliminated duplicate archive extraction in cluster restore** - saves 30-50% time on large restores
|
||||
- Previously: Archive was extracted twice (once in preflight validation, once in actual restore)
|
||||
- Now: Archive extracted once and reused for both validation and restore
|
||||
- **Time savings**:
|
||||
- 50 GB cluster: ~3-6 minutes faster
|
||||
- 10 GB cluster: ~1-2 minutes faster
|
||||
- Small clusters (<5 GB): ~30 seconds faster
|
||||
- Optimization automatically enabled when `--diagnose` flag is used
|
||||
- New `ValidateAndExtractCluster()` performs combined validation + extraction
|
||||
- `RestoreCluster()` accepts optional `preExtractedPath` parameter to reuse extracted directory
|
||||
- Disk space checks intelligently skipped when using pre-extracted directory
|
||||
- Maintains backward compatibility - works with and without pre-extraction
|
||||
- Log output shows optimization: `"Using pre-extracted cluster directory ... optimization: skipping duplicate extraction"`
|
||||
|
||||
### Improved - Archive Validation
|
||||
- **Enhanced tar.gz validation with stream-based checks**
|
||||
- Fast header-only validation (validates gzip + tar structure without full extraction)
|
||||
- Checks gzip magic bytes (0x1f 0x8b) and tar header signature
|
||||
- Reduces preflight validation time from minutes to seconds on large archives
|
||||
- Falls back to full extraction only when necessary (with `--diagnose`)
|
||||
|
||||
## [3.42.74] - 2026-01-20 "Resource Profile System + Critical Ctrl+C Fix"
|
||||
|
||||
### Critical Bug Fix
|
||||
- **Fixed Ctrl+C not working in TUI backup/restore** - Context cancellation was broken in TUI mode
|
||||
- `executeBackupWithTUIProgress()` and `executeRestoreWithTUIProgress()` created new contexts with `WithCancel(parentCtx)`
|
||||
- When user pressed Ctrl+C, `model.cancel()` was called on parent context but execution had separate context
|
||||
- Fixed by using parent context directly instead of creating new one
|
||||
- Ctrl+C/ESC/q now properly propagate cancellation to running operations
|
||||
- Users can now interrupt long-running TUI operations
|
||||
|
||||
### Added - Resource Profile System
|
||||
- **`--profile` flag for restore operations** with three presets:
|
||||
- **Conservative** (`--profile=conservative`): Single-threaded (`--parallel=1`), minimal memory usage
|
||||
- Best for resource-constrained servers, shared hosting, or when "out of shared memory" errors occur
|
||||
- Automatically enables `LargeDBMode` for better resource management
|
||||
- **Balanced** (default): Auto-detect resources, moderate parallelism
|
||||
- Good default for most scenarios
|
||||
- **Aggressive** (`--profile=aggressive`): Maximum parallelism, all available resources
|
||||
- Best for dedicated database servers with ample resources
|
||||
- **Potato** (`--profile=potato`): Easter egg 🥔, same as conservative
|
||||
- **Profile system applies to both CLI and TUI**:
|
||||
- CLI: `dbbackup restore cluster backup.tar.gz --profile=conservative --confirm`
|
||||
- TUI: Automatically uses conservative profile for safer interactive operation
|
||||
- **User overrides supported**: `--jobs` and `--parallel-dbs` flags override profile settings
|
||||
- **New `internal/config/profile.go`** module:
|
||||
- `GetRestoreProfile(name)` - Returns profile settings
|
||||
- `ApplyProfile(cfg, profile, jobs, parallelDBs)` - Applies profile with overrides
|
||||
- `GetProfileDescription(name)` - Human-readable descriptions
|
||||
- `ListProfiles()` - All available profiles
|
||||
|
||||
### Added - PostgreSQL Diagnostic Tools
|
||||
- **`diagnose_postgres_memory.sh`** - Comprehensive memory and resource analysis script:
|
||||
- System memory overview with usage percentages and warnings
|
||||
- Top 15 memory consuming processes
|
||||
- PostgreSQL-specific memory configuration analysis
|
||||
- Current locks and connections monitoring
|
||||
- Shared memory segments inspection
|
||||
- Disk space and swap usage checks
|
||||
- Identifies other resource consumers (Nessus, Elastic Agent, monitoring tools)
|
||||
- Smart recommendations based on findings
|
||||
- Detects temp file usage (indicator of low work_mem)
|
||||
- **`fix_postgres_locks.sh`** - PostgreSQL lock configuration helper:
|
||||
- Automatically increases `max_locks_per_transaction` to 4096
|
||||
- Shows current configuration before applying changes
|
||||
- Calculates total lock capacity
|
||||
- Provides restart commands for different PostgreSQL setups
|
||||
- References diagnostic tool for comprehensive analysis
|
||||
|
||||
### Added - Documentation
|
||||
- **`RESTORE_PROFILES.md`** - Complete profile guide with real-world scenarios:
|
||||
- Profile comparison table
|
||||
- When to use each profile
|
||||
- Override examples
|
||||
- Troubleshooting guide for "out of shared memory" errors
|
||||
- Integration with diagnostic tools
|
||||
- **`email_infra_team.txt`** - Admin communication template (German):
|
||||
- Analysis results template
|
||||
- Problem identification section
|
||||
- Three solution variants (temporary, permanent, workaround)
|
||||
- Includes diagnostic tool references
|
||||
|
||||
### Changed - TUI Improvements
|
||||
- **TUI mode defaults to conservative profile** for safer operation
|
||||
- Interactive users benefit from stability over speed
|
||||
- Prevents resource exhaustion on shared systems
|
||||
- Can be overridden with environment variable: `export RESOURCE_PROFILE=balanced`
|
||||
|
||||
### Fixed
|
||||
- Context cancellation in TUI backup operations (critical)
|
||||
- Context cancellation in TUI restore operations (critical)
|
||||
- Better error diagnostics for "out of shared memory" errors
|
||||
- Improved resource detection and management
|
||||
|
||||
### Technical Details
|
||||
- Profile system respects explicit user flags (`--jobs`, `--parallel-dbs`)
|
||||
- Conservative profile sets `cfg.LargeDBMode = true` automatically
|
||||
- TUI profile selection logged when `Debug` mode enabled
|
||||
- All profiles support both single and cluster restore operations
|
||||
|
||||
## [3.42.50] - 2026-01-16 "Ctrl+C Signal Handling Fix"
|
||||
|
||||
### Fixed - Proper Ctrl+C/SIGINT Handling in TUI
|
||||
|
||||
57
README.md
57
README.md
@ -56,7 +56,7 @@ Download from [releases](https://git.uuxo.net/UUXO/dbbackup/releases):
|
||||
|
||||
```bash
|
||||
# Linux x86_64
|
||||
wget https://git.uuxo.net/UUXO/dbbackup/releases/download/v3.42.35/dbbackup-linux-amd64
|
||||
wget https://git.uuxo.net/UUXO/dbbackup/releases/download/v3.42.74/dbbackup-linux-amd64
|
||||
chmod +x dbbackup-linux-amd64
|
||||
sudo mv dbbackup-linux-amd64 /usr/local/bin/dbbackup
|
||||
```
|
||||
@ -194,21 +194,59 @@ r: Restore | v: Verify | i: Info | d: Diagnose | D: Delete | R: Refresh | Esc: B
|
||||
```
|
||||
Configuration Settings
|
||||
|
||||
[SYSTEM] Detected Resources
|
||||
CPU: 8 physical cores, 16 logical cores
|
||||
Memory: 32GB total, 28GB available
|
||||
Recommended Profile: balanced
|
||||
→ 8 cores and 32GB RAM supports moderate parallelism
|
||||
|
||||
[CONFIG] Current Settings
|
||||
Target DB: PostgreSQL (postgres)
|
||||
Database: postgres@localhost:5432
|
||||
Backup Dir: /var/backups/postgres
|
||||
Compression: Level 6
|
||||
Profile: balanced | Cluster: 2 parallel | Jobs: 4
|
||||
|
||||
> Database Type: postgres
|
||||
CPU Workload Type: balanced
|
||||
Backup Directory: /root/db_backups
|
||||
Work Directory: /tmp
|
||||
Resource Profile: balanced (P:2 J:4)
|
||||
Cluster Parallelism: 2
|
||||
Backup Directory: /var/backups/postgres
|
||||
Work Directory: (system temp)
|
||||
Compression Level: 6
|
||||
Parallel Jobs: 16
|
||||
Dump Jobs: 8
|
||||
Parallel Jobs: 4
|
||||
Dump Jobs: 4
|
||||
Database Host: localhost
|
||||
Database Port: 5432
|
||||
Database User: root
|
||||
Database User: postgres
|
||||
SSL Mode: prefer
|
||||
|
||||
s: Save | r: Reset | q: Menu
|
||||
[KEYS] ↑↓ navigate | Enter edit | 'l' toggle LargeDB | 'c' conservative | 'p' recommend | 's' save | 'q' menu
|
||||
```
|
||||
|
||||
**Resource Profiles for Large Databases:**
|
||||
|
||||
When restoring large databases on VMs with limited resources, use the resource profile settings to prevent "out of shared memory" errors:
|
||||
|
||||
| Profile | Cluster Parallel | Jobs | Best For |
|
||||
|---------|------------------|------|----------|
|
||||
| conservative | 1 | 1 | Small VMs (<16GB RAM) |
|
||||
| balanced | 2 | 2-4 | Medium VMs (16-32GB RAM) |
|
||||
| performance | 4 | 4-8 | Large servers (32GB+ RAM) |
|
||||
| max-performance | 8 | 8-16 | High-end servers (64GB+) |
|
||||
|
||||
**Large DB Mode:** Toggle with `l` key. Reduces parallelism by 50% and sets max_locks_per_transaction=8192 for complex databases with many tables/LOBs.
|
||||
|
||||
**Quick shortcuts:** Press `l` to toggle Large DB Mode, `c` for conservative, `p` to show recommendation.
|
||||
|
||||
**Troubleshooting Tools:**
|
||||
|
||||
For PostgreSQL restore issues ("out of shared memory" errors), diagnostic scripts are available:
|
||||
- **diagnose_postgres_memory.sh** - Comprehensive system memory, PostgreSQL configuration, and resource analysis
|
||||
- **fix_postgres_locks.sh** - Automatically increase max_locks_per_transaction to 4096
|
||||
|
||||
See [RESTORE_PROFILES.md](RESTORE_PROFILES.md) for detailed troubleshooting guidance.
|
||||
|
||||
**Database Status:**
|
||||
```
|
||||
Database Status & Health Check
|
||||
@ -248,6 +286,9 @@ dbbackup restore single backup.dump --target myapp_db --create --confirm
|
||||
# Restore cluster
|
||||
dbbackup restore cluster cluster_backup.tar.gz --confirm
|
||||
|
||||
# Restore with resource profile (for resource-constrained servers)
|
||||
dbbackup restore cluster backup.tar.gz --profile=conservative --confirm
|
||||
|
||||
# Restore with debug logging (saves detailed error report on failure)
|
||||
dbbackup restore cluster backup.tar.gz --save-debug-log /tmp/restore-debug.json --confirm
|
||||
|
||||
@ -303,6 +344,7 @@ dbbackup backup single mydb --dry-run
|
||||
| `--backup-dir` | Backup directory | ~/db_backups |
|
||||
| `--compression` | Compression level (0-9) | 6 |
|
||||
| `--jobs` | Parallel jobs | 8 |
|
||||
| `--profile` | Resource profile (conservative/balanced/aggressive) | balanced |
|
||||
| `--cloud` | Cloud storage URI | - |
|
||||
| `--encrypt` | Enable encryption | false |
|
||||
| `--dry-run, -n` | Run preflight checks only | false |
|
||||
@ -858,6 +900,7 @@ Workload types:
|
||||
|
||||
## Documentation
|
||||
|
||||
- [RESTORE_PROFILES.md](RESTORE_PROFILES.md) - Restore resource profiles & troubleshooting
|
||||
- [SYSTEMD.md](SYSTEMD.md) - Systemd installation & scheduling
|
||||
- [DOCKER.md](DOCKER.md) - Docker deployment
|
||||
- [CLOUD.md](CLOUD.md) - Cloud storage configuration
|
||||
|
||||
195
RESTORE_PROFILES.md
Normal file
195
RESTORE_PROFILES.md
Normal file
@ -0,0 +1,195 @@
|
||||
# Restore Profiles
|
||||
|
||||
## Overview
|
||||
|
||||
The `--profile` flag allows you to optimize restore operations based on your server's resources and current workload. This is particularly useful when dealing with "out of shared memory" errors or resource-constrained environments.
|
||||
|
||||
## Available Profiles
|
||||
|
||||
### Conservative Profile (`--profile=conservative`)
|
||||
**Best for:** Resource-constrained servers, production systems with other running services, or when dealing with "out of shared memory" errors.
|
||||
|
||||
**Settings:**
|
||||
- Single-threaded restore (`--parallel=1`)
|
||||
- Single-threaded decompression (`--jobs=1`)
|
||||
- Memory-conservative mode enabled
|
||||
- Minimal memory footprint
|
||||
|
||||
**When to use:**
|
||||
- Server RAM usage > 70%
|
||||
- Other critical services running (web servers, monitoring agents)
|
||||
- "out of shared memory" errors during restore
|
||||
- Small VMs or shared hosting environments
|
||||
- Disk I/O is the bottleneck
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
dbbackup restore cluster backup.tar.gz --profile=conservative --confirm
|
||||
```
|
||||
|
||||
### Balanced Profile (`--profile=balanced`) - DEFAULT
|
||||
**Best for:** Most scenarios, general-purpose servers with adequate resources.
|
||||
|
||||
**Settings:**
|
||||
- Auto-detect parallelism based on CPU/RAM
|
||||
- Moderate resource usage
|
||||
- Good balance between speed and stability
|
||||
|
||||
**When to use:**
|
||||
- Default choice for most restores
|
||||
- Dedicated database server with moderate load
|
||||
- Unknown or variable server conditions
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
dbbackup restore cluster backup.tar.gz --confirm
|
||||
# or explicitly:
|
||||
dbbackup restore cluster backup.tar.gz --profile=balanced --confirm
|
||||
```
|
||||
|
||||
### Aggressive Profile (`--profile=aggressive`)
|
||||
**Best for:** Dedicated database servers with ample resources, maintenance windows, performance-critical restores.
|
||||
|
||||
**Settings:**
|
||||
- Maximum parallelism (auto-detect based on CPU cores)
|
||||
- Maximum resource utilization
|
||||
- Fastest restore speed
|
||||
|
||||
**When to use:**
|
||||
- Dedicated database server (no other services)
|
||||
- Server RAM usage < 50%
|
||||
- Time-critical restores (RTO minimization)
|
||||
- Maintenance windows with service downtime
|
||||
- Testing/development environments
|
||||
|
||||
**Example:**
|
||||
```bash
|
||||
dbbackup restore cluster backup.tar.gz --profile=aggressive --confirm
|
||||
```
|
||||
|
||||
### Potato Profile (`--profile=potato`) 🥔
|
||||
**Easter egg:** Same as conservative, for servers running on a potato.
|
||||
|
||||
## Profile Comparison
|
||||
|
||||
| Setting | Conservative | Balanced | Aggressive |
|
||||
|---------|-------------|----------|-----------|
|
||||
| Parallel DBs | 1 (sequential) | Auto (2-4) | Auto (all CPUs) |
|
||||
| Jobs (decompression) | 1 | Auto (2-4) | Auto (all CPUs) |
|
||||
| Memory Usage | Minimal | Moderate | Maximum |
|
||||
| Speed | Slowest | Medium | Fastest |
|
||||
| Stability | Most stable | Stable | Requires resources |
|
||||
|
||||
## Overriding Profile Settings
|
||||
|
||||
You can override specific profile settings:
|
||||
|
||||
```bash
|
||||
# Use conservative profile but allow 2 parallel jobs for decompression
|
||||
dbbackup restore cluster backup.tar.gz \\
|
||||
--profile=conservative \\
|
||||
--jobs=2 \\
|
||||
--confirm
|
||||
|
||||
# Use aggressive profile but limit to 2 parallel databases
|
||||
dbbackup restore cluster backup.tar.gz \\
|
||||
--profile=aggressive \\
|
||||
--parallel-dbs=2 \\
|
||||
--confirm
|
||||
```
|
||||
|
||||
## Real-World Scenarios
|
||||
|
||||
### Scenario 1: "Out of Shared Memory" Error
|
||||
**Problem:** PostgreSQL restore fails with `ERROR: out of shared memory`
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Step 1: Use conservative profile
|
||||
dbbackup restore cluster backup.tar.gz --profile=conservative --confirm
|
||||
|
||||
# Step 2: If still failing, temporarily stop monitoring agents
|
||||
sudo systemctl stop nessus-agent elastic-agent
|
||||
dbbackup restore cluster backup.tar.gz --profile=conservative --confirm
|
||||
sudo systemctl start nessus-agent elastic-agent
|
||||
|
||||
# Step 3: Ask infrastructure team to increase work_mem (see email_infra_team.txt)
|
||||
```
|
||||
|
||||
### Scenario 2: Fast Disaster Recovery
|
||||
**Goal:** Restore as quickly as possible during maintenance window
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Stop all non-essential services first
|
||||
sudo systemctl stop nginx php-fpm
|
||||
dbbackup restore cluster backup.tar.gz --profile=aggressive --confirm
|
||||
sudo systemctl start nginx php-fpm
|
||||
```
|
||||
|
||||
### Scenario 3: Shared Server with Multiple Services
|
||||
**Environment:** Web server + database + monitoring all on same VM
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Always use conservative to avoid impacting other services
|
||||
dbbackup restore cluster backup.tar.gz --profile=conservative --confirm
|
||||
```
|
||||
|
||||
### Scenario 4: Unknown Server Conditions
|
||||
**Situation:** Restoring to a new server, unsure of resources
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Step 1: Run diagnostics first
|
||||
./diagnose_postgres_memory.sh > diagnosis.log
|
||||
|
||||
# Step 2: Choose profile based on memory usage:
|
||||
# - If memory > 80%: use conservative
|
||||
# - If memory 50-80%: use balanced (default)
|
||||
# - If memory < 50%: use aggressive
|
||||
|
||||
# Step 3: Start with balanced and adjust if needed
|
||||
dbbackup restore cluster backup.tar.gz --confirm
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Profile Selection Guide
|
||||
|
||||
**Use Conservative when:**
|
||||
- ✅ Memory usage > 70%
|
||||
- ✅ Other services running
|
||||
- ✅ Getting "out of shared memory" errors
|
||||
- ✅ Restore keeps failing
|
||||
- ✅ Small VM (< 4 GB RAM)
|
||||
- ✅ High swap usage
|
||||
|
||||
**Use Balanced when:**
|
||||
- ✅ Normal operation
|
||||
- ✅ Moderate server load
|
||||
- ✅ Unsure what to use
|
||||
- ✅ Medium VM (4-16 GB RAM)
|
||||
|
||||
**Use Aggressive when:**
|
||||
- ✅ Dedicated database server
|
||||
- ✅ Memory usage < 50%
|
||||
- ✅ No other critical services
|
||||
- ✅ Need fastest possible restore
|
||||
- ✅ Large VM (> 16 GB RAM)
|
||||
- ✅ Maintenance window
|
||||
|
||||
## Environment Variables
|
||||
|
||||
You can set a default profile:
|
||||
|
||||
```bash
|
||||
export RESOURCE_PROFILE=conservative
|
||||
dbbackup restore cluster backup.tar.gz --confirm
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- [diagnose_postgres_memory.sh](diagnose_postgres_memory.sh) - Analyze system resources before restore
|
||||
- [fix_postgres_locks.sh](fix_postgres_locks.sh) - Fix PostgreSQL lock exhaustion
|
||||
- [email_infra_team.txt](email_infra_team.txt) - Template email for infrastructure team
|
||||
171
RESTORE_PROGRESS_PROPOSAL.md
Normal file
171
RESTORE_PROGRESS_PROPOSAL.md
Normal file
@ -0,0 +1,171 @@
|
||||
# Restore Progress Bar Enhancement Proposal
|
||||
|
||||
## Problem
|
||||
During Phase 2 cluster restore, the progress bar is not real-time because:
|
||||
- `pg_restore` subprocess blocks until completion
|
||||
- Progress updates only happen **before** each database restore starts
|
||||
- No feedback during actual restore execution (which can take hours)
|
||||
- Users see frozen progress bar during large database restores
|
||||
|
||||
## Root Cause
|
||||
In `internal/restore/engine.go`:
|
||||
- `executeRestoreCommand()` blocks on `cmd.Wait()`
|
||||
- Progress is only reported at goroutine entry (line ~1315)
|
||||
- No streaming progress during pg_restore execution
|
||||
|
||||
## Proposed Solutions
|
||||
|
||||
### Option 1: Parse pg_restore stderr for progress (RECOMMENDED)
|
||||
**Pros:**
|
||||
- Real-time feedback during restore
|
||||
- Works with existing pg_restore
|
||||
- No external tools needed
|
||||
|
||||
**Implementation:**
|
||||
```go
|
||||
// In executeRestoreCommand, modify stderr reader:
|
||||
go func() {
|
||||
scanner := bufio.NewScanner(stderr)
|
||||
for scanner.Scan() {
|
||||
line := scanner.Text()
|
||||
|
||||
// Parse pg_restore progress lines
|
||||
// Format: "pg_restore: processing item 1234 TABLE public users"
|
||||
if strings.Contains(line, "processing item") {
|
||||
e.reportItemProgress(line) // Update progress bar
|
||||
}
|
||||
|
||||
// Capture errors
|
||||
if strings.Contains(line, "ERROR:") {
|
||||
lastError = line
|
||||
errorCount++
|
||||
}
|
||||
}
|
||||
}()
|
||||
```
|
||||
|
||||
**Add to RestoreCluster goroutine:**
|
||||
```go
|
||||
// Track sub-items within each database
|
||||
var currentDBItems, totalDBItems int
|
||||
e.setItemProgressCallback(func(current, total int) {
|
||||
currentDBItems = current
|
||||
totalDBItems = total
|
||||
// Update TUI with sub-progress
|
||||
e.reportDatabaseSubProgress(idx, totalDBs, dbName, current, total)
|
||||
})
|
||||
```
|
||||
|
||||
### Option 2: Verbose mode with line counting
|
||||
**Pros:**
|
||||
- More granular progress (row-level)
|
||||
- Shows exact operation being performed
|
||||
|
||||
**Cons:**
|
||||
- `--verbose` causes massive stderr output (OOM risk on huge DBs)
|
||||
- Currently disabled for memory safety
|
||||
- Requires careful memory management
|
||||
|
||||
### Option 3: Hybrid approach (BEST)
|
||||
**Combine both:**
|
||||
1. **Default**: Parse non-verbose pg_restore output for item counts
|
||||
2. **Small DBs** (<500MB): Enable verbose for detailed progress
|
||||
3. **Periodic updates**: Report progress every 5 seconds even without stderr changes
|
||||
|
||||
**Implementation:**
|
||||
```go
|
||||
// Add periodic progress ticker
|
||||
progressTicker := time.NewTicker(5 * time.Second)
|
||||
defer progressTicker.Stop()
|
||||
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-progressTicker.C:
|
||||
// Report heartbeat even if no stderr
|
||||
e.reportHeartbeat(dbName, time.Since(dbRestoreStart))
|
||||
case <-stderrDone:
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
```
|
||||
|
||||
## Recommended Implementation Plan
|
||||
|
||||
### Phase 1: Quick Win (1-2 hours)
|
||||
1. Add heartbeat ticker in cluster restore goroutines
|
||||
2. Update TUI to show "Restoring database X... (elapsed: 3m 45s)"
|
||||
3. No code changes to pg_restore wrapper
|
||||
|
||||
### Phase 2: Parse pg_restore Output (4-6 hours)
|
||||
1. Parse stderr for "processing item" lines
|
||||
2. Extract current/total item counts
|
||||
3. Report sub-progress to TUI
|
||||
4. Update progress bar calculation:
|
||||
```
|
||||
dbProgress = baseProgress + (itemsDone/totalItems) * dbWeightedPercent
|
||||
```
|
||||
|
||||
### Phase 3: Smart Verbose Mode (optional)
|
||||
1. Detect database size before restore
|
||||
2. Enable verbose for DBs < 500MB
|
||||
3. Parse verbose output for detailed progress
|
||||
4. Automatic fallback to item-based for large DBs
|
||||
|
||||
## Files to Modify
|
||||
|
||||
1. **internal/restore/engine.go**:
|
||||
- `executeRestoreCommand()` - add progress parsing
|
||||
- `RestoreCluster()` - add heartbeat ticker
|
||||
- New: `reportItemProgress()`, `reportHeartbeat()`
|
||||
|
||||
2. **internal/tui/restore_exec.go**:
|
||||
- Update `RestoreExecModel` to handle sub-progress
|
||||
- Add "elapsed time" display during restore
|
||||
- Show item counts: "Restoring tables... (234/567)"
|
||||
|
||||
3. **internal/progress/indicator.go**:
|
||||
- Add `UpdateSubProgress(current, total int)` method
|
||||
- Add `ReportHeartbeat(elapsed time.Duration)` method
|
||||
|
||||
## Example Output
|
||||
|
||||
**Before (current):**
|
||||
```
|
||||
[====================] Phase 2/3: Restoring Databases (1/5)
|
||||
Restoring database myapp...
|
||||
[frozen for 30 minutes]
|
||||
```
|
||||
|
||||
**After (with heartbeat):**
|
||||
```
|
||||
[====================] Phase 2/3: Restoring Databases (1/5)
|
||||
Restoring database myapp... (elapsed: 4m 32s)
|
||||
[updates every 5 seconds]
|
||||
```
|
||||
|
||||
**After (with item parsing):**
|
||||
```
|
||||
[=========>-----------] Phase 2/3: Restoring Databases (1/5)
|
||||
Restoring database myapp... (processing item 1,234/5,678) (elapsed: 4m 32s)
|
||||
[smooth progress bar movement]
|
||||
```
|
||||
|
||||
## Testing Strategy
|
||||
1. Test with small DB (< 100MB) - verify heartbeat works
|
||||
2. Test with large DB (> 10GB) - verify no OOM, heartbeat works
|
||||
3. Test with BLOB-heavy DB - verify phased restore shows progress
|
||||
4. Test parallel cluster restore - verify multiple heartbeats don't conflict
|
||||
|
||||
## Risk Assessment
|
||||
- **Low risk**: Heartbeat ticker (Phase 1)
|
||||
- **Medium risk**: stderr parsing (Phase 2) - test thoroughly
|
||||
- **High risk**: Verbose mode (Phase 3) - can cause OOM
|
||||
|
||||
## Estimated Implementation Time
|
||||
- Phase 1 (heartbeat): 1-2 hours
|
||||
- Phase 2 (item parsing): 4-6 hours
|
||||
- Phase 3 (smart verbose): 8-10 hours (optional)
|
||||
|
||||
**Total for Phases 1+2: 5-8 hours**
|
||||
@ -4,8 +4,8 @@ This directory contains pre-compiled binaries for the DB Backup Tool across mult
|
||||
|
||||
## Build Information
|
||||
- **Version**: 3.42.50
|
||||
- **Build Time**: 2026-01-17_06:25:57_UTC
|
||||
- **Git Commit**: 4ea3ec2
|
||||
- **Build Time**: 2026-01-21_13:01:17_UTC
|
||||
- **Git Commit**: 75dee1f
|
||||
|
||||
## Recent Updates (v1.1.0)
|
||||
- ✅ Fixed TUI progress display with line-by-line output
|
||||
|
||||
10
bin/checksums.txt
Normal file
10
bin/checksums.txt
Normal file
@ -0,0 +1,10 @@
|
||||
674a9cdb28a6b27ebb3004b2a00330cb23708894207681405abb4774975fd92d dbbackup_darwin_amd64
|
||||
c65808a0b9a3eb5a88d4c30579aa67f10093aeb77db74c1d4747730f8bf33fa6 dbbackup_darwin_arm64
|
||||
c6dd8effb74c8a69b0232c1eb603c4cebe2b6cdf5d2f764c6b9d4ecc98cff6fd dbbackup_freebsd_amd64
|
||||
c1f24b324e0afc6b6e59c846d823d09c6c193cf812a92daab103977ec605cb48 dbbackup_linux_amd64
|
||||
edf31fca271a264a2a3a88c8de8ab0d4c576f5f08199fd60e68791707e0d87a1 dbbackup_linux_arm64
|
||||
d699561ca3b3b40f8d463bbd3b7eade7fce052f3b4aeea8a56896e8cedab433d dbbackup_linux_arm_armv7
|
||||
f221ccc7202e425acae81acf880ea666432889ac74289031b4942bf5f9284eed dbbackup_netbsd_amd64
|
||||
f93486bc4efcc627b23c7b0c4e06ffd54e4fb85be322c58aaec3a913b90735af dbbackup_openbsd_amd64
|
||||
935dfd4b666760efdc43236d12a1098d86a1c540d9a5ca9534efd7f00d2ab541 dbbackup_windows_amd64.exe
|
||||
b41d2e467d88c3e4c3fe42bf27cdd10f487564710c3802a842ca7c3e639f44df dbbackup_windows_arm64.exe
|
||||
@ -66,6 +66,15 @@ TUI Automation Flags (for testing and CI/CD):
|
||||
cfg.TUIVerbose, _ = cmd.Flags().GetBool("verbose-tui")
|
||||
cfg.TUILogFile, _ = cmd.Flags().GetString("tui-log-file")
|
||||
|
||||
// Set conservative profile as default for TUI mode (safer for interactive users)
|
||||
if cfg.ResourceProfile == "" || cfg.ResourceProfile == "balanced" {
|
||||
cfg.ResourceProfile = "conservative"
|
||||
cfg.LargeDBMode = true
|
||||
if cfg.Debug {
|
||||
log.Info("TUI mode: using conservative profile by default")
|
||||
}
|
||||
}
|
||||
|
||||
// Check authentication before starting TUI
|
||||
if cfg.IsPostgreSQL() {
|
||||
if mismatch, msg := auth.CheckAuthenticationMismatch(cfg); mismatch {
|
||||
|
||||
319
cmd/restore.go
319
cmd/restore.go
@ -13,8 +13,10 @@ import (
|
||||
|
||||
"dbbackup/internal/backup"
|
||||
"dbbackup/internal/cloud"
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/database"
|
||||
"dbbackup/internal/pitr"
|
||||
"dbbackup/internal/progress"
|
||||
"dbbackup/internal/restore"
|
||||
"dbbackup/internal/security"
|
||||
|
||||
@ -28,7 +30,8 @@ var (
|
||||
restoreClean bool
|
||||
restoreCreate bool
|
||||
restoreJobs int
|
||||
restoreParallelDBs int // Number of parallel database restores
|
||||
restoreParallelDBs int // Number of parallel database restores
|
||||
restoreProfile string // Resource profile: conservative, balanced, aggressive
|
||||
restoreTarget string
|
||||
restoreVerbose bool
|
||||
restoreNoProgress bool
|
||||
@ -37,6 +40,12 @@ var (
|
||||
restoreDiagnose bool // Run diagnosis before restore
|
||||
restoreSaveDebugLog string // Path to save debug log on failure
|
||||
|
||||
// Single database extraction from cluster flags
|
||||
restoreDatabase string // Single database to extract/restore from cluster
|
||||
restoreDatabases string // Comma-separated list of databases to extract
|
||||
restoreOutputDir string // Extract to directory (no restore)
|
||||
restoreListDBs bool // List databases in cluster backup
|
||||
|
||||
// Diagnose flags
|
||||
diagnoseJSON bool
|
||||
diagnoseDeep bool
|
||||
@ -112,6 +121,9 @@ Examples:
|
||||
# Restore to different database
|
||||
dbbackup restore single mydb.dump.gz --target mydb_test --confirm
|
||||
|
||||
# Memory-constrained server (single-threaded, minimal memory)
|
||||
dbbackup restore single mydb.dump.gz --profile=conservative --confirm
|
||||
|
||||
# Clean target database before restore
|
||||
dbbackup restore single mydb.sql.gz --clean --confirm
|
||||
|
||||
@ -131,6 +143,11 @@ var restoreClusterCmd = &cobra.Command{
|
||||
This command restores all databases that were backed up together
|
||||
in a cluster backup operation.
|
||||
|
||||
Single Database Extraction:
|
||||
Use --list-databases to see available databases
|
||||
Use --database to extract/restore a specific database
|
||||
Use --output-dir to extract without restoring
|
||||
|
||||
Safety features:
|
||||
- Dry-run by default (use --confirm to execute)
|
||||
- Archive validation and listing
|
||||
@ -138,12 +155,33 @@ Safety features:
|
||||
- Sequential database restoration
|
||||
|
||||
Examples:
|
||||
# List databases in cluster backup
|
||||
dbbackup restore cluster backup.tar.gz --list-databases
|
||||
|
||||
# Extract single database (no restore)
|
||||
dbbackup restore cluster backup.tar.gz --database myapp --output-dir /tmp/extract
|
||||
|
||||
# Restore single database from cluster
|
||||
dbbackup restore cluster backup.tar.gz --database myapp --confirm
|
||||
|
||||
# Restore single database with different name
|
||||
dbbackup restore cluster backup.tar.gz --database myapp --target myapp_test --confirm
|
||||
|
||||
# Extract multiple databases
|
||||
dbbackup restore cluster backup.tar.gz --databases "app1,app2,app3" --output-dir /tmp/extract
|
||||
|
||||
# Preview cluster restore
|
||||
dbbackup restore cluster cluster_backup_20240101_120000.tar.gz
|
||||
|
||||
# Restore full cluster
|
||||
dbbackup restore cluster cluster_backup_20240101_120000.tar.gz --confirm
|
||||
|
||||
# Memory-constrained server (conservative profile)
|
||||
dbbackup restore cluster cluster_backup.tar.gz --profile=conservative --confirm
|
||||
|
||||
# Maximum performance (dedicated server)
|
||||
dbbackup restore cluster cluster_backup.tar.gz --profile=aggressive --confirm
|
||||
|
||||
# Use parallel decompression
|
||||
dbbackup restore cluster cluster_backup.tar.gz --jobs 4 --confirm
|
||||
|
||||
@ -277,6 +315,7 @@ func init() {
|
||||
restoreSingleCmd.Flags().BoolVar(&restoreClean, "clean", false, "Drop and recreate target database")
|
||||
restoreSingleCmd.Flags().BoolVar(&restoreCreate, "create", false, "Create target database if it doesn't exist")
|
||||
restoreSingleCmd.Flags().StringVar(&restoreTarget, "target", "", "Target database name (defaults to original)")
|
||||
restoreSingleCmd.Flags().StringVar(&restoreProfile, "profile", "balanced", "Resource profile: conservative (--parallel=1, low memory), balanced, aggressive (max performance)")
|
||||
restoreSingleCmd.Flags().BoolVar(&restoreVerbose, "verbose", false, "Show detailed restore progress")
|
||||
restoreSingleCmd.Flags().BoolVar(&restoreNoProgress, "no-progress", false, "Disable progress indicators")
|
||||
restoreSingleCmd.Flags().StringVar(&restoreEncryptionKeyFile, "encryption-key-file", "", "Path to encryption key file (required for encrypted backups)")
|
||||
@ -285,12 +324,17 @@ func init() {
|
||||
restoreSingleCmd.Flags().StringVar(&restoreSaveDebugLog, "save-debug-log", "", "Save detailed error report to file on failure (e.g., /tmp/restore-debug.json)")
|
||||
|
||||
// Cluster restore flags
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreListDBs, "list-databases", false, "List databases in cluster backup and exit")
|
||||
restoreClusterCmd.Flags().StringVar(&restoreDatabase, "database", "", "Extract/restore single database from cluster")
|
||||
restoreClusterCmd.Flags().StringVar(&restoreDatabases, "databases", "", "Extract multiple databases (comma-separated)")
|
||||
restoreClusterCmd.Flags().StringVar(&restoreOutputDir, "output-dir", "", "Extract to directory without restoring (requires --database or --databases)")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreConfirm, "confirm", false, "Confirm and execute restore (required)")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreDryRun, "dry-run", false, "Show what would be done without executing")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreForce, "force", false, "Skip safety checks and confirmations")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreCleanCluster, "clean-cluster", false, "Drop all existing user databases before restore (disaster recovery)")
|
||||
restoreClusterCmd.Flags().IntVar(&restoreJobs, "jobs", 0, "Number of parallel decompression jobs (0 = auto)")
|
||||
restoreClusterCmd.Flags().IntVar(&restoreParallelDBs, "parallel-dbs", 0, "Number of databases to restore in parallel (0 = use config default, 1 = sequential)")
|
||||
restoreClusterCmd.Flags().StringVar(&restoreProfile, "profile", "balanced", "Resource profile: conservative (--parallel=1, low memory), balanced, aggressive (max performance)")
|
||||
restoreClusterCmd.Flags().IntVar(&restoreJobs, "jobs", 0, "Number of parallel decompression jobs (0 = auto, overrides profile)")
|
||||
restoreClusterCmd.Flags().IntVar(&restoreParallelDBs, "parallel-dbs", 0, "Number of databases to restore in parallel (0 = use profile, 1 = sequential, -1 = auto-detect, overrides profile)")
|
||||
restoreClusterCmd.Flags().StringVar(&restoreWorkdir, "workdir", "", "Working directory for extraction (use when system disk is small, e.g. /mnt/storage/restore_tmp)")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreVerbose, "verbose", false, "Show detailed restore progress")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreNoProgress, "no-progress", false, "Disable progress indicators")
|
||||
@ -298,6 +342,8 @@ func init() {
|
||||
restoreClusterCmd.Flags().StringVar(&restoreEncryptionKeyEnv, "encryption-key-env", "DBBACKUP_ENCRYPTION_KEY", "Environment variable containing encryption key")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreDiagnose, "diagnose", false, "Run deep diagnosis on all dumps before restore")
|
||||
restoreClusterCmd.Flags().StringVar(&restoreSaveDebugLog, "save-debug-log", "", "Save detailed error report to file on failure (e.g., /tmp/restore-debug.json)")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreClean, "clean", false, "Drop and recreate target database (for single DB restore)")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreCreate, "create", false, "Create target database if it doesn't exist (for single DB restore)")
|
||||
|
||||
// PITR restore flags
|
||||
restorePITRCmd.Flags().StringVar(&pitrBaseBackup, "base-backup", "", "Path to base backup file (.tar.gz) (required)")
|
||||
@ -436,6 +482,16 @@ func runRestoreDiagnose(cmd *cobra.Command, args []string) error {
|
||||
func runRestoreSingle(cmd *cobra.Command, args []string) error {
|
||||
archivePath := args[0]
|
||||
|
||||
// Apply resource profile
|
||||
if err := config.ApplyProfile(cfg, restoreProfile, restoreJobs, 0); err != nil {
|
||||
log.Warn("Invalid profile, using balanced", "error", err)
|
||||
restoreProfile = "balanced"
|
||||
_ = config.ApplyProfile(cfg, restoreProfile, restoreJobs, 0)
|
||||
}
|
||||
if cfg.Debug && restoreProfile != "balanced" {
|
||||
log.Info("Using restore profile", "profile", restoreProfile)
|
||||
}
|
||||
|
||||
// Check if this is a cloud URI
|
||||
var cleanupFunc func() error
|
||||
|
||||
@ -657,6 +713,203 @@ func runRestoreCluster(cmd *cobra.Command, args []string) error {
|
||||
return fmt.Errorf("archive not found: %s", archivePath)
|
||||
}
|
||||
|
||||
// Handle --list-databases flag
|
||||
if restoreListDBs {
|
||||
return runListDatabases(archivePath)
|
||||
}
|
||||
|
||||
// Handle single/multiple database extraction
|
||||
if restoreDatabase != "" || restoreDatabases != "" {
|
||||
return runExtractDatabases(archivePath)
|
||||
}
|
||||
|
||||
// Otherwise proceed with full cluster restore
|
||||
return runFullClusterRestore(archivePath)
|
||||
}
|
||||
|
||||
// runListDatabases lists all databases in a cluster backup
|
||||
func runListDatabases(archivePath string) error {
|
||||
ctx := context.Background()
|
||||
|
||||
log.Info("Scanning cluster backup", "archive", filepath.Base(archivePath))
|
||||
fmt.Println()
|
||||
|
||||
databases, err := restore.ListDatabasesInCluster(ctx, archivePath, log)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to list databases: %w", err)
|
||||
}
|
||||
|
||||
fmt.Printf("📦 Databases in cluster backup:\n")
|
||||
var totalSize int64
|
||||
for _, db := range databases {
|
||||
sizeStr := formatSize(db.Size)
|
||||
fmt.Printf(" - %-30s (%s)\n", db.Name, sizeStr)
|
||||
totalSize += db.Size
|
||||
}
|
||||
|
||||
fmt.Printf("\nTotal: %s across %d database(s)\n", formatSize(totalSize), len(databases))
|
||||
return nil
|
||||
}
|
||||
|
||||
// runExtractDatabases extracts single or multiple databases from cluster backup
|
||||
func runExtractDatabases(archivePath string) error {
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
defer cancel()
|
||||
|
||||
// Setup signal handling
|
||||
sigChan := make(chan os.Signal, 1)
|
||||
signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
|
||||
defer signal.Stop(sigChan)
|
||||
|
||||
go func() {
|
||||
<-sigChan
|
||||
log.Warn("Extraction interrupted by user")
|
||||
cancel()
|
||||
}()
|
||||
|
||||
// Single database extraction
|
||||
if restoreDatabase != "" {
|
||||
return handleSingleDatabaseExtraction(ctx, archivePath, restoreDatabase)
|
||||
}
|
||||
|
||||
// Multiple database extraction
|
||||
if restoreDatabases != "" {
|
||||
return handleMultipleDatabaseExtraction(ctx, archivePath, restoreDatabases)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// handleSingleDatabaseExtraction handles single database extraction or restore
|
||||
func handleSingleDatabaseExtraction(ctx context.Context, archivePath, dbName string) error {
|
||||
// Extract-only mode (no restore)
|
||||
if restoreOutputDir != "" {
|
||||
return extractSingleDatabase(ctx, archivePath, dbName, restoreOutputDir)
|
||||
}
|
||||
|
||||
// Restore mode
|
||||
if !restoreConfirm {
|
||||
fmt.Println("\n[DRY-RUN] DRY-RUN MODE - No changes will be made")
|
||||
fmt.Printf("\nWould extract and restore:\n")
|
||||
fmt.Printf(" Database: %s\n", dbName)
|
||||
fmt.Printf(" From: %s\n", archivePath)
|
||||
targetDB := restoreTarget
|
||||
if targetDB == "" {
|
||||
targetDB = dbName
|
||||
}
|
||||
fmt.Printf(" Target: %s\n", targetDB)
|
||||
if restoreClean {
|
||||
fmt.Printf(" Clean: true (drop and recreate)\n")
|
||||
}
|
||||
if restoreCreate {
|
||||
fmt.Printf(" Create: true (create if missing)\n")
|
||||
}
|
||||
fmt.Println("\nTo execute this restore, add --confirm flag")
|
||||
return nil
|
||||
}
|
||||
|
||||
// Create database instance
|
||||
db, err := database.New(cfg, log)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create database instance: %w", err)
|
||||
}
|
||||
defer db.Close()
|
||||
|
||||
// Create restore engine
|
||||
engine := restore.New(cfg, log, db)
|
||||
|
||||
// Determine target database name
|
||||
targetDB := restoreTarget
|
||||
if targetDB == "" {
|
||||
targetDB = dbName
|
||||
}
|
||||
|
||||
log.Info("Restoring single database from cluster", "database", dbName, "target", targetDB)
|
||||
|
||||
// Restore single database from cluster
|
||||
if err := engine.RestoreSingleFromCluster(ctx, archivePath, dbName, targetDB, restoreClean, restoreCreate); err != nil {
|
||||
return fmt.Errorf("restore failed: %w", err)
|
||||
}
|
||||
|
||||
fmt.Printf("\n✅ Successfully restored '%s' as '%s'\n", dbName, targetDB)
|
||||
return nil
|
||||
}
|
||||
|
||||
// extractSingleDatabase extracts a single database without restoring
|
||||
func extractSingleDatabase(ctx context.Context, archivePath, dbName, outputDir string) error {
|
||||
log.Info("Extracting database", "database", dbName, "output", outputDir)
|
||||
|
||||
// Create progress indicator
|
||||
prog := progress.NewIndicator(!restoreNoProgress, "dots")
|
||||
|
||||
extractedPath, err := restore.ExtractDatabaseFromCluster(ctx, archivePath, dbName, outputDir, log, prog)
|
||||
if err != nil {
|
||||
return fmt.Errorf("extraction failed: %w", err)
|
||||
}
|
||||
|
||||
fmt.Printf("\n✅ Extracted: %s\n", extractedPath)
|
||||
fmt.Printf(" Database: %s\n", dbName)
|
||||
fmt.Printf(" Location: %s\n", outputDir)
|
||||
return nil
|
||||
}
|
||||
|
||||
// handleMultipleDatabaseExtraction handles multiple database extraction
|
||||
func handleMultipleDatabaseExtraction(ctx context.Context, archivePath, databases string) error {
|
||||
if restoreOutputDir == "" {
|
||||
return fmt.Errorf("--output-dir required when using --databases")
|
||||
}
|
||||
|
||||
// Parse database list
|
||||
dbNames := strings.Split(databases, ",")
|
||||
for i := range dbNames {
|
||||
dbNames[i] = strings.TrimSpace(dbNames[i])
|
||||
}
|
||||
|
||||
log.Info("Extracting multiple databases", "count", len(dbNames), "output", restoreOutputDir)
|
||||
|
||||
// Create progress indicator
|
||||
prog := progress.NewIndicator(!restoreNoProgress, "dots")
|
||||
|
||||
extractedPaths, err := restore.ExtractMultipleDatabasesFromCluster(ctx, archivePath, dbNames, restoreOutputDir, log, prog)
|
||||
if err != nil {
|
||||
return fmt.Errorf("extraction failed: %w", err)
|
||||
}
|
||||
|
||||
fmt.Printf("\n✅ Extracted %d database(s):\n", len(extractedPaths))
|
||||
for dbName, path := range extractedPaths {
|
||||
fmt.Printf(" - %s → %s\n", dbName, filepath.Base(path))
|
||||
}
|
||||
fmt.Printf(" Location: %s\n", restoreOutputDir)
|
||||
return nil
|
||||
}
|
||||
|
||||
// runFullClusterRestore performs a full cluster restore
|
||||
func runFullClusterRestore(archivePath string) error {
|
||||
|
||||
// Apply resource profile
|
||||
if err := config.ApplyProfile(cfg, restoreProfile, restoreJobs, restoreParallelDBs); err != nil {
|
||||
log.Warn("Invalid profile, using balanced", "error", err)
|
||||
restoreProfile = "balanced"
|
||||
_ = config.ApplyProfile(cfg, restoreProfile, restoreJobs, restoreParallelDBs)
|
||||
}
|
||||
if cfg.Debug || restoreProfile != "balanced" {
|
||||
log.Info("Using restore profile", "profile", restoreProfile, "parallel_dbs", cfg.ClusterParallelism, "jobs", cfg.Jobs)
|
||||
}
|
||||
|
||||
// Convert to absolute path
|
||||
if !filepath.IsAbs(archivePath) {
|
||||
absPath, err := filepath.Abs(archivePath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("invalid archive path: %w", err)
|
||||
}
|
||||
archivePath = absPath
|
||||
}
|
||||
|
||||
// Check if file exists
|
||||
if _, err := os.Stat(archivePath); err != nil {
|
||||
return fmt.Errorf("archive not found: %s", archivePath)
|
||||
}
|
||||
|
||||
// Check if backup is encrypted and decrypt if necessary
|
||||
if backup.IsBackupEncrypted(archivePath) {
|
||||
log.Info("Encrypted cluster backup detected, decrypting...")
|
||||
@ -786,7 +1039,12 @@ func runRestoreCluster(cmd *cobra.Command, args []string) error {
|
||||
}
|
||||
|
||||
// Override cluster parallelism if --parallel-dbs is specified
|
||||
if restoreParallelDBs > 0 {
|
||||
if restoreParallelDBs == -1 {
|
||||
// Auto-detect optimal parallelism based on system resources
|
||||
autoParallel := restore.CalculateOptimalParallel()
|
||||
cfg.ClusterParallelism = autoParallel
|
||||
log.Info("Auto-detected optimal parallelism for database restores", "parallel_dbs", autoParallel, "mode", "auto")
|
||||
} else if restoreParallelDBs > 0 {
|
||||
cfg.ClusterParallelism = restoreParallelDBs
|
||||
log.Info("Using custom parallelism for database restores", "parallel_dbs", restoreParallelDBs)
|
||||
}
|
||||
@ -835,22 +1093,50 @@ func runRestoreCluster(cmd *cobra.Command, args []string) error {
|
||||
log.Info("Database cleanup completed")
|
||||
}
|
||||
|
||||
// Run pre-restore diagnosis if requested
|
||||
if restoreDiagnose {
|
||||
log.Info("[DIAG] Running pre-restore diagnosis...")
|
||||
// OPTIMIZATION: Pre-extract archive once for both diagnosis and restore
|
||||
// This avoids extracting the same tar.gz twice (saves 5-10 min on large clusters)
|
||||
var extractedDir string
|
||||
var extractErr error
|
||||
|
||||
// Create temp directory for extraction in configured WorkDir
|
||||
workDir := cfg.GetEffectiveWorkDir()
|
||||
diagTempDir, err := os.MkdirTemp(workDir, "dbbackup-diagnose-*")
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create temp directory for diagnosis in %s: %w", workDir, err)
|
||||
if restoreDiagnose || restoreConfirm {
|
||||
log.Info("Pre-extracting cluster archive (shared for validation and restore)...")
|
||||
extractedDir, extractErr = safety.ValidateAndExtractCluster(ctx, archivePath)
|
||||
if extractErr != nil {
|
||||
return fmt.Errorf("failed to extract cluster archive: %w", extractErr)
|
||||
}
|
||||
defer os.RemoveAll(diagTempDir)
|
||||
defer os.RemoveAll(extractedDir) // Cleanup at end
|
||||
log.Info("Archive extracted successfully", "location", extractedDir)
|
||||
}
|
||||
|
||||
// Run pre-restore diagnosis if requested (using already-extracted directory)
|
||||
if restoreDiagnose {
|
||||
log.Info("[DIAG] Running pre-restore diagnosis on extracted dumps...")
|
||||
|
||||
diagnoser := restore.NewDiagnoser(log, restoreVerbose)
|
||||
results, err := diagnoser.DiagnoseClusterDumps(archivePath, diagTempDir)
|
||||
// Diagnose dumps directly from extracted directory
|
||||
dumpsDir := filepath.Join(extractedDir, "dumps")
|
||||
if _, err := os.Stat(dumpsDir); err != nil {
|
||||
return fmt.Errorf("no dumps directory found in extracted archive: %w", err)
|
||||
}
|
||||
|
||||
entries, err := os.ReadDir(dumpsDir)
|
||||
if err != nil {
|
||||
return fmt.Errorf("diagnosis failed: %w", err)
|
||||
return fmt.Errorf("failed to read dumps directory: %w", err)
|
||||
}
|
||||
|
||||
// Diagnose each dump file
|
||||
var results []*restore.DiagnoseResult
|
||||
for _, entry := range entries {
|
||||
if entry.IsDir() {
|
||||
continue
|
||||
}
|
||||
dumpPath := filepath.Join(dumpsDir, entry.Name())
|
||||
result, err := diagnoser.DiagnoseFile(dumpPath)
|
||||
if err != nil {
|
||||
log.Warn("Could not diagnose dump", "file", entry.Name(), "error", err)
|
||||
continue
|
||||
}
|
||||
results = append(results, result)
|
||||
}
|
||||
|
||||
// Check for any invalid dumps
|
||||
@ -890,7 +1176,8 @@ func runRestoreCluster(cmd *cobra.Command, args []string) error {
|
||||
startTime := time.Now()
|
||||
auditLogger.LogRestoreStart(user, "all_databases", archivePath)
|
||||
|
||||
if err := engine.RestoreCluster(ctx, archivePath); err != nil {
|
||||
// Pass pre-extracted directory to avoid double extraction
|
||||
if err := engine.RestoreCluster(ctx, archivePath, extractedDir); err != nil {
|
||||
auditLogger.LogRestoreFailed(user, "all_databases", err)
|
||||
return fmt.Errorf("cluster restore failed: %w", err)
|
||||
}
|
||||
|
||||
359
diagnose_postgres_memory.sh
Executable file
359
diagnose_postgres_memory.sh
Executable file
@ -0,0 +1,359 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# PostgreSQL Memory and Resource Diagnostic Tool
|
||||
# Analyzes memory usage, locks, and system resources to identify restore issues
|
||||
#
|
||||
|
||||
set -e
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
echo "════════════════════════════════════════════════════════════"
|
||||
echo " PostgreSQL Memory & Resource Diagnostics"
|
||||
echo " $(date '+%Y-%m-%d %H:%M:%S')"
|
||||
echo "════════════════════════════════════════════════════════════"
|
||||
echo
|
||||
|
||||
# Function to format bytes to human readable
|
||||
bytes_to_human() {
|
||||
local bytes=$1
|
||||
if [ "$bytes" -ge 1073741824 ]; then
|
||||
echo "$(awk "BEGIN {printf \"%.2f GB\", $bytes/1073741824}")"
|
||||
elif [ "$bytes" -ge 1048576 ]; then
|
||||
echo "$(awk "BEGIN {printf \"%.2f MB\", $bytes/1048576}")"
|
||||
else
|
||||
echo "$(awk "BEGIN {printf \"%.2f KB\", $bytes/1024}")"
|
||||
fi
|
||||
}
|
||||
|
||||
# 1. SYSTEM MEMORY OVERVIEW
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}📊 SYSTEM MEMORY OVERVIEW${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
if command -v free &> /dev/null; then
|
||||
free -h
|
||||
echo
|
||||
|
||||
# Calculate percentages
|
||||
MEM_TOTAL=$(free -b | awk '/^Mem:/ {print $2}')
|
||||
MEM_USED=$(free -b | awk '/^Mem:/ {print $3}')
|
||||
MEM_FREE=$(free -b | awk '/^Mem:/ {print $4}')
|
||||
MEM_AVAILABLE=$(free -b | awk '/^Mem:/ {print $7}')
|
||||
|
||||
MEM_PERCENT=$(awk "BEGIN {printf \"%.1f\", ($MEM_USED/$MEM_TOTAL)*100}")
|
||||
|
||||
echo "Memory Utilization: ${MEM_PERCENT}%"
|
||||
echo "Total: $(bytes_to_human $MEM_TOTAL)"
|
||||
echo "Used: $(bytes_to_human $MEM_USED)"
|
||||
echo "Available: $(bytes_to_human $MEM_AVAILABLE)"
|
||||
|
||||
if (( $(echo "$MEM_PERCENT > 90" | bc -l) )); then
|
||||
echo -e "${RED}⚠️ WARNING: Memory usage is critically high (>90%)${NC}"
|
||||
elif (( $(echo "$MEM_PERCENT > 70" | bc -l) )); then
|
||||
echo -e "${YELLOW}⚠️ CAUTION: Memory usage is high (>70%)${NC}"
|
||||
else
|
||||
echo -e "${GREEN}✓ Memory usage is acceptable${NC}"
|
||||
fi
|
||||
fi
|
||||
echo
|
||||
|
||||
# 2. TOP MEMORY CONSUMERS
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}🔍 TOP 15 MEMORY CONSUMING PROCESSES${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
ps aux --sort=-%mem | head -16 | awk 'NR==1 {print $0} NR>1 {printf "%-8s %5s%% %7s %s\n", $1, $4, $6/1024"M", $11}'
|
||||
echo
|
||||
|
||||
# 3. POSTGRESQL PROCESSES
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}🐘 POSTGRESQL PROCESSES${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
PG_PROCS=$(ps aux | grep -E "postgres.*:" | grep -v grep || true)
|
||||
if [ -z "$PG_PROCS" ]; then
|
||||
echo "No PostgreSQL processes found"
|
||||
else
|
||||
echo "$PG_PROCS" | awk '{printf "%-8s %5s%% %7s %s\n", $1, $4, $6/1024"M", $11}'
|
||||
echo
|
||||
|
||||
# Sum up PostgreSQL memory
|
||||
PG_MEM_TOTAL=$(echo "$PG_PROCS" | awk '{sum+=$6} END {print sum/1024}')
|
||||
echo "Total PostgreSQL Memory: ${PG_MEM_TOTAL} MB"
|
||||
fi
|
||||
echo
|
||||
|
||||
# 4. POSTGRESQL CONFIGURATION
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}⚙️ POSTGRESQL MEMORY CONFIGURATION${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
if command -v psql &> /dev/null; then
|
||||
PSQL_CMD="psql -t -A -c"
|
||||
|
||||
# Try as postgres user first, then current user
|
||||
if sudo -u postgres $PSQL_CMD "SELECT 1" &> /dev/null; then
|
||||
PSQL_PREFIX="sudo -u postgres"
|
||||
elif $PSQL_CMD "SELECT 1" &> /dev/null; then
|
||||
PSQL_PREFIX=""
|
||||
else
|
||||
echo "❌ Cannot connect to PostgreSQL"
|
||||
PSQL_PREFIX="NONE"
|
||||
fi
|
||||
|
||||
if [ "$PSQL_PREFIX" != "NONE" ]; then
|
||||
echo "Key Memory Settings:"
|
||||
echo "────────────────────────────────────────────────────────────"
|
||||
|
||||
# Get all relevant settings (strip timing output)
|
||||
SHARED_BUFFERS=$($PSQL_PREFIX psql -t -A -c "SHOW shared_buffers;" 2>/dev/null | head -1 || echo "unknown")
|
||||
WORK_MEM=$($PSQL_PREFIX psql -t -A -c "SHOW work_mem;" 2>/dev/null | head -1 || echo "unknown")
|
||||
MAINT_WORK_MEM=$($PSQL_PREFIX psql -t -A -c "SHOW maintenance_work_mem;" 2>/dev/null | head -1 || echo "unknown")
|
||||
EFFECTIVE_CACHE=$($PSQL_PREFIX psql -t -A -c "SHOW effective_cache_size;" 2>/dev/null | head -1 || echo "unknown")
|
||||
MAX_CONNECTIONS=$($PSQL_PREFIX psql -t -A -c "SHOW max_connections;" 2>/dev/null | head -1 || echo "unknown")
|
||||
MAX_LOCKS=$($PSQL_PREFIX psql -t -A -c "SHOW max_locks_per_transaction;" 2>/dev/null | head -1 || echo "unknown")
|
||||
MAX_PREPARED=$($PSQL_PREFIX psql -t -A -c "SHOW max_prepared_transactions;" 2>/dev/null | head -1 || echo "unknown")
|
||||
|
||||
echo "shared_buffers: $SHARED_BUFFERS"
|
||||
echo "work_mem: $WORK_MEM"
|
||||
echo "maintenance_work_mem: $MAINT_WORK_MEM"
|
||||
echo "effective_cache_size: $EFFECTIVE_CACHE"
|
||||
echo "max_connections: $MAX_CONNECTIONS"
|
||||
echo "max_locks_per_transaction: $MAX_LOCKS"
|
||||
echo "max_prepared_transactions: $MAX_PREPARED"
|
||||
echo
|
||||
|
||||
# Calculate lock capacity
|
||||
if [ "$MAX_LOCKS" != "unknown" ] && [ "$MAX_CONNECTIONS" != "unknown" ] && [ "$MAX_PREPARED" != "unknown" ]; then
|
||||
# Ensure values are numeric
|
||||
if [[ "$MAX_LOCKS" =~ ^[0-9]+$ ]] && [[ "$MAX_CONNECTIONS" =~ ^[0-9]+$ ]] && [[ "$MAX_PREPARED" =~ ^[0-9]+$ ]]; then
|
||||
LOCK_CAPACITY=$((MAX_LOCKS * (MAX_CONNECTIONS + MAX_PREPARED)))
|
||||
echo "Total Lock Capacity: $LOCK_CAPACITY locks"
|
||||
|
||||
if [ "$MAX_LOCKS" -lt 1000 ]; then
|
||||
echo -e "${RED}⚠️ WARNING: max_locks_per_transaction is too low for large restores${NC}"
|
||||
echo -e "${YELLOW} Recommended: 4096 or higher${NC}"
|
||||
fi
|
||||
fi
|
||||
fi
|
||||
echo
|
||||
fi
|
||||
else
|
||||
echo "❌ psql not found"
|
||||
fi
|
||||
|
||||
# 5. CURRENT LOCKS AND CONNECTIONS
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}🔒 CURRENT LOCKS AND CONNECTIONS${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
if [ "$PSQL_PREFIX" != "NONE" ] && command -v psql &> /dev/null; then
|
||||
# Active connections
|
||||
ACTIVE_CONNS=$($PSQL_PREFIX psql -t -A -c "SELECT count(*) FROM pg_stat_activity;" 2>/dev/null | head -1 || echo "0")
|
||||
echo "Active Connections: $ACTIVE_CONNS / $MAX_CONNECTIONS"
|
||||
echo
|
||||
|
||||
# Lock statistics
|
||||
echo "Current Lock Usage:"
|
||||
echo "────────────────────────────────────────────────────────────"
|
||||
$PSQL_PREFIX psql -c "
|
||||
SELECT
|
||||
mode,
|
||||
COUNT(*) as count
|
||||
FROM pg_locks
|
||||
GROUP BY mode
|
||||
ORDER BY count DESC;
|
||||
" 2>/dev/null || echo "Unable to query locks"
|
||||
echo
|
||||
|
||||
# Total locks
|
||||
TOTAL_LOCKS=$($PSQL_PREFIX psql -t -A -c "SELECT COUNT(*) FROM pg_locks;" 2>/dev/null | head -1 || echo "0")
|
||||
echo "Total Active Locks: $TOTAL_LOCKS"
|
||||
|
||||
if [ ! -z "$LOCK_CAPACITY" ] && [ ! -z "$TOTAL_LOCKS" ] && [[ "$TOTAL_LOCKS" =~ ^[0-9]+$ ]] && [ "$TOTAL_LOCKS" -gt 0 ] 2>/dev/null; then
|
||||
LOCK_PERCENT=$((TOTAL_LOCKS * 100 / LOCK_CAPACITY))
|
||||
echo "Lock Usage: ${LOCK_PERCENT}%"
|
||||
|
||||
if [ "$LOCK_PERCENT" -gt 80 ]; then
|
||||
echo -e "${RED}⚠️ WARNING: Lock table usage is critically high${NC}"
|
||||
elif [ "$LOCK_PERCENT" -gt 60 ]; then
|
||||
echo -e "${YELLOW}⚠️ CAUTION: Lock table usage is elevated${NC}"
|
||||
fi
|
||||
fi
|
||||
echo
|
||||
|
||||
# Blocking queries
|
||||
echo "Blocking Queries:"
|
||||
echo "────────────────────────────────────────────────────────────"
|
||||
$PSQL_PREFIX psql -c "
|
||||
SELECT
|
||||
blocked_locks.pid AS blocked_pid,
|
||||
blocking_locks.pid AS blocking_pid,
|
||||
blocked_activity.usename AS blocked_user,
|
||||
blocking_activity.usename AS blocking_user,
|
||||
blocked_activity.query AS blocked_query
|
||||
FROM pg_catalog.pg_locks blocked_locks
|
||||
JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
|
||||
JOIN pg_catalog.pg_locks blocking_locks
|
||||
ON blocking_locks.locktype = blocked_locks.locktype
|
||||
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
|
||||
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
|
||||
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
|
||||
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
|
||||
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
|
||||
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
|
||||
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
|
||||
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
|
||||
AND blocking_locks.pid != blocked_locks.pid
|
||||
JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
|
||||
WHERE NOT blocked_locks.granted;
|
||||
" 2>/dev/null || echo "No blocking queries or unable to query"
|
||||
echo
|
||||
fi
|
||||
|
||||
# 6. SHARED MEMORY USAGE
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}💾 SHARED MEMORY SEGMENTS${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
if command -v ipcs &> /dev/null; then
|
||||
ipcs -m
|
||||
echo
|
||||
|
||||
# Sum up shared memory
|
||||
TOTAL_SHM=$(ipcs -m | awk '/^0x/ {sum+=$5} END {print sum}')
|
||||
if [ ! -z "$TOTAL_SHM" ]; then
|
||||
echo "Total Shared Memory: $(bytes_to_human $TOTAL_SHM)"
|
||||
fi
|
||||
else
|
||||
echo "ipcs command not available"
|
||||
fi
|
||||
echo
|
||||
|
||||
# 7. DISK SPACE (relevant for temp files)
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}💿 DISK SPACE${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
df -h | grep -E "Filesystem|/$|/var|/tmp|/postgres"
|
||||
echo
|
||||
|
||||
# Check for PostgreSQL temp files
|
||||
if [ "$PSQL_PREFIX" != "NONE" ] && command -v psql &> /dev/null; then
|
||||
TEMP_FILES=$($PSQL_PREFIX psql -t -A -c "SELECT count(*) FROM pg_stat_database WHERE temp_files > 0;" 2>/dev/null | head -1 || echo "0")
|
||||
if [ ! -z "$TEMP_FILES" ] && [ "$TEMP_FILES" -gt 0 ] 2>/dev/null; then
|
||||
echo -e "${YELLOW}⚠️ Databases are using temporary files (work_mem may be too low)${NC}"
|
||||
$PSQL_PREFIX psql -c "SELECT datname, temp_files, pg_size_pretty(temp_bytes) as temp_size FROM pg_stat_database WHERE temp_files > 0;" 2>/dev/null
|
||||
echo
|
||||
fi
|
||||
fi
|
||||
|
||||
# 8. OTHER RESOURCE CONSUMERS
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}🔍 OTHER POTENTIAL MEMORY CONSUMERS${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
# Check for common memory hogs
|
||||
echo "Checking for common memory-intensive services..."
|
||||
echo
|
||||
|
||||
for service in "mysqld" "mongodb" "redis" "elasticsearch" "java" "docker" "containerd"; do
|
||||
MEM=$(ps aux | grep "$service" | grep -v grep | awk '{sum+=$4} END {printf "%.1f", sum}')
|
||||
if [ ! -z "$MEM" ] && (( $(echo "$MEM > 0" | bc -l) )); then
|
||||
echo " ${service}: ${MEM}%"
|
||||
fi
|
||||
done
|
||||
echo
|
||||
|
||||
# 9. SWAP USAGE
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}🔄 SWAP USAGE${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
if command -v free &> /dev/null; then
|
||||
SWAP_TOTAL=$(free -b | awk '/^Swap:/ {print $2}')
|
||||
SWAP_USED=$(free -b | awk '/^Swap:/ {print $3}')
|
||||
|
||||
if [ "$SWAP_TOTAL" -gt 0 ]; then
|
||||
SWAP_PERCENT=$(awk "BEGIN {printf \"%.1f\", ($SWAP_USED/$SWAP_TOTAL)*100}")
|
||||
echo "Swap Total: $(bytes_to_human $SWAP_TOTAL)"
|
||||
echo "Swap Used: $(bytes_to_human $SWAP_USED) (${SWAP_PERCENT}%)"
|
||||
|
||||
if (( $(echo "$SWAP_PERCENT > 50" | bc -l) )); then
|
||||
echo -e "${RED}⚠️ WARNING: Heavy swap usage detected - system may be thrashing${NC}"
|
||||
elif (( $(echo "$SWAP_PERCENT > 20" | bc -l) )); then
|
||||
echo -e "${YELLOW}⚠️ CAUTION: System is using swap${NC}"
|
||||
else
|
||||
echo -e "${GREEN}✓ Swap usage is low${NC}"
|
||||
fi
|
||||
else
|
||||
echo "No swap configured"
|
||||
fi
|
||||
fi
|
||||
echo
|
||||
|
||||
# 10. RECOMMENDATIONS
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo -e "${BLUE}💡 RECOMMENDATIONS${NC}"
|
||||
echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}"
|
||||
echo
|
||||
|
||||
echo "Based on the diagnostics:"
|
||||
echo
|
||||
|
||||
# Memory recommendations
|
||||
if [ ! -z "$MEM_PERCENT" ]; then
|
||||
if (( $(echo "$MEM_PERCENT > 80" | bc -l) )); then
|
||||
echo "1. ⚠️ Memory Pressure:"
|
||||
echo " • System memory is ${MEM_PERCENT}% utilized"
|
||||
echo " • Stop non-essential services before restore"
|
||||
echo " • Consider increasing system RAM"
|
||||
echo " • Use 'dbbackup restore --parallel=1' to reduce memory usage"
|
||||
echo
|
||||
fi
|
||||
fi
|
||||
|
||||
# Lock recommendations
|
||||
if [ "$MAX_LOCKS" != "unknown" ] && [ ! -z "$MAX_LOCKS" ] && [[ "$MAX_LOCKS" =~ ^[0-9]+$ ]]; then
|
||||
if [ "$MAX_LOCKS" -lt 1000 ] 2>/dev/null; then
|
||||
echo "2. ⚠️ Lock Configuration:"
|
||||
echo " • max_locks_per_transaction is too low: $MAX_LOCKS"
|
||||
echo " • Run: ./fix_postgres_locks.sh"
|
||||
echo " • Or manually: ALTER SYSTEM SET max_locks_per_transaction = 4096;"
|
||||
echo " • Then restart PostgreSQL"
|
||||
echo
|
||||
fi
|
||||
fi
|
||||
|
||||
# Other recommendations
|
||||
echo "3. 🔧 Before Large Restores:"
|
||||
echo " • Stop unnecessary services (web servers, cron jobs, etc.)"
|
||||
echo " • Clear PostgreSQL idle connections"
|
||||
echo " • Ensure adequate disk space for temp files"
|
||||
echo " • Consider using --large-db mode for very large databases"
|
||||
echo
|
||||
|
||||
echo "4. 📊 Monitor During Restore:"
|
||||
echo " • Watch: watch -n 2 'ps aux | grep postgres | head -20'"
|
||||
echo " • Locks: watch -n 5 'psql -c \"SELECT COUNT(*) FROM pg_locks;\"'"
|
||||
echo " • Memory: watch -n 2 free -h"
|
||||
echo
|
||||
|
||||
echo "════════════════════════════════════════════════════════════"
|
||||
echo " Report generated: $(date '+%Y-%m-%d %H:%M:%S')"
|
||||
echo " Save this output: $0 > diagnosis_$(date +%Y%m%d_%H%M%S).log"
|
||||
echo "════════════════════════════════════════════════════════════"
|
||||
112
email_infra_team.txt
Normal file
112
email_infra_team.txt
Normal file
@ -0,0 +1,112 @@
|
||||
Betreff: PostgreSQL Restore Fehler - "out of shared memory" auf RST Server
|
||||
|
||||
Hallo Infra-Team,
|
||||
|
||||
wir haben auf dem RST PostgreSQL Server (PostgreSQL 17.4) wiederholt Restore-Fehler mit "out of shared memory" Meldungen.
|
||||
|
||||
═══════════════════════════════════════════════════════════
|
||||
ANALYSE (Stand: 20.01.2026)
|
||||
═══════════════════════════════════════════════════════════
|
||||
|
||||
Server-Specs:
|
||||
• RAM: 31 GB (aktuell 19.6 GB belegt = 63.9%)
|
||||
• PostgreSQL nutzt nur ~118 MB für eigene Prozesse
|
||||
• Swap: 4 GB (6.4% genutzt)
|
||||
|
||||
Lock-Konfiguration:
|
||||
• max_locks_per_transaction: 4096 ✓ (bereits korrekt)
|
||||
• max_connections: 100
|
||||
• Lock Capacity: 409.600 ✓ (ausreichend)
|
||||
|
||||
═══════════════════════════════════════════════════════════
|
||||
PROBLEM-IDENTIFIKATION
|
||||
═══════════════════════════════════════════════════════════
|
||||
|
||||
1. MEMORY CONSUMER (nicht-PostgreSQL):
|
||||
• Nessus Agent: ~173 MB
|
||||
• Elastic Agent: ~300 MB (mehrere Komponenten)
|
||||
• Icinga: ~24 MB
|
||||
• Weitere Monitoring: ~100+ MB
|
||||
|
||||
2. WORK_MEM ZU NIEDRIG:
|
||||
• Aktuell: 64 MB
|
||||
• 4 Datenbanken nutzen Temp-Files (Indikator für zu wenig work_mem):
|
||||
- prodkc: 201 MB temp files
|
||||
- keycloak: 45 MB temp files
|
||||
- d7030: 6 MB temp files
|
||||
- pgbench_db: 2 MB temp files
|
||||
|
||||
═══════════════════════════════════════════════════════════
|
||||
EMPFOHLENE MASSNAHMEN
|
||||
═══════════════════════════════════════════════════════════
|
||||
|
||||
VARIANTE A - Temporär für große Restores:
|
||||
-------------------------------------------
|
||||
1. Monitoring-Agents stoppen (frei: ~500 MB):
|
||||
sudo systemctl stop nessus-agent
|
||||
sudo systemctl stop elastic-agent
|
||||
|
||||
2. work_mem erhöhen:
|
||||
sudo -u postgres psql -c "ALTER SYSTEM SET work_mem = '256MB';"
|
||||
sudo systemctl restart postgresql
|
||||
|
||||
3. Restore durchführen
|
||||
|
||||
4. Agents wieder starten:
|
||||
sudo systemctl start nessus-agent
|
||||
sudo systemctl start elastic-agent
|
||||
|
||||
|
||||
VARIANTE B - Permanente Lösung:
|
||||
-------------------------------------------
|
||||
1. work_mem auf 256 MB erhöhen (statt 64 MB)
|
||||
2. maintenance_work_mem optional auf 4 GB erhöhen (statt 2 GB)
|
||||
3. Falls möglich: Monitoring auf dedizierten Server verschieben
|
||||
|
||||
SQL-Befehle:
|
||||
ALTER SYSTEM SET work_mem = '256MB';
|
||||
ALTER SYSTEM SET maintenance_work_mem = '4GB';
|
||||
-- Anschließend PostgreSQL restart
|
||||
|
||||
|
||||
VARIANTE C - Falls keine Config-Änderung möglich:
|
||||
-------------------------------------------
|
||||
• Restore mit --profile=conservative durchführen (reduziert Memory-Druck)
|
||||
dbbackup restore cluster backup.tar.gz --profile=conservative --confirm
|
||||
|
||||
• Oder TUI-Modus nutzen (verwendet automatisch conservative profile):
|
||||
dbbackup interactive
|
||||
|
||||
• Monitoring während Restore-Fenster deaktivieren
|
||||
|
||||
═══════════════════════════════════════════════════════════
|
||||
DETAIL-REPORT
|
||||
═══════════════════════════════════════════════════════════
|
||||
|
||||
Vollständiger Diagnose-Report liegt bei bzw. kann jederzeit mit
|
||||
diesem Script generiert werden:
|
||||
|
||||
/path/to/diagnose_postgres_memory.sh
|
||||
|
||||
Das Script analysiert:
|
||||
• System Memory Usage
|
||||
• PostgreSQL Konfiguration
|
||||
• Lock Usage
|
||||
• Temp File Usage
|
||||
• Blocking Queries
|
||||
• Shared Memory Segments
|
||||
|
||||
═══════════════════════════════════════════════════════════
|
||||
|
||||
Bevorzugt wäre Variante B (permanente work_mem Erhöhung), damit künftige
|
||||
große Restores ohne manuelle Eingriffe durchlaufen.
|
||||
|
||||
Bitte um Rückmeldung, welche Variante ihr umsetzt bzw. ob ihr weitere
|
||||
Infos benötigt.
|
||||
|
||||
Danke & Grüße
|
||||
[Dein Name]
|
||||
|
||||
---
|
||||
Anhang: diagnose_postgres_memory.sh (falls nicht vorhanden)
|
||||
Error Log: /a01/dba/tmp/dbbackup-restore-debug-20260119-221730.json
|
||||
140
fix_postgres_locks.sh
Executable file
140
fix_postgres_locks.sh
Executable file
@ -0,0 +1,140 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# Fix PostgreSQL Lock Table Exhaustion
|
||||
# Increases max_locks_per_transaction to handle large database restores
|
||||
#
|
||||
|
||||
set -e
|
||||
|
||||
echo "════════════════════════════════════════════════════════════"
|
||||
echo " PostgreSQL Lock Configuration Fix"
|
||||
echo "════════════════════════════════════════════════════════════"
|
||||
echo
|
||||
|
||||
# Check if running as postgres user or with sudo
|
||||
if [ "$EUID" -ne 0 ] && [ "$(whoami)" != "postgres" ]; then
|
||||
echo "⚠️ This script should be run as:"
|
||||
echo " sudo $0"
|
||||
echo " or as the postgres user"
|
||||
echo
|
||||
read -p "Continue anyway? (y/N) " -n 1 -r
|
||||
echo
|
||||
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
|
||||
# Detect PostgreSQL version and config
|
||||
PSQL=$(command -v psql || echo "")
|
||||
if [ -z "$PSQL" ]; then
|
||||
echo "❌ psql not found in PATH"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "📊 Current PostgreSQL Configuration:"
|
||||
echo "────────────────────────────────────────────────────────────"
|
||||
sudo -u postgres psql -c "SHOW max_locks_per_transaction;" 2>/dev/null || psql -c "SHOW max_locks_per_transaction;" || echo "Unable to query current value"
|
||||
sudo -u postgres psql -c "SHOW max_connections;" 2>/dev/null || psql -c "SHOW max_connections;" || echo "Unable to query current value"
|
||||
sudo -u postgres psql -c "SHOW work_mem;" 2>/dev/null || psql -c "SHOW work_mem;" || echo "Unable to query current value"
|
||||
sudo -u postgres psql -c "SHOW maintenance_work_mem;" 2>/dev/null || psql -c "SHOW maintenance_work_mem;" || echo "Unable to query current value"
|
||||
echo
|
||||
|
||||
# Recommended values
|
||||
RECOMMENDED_LOCKS=4096
|
||||
RECOMMENDED_WORK_MEM="256MB"
|
||||
RECOMMENDED_MAINTENANCE_WORK_MEM="4GB"
|
||||
|
||||
echo "🔧 Applying Fixes:"
|
||||
echo "────────────────────────────────────────────────────────────"
|
||||
echo "1. Setting max_locks_per_transaction = $RECOMMENDED_LOCKS"
|
||||
echo "2. Setting work_mem = $RECOMMENDED_WORK_MEM (improves query performance)"
|
||||
echo "3. Setting maintenance_work_mem = $RECOMMENDED_MAINTENANCE_WORK_MEM (speeds up restore/vacuum)"
|
||||
echo
|
||||
|
||||
# Apply the settings
|
||||
SUCCESS=0
|
||||
|
||||
# Fix 1: max_locks_per_transaction
|
||||
if sudo -u postgres psql -c "ALTER SYSTEM SET max_locks_per_transaction = $RECOMMENDED_LOCKS;" 2>/dev/null; then
|
||||
echo "✅ max_locks_per_transaction updated successfully"
|
||||
SUCCESS=$((SUCCESS + 1))
|
||||
elif psql -c "ALTER SYSTEM SET max_locks_per_transaction = $RECOMMENDED_LOCKS;" 2>/dev/null; then
|
||||
echo "✅ max_locks_per_transaction updated successfully"
|
||||
SUCCESS=$((SUCCESS + 1))
|
||||
else
|
||||
echo "❌ Failed to update max_locks_per_transaction"
|
||||
fi
|
||||
|
||||
# Fix 2: work_mem
|
||||
if sudo -u postgres psql -c "ALTER SYSTEM SET work_mem = '$RECOMMENDED_WORK_MEM';" 2>/dev/null; then
|
||||
echo "✅ work_mem updated successfully"
|
||||
SUCCESS=$((SUCCESS + 1))
|
||||
elif psql -c "ALTER SYSTEM SET work_mem = '$RECOMMENDED_WORK_MEM';" 2>/dev/null; then
|
||||
echo "✅ work_mem updated successfully"
|
||||
SUCCESS=$((SUCCESS + 1))
|
||||
else
|
||||
echo "❌ Failed to update work_mem"
|
||||
fi
|
||||
|
||||
# Fix 3: maintenance_work_mem
|
||||
if sudo -u postgres psql -c "ALTER SYSTEM SET maintenance_work_mem = '$RECOMMENDED_MAINTENANCE_WORK_MEM';" 2>/dev/null; then
|
||||
echo "✅ maintenance_work_mem updated successfully"
|
||||
SUCCESS=$((SUCCESS + 1))
|
||||
elif psql -c "ALTER SYSTEM SET maintenance_work_mem = '$RECOMMENDED_MAINTENANCE_WORK_MEM';" 2>/dev/null; then
|
||||
echo "✅ maintenance_work_mem updated successfully"
|
||||
SUCCESS=$((SUCCESS + 1))
|
||||
else
|
||||
echo "❌ Failed to update maintenance_work_mem"
|
||||
fi
|
||||
|
||||
if [ $SUCCESS -eq 0 ]; then
|
||||
echo
|
||||
echo "❌ All configuration updates failed"
|
||||
echo
|
||||
echo "Manual steps:"
|
||||
echo "1. Connect to PostgreSQL as superuser:"
|
||||
echo " sudo -u postgres psql"
|
||||
echo
|
||||
echo "2. Run these commands:"
|
||||
echo " ALTER SYSTEM SET max_locks_per_transaction = $RECOMMENDED_LOCKS;"
|
||||
echo " ALTER SYSTEM SET work_mem = '$RECOMMENDED_WORK_MEM';"
|
||||
echo " ALTER SYSTEM SET maintenance_work_mem = '$RECOMMENDED_MAINTENANCE_WORK_MEM';"
|
||||
echo
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo
|
||||
echo "✅ Applied $SUCCESS out of 3 configuration changes"
|
||||
|
||||
echo
|
||||
echo "⚠️ IMPORTANT: PostgreSQL restart required!"
|
||||
echo "────────────────────────────────────────────────────────────"
|
||||
echo
|
||||
echo "Restart PostgreSQL using one of these commands:"
|
||||
echo
|
||||
echo " • systemd: sudo systemctl restart postgresql"
|
||||
echo " • pg_ctl: sudo -u postgres pg_ctl restart -D /var/lib/postgresql/data"
|
||||
echo " • service: sudo service postgresql restart"
|
||||
echo
|
||||
echo "📊 Expected capacity after restart:"
|
||||
echo "────────────────────────────────────────────────────────────"
|
||||
echo " Lock capacity: max_locks_per_transaction × (max_connections + max_prepared)"
|
||||
echo " = $RECOMMENDED_LOCKS × (connections + prepared)"
|
||||
echo
|
||||
echo " Work memory: $RECOMMENDED_WORK_MEM per query operation"
|
||||
echo " Maintenance: $RECOMMENDED_MAINTENANCE_WORK_MEM for restore/vacuum/index"
|
||||
echo
|
||||
echo "After restarting, verify with:"
|
||||
echo " psql -c 'SHOW max_locks_per_transaction;'"
|
||||
echo " psql -c 'SHOW work_mem;'"
|
||||
echo " psql -c 'SHOW maintenance_work_mem;'"
|
||||
echo
|
||||
echo "💡 Benefits:"
|
||||
echo " ✓ Prevents 'out of shared memory' errors during restore"
|
||||
echo " ✓ Reduces temp file usage (better performance)"
|
||||
echo " ✓ Faster restore, vacuum, and index operations"
|
||||
echo
|
||||
echo "🔍 For comprehensive diagnostics, run:"
|
||||
echo " ./diagnose_postgres_memory.sh"
|
||||
echo
|
||||
echo "════════════════════════════════════════════════════════════"
|
||||
@ -94,7 +94,7 @@
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "code",
|
||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < 86400",
|
||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < bool 604800",
|
||||
"legendFormat": "{{database}}",
|
||||
"range": true,
|
||||
"refId": "A"
|
||||
@ -711,19 +711,6 @@
|
||||
},
|
||||
"pluginVersion": "10.2.0",
|
||||
"targets": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "code",
|
||||
"expr": "dbbackup_rpo_seconds{instance=~\"$instance\"} < 86400",
|
||||
"format": "table",
|
||||
"instant": true,
|
||||
"legendFormat": "__auto",
|
||||
"range": false,
|
||||
"refId": "Status"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
@ -769,26 +756,30 @@
|
||||
"Time": true,
|
||||
"Time 1": true,
|
||||
"Time 2": true,
|
||||
"Time 3": true,
|
||||
"__name__": true,
|
||||
"__name__ 1": true,
|
||||
"__name__ 2": true,
|
||||
"__name__ 3": true,
|
||||
"instance 1": true,
|
||||
"instance 2": true,
|
||||
"instance 3": true,
|
||||
"job": true,
|
||||
"job 1": true,
|
||||
"job 2": true,
|
||||
"job 3": true
|
||||
"engine 1": true,
|
||||
"engine 2": true
|
||||
},
|
||||
"indexByName": {
|
||||
"Database": 0,
|
||||
"Instance": 1,
|
||||
"Engine": 2,
|
||||
"RPO": 3,
|
||||
"Size": 4
|
||||
},
|
||||
"indexByName": {},
|
||||
"renameByName": {
|
||||
"Value #RPO": "RPO",
|
||||
"Value #Size": "Size",
|
||||
"Value #Status": "Status",
|
||||
"database": "Database",
|
||||
"instance": "Instance"
|
||||
"instance": "Instance",
|
||||
"engine": "Engine"
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -1275,7 +1266,7 @@
|
||||
"query": "label_values(dbbackup_rpo_seconds, instance)",
|
||||
"refId": "StandardVariableQuery"
|
||||
},
|
||||
"refresh": 1,
|
||||
"refresh": 2,
|
||||
"regex": "",
|
||||
"skipUrlSync": false,
|
||||
"sort": 1,
|
||||
|
||||
@ -1372,6 +1372,27 @@ func (e *Engine) executeCommand(ctx context.Context, cmdArgs []string, outputFil
|
||||
// NO GO BUFFERING - pg_dump writes directly to disk
|
||||
cmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
|
||||
// Start heartbeat ticker for backup progress
|
||||
backupStart := time.Now()
|
||||
heartbeatCtx, cancelHeartbeat := context.WithCancel(ctx)
|
||||
heartbeatTicker := time.NewTicker(5 * time.Second)
|
||||
defer heartbeatTicker.Stop()
|
||||
defer cancelHeartbeat()
|
||||
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-heartbeatTicker.C:
|
||||
elapsed := time.Since(backupStart)
|
||||
if e.progress != nil {
|
||||
e.progress.Update(fmt.Sprintf("Backing up database... (elapsed: %s)", formatDuration(elapsed)))
|
||||
}
|
||||
case <-heartbeatCtx.Done():
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
// Set environment variables for database tools
|
||||
cmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" {
|
||||
@ -1598,3 +1619,23 @@ func formatBytes(bytes int64) string {
|
||||
}
|
||||
return fmt.Sprintf("%.1f %cB", float64(bytes)/float64(div), "KMGTPE"[exp])
|
||||
}
|
||||
|
||||
// formatDuration formats a duration to human readable format (e.g., "3m 45s", "1h 23m", "45s")
|
||||
func formatDuration(d time.Duration) string {
|
||||
if d < time.Second {
|
||||
return "0s"
|
||||
}
|
||||
|
||||
hours := int(d.Hours())
|
||||
minutes := int(d.Minutes()) % 60
|
||||
seconds := int(d.Seconds()) % 60
|
||||
|
||||
if hours > 0 {
|
||||
return fmt.Sprintf("%dh %dm", hours, minutes)
|
||||
}
|
||||
if minutes > 0 {
|
||||
return fmt.Sprintf("%dm %ds", minutes, seconds)
|
||||
}
|
||||
return fmt.Sprintf("%ds", seconds)
|
||||
}
|
||||
|
||||
|
||||
@ -36,9 +36,14 @@ type Config struct {
|
||||
AutoDetectCores bool
|
||||
CPUWorkloadType string // "cpu-intensive", "io-intensive", "balanced"
|
||||
|
||||
// Resource profile for backup/restore operations
|
||||
ResourceProfile string // "conservative", "balanced", "performance", "max-performance"
|
||||
LargeDBMode bool // Enable large database mode (reduces parallelism, increases max_locks)
|
||||
|
||||
// CPU detection
|
||||
CPUDetector *cpu.Detector
|
||||
CPUInfo *cpu.CPUInfo
|
||||
MemoryInfo *cpu.MemoryInfo // System memory information
|
||||
|
||||
// Sample backup options
|
||||
SampleStrategy string // "ratio", "percent", "count"
|
||||
@ -178,6 +183,13 @@ func New() *Config {
|
||||
sslMode = ""
|
||||
}
|
||||
|
||||
// Detect memory information
|
||||
memInfo, _ := cpu.DetectMemory()
|
||||
|
||||
// Determine recommended resource profile
|
||||
recommendedProfile := cpu.RecommendProfile(cpuInfo, memInfo, false)
|
||||
defaultProfile := getEnvString("RESOURCE_PROFILE", recommendedProfile.Name)
|
||||
|
||||
cfg := &Config{
|
||||
// Database defaults
|
||||
Host: host,
|
||||
@ -189,18 +201,21 @@ func New() *Config {
|
||||
SSLMode: sslMode,
|
||||
Insecure: getEnvBool("INSECURE", false),
|
||||
|
||||
// Backup defaults
|
||||
// Backup defaults - use recommended profile's settings for small VMs
|
||||
BackupDir: backupDir,
|
||||
CompressionLevel: getEnvInt("COMPRESS_LEVEL", 6),
|
||||
Jobs: getEnvInt("JOBS", getDefaultJobs(cpuInfo)),
|
||||
DumpJobs: getEnvInt("DUMP_JOBS", getDefaultDumpJobs(cpuInfo)),
|
||||
Jobs: getEnvInt("JOBS", recommendedProfile.Jobs),
|
||||
DumpJobs: getEnvInt("DUMP_JOBS", recommendedProfile.DumpJobs),
|
||||
MaxCores: getEnvInt("MAX_CORES", getDefaultMaxCores(cpuInfo)),
|
||||
AutoDetectCores: getEnvBool("AUTO_DETECT_CORES", true),
|
||||
CPUWorkloadType: getEnvString("CPU_WORKLOAD_TYPE", "balanced"),
|
||||
ResourceProfile: defaultProfile,
|
||||
LargeDBMode: getEnvBool("LARGE_DB_MODE", false),
|
||||
|
||||
// CPU detection
|
||||
// CPU and memory detection
|
||||
CPUDetector: cpuDetector,
|
||||
CPUInfo: cpuInfo,
|
||||
MemoryInfo: memInfo,
|
||||
|
||||
// Sample backup defaults
|
||||
SampleStrategy: getEnvString("SAMPLE_STRATEGY", "ratio"),
|
||||
@ -220,8 +235,8 @@ func New() *Config {
|
||||
// Timeouts - default 24 hours (1440 min) to handle very large databases with large objects
|
||||
ClusterTimeoutMinutes: getEnvInt("CLUSTER_TIMEOUT_MIN", 1440),
|
||||
|
||||
// Cluster parallelism (default: 2 concurrent operations for faster cluster backup/restore)
|
||||
ClusterParallelism: getEnvInt("CLUSTER_PARALLELISM", 2),
|
||||
// Cluster parallelism - use recommended profile's setting for small VMs
|
||||
ClusterParallelism: getEnvInt("CLUSTER_PARALLELISM", recommendedProfile.ClusterParallelism),
|
||||
|
||||
// Working directory for large operations (default: system temp)
|
||||
WorkDir: getEnvString("WORK_DIR", ""),
|
||||
@ -409,6 +424,62 @@ func (c *Config) OptimizeForCPU() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
// ApplyResourceProfile applies a resource profile to the configuration
|
||||
// This adjusts parallelism settings based on the chosen profile
|
||||
func (c *Config) ApplyResourceProfile(profileName string) error {
|
||||
profile := cpu.GetProfileByName(profileName)
|
||||
if profile == nil {
|
||||
return &ConfigError{
|
||||
Field: "resource_profile",
|
||||
Value: profileName,
|
||||
Message: "unknown profile. Valid profiles: conservative, balanced, performance, max-performance",
|
||||
}
|
||||
}
|
||||
|
||||
// Validate profile against current system
|
||||
isValid, warnings := cpu.ValidateProfileForSystem(profile, c.CPUInfo, c.MemoryInfo)
|
||||
if !isValid {
|
||||
// Log warnings but don't block - user may know what they're doing
|
||||
_ = warnings // In production, log these warnings
|
||||
}
|
||||
|
||||
// Apply profile settings
|
||||
c.ResourceProfile = profile.Name
|
||||
|
||||
// If LargeDBMode is enabled, apply its modifiers
|
||||
if c.LargeDBMode {
|
||||
profile = cpu.ApplyLargeDBMode(profile)
|
||||
}
|
||||
|
||||
c.ClusterParallelism = profile.ClusterParallelism
|
||||
c.Jobs = profile.Jobs
|
||||
c.DumpJobs = profile.DumpJobs
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// GetResourceProfileRecommendation returns the recommended profile and reason
|
||||
func (c *Config) GetResourceProfileRecommendation(isLargeDB bool) (string, string) {
|
||||
profile, reason := cpu.RecommendProfileWithReason(c.CPUInfo, c.MemoryInfo, isLargeDB)
|
||||
return profile.Name, reason
|
||||
}
|
||||
|
||||
// GetCurrentProfile returns the current resource profile details
|
||||
// If LargeDBMode is enabled, returns a modified profile with reduced parallelism
|
||||
func (c *Config) GetCurrentProfile() *cpu.ResourceProfile {
|
||||
profile := cpu.GetProfileByName(c.ResourceProfile)
|
||||
if profile == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Apply LargeDBMode modifier if enabled
|
||||
if c.LargeDBMode {
|
||||
return cpu.ApplyLargeDBMode(profile)
|
||||
}
|
||||
|
||||
return profile
|
||||
}
|
||||
|
||||
// GetCPUInfo returns CPU information, detecting if necessary
|
||||
func (c *Config) GetCPUInfo() (*cpu.CPUInfo, error) {
|
||||
if c.CPUInfo != nil {
|
||||
|
||||
@ -28,9 +28,11 @@ type LocalConfig struct {
|
||||
DumpJobs int
|
||||
|
||||
// Performance settings
|
||||
CPUWorkload string
|
||||
MaxCores int
|
||||
ClusterTimeout int // Cluster operation timeout in minutes (default: 1440 = 24 hours)
|
||||
CPUWorkload string
|
||||
MaxCores int
|
||||
ClusterTimeout int // Cluster operation timeout in minutes (default: 1440 = 24 hours)
|
||||
ResourceProfile string
|
||||
LargeDBMode bool // Enable large database mode (reduces parallelism, increases locks)
|
||||
|
||||
// Security settings
|
||||
RetentionDays int
|
||||
@ -126,6 +128,10 @@ func LoadLocalConfig() (*LocalConfig, error) {
|
||||
if ct, err := strconv.Atoi(value); err == nil {
|
||||
cfg.ClusterTimeout = ct
|
||||
}
|
||||
case "resource_profile":
|
||||
cfg.ResourceProfile = value
|
||||
case "large_db_mode":
|
||||
cfg.LargeDBMode = value == "true" || value == "1"
|
||||
}
|
||||
case "security":
|
||||
switch key {
|
||||
@ -207,6 +213,12 @@ func SaveLocalConfig(cfg *LocalConfig) error {
|
||||
if cfg.ClusterTimeout != 0 {
|
||||
sb.WriteString(fmt.Sprintf("cluster_timeout = %d\n", cfg.ClusterTimeout))
|
||||
}
|
||||
if cfg.ResourceProfile != "" {
|
||||
sb.WriteString(fmt.Sprintf("resource_profile = %s\n", cfg.ResourceProfile))
|
||||
}
|
||||
if cfg.LargeDBMode {
|
||||
sb.WriteString("large_db_mode = true\n")
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
|
||||
// Security section
|
||||
@ -280,6 +292,14 @@ func ApplyLocalConfig(cfg *Config, local *LocalConfig) {
|
||||
if local.ClusterTimeout != 0 {
|
||||
cfg.ClusterTimeoutMinutes = local.ClusterTimeout
|
||||
}
|
||||
// Apply resource profile settings
|
||||
if local.ResourceProfile != "" {
|
||||
cfg.ResourceProfile = local.ResourceProfile
|
||||
}
|
||||
// LargeDBMode is a boolean - apply if true in config
|
||||
if local.LargeDBMode {
|
||||
cfg.LargeDBMode = true
|
||||
}
|
||||
if cfg.RetentionDays == 30 && local.RetentionDays != 0 {
|
||||
cfg.RetentionDays = local.RetentionDays
|
||||
}
|
||||
@ -294,22 +314,24 @@ func ApplyLocalConfig(cfg *Config, local *LocalConfig) {
|
||||
// ConfigFromConfig creates a LocalConfig from a Config
|
||||
func ConfigFromConfig(cfg *Config) *LocalConfig {
|
||||
return &LocalConfig{
|
||||
DBType: cfg.DatabaseType,
|
||||
Host: cfg.Host,
|
||||
Port: cfg.Port,
|
||||
User: cfg.User,
|
||||
Database: cfg.Database,
|
||||
SSLMode: cfg.SSLMode,
|
||||
BackupDir: cfg.BackupDir,
|
||||
WorkDir: cfg.WorkDir,
|
||||
Compression: cfg.CompressionLevel,
|
||||
Jobs: cfg.Jobs,
|
||||
DumpJobs: cfg.DumpJobs,
|
||||
CPUWorkload: cfg.CPUWorkloadType,
|
||||
MaxCores: cfg.MaxCores,
|
||||
ClusterTimeout: cfg.ClusterTimeoutMinutes,
|
||||
RetentionDays: cfg.RetentionDays,
|
||||
MinBackups: cfg.MinBackups,
|
||||
MaxRetries: cfg.MaxRetries,
|
||||
DBType: cfg.DatabaseType,
|
||||
Host: cfg.Host,
|
||||
Port: cfg.Port,
|
||||
User: cfg.User,
|
||||
Database: cfg.Database,
|
||||
SSLMode: cfg.SSLMode,
|
||||
BackupDir: cfg.BackupDir,
|
||||
WorkDir: cfg.WorkDir,
|
||||
Compression: cfg.CompressionLevel,
|
||||
Jobs: cfg.Jobs,
|
||||
DumpJobs: cfg.DumpJobs,
|
||||
CPUWorkload: cfg.CPUWorkloadType,
|
||||
MaxCores: cfg.MaxCores,
|
||||
ClusterTimeout: cfg.ClusterTimeoutMinutes,
|
||||
ResourceProfile: cfg.ResourceProfile,
|
||||
LargeDBMode: cfg.LargeDBMode,
|
||||
RetentionDays: cfg.RetentionDays,
|
||||
MinBackups: cfg.MinBackups,
|
||||
MaxRetries: cfg.MaxRetries,
|
||||
}
|
||||
}
|
||||
|
||||
128
internal/config/profile.go
Normal file
128
internal/config/profile.go
Normal file
@ -0,0 +1,128 @@
|
||||
package config
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// RestoreProfile defines resource settings for restore operations
|
||||
type RestoreProfile struct {
|
||||
Name string
|
||||
ParallelDBs int // Number of databases to restore in parallel
|
||||
Jobs int // Parallel decompression jobs
|
||||
DisableProgress bool // Disable progress indicators to reduce overhead
|
||||
MemoryConservative bool // Use memory-conservative settings
|
||||
}
|
||||
|
||||
// GetRestoreProfile returns the profile settings for a given profile name
|
||||
func GetRestoreProfile(profileName string) (*RestoreProfile, error) {
|
||||
profileName = strings.ToLower(strings.TrimSpace(profileName))
|
||||
|
||||
switch profileName {
|
||||
case "conservative":
|
||||
return &RestoreProfile{
|
||||
Name: "conservative",
|
||||
ParallelDBs: 1, // Single-threaded restore
|
||||
Jobs: 1, // Single-threaded decompression
|
||||
DisableProgress: false,
|
||||
MemoryConservative: true,
|
||||
}, nil
|
||||
|
||||
case "balanced", "":
|
||||
return &RestoreProfile{
|
||||
Name: "balanced",
|
||||
ParallelDBs: 0, // Use config default or auto-detect
|
||||
Jobs: 0, // Use config default or auto-detect
|
||||
DisableProgress: false,
|
||||
MemoryConservative: false,
|
||||
}, nil
|
||||
|
||||
case "aggressive", "performance", "max":
|
||||
return &RestoreProfile{
|
||||
Name: "aggressive",
|
||||
ParallelDBs: -1, // Auto-detect based on resources
|
||||
Jobs: -1, // Auto-detect based on CPU
|
||||
DisableProgress: false,
|
||||
MemoryConservative: false,
|
||||
}, nil
|
||||
|
||||
case "potato":
|
||||
// Easter egg: same as conservative but with a fun name
|
||||
return &RestoreProfile{
|
||||
Name: "potato",
|
||||
ParallelDBs: 1,
|
||||
Jobs: 1,
|
||||
DisableProgress: false,
|
||||
MemoryConservative: true,
|
||||
}, nil
|
||||
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown profile: %s (valid: conservative, balanced, aggressive)", profileName)
|
||||
}
|
||||
}
|
||||
|
||||
// ApplyProfile applies profile settings to config, respecting explicit user overrides
|
||||
func ApplyProfile(cfg *Config, profileName string, explicitJobs, explicitParallelDBs int) error {
|
||||
profile, err := GetRestoreProfile(profileName)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Show profile being used
|
||||
if cfg.Debug {
|
||||
fmt.Printf("Using restore profile: %s\n", profile.Name)
|
||||
if profile.MemoryConservative {
|
||||
fmt.Println("Memory-conservative mode enabled")
|
||||
}
|
||||
}
|
||||
|
||||
// Apply profile settings only if not explicitly overridden
|
||||
if explicitJobs == 0 && profile.Jobs > 0 {
|
||||
cfg.Jobs = profile.Jobs
|
||||
}
|
||||
|
||||
if explicitParallelDBs == 0 && profile.ParallelDBs != 0 {
|
||||
cfg.ClusterParallelism = profile.ParallelDBs
|
||||
}
|
||||
|
||||
// Store profile name
|
||||
cfg.ResourceProfile = profile.Name
|
||||
|
||||
// Conservative profile implies large DB mode settings
|
||||
if profile.MemoryConservative {
|
||||
cfg.LargeDBMode = true
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// GetProfileDescription returns a human-readable description of the profile
|
||||
func GetProfileDescription(profileName string) string {
|
||||
profile, err := GetRestoreProfile(profileName)
|
||||
if err != nil {
|
||||
return "Unknown profile"
|
||||
}
|
||||
|
||||
switch profile.Name {
|
||||
case "conservative":
|
||||
return "Conservative: --parallel=1, single-threaded, minimal memory usage. Best for resource-constrained servers or when other services are running."
|
||||
case "potato":
|
||||
return "Potato Mode: Same as conservative, for servers running on a potato 🥔"
|
||||
case "balanced":
|
||||
return "Balanced: Auto-detect resources, moderate parallelism. Good default for most scenarios."
|
||||
case "aggressive":
|
||||
return "Aggressive: Maximum parallelism, all available resources. Best for dedicated database servers with ample resources."
|
||||
default:
|
||||
return profile.Name
|
||||
}
|
||||
}
|
||||
|
||||
// ListProfiles returns a list of all available profiles with descriptions
|
||||
func ListProfiles() map[string]string {
|
||||
return map[string]string{
|
||||
"conservative": GetProfileDescription("conservative"),
|
||||
"balanced": GetProfileDescription("balanced"),
|
||||
"aggressive": GetProfileDescription("aggressive"),
|
||||
"potato": GetProfileDescription("potato"),
|
||||
}
|
||||
}
|
||||
475
internal/cpu/profiles.go
Normal file
475
internal/cpu/profiles.go
Normal file
@ -0,0 +1,475 @@
|
||||
package cpu
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"fmt"
|
||||
"os"
|
||||
"os/exec"
|
||||
"runtime"
|
||||
"strconv"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// MemoryInfo holds system memory information
|
||||
type MemoryInfo struct {
|
||||
TotalBytes int64 `json:"total_bytes"`
|
||||
AvailableBytes int64 `json:"available_bytes"`
|
||||
FreeBytes int64 `json:"free_bytes"`
|
||||
UsedBytes int64 `json:"used_bytes"`
|
||||
SwapTotalBytes int64 `json:"swap_total_bytes"`
|
||||
SwapFreeBytes int64 `json:"swap_free_bytes"`
|
||||
TotalGB int `json:"total_gb"`
|
||||
AvailableGB int `json:"available_gb"`
|
||||
Platform string `json:"platform"`
|
||||
}
|
||||
|
||||
// ResourceProfile defines a resource allocation profile for backup/restore operations
|
||||
type ResourceProfile struct {
|
||||
Name string `json:"name"`
|
||||
Description string `json:"description"`
|
||||
ClusterParallelism int `json:"cluster_parallelism"` // Concurrent databases
|
||||
Jobs int `json:"jobs"` // Parallel jobs within pg_restore
|
||||
DumpJobs int `json:"dump_jobs"` // Parallel jobs for pg_dump
|
||||
MaintenanceWorkMem string `json:"maintenance_work_mem"` // PostgreSQL recommendation
|
||||
MaxLocksPerTxn int `json:"max_locks_per_txn"` // PostgreSQL recommendation
|
||||
RecommendedForLarge bool `json:"recommended_for_large"` // Suitable for large DBs?
|
||||
MinMemoryGB int `json:"min_memory_gb"` // Minimum memory for this profile
|
||||
MinCores int `json:"min_cores"` // Minimum cores for this profile
|
||||
}
|
||||
|
||||
// Predefined resource profiles
|
||||
var (
|
||||
// ProfileConservative - Safe for constrained VMs, avoids shared memory issues
|
||||
ProfileConservative = ResourceProfile{
|
||||
Name: "conservative",
|
||||
Description: "Safe for small VMs (2-4 cores, <16GB). Sequential operations, minimal memory pressure. Best for large DBs on limited hardware.",
|
||||
ClusterParallelism: 1,
|
||||
Jobs: 1,
|
||||
DumpJobs: 2,
|
||||
MaintenanceWorkMem: "256MB",
|
||||
MaxLocksPerTxn: 4096,
|
||||
RecommendedForLarge: true,
|
||||
MinMemoryGB: 4,
|
||||
MinCores: 2,
|
||||
}
|
||||
|
||||
// ProfileBalanced - Default profile, works for most scenarios
|
||||
ProfileBalanced = ResourceProfile{
|
||||
Name: "balanced",
|
||||
Description: "Balanced for medium VMs (4-8 cores, 16-32GB). Moderate parallelism with good safety margin.",
|
||||
ClusterParallelism: 2,
|
||||
Jobs: 2,
|
||||
DumpJobs: 4,
|
||||
MaintenanceWorkMem: "512MB",
|
||||
MaxLocksPerTxn: 2048,
|
||||
RecommendedForLarge: true,
|
||||
MinMemoryGB: 16,
|
||||
MinCores: 4,
|
||||
}
|
||||
|
||||
// ProfilePerformance - Aggressive parallelism for powerful servers
|
||||
ProfilePerformance = ResourceProfile{
|
||||
Name: "performance",
|
||||
Description: "Aggressive for powerful servers (8+ cores, 32GB+). Maximum parallelism for fast operations.",
|
||||
ClusterParallelism: 4,
|
||||
Jobs: 4,
|
||||
DumpJobs: 8,
|
||||
MaintenanceWorkMem: "1GB",
|
||||
MaxLocksPerTxn: 1024,
|
||||
RecommendedForLarge: false, // Large DBs may still need conservative
|
||||
MinMemoryGB: 32,
|
||||
MinCores: 8,
|
||||
}
|
||||
|
||||
// ProfileMaxPerformance - Maximum parallelism for high-end servers
|
||||
ProfileMaxPerformance = ResourceProfile{
|
||||
Name: "max-performance",
|
||||
Description: "Maximum for high-end servers (16+ cores, 64GB+). Full CPU utilization.",
|
||||
ClusterParallelism: 8,
|
||||
Jobs: 8,
|
||||
DumpJobs: 16,
|
||||
MaintenanceWorkMem: "2GB",
|
||||
MaxLocksPerTxn: 512,
|
||||
RecommendedForLarge: false, // Large DBs should use LargeDBMode
|
||||
MinMemoryGB: 64,
|
||||
MinCores: 16,
|
||||
}
|
||||
|
||||
// AllProfiles contains all available profiles (VM resource-based)
|
||||
AllProfiles = []ResourceProfile{
|
||||
ProfileConservative,
|
||||
ProfileBalanced,
|
||||
ProfilePerformance,
|
||||
ProfileMaxPerformance,
|
||||
}
|
||||
)
|
||||
|
||||
// GetProfileByName returns a profile by its name
|
||||
func GetProfileByName(name string) *ResourceProfile {
|
||||
for _, p := range AllProfiles {
|
||||
if strings.EqualFold(p.Name, name) {
|
||||
return &p
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// ApplyLargeDBMode modifies a profile for large database operations.
|
||||
// This is a modifier that reduces parallelism and increases max_locks_per_transaction
|
||||
// to prevent "out of shared memory" errors with large databases (many tables, LOBs, etc.).
|
||||
// It returns a new profile with adjusted settings, leaving the original unchanged.
|
||||
func ApplyLargeDBMode(profile *ResourceProfile) *ResourceProfile {
|
||||
if profile == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Create a copy with adjusted settings
|
||||
modified := *profile
|
||||
|
||||
// Add "(large-db)" suffix to indicate this is modified
|
||||
modified.Name = profile.Name + " +large-db"
|
||||
modified.Description = fmt.Sprintf("%s [LargeDBMode: reduced parallelism, high locks]", profile.Description)
|
||||
|
||||
// Reduce parallelism to avoid lock exhaustion
|
||||
// Rule: halve parallelism, minimum 1
|
||||
modified.ClusterParallelism = max(1, profile.ClusterParallelism/2)
|
||||
modified.Jobs = max(1, profile.Jobs/2)
|
||||
modified.DumpJobs = max(2, profile.DumpJobs/2)
|
||||
|
||||
// Force high max_locks_per_transaction for large schemas
|
||||
modified.MaxLocksPerTxn = 8192
|
||||
|
||||
// Increase maintenance_work_mem for complex operations
|
||||
// Keep or boost maintenance work mem
|
||||
modified.MaintenanceWorkMem = "1GB"
|
||||
if profile.MinMemoryGB >= 32 {
|
||||
modified.MaintenanceWorkMem = "2GB"
|
||||
}
|
||||
|
||||
modified.RecommendedForLarge = true
|
||||
|
||||
return &modified
|
||||
}
|
||||
|
||||
// max returns the larger of two integers
|
||||
func max(a, b int) int {
|
||||
if a > b {
|
||||
return a
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
// DetectMemory detects system memory information
|
||||
func DetectMemory() (*MemoryInfo, error) {
|
||||
info := &MemoryInfo{
|
||||
Platform: runtime.GOOS,
|
||||
}
|
||||
|
||||
switch runtime.GOOS {
|
||||
case "linux":
|
||||
if err := detectLinuxMemory(info); err != nil {
|
||||
return info, fmt.Errorf("linux memory detection failed: %w", err)
|
||||
}
|
||||
case "darwin":
|
||||
if err := detectDarwinMemory(info); err != nil {
|
||||
return info, fmt.Errorf("darwin memory detection failed: %w", err)
|
||||
}
|
||||
case "windows":
|
||||
if err := detectWindowsMemory(info); err != nil {
|
||||
return info, fmt.Errorf("windows memory detection failed: %w", err)
|
||||
}
|
||||
default:
|
||||
// Fallback: use Go runtime memory stats
|
||||
var memStats runtime.MemStats
|
||||
runtime.ReadMemStats(&memStats)
|
||||
info.TotalBytes = int64(memStats.Sys)
|
||||
info.AvailableBytes = int64(memStats.Sys - memStats.Alloc)
|
||||
}
|
||||
|
||||
// Calculate GB values
|
||||
info.TotalGB = int(info.TotalBytes / (1024 * 1024 * 1024))
|
||||
info.AvailableGB = int(info.AvailableBytes / (1024 * 1024 * 1024))
|
||||
|
||||
return info, nil
|
||||
}
|
||||
|
||||
// detectLinuxMemory reads memory info from /proc/meminfo
|
||||
func detectLinuxMemory(info *MemoryInfo) error {
|
||||
file, err := os.Open("/proc/meminfo")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
scanner := bufio.NewScanner(file)
|
||||
for scanner.Scan() {
|
||||
line := scanner.Text()
|
||||
parts := strings.Fields(line)
|
||||
if len(parts) < 2 {
|
||||
continue
|
||||
}
|
||||
|
||||
key := strings.TrimSuffix(parts[0], ":")
|
||||
value, err := strconv.ParseInt(parts[1], 10, 64)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
|
||||
// Values are in kB
|
||||
valueBytes := value * 1024
|
||||
|
||||
switch key {
|
||||
case "MemTotal":
|
||||
info.TotalBytes = valueBytes
|
||||
case "MemAvailable":
|
||||
info.AvailableBytes = valueBytes
|
||||
case "MemFree":
|
||||
info.FreeBytes = valueBytes
|
||||
case "SwapTotal":
|
||||
info.SwapTotalBytes = valueBytes
|
||||
case "SwapFree":
|
||||
info.SwapFreeBytes = valueBytes
|
||||
}
|
||||
}
|
||||
|
||||
info.UsedBytes = info.TotalBytes - info.AvailableBytes
|
||||
|
||||
return scanner.Err()
|
||||
}
|
||||
|
||||
// detectDarwinMemory detects memory on macOS
|
||||
func detectDarwinMemory(info *MemoryInfo) error {
|
||||
// Use sysctl for total memory
|
||||
if output, err := runCommand("sysctl", "-n", "hw.memsize"); err == nil {
|
||||
if val, err := strconv.ParseInt(strings.TrimSpace(output), 10, 64); err == nil {
|
||||
info.TotalBytes = val
|
||||
}
|
||||
}
|
||||
|
||||
// Use vm_stat for available memory (more complex parsing required)
|
||||
if output, err := runCommand("vm_stat"); err == nil {
|
||||
pageSize := int64(4096) // Default page size
|
||||
var freePages, inactivePages int64
|
||||
|
||||
lines := strings.Split(output, "\n")
|
||||
for _, line := range lines {
|
||||
if strings.Contains(line, "page size of") {
|
||||
parts := strings.Fields(line)
|
||||
for i, p := range parts {
|
||||
if p == "of" && i+1 < len(parts) {
|
||||
if ps, err := strconv.ParseInt(parts[i+1], 10, 64); err == nil {
|
||||
pageSize = ps
|
||||
}
|
||||
}
|
||||
}
|
||||
} else if strings.Contains(line, "Pages free:") {
|
||||
val := extractNumberFromLine(line)
|
||||
freePages = val
|
||||
} else if strings.Contains(line, "Pages inactive:") {
|
||||
val := extractNumberFromLine(line)
|
||||
inactivePages = val
|
||||
}
|
||||
}
|
||||
|
||||
info.FreeBytes = freePages * pageSize
|
||||
info.AvailableBytes = (freePages + inactivePages) * pageSize
|
||||
}
|
||||
|
||||
info.UsedBytes = info.TotalBytes - info.AvailableBytes
|
||||
return nil
|
||||
}
|
||||
|
||||
// detectWindowsMemory detects memory on Windows
|
||||
func detectWindowsMemory(info *MemoryInfo) error {
|
||||
// Use wmic for memory info
|
||||
if output, err := runCommand("wmic", "OS", "get", "TotalVisibleMemorySize,FreePhysicalMemory", "/format:list"); err == nil {
|
||||
lines := strings.Split(output, "\n")
|
||||
for _, line := range lines {
|
||||
line = strings.TrimSpace(line)
|
||||
if strings.HasPrefix(line, "TotalVisibleMemorySize=") {
|
||||
val := strings.TrimPrefix(line, "TotalVisibleMemorySize=")
|
||||
if v, err := strconv.ParseInt(strings.TrimSpace(val), 10, 64); err == nil {
|
||||
info.TotalBytes = v * 1024 // KB to bytes
|
||||
}
|
||||
} else if strings.HasPrefix(line, "FreePhysicalMemory=") {
|
||||
val := strings.TrimPrefix(line, "FreePhysicalMemory=")
|
||||
if v, err := strconv.ParseInt(strings.TrimSpace(val), 10, 64); err == nil {
|
||||
info.FreeBytes = v * 1024
|
||||
info.AvailableBytes = v * 1024
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
info.UsedBytes = info.TotalBytes - info.AvailableBytes
|
||||
return nil
|
||||
}
|
||||
|
||||
// RecommendProfile recommends a resource profile based on system resources and workload
|
||||
func RecommendProfile(cpuInfo *CPUInfo, memInfo *MemoryInfo, isLargeDB bool) *ResourceProfile {
|
||||
cores := 0
|
||||
if cpuInfo != nil {
|
||||
cores = cpuInfo.PhysicalCores
|
||||
if cores == 0 {
|
||||
cores = cpuInfo.LogicalCores
|
||||
}
|
||||
}
|
||||
if cores == 0 {
|
||||
cores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memGB := 0
|
||||
if memInfo != nil {
|
||||
memGB = memInfo.TotalGB
|
||||
}
|
||||
|
||||
// Special case: large databases should use conservative profile
|
||||
// The caller should also enable LargeDBMode for increased MaxLocksPerTxn
|
||||
if isLargeDB {
|
||||
// For large DBs, recommend conservative regardless of resources
|
||||
// LargeDBMode flag will handle the lock settings separately
|
||||
return &ProfileConservative
|
||||
}
|
||||
|
||||
// Resource-based selection
|
||||
if cores >= 16 && memGB >= 64 {
|
||||
return &ProfileMaxPerformance
|
||||
} else if cores >= 8 && memGB >= 32 {
|
||||
return &ProfilePerformance
|
||||
} else if cores >= 4 && memGB >= 16 {
|
||||
return &ProfileBalanced
|
||||
}
|
||||
|
||||
// Default to conservative for constrained systems
|
||||
return &ProfileConservative
|
||||
}
|
||||
|
||||
// RecommendProfileWithReason returns a profile recommendation with explanation
|
||||
func RecommendProfileWithReason(cpuInfo *CPUInfo, memInfo *MemoryInfo, isLargeDB bool) (*ResourceProfile, string) {
|
||||
cores := 0
|
||||
if cpuInfo != nil {
|
||||
cores = cpuInfo.PhysicalCores
|
||||
if cores == 0 {
|
||||
cores = cpuInfo.LogicalCores
|
||||
}
|
||||
}
|
||||
if cores == 0 {
|
||||
cores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memGB := 0
|
||||
if memInfo != nil {
|
||||
memGB = memInfo.TotalGB
|
||||
}
|
||||
|
||||
// Build reason string
|
||||
var reason strings.Builder
|
||||
reason.WriteString(fmt.Sprintf("System: %d cores, %dGB RAM. ", cores, memGB))
|
||||
|
||||
profile := RecommendProfile(cpuInfo, memInfo, isLargeDB)
|
||||
|
||||
if isLargeDB {
|
||||
reason.WriteString("Large database mode - using conservative settings. Enable LargeDBMode for higher max_locks.")
|
||||
} else if profile.Name == "conservative" {
|
||||
reason.WriteString("Limited resources detected - using conservative profile for stability.")
|
||||
} else if profile.Name == "max-performance" {
|
||||
reason.WriteString("High-end server detected - using maximum parallelism.")
|
||||
} else if profile.Name == "performance" {
|
||||
reason.WriteString("Good resources detected - using performance profile.")
|
||||
} else {
|
||||
reason.WriteString("Using balanced profile for optimal performance/stability trade-off.")
|
||||
}
|
||||
|
||||
return profile, reason.String()
|
||||
}
|
||||
|
||||
// ValidateProfileForSystem checks if a profile is suitable for the current system
|
||||
func ValidateProfileForSystem(profile *ResourceProfile, cpuInfo *CPUInfo, memInfo *MemoryInfo) (bool, []string) {
|
||||
var warnings []string
|
||||
|
||||
cores := 0
|
||||
if cpuInfo != nil {
|
||||
cores = cpuInfo.PhysicalCores
|
||||
if cores == 0 {
|
||||
cores = cpuInfo.LogicalCores
|
||||
}
|
||||
}
|
||||
if cores == 0 {
|
||||
cores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memGB := 0
|
||||
if memInfo != nil {
|
||||
memGB = memInfo.TotalGB
|
||||
}
|
||||
|
||||
// Check minimum requirements
|
||||
if cores < profile.MinCores {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("Profile '%s' recommends %d+ cores (system has %d)", profile.Name, profile.MinCores, cores))
|
||||
}
|
||||
|
||||
if memGB < profile.MinMemoryGB {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("Profile '%s' recommends %dGB+ RAM (system has %dGB)", profile.Name, profile.MinMemoryGB, memGB))
|
||||
}
|
||||
|
||||
// Check for potential issues
|
||||
if profile.ClusterParallelism > cores {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("Cluster parallelism (%d) exceeds CPU cores (%d) - may cause contention",
|
||||
profile.ClusterParallelism, cores))
|
||||
}
|
||||
|
||||
// Memory pressure warning
|
||||
memPerWorker := 2 // Rough estimate: 2GB per parallel worker for large DB operations
|
||||
requiredMem := profile.ClusterParallelism * profile.Jobs * memPerWorker
|
||||
if memGB > 0 && requiredMem > memGB {
|
||||
warnings = append(warnings,
|
||||
fmt.Sprintf("High parallelism may require ~%dGB RAM (system has %dGB) - risk of OOM",
|
||||
requiredMem, memGB))
|
||||
}
|
||||
|
||||
return len(warnings) == 0, warnings
|
||||
}
|
||||
|
||||
// FormatProfileSummary returns a formatted summary of a profile
|
||||
func (p *ResourceProfile) FormatProfileSummary() string {
|
||||
return fmt.Sprintf("[%s] Parallel: %d DBs, %d jobs | Recommended for large DBs: %v",
|
||||
strings.ToUpper(p.Name),
|
||||
p.ClusterParallelism,
|
||||
p.Jobs,
|
||||
p.RecommendedForLarge)
|
||||
}
|
||||
|
||||
// PostgreSQLRecommendations returns PostgreSQL configuration recommendations for this profile
|
||||
func (p *ResourceProfile) PostgreSQLRecommendations() []string {
|
||||
return []string{
|
||||
fmt.Sprintf("ALTER SYSTEM SET max_locks_per_transaction = %d;", p.MaxLocksPerTxn),
|
||||
fmt.Sprintf("ALTER SYSTEM SET maintenance_work_mem = '%s';", p.MaintenanceWorkMem),
|
||||
"-- Restart PostgreSQL after changes to max_locks_per_transaction",
|
||||
}
|
||||
}
|
||||
|
||||
// Helper functions
|
||||
|
||||
func runCommand(name string, args ...string) (string, error) {
|
||||
cmd := exec.Command(name, args...)
|
||||
output, err := cmd.Output()
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
return string(output), nil
|
||||
}
|
||||
|
||||
func extractNumberFromLine(line string) int64 {
|
||||
// Extract number before the period at end (e.g., "Pages free: 123456.")
|
||||
parts := strings.Fields(line)
|
||||
for _, p := range parts {
|
||||
p = strings.TrimSuffix(p, ".")
|
||||
if val, err := strconv.ParseInt(p, 10, 64); err == nil && val > 0 {
|
||||
return val
|
||||
}
|
||||
}
|
||||
return 0
|
||||
}
|
||||
@ -146,7 +146,7 @@ func (d *Dots) Start(message string) {
|
||||
fmt.Fprint(d.writer, message)
|
||||
|
||||
go func() {
|
||||
ticker := time.NewTicker(500 * time.Millisecond)
|
||||
ticker := time.NewTicker(100 * time.Millisecond)
|
||||
defer ticker.Stop()
|
||||
|
||||
count := 0
|
||||
|
||||
@ -50,6 +50,7 @@ type Engine struct {
|
||||
progress progress.Indicator
|
||||
detailedReporter *progress.DetailedReporter
|
||||
dryRun bool
|
||||
silentMode bool // Suppress stdout output (for TUI mode)
|
||||
debugLogPath string // Path to save debug log on error
|
||||
|
||||
// TUI progress callback for detailed progress reporting
|
||||
@ -86,6 +87,7 @@ func NewSilent(cfg *config.Config, log logger.Logger, db database.Database) *Eng
|
||||
progress: progressIndicator,
|
||||
detailedReporter: detailedReporter,
|
||||
dryRun: false,
|
||||
silentMode: true, // Suppress stdout for TUI
|
||||
}
|
||||
}
|
||||
|
||||
@ -290,6 +292,25 @@ func (e *Engine) restorePostgreSQLDump(ctx context.Context, archivePath, targetD
|
||||
|
||||
cmd := e.db.BuildRestoreCommand(targetDB, archivePath, opts)
|
||||
|
||||
// Start heartbeat ticker for restore progress
|
||||
restoreStart := time.Now()
|
||||
heartbeatCtx, cancelHeartbeat := context.WithCancel(ctx)
|
||||
heartbeatTicker := time.NewTicker(5 * time.Second)
|
||||
defer heartbeatTicker.Stop()
|
||||
defer cancelHeartbeat()
|
||||
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-heartbeatTicker.C:
|
||||
elapsed := time.Since(restoreStart)
|
||||
e.progress.Update(fmt.Sprintf("Restoring %s... (elapsed: %s)", targetDB, formatDuration(elapsed)))
|
||||
case <-heartbeatCtx.Done():
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
if compressed {
|
||||
// For compressed dumps, decompress first
|
||||
return e.executeRestoreWithDecompression(ctx, archivePath, cmd)
|
||||
@ -818,8 +839,99 @@ func (e *Engine) previewRestore(archivePath, targetDB string, format ArchiveForm
|
||||
return nil
|
||||
}
|
||||
|
||||
// RestoreSingleFromCluster extracts and restores a single database from a cluster backup
|
||||
func (e *Engine) RestoreSingleFromCluster(ctx context.Context, clusterArchivePath, dbName, targetDB string, cleanFirst, createIfMissing bool) error {
|
||||
operation := e.log.StartOperation("Single Database Restore from Cluster")
|
||||
|
||||
// Validate and sanitize archive path
|
||||
validArchivePath, pathErr := security.ValidateArchivePath(clusterArchivePath)
|
||||
if pathErr != nil {
|
||||
operation.Fail(fmt.Sprintf("Invalid archive path: %v", pathErr))
|
||||
return fmt.Errorf("invalid archive path: %w", pathErr)
|
||||
}
|
||||
clusterArchivePath = validArchivePath
|
||||
|
||||
// Validate archive exists
|
||||
if _, err := os.Stat(clusterArchivePath); os.IsNotExist(err) {
|
||||
operation.Fail("Archive not found")
|
||||
return fmt.Errorf("archive not found: %s", clusterArchivePath)
|
||||
}
|
||||
|
||||
// Verify it's a cluster archive
|
||||
format := DetectArchiveFormat(clusterArchivePath)
|
||||
if format != FormatClusterTarGz {
|
||||
operation.Fail("Not a cluster archive")
|
||||
return fmt.Errorf("not a cluster archive: %s (format: %s)", clusterArchivePath, format)
|
||||
}
|
||||
|
||||
// Create temporary directory for extraction
|
||||
workDir := e.cfg.GetEffectiveWorkDir()
|
||||
tempDir := filepath.Join(workDir, fmt.Sprintf(".extract_%d", time.Now().Unix()))
|
||||
if err := os.MkdirAll(tempDir, 0755); err != nil {
|
||||
operation.Fail("Failed to create temporary directory")
|
||||
return fmt.Errorf("failed to create temp directory: %w", err)
|
||||
}
|
||||
defer os.RemoveAll(tempDir)
|
||||
|
||||
// Extract the specific database from cluster archive
|
||||
e.log.Info("Extracting database from cluster backup", "database", dbName, "cluster", filepath.Base(clusterArchivePath))
|
||||
e.progress.Start(fmt.Sprintf("Extracting '%s' from cluster backup", dbName))
|
||||
|
||||
extractedPath, err := ExtractDatabaseFromCluster(ctx, clusterArchivePath, dbName, tempDir, e.log, e.progress)
|
||||
if err != nil {
|
||||
e.progress.Fail(fmt.Sprintf("Extraction failed: %v", err))
|
||||
operation.Fail(fmt.Sprintf("Extraction failed: %v", err))
|
||||
return fmt.Errorf("failed to extract database: %w", err)
|
||||
}
|
||||
|
||||
e.progress.Update(fmt.Sprintf("Extracted: %s", filepath.Base(extractedPath)))
|
||||
e.log.Info("Database extracted successfully", "path", extractedPath)
|
||||
|
||||
// Now restore the extracted database file
|
||||
e.progress.Update("Restoring database...")
|
||||
|
||||
// Create database if requested and it doesn't exist
|
||||
if createIfMissing {
|
||||
e.log.Info("Checking if target database exists", "database", targetDB)
|
||||
if err := e.ensureDatabaseExists(ctx, targetDB); err != nil {
|
||||
operation.Fail(fmt.Sprintf("Failed to create database: %v", err))
|
||||
return fmt.Errorf("failed to create database '%s': %w", targetDB, err)
|
||||
}
|
||||
}
|
||||
|
||||
// Detect format of extracted file
|
||||
extractedFormat := DetectArchiveFormat(extractedPath)
|
||||
e.log.Info("Restoring extracted database", "format", extractedFormat, "target", targetDB)
|
||||
|
||||
// Restore based on format
|
||||
var restoreErr error
|
||||
switch extractedFormat {
|
||||
case FormatPostgreSQLDump, FormatPostgreSQLDumpGz:
|
||||
restoreErr = e.restorePostgreSQLDump(ctx, extractedPath, targetDB, extractedFormat == FormatPostgreSQLDumpGz, cleanFirst)
|
||||
case FormatPostgreSQLSQL, FormatPostgreSQLSQLGz:
|
||||
restoreErr = e.restorePostgreSQLSQL(ctx, extractedPath, targetDB, extractedFormat == FormatPostgreSQLSQLGz)
|
||||
case FormatMySQLSQL, FormatMySQLSQLGz:
|
||||
restoreErr = e.restoreMySQLSQL(ctx, extractedPath, targetDB, extractedFormat == FormatMySQLSQLGz)
|
||||
default:
|
||||
operation.Fail("Unsupported extracted format")
|
||||
return fmt.Errorf("unsupported extracted format: %s", extractedFormat)
|
||||
}
|
||||
|
||||
if restoreErr != nil {
|
||||
e.progress.Fail(fmt.Sprintf("Restore failed: %v", restoreErr))
|
||||
operation.Fail(fmt.Sprintf("Restore failed: %v", restoreErr))
|
||||
return restoreErr
|
||||
}
|
||||
|
||||
e.progress.Complete(fmt.Sprintf("Database '%s' restored from cluster backup", targetDB))
|
||||
operation.Complete(fmt.Sprintf("Restored '%s' from cluster as '%s'", dbName, targetDB))
|
||||
return nil
|
||||
}
|
||||
|
||||
// RestoreCluster restores a full cluster from a tar.gz archive
|
||||
func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
// If preExtractedPath is non-empty, uses that directory instead of extracting archivePath
|
||||
// This avoids double extraction when ValidateAndExtractCluster was already called
|
||||
func (e *Engine) RestoreCluster(ctx context.Context, archivePath string, preExtractedPath ...string) error {
|
||||
operation := e.log.StartOperation("Cluster Restore")
|
||||
|
||||
// Validate and sanitize archive path
|
||||
@ -850,22 +962,32 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
return fmt.Errorf("not a cluster archive: %s (detected format: %s)", archivePath, format)
|
||||
}
|
||||
|
||||
// Check disk space before starting restore
|
||||
e.log.Info("Checking disk space for restore")
|
||||
archiveInfo, err := os.Stat(archivePath)
|
||||
if err == nil {
|
||||
spaceCheck := checks.CheckDiskSpaceForRestore(e.cfg.BackupDir, archiveInfo.Size())
|
||||
// Check if we have a pre-extracted directory (optimization to avoid double extraction)
|
||||
// This check must happen BEFORE disk space checks to avoid false failures
|
||||
usingPreExtracted := len(preExtractedPath) > 0 && preExtractedPath[0] != ""
|
||||
|
||||
if spaceCheck.Critical {
|
||||
operation.Fail("Insufficient disk space")
|
||||
return fmt.Errorf("insufficient disk space for restore: %.1f%% used - need at least 4x archive size", spaceCheck.UsedPercent)
|
||||
}
|
||||
// Check disk space before starting restore (skip if using pre-extracted directory)
|
||||
var archiveInfo os.FileInfo
|
||||
var err error
|
||||
if !usingPreExtracted {
|
||||
e.log.Info("Checking disk space for restore")
|
||||
archiveInfo, err = os.Stat(archivePath)
|
||||
if err == nil {
|
||||
spaceCheck := checks.CheckDiskSpaceForRestore(e.cfg.BackupDir, archiveInfo.Size())
|
||||
|
||||
if spaceCheck.Warning {
|
||||
e.log.Warn("Low disk space - restore may fail",
|
||||
"available_gb", float64(spaceCheck.AvailableBytes)/(1024*1024*1024),
|
||||
"used_percent", spaceCheck.UsedPercent)
|
||||
if spaceCheck.Critical {
|
||||
operation.Fail("Insufficient disk space")
|
||||
return fmt.Errorf("insufficient disk space for restore: %.1f%% used - need at least 4x archive size", spaceCheck.UsedPercent)
|
||||
}
|
||||
|
||||
if spaceCheck.Warning {
|
||||
e.log.Warn("Low disk space - restore may fail",
|
||||
"available_gb", float64(spaceCheck.AvailableBytes)/(1024*1024*1024),
|
||||
"used_percent", spaceCheck.UsedPercent)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
e.log.Info("Skipping disk space check (using pre-extracted directory)")
|
||||
}
|
||||
|
||||
if e.dryRun {
|
||||
@ -879,46 +1001,56 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
workDir := e.cfg.GetEffectiveWorkDir()
|
||||
tempDir := filepath.Join(workDir, fmt.Sprintf(".restore_%d", time.Now().Unix()))
|
||||
|
||||
// Check disk space for extraction (need ~3x archive size: compressed + extracted + working space)
|
||||
if archiveInfo != nil {
|
||||
requiredBytes := uint64(archiveInfo.Size()) * 3
|
||||
extractionCheck := checks.CheckDiskSpace(workDir)
|
||||
if extractionCheck.AvailableBytes < requiredBytes {
|
||||
operation.Fail("Insufficient disk space for extraction")
|
||||
return fmt.Errorf("insufficient disk space for extraction in %s: need %.1f GB, have %.1f GB (archive size: %.1f GB × 3)",
|
||||
workDir,
|
||||
float64(requiredBytes)/(1024*1024*1024),
|
||||
float64(extractionCheck.AvailableBytes)/(1024*1024*1024),
|
||||
float64(archiveInfo.Size())/(1024*1024*1024))
|
||||
// Handle pre-extracted directory or extract archive
|
||||
if usingPreExtracted {
|
||||
tempDir = preExtractedPath[0]
|
||||
// Note: Caller handles cleanup of pre-extracted directory
|
||||
e.log.Info("Using pre-extracted cluster directory",
|
||||
"path", tempDir,
|
||||
"optimization", "skipping duplicate extraction")
|
||||
} else {
|
||||
// Check disk space for extraction (need ~3x archive size: compressed + extracted + working space)
|
||||
if archiveInfo != nil {
|
||||
requiredBytes := uint64(archiveInfo.Size()) * 3
|
||||
extractionCheck := checks.CheckDiskSpace(workDir)
|
||||
if extractionCheck.AvailableBytes < requiredBytes {
|
||||
operation.Fail("Insufficient disk space for extraction")
|
||||
return fmt.Errorf("insufficient disk space for extraction in %s: need %.1f GB, have %.1f GB (archive size: %.1f GB × 3)",
|
||||
workDir,
|
||||
float64(requiredBytes)/(1024*1024*1024),
|
||||
float64(extractionCheck.AvailableBytes)/(1024*1024*1024),
|
||||
float64(archiveInfo.Size())/(1024*1024*1024))
|
||||
}
|
||||
e.log.Info("Disk space check for extraction passed",
|
||||
"workdir", workDir,
|
||||
"required_gb", float64(requiredBytes)/(1024*1024*1024),
|
||||
"available_gb", float64(extractionCheck.AvailableBytes)/(1024*1024*1024))
|
||||
}
|
||||
e.log.Info("Disk space check for extraction passed",
|
||||
"workdir", workDir,
|
||||
"required_gb", float64(requiredBytes)/(1024*1024*1024),
|
||||
"available_gb", float64(extractionCheck.AvailableBytes)/(1024*1024*1024))
|
||||
}
|
||||
|
||||
if err := os.MkdirAll(tempDir, 0755); err != nil {
|
||||
operation.Fail("Failed to create temporary directory")
|
||||
return fmt.Errorf("failed to create temp directory in %s: %w", workDir, err)
|
||||
}
|
||||
defer os.RemoveAll(tempDir)
|
||||
// Need to extract archive ourselves
|
||||
if err := os.MkdirAll(tempDir, 0755); err != nil {
|
||||
operation.Fail("Failed to create temporary directory")
|
||||
return fmt.Errorf("failed to create temp directory in %s: %w", workDir, err)
|
||||
}
|
||||
defer os.RemoveAll(tempDir)
|
||||
|
||||
// Extract archive
|
||||
e.log.Info("Extracting cluster archive", "archive", archivePath, "tempDir", tempDir)
|
||||
if err := e.extractArchive(ctx, archivePath, tempDir); err != nil {
|
||||
operation.Fail("Archive extraction failed")
|
||||
return fmt.Errorf("failed to extract archive: %w", err)
|
||||
}
|
||||
// Extract archive
|
||||
e.log.Info("Extracting cluster archive", "archive", archivePath, "tempDir", tempDir)
|
||||
if err := e.extractArchive(ctx, archivePath, tempDir); err != nil {
|
||||
operation.Fail("Archive extraction failed")
|
||||
return fmt.Errorf("failed to extract archive: %w", err)
|
||||
}
|
||||
|
||||
// Check context validity after extraction (debugging context cancellation issues)
|
||||
if ctx.Err() != nil {
|
||||
e.log.Error("Context cancelled after extraction - this should not happen",
|
||||
"context_error", ctx.Err(),
|
||||
"extraction_completed", true)
|
||||
operation.Fail("Context cancelled unexpectedly")
|
||||
return fmt.Errorf("context cancelled after extraction completed: %w", ctx.Err())
|
||||
// Check context validity after extraction (debugging context cancellation issues)
|
||||
if ctx.Err() != nil {
|
||||
e.log.Error("Context cancelled after extraction - this should not happen",
|
||||
"context_error", ctx.Err(),
|
||||
"extraction_completed", true)
|
||||
operation.Fail("Context cancelled unexpectedly")
|
||||
return fmt.Errorf("context cancelled after extraction completed: %w", ctx.Err())
|
||||
}
|
||||
e.log.Info("Extraction completed, context still valid")
|
||||
}
|
||||
e.log.Info("Extraction completed, context still valid")
|
||||
|
||||
// Check if user has superuser privileges (required for ownership restoration)
|
||||
e.progress.Update("Checking privileges...")
|
||||
@ -1227,6 +1359,25 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
preserveOwnership := isSuperuser
|
||||
isCompressedSQL := strings.HasSuffix(dumpFile, ".sql.gz")
|
||||
|
||||
// Start heartbeat ticker to show progress during long-running restore
|
||||
heartbeatCtx, cancelHeartbeat := context.WithCancel(ctx)
|
||||
heartbeatTicker := time.NewTicker(5 * time.Second)
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-heartbeatTicker.C:
|
||||
elapsed := time.Since(dbRestoreStart)
|
||||
mu.Lock()
|
||||
statusMsg := fmt.Sprintf("Restoring %s (%d/%d) - elapsed: %s",
|
||||
dbName, idx+1, totalDBs, formatDuration(elapsed))
|
||||
e.progress.Update(statusMsg)
|
||||
mu.Unlock()
|
||||
case <-heartbeatCtx.Done():
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
var restoreErr error
|
||||
if isCompressedSQL {
|
||||
mu.Lock()
|
||||
@ -1240,6 +1391,10 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
|
||||
restoreErr = e.restorePostgreSQLDumpWithOwnership(ctx, dumpFile, dbName, false, preserveOwnership)
|
||||
}
|
||||
|
||||
// Stop heartbeat ticker
|
||||
heartbeatTicker.Stop()
|
||||
cancelHeartbeat()
|
||||
|
||||
if restoreErr != nil {
|
||||
mu.Lock()
|
||||
e.log.Error("Failed to restore database", "name", dbName, "file", dumpFile, "error", restoreErr)
|
||||
@ -1482,9 +1637,9 @@ func (pr *progressReader) Read(p []byte) (n int, err error) {
|
||||
n, err = pr.reader.Read(p)
|
||||
pr.bytesRead += int64(n)
|
||||
|
||||
// Throttle progress reporting to every 100ms
|
||||
// Throttle progress reporting to every 50ms for smoother updates
|
||||
if pr.reportEvery == 0 {
|
||||
pr.reportEvery = 100 * time.Millisecond
|
||||
pr.reportEvery = 50 * time.Millisecond
|
||||
}
|
||||
if time.Since(pr.lastReport) > pr.reportEvery {
|
||||
if pr.callback != nil {
|
||||
@ -1498,6 +1653,25 @@ func (pr *progressReader) Read(p []byte) (n int, err error) {
|
||||
|
||||
// extractArchiveShell extracts using shell tar command (faster but no progress)
|
||||
func (e *Engine) extractArchiveShell(ctx context.Context, archivePath, destDir string) error {
|
||||
// Start heartbeat ticker for extraction progress
|
||||
extractionStart := time.Now()
|
||||
heartbeatCtx, cancelHeartbeat := context.WithCancel(ctx)
|
||||
heartbeatTicker := time.NewTicker(5 * time.Second)
|
||||
defer heartbeatTicker.Stop()
|
||||
defer cancelHeartbeat()
|
||||
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-heartbeatTicker.C:
|
||||
elapsed := time.Since(extractionStart)
|
||||
e.progress.Update(fmt.Sprintf("Extracting archive... (elapsed: %s)", formatDuration(elapsed)))
|
||||
case <-heartbeatCtx.Done():
|
||||
return
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
cmd := exec.CommandContext(ctx, "tar", "-xzf", archivePath, "-C", destDir)
|
||||
|
||||
// Stream stderr to avoid memory issues - tar can produce lots of output for large archives
|
||||
@ -2080,6 +2254,25 @@ func FormatBytes(bytes int64) string {
|
||||
return fmt.Sprintf("%.1f %cB", float64(bytes)/float64(div), "KMGTPE"[exp])
|
||||
}
|
||||
|
||||
// formatDuration formats a duration to human readable format (e.g., "3m 45s", "1h 23m", "45s")
|
||||
func formatDuration(d time.Duration) string {
|
||||
if d < time.Second {
|
||||
return "0s"
|
||||
}
|
||||
|
||||
hours := int(d.Hours())
|
||||
minutes := int(d.Minutes()) % 60
|
||||
seconds := int(d.Seconds()) % 60
|
||||
|
||||
if hours > 0 {
|
||||
return fmt.Sprintf("%dh %dm", hours, minutes)
|
||||
}
|
||||
if minutes > 0 {
|
||||
return fmt.Sprintf("%dm %ds", minutes, seconds)
|
||||
}
|
||||
return fmt.Sprintf("%ds", seconds)
|
||||
}
|
||||
|
||||
// quickValidateSQLDump performs a fast validation of SQL dump files
|
||||
// by checking for truncated COPY blocks. This catches corrupted dumps
|
||||
// BEFORE attempting a full restore (which could waste 49+ minutes).
|
||||
|
||||
344
internal/restore/extract.go
Normal file
344
internal/restore/extract.go
Normal file
@ -0,0 +1,344 @@
|
||||
package restore
|
||||
|
||||
import (
|
||||
"archive/tar"
|
||||
"compress/gzip"
|
||||
"context"
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sort"
|
||||
"strings"
|
||||
|
||||
"dbbackup/internal/logger"
|
||||
"dbbackup/internal/progress"
|
||||
)
|
||||
|
||||
// DatabaseInfo represents metadata about a database in a cluster backup
|
||||
type DatabaseInfo struct {
|
||||
Name string
|
||||
Filename string
|
||||
Size int64
|
||||
}
|
||||
|
||||
// ListDatabasesInCluster lists all databases in a cluster backup archive
|
||||
func ListDatabasesInCluster(ctx context.Context, archivePath string, log logger.Logger) ([]DatabaseInfo, error) {
|
||||
file, err := os.Open(archivePath)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot open archive: %w", err)
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
gz, err := gzip.NewReader(file)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("not a valid gzip archive: %w", err)
|
||||
}
|
||||
defer gz.Close()
|
||||
|
||||
tarReader := tar.NewReader(gz)
|
||||
databases := make([]DatabaseInfo, 0)
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return nil, ctx.Err()
|
||||
default:
|
||||
}
|
||||
|
||||
header, err := tarReader.Next()
|
||||
if err == io.EOF {
|
||||
break
|
||||
}
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error reading tar archive: %w", err)
|
||||
}
|
||||
|
||||
// Look for files in dumps/ directory
|
||||
if !header.FileInfo().IsDir() && strings.HasPrefix(header.Name, "dumps/") {
|
||||
filename := filepath.Base(header.Name)
|
||||
|
||||
// Extract database name from filename (remove .dump, .dump.gz, .sql, .sql.gz)
|
||||
dbName := filename
|
||||
dbName = strings.TrimSuffix(dbName, ".dump.gz")
|
||||
dbName = strings.TrimSuffix(dbName, ".dump")
|
||||
dbName = strings.TrimSuffix(dbName, ".sql.gz")
|
||||
dbName = strings.TrimSuffix(dbName, ".sql")
|
||||
|
||||
databases = append(databases, DatabaseInfo{
|
||||
Name: dbName,
|
||||
Filename: filename,
|
||||
Size: header.Size,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Sort by name for consistent output
|
||||
sort.Slice(databases, func(i, j int) bool {
|
||||
return databases[i].Name < databases[j].Name
|
||||
})
|
||||
|
||||
if len(databases) == 0 {
|
||||
return nil, fmt.Errorf("no databases found in cluster backup")
|
||||
}
|
||||
|
||||
return databases, nil
|
||||
}
|
||||
|
||||
// ExtractDatabaseFromCluster extracts a single database dump from cluster backup
|
||||
func ExtractDatabaseFromCluster(ctx context.Context, archivePath, dbName, outputDir string, log logger.Logger, prog progress.Indicator) (string, error) {
|
||||
file, err := os.Open(archivePath)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("cannot open archive: %w", err)
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
stat, err := file.Stat()
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("cannot stat archive: %w", err)
|
||||
}
|
||||
archiveSize := stat.Size()
|
||||
|
||||
gz, err := gzip.NewReader(file)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("not a valid gzip archive: %w", err)
|
||||
}
|
||||
defer gz.Close()
|
||||
|
||||
tarReader := tar.NewReader(gz)
|
||||
|
||||
// Create output directory if needed
|
||||
if err := os.MkdirAll(outputDir, 0755); err != nil {
|
||||
return "", fmt.Errorf("cannot create output directory: %w", err)
|
||||
}
|
||||
|
||||
targetPattern := fmt.Sprintf("dumps/%s.", dbName) // Match dbName.dump, dbName.sql, etc.
|
||||
var extractedPath string
|
||||
found := false
|
||||
|
||||
if prog != nil {
|
||||
prog.Start(fmt.Sprintf("Extracting database: %s", dbName))
|
||||
defer prog.Stop()
|
||||
}
|
||||
|
||||
var bytesRead int64
|
||||
ticker := make(chan struct{})
|
||||
stopTicker := make(chan struct{})
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return
|
||||
case <-stopTicker:
|
||||
return
|
||||
case <-ticker:
|
||||
if prog != nil && archiveSize > 0 {
|
||||
percentage := float64(bytesRead) / float64(archiveSize) * 100
|
||||
prog.Update(fmt.Sprintf("Scanning: %.1f%%", percentage))
|
||||
}
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
close(stopTicker)
|
||||
return "", ctx.Err()
|
||||
default:
|
||||
}
|
||||
|
||||
header, err := tarReader.Next()
|
||||
if err == io.EOF {
|
||||
break
|
||||
}
|
||||
if err != nil {
|
||||
close(stopTicker)
|
||||
return "", fmt.Errorf("error reading tar archive: %w", err)
|
||||
}
|
||||
|
||||
bytesRead += header.Size
|
||||
select {
|
||||
case ticker <- struct{}{}:
|
||||
default:
|
||||
}
|
||||
|
||||
// Check if this is the database we're looking for
|
||||
if strings.HasPrefix(header.Name, targetPattern) && !header.FileInfo().IsDir() {
|
||||
filename := filepath.Base(header.Name)
|
||||
extractedPath = filepath.Join(outputDir, filename)
|
||||
|
||||
// Extract the file
|
||||
outFile, err := os.Create(extractedPath)
|
||||
if err != nil {
|
||||
close(stopTicker)
|
||||
return "", fmt.Errorf("cannot create output file: %w", err)
|
||||
}
|
||||
|
||||
if prog != nil {
|
||||
prog.Update(fmt.Sprintf("Extracting: %s", filename))
|
||||
}
|
||||
|
||||
written, err := io.Copy(outFile, tarReader)
|
||||
outFile.Close()
|
||||
if err != nil {
|
||||
close(stopTicker)
|
||||
return "", fmt.Errorf("extraction failed: %w", err)
|
||||
}
|
||||
|
||||
log.Info("Database extracted successfully", "database", dbName, "size", formatBytes(written), "path", extractedPath)
|
||||
found = true
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
close(stopTicker)
|
||||
|
||||
if !found {
|
||||
return "", fmt.Errorf("database '%s' not found in cluster backup", dbName)
|
||||
}
|
||||
|
||||
return extractedPath, nil
|
||||
}
|
||||
|
||||
// ExtractMultipleDatabasesFromCluster extracts multiple databases from cluster backup
|
||||
func ExtractMultipleDatabasesFromCluster(ctx context.Context, archivePath string, dbNames []string, outputDir string, log logger.Logger, prog progress.Indicator) (map[string]string, error) {
|
||||
file, err := os.Open(archivePath)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot open archive: %w", err)
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
stat, err := file.Stat()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("cannot stat archive: %w", err)
|
||||
}
|
||||
archiveSize := stat.Size()
|
||||
|
||||
gz, err := gzip.NewReader(file)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("not a valid gzip archive: %w", err)
|
||||
}
|
||||
defer gz.Close()
|
||||
|
||||
tarReader := tar.NewReader(gz)
|
||||
|
||||
// Create output directory if needed
|
||||
if err := os.MkdirAll(outputDir, 0755); err != nil {
|
||||
return nil, fmt.Errorf("cannot create output directory: %w", err)
|
||||
}
|
||||
|
||||
// Build lookup map
|
||||
targetDBs := make(map[string]bool)
|
||||
for _, dbName := range dbNames {
|
||||
targetDBs[dbName] = true
|
||||
}
|
||||
|
||||
extractedPaths := make(map[string]string)
|
||||
|
||||
if prog != nil {
|
||||
prog.Start(fmt.Sprintf("Extracting %d databases", len(dbNames)))
|
||||
defer prog.Stop()
|
||||
}
|
||||
|
||||
var bytesRead int64
|
||||
ticker := make(chan struct{})
|
||||
stopTicker := make(chan struct{})
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
return
|
||||
case <-stopTicker:
|
||||
return
|
||||
case <-ticker:
|
||||
if prog != nil && archiveSize > 0 {
|
||||
percentage := float64(bytesRead) / float64(archiveSize) * 100
|
||||
prog.Update(fmt.Sprintf("Scanning: %.1f%% (%d/%d found)", percentage, len(extractedPaths), len(dbNames)))
|
||||
}
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
for {
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
close(stopTicker)
|
||||
return nil, ctx.Err()
|
||||
default:
|
||||
}
|
||||
|
||||
header, err := tarReader.Next()
|
||||
if err == io.EOF {
|
||||
break
|
||||
}
|
||||
if err != nil {
|
||||
close(stopTicker)
|
||||
return nil, fmt.Errorf("error reading tar archive: %w", err)
|
||||
}
|
||||
|
||||
bytesRead += header.Size
|
||||
select {
|
||||
case ticker <- struct{}{}:
|
||||
default:
|
||||
}
|
||||
|
||||
// Check if this is one of the databases we're looking for
|
||||
if strings.HasPrefix(header.Name, "dumps/") && !header.FileInfo().IsDir() {
|
||||
filename := filepath.Base(header.Name)
|
||||
|
||||
// Extract database name
|
||||
dbName := filename
|
||||
dbName = strings.TrimSuffix(dbName, ".dump.gz")
|
||||
dbName = strings.TrimSuffix(dbName, ".dump")
|
||||
dbName = strings.TrimSuffix(dbName, ".sql.gz")
|
||||
dbName = strings.TrimSuffix(dbName, ".sql")
|
||||
|
||||
if targetDBs[dbName] {
|
||||
extractedPath := filepath.Join(outputDir, filename)
|
||||
|
||||
// Extract the file
|
||||
outFile, err := os.Create(extractedPath)
|
||||
if err != nil {
|
||||
close(stopTicker)
|
||||
return nil, fmt.Errorf("cannot create output file for %s: %w", dbName, err)
|
||||
}
|
||||
|
||||
if prog != nil {
|
||||
prog.Update(fmt.Sprintf("Extracting: %s (%d/%d)", dbName, len(extractedPaths)+1, len(dbNames)))
|
||||
}
|
||||
|
||||
written, err := io.Copy(outFile, tarReader)
|
||||
outFile.Close()
|
||||
if err != nil {
|
||||
close(stopTicker)
|
||||
return nil, fmt.Errorf("extraction failed for %s: %w", dbName, err)
|
||||
}
|
||||
|
||||
log.Info("Database extracted", "database", dbName, "size", formatBytes(written))
|
||||
extractedPaths[dbName] = extractedPath
|
||||
|
||||
// Stop early if we found all databases
|
||||
if len(extractedPaths) == len(dbNames) {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
close(stopTicker)
|
||||
|
||||
// Check if all requested databases were found
|
||||
missing := make([]string, 0)
|
||||
for _, dbName := range dbNames {
|
||||
if _, found := extractedPaths[dbName]; !found {
|
||||
missing = append(missing, dbName)
|
||||
}
|
||||
}
|
||||
|
||||
if len(missing) > 0 {
|
||||
return extractedPaths, fmt.Errorf("databases not found in cluster backup: %s", strings.Join(missing, ", "))
|
||||
}
|
||||
|
||||
return extractedPaths, nil
|
||||
}
|
||||
@ -16,6 +16,57 @@ import (
|
||||
"github.com/shirou/gopsutil/v3/mem"
|
||||
)
|
||||
|
||||
// CalculateOptimalParallel returns the recommended number of parallel workers
|
||||
// based on available system resources (CPU cores and RAM).
|
||||
// This is a standalone function that can be called from anywhere.
|
||||
// Returns 0 if resources cannot be detected.
|
||||
func CalculateOptimalParallel() int {
|
||||
cpuCores := runtime.NumCPU()
|
||||
|
||||
vmem, err := mem.VirtualMemory()
|
||||
if err != nil {
|
||||
// Fallback: use half of CPU cores if memory detection fails
|
||||
if cpuCores > 1 {
|
||||
return cpuCores / 2
|
||||
}
|
||||
return 1
|
||||
}
|
||||
|
||||
memAvailableGB := float64(vmem.Available) / (1024 * 1024 * 1024)
|
||||
|
||||
// Each pg_restore worker needs approximately 2-4GB of RAM
|
||||
// Use conservative 3GB per worker to avoid OOM
|
||||
const memPerWorkerGB = 3.0
|
||||
|
||||
// Calculate limits
|
||||
maxByMem := int(memAvailableGB / memPerWorkerGB)
|
||||
maxByCPU := cpuCores
|
||||
|
||||
// Use the minimum of memory and CPU limits
|
||||
recommended := maxByMem
|
||||
if maxByCPU < recommended {
|
||||
recommended = maxByCPU
|
||||
}
|
||||
|
||||
// Apply sensible bounds
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
if recommended > 16 {
|
||||
recommended = 16 // Cap at 16 to avoid diminishing returns
|
||||
}
|
||||
|
||||
// If memory pressure is high (>80%), reduce parallelism
|
||||
if vmem.UsedPercent > 80 && recommended > 1 {
|
||||
recommended = recommended / 2
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
}
|
||||
|
||||
return recommended
|
||||
}
|
||||
|
||||
// PreflightResult contains all preflight check results
|
||||
type PreflightResult struct {
|
||||
// Linux system checks
|
||||
@ -35,27 +86,29 @@ type PreflightResult struct {
|
||||
|
||||
// LinuxChecks contains Linux kernel/system checks
|
||||
type LinuxChecks struct {
|
||||
ShmMax int64 // /proc/sys/kernel/shmmax
|
||||
ShmAll int64 // /proc/sys/kernel/shmall
|
||||
MemTotal uint64 // Total RAM in bytes
|
||||
MemAvailable uint64 // Available RAM in bytes
|
||||
MemUsedPercent float64 // Memory usage percentage
|
||||
ShmMaxOK bool // Is shmmax sufficient?
|
||||
ShmAllOK bool // Is shmall sufficient?
|
||||
MemAvailableOK bool // Is available RAM sufficient?
|
||||
IsLinux bool // Are we running on Linux?
|
||||
ShmMax int64 // /proc/sys/kernel/shmmax
|
||||
ShmAll int64 // /proc/sys/kernel/shmall
|
||||
MemTotal uint64 // Total RAM in bytes
|
||||
MemAvailable uint64 // Available RAM in bytes
|
||||
MemUsedPercent float64 // Memory usage percentage
|
||||
CPUCores int // Number of CPU cores
|
||||
RecommendedParallel int // Auto-calculated optimal parallel count
|
||||
ShmMaxOK bool // Is shmmax sufficient?
|
||||
ShmAllOK bool // Is shmall sufficient?
|
||||
MemAvailableOK bool // Is available RAM sufficient?
|
||||
IsLinux bool // Are we running on Linux?
|
||||
}
|
||||
|
||||
// PostgreSQLChecks contains PostgreSQL configuration checks
|
||||
type PostgreSQLChecks struct {
|
||||
MaxLocksPerTransaction int // Current setting
|
||||
MaxPreparedTransactions int // Current setting (affects lock capacity)
|
||||
TotalLockCapacity int // Calculated: max_locks × (max_connections + max_prepared)
|
||||
MaintenanceWorkMem string // Current setting
|
||||
SharedBuffers string // Current setting (info only)
|
||||
MaxConnections int // Current setting
|
||||
Version string // PostgreSQL version
|
||||
IsSuperuser bool // Can we modify settings?
|
||||
MaxLocksPerTransaction int // Current setting
|
||||
MaxPreparedTransactions int // Current setting (affects lock capacity)
|
||||
TotalLockCapacity int // Calculated: max_locks × (max_connections + max_prepared)
|
||||
MaintenanceWorkMem string // Current setting
|
||||
SharedBuffers string // Current setting (info only)
|
||||
MaxConnections int // Current setting
|
||||
Version string // PostgreSQL version
|
||||
IsSuperuser bool // Can we modify settings?
|
||||
}
|
||||
|
||||
// ArchiveChecks contains analysis of the backup archive
|
||||
@ -100,6 +153,7 @@ func (e *Engine) RunPreflightChecks(ctx context.Context, dumpsDir string, entrie
|
||||
// checkSystemResources uses gopsutil for cross-platform system checks
|
||||
func (e *Engine) checkSystemResources(result *PreflightResult) {
|
||||
result.Linux.IsLinux = runtime.GOOS == "linux"
|
||||
result.Linux.CPUCores = runtime.NumCPU()
|
||||
|
||||
// Get memory info (works on Linux, macOS, Windows, BSD)
|
||||
if vmem, err := mem.VirtualMemory(); err == nil {
|
||||
@ -118,6 +172,9 @@ func (e *Engine) checkSystemResources(result *PreflightResult) {
|
||||
e.log.Warn("Could not detect system memory", "error", err)
|
||||
}
|
||||
|
||||
// Calculate recommended parallel based on resources
|
||||
result.Linux.RecommendedParallel = e.calculateRecommendedParallel(result)
|
||||
|
||||
// Linux-specific kernel checks (shmmax, shmall)
|
||||
if result.Linux.IsLinux {
|
||||
e.checkLinuxKernel(result)
|
||||
@ -434,8 +491,71 @@ func (e *Engine) calculateRecommendations(result *PreflightResult) {
|
||||
"recommended_locks", lockBoost)
|
||||
}
|
||||
|
||||
// calculateRecommendedParallel determines optimal parallelism based on system resources
|
||||
// Returns the recommended number of parallel workers for pg_restore
|
||||
func (e *Engine) calculateRecommendedParallel(result *PreflightResult) int {
|
||||
cpuCores := result.Linux.CPUCores
|
||||
if cpuCores == 0 {
|
||||
cpuCores = runtime.NumCPU()
|
||||
}
|
||||
|
||||
memAvailableGB := float64(result.Linux.MemAvailable) / (1024 * 1024 * 1024)
|
||||
|
||||
// Each pg_restore worker needs approximately 2-4GB of RAM
|
||||
// Use conservative 3GB per worker to avoid OOM
|
||||
const memPerWorkerGB = 3.0
|
||||
|
||||
// Calculate limits
|
||||
maxByMem := int(memAvailableGB / memPerWorkerGB)
|
||||
maxByCPU := cpuCores
|
||||
|
||||
// Use the minimum of memory and CPU limits
|
||||
recommended := maxByMem
|
||||
if maxByCPU < recommended {
|
||||
recommended = maxByCPU
|
||||
}
|
||||
|
||||
// Apply sensible bounds
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
if recommended > 16 {
|
||||
recommended = 16 // Cap at 16 to avoid diminishing returns
|
||||
}
|
||||
|
||||
// If memory pressure is high (>80%), reduce parallelism
|
||||
if result.Linux.MemUsedPercent > 80 && recommended > 1 {
|
||||
recommended = recommended / 2
|
||||
if recommended < 1 {
|
||||
recommended = 1
|
||||
}
|
||||
}
|
||||
|
||||
e.log.Info("Calculated recommended parallel",
|
||||
"cpu_cores", cpuCores,
|
||||
"mem_available_gb", fmt.Sprintf("%.1f", memAvailableGB),
|
||||
"max_by_mem", maxByMem,
|
||||
"max_by_cpu", maxByCPU,
|
||||
"recommended", recommended)
|
||||
|
||||
return recommended
|
||||
}
|
||||
|
||||
// printPreflightSummary prints a nice summary of all checks
|
||||
// In silent mode (TUI), this is skipped and results are logged instead
|
||||
func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||
// In TUI/silent mode, don't print to stdout - it causes scrambled output
|
||||
if e.silentMode {
|
||||
// Log summary instead for debugging
|
||||
e.log.Info("Preflight checks complete",
|
||||
"can_proceed", result.CanProceed,
|
||||
"warnings", len(result.Warnings),
|
||||
"errors", len(result.Errors),
|
||||
"total_blobs", result.Archive.TotalBlobCount,
|
||||
"recommended_locks", result.Archive.RecommendedLockBoost)
|
||||
return
|
||||
}
|
||||
|
||||
fmt.Println()
|
||||
fmt.Println(strings.Repeat("─", 60))
|
||||
fmt.Println(" PREFLIGHT CHECKS")
|
||||
@ -446,6 +566,8 @@ func (e *Engine) printPreflightSummary(result *PreflightResult) {
|
||||
printCheck("Total RAM", humanize.Bytes(result.Linux.MemTotal), true)
|
||||
printCheck("Available RAM", humanize.Bytes(result.Linux.MemAvailable), result.Linux.MemAvailableOK || result.Linux.MemAvailable == 0)
|
||||
printCheck("Memory Usage", fmt.Sprintf("%.1f%%", result.Linux.MemUsedPercent), result.Linux.MemUsedPercent < 85)
|
||||
printCheck("CPU Cores", fmt.Sprintf("%d", result.Linux.CPUCores), true)
|
||||
printCheck("Recommended Parallel", fmt.Sprintf("%d (auto-calculated)", result.Linux.RecommendedParallel), true)
|
||||
|
||||
// Linux-specific kernel checks
|
||||
if result.Linux.IsLinux && result.Linux.ShmMax > 0 {
|
||||
|
||||
@ -190,7 +190,7 @@ func (s *Safety) validateSQLScriptGz(path string) error {
|
||||
return fmt.Errorf("does not appear to contain SQL content")
|
||||
}
|
||||
|
||||
// validateTarGz validates tar.gz archive
|
||||
// validateTarGz validates tar.gz archive with fast stream-based checks
|
||||
func (s *Safety) validateTarGz(path string) error {
|
||||
file, err := os.Open(path)
|
||||
if err != nil {
|
||||
@ -205,11 +205,40 @@ func (s *Safety) validateTarGz(path string) error {
|
||||
return fmt.Errorf("cannot read file header")
|
||||
}
|
||||
|
||||
if buffer[0] == 0x1f && buffer[1] == 0x8b {
|
||||
return nil // Valid gzip header
|
||||
if buffer[0] != 0x1f || buffer[1] != 0x8b {
|
||||
return fmt.Errorf("not a valid gzip file")
|
||||
}
|
||||
|
||||
return fmt.Errorf("not a valid gzip file")
|
||||
// Quick tar structure validation (stream-based, no full extraction)
|
||||
// Reset to start and decompress first few KB to check tar header
|
||||
file.Seek(0, 0)
|
||||
gzReader, err := gzip.NewReader(file)
|
||||
if err != nil {
|
||||
return fmt.Errorf("gzip corruption detected: %w", err)
|
||||
}
|
||||
defer gzReader.Close()
|
||||
|
||||
// Read first tar header to verify it's a valid tar archive
|
||||
headerBuf := make([]byte, 512) // Tar header is 512 bytes
|
||||
n, err = gzReader.Read(headerBuf)
|
||||
if err != nil && err != io.EOF {
|
||||
return fmt.Errorf("failed to read tar header: %w", err)
|
||||
}
|
||||
if n < 512 {
|
||||
return fmt.Errorf("archive too small or corrupted")
|
||||
}
|
||||
|
||||
// Check tar magic ("ustar\0" at offset 257)
|
||||
if len(headerBuf) >= 263 {
|
||||
magic := string(headerBuf[257:262])
|
||||
if magic != "ustar" {
|
||||
s.log.Debug("No tar magic found, but may still be valid tar", "magic", magic)
|
||||
// Don't fail - some tar implementations don't use magic
|
||||
}
|
||||
}
|
||||
|
||||
s.log.Debug("Cluster archive validation passed (stream-based check)")
|
||||
return nil // Valid gzip + tar structure
|
||||
}
|
||||
|
||||
// containsSQLKeywords checks if content contains SQL keywords
|
||||
@ -228,6 +257,42 @@ func containsSQLKeywords(content string) bool {
|
||||
return false
|
||||
}
|
||||
|
||||
// ValidateAndExtractCluster performs validation and pre-extraction for cluster restore
|
||||
// Returns path to extracted directory (in temp location) to avoid double-extraction
|
||||
// Caller must clean up the returned directory with os.RemoveAll() when done
|
||||
func (s *Safety) ValidateAndExtractCluster(ctx context.Context, archivePath string) (extractedDir string, err error) {
|
||||
// First validate archive integrity (fast stream check)
|
||||
if err := s.ValidateArchive(archivePath); err != nil {
|
||||
return "", fmt.Errorf("archive validation failed: %w", err)
|
||||
}
|
||||
|
||||
// Create temp directory for extraction in configured WorkDir
|
||||
workDir := s.cfg.GetEffectiveWorkDir()
|
||||
if workDir == "" {
|
||||
workDir = s.cfg.BackupDir
|
||||
}
|
||||
|
||||
tempDir, err := os.MkdirTemp(workDir, "dbbackup-cluster-extract-*")
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("failed to create temp extraction directory in %s: %w", workDir, err)
|
||||
}
|
||||
|
||||
// Extract using tar command (fastest method)
|
||||
s.log.Info("Pre-extracting cluster archive for validation and restore",
|
||||
"archive", archivePath,
|
||||
"dest", tempDir)
|
||||
|
||||
cmd := exec.CommandContext(ctx, "tar", "-xzf", archivePath, "-C", tempDir)
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
os.RemoveAll(tempDir) // Cleanup on failure
|
||||
return "", fmt.Errorf("extraction failed: %w: %s", err, string(output))
|
||||
}
|
||||
|
||||
s.log.Info("Cluster archive extracted successfully", "location", tempDir)
|
||||
return tempDir, nil
|
||||
}
|
||||
|
||||
// CheckDiskSpace verifies sufficient disk space for restore
|
||||
// Uses the effective work directory (WorkDir if set, otherwise BackupDir) since
|
||||
// that's where extraction actually happens for large databases
|
||||
@ -334,10 +399,12 @@ func (s *Safety) checkPostgresDatabaseExists(ctx context.Context, dbName string)
|
||||
"-tAc", fmt.Sprintf("SELECT 1 FROM pg_database WHERE datname='%s'", dbName),
|
||||
}
|
||||
|
||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
||||
if s.cfg.Host != "localhost" && s.cfg.Host != "127.0.0.1" && s.cfg.Host != "" {
|
||||
args = append([]string{"-h", s.cfg.Host}, args...)
|
||||
// Always add -h flag for explicit host connection (required for password auth)
|
||||
host := s.cfg.Host
|
||||
if host == "" {
|
||||
host = "localhost"
|
||||
}
|
||||
args = append([]string{"-h", host}, args...)
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
|
||||
@ -346,9 +413,9 @@ func (s *Safety) checkPostgresDatabaseExists(ctx context.Context, dbName string)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", s.cfg.Password))
|
||||
}
|
||||
|
||||
output, err := cmd.Output()
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
return false, fmt.Errorf("failed to check database existence: %w", err)
|
||||
return false, fmt.Errorf("failed to check database existence: %w (output: %s)", err, strings.TrimSpace(string(output)))
|
||||
}
|
||||
|
||||
return strings.TrimSpace(string(output)) == "1", nil
|
||||
@ -405,21 +472,29 @@ func (s *Safety) listPostgresUserDatabases(ctx context.Context) ([]string, error
|
||||
"-c", query,
|
||||
}
|
||||
|
||||
// Only add -h flag if host is not localhost (to use Unix socket for peer auth)
|
||||
if s.cfg.Host != "localhost" && s.cfg.Host != "127.0.0.1" && s.cfg.Host != "" {
|
||||
args = append([]string{"-h", s.cfg.Host}, args...)
|
||||
// Always add -h flag for explicit host connection (required for password auth)
|
||||
// Empty or unset host defaults to localhost
|
||||
host := s.cfg.Host
|
||||
if host == "" {
|
||||
host = "localhost"
|
||||
}
|
||||
args = append([]string{"-h", host}, args...)
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
|
||||
// Set password if provided
|
||||
// Set password - check config first, then environment
|
||||
env := os.Environ()
|
||||
if s.cfg.Password != "" {
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", s.cfg.Password))
|
||||
env = append(env, fmt.Sprintf("PGPASSWORD=%s", s.cfg.Password))
|
||||
}
|
||||
cmd.Env = env
|
||||
|
||||
output, err := cmd.Output()
|
||||
s.log.Debug("Listing PostgreSQL databases", "host", host, "port", s.cfg.Port, "user", s.cfg.User)
|
||||
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to list databases: %w", err)
|
||||
// Include psql output in error for debugging
|
||||
return nil, fmt.Errorf("failed to list databases: %w (output: %s)", err, strings.TrimSpace(string(output)))
|
||||
}
|
||||
|
||||
// Parse output
|
||||
@ -432,6 +507,8 @@ func (s *Safety) listPostgresUserDatabases(ctx context.Context) ([]string, error
|
||||
}
|
||||
}
|
||||
|
||||
s.log.Debug("Found user databases", "count", len(databases), "databases", databases, "raw_output", string(output))
|
||||
|
||||
return databases, nil
|
||||
}
|
||||
|
||||
|
||||
@ -214,14 +214,27 @@ func (m ArchiveBrowserModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
}
|
||||
|
||||
if m.mode == "restore-single" && selected.Format.IsClusterBackup() {
|
||||
m.message = errorStyle.Render("[FAIL] Please select a single database backup")
|
||||
return m, nil
|
||||
// Cluster backup selected in single restore mode - offer to select individual database
|
||||
clusterSelector := NewClusterDatabaseSelector(m.config, m.logger, m, m.ctx, selected, "single", false)
|
||||
return clusterSelector, clusterSelector.Init()
|
||||
}
|
||||
|
||||
// Open restore preview
|
||||
preview := NewRestorePreview(m.config, m.logger, m.parent, m.ctx, selected, m.mode)
|
||||
return preview, preview.Init()
|
||||
}
|
||||
|
||||
case "s":
|
||||
// Select single database from cluster (shortcut key)
|
||||
if len(m.archives) > 0 && m.cursor < len(m.archives) {
|
||||
selected := m.archives[m.cursor]
|
||||
if selected.Format.IsClusterBackup() {
|
||||
clusterSelector := NewClusterDatabaseSelector(m.config, m.logger, m, m.ctx, selected, "single", false)
|
||||
return clusterSelector, clusterSelector.Init()
|
||||
} else {
|
||||
m.message = infoStyle.Render("💡 [s] only works with cluster backups")
|
||||
}
|
||||
}
|
||||
|
||||
case "i":
|
||||
// Show detailed info
|
||||
@ -351,7 +364,7 @@ func (m ArchiveBrowserModel) View() string {
|
||||
s.WriteString(infoStyle.Render(fmt.Sprintf("Total: %d archive(s) | Selected: %d/%d",
|
||||
len(m.archives), m.cursor+1, len(m.archives))))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render("[KEY] ↑/↓: Navigate | Enter: Select | d: Diagnose | f: Filter | i: Info | Esc: Back"))
|
||||
s.WriteString(infoStyle.Render("[KEY] ↑/↓: Navigate | Enter: Select | s: Single DB from Cluster | d: Diagnose | f: Filter | i: Info | Esc: Back"))
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
@ -13,6 +13,14 @@ import (
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/database"
|
||||
"dbbackup/internal/logger"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
// Backup phase constants for consistency
|
||||
const (
|
||||
backupPhaseGlobals = 1
|
||||
backupPhaseDatabases = 2
|
||||
backupPhaseCompressing = 3
|
||||
)
|
||||
|
||||
// BackupExecutionModel handles backup execution with progress
|
||||
@ -31,27 +39,36 @@ type BackupExecutionModel struct {
|
||||
cancelling bool // True when user has requested cancellation
|
||||
err error
|
||||
result string
|
||||
archivePath string // Path to created archive (for summary)
|
||||
archiveSize int64 // Size of created archive (for summary)
|
||||
startTime time.Time
|
||||
elapsed time.Duration // Final elapsed time
|
||||
details []string
|
||||
spinnerFrame int
|
||||
|
||||
// Database count progress (for cluster backup)
|
||||
dbTotal int
|
||||
dbDone int
|
||||
dbName string // Current database being backed up
|
||||
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||
phaseDesc string // Description of current phase
|
||||
dbTotal int
|
||||
dbDone int
|
||||
dbName string // Current database being backed up
|
||||
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||
phaseDesc string // Description of current phase
|
||||
phase2StartTime time.Time // When phase 2 (databases) started (for realtime ETA)
|
||||
dbPhaseElapsed time.Duration // Elapsed time since database backup phase started
|
||||
dbAvgPerDB time.Duration // Average time per database backup
|
||||
}
|
||||
|
||||
// sharedBackupProgressState holds progress state that can be safely accessed from callbacks
|
||||
type sharedBackupProgressState struct {
|
||||
mu sync.Mutex
|
||||
dbTotal int
|
||||
dbDone int
|
||||
dbName string
|
||||
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||
phaseDesc string // Description of current phase
|
||||
hasUpdate bool
|
||||
mu sync.Mutex
|
||||
dbTotal int
|
||||
dbDone int
|
||||
dbName string
|
||||
overallPhase int // 1=globals, 2=databases, 3=compressing
|
||||
phaseDesc string // Description of current phase
|
||||
hasUpdate bool
|
||||
phase2StartTime time.Time // When phase 2 started (for realtime ETA calculation)
|
||||
dbPhaseElapsed time.Duration // Elapsed time since database backup phase started
|
||||
dbAvgPerDB time.Duration // Average time per database backup
|
||||
}
|
||||
|
||||
// Package-level shared progress state for backup operations
|
||||
@ -72,12 +89,12 @@ func clearCurrentBackupProgress() {
|
||||
currentBackupProgressState = nil
|
||||
}
|
||||
|
||||
func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, overallPhase int, phaseDesc string, hasUpdate bool) {
|
||||
func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, overallPhase int, phaseDesc string, hasUpdate bool, dbPhaseElapsed, dbAvgPerDB time.Duration, phase2StartTime time.Time) {
|
||||
currentBackupProgressMu.Lock()
|
||||
defer currentBackupProgressMu.Unlock()
|
||||
|
||||
if currentBackupProgressState == nil {
|
||||
return 0, 0, "", 0, "", false
|
||||
return 0, 0, "", 0, "", false, 0, 0, time.Time{}
|
||||
}
|
||||
|
||||
currentBackupProgressState.mu.Lock()
|
||||
@ -86,9 +103,17 @@ func getCurrentBackupProgress() (dbTotal, dbDone int, dbName string, overallPhas
|
||||
hasUpdate = currentBackupProgressState.hasUpdate
|
||||
currentBackupProgressState.hasUpdate = false
|
||||
|
||||
// Calculate realtime phase elapsed if we have a phase 2 start time
|
||||
dbPhaseElapsed = currentBackupProgressState.dbPhaseElapsed
|
||||
if !currentBackupProgressState.phase2StartTime.IsZero() {
|
||||
dbPhaseElapsed = time.Since(currentBackupProgressState.phase2StartTime)
|
||||
}
|
||||
|
||||
return currentBackupProgressState.dbTotal, currentBackupProgressState.dbDone,
|
||||
currentBackupProgressState.dbName, currentBackupProgressState.overallPhase,
|
||||
currentBackupProgressState.phaseDesc, hasUpdate
|
||||
currentBackupProgressState.phaseDesc, hasUpdate,
|
||||
dbPhaseElapsed, currentBackupProgressState.dbAvgPerDB,
|
||||
currentBackupProgressState.phase2StartTime
|
||||
}
|
||||
|
||||
func NewBackupExecution(cfg *config.Config, log logger.Logger, parent tea.Model, ctx context.Context, backupType, dbName string, ratio int) BackupExecutionModel {
|
||||
@ -132,17 +157,18 @@ type backupProgressMsg struct {
|
||||
}
|
||||
|
||||
type backupCompleteMsg struct {
|
||||
result string
|
||||
err error
|
||||
result string
|
||||
err error
|
||||
archivePath string
|
||||
archiveSize int64
|
||||
elapsed time.Duration
|
||||
}
|
||||
|
||||
func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, backupType, dbName string, ratio int) tea.Cmd {
|
||||
return func() tea.Msg {
|
||||
// NO TIMEOUT for backup operations - a backup takes as long as it takes
|
||||
// Large databases can take many hours
|
||||
// Only manual cancellation (Ctrl+C) should stop the backup
|
||||
ctx, cancel := context.WithCancel(parentCtx)
|
||||
defer cancel()
|
||||
// Use the parent context directly - it's already cancellable from the model
|
||||
// DO NOT create a new context here as it breaks Ctrl+C cancellation
|
||||
ctx := parentCtx
|
||||
|
||||
start := time.Now()
|
||||
|
||||
@ -176,9 +202,13 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
|
||||
progressState.dbDone = done
|
||||
progressState.dbTotal = total
|
||||
progressState.dbName = currentDB
|
||||
progressState.overallPhase = 2 // Phase 2: Backing up databases
|
||||
progressState.phaseDesc = fmt.Sprintf("Phase 2/3: Databases (%d/%d)", done, total)
|
||||
progressState.overallPhase = backupPhaseDatabases
|
||||
progressState.phaseDesc = fmt.Sprintf("Phase 2/3: Backing up Databases (%d/%d)", done, total)
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 2 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase2StartTime.IsZero() {
|
||||
progressState.phase2StartTime = time.Now()
|
||||
}
|
||||
progressState.mu.Unlock()
|
||||
})
|
||||
|
||||
@ -216,8 +246,9 @@ func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config,
|
||||
}
|
||||
|
||||
return backupCompleteMsg{
|
||||
result: result,
|
||||
err: nil,
|
||||
result: result,
|
||||
err: nil,
|
||||
elapsed: elapsed,
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -230,13 +261,15 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.spinnerFrame = (m.spinnerFrame + 1) % len(spinnerFrames)
|
||||
|
||||
// Poll for database progress updates from callbacks
|
||||
dbTotal, dbDone, dbName, overallPhase, phaseDesc, hasUpdate := getCurrentBackupProgress()
|
||||
dbTotal, dbDone, dbName, overallPhase, phaseDesc, hasUpdate, dbPhaseElapsed, dbAvgPerDB, _ := getCurrentBackupProgress()
|
||||
if hasUpdate {
|
||||
m.dbTotal = dbTotal
|
||||
m.dbDone = dbDone
|
||||
m.dbName = dbName
|
||||
m.overallPhase = overallPhase
|
||||
m.phaseDesc = phaseDesc
|
||||
m.dbPhaseElapsed = dbPhaseElapsed
|
||||
m.dbAvgPerDB = dbAvgPerDB
|
||||
}
|
||||
|
||||
// Update status based on progress and elapsed time
|
||||
@ -284,6 +317,7 @@ func (m BackupExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.done = true
|
||||
m.err = msg.err
|
||||
m.result = msg.result
|
||||
m.elapsed = msg.elapsed
|
||||
if m.err == nil {
|
||||
m.status = "[OK] Backup completed successfully!"
|
||||
} else {
|
||||
@ -361,14 +395,52 @@ func renderBackupDatabaseProgressBar(done, total int, dbName string, width int)
|
||||
return fmt.Sprintf(" Database: [%s] %d/%d", bar, done, total)
|
||||
}
|
||||
|
||||
// renderBackupDatabaseProgressBarWithTiming renders database backup progress with ETA
|
||||
func renderBackupDatabaseProgressBarWithTiming(done, total int, dbPhaseElapsed, dbAvgPerDB time.Duration) string {
|
||||
if total == 0 {
|
||||
return ""
|
||||
}
|
||||
|
||||
// Calculate progress percentage
|
||||
percent := float64(done) / float64(total)
|
||||
if percent > 1.0 {
|
||||
percent = 1.0
|
||||
}
|
||||
|
||||
// Build progress bar
|
||||
barWidth := 50
|
||||
filled := int(float64(barWidth) * percent)
|
||||
if filled > barWidth {
|
||||
filled = barWidth
|
||||
}
|
||||
bar := strings.Repeat("█", filled) + strings.Repeat("░", barWidth-filled)
|
||||
|
||||
// Calculate ETA similar to restore
|
||||
var etaStr string
|
||||
if done > 0 && done < total {
|
||||
avgPerDB := dbPhaseElapsed / time.Duration(done)
|
||||
remaining := total - done
|
||||
eta := avgPerDB * time.Duration(remaining)
|
||||
etaStr = fmt.Sprintf(" | ETA: %s", formatDuration(eta))
|
||||
} else if done == total {
|
||||
etaStr = " | Complete"
|
||||
}
|
||||
|
||||
return fmt.Sprintf(" Databases: [%s] %d/%d | Elapsed: %s%s\n",
|
||||
bar, done, total, formatDuration(dbPhaseElapsed), etaStr)
|
||||
}
|
||||
|
||||
func (m BackupExecutionModel) View() string {
|
||||
var s strings.Builder
|
||||
s.Grow(512) // Pre-allocate estimated capacity for better performance
|
||||
|
||||
// Clear screen with newlines and render header
|
||||
s.WriteString("\n\n")
|
||||
header := titleStyle.Render("[EXEC] Backup Execution")
|
||||
s.WriteString(header)
|
||||
header := "[EXEC] Backing up Database"
|
||||
if m.backupType == "cluster" {
|
||||
header = "[EXEC] Cluster Backup"
|
||||
}
|
||||
s.WriteString(titleStyle.Render(header))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Backup details - properly aligned
|
||||
@ -379,7 +451,6 @@ func (m BackupExecutionModel) View() string {
|
||||
if m.ratio > 0 {
|
||||
s.WriteString(fmt.Sprintf(" %-10s %d\n", "Sample:", m.ratio))
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" %-10s %s\n", "Duration:", time.Since(m.startTime).Round(time.Second)))
|
||||
s.WriteString("\n")
|
||||
|
||||
// Status display
|
||||
@ -395,11 +466,15 @@ func (m BackupExecutionModel) View() string {
|
||||
|
||||
elapsedSec := int(time.Since(m.startTime).Seconds())
|
||||
|
||||
if m.overallPhase == 2 && m.dbTotal > 0 {
|
||||
if m.overallPhase == backupPhaseDatabases && m.dbTotal > 0 {
|
||||
// Phase 2: Database backups - contributes 15-90%
|
||||
dbPct := int((int64(m.dbDone) * 100) / int64(m.dbTotal))
|
||||
overallProgress = 15 + (dbPct * 75 / 100)
|
||||
phaseLabel = m.phaseDesc
|
||||
} else if m.overallPhase == backupPhaseCompressing {
|
||||
// Phase 3: Compressing archive
|
||||
overallProgress = 92
|
||||
phaseLabel = "Phase 3/3: Compressing Archive"
|
||||
} else if elapsedSec < 5 {
|
||||
// Initial setup
|
||||
overallProgress = 2
|
||||
@ -430,9 +505,9 @@ func (m BackupExecutionModel) View() string {
|
||||
}
|
||||
s.WriteString("\n")
|
||||
|
||||
// Database progress bar
|
||||
progressBar := renderBackupDatabaseProgressBar(m.dbDone, m.dbTotal, m.dbName, 50)
|
||||
s.WriteString(progressBar + "\n")
|
||||
// Database progress bar with timing
|
||||
s.WriteString(renderBackupDatabaseProgressBarWithTiming(m.dbDone, m.dbTotal, m.dbPhaseElapsed, m.dbAvgPerDB))
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
// Intermediate phase (globals)
|
||||
spinner := spinnerFrames[m.spinnerFrame]
|
||||
@ -449,86 +524,87 @@ func (m BackupExecutionModel) View() string {
|
||||
}
|
||||
|
||||
if !m.cancelling {
|
||||
s.WriteString("\n [KEY] Press Ctrl+C or ESC to cancel\n")
|
||||
// Elapsed time
|
||||
s.WriteString(fmt.Sprintf("Elapsed: %s\n", formatDuration(time.Since(m.startTime))))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render("[KEYS] Press Ctrl+C or ESC to cancel"))
|
||||
}
|
||||
} else {
|
||||
// Show completion summary with detailed stats
|
||||
if m.err != nil {
|
||||
s.WriteString(errorStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render(" ╔══════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString(errorStyle.Render("║ [FAIL] BACKUP FAILED ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render(" ║ [FAIL] BACKUP FAILED ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render(" ╚══════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString(errorStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
|
||||
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
s.WriteString(successStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(successStyle.Render(" ╔══════════════════════════════════════════════════════════╗"))
|
||||
s.WriteString(successStyle.Render("║ [OK] BACKUP COMPLETED SUCCESSFULLY ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(successStyle.Render(" ║ [OK] BACKUP COMPLETED SUCCESSFULLY ║"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(successStyle.Render(" ╚══════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString(successStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Summary section
|
||||
s.WriteString(infoStyle.Render(" ─── Summary ─────────────────────────────────────────────"))
|
||||
s.WriteString(infoStyle.Render(" ─── Summary ───────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Archive info (if available)
|
||||
if m.archivePath != "" {
|
||||
s.WriteString(fmt.Sprintf(" Archive: %s\n", filepath.Base(m.archivePath)))
|
||||
}
|
||||
if m.archiveSize > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Archive Size: %s\n", FormatBytes(m.archiveSize)))
|
||||
}
|
||||
|
||||
// Backup type specific info
|
||||
switch m.backupType {
|
||||
case "cluster":
|
||||
s.WriteString(" Type: Cluster Backup\n")
|
||||
s.WriteString(" Type: Cluster Backup\n")
|
||||
if m.dbTotal > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Databases: %d backed up\n", m.dbTotal))
|
||||
s.WriteString(fmt.Sprintf(" Databases: %d backed up\n", m.dbTotal))
|
||||
}
|
||||
case "single":
|
||||
s.WriteString(" Type: Single Database Backup\n")
|
||||
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||
s.WriteString(" Type: Single Database Backup\n")
|
||||
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||
case "sample":
|
||||
s.WriteString(" Type: Sample Backup\n")
|
||||
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||
s.WriteString(fmt.Sprintf(" Sample Ratio: %d\n", m.ratio))
|
||||
s.WriteString(" Type: Sample Backup\n")
|
||||
s.WriteString(fmt.Sprintf(" Database: %s\n", m.databaseName))
|
||||
s.WriteString(fmt.Sprintf(" Sample Ratio: %d\n", m.ratio))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
|
||||
// Timing section
|
||||
s.WriteString(infoStyle.Render(" ─── Timing ──────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
elapsed := time.Since(m.startTime)
|
||||
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatBackupDuration(elapsed)))
|
||||
|
||||
if m.backupType == "cluster" && m.dbTotal > 0 {
|
||||
avgPerDB := elapsed / time.Duration(m.dbTotal)
|
||||
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatBackupDuration(avgPerDB)))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ─────────────────────────────────────────────────────────"))
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
// Timing section (always shown, consistent with restore)
|
||||
s.WriteString(infoStyle.Render(" ─── Timing ────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
elapsed := m.elapsed
|
||||
if elapsed == 0 {
|
||||
elapsed = time.Since(m.startTime)
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" Total Time: %s\n", formatDuration(elapsed)))
|
||||
|
||||
// Calculate and show throughput if we have size info
|
||||
if m.archiveSize > 0 && elapsed.Seconds() > 0 {
|
||||
throughput := float64(m.archiveSize) / elapsed.Seconds()
|
||||
s.WriteString(fmt.Sprintf(" Throughput: %s/s (average)\n", FormatBytes(int64(throughput))))
|
||||
}
|
||||
|
||||
if m.backupType == "cluster" && m.dbTotal > 0 && m.err == nil {
|
||||
avgPerDB := elapsed / time.Duration(m.dbTotal)
|
||||
s.WriteString(fmt.Sprintf(" Avg per DB: %s\n", formatDuration(avgPerDB)))
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(" [KEY] Press Enter or ESC to return to menu\n")
|
||||
s.WriteString(infoStyle.Render(" ───────────────────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(infoStyle.Render(" [KEYS] Press Enter to continue"))
|
||||
}
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
// formatBackupDuration formats duration in human readable format
|
||||
func formatBackupDuration(d time.Duration) string {
|
||||
if d < time.Minute {
|
||||
return fmt.Sprintf("%.1fs", d.Seconds())
|
||||
}
|
||||
if d < time.Hour {
|
||||
minutes := int(d.Minutes())
|
||||
seconds := int(d.Seconds()) % 60
|
||||
return fmt.Sprintf("%dm %ds", minutes, seconds)
|
||||
}
|
||||
hours := int(d.Hours())
|
||||
minutes := int(d.Minutes()) % 60
|
||||
return fmt.Sprintf("%dh %dm", hours, minutes)
|
||||
}
|
||||
|
||||
281
internal/tui/cluster_db_selector.go
Normal file
281
internal/tui/cluster_db_selector.go
Normal file
@ -0,0 +1,281 @@
|
||||
package tui
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
tea "github.com/charmbracelet/bubbletea"
|
||||
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/logger"
|
||||
"dbbackup/internal/restore"
|
||||
)
|
||||
|
||||
// ClusterDatabaseSelectorModel for selecting databases from a cluster backup
|
||||
type ClusterDatabaseSelectorModel struct {
|
||||
config *config.Config
|
||||
logger logger.Logger
|
||||
parent tea.Model
|
||||
ctx context.Context
|
||||
archive ArchiveInfo
|
||||
databases []restore.DatabaseInfo
|
||||
cursor int
|
||||
selected map[int]bool // Track multiple selections
|
||||
loading bool
|
||||
err error
|
||||
title string
|
||||
mode string // "single" or "multiple"
|
||||
extractOnly bool // If true, extract without restoring
|
||||
}
|
||||
|
||||
func NewClusterDatabaseSelector(cfg *config.Config, log logger.Logger, parent tea.Model, ctx context.Context, archive ArchiveInfo, mode string, extractOnly bool) ClusterDatabaseSelectorModel {
|
||||
return ClusterDatabaseSelectorModel{
|
||||
config: cfg,
|
||||
logger: log,
|
||||
parent: parent,
|
||||
ctx: ctx,
|
||||
archive: archive,
|
||||
databases: nil,
|
||||
selected: make(map[int]bool),
|
||||
title: "Select Database(s) from Cluster Backup",
|
||||
loading: true,
|
||||
mode: mode,
|
||||
extractOnly: extractOnly,
|
||||
}
|
||||
}
|
||||
|
||||
func (m ClusterDatabaseSelectorModel) Init() tea.Cmd {
|
||||
return fetchClusterDatabases(m.ctx, m.archive, m.logger)
|
||||
}
|
||||
|
||||
type clusterDatabaseListMsg struct {
|
||||
databases []restore.DatabaseInfo
|
||||
err error
|
||||
}
|
||||
|
||||
func fetchClusterDatabases(ctx context.Context, archive ArchiveInfo, log logger.Logger) tea.Cmd {
|
||||
return func() tea.Msg {
|
||||
databases, err := restore.ListDatabasesInCluster(ctx, archive.Path, log)
|
||||
if err != nil {
|
||||
return clusterDatabaseListMsg{databases: nil, err: fmt.Errorf("failed to list databases: %w", err)}
|
||||
}
|
||||
return clusterDatabaseListMsg{databases: databases, err: nil}
|
||||
}
|
||||
}
|
||||
|
||||
func (m ClusterDatabaseSelectorModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
switch msg := msg.(type) {
|
||||
case clusterDatabaseListMsg:
|
||||
m.loading = false
|
||||
if msg.err != nil {
|
||||
m.err = msg.err
|
||||
} else {
|
||||
m.databases = msg.databases
|
||||
if len(m.databases) > 0 && m.mode == "single" {
|
||||
m.selected[0] = true // Pre-select first database in single mode
|
||||
}
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case tea.KeyMsg:
|
||||
if m.loading {
|
||||
return m, nil
|
||||
}
|
||||
|
||||
switch msg.String() {
|
||||
case "q", "esc":
|
||||
// Return to parent
|
||||
return m.parent, nil
|
||||
|
||||
case "up", "k":
|
||||
if m.cursor > 0 {
|
||||
m.cursor--
|
||||
}
|
||||
|
||||
case "down", "j":
|
||||
if m.cursor < len(m.databases)-1 {
|
||||
m.cursor++
|
||||
}
|
||||
|
||||
case " ": // Space to toggle selection (multiple mode)
|
||||
if m.mode == "multiple" {
|
||||
m.selected[m.cursor] = !m.selected[m.cursor]
|
||||
} else {
|
||||
// Single mode: clear all and select current
|
||||
m.selected = make(map[int]bool)
|
||||
m.selected[m.cursor] = true
|
||||
}
|
||||
|
||||
case "enter":
|
||||
if m.err != nil {
|
||||
return m.parent, nil
|
||||
}
|
||||
|
||||
if len(m.databases) == 0 {
|
||||
return m.parent, nil
|
||||
}
|
||||
|
||||
// Get selected database(s)
|
||||
var selectedDBs []restore.DatabaseInfo
|
||||
for i, selected := range m.selected {
|
||||
if selected && i < len(m.databases) {
|
||||
selectedDBs = append(selectedDBs, m.databases[i])
|
||||
}
|
||||
}
|
||||
|
||||
if len(selectedDBs) == 0 {
|
||||
// No selection, use cursor position
|
||||
selectedDBs = []restore.DatabaseInfo{m.databases[m.cursor]}
|
||||
}
|
||||
|
||||
if m.extractOnly {
|
||||
// TODO: Implement extraction flow
|
||||
m.logger.Info("Extract-only mode not yet implemented in TUI")
|
||||
return m.parent, nil
|
||||
}
|
||||
|
||||
// For restore: proceed to restore preview/confirmation
|
||||
if len(selectedDBs) == 1 {
|
||||
// Single database restore from cluster
|
||||
// Create a temporary archive info for the selected database
|
||||
dbArchive := ArchiveInfo{
|
||||
Name: selectedDBs[0].Filename,
|
||||
Path: m.archive.Path, // Still use cluster archive path
|
||||
Format: m.archive.Format,
|
||||
Size: selectedDBs[0].Size,
|
||||
Modified: m.archive.Modified,
|
||||
DatabaseName: selectedDBs[0].Name,
|
||||
}
|
||||
|
||||
preview := NewRestorePreview(m.config, m.logger, m.parent, m.ctx, dbArchive, "restore-cluster-single")
|
||||
return preview, preview.Init()
|
||||
} else {
|
||||
// Multiple database restore - not yet implemented
|
||||
m.logger.Info("Multiple database restore not yet implemented in TUI")
|
||||
return m.parent, nil
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return m, nil
|
||||
}
|
||||
|
||||
func (m ClusterDatabaseSelectorModel) View() string {
|
||||
if m.loading {
|
||||
return TitleStyle.Render("Loading databases from cluster backup...") + "\n\nPlease wait..."
|
||||
}
|
||||
|
||||
if m.err != nil {
|
||||
var s strings.Builder
|
||||
s.WriteString(TitleStyle.Render("Error"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(StatusErrorStyle.Render("Failed to list databases"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(m.err.Error())
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(StatusReadyStyle.Render("Press any key to go back"))
|
||||
return s.String()
|
||||
}
|
||||
|
||||
if len(m.databases) == 0 {
|
||||
var s strings.Builder
|
||||
s.WriteString(TitleStyle.Render("No Databases Found"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(StatusWarningStyle.Render("The cluster backup appears to be empty or invalid."))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(StatusReadyStyle.Render("Press any key to go back"))
|
||||
return s.String()
|
||||
}
|
||||
|
||||
var s strings.Builder
|
||||
|
||||
// Title
|
||||
s.WriteString(TitleStyle.Render(m.title))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Archive info
|
||||
s.WriteString(LabelStyle.Render("Archive: "))
|
||||
s.WriteString(m.archive.Name)
|
||||
s.WriteString("\n")
|
||||
s.WriteString(LabelStyle.Render("Databases: "))
|
||||
s.WriteString(fmt.Sprintf("%d", len(m.databases)))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Instructions
|
||||
if m.mode == "multiple" {
|
||||
s.WriteString(StatusReadyStyle.Render("↑/↓: navigate • space: select/deselect • enter: confirm • q/esc: back"))
|
||||
} else {
|
||||
s.WriteString(StatusReadyStyle.Render("↑/↓: navigate • enter: select • q/esc: back"))
|
||||
}
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Database list
|
||||
s.WriteString(ListHeaderStyle.Render("Available Databases:"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
for i, db := range m.databases {
|
||||
cursor := " "
|
||||
if m.cursor == i {
|
||||
cursor = "▶ "
|
||||
}
|
||||
|
||||
checkbox := ""
|
||||
if m.mode == "multiple" {
|
||||
if m.selected[i] {
|
||||
checkbox = "[✓] "
|
||||
} else {
|
||||
checkbox = "[ ] "
|
||||
}
|
||||
} else {
|
||||
if m.selected[i] {
|
||||
checkbox = "● "
|
||||
} else {
|
||||
checkbox = "○ "
|
||||
}
|
||||
}
|
||||
|
||||
sizeStr := formatBytes(db.Size)
|
||||
line := fmt.Sprintf("%s%s%-40s %10s", cursor, checkbox, db.Name, sizeStr)
|
||||
|
||||
if m.cursor == i {
|
||||
s.WriteString(ListSelectedStyle.Render(line))
|
||||
} else {
|
||||
s.WriteString(ListNormalStyle.Render(line))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
|
||||
// Selection summary
|
||||
selectedCount := 0
|
||||
var totalSize int64
|
||||
for i, selected := range m.selected {
|
||||
if selected && i < len(m.databases) {
|
||||
selectedCount++
|
||||
totalSize += m.databases[i].Size
|
||||
}
|
||||
}
|
||||
|
||||
if selectedCount > 0 {
|
||||
s.WriteString(StatusSuccessStyle.Render(fmt.Sprintf("Selected: %d database(s), Total size: %s", selectedCount, formatBytes(totalSize))))
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
// formatBytes formats byte count as human-readable string
|
||||
func formatBytes(bytes int64) string {
|
||||
const unit = 1024
|
||||
if bytes < unit {
|
||||
return fmt.Sprintf("%d B", bytes)
|
||||
}
|
||||
div, exp := int64(unit), 0
|
||||
for n := bytes / unit; n >= unit; n /= unit {
|
||||
div *= unit
|
||||
exp++
|
||||
}
|
||||
return fmt.Sprintf("%.1f %cB", float64(bytes)/float64(div), "KMGTPE"[exp])
|
||||
}
|
||||
@ -299,9 +299,13 @@ func (m *MenuModel) View() string {
|
||||
|
||||
var s string
|
||||
|
||||
// Product branding header
|
||||
brandLine := fmt.Sprintf("dbbackup v%s • Enterprise Database Backup & Recovery", m.config.Version)
|
||||
s += "\n" + infoStyle.Render(brandLine) + "\n"
|
||||
|
||||
// Header
|
||||
header := titleStyle.Render("Database Backup Tool - Interactive Menu")
|
||||
s += fmt.Sprintf("\n%s\n\n", header)
|
||||
header := titleStyle.Render("Interactive Menu")
|
||||
s += fmt.Sprintf("%s\n\n", header)
|
||||
|
||||
if len(m.dbTypes) > 0 {
|
||||
options := make([]string, len(m.dbTypes))
|
||||
|
||||
@ -152,8 +152,9 @@ type sharedProgressState struct {
|
||||
currentDB string
|
||||
|
||||
// Timing info for database restore phase
|
||||
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
|
||||
dbAvgPerDB time.Duration // Average time per database restore
|
||||
dbPhaseElapsed time.Duration // Elapsed time since restore phase started
|
||||
dbAvgPerDB time.Duration // Average time per database restore
|
||||
phase3StartTime time.Time // When phase 3 started (for realtime ETA calculation)
|
||||
|
||||
// Overall phase tracking (1=Extract, 2=Globals, 3=Databases)
|
||||
overallPhase int
|
||||
@ -190,12 +191,12 @@ func clearCurrentRestoreProgress() {
|
||||
currentRestoreProgressState = nil
|
||||
}
|
||||
|
||||
func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description string, hasUpdate bool, dbTotal, dbDone int, speed float64, dbPhaseElapsed, dbAvgPerDB time.Duration, currentDB string, overallPhase int, extractionDone bool, dbBytesTotal, dbBytesDone int64) {
|
||||
func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description string, hasUpdate bool, dbTotal, dbDone int, speed float64, dbPhaseElapsed, dbAvgPerDB time.Duration, currentDB string, overallPhase int, extractionDone bool, dbBytesTotal, dbBytesDone int64, phase3StartTime time.Time) {
|
||||
currentRestoreProgressMu.Lock()
|
||||
defer currentRestoreProgressMu.Unlock()
|
||||
|
||||
if currentRestoreProgressState == nil {
|
||||
return 0, 0, "", false, 0, 0, 0, 0, 0, "", 0, false, 0, 0
|
||||
return 0, 0, "", false, 0, 0, 0, 0, 0, "", 0, false, 0, 0, time.Time{}
|
||||
}
|
||||
|
||||
currentRestoreProgressState.mu.Lock()
|
||||
@ -204,13 +205,20 @@ func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description strin
|
||||
// Calculate rolling window speed
|
||||
speed = calculateRollingSpeed(currentRestoreProgressState.speedSamples)
|
||||
|
||||
// Calculate realtime phase elapsed if we have a phase 3 start time
|
||||
dbPhaseElapsed = currentRestoreProgressState.dbPhaseElapsed
|
||||
if !currentRestoreProgressState.phase3StartTime.IsZero() {
|
||||
dbPhaseElapsed = time.Since(currentRestoreProgressState.phase3StartTime)
|
||||
}
|
||||
|
||||
return currentRestoreProgressState.bytesTotal, currentRestoreProgressState.bytesDone,
|
||||
currentRestoreProgressState.description, currentRestoreProgressState.hasUpdate,
|
||||
currentRestoreProgressState.dbTotal, currentRestoreProgressState.dbDone, speed,
|
||||
currentRestoreProgressState.dbPhaseElapsed, currentRestoreProgressState.dbAvgPerDB,
|
||||
dbPhaseElapsed, currentRestoreProgressState.dbAvgPerDB,
|
||||
currentRestoreProgressState.currentDB, currentRestoreProgressState.overallPhase,
|
||||
currentRestoreProgressState.extractionDone,
|
||||
currentRestoreProgressState.dbBytesTotal, currentRestoreProgressState.dbBytesDone
|
||||
currentRestoreProgressState.dbBytesTotal, currentRestoreProgressState.dbBytesDone,
|
||||
currentRestoreProgressState.phase3StartTime
|
||||
}
|
||||
|
||||
// calculateRollingSpeed calculates speed from recent samples (last 5 seconds)
|
||||
@ -253,11 +261,9 @@ type restoreProgressChannel chan restoreProgressMsg
|
||||
|
||||
func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string, cleanFirst, createIfMissing bool, restoreType string, cleanClusterFirst bool, existingDBs []string, saveDebugLog bool) tea.Cmd {
|
||||
return func() tea.Msg {
|
||||
// NO TIMEOUT for restore operations - a restore takes as long as it takes
|
||||
// Large databases with large objects can take many hours
|
||||
// Only manual cancellation (Ctrl+C) should stop the restore
|
||||
ctx, cancel := context.WithCancel(parentCtx)
|
||||
defer cancel()
|
||||
// Use the parent context directly - it's already cancellable from the model
|
||||
// DO NOT create a new context here as it breaks Ctrl+C cancellation
|
||||
ctx := parentCtx
|
||||
|
||||
start := time.Now()
|
||||
|
||||
@ -273,26 +279,42 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
defer dbClient.Close()
|
||||
|
||||
// STEP 1: Clean cluster if requested (drop all existing user databases)
|
||||
if restoreType == "restore-cluster" && cleanClusterFirst && len(existingDBs) > 0 {
|
||||
log.Info("Dropping existing user databases before cluster restore", "count", len(existingDBs))
|
||||
|
||||
// Drop databases using command-line psql (no connection required)
|
||||
// This matches how cluster restore works - uses CLI tools, not database connections
|
||||
droppedCount := 0
|
||||
for _, dbName := range existingDBs {
|
||||
// Create timeout context for each database drop (5 minutes per DB - large DBs take time)
|
||||
dropCtx, dropCancel := context.WithTimeout(ctx, 5*time.Minute)
|
||||
if err := dropDatabaseCLI(dropCtx, cfg, dbName); err != nil {
|
||||
log.Warn("Failed to drop database", "name", dbName, "error", err)
|
||||
// Continue with other databases
|
||||
} else {
|
||||
droppedCount++
|
||||
log.Info("Dropped database", "name", dbName)
|
||||
}
|
||||
dropCancel() // Clean up context
|
||||
if restoreType == "restore-cluster" && cleanClusterFirst {
|
||||
// Re-detect databases at execution time to get current state
|
||||
// The preview list may be stale or detection may have failed earlier
|
||||
safety := restore.NewSafety(cfg, log)
|
||||
currentDBs, err := safety.ListUserDatabases(ctx)
|
||||
if err != nil {
|
||||
log.Warn("Failed to list databases for cleanup, using preview list", "error", err)
|
||||
currentDBs = existingDBs // Fall back to preview list
|
||||
} else if len(currentDBs) > 0 {
|
||||
log.Info("Re-detected user databases for cleanup", "count", len(currentDBs), "databases", currentDBs)
|
||||
existingDBs = currentDBs // Update with fresh list
|
||||
}
|
||||
|
||||
log.Info("Cluster cleanup completed", "dropped", droppedCount, "total", len(existingDBs))
|
||||
if len(existingDBs) > 0 {
|
||||
log.Info("Dropping existing user databases before cluster restore", "count", len(existingDBs))
|
||||
|
||||
// Drop databases using command-line psql (no connection required)
|
||||
// This matches how cluster restore works - uses CLI tools, not database connections
|
||||
droppedCount := 0
|
||||
for _, dbName := range existingDBs {
|
||||
// Create timeout context for each database drop (5 minutes per DB - large DBs take time)
|
||||
dropCtx, dropCancel := context.WithTimeout(ctx, 5*time.Minute)
|
||||
if err := dropDatabaseCLI(dropCtx, cfg, dbName); err != nil {
|
||||
log.Warn("Failed to drop database", "name", dbName, "error", err)
|
||||
// Continue with other databases
|
||||
} else {
|
||||
droppedCount++
|
||||
log.Info("Dropped database", "name", dbName)
|
||||
}
|
||||
dropCancel() // Clean up context
|
||||
}
|
||||
|
||||
log.Info("Cluster cleanup completed", "dropped", droppedCount, "total", len(existingDBs))
|
||||
} else {
|
||||
log.Info("No user databases to clean up")
|
||||
}
|
||||
}
|
||||
|
||||
// STEP 2: Create restore engine with silent progress (no stdout interference with TUI)
|
||||
@ -341,6 +363,10 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
progressState.overallPhase = 3
|
||||
progressState.extractionDone = true
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 3 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase3StartTime.IsZero() {
|
||||
progressState.phase3StartTime = time.Now()
|
||||
}
|
||||
// Clear byte progress when switching to db progress
|
||||
progressState.bytesTotal = 0
|
||||
progressState.bytesDone = 0
|
||||
@ -359,6 +385,10 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
progressState.dbPhaseElapsed = phaseElapsed
|
||||
progressState.dbAvgPerDB = avgPerDB
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 3 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase3StartTime.IsZero() {
|
||||
progressState.phase3StartTime = time.Now()
|
||||
}
|
||||
// Clear byte progress when switching to db progress
|
||||
progressState.bytesTotal = 0
|
||||
progressState.bytesDone = 0
|
||||
@ -376,6 +406,10 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
progressState.overallPhase = 3
|
||||
progressState.extractionDone = true
|
||||
progressState.hasUpdate = true
|
||||
// Set phase 3 start time on first callback (for realtime ETA calculation)
|
||||
if progressState.phase3StartTime.IsZero() {
|
||||
progressState.phase3StartTime = time.Now()
|
||||
}
|
||||
})
|
||||
|
||||
// Store progress state in a package-level variable for the ticker to access
|
||||
@ -396,6 +430,9 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
var restoreErr error
|
||||
if restoreType == "restore-cluster" {
|
||||
restoreErr = engine.RestoreCluster(ctx, archive.Path)
|
||||
} else if restoreType == "restore-cluster-single" {
|
||||
// Restore single database from cluster backup
|
||||
restoreErr = engine.RestoreSingleFromCluster(ctx, archive.Path, targetDB, targetDB, cleanFirst, createIfMissing)
|
||||
} else {
|
||||
restoreErr = engine.RestoreSingle(ctx, archive.Path, targetDB, cleanFirst, createIfMissing)
|
||||
}
|
||||
@ -411,6 +448,8 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
result := fmt.Sprintf("Successfully restored from %s", archive.Name)
|
||||
if restoreType == "restore-single" {
|
||||
result = fmt.Sprintf("Successfully restored '%s' from %s", targetDB, archive.Name)
|
||||
} else if restoreType == "restore-cluster-single" {
|
||||
result = fmt.Sprintf("Successfully restored '%s' from cluster %s", targetDB, archive.Name)
|
||||
} else if restoreType == "restore-cluster" && cleanClusterFirst {
|
||||
result = fmt.Sprintf("Successfully restored cluster from %s (cleaned %d existing database(s) first)", archive.Name, len(existingDBs))
|
||||
}
|
||||
@ -431,7 +470,8 @@ func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.elapsed = time.Since(m.startTime)
|
||||
|
||||
// Poll shared progress state for real-time updates
|
||||
bytesTotal, bytesDone, description, hasUpdate, dbTotal, dbDone, speed, dbPhaseElapsed, dbAvgPerDB, currentDB, overallPhase, extractionDone, dbBytesTotal, dbBytesDone := getCurrentRestoreProgress()
|
||||
// Note: dbPhaseElapsed is now calculated in realtime inside getCurrentRestoreProgress()
|
||||
bytesTotal, bytesDone, description, hasUpdate, dbTotal, dbDone, speed, dbPhaseElapsed, dbAvgPerDB, currentDB, overallPhase, extractionDone, dbBytesTotal, dbBytesDone, _ := getCurrentRestoreProgress()
|
||||
if hasUpdate && bytesTotal > 0 && !extractionDone {
|
||||
// Phase 1: Extraction
|
||||
m.bytesTotal = bytesTotal
|
||||
@ -623,13 +663,15 @@ func (m RestoreExecutionModel) View() string {
|
||||
title := "[EXEC] Restoring Database"
|
||||
if m.restoreType == "restore-cluster" {
|
||||
title = "[EXEC] Restoring Cluster"
|
||||
} else if m.restoreType == "restore-cluster-single" {
|
||||
title = "[EXEC] Restoring Single Database from Cluster"
|
||||
}
|
||||
s.WriteString(titleStyle.Render(title))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Archive info
|
||||
s.WriteString(fmt.Sprintf("Archive: %s\n", m.archive.Name))
|
||||
if m.restoreType == "restore-single" {
|
||||
if m.restoreType == "restore-single" || m.restoreType == "restore-cluster-single" {
|
||||
s.WriteString(fmt.Sprintf("Target: %s\n", m.targetDB))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
@ -643,7 +685,13 @@ func (m RestoreExecutionModel) View() string {
|
||||
s.WriteString("\n")
|
||||
s.WriteString(errorStyle.Render("╚══════════════════════════════════════════════════════════════╝"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(errorStyle.Render(fmt.Sprintf(" Error: %v", m.err)))
|
||||
|
||||
// Parse and display error in a clean, structured format
|
||||
errStr := m.err.Error()
|
||||
|
||||
// Extract key parts from the error message
|
||||
errDisplay := formatRestoreError(errStr)
|
||||
s.WriteString(errDisplay)
|
||||
s.WriteString("\n")
|
||||
} else {
|
||||
s.WriteString(successStyle.Render("╔══════════════════════════════════════════════════════════════╗"))
|
||||
@ -989,3 +1037,188 @@ func dropDatabaseCLI(ctx context.Context, cfg *config.Config, dbName string) err
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// formatRestoreError formats a restore error message for clean TUI display
|
||||
func formatRestoreError(errStr string) string {
|
||||
var s strings.Builder
|
||||
maxLineWidth := 60
|
||||
|
||||
// Common patterns to extract
|
||||
patterns := []struct {
|
||||
key string
|
||||
pattern string
|
||||
}{
|
||||
{"Error Type", "ERROR:"},
|
||||
{"Hint", "HINT:"},
|
||||
{"Last Error", "last error:"},
|
||||
{"Total Errors", "total errors:"},
|
||||
}
|
||||
|
||||
// First, try to extract a clean error summary
|
||||
errLines := strings.Split(errStr, "\n")
|
||||
|
||||
// Find the main error message (first line or first ERROR:)
|
||||
mainError := ""
|
||||
hint := ""
|
||||
totalErrors := ""
|
||||
dbsFailed := []string{}
|
||||
|
||||
for _, line := range errLines {
|
||||
line = strings.TrimSpace(line)
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
|
||||
// Extract ERROR messages
|
||||
if strings.Contains(line, "ERROR:") {
|
||||
if mainError == "" {
|
||||
// Get just the ERROR part
|
||||
idx := strings.Index(line, "ERROR:")
|
||||
if idx >= 0 {
|
||||
mainError = strings.TrimSpace(line[idx:])
|
||||
// Truncate if too long
|
||||
if len(mainError) > maxLineWidth {
|
||||
mainError = mainError[:maxLineWidth-3] + "..."
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Extract HINT
|
||||
if strings.Contains(line, "HINT:") {
|
||||
idx := strings.Index(line, "HINT:")
|
||||
if idx >= 0 {
|
||||
hint = strings.TrimSpace(line[idx+5:])
|
||||
if len(hint) > maxLineWidth {
|
||||
hint = hint[:maxLineWidth-3] + "..."
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Extract total errors count
|
||||
if strings.Contains(line, "total errors:") {
|
||||
idx := strings.Index(line, "total errors:")
|
||||
if idx >= 0 {
|
||||
totalErrors = strings.TrimSpace(line[idx+13:])
|
||||
// Just extract the number
|
||||
parts := strings.Fields(totalErrors)
|
||||
if len(parts) > 0 {
|
||||
totalErrors = parts[0]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Extract failed database names (for cluster restore)
|
||||
if strings.Contains(line, ": restore failed:") {
|
||||
parts := strings.SplitN(line, ":", 2)
|
||||
if len(parts) > 0 {
|
||||
dbName := strings.TrimSpace(parts[0])
|
||||
if dbName != "" && !strings.HasPrefix(dbName, "Error") {
|
||||
dbsFailed = append(dbsFailed, dbName)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// If no structured error found, use the first line
|
||||
if mainError == "" {
|
||||
firstLine := errStr
|
||||
if idx := strings.Index(errStr, "\n"); idx > 0 {
|
||||
firstLine = errStr[:idx]
|
||||
}
|
||||
if len(firstLine) > maxLineWidth*2 {
|
||||
firstLine = firstLine[:maxLineWidth*2-3] + "..."
|
||||
}
|
||||
mainError = firstLine
|
||||
}
|
||||
|
||||
// Build structured error display
|
||||
s.WriteString(infoStyle.Render(" ─── Error Details ─────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Error type detection
|
||||
errorType := "critical"
|
||||
if strings.Contains(errStr, "out of shared memory") || strings.Contains(errStr, "max_locks_per_transaction") {
|
||||
errorType = "critical"
|
||||
} else if strings.Contains(errStr, "connection") {
|
||||
errorType = "connection"
|
||||
} else if strings.Contains(errStr, "permission") || strings.Contains(errStr, "access") {
|
||||
errorType = "permission"
|
||||
}
|
||||
|
||||
s.WriteString(fmt.Sprintf(" Type: %s\n", errorType))
|
||||
s.WriteString(fmt.Sprintf(" Message: %s\n", mainError))
|
||||
|
||||
if hint != "" {
|
||||
s.WriteString(fmt.Sprintf(" Hint: %s\n", hint))
|
||||
}
|
||||
|
||||
if totalErrors != "" {
|
||||
s.WriteString(fmt.Sprintf(" Total Errors: %s\n", totalErrors))
|
||||
}
|
||||
|
||||
// Show failed databases (max 5)
|
||||
if len(dbsFailed) > 0 {
|
||||
s.WriteString("\n")
|
||||
s.WriteString(" Failed Databases:\n")
|
||||
for i, db := range dbsFailed {
|
||||
if i >= 5 {
|
||||
s.WriteString(fmt.Sprintf(" ... and %d more\n", len(dbsFailed)-5))
|
||||
break
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" • %s\n", db))
|
||||
}
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ─── Diagnosis ─────────────────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
// Provide specific recommendations based on error
|
||||
if strings.Contains(errStr, "out of shared memory") || strings.Contains(errStr, "max_locks_per_transaction") {
|
||||
s.WriteString(errorStyle.Render(" • PostgreSQL lock table exhausted\n"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] Recommendations ────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" Lock capacity = max_locks_per_transaction\n")
|
||||
s.WriteString(" × (max_connections + max_prepared_transactions)\n\n")
|
||||
s.WriteString(" If you reduced VM size or max_connections, you need higher\n")
|
||||
s.WriteString(" max_locks_per_transaction to compensate.\n\n")
|
||||
s.WriteString(successStyle.Render(" FIX OPTIONS:\n"))
|
||||
s.WriteString(" 1. Enable 'Large DB Mode' in Settings\n")
|
||||
s.WriteString(" (press 'l' to toggle, reduces parallelism, increases locks)\n\n")
|
||||
s.WriteString(" 2. Increase PostgreSQL locks:\n")
|
||||
s.WriteString(" ALTER SYSTEM SET max_locks_per_transaction = 4096;\n")
|
||||
s.WriteString(" Then RESTART PostgreSQL.\n\n")
|
||||
s.WriteString(" 3. Reduce parallel jobs:\n")
|
||||
s.WriteString(" Set Cluster Parallelism = 1 in Settings\n")
|
||||
} else if strings.Contains(errStr, "connection") || strings.Contains(errStr, "refused") {
|
||||
s.WriteString(" • Database connection failed\n\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] Recommendations ────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" 1. Check database is running\n")
|
||||
s.WriteString(" 2. Verify host, port, and credentials in Settings\n")
|
||||
s.WriteString(" 3. Check firewall/network connectivity\n")
|
||||
} else if strings.Contains(errStr, "permission") || strings.Contains(errStr, "denied") {
|
||||
s.WriteString(" • Permission denied\n\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] Recommendations ────────────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" 1. Verify database user has sufficient privileges\n")
|
||||
s.WriteString(" 2. Grant CREATE/DROP DATABASE permissions if restoring cluster\n")
|
||||
s.WriteString(" 3. Check file system permissions on backup directory\n")
|
||||
} else {
|
||||
s.WriteString(" See error message above for details.\n\n")
|
||||
s.WriteString(infoStyle.Render(" ─── [HINT] General Recommendations ────────────────────────"))
|
||||
s.WriteString("\n\n")
|
||||
s.WriteString(" 1. Check the full error log for details\n")
|
||||
s.WriteString(" 2. Try restoring with 'conservative' profile (press 'c')\n")
|
||||
s.WriteString(" 3. For complex databases, enable 'Large DB Mode' (press 'l')\n")
|
||||
}
|
||||
|
||||
s.WriteString("\n")
|
||||
|
||||
// Suppress the pattern variable since we don't use it but defined it
|
||||
_ = patterns
|
||||
|
||||
return s.String()
|
||||
}
|
||||
|
||||
@ -55,6 +55,7 @@ type RestorePreviewModel struct {
|
||||
cleanClusterFirst bool // For cluster restore: drop all user databases first
|
||||
existingDBCount int // Number of existing user databases
|
||||
existingDBs []string // List of existing user databases
|
||||
existingDBError string // Error message if database listing failed
|
||||
safetyChecks []SafetyCheck
|
||||
checking bool
|
||||
canProceed bool
|
||||
@ -102,6 +103,7 @@ type safetyCheckCompleteMsg struct {
|
||||
canProceed bool
|
||||
existingDBCount int
|
||||
existingDBs []string
|
||||
existingDBError string
|
||||
}
|
||||
|
||||
func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd {
|
||||
@ -221,10 +223,12 @@ func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo,
|
||||
check = SafetyCheck{Name: "Existing databases", Status: "checking", Critical: false}
|
||||
|
||||
// Get list of existing user databases (exclude templates and system DBs)
|
||||
var existingDBError string
|
||||
dbList, err := safety.ListUserDatabases(ctx)
|
||||
if err != nil {
|
||||
check.Status = "warning"
|
||||
check.Message = fmt.Sprintf("Cannot list databases: %v", err)
|
||||
existingDBError = err.Error()
|
||||
} else {
|
||||
existingDBCount = len(dbList)
|
||||
existingDBs = dbList
|
||||
@ -238,6 +242,14 @@ func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo,
|
||||
}
|
||||
}
|
||||
checks = append(checks, check)
|
||||
|
||||
return safetyCheckCompleteMsg{
|
||||
checks: checks,
|
||||
canProceed: canProceed,
|
||||
existingDBCount: existingDBCount,
|
||||
existingDBs: existingDBs,
|
||||
existingDBError: existingDBError,
|
||||
}
|
||||
}
|
||||
|
||||
return safetyCheckCompleteMsg{
|
||||
@ -257,6 +269,7 @@ func (m RestorePreviewModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
m.canProceed = msg.canProceed
|
||||
m.existingDBCount = msg.existingDBCount
|
||||
m.existingDBs = msg.existingDBs
|
||||
m.existingDBError = msg.existingDBError
|
||||
// Auto-forward in auto-confirm mode
|
||||
if m.config.TUIAutoConfirm {
|
||||
return m.parent, tea.Quit
|
||||
@ -275,10 +288,17 @@ func (m RestorePreviewModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
|
||||
case "c":
|
||||
if m.mode == "restore-cluster" {
|
||||
// Toggle cluster cleanup
|
||||
// Toggle cluster cleanup - databases will be re-detected at execution time
|
||||
m.cleanClusterFirst = !m.cleanClusterFirst
|
||||
if m.cleanClusterFirst {
|
||||
m.message = checkWarningStyle.Render(fmt.Sprintf("[WARN] Will drop %d existing database(s) before restore", m.existingDBCount))
|
||||
if m.existingDBError != "" {
|
||||
// Detection failed in preview - will re-detect at execution
|
||||
m.message = checkWarningStyle.Render("[WARN] Will clean existing databases before restore (detection pending)")
|
||||
} else if m.existingDBCount > 0 {
|
||||
m.message = checkWarningStyle.Render(fmt.Sprintf("[WARN] Will drop %d existing database(s) before restore", m.existingDBCount))
|
||||
} else {
|
||||
m.message = infoStyle.Render("[INFO] Cleanup enabled (no databases currently detected)")
|
||||
}
|
||||
} else {
|
||||
m.message = fmt.Sprintf("Clean cluster first: disabled")
|
||||
}
|
||||
@ -382,7 +402,27 @@ func (m RestorePreviewModel) View() string {
|
||||
s.WriteString("\n")
|
||||
s.WriteString(fmt.Sprintf(" Host: %s:%d\n", m.config.Host, m.config.Port))
|
||||
|
||||
if m.existingDBCount > 0 {
|
||||
// Show Resource Profile and CPU Workload settings
|
||||
profile := m.config.GetCurrentProfile()
|
||||
if profile != nil {
|
||||
s.WriteString(fmt.Sprintf(" Resource Profile: %s (Parallel:%d, Jobs:%d)\n",
|
||||
profile.Name, profile.ClusterParallelism, profile.Jobs))
|
||||
} else {
|
||||
s.WriteString(fmt.Sprintf(" Resource Profile: %s\n", m.config.ResourceProfile))
|
||||
}
|
||||
// Show Large DB Mode status
|
||||
if m.config.LargeDBMode {
|
||||
s.WriteString(" Large DB Mode: ON (reduced parallelism, high locks)\n")
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" CPU Workload: %s\n", m.config.CPUWorkloadType))
|
||||
s.WriteString(fmt.Sprintf(" Cluster Parallelism: %d databases\n", m.config.ClusterParallelism))
|
||||
|
||||
if m.existingDBError != "" {
|
||||
// Show warning when database listing failed - but still allow cleanup toggle
|
||||
s.WriteString(checkWarningStyle.Render(" Existing Databases: Detection failed\n"))
|
||||
s.WriteString(infoStyle.Render(fmt.Sprintf(" (%s)\n", m.existingDBError)))
|
||||
s.WriteString(infoStyle.Render(" (Will re-detect at restore time)\n"))
|
||||
} else if m.existingDBCount > 0 {
|
||||
s.WriteString(fmt.Sprintf(" Existing Databases: %d found\n", m.existingDBCount))
|
||||
|
||||
// Show first few database names
|
||||
@ -395,17 +435,20 @@ func (m RestorePreviewModel) View() string {
|
||||
}
|
||||
s.WriteString(fmt.Sprintf(" - %s\n", db))
|
||||
}
|
||||
|
||||
cleanIcon := "[N]"
|
||||
cleanStyle := infoStyle
|
||||
if m.cleanClusterFirst {
|
||||
cleanIcon = "[Y]"
|
||||
cleanStyle = checkWarningStyle
|
||||
}
|
||||
s.WriteString(cleanStyle.Render(fmt.Sprintf(" Clean All First: %s %v (press 'c' to toggle)\n", cleanIcon, m.cleanClusterFirst)))
|
||||
} else {
|
||||
s.WriteString(" Existing Databases: None (clean slate)\n")
|
||||
}
|
||||
|
||||
// Always show cleanup toggle for cluster restore
|
||||
cleanIcon := "[N]"
|
||||
cleanStyle := infoStyle
|
||||
if m.cleanClusterFirst {
|
||||
cleanIcon := "[Y]"
|
||||
cleanStyle = checkWarningStyle
|
||||
s.WriteString(cleanStyle.Render(fmt.Sprintf(" Clean All First: %s enabled (press 'c' to toggle)\n", cleanIcon)))
|
||||
} else {
|
||||
s.WriteString(cleanStyle.Render(fmt.Sprintf(" Clean All First: %s disabled (press 'c' to toggle)\n", cleanIcon)))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
}
|
||||
|
||||
@ -453,10 +496,18 @@ func (m RestorePreviewModel) View() string {
|
||||
s.WriteString(infoStyle.Render(" All existing data in target database will be dropped!"))
|
||||
s.WriteString("\n\n")
|
||||
}
|
||||
if m.cleanClusterFirst && m.existingDBCount > 0 {
|
||||
if m.cleanClusterFirst {
|
||||
s.WriteString(checkWarningStyle.Render("[DANGER] WARNING: Cluster cleanup enabled"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(checkWarningStyle.Render(fmt.Sprintf(" %d existing database(s) will be DROPPED before restore!", m.existingDBCount)))
|
||||
if m.existingDBError != "" {
|
||||
s.WriteString(checkWarningStyle.Render(" Existing databases will be DROPPED before restore!"))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" (Database count will be detected at restore time)"))
|
||||
} else if m.existingDBCount > 0 {
|
||||
s.WriteString(checkWarningStyle.Render(fmt.Sprintf(" %d existing database(s) will be DROPPED before restore!", m.existingDBCount)))
|
||||
} else {
|
||||
s.WriteString(infoStyle.Render(" No databases currently detected - cleanup will verify at restore time"))
|
||||
}
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render(" This ensures a clean disaster recovery scenario"))
|
||||
s.WriteString("\n\n")
|
||||
|
||||
@ -10,6 +10,7 @@ import (
|
||||
"github.com/charmbracelet/lipgloss"
|
||||
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/cpu"
|
||||
"dbbackup/internal/logger"
|
||||
)
|
||||
|
||||
@ -101,6 +102,65 @@ func NewSettingsModel(cfg *config.Config, log logger.Logger, parent tea.Model) S
|
||||
Type: "selector",
|
||||
Description: "CPU workload profile (press Enter to cycle: Balanced → CPU-Intensive → I/O-Intensive)",
|
||||
},
|
||||
{
|
||||
Key: "resource_profile",
|
||||
DisplayName: "Resource Profile",
|
||||
Value: func(c *config.Config) string {
|
||||
profile := c.GetCurrentProfile()
|
||||
if profile != nil {
|
||||
return fmt.Sprintf("%s (P:%d J:%d)", profile.Name, profile.ClusterParallelism, profile.Jobs)
|
||||
}
|
||||
return c.ResourceProfile
|
||||
},
|
||||
Update: func(c *config.Config, v string) error {
|
||||
profiles := []string{"conservative", "balanced", "performance", "max-performance"}
|
||||
currentIdx := 0
|
||||
for i, p := range profiles {
|
||||
if c.ResourceProfile == p {
|
||||
currentIdx = i
|
||||
break
|
||||
}
|
||||
}
|
||||
nextIdx := (currentIdx + 1) % len(profiles)
|
||||
return c.ApplyResourceProfile(profiles[nextIdx])
|
||||
},
|
||||
Type: "selector",
|
||||
Description: "Resource profile for VM capacity. Toggle 'l' for Large DB Mode on any profile.",
|
||||
},
|
||||
{
|
||||
Key: "large_db_mode",
|
||||
DisplayName: "Large DB Mode",
|
||||
Value: func(c *config.Config) string {
|
||||
if c.LargeDBMode {
|
||||
return "ON (↓parallelism, ↑locks)"
|
||||
}
|
||||
return "OFF"
|
||||
},
|
||||
Update: func(c *config.Config, v string) error {
|
||||
c.LargeDBMode = !c.LargeDBMode
|
||||
return nil
|
||||
},
|
||||
Type: "selector",
|
||||
Description: "Enable for databases with many tables/LOBs. Reduces parallelism, increases max_locks_per_transaction.",
|
||||
},
|
||||
{
|
||||
Key: "cluster_parallelism",
|
||||
DisplayName: "Cluster Parallelism",
|
||||
Value: func(c *config.Config) string { return fmt.Sprintf("%d", c.ClusterParallelism) },
|
||||
Update: func(c *config.Config, v string) error {
|
||||
val, err := strconv.Atoi(v)
|
||||
if err != nil {
|
||||
return fmt.Errorf("cluster parallelism must be a number")
|
||||
}
|
||||
if val < 1 {
|
||||
return fmt.Errorf("cluster parallelism must be at least 1")
|
||||
}
|
||||
c.ClusterParallelism = val
|
||||
return nil
|
||||
},
|
||||
Type: "int",
|
||||
Description: "Concurrent databases during cluster backup/restore (1=sequential, safer for large DBs)",
|
||||
},
|
||||
{
|
||||
Key: "backup_dir",
|
||||
DisplayName: "Backup Directory",
|
||||
@ -528,12 +588,70 @@ func (m SettingsModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
|
||||
case "s":
|
||||
return m.saveSettings()
|
||||
|
||||
case "l":
|
||||
// Quick shortcut: Toggle Large DB Mode
|
||||
return m.toggleLargeDBMode()
|
||||
|
||||
case "c":
|
||||
// Quick shortcut: Apply "conservative" profile for constrained VMs
|
||||
return m.applyConservativeProfile()
|
||||
|
||||
case "p":
|
||||
// Show profile recommendation
|
||||
return m.showProfileRecommendation()
|
||||
}
|
||||
}
|
||||
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// toggleLargeDBMode toggles the Large DB Mode flag
|
||||
func (m SettingsModel) toggleLargeDBMode() (tea.Model, tea.Cmd) {
|
||||
m.config.LargeDBMode = !m.config.LargeDBMode
|
||||
if m.config.LargeDBMode {
|
||||
profile := m.config.GetCurrentProfile()
|
||||
m.message = successStyle.Render(fmt.Sprintf(
|
||||
"[ON] Large DB Mode enabled: %s → Parallel=%d, Jobs=%d, MaxLocks=%d",
|
||||
profile.Name, profile.ClusterParallelism, profile.Jobs, profile.MaxLocksPerTxn))
|
||||
} else {
|
||||
profile := m.config.GetCurrentProfile()
|
||||
m.message = successStyle.Render(fmt.Sprintf(
|
||||
"[OFF] Large DB Mode disabled: %s → Parallel=%d, Jobs=%d",
|
||||
profile.Name, profile.ClusterParallelism, profile.Jobs))
|
||||
}
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// applyConservativeProfile applies the conservative profile for constrained VMs
|
||||
func (m SettingsModel) applyConservativeProfile() (tea.Model, tea.Cmd) {
|
||||
if err := m.config.ApplyResourceProfile("conservative"); err != nil {
|
||||
m.message = errorStyle.Render(fmt.Sprintf("[FAIL] %s", err.Error()))
|
||||
return m, nil
|
||||
}
|
||||
m.message = successStyle.Render("[OK] Applied 'conservative' profile: Cluster=1, Jobs=1. Safe for small VMs with limited memory.")
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// showProfileRecommendation displays the recommended profile based on system resources
|
||||
func (m SettingsModel) showProfileRecommendation() (tea.Model, tea.Cmd) {
|
||||
profileName, reason := m.config.GetResourceProfileRecommendation(false)
|
||||
|
||||
var largeDBHint string
|
||||
if m.config.LargeDBMode {
|
||||
largeDBHint = "Large DB Mode: ON"
|
||||
} else {
|
||||
largeDBHint = "Large DB Mode: OFF (press 'l' to enable)"
|
||||
}
|
||||
|
||||
m.message = infoStyle.Render(fmt.Sprintf(
|
||||
"[RECOMMEND] Profile: %s | %s\n"+
|
||||
" → %s\n"+
|
||||
" Press 'l' to toggle Large DB Mode, 'c' for conservative",
|
||||
profileName, largeDBHint, reason))
|
||||
return m, nil
|
||||
}
|
||||
|
||||
// handleEditingInput handles input when editing a setting
|
||||
func (m SettingsModel) handleEditingInput(msg tea.KeyMsg) (tea.Model, tea.Cmd) {
|
||||
switch msg.String() {
|
||||
@ -747,7 +865,32 @@ func (m SettingsModel) View() string {
|
||||
// Current configuration summary
|
||||
if !m.editing {
|
||||
b.WriteString("\n")
|
||||
b.WriteString(infoStyle.Render("[INFO] Current Configuration"))
|
||||
b.WriteString(infoStyle.Render("[INFO] System Resources & Configuration"))
|
||||
b.WriteString("\n")
|
||||
|
||||
// System resources
|
||||
var sysInfo []string
|
||||
if m.config.CPUInfo != nil {
|
||||
sysInfo = append(sysInfo, fmt.Sprintf("CPU: %d cores (physical), %d logical",
|
||||
m.config.CPUInfo.PhysicalCores, m.config.CPUInfo.LogicalCores))
|
||||
}
|
||||
if m.config.MemoryInfo != nil {
|
||||
sysInfo = append(sysInfo, fmt.Sprintf("Memory: %dGB total, %dGB available",
|
||||
m.config.MemoryInfo.TotalGB, m.config.MemoryInfo.AvailableGB))
|
||||
}
|
||||
|
||||
// Recommended profile
|
||||
recommendedProfile, reason := m.config.GetResourceProfileRecommendation(false)
|
||||
sysInfo = append(sysInfo, fmt.Sprintf("Recommended Profile: %s", recommendedProfile))
|
||||
sysInfo = append(sysInfo, fmt.Sprintf(" → %s", reason))
|
||||
|
||||
for _, line := range sysInfo {
|
||||
b.WriteString(detailStyle.Render(fmt.Sprintf(" %s", line)))
|
||||
b.WriteString("\n")
|
||||
}
|
||||
|
||||
b.WriteString("\n")
|
||||
b.WriteString(infoStyle.Render("[CONFIG] Current Settings"))
|
||||
b.WriteString("\n")
|
||||
|
||||
summary := []string{
|
||||
@ -755,7 +898,17 @@ func (m SettingsModel) View() string {
|
||||
fmt.Sprintf("Database: %s@%s:%d", m.config.User, m.config.Host, m.config.Port),
|
||||
fmt.Sprintf("Backup Dir: %s", m.config.BackupDir),
|
||||
fmt.Sprintf("Compression: Level %d", m.config.CompressionLevel),
|
||||
fmt.Sprintf("Jobs: %d parallel, %d dump", m.config.Jobs, m.config.DumpJobs),
|
||||
fmt.Sprintf("Profile: %s | Cluster: %d parallel | Jobs: %d",
|
||||
m.config.ResourceProfile, m.config.ClusterParallelism, m.config.Jobs),
|
||||
}
|
||||
|
||||
// Show profile warnings if applicable
|
||||
profile := m.config.GetCurrentProfile()
|
||||
if profile != nil {
|
||||
isValid, warnings := cpu.ValidateProfileForSystem(profile, m.config.CPUInfo, m.config.MemoryInfo)
|
||||
if !isValid && len(warnings) > 0 {
|
||||
summary = append(summary, fmt.Sprintf("⚠️ Warning: %s", warnings[0]))
|
||||
}
|
||||
}
|
||||
|
||||
if m.config.CloudEnabled {
|
||||
@ -782,9 +935,9 @@ func (m SettingsModel) View() string {
|
||||
} else {
|
||||
// Show different help based on current selection
|
||||
if m.cursor >= 0 && m.cursor < len(m.settings) && m.settings[m.cursor].Type == "path" {
|
||||
footer = infoStyle.Render("\n[KEYS] Up/Down navigate | Enter edit | Tab browse directories | 's' save | 'r' reset | 'q' menu")
|
||||
footer = infoStyle.Render("\n[KEYS] ↑↓ navigate | Enter edit | Tab dirs | 'l' toggle LargeDB | 'c' conservative | 'p' recommend | 's' save | 'q' menu")
|
||||
} else {
|
||||
footer = infoStyle.Render("\n[KEYS] Up/Down navigate | Enter edit | 's' save | 'r' reset | 'q' menu | Tab=dirs on path fields only")
|
||||
footer = infoStyle.Render("\n[KEYS] ↑↓ navigate | Enter edit | 'l' toggle LargeDB mode | 'c' conservative | 'p' recommend | 's' save | 'r' reset | 'q' menu")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
77
release-notes-v3.42.77.md
Normal file
77
release-notes-v3.42.77.md
Normal file
@ -0,0 +1,77 @@
|
||||
# dbbackup v3.42.77
|
||||
|
||||
## 🎯 New Feature: Single Database Extraction from Cluster Backups
|
||||
|
||||
Extract and restore individual databases from cluster backups without full cluster restoration!
|
||||
|
||||
### 🆕 New Flags
|
||||
|
||||
- **`--list-databases`**: List all databases in cluster backup with sizes
|
||||
- **`--database <name>`**: Extract/restore a single database from cluster
|
||||
- **`--databases "db1,db2,db3"`**: Extract multiple databases (comma-separated)
|
||||
- **`--output-dir <path>`**: Extract to directory without restoring
|
||||
- **`--target <name>`**: Rename database during restore
|
||||
|
||||
### 📖 Examples
|
||||
|
||||
```bash
|
||||
# List databases in cluster backup
|
||||
dbbackup restore cluster backup.tar.gz --list-databases
|
||||
|
||||
# Extract single database (no restore)
|
||||
dbbackup restore cluster backup.tar.gz --database myapp --output-dir /tmp/extract
|
||||
|
||||
# Restore single database from cluster
|
||||
dbbackup restore cluster backup.tar.gz --database myapp --confirm
|
||||
|
||||
# Restore with different name (testing)
|
||||
dbbackup restore cluster backup.tar.gz --database myapp --target myapp_test --confirm
|
||||
|
||||
# Extract multiple databases
|
||||
dbbackup restore cluster backup.tar.gz --databases "app1,app2,app3" --output-dir /tmp/extract
|
||||
```
|
||||
|
||||
### 💡 Use Cases
|
||||
|
||||
✅ **Selective disaster recovery** - restore only affected databases
|
||||
✅ **Database migration** - copy databases between clusters
|
||||
✅ **Testing workflows** - restore with different names
|
||||
✅ **Faster restores** - extract only what you need
|
||||
✅ **Less disk space** - no need to extract entire cluster
|
||||
|
||||
### ⚙️ Technical Details
|
||||
|
||||
- Stream-based extraction with progress feedback
|
||||
- Fast cluster archive scanning (no full extraction needed)
|
||||
- Works with all cluster backup formats (.tar.gz)
|
||||
- Compatible with existing cluster restore workflow
|
||||
- Automatic format detection for extracted dumps
|
||||
|
||||
### 🖥️ TUI Support (Interactive Mode)
|
||||
|
||||
**New in this release**: Press **`s`** key when viewing a cluster backup to select individual databases!
|
||||
|
||||
- Navigate cluster backups in TUI and press `s` for database selection
|
||||
- Interactive database picker with size information
|
||||
- Visual selection confirmation before restore
|
||||
- Seamless integration with existing TUI workflows
|
||||
|
||||
**TUI Workflow:**
|
||||
1. Launch TUI: `dbbackup` (no arguments)
|
||||
2. Navigate to "Restore" → "Single Database"
|
||||
3. Select cluster backup archive
|
||||
4. Press `s` to show database list
|
||||
5. Select database and confirm restore
|
||||
|
||||
## 📦 Installation
|
||||
|
||||
Download the binary for your platform below and make it executable:
|
||||
|
||||
```bash
|
||||
chmod +x dbbackup_*
|
||||
./dbbackup_* --version
|
||||
```
|
||||
|
||||
## 🔍 Checksums
|
||||
|
||||
SHA256 checksums in `checksums.txt`.
|
||||
Reference in New Issue
Block a user