v3.42.9: Fix all timeout bugs and deadlocks

CRITICAL FIXES:
- Encryption detection false positive (IsBackupEncrypted returned true for ALL files)
- 12 cmd.Wait() deadlocks fixed with channel-based context handling
- TUI timeout bugs: 60s->10min for safety checks, 15s->60s for DB listing
- diagnose.go timeouts: 60s->5min for tar/pg_restore operations
- Panic recovery added to parallel backup/restore goroutines
- Variable shadowing fix in restore/engine.go

These bugs caused pg_dump backups to fail through TUI for months.
This commit is contained in:
2026-01-08 05:56:31 +01:00
parent 627061cdbb
commit bb49922682
22 changed files with 1099 additions and 304 deletions

View File

@@ -5,6 +5,158 @@ All notable changes to dbbackup will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [3.42.9] - 2026-01-08 "Diagnose Timeout Fix"
### Fixed - diagnose.go Timeout Bugs
**More short timeouts that caused large archive failures:**
- `diagnoseClusterArchive()`: tar listing 60s → **5 minutes**
- `verifyWithPgRestore()`: pg_restore --list 60s → **5 minutes**
- `DiagnoseClusterDumps()`: archive listing 120s → **10 minutes**
**Impact:** These timeouts caused "context deadline exceeded" errors when
diagnosing multi-GB backup archives, preventing TUI restore from even starting.
## [3.42.8] - 2026-01-08 "TUI Timeout Fix"
### Fixed - TUI Timeout Bugs Causing Backup/Restore Failures
**ROOT CAUSE of 2-3 month TUI backup/restore failures identified and fixed:**
#### Critical Timeout Fixes:
- **restore_preview.go**: Safety check timeout increased from 60s → **10 minutes**
- Large archives (>1GB) take 2+ minutes to diagnose
- Users saw "context deadline exceeded" before backup even started
- **dbselector.go**: Database listing timeout increased from 15s → **60 seconds**
- Busy PostgreSQL servers need more time to respond
- **status.go**: Status check timeout increased from 10s → **30 seconds**
- SSL negotiation and slow networks caused failures
#### Stability Improvements:
- **Panic recovery** added to parallel goroutines in:
- `backup/engine.go:BackupCluster()` - cluster backup workers
- `restore/engine.go:RestoreCluster()` - cluster restore workers
- Prevents single database panic from crashing entire operation
#### Bug Fix:
- **restore/engine.go**: Fixed variable shadowing `err``cmdErr` for exit code detection
## [3.42.7] - 2026-01-08 "Context Killer Complete"
### Fixed - Additional Deadlock Bugs in Restore & Engine
**All remaining cmd.Wait() deadlock bugs fixed across the codebase:**
#### internal/restore/engine.go:
- `executeRestoreWithDecompression()` - gunzip/pigz pipeline restore
- `extractArchive()` - tar extraction for cluster restore
- `restoreGlobals()` - pg_dumpall globals restore
#### internal/backup/engine.go:
- `createArchive()` - tar/pigz archive creation pipeline
#### internal/engine/mysqldump.go:
- `Backup()` - mysqldump backup operation
- `BackupToWriter()` - streaming mysqldump to writer
**All 6 functions now use proper channel-based context handling with Process.Kill().**
## [3.42.6] - 2026-01-08 "Deadlock Killer"
### Fixed - Backup Command Context Handling
**Critical Bug: pg_dump/mysqldump could hang forever on context cancellation**
The `executeCommand`, `executeCommandWithProgress`, `executeMySQLWithProgressAndCompression`,
and `executeMySQLWithCompression` functions had a race condition where:
1. A goroutine was spawned to read stderr
2. `cmd.Wait()` was called directly
3. If context was cancelled, the process was NOT killed
4. The goroutine could hang forever waiting for stderr
**Fix**: All backup execution functions now use proper channel-based context handling:
```go
// Wait for command with context handling
cmdDone := make(chan error, 1)
go func() {
cmdDone <- cmd.Wait()
}()
select {
case cmdErr = <-cmdDone:
// Command completed
case <-ctx.Done():
// Context cancelled - kill process
cmd.Process.Kill()
<-cmdDone
cmdErr = ctx.Err()
}
```
**Affected Functions:**
- `executeCommand()` - pg_dump for cluster backup
- `executeCommandWithProgress()` - pg_dump for single backup with progress
- `executeMySQLWithProgressAndCompression()` - mysqldump pipeline
- `executeMySQLWithCompression()` - mysqldump pipeline
**This fixes:** Backup operations hanging indefinitely when cancelled or timing out.
## [3.42.5] - 2026-01-08 "False Positive Fix"
### Fixed - Encryption Detection Bug
**IsBackupEncrypted False Positive:**
- **BUG FIX**: `IsBackupEncrypted()` returned `true` for ALL files, blocking normal restores
- Root cause: Fallback logic checked if first 12 bytes (nonce size) could be read - always true
- Fix: Now properly detects known unencrypted formats by magic bytes:
- Gzip: `1f 8b`
- PostgreSQL custom: `PGDMP`
- Plain SQL: starts with `--`, `SET`, `CREATE`
- Returns `false` if no metadata present and format is recognized as unencrypted
- Affected file: `internal/backup/encryption.go`
## [3.42.4] - 2026-01-08 "The Long Haul"
### Fixed - Critical Restore Timeout Bug
**Removed Arbitrary Timeouts from Backup/Restore Operations:**
- **CRITICAL FIX**: Removed 4-hour timeout that was killing large database restores
- PostgreSQL cluster restores of 69GB+ databases no longer fail with "context deadline exceeded"
- All backup/restore operations now use `context.WithCancel` instead of `context.WithTimeout`
- Operations run until completion or manual cancellation (Ctrl+C)
**Affected Files:**
- `internal/tui/restore_exec.go`: Changed from 4-hour timeout to context.WithCancel
- `internal/tui/backup_exec.go`: Changed from 4-hour timeout to context.WithCancel
- `internal/backup/engine.go`: Removed per-database timeout in cluster backup
- `cmd/restore.go`: CLI restore commands use context.WithCancel
**exec.Command Context Audit:**
- Fixed `exec.Command` without Context in `internal/restore/engine.go:730`
- Added proper context handling to all external command calls
- Added timeouts only for quick diagnostic/version checks (not restore path):
- `restore/version_check.go`: 30s timeout for pg_restore --version check only
- `restore/error_report.go`: 10s timeout for tool version detection
- `restore/diagnose.go`: 60s timeout for diagnostic functions
- `pitr/binlog.go`: 10s timeout for mysqlbinlog --version check
- `cleanup/processes.go`: 5s timeout for process listing
- `auth/helper.go`: 30s timeout for auth helper commands
**Verification:**
- 54 total `exec.CommandContext` calls verified in backup/restore/pitr path
- 0 `exec.Command` without Context in critical restore path
- All 14 PostgreSQL exec calls use CommandContext (pg_dump, pg_restore, psql)
- All 15 MySQL/MariaDB exec calls use CommandContext (mysqldump, mysql, mysqlbinlog)
- All 14 test packages pass
### Technical Details
- Large Object (BLOB/BYTEA) restores are particularly affected by timeouts
- 69GB database with large objects can take 5+ hours to restore
- Previous 4-hour hard timeout was causing consistent failures
- Now: No timeout - runs until complete or user cancels
## [3.42.1] - 2026-01-07 "Resistance is Futile" ## [3.42.1] - 2026-01-07 "Resistance is Futile"
### Added - Content-Defined Chunking Deduplication ### Added - Content-Defined Chunking Deduplication

295
EMOTICON_REMOVAL_PLAN.md Normal file
View File

@@ -0,0 +1,295 @@
# Emoticon Removal Plan for Python Code
## ⚠️ CRITICAL: Code Must Remain Functional After Removal
This document outlines a **safe, systematic approach** to removing emoticons from Python code without breaking functionality.
---
## 1. Identification Phase
### 1.1 Where Emoticons CAN Safely Exist (Safe to Remove)
| Location | Risk Level | Action |
|----------|------------|--------|
| Comments (`# 🎉 Success!`) | ✅ SAFE | Remove or replace with text |
| Docstrings (`"""📌 Note:..."""`) | ✅ SAFE | Remove or replace with text |
| Print statements for decoration (`print("✅ Done!")`) | ⚠️ LOW | Replace with ASCII or text |
| Logging messages (`logger.info("🔥 Starting...")`) | ⚠️ LOW | Replace with text equivalent |
### 1.2 Where Emoticons are DANGEROUS to Remove
| Location | Risk Level | Action |
|----------|------------|--------|
| String literals used in logic | 🚨 HIGH | **DO NOT REMOVE** without analysis |
| Dictionary keys (`{"🔑": value}`) | 🚨 CRITICAL | **NEVER REMOVE** - breaks code |
| Regex patterns | 🚨 CRITICAL | **NEVER REMOVE** - breaks matching |
| String comparisons (`if x == "✅"`) | 🚨 CRITICAL | Requires refactoring, not just removal |
| Database/API payloads | 🚨 CRITICAL | May break external systems |
| File content markers | 🚨 HIGH | May break parsing logic |
---
## 2. Pre-Removal Checklist
### 2.1 Before ANY Changes
- [ ] **Full backup** of the codebase
- [ ] **Run all tests** and record baseline results
- [ ] **Document all emoticon locations** with grep/search
- [ ] **Identify emoticon usage patterns** (decorative vs. functional)
### 2.2 Discovery Commands
```bash
# Find all files with emoticons (Unicode range for common emojis)
grep -rn --include="*.py" -P '[\x{1F300}-\x{1F9FF}]' .
# Find emoticons in strings
grep -rn --include="*.py" -E '["'"'"'][^"'"'"']*[\x{1F300}-\x{1F9FF}]' .
# List unique emoticons used
grep -oP '[\x{1F300}-\x{1F9FF}]' *.py | sort -u
```
---
## 3. Replacement Strategy
### 3.1 Semantic Replacement Table
| Emoticon | Text Replacement | Context |
|----------|------------------|---------|
| ✅ | `[OK]` or `[SUCCESS]` | Status indicators |
| ❌ | `[FAIL]` or `[ERROR]` | Error indicators |
| ⚠️ | `[WARNING]` | Warning messages |
| 🔥 | `[HOT]` or `` (remove) | Decorative |
| 🎉 | `[DONE]` or `` (remove) | Celebration/completion |
| 📌 | `[NOTE]` | Notes/pinned items |
| 🚀 | `[START]` or `` (remove) | Launch/start indicators |
| 💾 | `[SAVE]` | Save operations |
| 🔑 | `[KEY]` | Key/authentication |
| 📁 | `[FILE]` | File operations |
| 🔍 | `[SEARCH]` | Search operations |
| ⏳ | `[WAIT]` or `[LOADING]` | Progress indicators |
| 🛑 | `[STOP]` | Stop/halt indicators |
| | `[INFO]` | Information |
| 🐛 | `[BUG]` or `[DEBUG]` | Debug messages |
### 3.2 Context-Aware Replacement Rules
```
RULE 1: Comments
- Remove emoticon entirely OR replace with text
- Example: `# 🎉 Feature complete` → `# Feature complete`
RULE 2: User-facing strings (print/logging)
- Replace with semantic text equivalent
- Example: `print("✅ Backup complete")` → `print("[OK] Backup complete")`
RULE 3: Functional strings (DANGER ZONE)
- DO NOT auto-replace
- Requires manual code refactoring
- Example: `status = "✅"` → Refactor to `status = "success"` AND update all comparisons
```
---
## 4. Safe Removal Process
### Step 1: Audit
```python
# Python script to audit emoticon usage
import re
import ast
EMOJI_PATTERN = re.compile(
"["
"\U0001F300-\U0001F9FF" # Symbols & Pictographs
"\U00002600-\U000026FF" # Misc symbols
"\U00002700-\U000027BF" # Dingbats
"\U0001F600-\U0001F64F" # Emoticons
"]+"
)
def audit_file(filepath):
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
# Parse AST to understand context
tree = ast.parse(content)
findings = []
for lineno, line in enumerate(content.split('\n'), 1):
matches = EMOJI_PATTERN.findall(line)
if matches:
# Determine context (comment, string, etc.)
context = classify_context(line, matches)
findings.append({
'line': lineno,
'content': line.strip(),
'emojis': matches,
'context': context,
'risk': assess_risk(context)
})
return findings
def classify_context(line, matches):
stripped = line.strip()
if stripped.startswith('#'):
return 'COMMENT'
if 'print(' in line or 'logging.' in line or 'logger.' in line:
return 'OUTPUT'
if '==' in line or '!=' in line:
return 'COMPARISON'
if re.search(r'["\'][^"\']*$', line.split('#')[0]):
return 'STRING_LITERAL'
return 'UNKNOWN'
def assess_risk(context):
risk_map = {
'COMMENT': 'LOW',
'OUTPUT': 'LOW',
'COMPARISON': 'CRITICAL',
'STRING_LITERAL': 'HIGH',
'UNKNOWN': 'HIGH'
}
return risk_map.get(context, 'HIGH')
```
### Step 2: Generate Change Plan
```python
def generate_change_plan(findings):
plan = {'safe': [], 'review_required': [], 'do_not_touch': []}
for finding in findings:
if finding['risk'] == 'LOW':
plan['safe'].append(finding)
elif finding['risk'] == 'HIGH':
plan['review_required'].append(finding)
else: # CRITICAL
plan['do_not_touch'].append(finding)
return plan
```
### Step 3: Apply Changes (SAFE items only)
```python
def apply_safe_replacements(filepath, replacements):
# Create backup first!
import shutil
shutil.copy(filepath, filepath + '.backup')
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
for old, new in replacements:
content = content.replace(old, new)
with open(filepath, 'w', encoding='utf-8') as f:
f.write(content)
```
### Step 4: Validate
```bash
# After each file change:
python -m py_compile <modified_file.py> # Syntax check
pytest <related_tests> # Run tests
```
---
## 5. Validation Checklist
### After EACH File Modification
- [ ] File compiles without syntax errors (`python -m py_compile file.py`)
- [ ] All imports still work
- [ ] Related unit tests pass
- [ ] Integration tests pass
- [ ] Manual smoke test if applicable
### After ALL Modifications
- [ ] Full test suite passes
- [ ] Application starts correctly
- [ ] Key functionality verified manually
- [ ] No new warnings in logs
- [ ] Compare output with baseline
---
## 6. Rollback Plan
### If Something Breaks
1. **Immediate**: Restore from `.backup` files
2. **Git**: `git checkout -- <file>` or `git stash pop`
3. **Full rollback**: Restore from pre-change backup
### Keep Until Verified
```bash
# Backup storage structure
backups/
├── pre_emoticon_removal/
│ ├── timestamp.tar.gz
│ └── git_commit_hash.txt
└── individual_files/
├── file1.py.backup
└── file2.py.backup
```
---
## 7. Implementation Order
1. **Phase 1**: Comments only (LOWEST risk)
2. **Phase 2**: Docstrings (LOW risk)
3. **Phase 3**: Print/logging statements (LOW-MEDIUM risk)
4. **Phase 4**: Manual review items (HIGH risk) - one by one
5. **Phase 5**: NEVER touch CRITICAL items without full refactoring
---
## 8. Example Workflow
```bash
# 1. Create full backup
git stash && git checkout -b emoticon-removal
# 2. Run audit script
python emoticon_audit.py > audit_report.json
# 3. Review audit report
cat audit_report.json | jq '.do_not_touch' # Check critical items
# 4. Apply safe changes only
python apply_safe_changes.py --dry-run # Preview first!
python apply_safe_changes.py # Apply
# 5. Validate after each change
python -m pytest tests/
# 6. Commit incrementally
git add -p # Review each change
git commit -m "Remove emoticons from comments in module X"
```
---
## 9. DO NOT DO
**Never** use global find-replace on emoticons
**Never** remove emoticons from string comparisons without refactoring
**Never** change multiple files without testing between changes
**Never** assume an emoticon is decorative - verify context
**Never** proceed if tests fail after a change
---
## 10. Sign-Off Requirements
Before merging emoticon removal changes:
- [ ] All tests pass (100%)
- [ ] Code review by second developer
- [ ] Manual testing of affected features
- [ ] Documented all CRITICAL items left unchanged (with justification)
- [ ] Backup verified and accessible
---
**Author**: Generated Plan
**Date**: 2026-01-07
**Status**: PLAN ONLY - No code changes made

View File

@@ -143,7 +143,7 @@ Backup Execution
Backup created: cluster_20251128_092928.tar.gz Backup created: cluster_20251128_092928.tar.gz
Size: 22.5 GB (compressed) Size: 22.5 GB (compressed)
Location: /u01/dba/dumps/ Location: /var/backups/postgres/
Databases: 7 Databases: 7
Checksum: SHA-256 verified Checksum: SHA-256 verified
``` ```

View File

@@ -4,8 +4,8 @@ This directory contains pre-compiled binaries for the DB Backup Tool across mult
## Build Information ## Build Information
- **Version**: 3.42.1 - **Version**: 3.42.1
- **Build Time**: 2026-01-07_19:40:21_UTC - **Build Time**: 2026-01-08_04:54:46_UTC
- **Git Commit**: 3653ced - **Git Commit**: 627061c
## Recent Updates (v1.1.0) ## Recent Updates (v1.1.0)
- ✅ Fixed TUI progress display with line-by-line output - ✅ Fixed TUI progress display with line-by-line output

View File

@@ -2,12 +2,14 @@ package auth
import ( import (
"bufio" "bufio"
"context"
"fmt" "fmt"
"os" "os"
"os/exec" "os/exec"
"path/filepath" "path/filepath"
"strconv" "strconv"
"strings" "strings"
"time"
"dbbackup/internal/config" "dbbackup/internal/config"
) )
@@ -69,7 +71,10 @@ func checkPgHbaConf(user string) AuthMethod {
// findHbaFileViaPostgres asks PostgreSQL for the hba_file location // findHbaFileViaPostgres asks PostgreSQL for the hba_file location
func findHbaFileViaPostgres() string { func findHbaFileViaPostgres() string {
cmd := exec.Command("psql", "-U", "postgres", "-t", "-c", "SHOW hba_file;") ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "psql", "-U", "postgres", "-t", "-c", "SHOW hba_file;")
output, err := cmd.Output() output, err := cmd.Output()
if err != nil { if err != nil {
return "" return ""
@@ -82,8 +87,11 @@ func parsePgHbaConf(path string, user string) AuthMethod {
// Try with sudo if we can't read directly // Try with sudo if we can't read directly
file, err := os.Open(path) file, err := os.Open(path)
if err != nil { if err != nil {
// Try with sudo // Try with sudo (with timeout)
cmd := exec.Command("sudo", "cat", path) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "sudo", "cat", path)
output, err := cmd.Output() output, err := cmd.Output()
if err != nil { if err != nil {
return AuthUnknown return AuthUnknown

View File

@@ -87,20 +87,46 @@ func IsBackupEncrypted(backupPath string) bool {
return meta.Encrypted return meta.Encrypted
} }
// Fallback: check if file starts with encryption nonce // No metadata found - check file format to determine if encrypted
// Known unencrypted formats have specific magic bytes:
// - Gzip: 1f 8b
// - PGDMP (PostgreSQL custom): 50 47 44 4d 50 (PGDMP)
// - Plain SQL: starts with text (-- or SET or CREATE)
// - Tar: 75 73 74 61 72 (ustar) at offset 257
//
// If file doesn't match any known format, it MIGHT be encrypted,
// but we return false to avoid false positives. User must provide
// metadata file or use --encrypt flag explicitly.
file, err := os.Open(backupPath) file, err := os.Open(backupPath)
if err != nil { if err != nil {
return false return false
} }
defer file.Close() defer file.Close()
// Try to read nonce - if it succeeds, likely encrypted header := make([]byte, 6)
nonce := make([]byte, crypto.NonceSize) if n, err := file.Read(header); err != nil || n < 2 {
if n, err := file.Read(nonce); err != nil || n != crypto.NonceSize {
return false return false
} }
return true // Check for known unencrypted formats
// Gzip magic: 1f 8b
if header[0] == 0x1f && header[1] == 0x8b {
return false // Gzip compressed - not encrypted
}
// PGDMP magic (PostgreSQL custom format)
if len(header) >= 5 && string(header[:5]) == "PGDMP" {
return false // PostgreSQL custom dump - not encrypted
}
// Plain text SQL (starts with --, SET, CREATE, etc.)
if header[0] == '-' || header[0] == 'S' || header[0] == 'C' || header[0] == '/' {
return false // Plain text SQL - not encrypted
}
// Without metadata, we cannot reliably determine encryption status
// Return false to avoid blocking restores with false positives
return false
} }
// DecryptBackupFile decrypts an encrypted backup file // DecryptBackupFile decrypts an encrypted backup file

View File

@@ -443,6 +443,14 @@ func (e *Engine) BackupCluster(ctx context.Context) error {
defer wg.Done() defer wg.Done()
defer func() { <-semaphore }() // Release defer func() { <-semaphore }() // Release
// Panic recovery - prevent one database failure from crashing entire cluster backup
defer func() {
if r := recover(); r != nil {
e.log.Error("Panic in database backup goroutine", "database", name, "panic", r)
atomic.AddInt32(&failCount, 1)
}
}()
// Check for cancellation at start of goroutine // Check for cancellation at start of goroutine
select { select {
case <-ctx.Done(): case <-ctx.Done():
@@ -502,26 +510,10 @@ func (e *Engine) BackupCluster(ctx context.Context) error {
cmd := e.db.BuildBackupCommand(name, dumpFile, options) cmd := e.db.BuildBackupCommand(name, dumpFile, options)
// Calculate timeout based on database size: // NO TIMEOUT for individual database backups
// - Minimum 2 hours for small databases // Large databases with large objects can take many hours
// - Add 1 hour per 20GB for large databases // The parent context handles cancellation if needed
// - This allows ~69GB database to take up to 5+ hours err := e.executeCommand(ctx, cmd, dumpFile)
timeout := 2 * time.Hour
if size, err := e.db.GetDatabaseSize(ctx, name); err == nil {
sizeGB := size / (1024 * 1024 * 1024)
if sizeGB > 20 {
extraHours := (sizeGB / 20) + 1
timeout = time.Duration(2+extraHours) * time.Hour
mu.Lock()
e.printf(" Extended timeout: %v (for %dGB database)\n", timeout, sizeGB)
mu.Unlock()
}
}
dbCtx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()
err := e.executeCommand(dbCtx, cmd, dumpFile)
cancel()
if err != nil { if err != nil {
e.log.Warn("Failed to backup database", "database", name, "error", err) e.log.Warn("Failed to backup database", "database", name, "error", err)
@@ -614,12 +606,36 @@ func (e *Engine) executeCommandWithProgress(ctx context.Context, cmdArgs []strin
return fmt.Errorf("failed to start command: %w", err) return fmt.Errorf("failed to start command: %w", err)
} }
// Monitor progress via stderr // Monitor progress via stderr in goroutine
go e.monitorCommandProgress(stderr, tracker) stderrDone := make(chan struct{})
go func() {
defer close(stderrDone)
e.monitorCommandProgress(stderr, tracker)
}()
// Wait for command to complete // Wait for command to complete with proper context handling
if err := cmd.Wait(); err != nil { cmdDone := make(chan error, 1)
return fmt.Errorf("backup command failed: %w", err) go func() {
cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed (success or failure)
case <-ctx.Done():
// Context cancelled - kill process to unblock
e.log.Warn("Backup cancelled - killing process")
cmd.Process.Kill()
<-cmdDone // Wait for goroutine to finish
cmdErr = ctx.Err()
}
// Wait for stderr reader to finish
<-stderrDone
if cmdErr != nil {
return fmt.Errorf("backup command failed: %w", cmdErr)
} }
return nil return nil
@@ -696,8 +712,12 @@ func (e *Engine) executeMySQLWithProgressAndCompression(ctx context.Context, cmd
return fmt.Errorf("failed to get stderr pipe: %w", err) return fmt.Errorf("failed to get stderr pipe: %w", err)
} }
// Start monitoring progress // Start monitoring progress in goroutine
go e.monitorCommandProgress(stderr, tracker) stderrDone := make(chan struct{})
go func() {
defer close(stderrDone)
e.monitorCommandProgress(stderr, tracker)
}()
// Start both commands // Start both commands
if err := gzipCmd.Start(); err != nil { if err := gzipCmd.Start(); err != nil {
@@ -705,20 +725,41 @@ func (e *Engine) executeMySQLWithProgressAndCompression(ctx context.Context, cmd
} }
if err := dumpCmd.Start(); err != nil { if err := dumpCmd.Start(); err != nil {
gzipCmd.Process.Kill()
return fmt.Errorf("failed to start mysqldump: %w", err) return fmt.Errorf("failed to start mysqldump: %w", err)
} }
// Wait for mysqldump to complete // Wait for mysqldump with context handling
if err := dumpCmd.Wait(); err != nil { dumpDone := make(chan error, 1)
return fmt.Errorf("mysqldump failed: %w", err) go func() {
dumpDone <- dumpCmd.Wait()
}()
var dumpErr error
select {
case dumpErr = <-dumpDone:
// mysqldump completed
case <-ctx.Done():
e.log.Warn("Backup cancelled - killing mysqldump")
dumpCmd.Process.Kill()
gzipCmd.Process.Kill()
<-dumpDone
return ctx.Err()
} }
// Wait for stderr reader
<-stderrDone
// Close pipe and wait for gzip // Close pipe and wait for gzip
pipe.Close() pipe.Close()
if err := gzipCmd.Wait(); err != nil { if err := gzipCmd.Wait(); err != nil {
return fmt.Errorf("gzip failed: %w", err) return fmt.Errorf("gzip failed: %w", err)
} }
if dumpErr != nil {
return fmt.Errorf("mysqldump failed: %w", dumpErr)
}
return nil return nil
} }
@@ -749,19 +790,45 @@ func (e *Engine) executeMySQLWithCompression(ctx context.Context, cmdArgs []stri
gzipCmd.Stdin = stdin gzipCmd.Stdin = stdin
gzipCmd.Stdout = outFile gzipCmd.Stdout = outFile
// Start both commands // Start gzip first
if err := gzipCmd.Start(); err != nil { if err := gzipCmd.Start(); err != nil {
return fmt.Errorf("failed to start gzip: %w", err) return fmt.Errorf("failed to start gzip: %w", err)
} }
if err := dumpCmd.Run(); err != nil { // Start mysqldump
return fmt.Errorf("mysqldump failed: %w", err) if err := dumpCmd.Start(); err != nil {
gzipCmd.Process.Kill()
return fmt.Errorf("failed to start mysqldump: %w", err)
} }
// Wait for mysqldump with context handling
dumpDone := make(chan error, 1)
go func() {
dumpDone <- dumpCmd.Wait()
}()
var dumpErr error
select {
case dumpErr = <-dumpDone:
// mysqldump completed
case <-ctx.Done():
e.log.Warn("Backup cancelled - killing mysqldump")
dumpCmd.Process.Kill()
gzipCmd.Process.Kill()
<-dumpDone
return ctx.Err()
}
// Close pipe and wait for gzip
stdin.Close()
if err := gzipCmd.Wait(); err != nil { if err := gzipCmd.Wait(); err != nil {
return fmt.Errorf("gzip failed: %w", err) return fmt.Errorf("gzip failed: %w", err)
} }
if dumpErr != nil {
return fmt.Errorf("mysqldump failed: %w", dumpErr)
}
return nil return nil
} }
@@ -898,15 +965,46 @@ func (e *Engine) createArchive(ctx context.Context, sourceDir, outputFile string
goto regularTar goto regularTar
} }
// Wait for tar to finish // Wait for tar with proper context handling
if err := cmd.Wait(); err != nil { tarDone := make(chan error, 1)
go func() {
tarDone <- cmd.Wait()
}()
var tarErr error
select {
case tarErr = <-tarDone:
// tar completed
case <-ctx.Done():
e.log.Warn("Archive creation cancelled - killing processes")
cmd.Process.Kill()
pigzCmd.Process.Kill() pigzCmd.Process.Kill()
return fmt.Errorf("tar failed: %w", err) <-tarDone
return ctx.Err()
} }
// Wait for pigz to finish if tarErr != nil {
if err := pigzCmd.Wait(); err != nil { pigzCmd.Process.Kill()
return fmt.Errorf("pigz compression failed: %w", err) return fmt.Errorf("tar failed: %w", tarErr)
}
// Wait for pigz with proper context handling
pigzDone := make(chan error, 1)
go func() {
pigzDone <- pigzCmd.Wait()
}()
var pigzErr error
select {
case pigzErr = <-pigzDone:
case <-ctx.Done():
pigzCmd.Process.Kill()
<-pigzDone
return ctx.Err()
}
if pigzErr != nil {
return fmt.Errorf("pigz compression failed: %w", pigzErr)
} }
return nil return nil
} }
@@ -1251,8 +1349,10 @@ func (e *Engine) executeCommand(ctx context.Context, cmdArgs []string, outputFil
return fmt.Errorf("failed to start backup command: %w", err) return fmt.Errorf("failed to start backup command: %w", err)
} }
// Stream stderr output (don't buffer it all in memory) // Stream stderr output in goroutine (don't buffer it all in memory)
stderrDone := make(chan struct{})
go func() { go func() {
defer close(stderrDone)
scanner := bufio.NewScanner(stderr) scanner := bufio.NewScanner(stderr)
scanner.Buffer(make([]byte, 64*1024), 1024*1024) // 1MB max line size scanner.Buffer(make([]byte, 64*1024), 1024*1024) // 1MB max line size
for scanner.Scan() { for scanner.Scan() {
@@ -1263,10 +1363,30 @@ func (e *Engine) executeCommand(ctx context.Context, cmdArgs []string, outputFil
} }
}() }()
// Wait for command to complete // Wait for command to complete with proper context handling
if err := cmd.Wait(); err != nil { cmdDone := make(chan error, 1)
e.log.Error("Backup command failed", "error", err, "database", filepath.Base(outputFile)) go func() {
return fmt.Errorf("backup command failed: %w", err) cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed (success or failure)
case <-ctx.Done():
// Context cancelled - kill process to unblock
e.log.Warn("Backup cancelled - killing pg_dump process")
cmd.Process.Kill()
<-cmdDone // Wait for goroutine to finish
cmdErr = ctx.Err()
}
// Wait for stderr reader to finish
<-stderrDone
if cmdErr != nil {
e.log.Error("Backup command failed", "error", cmdErr, "database", filepath.Base(outputFile))
return fmt.Errorf("backup command failed: %w", cmdErr)
} }
return nil return nil

View File

@@ -12,6 +12,7 @@ import (
"strings" "strings"
"sync" "sync"
"syscall" "syscall"
"time"
"dbbackup/internal/logger" "dbbackup/internal/logger"
) )
@@ -116,8 +117,11 @@ func KillOrphanedProcesses(log logger.Logger) error {
// findProcessesByName returns PIDs of processes matching the given name // findProcessesByName returns PIDs of processes matching the given name
func findProcessesByName(name string, excludePID int) ([]int, error) { func findProcessesByName(name string, excludePID int) ([]int, error) {
// Use pgrep for efficient process searching // Use pgrep for efficient process searching with timeout
cmd := exec.Command("pgrep", "-x", name) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "pgrep", "-x", name)
output, err := cmd.Output() output, err := cmd.Output()
if err != nil { if err != nil {
// Exit code 1 means no processes found (not an error) // Exit code 1 means no processes found (not an error)

View File

@@ -217,8 +217,8 @@ func New() *Config {
SingleDBName: getEnvString("SINGLE_DB_NAME", ""), SingleDBName: getEnvString("SINGLE_DB_NAME", ""),
RestoreDBName: getEnvString("RESTORE_DB_NAME", ""), RestoreDBName: getEnvString("RESTORE_DB_NAME", ""),
// Timeouts // Timeouts - default 24 hours (1440 min) to handle very large databases with large objects
ClusterTimeoutMinutes: getEnvInt("CLUSTER_TIMEOUT_MIN", 240), ClusterTimeoutMinutes: getEnvInt("CLUSTER_TIMEOUT_MIN", 1440),
// Cluster parallelism (default: 2 concurrent operations for faster cluster backup/restore) // Cluster parallelism (default: 2 concurrent operations for faster cluster backup/restore)
ClusterParallelism: getEnvInt("CLUSTER_PARALLELISM", 2), ClusterParallelism: getEnvInt("CLUSTER_PARALLELISM", 2),
@@ -227,7 +227,7 @@ func New() *Config {
WorkDir: getEnvString("WORK_DIR", ""), WorkDir: getEnvString("WORK_DIR", ""),
// Swap file management // Swap file management
SwapFilePath: "", // Will be set after WorkDir is initialized SwapFilePath: "", // Will be set after WorkDir is initialized
SwapFileSizeGB: getEnvInt("SWAP_FILE_SIZE_GB", 0), // 0 = disabled by default SwapFileSizeGB: getEnvInt("SWAP_FILE_SIZE_GB", 0), // 0 = disabled by default
AutoSwap: getEnvBool("AUTO_SWAP", false), AutoSwap: getEnvBool("AUTO_SWAP", false),

View File

@@ -28,8 +28,9 @@ type LocalConfig struct {
DumpJobs int DumpJobs int
// Performance settings // Performance settings
CPUWorkload string CPUWorkload string
MaxCores int MaxCores int
ClusterTimeout int // Cluster operation timeout in minutes (default: 1440 = 24 hours)
// Security settings // Security settings
RetentionDays int RetentionDays int
@@ -121,6 +122,10 @@ func LoadLocalConfig() (*LocalConfig, error) {
if mc, err := strconv.Atoi(value); err == nil { if mc, err := strconv.Atoi(value); err == nil {
cfg.MaxCores = mc cfg.MaxCores = mc
} }
case "cluster_timeout":
if ct, err := strconv.Atoi(value); err == nil {
cfg.ClusterTimeout = ct
}
} }
case "security": case "security":
switch key { switch key {
@@ -199,6 +204,9 @@ func SaveLocalConfig(cfg *LocalConfig) error {
if cfg.MaxCores != 0 { if cfg.MaxCores != 0 {
sb.WriteString(fmt.Sprintf("max_cores = %d\n", cfg.MaxCores)) sb.WriteString(fmt.Sprintf("max_cores = %d\n", cfg.MaxCores))
} }
if cfg.ClusterTimeout != 0 {
sb.WriteString(fmt.Sprintf("cluster_timeout = %d\n", cfg.ClusterTimeout))
}
sb.WriteString("\n") sb.WriteString("\n")
// Security section // Security section
@@ -268,6 +276,10 @@ func ApplyLocalConfig(cfg *Config, local *LocalConfig) {
if local.MaxCores != 0 { if local.MaxCores != 0 {
cfg.MaxCores = local.MaxCores cfg.MaxCores = local.MaxCores
} }
// Apply cluster timeout from config file (overrides default)
if local.ClusterTimeout != 0 {
cfg.ClusterTimeoutMinutes = local.ClusterTimeout
}
if cfg.RetentionDays == 30 && local.RetentionDays != 0 { if cfg.RetentionDays == 30 && local.RetentionDays != 0 {
cfg.RetentionDays = local.RetentionDays cfg.RetentionDays = local.RetentionDays
} }
@@ -282,21 +294,22 @@ func ApplyLocalConfig(cfg *Config, local *LocalConfig) {
// ConfigFromConfig creates a LocalConfig from a Config // ConfigFromConfig creates a LocalConfig from a Config
func ConfigFromConfig(cfg *Config) *LocalConfig { func ConfigFromConfig(cfg *Config) *LocalConfig {
return &LocalConfig{ return &LocalConfig{
DBType: cfg.DatabaseType, DBType: cfg.DatabaseType,
Host: cfg.Host, Host: cfg.Host,
Port: cfg.Port, Port: cfg.Port,
User: cfg.User, User: cfg.User,
Database: cfg.Database, Database: cfg.Database,
SSLMode: cfg.SSLMode, SSLMode: cfg.SSLMode,
BackupDir: cfg.BackupDir, BackupDir: cfg.BackupDir,
WorkDir: cfg.WorkDir, WorkDir: cfg.WorkDir,
Compression: cfg.CompressionLevel, Compression: cfg.CompressionLevel,
Jobs: cfg.Jobs, Jobs: cfg.Jobs,
DumpJobs: cfg.DumpJobs, DumpJobs: cfg.DumpJobs,
CPUWorkload: cfg.CPUWorkloadType, CPUWorkload: cfg.CPUWorkloadType,
MaxCores: cfg.MaxCores, MaxCores: cfg.MaxCores,
RetentionDays: cfg.RetentionDays, ClusterTimeout: cfg.ClusterTimeoutMinutes,
MinBackups: cfg.MinBackups, RetentionDays: cfg.RetentionDays,
MaxRetries: cfg.MaxRetries, MinBackups: cfg.MinBackups,
MaxRetries: cfg.MaxRetries,
} }
} }

View File

@@ -234,10 +234,26 @@ func (e *MySQLDumpEngine) Backup(ctx context.Context, opts *BackupOptions) (*Bac
gzWriter.Close() gzWriter.Close()
} }
// Wait for command // Wait for command with proper context handling
if err := cmd.Wait(); err != nil { cmdDone := make(chan error, 1)
go func() {
cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed
case <-ctx.Done():
e.log.Warn("MySQL backup cancelled - killing process")
cmd.Process.Kill()
<-cmdDone
cmdErr = ctx.Err()
}
if cmdErr != nil {
stderr := stderrBuf.String() stderr := stderrBuf.String()
return nil, fmt.Errorf("mysqldump failed: %w\n%s", err, stderr) return nil, fmt.Errorf("mysqldump failed: %w\n%s", cmdErr, stderr)
} }
// Get file info // Get file info
@@ -442,8 +458,25 @@ func (e *MySQLDumpEngine) BackupToWriter(ctx context.Context, w io.Writer, opts
gzWriter.Close() gzWriter.Close()
} }
if err := cmd.Wait(); err != nil { // Wait for command with proper context handling
return nil, fmt.Errorf("mysqldump failed: %w\n%s", err, stderrBuf.String()) cmdDone := make(chan error, 1)
go func() {
cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed
case <-ctx.Done():
e.log.Warn("MySQL streaming backup cancelled - killing process")
cmd.Process.Kill()
<-cmdDone
cmdErr = ctx.Err()
}
if cmdErr != nil {
return nil, fmt.Errorf("mysqldump failed: %w\n%s", cmdErr, stderrBuf.String())
} }
return &BackupResult{ return &BackupResult{

View File

@@ -212,7 +212,11 @@ func (m *BinlogManager) detectTools() error {
// detectServerType determines if we're working with MySQL or MariaDB // detectServerType determines if we're working with MySQL or MariaDB
func (m *BinlogManager) detectServerType() DatabaseType { func (m *BinlogManager) detectServerType() DatabaseType {
cmd := exec.Command(m.mysqlbinlogPath, "--version") // Use timeout to prevent blocking if command hangs
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, m.mysqlbinlogPath, "--version")
output, err := cmd.Output() output, err := cmd.Output()
if err != nil { if err != nil {
return DatabaseMySQL // Default to MySQL return DatabaseMySQL // Default to MySQL

View File

@@ -4,6 +4,7 @@ import (
"bufio" "bufio"
"bytes" "bytes"
"compress/gzip" "compress/gzip"
"context"
"encoding/json" "encoding/json"
"fmt" "fmt"
"io" "io"
@@ -12,6 +13,7 @@ import (
"path/filepath" "path/filepath"
"regexp" "regexp"
"strings" "strings"
"time"
"dbbackup/internal/logger" "dbbackup/internal/logger"
) )
@@ -60,9 +62,9 @@ type DiagnoseDetails struct {
TableList []string `json:"table_list,omitempty"` TableList []string `json:"table_list,omitempty"`
// Compression analysis // Compression analysis
GzipValid bool `json:"gzip_valid,omitempty"` GzipValid bool `json:"gzip_valid,omitempty"`
GzipError string `json:"gzip_error,omitempty"` GzipError string `json:"gzip_error,omitempty"`
ExpandedSize int64 `json:"expanded_size,omitempty"` ExpandedSize int64 `json:"expanded_size,omitempty"`
CompressionRatio float64 `json:"compression_ratio,omitempty"` CompressionRatio float64 `json:"compression_ratio,omitempty"`
} }
@@ -157,7 +159,7 @@ func (d *Diagnoser) diagnosePgDump(filePath string, result *DiagnoseResult) {
result.IsCorrupted = true result.IsCorrupted = true
result.Details.HasPGDMPSignature = false result.Details.HasPGDMPSignature = false
result.Details.FirstBytes = fmt.Sprintf("%q", header[:minInt(n, 20)]) result.Details.FirstBytes = fmt.Sprintf("%q", header[:minInt(n, 20)])
result.Errors = append(result.Errors, result.Errors = append(result.Errors,
"Missing PGDMP signature - file is NOT PostgreSQL custom format", "Missing PGDMP signature - file is NOT PostgreSQL custom format",
"This file may be SQL format incorrectly named as .dump", "This file may be SQL format incorrectly named as .dump",
"Try: file "+filePath+" to check actual file type") "Try: file "+filePath+" to check actual file type")
@@ -185,7 +187,7 @@ func (d *Diagnoser) diagnosePgDumpGz(filePath string, result *DiagnoseResult) {
result.IsCorrupted = true result.IsCorrupted = true
result.Details.GzipValid = false result.Details.GzipValid = false
result.Details.GzipError = err.Error() result.Details.GzipError = err.Error()
result.Errors = append(result.Errors, result.Errors = append(result.Errors,
fmt.Sprintf("Invalid gzip format: %v", err), fmt.Sprintf("Invalid gzip format: %v", err),
"The file may be truncated or corrupted during transfer") "The file may be truncated or corrupted during transfer")
return return
@@ -210,7 +212,7 @@ func (d *Diagnoser) diagnosePgDumpGz(filePath string, result *DiagnoseResult) {
} else { } else {
result.Details.HasPGDMPSignature = false result.Details.HasPGDMPSignature = false
result.Details.FirstBytes = fmt.Sprintf("%q", header[:minInt(n, 20)]) result.Details.FirstBytes = fmt.Sprintf("%q", header[:minInt(n, 20)])
// Check if it's actually SQL content // Check if it's actually SQL content
content := string(header[:n]) content := string(header[:n])
if strings.Contains(content, "PostgreSQL") || strings.Contains(content, "pg_dump") || if strings.Contains(content, "PostgreSQL") || strings.Contains(content, "pg_dump") ||
@@ -233,7 +235,7 @@ func (d *Diagnoser) diagnosePgDumpGz(filePath string, result *DiagnoseResult) {
// Verify full gzip stream integrity by reading to end // Verify full gzip stream integrity by reading to end
file.Seek(0, 0) file.Seek(0, 0)
gz, _ = gzip.NewReader(file) gz, _ = gzip.NewReader(file)
var totalRead int64 var totalRead int64
buf := make([]byte, 32*1024) buf := make([]byte, 32*1024)
for { for {
@@ -255,7 +257,7 @@ func (d *Diagnoser) diagnosePgDumpGz(filePath string, result *DiagnoseResult) {
} }
} }
gz.Close() gz.Close()
result.Details.ExpandedSize = totalRead result.Details.ExpandedSize = totalRead
if result.FileSize > 0 { if result.FileSize > 0 {
result.Details.CompressionRatio = float64(totalRead) / float64(result.FileSize) result.Details.CompressionRatio = float64(totalRead) / float64(result.FileSize)
@@ -392,7 +394,7 @@ func (d *Diagnoser) diagnoseSQLScript(filePath string, compressed bool, result *
lastCopyTable, copyStartLine), lastCopyTable, copyStartLine),
"The backup was truncated during data export", "The backup was truncated during data export",
"This explains the 'syntax error' during restore - COPY data is being interpreted as SQL") "This explains the 'syntax error' during restore - COPY data is being interpreted as SQL")
if len(copyDataSamples) > 0 { if len(copyDataSamples) > 0 {
result.Errors = append(result.Errors, result.Errors = append(result.Errors,
fmt.Sprintf("Sample orphaned data: %s", copyDataSamples[0])) fmt.Sprintf("Sample orphaned data: %s", copyDataSamples[0]))
@@ -412,8 +414,12 @@ func (d *Diagnoser) diagnoseSQLScript(filePath string, compressed bool, result *
// diagnoseClusterArchive analyzes a cluster tar.gz archive // diagnoseClusterArchive analyzes a cluster tar.gz archive
func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResult) { func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResult) {
// First verify tar.gz integrity // First verify tar.gz integrity with timeout
cmd := exec.Command("tar", "-tzf", filePath) // 5 minutes for large archives (multi-GB archives need more time)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()
cmd := exec.CommandContext(ctx, "tar", "-tzf", filePath)
output, err := cmd.Output() output, err := cmd.Output()
if err != nil { if err != nil {
result.IsValid = false result.IsValid = false
@@ -491,13 +497,18 @@ func (d *Diagnoser) diagnoseUnknown(filePath string, result *DiagnoseResult) {
// verifyWithPgRestore uses pg_restore --list to verify dump integrity // verifyWithPgRestore uses pg_restore --list to verify dump integrity
func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult) { func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult) {
cmd := exec.Command("pg_restore", "--list", filePath) // Use timeout to prevent blocking on very large dump files
// 5 minutes for large dumps (multi-GB dumps with many tables)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
defer cancel()
cmd := exec.CommandContext(ctx, "pg_restore", "--list", filePath)
output, err := cmd.CombinedOutput() output, err := cmd.CombinedOutput()
if err != nil { if err != nil {
result.Details.PgRestoreListable = false result.Details.PgRestoreListable = false
result.Details.PgRestoreError = string(output) result.Details.PgRestoreError = string(output)
// Check for specific errors // Check for specific errors
errStr := string(output) errStr := string(output)
if strings.Contains(errStr, "unexpected end of file") || if strings.Contains(errStr, "unexpected end of file") ||
@@ -544,7 +555,11 @@ func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult)
// DiagnoseClusterDumps extracts and diagnoses all dumps in a cluster archive // DiagnoseClusterDumps extracts and diagnoses all dumps in a cluster archive
func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*DiagnoseResult, error) { func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*DiagnoseResult, error) {
// First, try to list archive contents without extracting (fast check) // First, try to list archive contents without extracting (fast check)
listCmd := exec.Command("tar", "-tzf", archivePath) // 10 minutes for very large archives
listCtx, listCancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer listCancel()
listCmd := exec.CommandContext(listCtx, "tar", "-tzf", archivePath)
listOutput, listErr := listCmd.CombinedOutput() listOutput, listErr := listCmd.CombinedOutput()
if listErr != nil { if listErr != nil {
// Archive listing failed - likely corrupted // Archive listing failed - likely corrupted
@@ -557,9 +572,9 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
IsCorrupted: true, IsCorrupted: true,
Details: &DiagnoseDetails{}, Details: &DiagnoseDetails{},
} }
errOutput := string(listOutput) errOutput := string(listOutput)
if strings.Contains(errOutput, "unexpected end of file") || if strings.Contains(errOutput, "unexpected end of file") ||
strings.Contains(errOutput, "Unexpected EOF") || strings.Contains(errOutput, "Unexpected EOF") ||
strings.Contains(errOutput, "truncated") { strings.Contains(errOutput, "truncated") {
errResult.IsTruncated = true errResult.IsTruncated = true
@@ -574,28 +589,34 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
fmt.Sprintf("tar error: %s", truncateString(errOutput, 300)), fmt.Sprintf("tar error: %s", truncateString(errOutput, 300)),
"Run manually: tar -tzf "+archivePath+" 2>&1 | tail -50") "Run manually: tar -tzf "+archivePath+" 2>&1 | tail -50")
} }
return []*DiagnoseResult{errResult}, nil return []*DiagnoseResult{errResult}, nil
} }
// Archive is listable - now check disk space before extraction // Archive is listable - now check disk space before extraction
files := strings.Split(strings.TrimSpace(string(listOutput)), "\n") files := strings.Split(strings.TrimSpace(string(listOutput)), "\n")
// Check if we have enough disk space (estimate 4x archive size needed) // Check if we have enough disk space (estimate 4x archive size needed)
archiveInfo, _ := os.Stat(archivePath) archiveInfo, _ := os.Stat(archivePath)
requiredSpace := archiveInfo.Size() * 4 requiredSpace := archiveInfo.Size() * 4
// Check temp directory space - try to extract metadata first // Check temp directory space - try to extract metadata first
if stat, err := os.Stat(tempDir); err == nil && stat.IsDir() { if stat, err := os.Stat(tempDir); err == nil && stat.IsDir() {
// Try extraction of a small test file first // Try extraction of a small test file first with timeout
testCmd := exec.Command("tar", "-xzf", archivePath, "-C", tempDir, "--wildcards", "*.json", "--wildcards", "globals.sql") testCtx, testCancel := context.WithTimeout(context.Background(), 30*time.Second)
testCmd := exec.CommandContext(testCtx, "tar", "-xzf", archivePath, "-C", tempDir, "--wildcards", "*.json", "--wildcards", "globals.sql")
testCmd.Run() // Ignore error - just try to extract metadata testCmd.Run() // Ignore error - just try to extract metadata
testCancel()
} }
d.log.Info("Archive listing successful", "files", len(files)) d.log.Info("Archive listing successful", "files", len(files))
// Try full extraction // Try full extraction - NO TIMEOUT here as large archives can take a long time
cmd := exec.Command("tar", "-xzf", archivePath, "-C", tempDir) // Use a generous timeout (30 minutes) for very large archives
extractCtx, extractCancel := context.WithTimeout(context.Background(), 30*time.Minute)
defer extractCancel()
cmd := exec.CommandContext(extractCtx, "tar", "-xzf", archivePath, "-C", tempDir)
var stderr bytes.Buffer var stderr bytes.Buffer
cmd.Stderr = &stderr cmd.Stderr = &stderr
if err := cmd.Run(); err != nil { if err := cmd.Run(); err != nil {
@@ -608,14 +629,14 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
IsValid: false, IsValid: false,
Details: &DiagnoseDetails{}, Details: &DiagnoseDetails{},
} }
errOutput := stderr.String() errOutput := stderr.String()
if strings.Contains(errOutput, "No space left") || if strings.Contains(errOutput, "No space left") ||
strings.Contains(errOutput, "cannot write") || strings.Contains(errOutput, "cannot write") ||
strings.Contains(errOutput, "Disk quota exceeded") { strings.Contains(errOutput, "Disk quota exceeded") {
errResult.Errors = append(errResult.Errors, errResult.Errors = append(errResult.Errors,
"INSUFFICIENT DISK SPACE to extract archive for diagnosis", "INSUFFICIENT DISK SPACE to extract archive for diagnosis",
fmt.Sprintf("Archive size: %s (needs ~%s for extraction)", fmt.Sprintf("Archive size: %s (needs ~%s for extraction)",
formatBytes(archiveInfo.Size()), formatBytes(requiredSpace)), formatBytes(archiveInfo.Size()), formatBytes(requiredSpace)),
"Use CLI diagnosis instead: dbbackup restore diagnose "+archivePath, "Use CLI diagnosis instead: dbbackup restore diagnose "+archivePath,
"Or use --workdir flag to specify a location with more space") "Or use --workdir flag to specify a location with more space")
@@ -634,7 +655,7 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
fmt.Sprintf("Extraction failed: %v", err), fmt.Sprintf("Extraction failed: %v", err),
fmt.Sprintf("tar error: %s", truncateString(errOutput, 300))) fmt.Sprintf("tar error: %s", truncateString(errOutput, 300)))
} }
// Still report what files we found in the listing // Still report what files we found in the listing
var dumpFiles []string var dumpFiles []string
for _, f := range files { for _, f := range files {
@@ -648,7 +669,7 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
errResult.Warnings = append(errResult.Warnings, errResult.Warnings = append(errResult.Warnings,
fmt.Sprintf("Archive contains %d database dumps (listing only)", len(dumpFiles))) fmt.Sprintf("Archive contains %d database dumps (listing only)", len(dumpFiles)))
} }
return []*DiagnoseResult{errResult}, nil return []*DiagnoseResult{errResult}, nil
} }

View File

@@ -27,7 +27,7 @@ type Engine struct {
progress progress.Indicator progress progress.Indicator
detailedReporter *progress.DetailedReporter detailedReporter *progress.DetailedReporter
dryRun bool dryRun bool
debugLogPath string // Path to save debug log on error debugLogPath string // Path to save debug log on error
errorCollector *ErrorCollector // Collects detailed error info errorCollector *ErrorCollector // Collects detailed error info
} }
@@ -357,43 +357,68 @@ func (e *Engine) executeRestoreCommandWithContext(ctx context.Context, cmdArgs [
return fmt.Errorf("failed to start restore command: %w", err) return fmt.Errorf("failed to start restore command: %w", err)
} }
// Read stderr in chunks to log errors without loading all into memory // Read stderr in goroutine to avoid blocking
buf := make([]byte, 4096)
var lastError string var lastError string
var errorCount int var errorCount int
const maxErrors = 10 // Limit captured errors to prevent OOM stderrDone := make(chan struct{})
for { go func() {
n, err := stderr.Read(buf) defer close(stderrDone)
if n > 0 { buf := make([]byte, 4096)
chunk := string(buf[:n]) const maxErrors = 10 // Limit captured errors to prevent OOM
for {
// Feed to error collector if enabled n, err := stderr.Read(buf)
if collector != nil { if n > 0 {
collector.CaptureStderr(chunk) chunk := string(buf[:n])
}
// Feed to error collector if enabled
// Only capture REAL errors, not verbose output if collector != nil {
if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") { collector.CaptureStderr(chunk)
lastError = strings.TrimSpace(chunk)
errorCount++
if errorCount <= maxErrors {
e.log.Warn("Restore stderr", "output", chunk)
} }
// Only capture REAL errors, not verbose output
if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") {
lastError = strings.TrimSpace(chunk)
errorCount++
if errorCount <= maxErrors {
e.log.Warn("Restore stderr", "output", chunk)
}
}
// Note: --verbose output is discarded to prevent OOM
}
if err != nil {
break
} }
// Note: --verbose output is discarded to prevent OOM
}
if err != nil {
break
} }
}()
// Wait for command with proper context handling
cmdDone := make(chan error, 1)
go func() {
cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed (success or failure)
case <-ctx.Done():
// Context cancelled - kill process
e.log.Warn("Restore cancelled - killing process")
cmd.Process.Kill()
<-cmdDone
cmdErr = ctx.Err()
} }
if err := cmd.Wait(); err != nil { // Wait for stderr reader to finish
<-stderrDone
if cmdErr != nil {
// Get exit code // Get exit code
exitCode := 1 exitCode := 1
if exitErr, ok := err.(*exec.ExitError); ok { if exitErr, ok := cmdErr.(*exec.ExitError); ok {
exitCode = exitErr.ExitCode() exitCode = exitErr.ExitCode()
} }
// PostgreSQL pg_restore returns exit code 1 even for ignorable errors // PostgreSQL pg_restore returns exit code 1 even for ignorable errors
// Check if errors are ignorable (already exists, duplicate, etc.) // Check if errors are ignorable (already exists, duplicate, etc.)
if lastError != "" && e.isIgnorableError(lastError) { if lastError != "" && e.isIgnorableError(lastError) {
@@ -427,10 +452,10 @@ func (e *Engine) executeRestoreCommandWithContext(ctx context.Context, cmdArgs [
errType, errType,
errHint, errHint,
) )
// Print report to console // Print report to console
collector.PrintReport(report) collector.PrintReport(report)
// Save to file // Save to file
if e.debugLogPath != "" { if e.debugLogPath != "" {
if saveErr := collector.SaveReport(report, e.debugLogPath); saveErr != nil { if saveErr := collector.SaveReport(report, e.debugLogPath); saveErr != nil {
@@ -481,31 +506,56 @@ func (e *Engine) executeRestoreWithDecompression(ctx context.Context, archivePat
return fmt.Errorf("failed to start restore command: %w", err) return fmt.Errorf("failed to start restore command: %w", err)
} }
// Read stderr in chunks to log errors without loading all into memory // Read stderr in goroutine to avoid blocking
buf := make([]byte, 4096)
var lastError string var lastError string
var errorCount int var errorCount int
const maxErrors = 10 // Limit captured errors to prevent OOM stderrDone := make(chan struct{})
for { go func() {
n, err := stderr.Read(buf) defer close(stderrDone)
if n > 0 { buf := make([]byte, 4096)
chunk := string(buf[:n]) const maxErrors = 10 // Limit captured errors to prevent OOM
// Only capture REAL errors, not verbose output for {
if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") { n, err := stderr.Read(buf)
lastError = strings.TrimSpace(chunk) if n > 0 {
errorCount++ chunk := string(buf[:n])
if errorCount <= maxErrors { // Only capture REAL errors, not verbose output
e.log.Warn("Restore stderr", "output", chunk) if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") {
lastError = strings.TrimSpace(chunk)
errorCount++
if errorCount <= maxErrors {
e.log.Warn("Restore stderr", "output", chunk)
}
} }
// Note: --verbose output is discarded to prevent OOM
}
if err != nil {
break
} }
// Note: --verbose output is discarded to prevent OOM
}
if err != nil {
break
} }
}()
// Wait for command with proper context handling
cmdDone := make(chan error, 1)
go func() {
cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed (success or failure)
case <-ctx.Done():
// Context cancelled - kill process
e.log.Warn("Restore with decompression cancelled - killing process")
cmd.Process.Kill()
<-cmdDone
cmdErr = ctx.Err()
} }
if err := cmd.Wait(); err != nil { // Wait for stderr reader to finish
<-stderrDone
if cmdErr != nil {
// PostgreSQL pg_restore returns exit code 1 even for ignorable errors // PostgreSQL pg_restore returns exit code 1 even for ignorable errors
// Check if errors are ignorable (already exists, duplicate, etc.) // Check if errors are ignorable (already exists, duplicate, etc.)
if lastError != "" && e.isIgnorableError(lastError) { if lastError != "" && e.isIgnorableError(lastError) {
@@ -517,18 +567,18 @@ func (e *Engine) executeRestoreWithDecompression(ctx context.Context, archivePat
if lastError != "" { if lastError != "" {
classification := checks.ClassifyError(lastError) classification := checks.ClassifyError(lastError)
e.log.Error("Restore with decompression failed", e.log.Error("Restore with decompression failed",
"error", err, "error", cmdErr,
"last_stderr", lastError, "last_stderr", lastError,
"error_count", errorCount, "error_count", errorCount,
"error_type", classification.Type, "error_type", classification.Type,
"hint", classification.Hint, "hint", classification.Hint,
"action", classification.Action) "action", classification.Action)
return fmt.Errorf("restore failed: %w (last error: %s, total errors: %d) - %s", return fmt.Errorf("restore failed: %w (last error: %s, total errors: %d) - %s",
err, lastError, errorCount, classification.Hint) cmdErr, lastError, errorCount, classification.Hint)
} }
e.log.Error("Restore with decompression failed", "error", err, "last_stderr", lastError, "error_count", errorCount) e.log.Error("Restore with decompression failed", "error", cmdErr, "last_stderr", lastError, "error_count", errorCount)
return fmt.Errorf("restore failed: %w", err) return fmt.Errorf("restore failed: %w", cmdErr)
} }
return nil return nil
@@ -727,7 +777,7 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
} }
} else if strings.HasSuffix(dumpFile, ".dump") { } else if strings.HasSuffix(dumpFile, ".dump") {
// Validate custom format dumps using pg_restore --list // Validate custom format dumps using pg_restore --list
cmd := exec.Command("pg_restore", "--list", dumpFile) cmd := exec.CommandContext(ctx, "pg_restore", "--list", dumpFile)
output, err := cmd.CombinedOutput() output, err := cmd.CombinedOutput()
if err != nil { if err != nil {
dbName := strings.TrimSuffix(entry.Name(), ".dump") dbName := strings.TrimSuffix(entry.Name(), ".dump")
@@ -812,6 +862,14 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
defer wg.Done() defer wg.Done()
defer func() { <-semaphore }() // Release defer func() { <-semaphore }() // Release
// Panic recovery - prevent one database failure from crashing entire cluster restore
defer func() {
if r := recover(); r != nil {
e.log.Error("Panic in database restore goroutine", "file", filename, "panic", r)
atomic.AddInt32(&failCount, 1)
}
}()
// Update estimator progress (thread-safe) // Update estimator progress (thread-safe)
mu.Lock() mu.Lock()
estimator.UpdateProgress(idx) estimator.UpdateProgress(idx)
@@ -939,16 +997,39 @@ func (e *Engine) extractArchive(ctx context.Context, archivePath, destDir string
} }
// Discard stderr output in chunks to prevent memory buildup // Discard stderr output in chunks to prevent memory buildup
buf := make([]byte, 4096) stderrDone := make(chan struct{})
for { go func() {
_, err := stderr.Read(buf) defer close(stderrDone)
if err != nil { buf := make([]byte, 4096)
break for {
_, err := stderr.Read(buf)
if err != nil {
break
}
} }
}()
// Wait for command with proper context handling
cmdDone := make(chan error, 1)
go func() {
cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed
case <-ctx.Done():
e.log.Warn("Archive extraction cancelled - killing process")
cmd.Process.Kill()
<-cmdDone
cmdErr = ctx.Err()
} }
if err := cmd.Wait(); err != nil { <-stderrDone
return fmt.Errorf("tar extraction failed: %w", err)
if cmdErr != nil {
return fmt.Errorf("tar extraction failed: %w", cmdErr)
} }
return nil return nil
} }
@@ -981,25 +1062,48 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
return fmt.Errorf("failed to start psql: %w", err) return fmt.Errorf("failed to start psql: %w", err)
} }
// Read stderr in chunks // Read stderr in chunks in goroutine
buf := make([]byte, 4096)
var lastError string var lastError string
for { stderrDone := make(chan struct{})
n, err := stderr.Read(buf) go func() {
if n > 0 { defer close(stderrDone)
chunk := string(buf[:n]) buf := make([]byte, 4096)
if strings.Contains(chunk, "ERROR") || strings.Contains(chunk, "FATAL") { for {
lastError = chunk n, err := stderr.Read(buf)
e.log.Warn("Globals restore stderr", "output", chunk) if n > 0 {
chunk := string(buf[:n])
if strings.Contains(chunk, "ERROR") || strings.Contains(chunk, "FATAL") {
lastError = chunk
e.log.Warn("Globals restore stderr", "output", chunk)
}
}
if err != nil {
break
} }
} }
if err != nil { }()
break
} // Wait for command with proper context handling
cmdDone := make(chan error, 1)
go func() {
cmdDone <- cmd.Wait()
}()
var cmdErr error
select {
case cmdErr = <-cmdDone:
// Command completed
case <-ctx.Done():
e.log.Warn("Globals restore cancelled - killing process")
cmd.Process.Kill()
<-cmdDone
cmdErr = ctx.Err()
} }
if err := cmd.Wait(); err != nil { <-stderrDone
return fmt.Errorf("failed to restore globals: %w (last error: %s)", err, lastError)
if cmdErr != nil {
return fmt.Errorf("failed to restore globals: %w (last error: %s)", cmdErr, lastError)
} }
return nil return nil
@@ -1263,7 +1367,8 @@ func (e *Engine) detectLargeObjectsInDumps(dumpsDir string, entries []os.DirEntr
} }
// Use pg_restore -l to list contents (fast, doesn't restore data) // Use pg_restore -l to list contents (fast, doesn't restore data)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) // 2 minutes for large dumps with many objects
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
defer cancel() defer cancel()
cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpFile) cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpFile)

View File

@@ -3,6 +3,7 @@ package restore
import ( import (
"bufio" "bufio"
"compress/gzip" "compress/gzip"
"context"
"encoding/json" "encoding/json"
"fmt" "fmt"
"io" "io"
@@ -20,43 +21,43 @@ import (
// RestoreErrorReport contains comprehensive information about a restore failure // RestoreErrorReport contains comprehensive information about a restore failure
type RestoreErrorReport struct { type RestoreErrorReport struct {
// Metadata // Metadata
Timestamp time.Time `json:"timestamp"` Timestamp time.Time `json:"timestamp"`
Version string `json:"version"` Version string `json:"version"`
GoVersion string `json:"go_version"` GoVersion string `json:"go_version"`
OS string `json:"os"` OS string `json:"os"`
Arch string `json:"arch"` Arch string `json:"arch"`
// Archive info // Archive info
ArchivePath string `json:"archive_path"` ArchivePath string `json:"archive_path"`
ArchiveSize int64 `json:"archive_size"` ArchiveSize int64 `json:"archive_size"`
ArchiveFormat string `json:"archive_format"` ArchiveFormat string `json:"archive_format"`
// Database info // Database info
TargetDB string `json:"target_db"` TargetDB string `json:"target_db"`
DatabaseType string `json:"database_type"` DatabaseType string `json:"database_type"`
// Error details // Error details
ExitCode int `json:"exit_code"` ExitCode int `json:"exit_code"`
ErrorMessage string `json:"error_message"` ErrorMessage string `json:"error_message"`
ErrorType string `json:"error_type"` ErrorType string `json:"error_type"`
ErrorHint string `json:"error_hint"` ErrorHint string `json:"error_hint"`
TotalErrors int `json:"total_errors"` TotalErrors int `json:"total_errors"`
// Captured output // Captured output
LastStderr []string `json:"last_stderr"` LastStderr []string `json:"last_stderr"`
FirstErrors []string `json:"first_errors"` FirstErrors []string `json:"first_errors"`
// Context around failure // Context around failure
FailureContext *FailureContext `json:"failure_context,omitempty"` FailureContext *FailureContext `json:"failure_context,omitempty"`
// Diagnosis results // Diagnosis results
DiagnosisResult *DiagnoseResult `json:"diagnosis_result,omitempty"` DiagnosisResult *DiagnoseResult `json:"diagnosis_result,omitempty"`
// Environment (sanitized) // Environment (sanitized)
PostgresVersion string `json:"postgres_version,omitempty"` PostgresVersion string `json:"postgres_version,omitempty"`
PgRestoreVersion string `json:"pg_restore_version,omitempty"` PgRestoreVersion string `json:"pg_restore_version,omitempty"`
PsqlVersion string `json:"psql_version,omitempty"` PsqlVersion string `json:"psql_version,omitempty"`
// Recommendations // Recommendations
Recommendations []string `json:"recommendations"` Recommendations []string `json:"recommendations"`
} }
@@ -67,40 +68,40 @@ type FailureContext struct {
FailedLine int `json:"failed_line,omitempty"` FailedLine int `json:"failed_line,omitempty"`
FailedStatement string `json:"failed_statement,omitempty"` FailedStatement string `json:"failed_statement,omitempty"`
SurroundingLines []string `json:"surrounding_lines,omitempty"` SurroundingLines []string `json:"surrounding_lines,omitempty"`
// For COPY block errors // For COPY block errors
InCopyBlock bool `json:"in_copy_block,omitempty"` InCopyBlock bool `json:"in_copy_block,omitempty"`
CopyTableName string `json:"copy_table_name,omitempty"` CopyTableName string `json:"copy_table_name,omitempty"`
CopyStartLine int `json:"copy_start_line,omitempty"` CopyStartLine int `json:"copy_start_line,omitempty"`
SampleCopyData []string `json:"sample_copy_data,omitempty"` SampleCopyData []string `json:"sample_copy_data,omitempty"`
// File position info // File position info
BytePosition int64 `json:"byte_position,omitempty"` BytePosition int64 `json:"byte_position,omitempty"`
PercentComplete float64 `json:"percent_complete,omitempty"` PercentComplete float64 `json:"percent_complete,omitempty"`
} }
// ErrorCollector captures detailed error information during restore // ErrorCollector captures detailed error information during restore
type ErrorCollector struct { type ErrorCollector struct {
log logger.Logger log logger.Logger
cfg *config.Config cfg *config.Config
archivePath string archivePath string
targetDB string targetDB string
format ArchiveFormat format ArchiveFormat
// Captured data // Captured data
stderrLines []string stderrLines []string
firstErrors []string firstErrors []string
lastErrors []string lastErrors []string
totalErrors int totalErrors int
exitCode int exitCode int
// Limits // Limits
maxStderrLines int maxStderrLines int
maxErrorCapture int maxErrorCapture int
// State // State
startTime time.Time startTime time.Time
enabled bool enabled bool
} }
// NewErrorCollector creates a new error collector // NewErrorCollector creates a new error collector
@@ -126,30 +127,30 @@ func (ec *ErrorCollector) CaptureStderr(chunk string) {
if !ec.enabled { if !ec.enabled {
return return
} }
lines := strings.Split(chunk, "\n") lines := strings.Split(chunk, "\n")
for _, line := range lines { for _, line := range lines {
line = strings.TrimSpace(line) line = strings.TrimSpace(line)
if line == "" { if line == "" {
continue continue
} }
// Store last N lines of stderr // Store last N lines of stderr
if len(ec.stderrLines) >= ec.maxStderrLines { if len(ec.stderrLines) >= ec.maxStderrLines {
// Shift array, drop oldest // Shift array, drop oldest
ec.stderrLines = ec.stderrLines[1:] ec.stderrLines = ec.stderrLines[1:]
} }
ec.stderrLines = append(ec.stderrLines, line) ec.stderrLines = append(ec.stderrLines, line)
// Check if this is an error line // Check if this is an error line
if isErrorLine(line) { if isErrorLine(line) {
ec.totalErrors++ ec.totalErrors++
// Capture first N errors // Capture first N errors
if len(ec.firstErrors) < ec.maxErrorCapture { if len(ec.firstErrors) < ec.maxErrorCapture {
ec.firstErrors = append(ec.firstErrors, line) ec.firstErrors = append(ec.firstErrors, line)
} }
// Keep last N errors (ring buffer style) // Keep last N errors (ring buffer style)
if len(ec.lastErrors) >= ec.maxErrorCapture { if len(ec.lastErrors) >= ec.maxErrorCapture {
ec.lastErrors = ec.lastErrors[1:] ec.lastErrors = ec.lastErrors[1:]
@@ -184,36 +185,36 @@ func (ec *ErrorCollector) GenerateReport(errMessage string, errType string, errH
LastStderr: ec.stderrLines, LastStderr: ec.stderrLines,
FirstErrors: ec.firstErrors, FirstErrors: ec.firstErrors,
} }
// Get archive size // Get archive size
if stat, err := os.Stat(ec.archivePath); err == nil { if stat, err := os.Stat(ec.archivePath); err == nil {
report.ArchiveSize = stat.Size() report.ArchiveSize = stat.Size()
} }
// Get tool versions // Get tool versions
report.PostgresVersion = getCommandVersion("postgres", "--version") report.PostgresVersion = getCommandVersion("postgres", "--version")
report.PgRestoreVersion = getCommandVersion("pg_restore", "--version") report.PgRestoreVersion = getCommandVersion("pg_restore", "--version")
report.PsqlVersion = getCommandVersion("psql", "--version") report.PsqlVersion = getCommandVersion("psql", "--version")
// Analyze failure context // Analyze failure context
report.FailureContext = ec.analyzeFailureContext() report.FailureContext = ec.analyzeFailureContext()
// Run diagnosis if not already done // Run diagnosis if not already done
diagnoser := NewDiagnoser(ec.log, false) diagnoser := NewDiagnoser(ec.log, false)
if diagResult, err := diagnoser.DiagnoseFile(ec.archivePath); err == nil { if diagResult, err := diagnoser.DiagnoseFile(ec.archivePath); err == nil {
report.DiagnosisResult = diagResult report.DiagnosisResult = diagResult
} }
// Generate recommendations // Generate recommendations
report.Recommendations = ec.generateRecommendations(report) report.Recommendations = ec.generateRecommendations(report)
return report return report
} }
// analyzeFailureContext extracts context around the failure // analyzeFailureContext extracts context around the failure
func (ec *ErrorCollector) analyzeFailureContext() *FailureContext { func (ec *ErrorCollector) analyzeFailureContext() *FailureContext {
ctx := &FailureContext{} ctx := &FailureContext{}
// Look for line number in errors // Look for line number in errors
for _, errLine := range ec.lastErrors { for _, errLine := range ec.lastErrors {
if lineNum := extractLineNumber(errLine); lineNum > 0 { if lineNum := extractLineNumber(errLine); lineNum > 0 {
@@ -221,7 +222,7 @@ func (ec *ErrorCollector) analyzeFailureContext() *FailureContext {
break break
} }
} }
// Look for COPY-related errors // Look for COPY-related errors
for _, errLine := range ec.lastErrors { for _, errLine := range ec.lastErrors {
if strings.Contains(errLine, "COPY") || strings.Contains(errLine, "syntax error") { if strings.Contains(errLine, "COPY") || strings.Contains(errLine, "syntax error") {
@@ -233,12 +234,12 @@ func (ec *ErrorCollector) analyzeFailureContext() *FailureContext {
break break
} }
} }
// If we have a line number, try to get surrounding context from the dump // If we have a line number, try to get surrounding context from the dump
if ctx.FailedLine > 0 && ec.archivePath != "" { if ctx.FailedLine > 0 && ec.archivePath != "" {
ctx.SurroundingLines = ec.getSurroundingLines(ctx.FailedLine, 5) ctx.SurroundingLines = ec.getSurroundingLines(ctx.FailedLine, 5)
} }
return ctx return ctx
} }
@@ -246,13 +247,13 @@ func (ec *ErrorCollector) analyzeFailureContext() *FailureContext {
func (ec *ErrorCollector) getSurroundingLines(lineNum int, context int) []string { func (ec *ErrorCollector) getSurroundingLines(lineNum int, context int) []string {
var reader io.Reader var reader io.Reader
var lines []string var lines []string
file, err := os.Open(ec.archivePath) file, err := os.Open(ec.archivePath)
if err != nil { if err != nil {
return nil return nil
} }
defer file.Close() defer file.Close()
// Handle compressed files // Handle compressed files
if strings.HasSuffix(ec.archivePath, ".gz") { if strings.HasSuffix(ec.archivePath, ".gz") {
gz, err := gzip.NewReader(file) gz, err := gzip.NewReader(file)
@@ -264,19 +265,19 @@ func (ec *ErrorCollector) getSurroundingLines(lineNum int, context int) []string
} else { } else {
reader = file reader = file
} }
scanner := bufio.NewScanner(reader) scanner := bufio.NewScanner(reader)
buf := make([]byte, 0, 1024*1024) buf := make([]byte, 0, 1024*1024)
scanner.Buffer(buf, 10*1024*1024) scanner.Buffer(buf, 10*1024*1024)
currentLine := 0 currentLine := 0
startLine := lineNum - context startLine := lineNum - context
endLine := lineNum + context endLine := lineNum + context
if startLine < 1 { if startLine < 1 {
startLine = 1 startLine = 1
} }
for scanner.Scan() { for scanner.Scan() {
currentLine++ currentLine++
if currentLine >= startLine && currentLine <= endLine { if currentLine >= startLine && currentLine <= endLine {
@@ -290,18 +291,18 @@ func (ec *ErrorCollector) getSurroundingLines(lineNum int, context int) []string
break break
} }
} }
return lines return lines
} }
// generateRecommendations provides actionable recommendations based on the error // generateRecommendations provides actionable recommendations based on the error
func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []string { func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []string {
var recs []string var recs []string
// Check diagnosis results // Check diagnosis results
if report.DiagnosisResult != nil { if report.DiagnosisResult != nil {
if report.DiagnosisResult.IsTruncated { if report.DiagnosisResult.IsTruncated {
recs = append(recs, recs = append(recs,
"CRITICAL: Backup file is truncated/incomplete", "CRITICAL: Backup file is truncated/incomplete",
"Action: Re-run the backup for the affected database", "Action: Re-run the backup for the affected database",
"Check: Verify disk space was available during backup", "Check: Verify disk space was available during backup",
@@ -317,14 +318,14 @@ func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []
} }
if report.DiagnosisResult.Details != nil && report.DiagnosisResult.Details.UnterminatedCopy { if report.DiagnosisResult.Details != nil && report.DiagnosisResult.Details.UnterminatedCopy {
recs = append(recs, recs = append(recs,
fmt.Sprintf("ISSUE: COPY block for table '%s' was not terminated", fmt.Sprintf("ISSUE: COPY block for table '%s' was not terminated",
report.DiagnosisResult.Details.LastCopyTable), report.DiagnosisResult.Details.LastCopyTable),
"Cause: Backup was interrupted during data export", "Cause: Backup was interrupted during data export",
"Action: Re-run backup ensuring it completes fully", "Action: Re-run backup ensuring it completes fully",
) )
} }
} }
// Check error patterns // Check error patterns
if report.TotalErrors > 1000000 { if report.TotalErrors > 1000000 {
recs = append(recs, recs = append(recs,
@@ -333,7 +334,7 @@ func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []
"Check: Verify dump format matches restore command", "Check: Verify dump format matches restore command",
) )
} }
// Check for common error types // Check for common error types
errLower := strings.ToLower(report.ErrorMessage) errLower := strings.ToLower(report.ErrorMessage)
if strings.Contains(errLower, "syntax error") { if strings.Contains(errLower, "syntax error") {
@@ -343,7 +344,7 @@ func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []
"Check: Run 'dbbackup restore diagnose <archive>' for detailed analysis", "Check: Run 'dbbackup restore diagnose <archive>' for detailed analysis",
) )
} }
if strings.Contains(errLower, "permission denied") { if strings.Contains(errLower, "permission denied") {
recs = append(recs, recs = append(recs,
"ISSUE: Permission denied", "ISSUE: Permission denied",
@@ -351,7 +352,7 @@ func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []
"Action: For ownership preservation, use a superuser account", "Action: For ownership preservation, use a superuser account",
) )
} }
if strings.Contains(errLower, "does not exist") { if strings.Contains(errLower, "does not exist") {
recs = append(recs, recs = append(recs,
"ISSUE: Missing object reference", "ISSUE: Missing object reference",
@@ -359,7 +360,7 @@ func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []
"Action: Check if target database was created", "Action: Check if target database was created",
) )
} }
if len(recs) == 0 { if len(recs) == 0 {
recs = append(recs, recs = append(recs,
"Run 'dbbackup restore diagnose <archive>' for detailed analysis", "Run 'dbbackup restore diagnose <archive>' for detailed analysis",
@@ -367,7 +368,7 @@ func (ec *ErrorCollector) generateRecommendations(report *RestoreErrorReport) []
"Review the PostgreSQL/MySQL logs on the target server", "Review the PostgreSQL/MySQL logs on the target server",
) )
} }
return recs return recs
} }
@@ -378,18 +379,18 @@ func (ec *ErrorCollector) SaveReport(report *RestoreErrorReport, outputPath stri
if err := os.MkdirAll(dir, 0755); err != nil { if err := os.MkdirAll(dir, 0755); err != nil {
return fmt.Errorf("failed to create directory: %w", err) return fmt.Errorf("failed to create directory: %w", err)
} }
// Marshal to JSON with indentation // Marshal to JSON with indentation
data, err := json.MarshalIndent(report, "", " ") data, err := json.MarshalIndent(report, "", " ")
if err != nil { if err != nil {
return fmt.Errorf("failed to marshal report: %w", err) return fmt.Errorf("failed to marshal report: %w", err)
} }
// Write file // Write file
if err := os.WriteFile(outputPath, data, 0644); err != nil { if err := os.WriteFile(outputPath, data, 0644); err != nil {
return fmt.Errorf("failed to write report: %w", err) return fmt.Errorf("failed to write report: %w", err)
} }
return nil return nil
} }
@@ -399,35 +400,35 @@ func (ec *ErrorCollector) PrintReport(report *RestoreErrorReport) {
fmt.Println(strings.Repeat("═", 70)) fmt.Println(strings.Repeat("═", 70))
fmt.Println(" 🔴 RESTORE ERROR REPORT") fmt.Println(" 🔴 RESTORE ERROR REPORT")
fmt.Println(strings.Repeat("═", 70)) fmt.Println(strings.Repeat("═", 70))
fmt.Printf("\n📅 Timestamp: %s\n", report.Timestamp.Format("2006-01-02 15:04:05")) fmt.Printf("\n📅 Timestamp: %s\n", report.Timestamp.Format("2006-01-02 15:04:05"))
fmt.Printf("📦 Archive: %s\n", filepath.Base(report.ArchivePath)) fmt.Printf("📦 Archive: %s\n", filepath.Base(report.ArchivePath))
fmt.Printf("📊 Format: %s\n", report.ArchiveFormat) fmt.Printf("📊 Format: %s\n", report.ArchiveFormat)
fmt.Printf("🎯 Target DB: %s\n", report.TargetDB) fmt.Printf("🎯 Target DB: %s\n", report.TargetDB)
fmt.Printf("⚠️ Exit Code: %d\n", report.ExitCode) fmt.Printf("⚠️ Exit Code: %d\n", report.ExitCode)
fmt.Printf("❌ Total Errors: %d\n", report.TotalErrors) fmt.Printf("❌ Total Errors: %d\n", report.TotalErrors)
fmt.Println("\n" + strings.Repeat("─", 70)) fmt.Println("\n" + strings.Repeat("─", 70))
fmt.Println("ERROR DETAILS:") fmt.Println("ERROR DETAILS:")
fmt.Println(strings.Repeat("─", 70)) fmt.Println(strings.Repeat("─", 70))
fmt.Printf("\nType: %s\n", report.ErrorType) fmt.Printf("\nType: %s\n", report.ErrorType)
fmt.Printf("Message: %s\n", report.ErrorMessage) fmt.Printf("Message: %s\n", report.ErrorMessage)
if report.ErrorHint != "" { if report.ErrorHint != "" {
fmt.Printf("Hint: %s\n", report.ErrorHint) fmt.Printf("Hint: %s\n", report.ErrorHint)
} }
// Show failure context // Show failure context
if report.FailureContext != nil && report.FailureContext.FailedLine > 0 { if report.FailureContext != nil && report.FailureContext.FailedLine > 0 {
fmt.Println("\n" + strings.Repeat("─", 70)) fmt.Println("\n" + strings.Repeat("─", 70))
fmt.Println("FAILURE CONTEXT:") fmt.Println("FAILURE CONTEXT:")
fmt.Println(strings.Repeat("─", 70)) fmt.Println(strings.Repeat("─", 70))
fmt.Printf("\nFailed at line: %d\n", report.FailureContext.FailedLine) fmt.Printf("\nFailed at line: %d\n", report.FailureContext.FailedLine)
if report.FailureContext.InCopyBlock { if report.FailureContext.InCopyBlock {
fmt.Printf("Inside COPY block for table: %s\n", report.FailureContext.CopyTableName) fmt.Printf("Inside COPY block for table: %s\n", report.FailureContext.CopyTableName)
} }
if len(report.FailureContext.SurroundingLines) > 0 { if len(report.FailureContext.SurroundingLines) > 0 {
fmt.Println("\nSurrounding lines:") fmt.Println("\nSurrounding lines:")
for _, line := range report.FailureContext.SurroundingLines { for _, line := range report.FailureContext.SurroundingLines {
@@ -435,13 +436,13 @@ func (ec *ErrorCollector) PrintReport(report *RestoreErrorReport) {
} }
} }
} }
// Show first few errors // Show first few errors
if len(report.FirstErrors) > 0 { if len(report.FirstErrors) > 0 {
fmt.Println("\n" + strings.Repeat("─", 70)) fmt.Println("\n" + strings.Repeat("─", 70))
fmt.Println("FIRST ERRORS:") fmt.Println("FIRST ERRORS:")
fmt.Println(strings.Repeat("─", 70)) fmt.Println(strings.Repeat("─", 70))
for i, err := range report.FirstErrors { for i, err := range report.FirstErrors {
if i >= 5 { if i >= 5 {
fmt.Printf("... and %d more\n", len(report.FirstErrors)-5) fmt.Printf("... and %d more\n", len(report.FirstErrors)-5)
@@ -450,13 +451,13 @@ func (ec *ErrorCollector) PrintReport(report *RestoreErrorReport) {
fmt.Printf(" %d. %s\n", i+1, truncateString(err, 100)) fmt.Printf(" %d. %s\n", i+1, truncateString(err, 100))
} }
} }
// Show diagnosis summary // Show diagnosis summary
if report.DiagnosisResult != nil && !report.DiagnosisResult.IsValid { if report.DiagnosisResult != nil && !report.DiagnosisResult.IsValid {
fmt.Println("\n" + strings.Repeat("─", 70)) fmt.Println("\n" + strings.Repeat("─", 70))
fmt.Println("DIAGNOSIS:") fmt.Println("DIAGNOSIS:")
fmt.Println(strings.Repeat("─", 70)) fmt.Println(strings.Repeat("─", 70))
if report.DiagnosisResult.IsTruncated { if report.DiagnosisResult.IsTruncated {
fmt.Println(" ❌ File is TRUNCATED") fmt.Println(" ❌ File is TRUNCATED")
} }
@@ -470,21 +471,21 @@ func (ec *ErrorCollector) PrintReport(report *RestoreErrorReport) {
fmt.Printf(" • %s\n", err) fmt.Printf(" • %s\n", err)
} }
} }
// Show recommendations // Show recommendations
fmt.Println("\n" + strings.Repeat("─", 70)) fmt.Println("\n" + strings.Repeat("─", 70))
fmt.Println("💡 RECOMMENDATIONS:") fmt.Println("💡 RECOMMENDATIONS:")
fmt.Println(strings.Repeat("─", 70)) fmt.Println(strings.Repeat("─", 70))
for _, rec := range report.Recommendations { for _, rec := range report.Recommendations {
fmt.Printf(" • %s\n", rec) fmt.Printf(" • %s\n", rec)
} }
// Show tool versions // Show tool versions
fmt.Println("\n" + strings.Repeat("─", 70)) fmt.Println("\n" + strings.Repeat("─", 70))
fmt.Println("ENVIRONMENT:") fmt.Println("ENVIRONMENT:")
fmt.Println(strings.Repeat("─", 70)) fmt.Println(strings.Repeat("─", 70))
fmt.Printf(" OS: %s/%s\n", report.OS, report.Arch) fmt.Printf(" OS: %s/%s\n", report.OS, report.Arch)
fmt.Printf(" Go: %s\n", report.GoVersion) fmt.Printf(" Go: %s\n", report.GoVersion)
if report.PgRestoreVersion != "" { if report.PgRestoreVersion != "" {
@@ -493,15 +494,15 @@ func (ec *ErrorCollector) PrintReport(report *RestoreErrorReport) {
if report.PsqlVersion != "" { if report.PsqlVersion != "" {
fmt.Printf(" psql: %s\n", report.PsqlVersion) fmt.Printf(" psql: %s\n", report.PsqlVersion)
} }
fmt.Println(strings.Repeat("═", 70)) fmt.Println(strings.Repeat("═", 70))
} }
// Helper functions // Helper functions
func isErrorLine(line string) bool { func isErrorLine(line string) bool {
return strings.Contains(line, "ERROR:") || return strings.Contains(line, "ERROR:") ||
strings.Contains(line, "FATAL:") || strings.Contains(line, "FATAL:") ||
strings.Contains(line, "error:") || strings.Contains(line, "error:") ||
strings.Contains(line, "PANIC:") strings.Contains(line, "PANIC:")
} }
@@ -556,7 +557,11 @@ func getDatabaseType(format ArchiveFormat) string {
} }
func getCommandVersion(cmd string, arg string) string { func getCommandVersion(cmd string, arg string) string {
output, err := exec.Command(cmd, arg).CombinedOutput() // Use timeout to prevent blocking if command hangs
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
output, err := exec.CommandContext(ctx, cmd, arg).CombinedOutput()
if err != nil { if err != nil {
return "" return ""
} }

View File

@@ -6,6 +6,7 @@ import (
"os/exec" "os/exec"
"regexp" "regexp"
"strconv" "strconv"
"time"
"dbbackup/internal/database" "dbbackup/internal/database"
) )
@@ -47,8 +48,13 @@ func ParsePostgreSQLVersion(versionStr string) (*VersionInfo, error) {
// GetDumpFileVersion extracts the PostgreSQL version from a dump file // GetDumpFileVersion extracts the PostgreSQL version from a dump file
// Uses pg_restore -l to read the dump metadata // Uses pg_restore -l to read the dump metadata
// Uses a 30-second timeout to avoid blocking on large files
func GetDumpFileVersion(dumpPath string) (*VersionInfo, error) { func GetDumpFileVersion(dumpPath string) (*VersionInfo, error) {
cmd := exec.Command("pg_restore", "-l", dumpPath) // Use a timeout context to prevent blocking on very large dump files
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpPath)
output, err := cmd.CombinedOutput() output, err := cmd.CombinedOutput()
if err != nil { if err != nil {
return nil, fmt.Errorf("failed to read dump file metadata: %w (output: %s)", err, string(output)) return nil, fmt.Errorf("failed to read dump file metadata: %w (output: %s)", err, string(output))

View File

@@ -83,10 +83,10 @@ type backupCompleteMsg struct {
func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, backupType, dbName string, ratio int) tea.Cmd { func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, backupType, dbName string, ratio int) tea.Cmd {
return func() tea.Msg { return func() tea.Msg {
// Use configurable cluster timeout (minutes) from config; default set in config.New() // NO TIMEOUT for backup operations - a backup takes as long as it takes
// Use parent context to inherit cancellation from TUI // Large databases can take many hours
clusterTimeout := time.Duration(cfg.ClusterTimeoutMinutes) * time.Minute // Only manual cancellation (Ctrl+C) should stop the backup
ctx, cancel := context.WithTimeout(parentCtx, clusterTimeout) ctx, cancel := context.WithCancel(parentCtx)
defer cancel() defer cancel()
start := time.Now() start := time.Now()

View File

@@ -53,7 +53,8 @@ type databaseListMsg struct {
func fetchDatabases(cfg *config.Config, log logger.Logger) tea.Cmd { func fetchDatabases(cfg *config.Config, log logger.Logger) tea.Cmd {
return func() tea.Msg { return func() tea.Msg {
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second) // 60 seconds for database listing - busy servers may be slow
ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
defer cancel() defer cancel()
dbClient, err := database.New(cfg, log) dbClient, err := database.New(cfg, log)

View File

@@ -111,10 +111,10 @@ type restoreCompleteMsg struct {
func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string, cleanFirst, createIfMissing bool, restoreType string, cleanClusterFirst bool, existingDBs []string, saveDebugLog bool) tea.Cmd { func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string, cleanFirst, createIfMissing bool, restoreType string, cleanClusterFirst bool, existingDBs []string, saveDebugLog bool) tea.Cmd {
return func() tea.Msg { return func() tea.Msg {
// Use configurable cluster timeout (minutes) from config; default set in config.New() // NO TIMEOUT for restore operations - a restore takes as long as it takes
// Use parent context to inherit cancellation from TUI // Large databases with large objects can take many hours
restoreTimeout := time.Duration(cfg.ClusterTimeoutMinutes) * time.Minute // Only manual cancellation (Ctrl+C) should stop the restore
ctx, cancel := context.WithTimeout(parentCtx, restoreTimeout) ctx, cancel := context.WithCancel(parentCtx)
defer cancel() defer cancel()
start := time.Now() start := time.Now()
@@ -138,8 +138,8 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
// This matches how cluster restore works - uses CLI tools, not database connections // This matches how cluster restore works - uses CLI tools, not database connections
droppedCount := 0 droppedCount := 0
for _, dbName := range existingDBs { for _, dbName := range existingDBs {
// Create timeout context for each database drop (30 seconds per DB) // Create timeout context for each database drop (5 minutes per DB - large DBs take time)
dropCtx, dropCancel := context.WithTimeout(ctx, 30*time.Second) dropCtx, dropCancel := context.WithTimeout(ctx, 5*time.Minute)
if err := dropDatabaseCLI(dropCtx, cfg, dbName); err != nil { if err := dropDatabaseCLI(dropCtx, cfg, dbName); err != nil {
log.Warn("Failed to drop database", "name", dbName, "error", err) log.Warn("Failed to drop database", "name", dbName, "error", err)
// Continue with other databases // Continue with other databases

View File

@@ -106,7 +106,8 @@ type safetyCheckCompleteMsg struct {
func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd { func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd {
return func() tea.Msg { return func() tea.Msg {
ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second) // 10 minutes for safety checks - large archives can take a long time to diagnose
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel() defer cancel()
safety := restore.NewSafety(cfg, log) safety := restore.NewSafety(cfg, log)
@@ -444,7 +445,7 @@ func (m RestorePreviewModel) View() string {
// Advanced Options // Advanced Options
s.WriteString(archiveHeaderStyle.Render("⚙️ Advanced Options")) s.WriteString(archiveHeaderStyle.Render("⚙️ Advanced Options"))
s.WriteString("\n") s.WriteString("\n")
// Work directory option // Work directory option
workDirIcon := "✗" workDirIcon := "✗"
workDirStyle := infoStyle workDirStyle := infoStyle
@@ -460,7 +461,7 @@ func (m RestorePreviewModel) View() string {
s.WriteString(infoStyle.Render(" ⚠️ Large archives need more space than /tmp may have")) s.WriteString(infoStyle.Render(" ⚠️ Large archives need more space than /tmp may have"))
s.WriteString("\n") s.WriteString("\n")
} }
// Debug log option // Debug log option
debugIcon := "✗" debugIcon := "✗"
debugStyle := infoStyle debugStyle := infoStyle

View File

@@ -70,7 +70,8 @@ type statusMsg struct {
func fetchStatus(cfg *config.Config, log logger.Logger) tea.Cmd { func fetchStatus(cfg *config.Config, log logger.Logger) tea.Cmd {
return func() tea.Msg { return func() tea.Msg {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) // 30 seconds for status check - slow networks or SSL negotiation
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel() defer cancel()
dbClient, err := database.New(cfg, log) dbClient, err := database.New(cfg, log)

View File

@@ -16,7 +16,7 @@ import (
// Build information (set by ldflags) // Build information (set by ldflags)
var ( var (
version = "3.42.1" version = "3.42.9"
buildTime = "unknown" buildTime = "unknown"
gitCommit = "unknown" gitCommit = "unknown"
) )