v3.42.9: Fix all timeout bugs and deadlocks

CRITICAL FIXES: - Encryption detection false positive (IsBackupEncrypted returned true for ALL files) - 12 cmd.Wait() deadlocks fixed with channel-based context handling - TUI timeout bugs: 60s->10min for safety checks, 15s->60s for DB listing - diagnose.go timeouts: 60s->5min for tar/pg_restore operations - Panic recovery added to parallel backup/restore goroutines - Variable shadowing fix in restore/engine.go These bugs caused pg_dump backups to fail through TUI for months.
fix: restore automatic builds on tag push
2026-01-08 05:56:31 +01:00 · 2026-01-07 20:53:20 +01:00 · 2026-01-07 20:48:01 +01:00 · 2026-01-07 20:41:53 +01:00
29 changed files with 1154 additions and 329 deletions
--- a/.gitea/workflows/ci.yml
+++ b/.gitea/workflows/ci.yml
@@ -63,7 +63,7 @@ jobs:
    name: Build & Release
    runs-on: ubuntu-latest
    needs: [test, lint]
-    if: startsWith(github.ref, 'refs/tags/')
+    if: startsWith(github.ref, 'refs/tags/v')
    container:
      image: golang:1.24-bookworm
    steps:
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,158 @@ All notable changes to dbbackup will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## [3.42.9] - 2026-01-08 "Diagnose Timeout Fix"
 ### Fixed - diagnose.go Timeout Bugs
 **More short timeouts that caused large archive failures:**
 - `diagnoseClusterArchive()`: tar listing 60s → **5 minutes**
 - `verifyWithPgRestore()`: pg_restore --list 60s → **5 minutes**
 - `DiagnoseClusterDumps()`: archive listing 120s → **10 minutes**
 **Impact:** These timeouts caused "context deadline exceeded" errors when
 diagnosing multi-GB backup archives, preventing TUI restore from even starting.
 ## [3.42.8] - 2026-01-08 "TUI Timeout Fix"
 ### Fixed - TUI Timeout Bugs Causing Backup/Restore Failures
 **ROOT CAUSE of 2-3 month TUI backup/restore failures identified and fixed:**
 #### Critical Timeout Fixes:
 - **restore_preview.go**: Safety check timeout increased from 60s → **10 minutes**
  - Large archives (>1GB) take 2+ minutes to diagnose
  - Users saw "context deadline exceeded" before backup even started
 - **dbselector.go**: Database listing timeout increased from 15s → **60 seconds**
  - Busy PostgreSQL servers need more time to respond
 - **status.go**: Status check timeout increased from 10s → **30 seconds**
  - SSL negotiation and slow networks caused failures
 #### Stability Improvements:
 - **Panic recovery** added to parallel goroutines in:
  - `backup/engine.go:BackupCluster()` - cluster backup workers
  - `restore/engine.go:RestoreCluster()` - cluster restore workers
  - Prevents single database panic from crashing entire operation
 #### Bug Fix:
 - **restore/engine.go**: Fixed variable shadowing `err` → `cmdErr` for exit code detection
 ## [3.42.7] - 2026-01-08 "Context Killer Complete"
 ### Fixed - Additional Deadlock Bugs in Restore & Engine
 **All remaining cmd.Wait() deadlock bugs fixed across the codebase:**
 #### internal/restore/engine.go:
 - `executeRestoreWithDecompression()` - gunzip/pigz pipeline restore
 - `extractArchive()` - tar extraction for cluster restore
 - `restoreGlobals()` - pg_dumpall globals restore
 #### internal/backup/engine.go:
 - `createArchive()` - tar/pigz archive creation pipeline
 #### internal/engine/mysqldump.go:
 - `Backup()` - mysqldump backup operation
 - `BackupToWriter()` - streaming mysqldump to writer
 **All 6 functions now use proper channel-based context handling with Process.Kill().**
 ## [3.42.6] - 2026-01-08 "Deadlock Killer"
 ### Fixed - Backup Command Context Handling
 **Critical Bug: pg_dump/mysqldump could hang forever on context cancellation**
 The `executeCommand`, `executeCommandWithProgress`, `executeMySQLWithProgressAndCompression`, 
 and `executeMySQLWithCompression` functions had a race condition where:
 1. A goroutine was spawned to read stderr
 2. `cmd.Wait()` was called directly
 3. If context was cancelled, the process was NOT killed
 4. The goroutine could hang forever waiting for stderr
 **Fix**: All backup execution functions now use proper channel-based context handling:
 ```go
 // Wait for command with context handling
 cmdDone := make(chan error, 1)
 go func() {
    cmdDone <- cmd.Wait()
 }()
 select {
 case cmdErr = <-cmdDone:
    // Command completed
 case <-ctx.Done():
    // Context cancelled - kill process
    cmd.Process.Kill()
    <-cmdDone
    cmdErr = ctx.Err()
 }
 ```
 **Affected Functions:**
 - `executeCommand()` - pg_dump for cluster backup
 - `executeCommandWithProgress()` - pg_dump for single backup with progress
 - `executeMySQLWithProgressAndCompression()` - mysqldump pipeline
 - `executeMySQLWithCompression()` - mysqldump pipeline
 **This fixes:** Backup operations hanging indefinitely when cancelled or timing out.
 ## [3.42.5] - 2026-01-08 "False Positive Fix"
 ### Fixed - Encryption Detection Bug
 **IsBackupEncrypted False Positive:**
 - **BUG FIX**: `IsBackupEncrypted()` returned `true` for ALL files, blocking normal restores
 - Root cause: Fallback logic checked if first 12 bytes (nonce size) could be read - always true
 - Fix: Now properly detects known unencrypted formats by magic bytes:
  - Gzip: `1f 8b`
  - PostgreSQL custom: `PGDMP`
  - Plain SQL: starts with `--`, `SET`, `CREATE`
 - Returns `false` if no metadata present and format is recognized as unencrypted
 - Affected file: `internal/backup/encryption.go`
 ## [3.42.4] - 2026-01-08 "The Long Haul"
 ### Fixed - Critical Restore Timeout Bug
 **Removed Arbitrary Timeouts from Backup/Restore Operations:**
 - **CRITICAL FIX**: Removed 4-hour timeout that was killing large database restores
 - PostgreSQL cluster restores of 69GB+ databases no longer fail with "context deadline exceeded"
 - All backup/restore operations now use `context.WithCancel` instead of `context.WithTimeout`
 - Operations run until completion or manual cancellation (Ctrl+C)
 **Affected Files:**
 - `internal/tui/restore_exec.go`: Changed from 4-hour timeout to context.WithCancel
 - `internal/tui/backup_exec.go`: Changed from 4-hour timeout to context.WithCancel  
 - `internal/backup/engine.go`: Removed per-database timeout in cluster backup
 - `cmd/restore.go`: CLI restore commands use context.WithCancel
 **exec.Command Context Audit:**
 - Fixed `exec.Command` without Context in `internal/restore/engine.go:730`
 - Added proper context handling to all external command calls
 - Added timeouts only for quick diagnostic/version checks (not restore path):
  - `restore/version_check.go`: 30s timeout for pg_restore --version check only
  - `restore/error_report.go`: 10s timeout for tool version detection
  - `restore/diagnose.go`: 60s timeout for diagnostic functions
  - `pitr/binlog.go`: 10s timeout for mysqlbinlog --version check
  - `cleanup/processes.go`: 5s timeout for process listing
  - `auth/helper.go`: 30s timeout for auth helper commands
 **Verification:**
 - 54 total `exec.CommandContext` calls verified in backup/restore/pitr path
 - 0 `exec.Command` without Context in critical restore path
 - All 14 PostgreSQL exec calls use CommandContext (pg_dump, pg_restore, psql)
 - All 15 MySQL/MariaDB exec calls use CommandContext (mysqldump, mysql, mysqlbinlog)
 - All 14 test packages pass
 ### Technical Details
 - Large Object (BLOB/BYTEA) restores are particularly affected by timeouts
 - 69GB database with large objects can take 5+ hours to restore
 - Previous 4-hour hard timeout was causing consistent failures
 - Now: No timeout - runs until complete or user cancels
 ## [3.42.1] - 2026-01-07 "Resistance is Futile"
 ### Added - Content-Defined Chunking Deduplication
--- a/EMOTICON_REMOVAL_PLAN.md
+++ b/EMOTICON_REMOVAL_PLAN.md
@@ -0,0 +1,295 @@
 # Emoticon Removal Plan for Python Code
 ## ⚠️ CRITICAL: Code Must Remain Functional After Removal
 This document outlines a **safe, systematic approach** to removing emoticons from Python code without breaking functionality.
 ---
 ## 1. Identification Phase
 ### 1.1 Where Emoticons CAN Safely Exist (Safe to Remove)
 | Location | Risk Level | Action |
 |----------|------------|--------|
 | Comments (`# 🎉 Success!`) | ✅ SAFE | Remove or replace with text |
 | Docstrings (`"""📌 Note:..."""`) | ✅ SAFE | Remove or replace with text |
 | Print statements for decoration (`print("✅ Done!")`) | ⚠️ LOW | Replace with ASCII or text |
 | Logging messages (`logger.info("🔥 Starting...")`) | ⚠️ LOW | Replace with text equivalent |
 ### 1.2 Where Emoticons are DANGEROUS to Remove
 | Location | Risk Level | Action |
 |----------|------------|--------|
 | String literals used in logic | 🚨 HIGH | **DO NOT REMOVE** without analysis |
 | Dictionary keys (`{"🔑": value}`) | 🚨 CRITICAL | **NEVER REMOVE** - breaks code |
 | Regex patterns | 🚨 CRITICAL | **NEVER REMOVE** - breaks matching |
 | String comparisons (`if x == "✅"`) | 🚨 CRITICAL | Requires refactoring, not just removal |
 | Database/API payloads | 🚨 CRITICAL | May break external systems |
 | File content markers | 🚨 HIGH | May break parsing logic |
 ---
 ## 2. Pre-Removal Checklist
 ### 2.1 Before ANY Changes
 - [ ] **Full backup** of the codebase
 - [ ] **Run all tests** and record baseline results
 - [ ] **Document all emoticon locations** with grep/search
 - [ ] **Identify emoticon usage patterns** (decorative vs. functional)
 ### 2.2 Discovery Commands
 ```bash
 # Find all files with emoticons (Unicode range for common emojis)
 grep -rn --include="*.py" -P '[\x{1F300}-\x{1F9FF}]' .
 # Find emoticons in strings
 grep -rn --include="*.py" -E '["'"'"'][^"'"'"']*[\x{1F300}-\x{1F9FF}]' .
 # List unique emoticons used
 grep -oP '[\x{1F300}-\x{1F9FF}]' *.py | sort -u
 ```
 ---
 ## 3. Replacement Strategy
 ### 3.1 Semantic Replacement Table
 | Emoticon | Text Replacement | Context |
 |----------|------------------|---------|
 | ✅ | `[OK]` or `[SUCCESS]` | Status indicators |
 | ❌ | `[FAIL]` or `[ERROR]` | Error indicators |
 | ⚠️ | `[WARNING]` | Warning messages |
 | 🔥 | `[HOT]` or `` (remove) | Decorative |
 | 🎉 | `[DONE]` or `` (remove) | Celebration/completion |
 | 📌 | `[NOTE]` | Notes/pinned items |
 | 🚀 | `[START]` or `` (remove) | Launch/start indicators |
 | 💾 | `[SAVE]` | Save operations |
 | 🔑 | `[KEY]` | Key/authentication |
 | 📁 | `[FILE]` | File operations |
 | 🔍 | `[SEARCH]` | Search operations |
 | ⏳ | `[WAIT]` or `[LOADING]` | Progress indicators |
 | 🛑 | `[STOP]` | Stop/halt indicators |
 | ℹ️ | `[INFO]` | Information |
 | 🐛 | `[BUG]` or `[DEBUG]` | Debug messages |
 ### 3.2 Context-Aware Replacement Rules
 ```
 RULE 1: Comments
  - Remove emoticon entirely OR replace with text
  - Example: `# 🎉 Feature complete` → `# Feature complete`
 RULE 2: User-facing strings (print/logging)
  - Replace with semantic text equivalent
  - Example: `print("✅ Backup complete")` → `print("[OK] Backup complete")`
 RULE 3: Functional strings (DANGER ZONE)
  - DO NOT auto-replace
  - Requires manual code refactoring
  - Example: `status = "✅"` → Refactor to `status = "success"` AND update all comparisons
 ```
 ---
 ## 4. Safe Removal Process
 ### Step 1: Audit
 ```python
 # Python script to audit emoticon usage
 import re
 import ast
 EMOJI_PATTERN = re.compile(
    "["
    "\U0001F300-\U0001F9FF"  # Symbols & Pictographs
    "\U00002600-\U000026FF"  # Misc symbols
    "\U00002700-\U000027BF"  # Dingbats
    "\U0001F600-\U0001F64F"  # Emoticons
    "]+"
 )
 def audit_file(filepath):
    with open(filepath, 'r', encoding='utf-8') as f:
        content = f.read()
    # Parse AST to understand context
    tree = ast.parse(content)
    findings = []
    for lineno, line in enumerate(content.split('\n'), 1):
        matches = EMOJI_PATTERN.findall(line)
        if matches:
            # Determine context (comment, string, etc.)
            context = classify_context(line, matches)
            findings.append({
                'line': lineno,
                'content': line.strip(),
                'emojis': matches,
                'context': context,
                'risk': assess_risk(context)
            })
    return findings
 def classify_context(line, matches):
    stripped = line.strip()
    if stripped.startswith('#'):
        return 'COMMENT'
    if 'print(' in line or 'logging.' in line or 'logger.' in line:
        return 'OUTPUT'
    if '==' in line or '!=' in line:
        return 'COMPARISON'
    if re.search(r'["\'][^"\']*$', line.split('#')[0]):
        return 'STRING_LITERAL'
    return 'UNKNOWN'
 def assess_risk(context):
    risk_map = {
        'COMMENT': 'LOW',
        'OUTPUT': 'LOW',
        'COMPARISON': 'CRITICAL',
        'STRING_LITERAL': 'HIGH',
        'UNKNOWN': 'HIGH'
    }
    return risk_map.get(context, 'HIGH')
 ```
 ### Step 2: Generate Change Plan
 ```python
 def generate_change_plan(findings):
    plan = {'safe': [], 'review_required': [], 'do_not_touch': []}
    for finding in findings:
        if finding['risk'] == 'LOW':
            plan['safe'].append(finding)
        elif finding['risk'] == 'HIGH':
            plan['review_required'].append(finding)
        else:  # CRITICAL
            plan['do_not_touch'].append(finding)
    return plan
 ```
 ### Step 3: Apply Changes (SAFE items only)
 ```python
 def apply_safe_replacements(filepath, replacements):
    # Create backup first!
    import shutil
    shutil.copy(filepath, filepath + '.backup')
    with open(filepath, 'r', encoding='utf-8') as f:
        content = f.read()
    for old, new in replacements:
        content = content.replace(old, new)
    with open(filepath, 'w', encoding='utf-8') as f:
        f.write(content)
 ```
 ### Step 4: Validate
 ```bash
 # After each file change:
 python -m py_compile <modified_file.py>  # Syntax check
 pytest <related_tests>                     # Run tests
 ```
 ---
 ## 5. Validation Checklist
 ### After EACH File Modification
 - [ ] File compiles without syntax errors (`python -m py_compile file.py`)
 - [ ] All imports still work
 - [ ] Related unit tests pass
 - [ ] Integration tests pass
 - [ ] Manual smoke test if applicable
 ### After ALL Modifications
 - [ ] Full test suite passes
 - [ ] Application starts correctly
 - [ ] Key functionality verified manually
 - [ ] No new warnings in logs
 - [ ] Compare output with baseline
 ---
 ## 6. Rollback Plan
 ### If Something Breaks
 1. **Immediate**: Restore from `.backup` files
 2. **Git**: `git checkout -- <file>` or `git stash pop`
 3. **Full rollback**: Restore from pre-change backup
 ### Keep Until Verified
 ```bash
 # Backup storage structure
 backups/
 ├── pre_emoticon_removal/
 │   ├── timestamp.tar.gz
 │   └── git_commit_hash.txt
 └── individual_files/
    ├── file1.py.backup
    └── file2.py.backup
 ```
 ---
 ## 7. Implementation Order
 1. **Phase 1**: Comments only (LOWEST risk)
 2. **Phase 2**: Docstrings (LOW risk)
 3. **Phase 3**: Print/logging statements (LOW-MEDIUM risk)
 4. **Phase 4**: Manual review items (HIGH risk) - one by one
 5. **Phase 5**: NEVER touch CRITICAL items without full refactoring
 ---
 ## 8. Example Workflow
 ```bash
 # 1. Create full backup
 git stash && git checkout -b emoticon-removal
 # 2. Run audit script
 python emoticon_audit.py > audit_report.json
 # 3. Review audit report
 cat audit_report.json | jq '.do_not_touch'  # Check critical items
 # 4. Apply safe changes only
 python apply_safe_changes.py --dry-run  # Preview first!
 python apply_safe_changes.py            # Apply
 # 5. Validate after each change
 python -m pytest tests/
 # 6. Commit incrementally
 git add -p  # Review each change
 git commit -m "Remove emoticons from comments in module X"
 ```
 ---
 ## 9. DO NOT DO
 ❌ **Never** use global find-replace on emoticons  
 ❌ **Never** remove emoticons from string comparisons without refactoring  
 ❌ **Never** change multiple files without testing between changes  
 ❌ **Never** assume an emoticon is decorative - verify context  
 ❌ **Never** proceed if tests fail after a change  
 ---
 ## 10. Sign-Off Requirements
 Before merging emoticon removal changes:
 - [ ] All tests pass (100%)
 - [ ] Code review by second developer
 - [ ] Manual testing of affected features
 - [ ] Documented all CRITICAL items left unchanged (with justification)
 - [ ] Backup verified and accessible
 ---
 **Author**: Generated Plan  
 **Date**: 2026-01-07  
 **Status**: PLAN ONLY - No code changes made
--- a/README.md
+++ b/README.md
@@ -143,7 +143,7 @@ Backup Execution
  Backup created: cluster_20251128_092928.tar.gz
  Size: 22.5 GB (compressed)
-  Location: /u01/dba/dumps/
+  Location: /var/backups/postgres/
  Databases: 7
  Checksum: SHA-256 verified
 ```
--- a/bin/README.md
+++ b/bin/README.md
@@ -4,8 +4,8 @@ This directory contains pre-compiled binaries for the DB Backup Tool across mult
 ## Build Information
 - **Version**: 3.42.1
- **Build Time**: 2026-01-07_14:38:01_UTC
+- **Build Time**: 2026-01-08_04:54:46_UTC
- **Git Commit**: 9743d57
+- **Git Commit**: 627061c
 ## Recent Updates (v1.1.0)
 - ✅ Fixed TUI progress display with line-by-line output
--- a/cmd/migrate.go
+++ b/cmd/migrate.go
@@ -203,9 +203,17 @@ func runMigrateCluster(cmd *cobra.Command, args []string) error {
 		migrateTargetUser = migrateSourceUser
 	}
 	// Create source config first to get WorkDir
 	sourceCfg := config.New()
 	sourceCfg.Host = migrateSourceHost
 	sourceCfg.Port = migrateSourcePort
 	sourceCfg.User = migrateSourceUser
 	sourceCfg.Password = migrateSourcePassword
 	workdir := migrateWorkdir
 	if workdir == "" {
-		workdir = filepath.Join(os.TempDir(), "dbbackup-migrate")
+		// Use WorkDir from config if available
 		workdir = filepath.Join(sourceCfg.GetEffectiveWorkDir(), "dbbackup-migrate")
 	}
 	// Create working directory
@@ -213,12 +221,7 @@ func runMigrateCluster(cmd *cobra.Command, args []string) error {
 		return fmt.Errorf("failed to create working directory: %w", err)
 	}
-	// Create source config
+	// Update source config with remaining settings
 	sourceCfg := config.New()
 	sourceCfg.Host = migrateSourceHost
 	sourceCfg.Port = migrateSourcePort
 	sourceCfg.User = migrateSourceUser
 	sourceCfg.Password = migrateSourcePassword
 	sourceCfg.SSLMode = migrateSourceSSLMode
 	sourceCfg.Database = "postgres" // Default connection database
 	sourceCfg.DatabaseType = cfg.DatabaseType
@@ -342,7 +345,8 @@ func runMigrateSingle(cmd *cobra.Command, args []string) error {
 	workdir := migrateWorkdir
 	if workdir == "" {
-		workdir = filepath.Join(os.TempDir(), "dbbackup-migrate")
+		tempCfg := config.New()
 		workdir = filepath.Join(tempCfg.GetEffectiveWorkDir(), "dbbackup-migrate")
 	}
 	// Create working directory
--- a/cmd/restore.go
+++ b/cmd/restore.go
@@ -350,10 +350,11 @@ func runRestoreDiagnose(cmd *cobra.Command, args []string) error {
 	format := restore.DetectArchiveFormat(archivePath)
 	if format.IsClusterBackup() && diagnoseDeep {
-		// Create temp directory for extraction
+		// Create temp directory for extraction in configured WorkDir
-		tempDir, err := os.MkdirTemp("", "dbbackup-diagnose-*")
+		workDir := cfg.GetEffectiveWorkDir()
 		tempDir, err := os.MkdirTemp(workDir, "dbbackup-diagnose-*")
 		if err != nil {
-			return fmt.Errorf("failed to create temp directory: %w", err)
+			return fmt.Errorf("failed to create temp directory in %s: %w", workDir, err)
 		}
 		if !diagnoseKeepTemp {
@@ -830,10 +831,11 @@ func runRestoreCluster(cmd *cobra.Command, args []string) error {
 	if restoreDiagnose {
 		log.Info("🔍 Running pre-restore diagnosis...")
-		// Create temp directory for extraction
+		// Create temp directory for extraction in configured WorkDir
-		diagTempDir, err := os.MkdirTemp("", "dbbackup-diagnose-*")
+		workDir := cfg.GetEffectiveWorkDir()
 		diagTempDir, err := os.MkdirTemp(workDir, "dbbackup-diagnose-*")
 		if err != nil {
-			return fmt.Errorf("failed to create temp directory for diagnosis: %w", err)
+			return fmt.Errorf("failed to create temp directory for diagnosis in %s: %w", workDir, err)
 		}
 		defer os.RemoveAll(diagTempDir)
--- a/internal/auth/helper.go
+++ b/internal/auth/helper.go
@@ -2,12 +2,14 @@ package auth
 import (
 	"bufio"
 	"context"
 	"fmt"
 	"os"
 	"os/exec"
 	"path/filepath"
 	"strconv"
 	"strings"
 	"time"
 	"dbbackup/internal/config"
 )
@@ -69,7 +71,10 @@ func checkPgHbaConf(user string) AuthMethod {
 // findHbaFileViaPostgres asks PostgreSQL for the hba_file location
 func findHbaFileViaPostgres() string {
-	cmd := exec.Command("psql", "-U", "postgres", "-t", "-c", "SHOW hba_file;")
+	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
 	defer cancel()
 	cmd := exec.CommandContext(ctx, "psql", "-U", "postgres", "-t", "-c", "SHOW hba_file;")
 	output, err := cmd.Output()
 	if err != nil {
 		return ""
@@ -82,8 +87,11 @@ func parsePgHbaConf(path string, user string) AuthMethod {
 	// Try with sudo if we can't read directly
 	file, err := os.Open(path)
 	if err != nil {
-		// Try with sudo
+		// Try with sudo (with timeout)
-		cmd := exec.Command("sudo", "cat", path)
+		ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
 		defer cancel()
 		cmd := exec.CommandContext(ctx, "sudo", "cat", path)
 		output, err := cmd.Output()
 		if err != nil {
 			return AuthUnknown
--- a/internal/backup/encryption.go
+++ b/internal/backup/encryption.go
@@ -87,20 +87,46 @@ func IsBackupEncrypted(backupPath string) bool {
 		return meta.Encrypted
 	}
-	// Fallback: check if file starts with encryption nonce
+	// No metadata found - check file format to determine if encrypted
 	// Known unencrypted formats have specific magic bytes:
 	// - Gzip: 1f 8b
 	// - PGDMP (PostgreSQL custom): 50 47 44 4d 50 (PGDMP)
 	// - Plain SQL: starts with text (-- or SET or CREATE)
 	// - Tar: 75 73 74 61 72 (ustar) at offset 257
 	//
 	// If file doesn't match any known format, it MIGHT be encrypted,
 	// but we return false to avoid false positives. User must provide
 	// metadata file or use --encrypt flag explicitly.
 	file, err := os.Open(backupPath)
 	if err != nil {
 		return false
 	}
 	defer file.Close()
-	// Try to read nonce - if it succeeds, likely encrypted
+	header := make([]byte, 6)
-	nonce := make([]byte, crypto.NonceSize)
+	if n, err := file.Read(header); err != nil || n < 2 {
 	if n, err := file.Read(nonce); err != nil || n != crypto.NonceSize {
 		return false
 	}
-	return true
+	// Check for known unencrypted formats
 	// Gzip magic: 1f 8b
 	if header[0] == 0x1f && header[1] == 0x8b {
 		return false // Gzip compressed - not encrypted
 	}
 	// PGDMP magic (PostgreSQL custom format)
 	if len(header) >= 5 && string(header[:5]) == "PGDMP" {
 		return false // PostgreSQL custom dump - not encrypted
 	}
 	// Plain text SQL (starts with --, SET, CREATE, etc.)
 	if header[0] == '-' || header[0] == 'S' || header[0] == 'C' || header[0] == '/' {
 		return false // Plain text SQL - not encrypted
 	}
 	// Without metadata, we cannot reliably determine encryption status
 	// Return false to avoid blocking restores with false positives
 	return false
 }
 // DecryptBackupFile decrypts an encrypted backup file
--- a/internal/backup/engine.go
+++ b/internal/backup/engine.go
@@ -443,6 +443,14 @@ func (e *Engine) BackupCluster(ctx context.Context) error {
 			defer wg.Done()
 			defer func() { <-semaphore }() // Release
 			// Panic recovery - prevent one database failure from crashing entire cluster backup
 			defer func() {
 				if r := recover(); r != nil {
 					e.log.Error("Panic in database backup goroutine", "database", name, "panic", r)
 					atomic.AddInt32(&failCount, 1)
 				}
 			}()
 			// Check for cancellation at start of goroutine
 			select {
 			case <-ctx.Done():
@@ -502,26 +510,10 @@ func (e *Engine) BackupCluster(ctx context.Context) error {
 			cmd := e.db.BuildBackupCommand(name, dumpFile, options)
-			// Calculate timeout based on database size:
+			// NO TIMEOUT for individual database backups
-			// - Minimum 2 hours for small databases
+			// Large databases with large objects can take many hours
-			// - Add 1 hour per 20GB for large databases
+			// The parent context handles cancellation if needed
-			// - This allows ~69GB database to take up to 5+ hours
+			err := e.executeCommand(ctx, cmd, dumpFile)
 			timeout := 2 * time.Hour
 			if size, err := e.db.GetDatabaseSize(ctx, name); err == nil {
 				sizeGB := size / (1024 * 1024 * 1024)
 				if sizeGB > 20 {
 					extraHours := (sizeGB / 20) + 1
 					timeout = time.Duration(2+extraHours) * time.Hour
 					mu.Lock()
 					e.printf("       Extended timeout: %v (for %dGB database)\n", timeout, sizeGB)
 					mu.Unlock()
 				}
 			}
 			dbCtx, cancel := context.WithTimeout(ctx, timeout)
 			defer cancel()
 			err := e.executeCommand(dbCtx, cmd, dumpFile)
 			cancel()
 			if err != nil {
 				e.log.Warn("Failed to backup database", "database", name, "error", err)
@@ -614,12 +606,36 @@ func (e *Engine) executeCommandWithProgress(ctx context.Context, cmdArgs []strin
 		return fmt.Errorf("failed to start command: %w", err)
 	}
-	// Monitor progress via stderr
+	// Monitor progress via stderr in goroutine
-	go e.monitorCommandProgress(stderr, tracker)
+	stderrDone := make(chan struct{})
 	go func() {
 		defer close(stderrDone)
 		e.monitorCommandProgress(stderr, tracker)
 	}()
-	// Wait for command to complete
+	// Wait for command to complete with proper context handling
-	if err := cmd.Wait(); err != nil {
+	cmdDone := make(chan error, 1)
-		return fmt.Errorf("backup command failed: %w", err)
+	go func() {
 		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed (success or failure)
 	case <-ctx.Done():
 		// Context cancelled - kill process to unblock
 		e.log.Warn("Backup cancelled - killing process")
 		cmd.Process.Kill()
 		<-cmdDone // Wait for goroutine to finish
 		cmdErr = ctx.Err()
 	}
 	// Wait for stderr reader to finish
 	<-stderrDone
 	if cmdErr != nil {
 		return fmt.Errorf("backup command failed: %w", cmdErr)
 	}
 	return nil
@@ -696,8 +712,12 @@ func (e *Engine) executeMySQLWithProgressAndCompression(ctx context.Context, cmd
 		return fmt.Errorf("failed to get stderr pipe: %w", err)
 	}
-	// Start monitoring progress
+	// Start monitoring progress in goroutine
-	go e.monitorCommandProgress(stderr, tracker)
+	stderrDone := make(chan struct{})
 	go func() {
 		defer close(stderrDone)
 		e.monitorCommandProgress(stderr, tracker)
 	}()
 	// Start both commands
 	if err := gzipCmd.Start(); err != nil {
@@ -705,20 +725,41 @@ func (e *Engine) executeMySQLWithProgressAndCompression(ctx context.Context, cmd
 	}
 	if err := dumpCmd.Start(); err != nil {
 		gzipCmd.Process.Kill()
 		return fmt.Errorf("failed to start mysqldump: %w", err)
 	}
-	// Wait for mysqldump to complete
+	// Wait for mysqldump with context handling
-	if err := dumpCmd.Wait(); err != nil {
+	dumpDone := make(chan error, 1)
-		return fmt.Errorf("mysqldump failed: %w", err)
+	go func() {
 		dumpDone <- dumpCmd.Wait()
 	}()
 	var dumpErr error
 	select {
 	case dumpErr = <-dumpDone:
 		// mysqldump completed
 	case <-ctx.Done():
 		e.log.Warn("Backup cancelled - killing mysqldump")
 		dumpCmd.Process.Kill()
 		gzipCmd.Process.Kill()
 		<-dumpDone
 		return ctx.Err()
 	}
 	// Wait for stderr reader
 	<-stderrDone
 	// Close pipe and wait for gzip
 	pipe.Close()
 	if err := gzipCmd.Wait(); err != nil {
 		return fmt.Errorf("gzip failed: %w", err)
 	}
 	if dumpErr != nil {
 		return fmt.Errorf("mysqldump failed: %w", dumpErr)
 	}
 	return nil
 }
@@ -749,19 +790,45 @@ func (e *Engine) executeMySQLWithCompression(ctx context.Context, cmdArgs []stri
 	gzipCmd.Stdin = stdin
 	gzipCmd.Stdout = outFile
-	// Start both commands
+	// Start gzip first
 	if err := gzipCmd.Start(); err != nil {
 		return fmt.Errorf("failed to start gzip: %w", err)
 	}
-	if err := dumpCmd.Run(); err != nil {
+	// Start mysqldump
-		return fmt.Errorf("mysqldump failed: %w", err)
+	if err := dumpCmd.Start(); err != nil {
 		gzipCmd.Process.Kill()
 		return fmt.Errorf("failed to start mysqldump: %w", err)
 	}
 	// Wait for mysqldump with context handling
 	dumpDone := make(chan error, 1)
 	go func() {
 		dumpDone <- dumpCmd.Wait()
 	}()
 	var dumpErr error
 	select {
 	case dumpErr = <-dumpDone:
 		// mysqldump completed
 	case <-ctx.Done():
 		e.log.Warn("Backup cancelled - killing mysqldump")
 		dumpCmd.Process.Kill()
 		gzipCmd.Process.Kill()
 		<-dumpDone
 		return ctx.Err()
 	}
 	// Close pipe and wait for gzip
 	stdin.Close()
 	if err := gzipCmd.Wait(); err != nil {
 		return fmt.Errorf("gzip failed: %w", err)
 	}
 	if dumpErr != nil {
 		return fmt.Errorf("mysqldump failed: %w", dumpErr)
 	}
 	return nil
 }
@@ -898,15 +965,46 @@ func (e *Engine) createArchive(ctx context.Context, sourceDir, outputFile string
 			goto regularTar
 		}
-		// Wait for tar to finish
+		// Wait for tar with proper context handling
-		if err := cmd.Wait(); err != nil {
+		tarDone := make(chan error, 1)
 		go func() {
 			tarDone <- cmd.Wait()
 		}()
 		var tarErr error
 		select {
 		case tarErr = <-tarDone:
 			// tar completed
 		case <-ctx.Done():
 			e.log.Warn("Archive creation cancelled - killing processes")
 			cmd.Process.Kill()
 			pigzCmd.Process.Kill()
-			return fmt.Errorf("tar failed: %w", err)
+			<-tarDone
 			return ctx.Err()
 		}
-		// Wait for pigz to finish
+		if tarErr != nil {
-		if err := pigzCmd.Wait(); err != nil {
+			pigzCmd.Process.Kill()
-			return fmt.Errorf("pigz compression failed: %w", err)
+			return fmt.Errorf("tar failed: %w", tarErr)
 		}
 		// Wait for pigz with proper context handling
 		pigzDone := make(chan error, 1)
 		go func() {
 			pigzDone <- pigzCmd.Wait()
 		}()
 		var pigzErr error
 		select {
 		case pigzErr = <-pigzDone:
 		case <-ctx.Done():
 			pigzCmd.Process.Kill()
 			<-pigzDone
 			return ctx.Err()
 		}
 		if pigzErr != nil {
 			return fmt.Errorf("pigz compression failed: %w", pigzErr)
 		}
 		return nil
 	}
@@ -1251,8 +1349,10 @@ func (e *Engine) executeCommand(ctx context.Context, cmdArgs []string, outputFil
 		return fmt.Errorf("failed to start backup command: %w", err)
 	}
-	// Stream stderr output (don't buffer it all in memory)
+	// Stream stderr output in goroutine (don't buffer it all in memory)
 	stderrDone := make(chan struct{})
 	go func() {
 		defer close(stderrDone)
 		scanner := bufio.NewScanner(stderr)
 		scanner.Buffer(make([]byte, 64*1024), 1024*1024) // 1MB max line size
 		for scanner.Scan() {
@@ -1263,10 +1363,30 @@ func (e *Engine) executeCommand(ctx context.Context, cmdArgs []string, outputFil
 		}
 	}()
-	// Wait for command to complete
+	// Wait for command to complete with proper context handling
-	if err := cmd.Wait(); err != nil {
+	cmdDone := make(chan error, 1)
-		e.log.Error("Backup command failed", "error", err, "database", filepath.Base(outputFile))
+	go func() {
-		return fmt.Errorf("backup command failed: %w", err)
+		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed (success or failure)
 	case <-ctx.Done():
 		// Context cancelled - kill process to unblock
 		e.log.Warn("Backup cancelled - killing pg_dump process")
 		cmd.Process.Kill()
 		<-cmdDone // Wait for goroutine to finish
 		cmdErr = ctx.Err()
 	}
 	// Wait for stderr reader to finish
 	<-stderrDone
 	if cmdErr != nil {
 		e.log.Error("Backup command failed", "error", cmdErr, "database", filepath.Base(outputFile))
 		return fmt.Errorf("backup command failed: %w", cmdErr)
 	}
 	return nil
--- a/internal/cleanup/processes.go
+++ b/internal/cleanup/processes.go
@@ -12,6 +12,7 @@ import (
 	"strings"
 	"sync"
 	"syscall"
 	"time"
 	"dbbackup/internal/logger"
 )
@@ -116,8 +117,11 @@ func KillOrphanedProcesses(log logger.Logger) error {
 // findProcessesByName returns PIDs of processes matching the given name
 func findProcessesByName(name string, excludePID int) ([]int, error) {
-	// Use pgrep for efficient process searching
+	// Use pgrep for efficient process searching with timeout
-	cmd := exec.Command("pgrep", "-x", name)
+	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
 	defer cancel()
 	cmd := exec.CommandContext(ctx, "pgrep", "-x", name)
 	output, err := cmd.Output()
 	if err != nil {
 		// Exit code 1 means no processes found (not an error)
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -217,14 +217,17 @@ func New() *Config {
 		SingleDBName:  getEnvString("SINGLE_DB_NAME", ""),
 		RestoreDBName: getEnvString("RESTORE_DB_NAME", ""),
-		// Timeouts
+		// Timeouts - default 24 hours (1440 min) to handle very large databases with large objects
-		ClusterTimeoutMinutes: getEnvInt("CLUSTER_TIMEOUT_MIN", 240),
+		ClusterTimeoutMinutes: getEnvInt("CLUSTER_TIMEOUT_MIN", 1440),
 		// Cluster parallelism (default: 2 concurrent operations for faster cluster backup/restore)
 		ClusterParallelism: getEnvInt("CLUSTER_PARALLELISM", 2),
 		// Working directory for large operations (default: system temp)
 		WorkDir: getEnvString("WORK_DIR", ""),
 		// Swap file management
-		SwapFilePath:   getEnvString("SWAP_FILE_PATH", "/tmp/dbbackup_swap"),
+		SwapFilePath:   "",                                // Will be set after WorkDir is initialized
 		SwapFileSizeGB: getEnvInt("SWAP_FILE_SIZE_GB", 0), // 0 = disabled by default
 		AutoSwap:       getEnvBool("AUTO_SWAP", false),
@@ -264,6 +267,13 @@ func New() *Config {
 		cfg.SSLMode = "prefer"
 	}
 	// Set SwapFilePath using WorkDir if not explicitly set via env var
 	if envSwap := os.Getenv("SWAP_FILE_PATH"); envSwap != "" {
 		cfg.SwapFilePath = envSwap
 	} else {
 		cfg.SwapFilePath = filepath.Join(cfg.GetEffectiveWorkDir(), "dbbackup_swap")
 	}
 	return cfg
 }
@@ -499,6 +509,14 @@ func GetCurrentOSUser() string {
 	return getCurrentUser()
 }
 // GetEffectiveWorkDir returns the configured WorkDir or system temp as fallback
 func (c *Config) GetEffectiveWorkDir() string {
 	if c.WorkDir != "" {
 		return c.WorkDir
 	}
 	return os.TempDir()
 }
 func getDefaultBackupDir() string {
 	// Try to create a sensible default backup directory
 	homeDir, _ := os.UserHomeDir()
@@ -516,7 +534,7 @@ func getDefaultBackupDir() string {
 		return "/var/lib/pgsql/pg_backups"
 	}
-	return "/tmp/db_backups"
+	return filepath.Join(os.TempDir(), "db_backups")
 }
 // CPU-related helper functions
--- a/internal/config/persist.go
+++ b/internal/config/persist.go
@@ -28,8 +28,9 @@ type LocalConfig struct {
 	DumpJobs    int
 	// Performance settings
-	CPUWorkload string
+	CPUWorkload    string
-	MaxCores    int
+	MaxCores       int
 	ClusterTimeout int // Cluster operation timeout in minutes (default: 1440 = 24 hours)
 	// Security settings
 	RetentionDays int
@@ -121,6 +122,10 @@ func LoadLocalConfig() (*LocalConfig, error) {
 				if mc, err := strconv.Atoi(value); err == nil {
 					cfg.MaxCores = mc
 				}
 			case "cluster_timeout":
 				if ct, err := strconv.Atoi(value); err == nil {
 					cfg.ClusterTimeout = ct
 				}
 			}
 		case "security":
 			switch key {
@@ -199,6 +204,9 @@ func SaveLocalConfig(cfg *LocalConfig) error {
 	if cfg.MaxCores != 0 {
 		sb.WriteString(fmt.Sprintf("max_cores = %d\n", cfg.MaxCores))
 	}
 	if cfg.ClusterTimeout != 0 {
 		sb.WriteString(fmt.Sprintf("cluster_timeout = %d\n", cfg.ClusterTimeout))
 	}
 	sb.WriteString("\n")
 	// Security section
@@ -268,6 +276,10 @@ func ApplyLocalConfig(cfg *Config, local *LocalConfig) {
 	if local.MaxCores != 0 {
 		cfg.MaxCores = local.MaxCores
 	}
 	// Apply cluster timeout from config file (overrides default)
 	if local.ClusterTimeout != 0 {
 		cfg.ClusterTimeoutMinutes = local.ClusterTimeout
 	}
 	if cfg.RetentionDays == 30 && local.RetentionDays != 0 {
 		cfg.RetentionDays = local.RetentionDays
 	}
@@ -282,21 +294,22 @@ func ApplyLocalConfig(cfg *Config, local *LocalConfig) {
 // ConfigFromConfig creates a LocalConfig from a Config
 func ConfigFromConfig(cfg *Config) *LocalConfig {
 	return &LocalConfig{
-		DBType:        cfg.DatabaseType,
+		DBType:         cfg.DatabaseType,
-		Host:          cfg.Host,
+		Host:           cfg.Host,
-		Port:          cfg.Port,
+		Port:           cfg.Port,
-		User:          cfg.User,
+		User:           cfg.User,
-		Database:      cfg.Database,
+		Database:       cfg.Database,
-		SSLMode:       cfg.SSLMode,
+		SSLMode:        cfg.SSLMode,
-		BackupDir:     cfg.BackupDir,
+		BackupDir:      cfg.BackupDir,
-		WorkDir:       cfg.WorkDir,
+		WorkDir:        cfg.WorkDir,
-		Compression:   cfg.CompressionLevel,
+		Compression:    cfg.CompressionLevel,
-		Jobs:          cfg.Jobs,
+		Jobs:           cfg.Jobs,
-		DumpJobs:      cfg.DumpJobs,
+		DumpJobs:       cfg.DumpJobs,
-		CPUWorkload:   cfg.CPUWorkloadType,
+		CPUWorkload:    cfg.CPUWorkloadType,
-		MaxCores:      cfg.MaxCores,
+		MaxCores:       cfg.MaxCores,
-		RetentionDays: cfg.RetentionDays,
+		ClusterTimeout: cfg.ClusterTimeoutMinutes,
-		MinBackups:    cfg.MinBackups,
+		RetentionDays:  cfg.RetentionDays,
-		MaxRetries:    cfg.MaxRetries,
+		MinBackups:     cfg.MinBackups,
 		MaxRetries:     cfg.MaxRetries,
 	}
 }
--- a/internal/engine/mysqldump.go
+++ b/internal/engine/mysqldump.go
@@ -234,10 +234,26 @@ func (e *MySQLDumpEngine) Backup(ctx context.Context, opts *BackupOptions) (*Bac
 		gzWriter.Close()
 	}
-	// Wait for command
+	// Wait for command with proper context handling
-	if err := cmd.Wait(); err != nil {
+	cmdDone := make(chan error, 1)
 	go func() {
 		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed
 	case <-ctx.Done():
 		e.log.Warn("MySQL backup cancelled - killing process")
 		cmd.Process.Kill()
 		<-cmdDone
 		cmdErr = ctx.Err()
 	}
 	if cmdErr != nil {
 		stderr := stderrBuf.String()
-		return nil, fmt.Errorf("mysqldump failed: %w\n%s", err, stderr)
+		return nil, fmt.Errorf("mysqldump failed: %w\n%s", cmdErr, stderr)
 	}
 	// Get file info
@@ -442,8 +458,25 @@ func (e *MySQLDumpEngine) BackupToWriter(ctx context.Context, w io.Writer, opts
 		gzWriter.Close()
 	}
-	if err := cmd.Wait(); err != nil {
+	// Wait for command with proper context handling
-		return nil, fmt.Errorf("mysqldump failed: %w\n%s", err, stderrBuf.String())
+	cmdDone := make(chan error, 1)
 	go func() {
 		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed
 	case <-ctx.Done():
 		e.log.Warn("MySQL streaming backup cancelled - killing process")
 		cmd.Process.Kill()
 		<-cmdDone
 		cmdErr = ctx.Err()
 	}
 	if cmdErr != nil {
 		return nil, fmt.Errorf("mysqldump failed: %w\n%s", cmdErr, stderrBuf.String())
 	}
 	return &BackupResult{
--- a/internal/engine/snapshot_engine.go
+++ b/internal/engine/snapshot_engine.go
@@ -188,6 +188,8 @@ func (e *SnapshotEngine) Backup(ctx context.Context, opts *BackupOptions) (*Back
 	// Step 4: Mount snapshot
 	mountPoint := e.config.MountPoint
 	if mountPoint == "" {
 		// Note: snapshot engine uses snapshot.Config which doesnt have GetEffectiveWorkDir()
 		// TODO: Refactor to use main config.Config for WorkDir support
 		mountPoint = filepath.Join(os.TempDir(), fmt.Sprintf("dbbackup_snap_%s", timestamp))
 	}
--- a/internal/migrate/engine.go
+++ b/internal/migrate/engine.go
@@ -117,7 +117,7 @@ func NewEngine(sourceCfg, targetCfg *config.Config, log logger.Logger) (*Engine,
 		targetDB:    targetDB,
 		log:         log,
 		progress:    progress.NewSpinner(),
-		workDir:     os.TempDir(),
+		workDir:     sourceCfg.GetEffectiveWorkDir(),
 		keepBackup:  false,
 		jobs:        4,
 		dryRun:      false,
--- a/internal/pitr/binlog.go
+++ b/internal/pitr/binlog.go
@@ -212,7 +212,11 @@ func (m *BinlogManager) detectTools() error {
 // detectServerType determines if we're working with MySQL or MariaDB
 func (m *BinlogManager) detectServerType() DatabaseType {
-	cmd := exec.Command(m.mysqlbinlogPath, "--version")
+	// Use timeout to prevent blocking if command hangs
 	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
 	defer cancel()
 	cmd := exec.CommandContext(ctx, m.mysqlbinlogPath, "--version")
 	output, err := cmd.Output()
 	if err != nil {
 		return DatabaseMySQL // Default to MySQL
--- a/internal/restore/cloud_download.go
+++ b/internal/restore/cloud_download.go
@@ -47,9 +47,10 @@ type DownloadResult struct {
 // Download downloads a backup from cloud storage
 func (d *CloudDownloader) Download(ctx context.Context, remotePath string, opts DownloadOptions) (*DownloadResult, error) {
-	// Determine temp directory
+	// Determine temp directory (use from opts, or from config's WorkDir, or fallback to system temp)
 	tempDir := opts.TempDir
 	if tempDir == "" {
 		// Try to get from config if available (passed via opts.TempDir)
 		tempDir = os.TempDir()
 	}
--- a/internal/restore/diagnose.go
+++ b/internal/restore/diagnose.go
@@ -4,6 +4,7 @@ import (
 	"bufio"
 	"bytes"
 	"compress/gzip"
 	"context"
 	"encoding/json"
 	"fmt"
 	"io"
@@ -12,6 +13,7 @@ import (
 	"path/filepath"
 	"regexp"
 	"strings"
 	"time"
 	"dbbackup/internal/logger"
 )
@@ -60,9 +62,9 @@ type DiagnoseDetails struct {
 	TableList         []string `json:"table_list,omitempty"`
 	// Compression analysis
-	GzipValid     bool   `json:"gzip_valid,omitempty"`
+	GzipValid        bool    `json:"gzip_valid,omitempty"`
-	GzipError     string `json:"gzip_error,omitempty"`
+	GzipError        string  `json:"gzip_error,omitempty"`
-	ExpandedSize  int64  `json:"expanded_size,omitempty"`
+	ExpandedSize     int64   `json:"expanded_size,omitempty"`
 	CompressionRatio float64 `json:"compression_ratio,omitempty"`
 }
@@ -412,8 +414,12 @@ func (d *Diagnoser) diagnoseSQLScript(filePath string, compressed bool, result *
 // diagnoseClusterArchive analyzes a cluster tar.gz archive
 func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResult) {
-	// First verify tar.gz integrity
+	// First verify tar.gz integrity with timeout
-	cmd := exec.Command("tar", "-tzf", filePath)
+	// 5 minutes for large archives (multi-GB archives need more time)
 	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
 	defer cancel()
 	cmd := exec.CommandContext(ctx, "tar", "-tzf", filePath)
 	output, err := cmd.Output()
 	if err != nil {
 		result.IsValid = false
@@ -491,7 +497,12 @@ func (d *Diagnoser) diagnoseUnknown(filePath string, result *DiagnoseResult) {
 // verifyWithPgRestore uses pg_restore --list to verify dump integrity
 func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult) {
-	cmd := exec.Command("pg_restore", "--list", filePath)
+	// Use timeout to prevent blocking on very large dump files
 	// 5 minutes for large dumps (multi-GB dumps with many tables)
 	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
 	defer cancel()
 	cmd := exec.CommandContext(ctx, "pg_restore", "--list", filePath)
 	output, err := cmd.CombinedOutput()
 	if err != nil {
@@ -544,7 +555,11 @@ func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult)
 // DiagnoseClusterDumps extracts and diagnoses all dumps in a cluster archive
 func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*DiagnoseResult, error) {
 	// First, try to list archive contents without extracting (fast check)
-	listCmd := exec.Command("tar", "-tzf", archivePath)
+	// 10 minutes for very large archives
 	listCtx, listCancel := context.WithTimeout(context.Background(), 10*time.Minute)
 	defer listCancel()
 	listCmd := exec.CommandContext(listCtx, "tar", "-tzf", archivePath)
 	listOutput, listErr := listCmd.CombinedOutput()
 	if listErr != nil {
 		// Archive listing failed - likely corrupted
@@ -587,15 +602,21 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
 	// Check temp directory space - try to extract metadata first
 	if stat, err := os.Stat(tempDir); err == nil && stat.IsDir() {
-		// Try extraction of a small test file first
+		// Try extraction of a small test file first with timeout
-		testCmd := exec.Command("tar", "-xzf", archivePath, "-C", tempDir, "--wildcards", "*.json", "--wildcards", "globals.sql")
+		testCtx, testCancel := context.WithTimeout(context.Background(), 30*time.Second)
 		testCmd := exec.CommandContext(testCtx, "tar", "-xzf", archivePath, "-C", tempDir, "--wildcards", "*.json", "--wildcards", "globals.sql")
 		testCmd.Run() // Ignore error - just try to extract metadata
 		testCancel()
 	}
 	d.log.Info("Archive listing successful", "files", len(files))
-	// Try full extraction
+	// Try full extraction - NO TIMEOUT here as large archives can take a long time
-	cmd := exec.Command("tar", "-xzf", archivePath, "-C", tempDir)
+	// Use a generous timeout (30 minutes) for very large archives
 	extractCtx, extractCancel := context.WithTimeout(context.Background(), 30*time.Minute)
 	defer extractCancel()
 	cmd := exec.CommandContext(extractCtx, "tar", "-xzf", archivePath, "-C", tempDir)
 	var stderr bytes.Buffer
 	cmd.Stderr = &stderr
 	if err := cmd.Run(); err != nil {
--- a/internal/restore/engine.go
+++ b/internal/restore/engine.go
@@ -27,7 +27,7 @@ type Engine struct {
 	progress         progress.Indicator
 	detailedReporter *progress.DetailedReporter
 	dryRun           bool
-	debugLogPath     string         // Path to save debug log on error
+	debugLogPath     string          // Path to save debug log on error
 	errorCollector   *ErrorCollector // Collects detailed error info
 }
@@ -357,40 +357,65 @@ func (e *Engine) executeRestoreCommandWithContext(ctx context.Context, cmdArgs [
 		return fmt.Errorf("failed to start restore command: %w", err)
 	}
-	// Read stderr in chunks to log errors without loading all into memory
+	// Read stderr in goroutine to avoid blocking
 	buf := make([]byte, 4096)
 	var lastError string
 	var errorCount int
-	const maxErrors = 10 // Limit captured errors to prevent OOM
+	stderrDone := make(chan struct{})
-	for {
+	go func() {
-		n, err := stderr.Read(buf)
+		defer close(stderrDone)
-		if n > 0 {
+		buf := make([]byte, 4096)
-			chunk := string(buf[:n])
+		const maxErrors = 10 // Limit captured errors to prevent OOM
 		for {
 			n, err := stderr.Read(buf)
 			if n > 0 {
 				chunk := string(buf[:n])
-			// Feed to error collector if enabled
+				// Feed to error collector if enabled
-			if collector != nil {
+				if collector != nil {
-				collector.CaptureStderr(chunk)
+					collector.CaptureStderr(chunk)
 			}
 			// Only capture REAL errors, not verbose output
 			if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") {
 				lastError = strings.TrimSpace(chunk)
 				errorCount++
 				if errorCount <= maxErrors {
 					e.log.Warn("Restore stderr", "output", chunk)
 				}
 				// Only capture REAL errors, not verbose output
 				if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") {
 					lastError = strings.TrimSpace(chunk)
 					errorCount++
 					if errorCount <= maxErrors {
 						e.log.Warn("Restore stderr", "output", chunk)
 					}
 				}
 				// Note: --verbose output is discarded to prevent OOM
 			}
 			if err != nil {
 				break
 			}
 			// Note: --verbose output is discarded to prevent OOM
 		}
 		if err != nil {
 			break
 		}
 	}()
 	// Wait for command with proper context handling
 	cmdDone := make(chan error, 1)
 	go func() {
 		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed (success or failure)
 	case <-ctx.Done():
 		// Context cancelled - kill process
 		e.log.Warn("Restore cancelled - killing process")
 		cmd.Process.Kill()
 		<-cmdDone
 		cmdErr = ctx.Err()
 	}
-	if err := cmd.Wait(); err != nil {
+	// Wait for stderr reader to finish
 	<-stderrDone
 	if cmdErr != nil {
 		// Get exit code
 		exitCode := 1
-		if exitErr, ok := err.(*exec.ExitError); ok {
+		if exitErr, ok := cmdErr.(*exec.ExitError); ok {
 			exitCode = exitErr.ExitCode()
 		}
@@ -481,31 +506,56 @@ func (e *Engine) executeRestoreWithDecompression(ctx context.Context, archivePat
 		return fmt.Errorf("failed to start restore command: %w", err)
 	}
-	// Read stderr in chunks to log errors without loading all into memory
+	// Read stderr in goroutine to avoid blocking
 	buf := make([]byte, 4096)
 	var lastError string
 	var errorCount int
-	const maxErrors = 10 // Limit captured errors to prevent OOM
+	stderrDone := make(chan struct{})
-	for {
+	go func() {
-		n, err := stderr.Read(buf)
+		defer close(stderrDone)
-		if n > 0 {
+		buf := make([]byte, 4096)
-			chunk := string(buf[:n])
+		const maxErrors = 10 // Limit captured errors to prevent OOM
-			// Only capture REAL errors, not verbose output
+		for {
-			if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") {
+			n, err := stderr.Read(buf)
-				lastError = strings.TrimSpace(chunk)
+			if n > 0 {
-				errorCount++
+				chunk := string(buf[:n])
-				if errorCount <= maxErrors {
+				// Only capture REAL errors, not verbose output
-					e.log.Warn("Restore stderr", "output", chunk)
+				if strings.Contains(chunk, "ERROR:") || strings.Contains(chunk, "FATAL:") || strings.Contains(chunk, "error:") {
 					lastError = strings.TrimSpace(chunk)
 					errorCount++
 					if errorCount <= maxErrors {
 						e.log.Warn("Restore stderr", "output", chunk)
 					}
 				}
 				// Note: --verbose output is discarded to prevent OOM
 			}
 			if err != nil {
 				break
 			}
 			// Note: --verbose output is discarded to prevent OOM
 		}
 		if err != nil {
 			break
 		}
 	}()
 	// Wait for command with proper context handling
 	cmdDone := make(chan error, 1)
 	go func() {
 		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed (success or failure)
 	case <-ctx.Done():
 		// Context cancelled - kill process
 		e.log.Warn("Restore with decompression cancelled - killing process")
 		cmd.Process.Kill()
 		<-cmdDone
 		cmdErr = ctx.Err()
 	}
-	if err := cmd.Wait(); err != nil {
+	// Wait for stderr reader to finish
 	<-stderrDone
 	if cmdErr != nil {
 		// PostgreSQL pg_restore returns exit code 1 even for ignorable errors
 		// Check if errors are ignorable (already exists, duplicate, etc.)
 		if lastError != "" && e.isIgnorableError(lastError) {
@@ -517,18 +567,18 @@ func (e *Engine) executeRestoreWithDecompression(ctx context.Context, archivePat
 		if lastError != "" {
 			classification := checks.ClassifyError(lastError)
 			e.log.Error("Restore with decompression failed",
-				"error", err,
+				"error", cmdErr,
 				"last_stderr", lastError,
 				"error_count", errorCount,
 				"error_type", classification.Type,
 				"hint", classification.Hint,
 				"action", classification.Action)
 			return fmt.Errorf("restore failed: %w (last error: %s, total errors: %d) - %s",
-				err, lastError, errorCount, classification.Hint)
+				cmdErr, lastError, errorCount, classification.Hint)
 		}
-		e.log.Error("Restore with decompression failed", "error", err, "last_stderr", lastError, "error_count", errorCount)
+		e.log.Error("Restore with decompression failed", "error", cmdErr, "last_stderr", lastError, "error_count", errorCount)
-		return fmt.Errorf("restore failed: %w", err)
+		return fmt.Errorf("restore failed: %w", cmdErr)
 	}
 	return nil
@@ -628,11 +678,12 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
 	e.progress.Start(fmt.Sprintf("Restoring cluster from %s", filepath.Base(archivePath)))
-	// Create temporary extraction directory
+	// Create temporary extraction directory in configured WorkDir
-	tempDir := filepath.Join(e.cfg.BackupDir, fmt.Sprintf(".restore_%d", time.Now().Unix()))
+	workDir := e.cfg.GetEffectiveWorkDir()
 	tempDir := filepath.Join(workDir, fmt.Sprintf(".restore_%d", time.Now().Unix()))
 	if err := os.MkdirAll(tempDir, 0755); err != nil {
 		operation.Fail("Failed to create temporary directory")
-		return fmt.Errorf("failed to create temp directory: %w", err)
+		return fmt.Errorf("failed to create temp directory in %s: %w", workDir, err)
 	}
 	defer os.RemoveAll(tempDir)
@@ -726,7 +777,7 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
 			}
 		} else if strings.HasSuffix(dumpFile, ".dump") {
 			// Validate custom format dumps using pg_restore --list
-			cmd := exec.Command("pg_restore", "--list", dumpFile)
+			cmd := exec.CommandContext(ctx, "pg_restore", "--list", dumpFile)
 			output, err := cmd.CombinedOutput()
 			if err != nil {
 				dbName := strings.TrimSuffix(entry.Name(), ".dump")
@@ -811,6 +862,14 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string) error {
 			defer wg.Done()
 			defer func() { <-semaphore }() // Release
 			// Panic recovery - prevent one database failure from crashing entire cluster restore
 			defer func() {
 				if r := recover(); r != nil {
 					e.log.Error("Panic in database restore goroutine", "file", filename, "panic", r)
 					atomic.AddInt32(&failCount, 1)
 				}
 			}()
 			// Update estimator progress (thread-safe)
 			mu.Lock()
 			estimator.UpdateProgress(idx)
@@ -938,16 +997,39 @@ func (e *Engine) extractArchive(ctx context.Context, archivePath, destDir string
 	}
 	// Discard stderr output in chunks to prevent memory buildup
-	buf := make([]byte, 4096)
+	stderrDone := make(chan struct{})
-	for {
+	go func() {
-		_, err := stderr.Read(buf)
+		defer close(stderrDone)
-		if err != nil {
+		buf := make([]byte, 4096)
-			break
+		for {
 			_, err := stderr.Read(buf)
 			if err != nil {
 				break
 			}
 		}
 	}()
 	// Wait for command with proper context handling
 	cmdDone := make(chan error, 1)
 	go func() {
 		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed
 	case <-ctx.Done():
 		e.log.Warn("Archive extraction cancelled - killing process")
 		cmd.Process.Kill()
 		<-cmdDone
 		cmdErr = ctx.Err()
 	}
-	if err := cmd.Wait(); err != nil {
+	<-stderrDone
-		return fmt.Errorf("tar extraction failed: %w", err)
+
 	if cmdErr != nil {
 		return fmt.Errorf("tar extraction failed: %w", cmdErr)
 	}
 	return nil
 }
@@ -980,25 +1062,48 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
 		return fmt.Errorf("failed to start psql: %w", err)
 	}
-	// Read stderr in chunks
+	// Read stderr in chunks in goroutine
 	buf := make([]byte, 4096)
 	var lastError string
-	for {
+	stderrDone := make(chan struct{})
-		n, err := stderr.Read(buf)
+	go func() {
-		if n > 0 {
+		defer close(stderrDone)
-			chunk := string(buf[:n])
+		buf := make([]byte, 4096)
-			if strings.Contains(chunk, "ERROR") || strings.Contains(chunk, "FATAL") {
+		for {
-				lastError = chunk
+			n, err := stderr.Read(buf)
-				e.log.Warn("Globals restore stderr", "output", chunk)
+			if n > 0 {
 				chunk := string(buf[:n])
 				if strings.Contains(chunk, "ERROR") || strings.Contains(chunk, "FATAL") {
 					lastError = chunk
 					e.log.Warn("Globals restore stderr", "output", chunk)
 				}
 			}
 			if err != nil {
 				break
 			}
 		}
-		if err != nil {
+	}()
-			break
+
-		}
+	// Wait for command with proper context handling
 	cmdDone := make(chan error, 1)
 	go func() {
 		cmdDone <- cmd.Wait()
 	}()
 	var cmdErr error
 	select {
 	case cmdErr = <-cmdDone:
 		// Command completed
 	case <-ctx.Done():
 		e.log.Warn("Globals restore cancelled - killing process")
 		cmd.Process.Kill()
 		<-cmdDone
 		cmdErr = ctx.Err()
 	}
-	if err := cmd.Wait(); err != nil {
+	<-stderrDone
-		return fmt.Errorf("failed to restore globals: %w (last error: %s)", err, lastError)
+
 	if cmdErr != nil {
 		return fmt.Errorf("failed to restore globals: %w (last error: %s)", cmdErr, lastError)
 	}
 	return nil
@@ -1262,7 +1367,8 @@ func (e *Engine) detectLargeObjectsInDumps(dumpsDir string, entries []os.DirEntr
 		}
 		// Use pg_restore -l to list contents (fast, doesn't restore data)
-		ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+		// 2 minutes for large dumps with many objects
 		ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
 		defer cancel()
 		cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpFile)
--- a/internal/restore/error_report.go
+++ b/internal/restore/error_report.go
@@ -3,6 +3,7 @@ package restore
 import (
 	"bufio"
 	"compress/gzip"
 	"context"
 	"encoding/json"
 	"fmt"
 	"io"
@@ -20,11 +21,11 @@ import (
 // RestoreErrorReport contains comprehensive information about a restore failure
 type RestoreErrorReport struct {
 	// Metadata
-	Timestamp     time.Time `json:"timestamp"`
+	Timestamp time.Time `json:"timestamp"`
-	Version       string    `json:"version"`
+	Version   string    `json:"version"`
-	GoVersion     string    `json:"go_version"`
+	GoVersion string    `json:"go_version"`
-	OS            string    `json:"os"`
+	OS        string    `json:"os"`
-	Arch          string    `json:"arch"`
+	Arch      string    `json:"arch"`
 	// Archive info
 	ArchivePath   string `json:"archive_path"`
@@ -32,19 +33,19 @@ type RestoreErrorReport struct {
 	ArchiveFormat string `json:"archive_format"`
 	// Database info
-	TargetDB      string `json:"target_db"`
+	TargetDB     string `json:"target_db"`
-	DatabaseType  string `json:"database_type"`
+	DatabaseType string `json:"database_type"`
 	// Error details
-	ExitCode      int      `json:"exit_code"`
+	ExitCode     int    `json:"exit_code"`
-	ErrorMessage  string   `json:"error_message"`
+	ErrorMessage string `json:"error_message"`
-	ErrorType     string   `json:"error_type"`
+	ErrorType    string `json:"error_type"`
-	ErrorHint     string   `json:"error_hint"`
+	ErrorHint    string `json:"error_hint"`
-	TotalErrors   int      `json:"total_errors"`
+	TotalErrors  int    `json:"total_errors"`
 	// Captured output
-	LastStderr    []string `json:"last_stderr"`
+	LastStderr  []string `json:"last_stderr"`
-	FirstErrors   []string `json:"first_errors"`
+	FirstErrors []string `json:"first_errors"`
 	// Context around failure
 	FailureContext *FailureContext `json:"failure_context,omitempty"`
@@ -53,9 +54,9 @@ type RestoreErrorReport struct {
 	DiagnosisResult *DiagnoseResult `json:"diagnosis_result,omitempty"`
 	// Environment (sanitized)
-	PostgresVersion string `json:"postgres_version,omitempty"`
+	PostgresVersion  string `json:"postgres_version,omitempty"`
 	PgRestoreVersion string `json:"pg_restore_version,omitempty"`
-	PsqlVersion     string `json:"psql_version,omitempty"`
+	PsqlVersion      string `json:"psql_version,omitempty"`
 	// Recommendations
 	Recommendations []string `json:"recommendations"`
@@ -69,38 +70,38 @@ type FailureContext struct {
 	SurroundingLines []string `json:"surrounding_lines,omitempty"`
 	// For COPY block errors
-	InCopyBlock      bool   `json:"in_copy_block,omitempty"`
+	InCopyBlock    bool     `json:"in_copy_block,omitempty"`
-	CopyTableName    string `json:"copy_table_name,omitempty"`
+	CopyTableName  string   `json:"copy_table_name,omitempty"`
-	CopyStartLine    int    `json:"copy_start_line,omitempty"`
+	CopyStartLine  int      `json:"copy_start_line,omitempty"`
-	SampleCopyData   []string `json:"sample_copy_data,omitempty"`
+	SampleCopyData []string `json:"sample_copy_data,omitempty"`
 	// File position info
-	BytePosition     int64 `json:"byte_position,omitempty"`
+	BytePosition    int64   `json:"byte_position,omitempty"`
-	PercentComplete  float64 `json:"percent_complete,omitempty"`
+	PercentComplete float64 `json:"percent_complete,omitempty"`
 }
 // ErrorCollector captures detailed error information during restore
 type ErrorCollector struct {
-	log            logger.Logger
+	log         logger.Logger
-	cfg            *config.Config
+	cfg         *config.Config
-	archivePath    string
+	archivePath string
-	targetDB       string
+	targetDB    string
-	format         ArchiveFormat
+	format      ArchiveFormat
 	// Captured data
-	stderrLines    []string
+	stderrLines []string
-	firstErrors    []string
+	firstErrors []string
-	lastErrors     []string
+	lastErrors  []string
-	totalErrors    int
+	totalErrors int
-	exitCode       int
+	exitCode    int
 	// Limits
 	maxStderrLines  int
 	maxErrorCapture int
 	// State
-	startTime      time.Time
+	startTime time.Time
-	enabled        bool
+	enabled   bool
 }
 // NewErrorCollector creates a new error collector
@@ -556,7 +557,11 @@ func getDatabaseType(format ArchiveFormat) string {
 }
 func getCommandVersion(cmd string, arg string) string {
-	output, err := exec.Command(cmd, arg).CombinedOutput()
+	// Use timeout to prevent blocking if command hangs
 	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
 	defer cancel()
 	output, err := exec.CommandContext(ctx, cmd, arg).CombinedOutput()
 	if err != nil {
 		return ""
 	}
--- a/internal/restore/version_check.go
+++ b/internal/restore/version_check.go
@@ -6,6 +6,7 @@ import (
 	"os/exec"
 	"regexp"
 	"strconv"
 	"time"
 	"dbbackup/internal/database"
 )
@@ -47,8 +48,13 @@ func ParsePostgreSQLVersion(versionStr string) (*VersionInfo, error) {
 // GetDumpFileVersion extracts the PostgreSQL version from a dump file
 // Uses pg_restore -l to read the dump metadata
 // Uses a 30-second timeout to avoid blocking on large files
 func GetDumpFileVersion(dumpPath string) (*VersionInfo, error) {
-	cmd := exec.Command("pg_restore", "-l", dumpPath)
+	// Use a timeout context to prevent blocking on very large dump files
 	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
 	defer cancel()
 	cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpPath)
 	output, err := cmd.CombinedOutput()
 	if err != nil {
 		return nil, fmt.Errorf("failed to read dump file metadata: %w (output: %s)", err, string(output))
--- a/internal/tui/backup_exec.go
+++ b/internal/tui/backup_exec.go
@@ -83,10 +83,10 @@ type backupCompleteMsg struct {
 func executeBackupWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, backupType, dbName string, ratio int) tea.Cmd {
 	return func() tea.Msg {
-		// Use configurable cluster timeout (minutes) from config; default set in config.New()
+		// NO TIMEOUT for backup operations - a backup takes as long as it takes
-		// Use parent context to inherit cancellation from TUI
+		// Large databases can take many hours
-		clusterTimeout := time.Duration(cfg.ClusterTimeoutMinutes) * time.Minute
+		// Only manual cancellation (Ctrl+C) should stop the backup
-		ctx, cancel := context.WithTimeout(parentCtx, clusterTimeout)
+		ctx, cancel := context.WithCancel(parentCtx)
 		defer cancel()
 		start := time.Now()
--- a/internal/tui/dbselector.go
+++ b/internal/tui/dbselector.go
@@ -53,7 +53,8 @@ type databaseListMsg struct {
 func fetchDatabases(cfg *config.Config, log logger.Logger) tea.Cmd {
 	return func() tea.Msg {
-		ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
+		// 60 seconds for database listing - busy servers may be slow
 		ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
 		defer cancel()
 		dbClient, err := database.New(cfg, log)
--- a/internal/tui/restore_exec.go
+++ b/internal/tui/restore_exec.go
@@ -4,6 +4,7 @@ import (
 	"context"
 	"fmt"
 	"os/exec"
 	"path/filepath"
 	"strings"
 	"time"
@@ -110,10 +111,10 @@ type restoreCompleteMsg struct {
 func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string, cleanFirst, createIfMissing bool, restoreType string, cleanClusterFirst bool, existingDBs []string, saveDebugLog bool) tea.Cmd {
 	return func() tea.Msg {
-		// Use configurable cluster timeout (minutes) from config; default set in config.New()
+		// NO TIMEOUT for restore operations - a restore takes as long as it takes
-		// Use parent context to inherit cancellation from TUI
+		// Large databases with large objects can take many hours
-		restoreTimeout := time.Duration(cfg.ClusterTimeoutMinutes) * time.Minute
+		// Only manual cancellation (Ctrl+C) should stop the restore
-		ctx, cancel := context.WithTimeout(parentCtx, restoreTimeout)
+		ctx, cancel := context.WithCancel(parentCtx)
 		defer cancel()
 		start := time.Now()
@@ -137,8 +138,8 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
 			// This matches how cluster restore works - uses CLI tools, not database connections
 			droppedCount := 0
 			for _, dbName := range existingDBs {
-				// Create timeout context for each database drop (30 seconds per DB)
+				// Create timeout context for each database drop (5 minutes per DB - large DBs take time)
-				dropCtx, dropCancel := context.WithTimeout(ctx, 30*time.Second)
+				dropCtx, dropCancel := context.WithTimeout(ctx, 5*time.Minute)
 				if err := dropDatabaseCLI(dropCtx, cfg, dbName); err != nil {
 					log.Warn("Failed to drop database", "name", dbName, "error", err)
 					// Continue with other databases
@@ -157,8 +158,9 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
 		// Enable debug logging if requested
 		if saveDebugLog {
-			// Generate debug log path based on archive name and timestamp
+			// Generate debug log path using configured WorkDir
-			debugLogPath := fmt.Sprintf("/tmp/dbbackup-restore-debug-%s.json", time.Now().Format("20060102-150405"))
+			workDir := cfg.GetEffectiveWorkDir()
 			debugLogPath := filepath.Join(workDir, fmt.Sprintf("dbbackup-restore-debug-%s.json", time.Now().Format("20060102-150405")))
 			engine.SetDebugLogPath(debugLogPath)
 			log.Info("Debug logging enabled", "path", debugLogPath)
 		}
--- a/internal/tui/restore_preview.go
+++ b/internal/tui/restore_preview.go
@@ -106,7 +106,8 @@ type safetyCheckCompleteMsg struct {
 func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd {
 	return func() tea.Msg {
-		ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
+		// 10 minutes for safety checks - large archives can take a long time to diagnose
 		ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
 		defer cancel()
 		safety := restore.NewSafety(cfg, log)
@@ -471,7 +472,7 @@ func (m RestorePreviewModel) View() string {
 	s.WriteString(debugStyle.Render(fmt.Sprintf("  %s Debug Log: %v (press 'd' to toggle)", debugIcon, m.saveDebugLog)))
 	s.WriteString("\n")
 	if m.saveDebugLog {
-		s.WriteString(infoStyle.Render("    Saves detailed error report to /tmp on failure"))
+		s.WriteString(infoStyle.Render(fmt.Sprintf("    Saves detailed error report to %s on failure", m.config.GetEffectiveWorkDir())))
 		s.WriteString("\n")
 	}
 	s.WriteString("\n")
--- a/internal/tui/settings.go
+++ b/internal/tui/settings.go
@@ -802,7 +802,7 @@ func (m SettingsModel) openDirectoryBrowser() (tea.Model, tea.Cmd) {
 	setting := m.settings[m.cursor]
 	currentValue := setting.Value(m.config)
 	if currentValue == "" {
-		currentValue = "/tmp"
+		currentValue = m.config.GetEffectiveWorkDir()
 	}
 	if m.dirBrowser == nil {
--- a/internal/tui/status.go
+++ b/internal/tui/status.go
@@ -70,7 +70,8 @@ type statusMsg struct {
 func fetchStatus(cfg *config.Config, log logger.Logger) tea.Cmd {
 	return func() tea.Msg {
-		ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
+		// 30 seconds for status check - slow networks or SSL negotiation
 		ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
 		defer cancel()
 		dbClient, err := database.New(cfg, log)
--- a/main.go
+++ b/main.go
@@ -16,7 +16,7 @@ import (
 // Build information (set by ldflags)
 var (
-	version   = "3.42.1"
+	version   = "3.42.9"
 	buildTime = "unknown"
 	gitCommit = "unknown"
 )
Author	SHA1	Message	Date
Alexander Renz	9c65821250	v3.42.9: Fix all timeout bugs and deadlocks All checks were successful CI/CD / Test (push) Successful in 1m14s Details CI/CD / Lint (push) Successful in 1m21s Details CI/CD / Build & Release (push) Successful in 3m12s Details CRITICAL FIXES: - Encryption detection false positive (IsBackupEncrypted returned true for ALL files) - 12 cmd.Wait() deadlocks fixed with channel-based context handling - TUI timeout bugs: 60s->10min for safety checks, 15s->60s for DB listing - diagnose.go timeouts: 60s->5min for tar/pg_restore operations - Panic recovery added to parallel backup/restore goroutines - Variable shadowing fix in restore/engine.go These bugs caused pg_dump backups to fail through TUI for months.	2026-01-08 05:56:31 +01:00
Alexander Renz	627061cdbb	fix: restore automatic builds on tag push All checks were successful CI/CD / Test (push) Successful in 1m16s Details CI/CD / Lint (push) Successful in 1m23s Details CI/CD / Build & Release (push) Successful in 3m17s Details	2026-01-07 20:53:20 +01:00
Alexander Renz	e1a7c57e0f	fix: CI runs only once - on release publish, not on tag push All checks were successful CI/CD / Test (push) Successful in 1m18s Details CI/CD / Lint (push) Successful in 1m25s Details CI/CD / Build & Release (push) Has been skipped Details Removed duplicate CI triggers: - Before: Ran on push to branches AND on tag push (doubled) - After: Runs on push to branches OR when release is published This prevents wasted CI resources and confusion.	2026-01-07 20:48:01 +01:00
Alexander Renz	22915102d4	CRITICAL FIX: Eliminate all hardcoded /tmp paths - respect WorkDir configuration All checks were successful CI/CD / Test (push) Successful in 1m17s Details CI/CD / Lint (push) Successful in 1m24s Details CI/CD / Build & Release (push) Has been skipped Details This is a critical bugfix release addressing multiple hardcoded temporary directory paths that prevented proper use of the WorkDir configuration option. PROBLEM: Users configuring WorkDir (e.g., /u01/dba/tmp) for systems with small root filesystems still experienced failures because critical operations hardcoded /tmp instead of respecting the configured WorkDir. This made the WorkDir option essentially non-functional. FIXED LOCATIONS: 1. internal/restore/engine.go:632 - CRITICAL: Used BackupDir instead of WorkDir for extraction 2. cmd/restore.go:354,834 - CLI restore/diagnose commands ignored WorkDir 3. cmd/migrate.go:208,347 - Migration commands hardcoded /tmp 4. internal/migrate/engine.go:120 - Migration engine ignored WorkDir 5. internal/config/config.go:224 - SwapFilePath hardcoded /tmp 6. internal/config/config.go:519 - Backup directory fallback hardcoded /tmp 7. internal/tui/restore_exec.go:161 - Debug logs hardcoded /tmp 8. internal/tui/settings.go:805 - Directory browser default hardcoded /tmp 9. internal/tui/restore_preview.go:474 - Display message hardcoded /tmp NEW FEATURES: - Added Config.GetEffectiveWorkDir() helper method - WorkDir now respects WORK_DIR environment variable - All temp operations now consistently use configured WorkDir with /tmp fallback IMPACT: - Restores on systems with small root disks now work properly with WorkDir configured - Admins can control disk space usage for all temporary operations - Debug logs, extraction dirs, swap files all respect WorkDir setting Version: 3.42.1 (Critical Fix Release)	2026-01-07 20:41:53 +01:00