Compare commits
3 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 222bdbef58 | |||
| f7e9fa64f0 | |||
| f153e61dbf |
94
PITR.md
94
PITR.md
@@ -584,6 +584,100 @@ Document your recovery procedure:
|
|||||||
9. Create new base backup
|
9. Create new base backup
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Large Database Support (600+ GB)
|
||||||
|
|
||||||
|
For databases larger than 600 GB, PITR is the **recommended approach** over full dump/restore.
|
||||||
|
|
||||||
|
### Why PITR Works Better for Large DBs
|
||||||
|
|
||||||
|
| Approach | 600 GB Database | Recovery Time (RTO) |
|
||||||
|
|----------|-----------------|---------------------|
|
||||||
|
| Full pg_dump/restore | Hours to dump, hours to restore | 4-12+ hours |
|
||||||
|
| PITR (base + WAL) | Incremental WAL only | 30 min - 2 hours |
|
||||||
|
|
||||||
|
### Setup for Large Databases
|
||||||
|
|
||||||
|
**1. Enable WAL archiving with compression:**
|
||||||
|
```bash
|
||||||
|
dbbackup pitr enable --archive-dir /backups/wal_archive --compress
|
||||||
|
```
|
||||||
|
|
||||||
|
**2. Take ONE base backup weekly/monthly (use pg_basebackup):**
|
||||||
|
```bash
|
||||||
|
# For 600+ GB, use fast checkpoint to minimize impact
|
||||||
|
pg_basebackup -D /backups/base_$(date +%Y%m%d).tar.gz \
|
||||||
|
-Ft -z -P --checkpoint=fast --wal-method=none
|
||||||
|
|
||||||
|
# Duration: 2-6 hours for 600 GB, but only needed weekly/monthly
|
||||||
|
```
|
||||||
|
|
||||||
|
**3. WAL files archive continuously** (~1-5 GB/hour typical), capturing every change.
|
||||||
|
|
||||||
|
**4. Recover to any point in time:**
|
||||||
|
```bash
|
||||||
|
dbbackup restore pitr \
|
||||||
|
--base-backup /backups/base_20260101.tar.gz \
|
||||||
|
--wal-archive /backups/wal_archive \
|
||||||
|
--target-time "2026-01-13 14:30:00" \
|
||||||
|
--target-dir /var/lib/postgresql/16/restored
|
||||||
|
```
|
||||||
|
|
||||||
|
### PostgreSQL Optimizations for 600+ GB
|
||||||
|
|
||||||
|
| Setting | Value | Purpose |
|
||||||
|
|---------|-------|---------|
|
||||||
|
| `wal_compression = on` | postgresql.conf | 70-80% smaller WAL files |
|
||||||
|
| `max_wal_size = 4GB` | postgresql.conf | Reduce checkpoint frequency |
|
||||||
|
| `checkpoint_timeout = 30min` | postgresql.conf | Less frequent checkpoints |
|
||||||
|
| `archive_timeout = 300` | postgresql.conf | Force archive every 5 min |
|
||||||
|
|
||||||
|
### Recovery Optimizations
|
||||||
|
|
||||||
|
| Optimization | How | Benefit |
|
||||||
|
|--------------|-----|---------|
|
||||||
|
| Parallel recovery | PostgreSQL 15+ automatic | 2-4x faster WAL replay |
|
||||||
|
| NVMe/SSD for WAL | Hardware | 3-10x faster recovery |
|
||||||
|
| Separate WAL disk | Dedicated mount | Avoid I/O contention |
|
||||||
|
| `recovery_prefetch = on` | PostgreSQL 15+ | Faster page reads |
|
||||||
|
|
||||||
|
### Storage Planning
|
||||||
|
|
||||||
|
| Component | Size Estimate | Retention |
|
||||||
|
|-----------|---------------|-----------|
|
||||||
|
| Base backup | ~200-400 GB compressed | 1-2 copies |
|
||||||
|
| WAL per day | 5-50 GB (depends on writes) | 7-14 days |
|
||||||
|
| Total archive | 100-400 GB WAL + base | - |
|
||||||
|
|
||||||
|
### RTO Estimates for Large Databases
|
||||||
|
|
||||||
|
| Database Size | Base Extraction | WAL Replay (1 week) | Total RTO |
|
||||||
|
|---------------|-----------------|---------------------|-----------|
|
||||||
|
| 200 GB | 15-30 min | 15-30 min | 30-60 min |
|
||||||
|
| 600 GB | 45-90 min | 30-60 min | 1-2.5 hours |
|
||||||
|
| 1 TB | 60-120 min | 45-90 min | 2-3.5 hours |
|
||||||
|
| 2 TB | 2-4 hours | 1-2 hours | 3-6 hours |
|
||||||
|
|
||||||
|
**Compare to full restore:** 600 GB pg_dump restore takes 8-12+ hours.
|
||||||
|
|
||||||
|
### Best Practices for 600+ GB
|
||||||
|
|
||||||
|
1. **Weekly base backups** - Monthly if storage is tight
|
||||||
|
2. **Test recovery monthly** - Verify WAL chain integrity
|
||||||
|
3. **Monitor WAL lag** - Alert if archive falls behind
|
||||||
|
4. **Use streaming replication** - For HA, combine with PITR for DR
|
||||||
|
5. **Separate archive storage** - Don't fill up the DB disk
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Quick health check for large DB PITR setup
|
||||||
|
dbbackup pitr status --verbose
|
||||||
|
|
||||||
|
# Expected output:
|
||||||
|
# Base Backup: 2026-01-06 (7 days old) - OK
|
||||||
|
# WAL Archive: 847 files, 52 GB
|
||||||
|
# Recovery Window: 2026-01-06 to 2026-01-13 (7 days)
|
||||||
|
# Estimated RTO: ~90 minutes
|
||||||
|
```
|
||||||
|
|
||||||
## Performance Considerations
|
## Performance Considerations
|
||||||
|
|
||||||
### WAL Archive Size
|
### WAL Archive Size
|
||||||
|
|||||||
@@ -4,8 +4,8 @@ This directory contains pre-compiled binaries for the DB Backup Tool across mult
|
|||||||
|
|
||||||
## Build Information
|
## Build Information
|
||||||
- **Version**: 3.42.10
|
- **Version**: 3.42.10
|
||||||
- **Build Time**: 2026-01-12_08:50:35_UTC
|
- **Build Time**: 2026-01-13_07:23:20_UTC
|
||||||
- **Git Commit**: b1f8c6d
|
- **Git Commit**: f153e61
|
||||||
|
|
||||||
## Recent Updates (v1.1.0)
|
## Recent Updates (v1.1.0)
|
||||||
- ✅ Fixed TUI progress display with line-by-line output
|
- ✅ Fixed TUI progress display with line-by-line output
|
||||||
|
|||||||
@@ -414,24 +414,121 @@ func (d *Diagnoser) diagnoseSQLScript(filePath string, compressed bool, result *
|
|||||||
|
|
||||||
// diagnoseClusterArchive analyzes a cluster tar.gz archive
|
// diagnoseClusterArchive analyzes a cluster tar.gz archive
|
||||||
func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResult) {
|
func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResult) {
|
||||||
// First verify tar.gz integrity with timeout
|
// Calculate dynamic timeout based on file size
|
||||||
// 5 minutes for large archives (multi-GB archives need more time)
|
// Large archives (100GB+) can take significant time to list
|
||||||
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
|
// Minimum 5 minutes, scales with file size, max 180 minutes for very large archives
|
||||||
|
timeoutMinutes := 5
|
||||||
|
if result.FileSize > 0 {
|
||||||
|
// 1 minute per 2 GB, minimum 5 minutes, max 180 minutes
|
||||||
|
sizeGB := result.FileSize / (1024 * 1024 * 1024)
|
||||||
|
estimatedMinutes := int(sizeGB/2) + 5
|
||||||
|
if estimatedMinutes > timeoutMinutes {
|
||||||
|
timeoutMinutes = estimatedMinutes
|
||||||
|
}
|
||||||
|
if timeoutMinutes > 180 {
|
||||||
|
timeoutMinutes = 180
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
d.log.Info("Verifying cluster archive integrity",
|
||||||
|
"size", fmt.Sprintf("%.1f GB", float64(result.FileSize)/(1024*1024*1024)),
|
||||||
|
"timeout", fmt.Sprintf("%d min", timeoutMinutes))
|
||||||
|
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
|
||||||
defer cancel()
|
defer cancel()
|
||||||
|
|
||||||
|
// Use streaming approach with pipes to avoid memory issues with large archives
|
||||||
cmd := exec.CommandContext(ctx, "tar", "-tzf", filePath)
|
cmd := exec.CommandContext(ctx, "tar", "-tzf", filePath)
|
||||||
output, err := cmd.Output()
|
stdout, pipeErr := cmd.StdoutPipe()
|
||||||
if err != nil {
|
if pipeErr != nil {
|
||||||
|
// Pipe creation failed - not a corruption issue
|
||||||
|
result.Warnings = append(result.Warnings,
|
||||||
|
fmt.Sprintf("Cannot create pipe for verification: %v", pipeErr),
|
||||||
|
"Archive integrity cannot be verified but may still be valid")
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
var stderrBuf bytes.Buffer
|
||||||
|
cmd.Stderr = &stderrBuf
|
||||||
|
|
||||||
|
if startErr := cmd.Start(); startErr != nil {
|
||||||
|
result.Warnings = append(result.Warnings,
|
||||||
|
fmt.Sprintf("Cannot start tar verification: %v", startErr),
|
||||||
|
"Archive integrity cannot be verified but may still be valid")
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Stream output line by line to avoid buffering entire listing in memory
|
||||||
|
scanner := bufio.NewScanner(stdout)
|
||||||
|
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024) // Allow long paths
|
||||||
|
|
||||||
|
var files []string
|
||||||
|
fileCount := 0
|
||||||
|
for scanner.Scan() {
|
||||||
|
fileCount++
|
||||||
|
line := scanner.Text()
|
||||||
|
// Only store dump/metadata files, not every file
|
||||||
|
if strings.HasSuffix(line, ".dump") || strings.HasSuffix(line, ".sql.gz") ||
|
||||||
|
strings.HasSuffix(line, ".sql") || strings.HasSuffix(line, ".json") ||
|
||||||
|
strings.Contains(line, "globals") || strings.Contains(line, "manifest") ||
|
||||||
|
strings.Contains(line, "metadata") {
|
||||||
|
files = append(files, line)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
scanErr := scanner.Err()
|
||||||
|
waitErr := cmd.Wait()
|
||||||
|
stderrOutput := stderrBuf.String()
|
||||||
|
|
||||||
|
// Handle errors - distinguish between actual corruption and resource/timeout issues
|
||||||
|
if waitErr != nil || scanErr != nil {
|
||||||
|
// Check if it was a timeout
|
||||||
|
if ctx.Err() == context.DeadlineExceeded {
|
||||||
|
result.Warnings = append(result.Warnings,
|
||||||
|
fmt.Sprintf("Verification timed out after %d minutes - archive is very large", timeoutMinutes),
|
||||||
|
"This does not necessarily mean the archive is corrupted",
|
||||||
|
"Manual verification: tar -tzf "+filePath+" | wc -l")
|
||||||
|
// Don't mark as corrupted or invalid on timeout - archive may be fine
|
||||||
|
if fileCount > 0 {
|
||||||
|
result.Details.TableCount = len(files)
|
||||||
|
result.Details.TableList = files
|
||||||
|
}
|
||||||
|
return
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check for specific gzip/tar corruption indicators
|
||||||
|
if strings.Contains(stderrOutput, "unexpected end of file") ||
|
||||||
|
strings.Contains(stderrOutput, "Unexpected EOF") ||
|
||||||
|
strings.Contains(stderrOutput, "gzip: stdin: unexpected end of file") ||
|
||||||
|
strings.Contains(stderrOutput, "not in gzip format") ||
|
||||||
|
strings.Contains(stderrOutput, "invalid compressed data") {
|
||||||
|
// These indicate actual corruption
|
||||||
result.IsValid = false
|
result.IsValid = false
|
||||||
result.IsCorrupted = true
|
result.IsCorrupted = true
|
||||||
result.Errors = append(result.Errors,
|
result.Errors = append(result.Errors,
|
||||||
fmt.Sprintf("Tar archive is invalid or corrupted: %v", err),
|
"Tar archive appears truncated or corrupted",
|
||||||
|
fmt.Sprintf("Error: %s", truncateString(stderrOutput, 200)),
|
||||||
"Run: tar -tzf "+filePath+" 2>&1 | tail -20")
|
"Run: tar -tzf "+filePath+" 2>&1 | tail -20")
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
|
|
||||||
// Parse tar listing
|
// Other errors (signal killed, memory, etc.) - not necessarily corruption
|
||||||
files := strings.Split(strings.TrimSpace(string(output)), "\n")
|
// If we read some files successfully, the archive structure is likely OK
|
||||||
|
if fileCount > 0 {
|
||||||
|
result.Warnings = append(result.Warnings,
|
||||||
|
fmt.Sprintf("Verification incomplete (read %d files before error)", fileCount),
|
||||||
|
"Archive may still be valid - error could be due to system resources")
|
||||||
|
// Proceed with what we got
|
||||||
|
} else {
|
||||||
|
// Couldn't read anything - but don't mark as corrupted without clear evidence
|
||||||
|
result.Warnings = append(result.Warnings,
|
||||||
|
fmt.Sprintf("Cannot verify archive: %v", waitErr),
|
||||||
|
"Archive integrity is uncertain - proceed with caution or verify manually")
|
||||||
|
return
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Parse the collected file list
|
||||||
var dumpFiles []string
|
var dumpFiles []string
|
||||||
hasGlobals := false
|
hasGlobals := false
|
||||||
hasMetadata := false
|
hasMetadata := false
|
||||||
@@ -497,9 +594,22 @@ func (d *Diagnoser) diagnoseUnknown(filePath string, result *DiagnoseResult) {
|
|||||||
|
|
||||||
// verifyWithPgRestore uses pg_restore --list to verify dump integrity
|
// verifyWithPgRestore uses pg_restore --list to verify dump integrity
|
||||||
func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult) {
|
func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult) {
|
||||||
// Use timeout to prevent blocking on very large dump files
|
// Calculate dynamic timeout based on file size
|
||||||
// 5 minutes for large dumps (multi-GB dumps with many tables)
|
// pg_restore --list is usually faster than tar -tzf for same size
|
||||||
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
|
timeoutMinutes := 5
|
||||||
|
if result.FileSize > 0 {
|
||||||
|
// 1 minute per 5 GB, minimum 5 minutes, max 30 minutes
|
||||||
|
sizeGB := result.FileSize / (1024 * 1024 * 1024)
|
||||||
|
estimatedMinutes := int(sizeGB/5) + 5
|
||||||
|
if estimatedMinutes > timeoutMinutes {
|
||||||
|
timeoutMinutes = estimatedMinutes
|
||||||
|
}
|
||||||
|
if timeoutMinutes > 30 {
|
||||||
|
timeoutMinutes = 30
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
|
||||||
defer cancel()
|
defer cancel()
|
||||||
|
|
||||||
cmd := exec.CommandContext(ctx, "pg_restore", "--list", filePath)
|
cmd := exec.CommandContext(ctx, "pg_restore", "--list", filePath)
|
||||||
@@ -554,14 +664,72 @@ func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult)
|
|||||||
|
|
||||||
// DiagnoseClusterDumps extracts and diagnoses all dumps in a cluster archive
|
// DiagnoseClusterDumps extracts and diagnoses all dumps in a cluster archive
|
||||||
func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*DiagnoseResult, error) {
|
func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*DiagnoseResult, error) {
|
||||||
// First, try to list archive contents without extracting (fast check)
|
// Get archive size for dynamic timeout calculation
|
||||||
// 10 minutes for very large archives
|
archiveInfo, err := os.Stat(archivePath)
|
||||||
listCtx, listCancel := context.WithTimeout(context.Background(), 10*time.Minute)
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("cannot stat archive: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Dynamic timeout based on archive size: base 10 min + 1 min per 3 GB
|
||||||
|
// Large archives like 100+ GB need more time for tar -tzf
|
||||||
|
timeoutMinutes := 10
|
||||||
|
if archiveInfo.Size() > 0 {
|
||||||
|
sizeGB := archiveInfo.Size() / (1024 * 1024 * 1024)
|
||||||
|
estimatedMinutes := int(sizeGB/3) + 10
|
||||||
|
if estimatedMinutes > timeoutMinutes {
|
||||||
|
timeoutMinutes = estimatedMinutes
|
||||||
|
}
|
||||||
|
if timeoutMinutes > 120 { // Max 2 hours
|
||||||
|
timeoutMinutes = 120
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
d.log.Info("Listing cluster archive contents",
|
||||||
|
"size", fmt.Sprintf("%.1f GB", float64(archiveInfo.Size())/(1024*1024*1024)),
|
||||||
|
"timeout", fmt.Sprintf("%d min", timeoutMinutes))
|
||||||
|
|
||||||
|
listCtx, listCancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
|
||||||
defer listCancel()
|
defer listCancel()
|
||||||
|
|
||||||
listCmd := exec.CommandContext(listCtx, "tar", "-tzf", archivePath)
|
listCmd := exec.CommandContext(listCtx, "tar", "-tzf", archivePath)
|
||||||
listOutput, listErr := listCmd.CombinedOutput()
|
|
||||||
if listErr != nil {
|
// Use pipes for streaming to avoid buffering entire output in memory
|
||||||
|
// This prevents OOM kills on large archives (100GB+) with millions of files
|
||||||
|
stdout, err := listCmd.StdoutPipe()
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to create stdout pipe: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
var stderrBuf bytes.Buffer
|
||||||
|
listCmd.Stderr = &stderrBuf
|
||||||
|
|
||||||
|
if err := listCmd.Start(); err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to start tar listing: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Stream the output line by line, only keeping relevant files
|
||||||
|
var files []string
|
||||||
|
scanner := bufio.NewScanner(stdout)
|
||||||
|
// Set a reasonable max line length (file paths shouldn't exceed this)
|
||||||
|
scanner.Buffer(make([]byte, 0, 4096), 1024*1024)
|
||||||
|
|
||||||
|
fileCount := 0
|
||||||
|
for scanner.Scan() {
|
||||||
|
fileCount++
|
||||||
|
line := scanner.Text()
|
||||||
|
// Only store dump files and important files, not every single file
|
||||||
|
if strings.HasSuffix(line, ".dump") || strings.HasSuffix(line, ".sql") ||
|
||||||
|
strings.HasSuffix(line, ".sql.gz") || strings.HasSuffix(line, ".json") ||
|
||||||
|
strings.Contains(line, "globals") || strings.Contains(line, "manifest") ||
|
||||||
|
strings.Contains(line, "metadata") || strings.HasSuffix(line, "/") {
|
||||||
|
files = append(files, line)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
scanErr := scanner.Err()
|
||||||
|
listErr := listCmd.Wait()
|
||||||
|
|
||||||
|
if listErr != nil || scanErr != nil {
|
||||||
// Archive listing failed - likely corrupted
|
// Archive listing failed - likely corrupted
|
||||||
errResult := &DiagnoseResult{
|
errResult := &DiagnoseResult{
|
||||||
FilePath: archivePath,
|
FilePath: archivePath,
|
||||||
@@ -573,7 +741,12 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
|
|||||||
Details: &DiagnoseDetails{},
|
Details: &DiagnoseDetails{},
|
||||||
}
|
}
|
||||||
|
|
||||||
errOutput := string(listOutput)
|
errOutput := stderrBuf.String()
|
||||||
|
actualErr := listErr
|
||||||
|
if scanErr != nil {
|
||||||
|
actualErr = scanErr
|
||||||
|
}
|
||||||
|
|
||||||
if strings.Contains(errOutput, "unexpected end of file") ||
|
if strings.Contains(errOutput, "unexpected end of file") ||
|
||||||
strings.Contains(errOutput, "Unexpected EOF") ||
|
strings.Contains(errOutput, "Unexpected EOF") ||
|
||||||
strings.Contains(errOutput, "truncated") {
|
strings.Contains(errOutput, "truncated") {
|
||||||
@@ -585,7 +758,7 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
|
|||||||
"Solution: Re-create the backup from source database")
|
"Solution: Re-create the backup from source database")
|
||||||
} else {
|
} else {
|
||||||
errResult.Errors = append(errResult.Errors,
|
errResult.Errors = append(errResult.Errors,
|
||||||
fmt.Sprintf("Cannot list archive contents: %v", listErr),
|
fmt.Sprintf("Cannot list archive contents: %v", actualErr),
|
||||||
fmt.Sprintf("tar error: %s", truncateString(errOutput, 300)),
|
fmt.Sprintf("tar error: %s", truncateString(errOutput, 300)),
|
||||||
"Run manually: tar -tzf "+archivePath+" 2>&1 | tail -50")
|
"Run manually: tar -tzf "+archivePath+" 2>&1 | tail -50")
|
||||||
}
|
}
|
||||||
@@ -593,11 +766,10 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
|
|||||||
return []*DiagnoseResult{errResult}, nil
|
return []*DiagnoseResult{errResult}, nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// Archive is listable - now check disk space before extraction
|
d.log.Debug("Archive listing streamed successfully", "total_files", fileCount, "relevant_files", len(files))
|
||||||
files := strings.Split(strings.TrimSpace(string(listOutput)), "\n")
|
|
||||||
|
|
||||||
// Check if we have enough disk space (estimate 4x archive size needed)
|
// Check if we have enough disk space (estimate 4x archive size needed)
|
||||||
archiveInfo, _ := os.Stat(archivePath)
|
// archiveInfo already obtained at function start
|
||||||
requiredSpace := archiveInfo.Size() * 4
|
requiredSpace := archiveInfo.Size() * 4
|
||||||
|
|
||||||
// Check temp directory space - try to extract metadata first
|
// Check temp directory space - try to extract metadata first
|
||||||
|
|||||||
@@ -229,8 +229,14 @@ func containsSQLKeywords(content string) bool {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// CheckDiskSpace verifies sufficient disk space for restore
|
// CheckDiskSpace verifies sufficient disk space for restore
|
||||||
|
// Uses the effective work directory (WorkDir if set, otherwise BackupDir) since
|
||||||
|
// that's where extraction actually happens for large databases
|
||||||
func (s *Safety) CheckDiskSpace(archivePath string, multiplier float64) error {
|
func (s *Safety) CheckDiskSpace(archivePath string, multiplier float64) error {
|
||||||
return s.CheckDiskSpaceAt(archivePath, s.cfg.BackupDir, multiplier)
|
checkDir := s.cfg.GetEffectiveWorkDir()
|
||||||
|
if checkDir == "" {
|
||||||
|
checkDir = s.cfg.BackupDir
|
||||||
|
}
|
||||||
|
return s.CheckDiskSpaceAt(archivePath, checkDir, multiplier)
|
||||||
}
|
}
|
||||||
|
|
||||||
// CheckDiskSpaceAt verifies sufficient disk space at a specific directory
|
// CheckDiskSpaceAt verifies sufficient disk space at a specific directory
|
||||||
|
|||||||
@@ -106,9 +106,23 @@ type safetyCheckCompleteMsg struct {
|
|||||||
|
|
||||||
func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd {
|
func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd {
|
||||||
return func() tea.Msg {
|
return func() tea.Msg {
|
||||||
// 10 minutes for safety checks - large archives can take a long time to diagnose
|
// Dynamic timeout based on archive size for large database support
|
||||||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
|
// Base: 10 minutes + 1 minute per 5 GB, max 120 minutes
|
||||||
|
timeoutMinutes := 10
|
||||||
|
if archive.Size > 0 {
|
||||||
|
sizeGB := archive.Size / (1024 * 1024 * 1024)
|
||||||
|
estimatedMinutes := int(sizeGB/5) + 10
|
||||||
|
if estimatedMinutes > timeoutMinutes {
|
||||||
|
timeoutMinutes = estimatedMinutes
|
||||||
|
}
|
||||||
|
if timeoutMinutes > 120 {
|
||||||
|
timeoutMinutes = 120
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
|
||||||
defer cancel()
|
defer cancel()
|
||||||
|
_ = ctx // Used by database checks below
|
||||||
|
|
||||||
safety := restore.NewSafety(cfg, log)
|
safety := restore.NewSafety(cfg, log)
|
||||||
checks := []SafetyCheck{}
|
checks := []SafetyCheck{}
|
||||||
|
|||||||
Reference in New Issue
Block a user