Compare commits
5 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 88c141467b | |||
| 3d229f4c5e | |||
| da89e18a25 | |||
| 2e7aa9fcdf | |||
| 59812400a4 |
81
CHANGELOG.md
81
CHANGELOG.md
@ -5,6 +5,87 @@ All notable changes to dbbackup will be documented in this file.
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [5.5.0] - 2026-02-02
|
||||
|
||||
### Added
|
||||
- **🚀 Native Engine Support for Cluster Backup/Restore**
|
||||
- NEW: `--native` flag for cluster backup creates SQL format (.sql.gz) using pure Go
|
||||
- NEW: `--native` flag for cluster restore uses pure Go engine for .sql.gz files
|
||||
- Zero external tool dependencies when using native mode
|
||||
- Single-binary deployment now possible without pg_dump/pg_restore installed
|
||||
|
||||
- **Native Cluster Backup** (`dbbackup backup cluster --native`)
|
||||
- Creates .sql.gz files instead of .dump files
|
||||
- Uses pgx wire protocol for data export
|
||||
- Parallel gzip compression with pgzip
|
||||
- Automatic fallback to pg_dump if `--fallback-tools` is set
|
||||
|
||||
- **Native Cluster Restore** (`dbbackup restore cluster --native --confirm`)
|
||||
- Restores .sql.gz files using pure Go (pgx CopyFrom)
|
||||
- No psql or pg_restore required
|
||||
- Automatic detection: uses native for .sql.gz, pg_restore for .dump
|
||||
- Fallback support with `--fallback-tools`
|
||||
|
||||
### Updated
|
||||
- **NATIVE_ENGINE_SUMMARY.md** - Complete rewrite with accurate documentation
|
||||
- Native engine matrix now shows full cluster support with `--native` flag
|
||||
|
||||
### Technical Details
|
||||
- `internal/backup/engine.go`: Added native engine path in BackupCluster()
|
||||
- `internal/restore/engine.go`: Added `restoreWithNativeEngine()` function
|
||||
- `cmd/backup.go`: Added `--native` and `--fallback-tools` flags to cluster command
|
||||
- `cmd/restore.go`: Added `--native` and `--fallback-tools` flags with PreRunE handlers
|
||||
- Version bumped to 5.5.0 (new feature release)
|
||||
|
||||
## [5.4.6] - 2026-02-02
|
||||
|
||||
### Fixed
|
||||
- **CRITICAL: Progress Tracking for Large Database Restores**
|
||||
- Fixed "no progress" issue where TUI showed 0% for hours during large single-DB restore
|
||||
- Root cause: Progress only updated after database *completed*, not during restore
|
||||
- Heartbeat now reports estimated progress every 5 seconds (was 15s, text-only)
|
||||
- Time-based progress estimation: ~10MB/s throughput assumption
|
||||
- Progress capped at 95% until actual completion (prevents jumping to 100% too early)
|
||||
|
||||
- **Improved TUI Feedback During Long Restores**
|
||||
- Shows spinner + elapsed time when byte-level progress not available
|
||||
- Displays "pg_restore in progress (progress updates every 5s)" message
|
||||
- Better visual feedback that restore is actively running
|
||||
|
||||
### Technical Details
|
||||
- `reportDatabaseProgressByBytes()` now called during restore, not just after completion
|
||||
- Heartbeat interval reduced from 15s to 5s for more responsive feedback
|
||||
- TUI gracefully handles `CurrentDBTotal=0` case with activity indicator
|
||||
|
||||
## [5.4.5] - 2026-02-02
|
||||
|
||||
### Fixed
|
||||
- **Accurate Disk Space Estimation for Cluster Archives**
|
||||
- Fixed WARNING showing 836GB for 119GB archive - was using wrong compression multiplier
|
||||
- Cluster archives (.tar.gz) contain pre-compressed .dump files → now uses 1.2x multiplier
|
||||
- Single SQL files (.sql.gz) still use 5x multiplier (was 7x, slightly optimized)
|
||||
- New `CheckSystemMemoryWithType(size, isClusterArchive)` method for accurate estimates
|
||||
- 119GB cluster archive now correctly estimates ~143GB instead of ~833GB
|
||||
|
||||
## [5.4.4] - 2026-02-02
|
||||
|
||||
### Fixed
|
||||
- **TUI Header Separator Fix** - Capped separator length at 40 chars to prevent line overflow on wide terminals
|
||||
|
||||
## [5.4.3] - 2026-02-02
|
||||
|
||||
### Fixed
|
||||
- **Bulletproof SIGINT Handling** - Zero zombie processes guaranteed
|
||||
- All external commands now use `cleanup.SafeCommand()` with process group isolation
|
||||
- `KillCommandGroup()` sends signals to entire process group (-pgid)
|
||||
- No more orphaned pg_restore/pg_dump/psql/pigz processes on Ctrl+C
|
||||
- 16 files updated with proper signal handling
|
||||
|
||||
- **Eliminated External gzip Process** - The `zgrep` command was spawning `gzip -cdfq`
|
||||
- Replaced with in-process pgzip decompression in `preflight.go`
|
||||
- `estimateBlobsInSQL()` now uses pure Go pgzip.NewReader
|
||||
- Zero external gzip processes during restore
|
||||
|
||||
## [5.1.22] - 2026-02-01
|
||||
|
||||
### Added
|
||||
|
||||
@ -1,10 +1,49 @@
|
||||
# Native Database Engine Implementation Summary
|
||||
|
||||
## Mission Accomplished: Zero External Tool Dependencies
|
||||
## Current Status: Full Native Engine Support (v5.5.0+)
|
||||
|
||||
**User Goal:** "FULL - no dependency to the other tools"
|
||||
**Goal:** Zero dependency on external tools (pg_dump, pg_restore, mysqldump, mysql)
|
||||
|
||||
**Result:** **COMPLETE SUCCESS** - dbbackup now operates with **zero external tool dependencies**
|
||||
**Reality:** Native engine is **NOW AVAILABLE FOR ALL OPERATIONS** when using `--native` flag!
|
||||
|
||||
## Engine Support Matrix
|
||||
|
||||
| Operation | Default Mode | With `--native` Flag |
|
||||
|-----------|-------------|---------------------|
|
||||
| **Single DB Backup** | ✅ Native Go | ✅ Native Go |
|
||||
| **Single DB Restore** | ✅ Native Go | ✅ Native Go |
|
||||
| **Cluster Backup** | pg_dump (custom format) | ✅ **Native Go** (SQL format) |
|
||||
| **Cluster Restore** | pg_restore | ✅ **Native Go** (for .sql.gz files) |
|
||||
|
||||
### NEW: Native Cluster Operations (v5.5.0)
|
||||
|
||||
```bash
|
||||
# Native cluster backup - creates SQL format dumps, no pg_dump needed!
|
||||
./dbbackup backup cluster --native
|
||||
|
||||
# Native cluster restore - restores .sql.gz files with pure Go, no pg_restore!
|
||||
./dbbackup restore cluster backup.tar.gz --native --confirm
|
||||
```
|
||||
|
||||
### Format Selection
|
||||
|
||||
| Format | Created By | Restored By | Size | Speed |
|
||||
|--------|------------|-------------|------|-------|
|
||||
| **SQL** (.sql.gz) | Native Go or pg_dump | Native Go or psql | Larger | Medium |
|
||||
| **Custom** (.dump) | pg_dump -Fc | pg_restore only | Smaller | Fast (parallel) |
|
||||
|
||||
### When to Use Native Mode
|
||||
|
||||
**Use `--native` when:**
|
||||
- External tools (pg_dump/pg_restore) are not installed
|
||||
- Running in minimal containers without PostgreSQL client
|
||||
- Building a single statically-linked binary deployment
|
||||
- Simplifying disaster recovery procedures
|
||||
|
||||
**Use default mode when:**
|
||||
- Maximum backup/restore performance is critical
|
||||
- You need parallel restore with `-j` option
|
||||
- Backup size is a primary concern
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
@ -27,133 +66,201 @@
|
||||
- Configuration-based engine initialization
|
||||
- Unified backup orchestration across engines
|
||||
|
||||
4. **Advanced Engine Framework** (`internal/engine/native/advanced.go`)
|
||||
- Extensible options for advanced backup features
|
||||
- Support for multiple output formats (SQL, Custom, Directory)
|
||||
- Compression support (Gzip, Zstd, LZ4)
|
||||
- Performance optimization settings
|
||||
|
||||
5. **Restore Engine Framework** (`internal/engine/native/restore.go`)
|
||||
- Basic restore architecture (implementation ready)
|
||||
- Options for transaction control and error handling
|
||||
4. **Restore Engine Framework** (`internal/engine/native/restore.go`)
|
||||
- Parses SQL statements from backup
|
||||
- Uses `CopyFrom` for COPY data
|
||||
- Progress tracking and status reporting
|
||||
|
||||
## Configuration
|
||||
|
||||
```bash
|
||||
# SINGLE DATABASE (native is default for SQL format)
|
||||
./dbbackup backup single mydb # Uses native engine
|
||||
./dbbackup restore backup.sql.gz --native # Uses native engine
|
||||
|
||||
# CLUSTER BACKUP
|
||||
./dbbackup backup cluster # Default: pg_dump custom format
|
||||
./dbbackup backup cluster --native # NEW: Native Go, SQL format
|
||||
|
||||
# CLUSTER RESTORE
|
||||
./dbbackup restore cluster backup.tar.gz --confirm # Default: pg_restore
|
||||
./dbbackup restore cluster backup.tar.gz --native --confirm # NEW: Native Go for .sql.gz files
|
||||
|
||||
# FALLBACK MODE
|
||||
./dbbackup backup cluster --native --fallback-tools # Try native, fall back if fails
|
||||
```
|
||||
|
||||
### Config Defaults
|
||||
|
||||
```go
|
||||
// internal/config/config.go
|
||||
UseNativeEngine: true, // Native is default for single DB
|
||||
FallbackToTools: true, // Fall back to tools if native fails
|
||||
```
|
||||
|
||||
## When Native Engine is Used
|
||||
|
||||
### ✅ Native Engine for Single DB (Default)
|
||||
|
||||
```bash
|
||||
# Single DB backup to SQL format
|
||||
./dbbackup backup single mydb
|
||||
# → Uses native.PostgreSQLNativeEngine.Backup()
|
||||
# → Pure Go: pgx COPY TO STDOUT
|
||||
|
||||
# Single DB restore from SQL format
|
||||
./dbbackup restore mydb_backup.sql.gz --database=mydb
|
||||
# → Uses native.PostgreSQLRestoreEngine.Restore()
|
||||
# → Pure Go: pgx CopyFrom()
|
||||
```
|
||||
|
||||
### ✅ Native Engine for Cluster (With --native Flag)
|
||||
|
||||
```bash
|
||||
# Cluster backup with native engine
|
||||
./dbbackup backup cluster --native
|
||||
# → For each database: native.PostgreSQLNativeEngine.Backup()
|
||||
# → Creates .sql.gz files (not .dump)
|
||||
# → Pure Go: no pg_dump required!
|
||||
|
||||
# Cluster restore with native engine
|
||||
./dbbackup restore cluster backup.tar.gz --native --confirm
|
||||
# → For each .sql.gz: native.PostgreSQLRestoreEngine.Restore()
|
||||
# → Pure Go: no pg_restore required!
|
||||
```
|
||||
|
||||
### External Tools (Default for Cluster, or Custom Format)
|
||||
|
||||
```bash
|
||||
# Cluster backup (default - uses custom format for efficiency)
|
||||
./dbbackup backup cluster
|
||||
# → Uses pg_dump -Fc for each database
|
||||
# → Reason: Custom format enables parallel restore
|
||||
|
||||
# Cluster restore (default)
|
||||
./dbbackup restore cluster backup.tar.gz --confirm
|
||||
# → Uses pg_restore for .dump files
|
||||
# → Uses native engine for .sql.gz files automatically!
|
||||
|
||||
# Single DB restore from .dump file
|
||||
./dbbackup restore mydb_backup.dump --database=mydb
|
||||
# → Uses pg_restore
|
||||
# → Reason: Custom format binary file
|
||||
```
|
||||
|
||||
## Performance Comparison
|
||||
|
||||
| Method | Format | Backup Speed | Restore Speed | File Size | External Tools |
|
||||
|--------|--------|-------------|---------------|-----------|----------------|
|
||||
| Native Go | SQL.gz | Medium | Medium | Larger | ❌ None |
|
||||
| pg_dump/restore | Custom | Fast | Fast (parallel) | Smaller | ✅ Required |
|
||||
|
||||
### Recommendation
|
||||
|
||||
| Scenario | Recommended Mode |
|
||||
|----------|------------------|
|
||||
| No PostgreSQL tools installed | `--native` |
|
||||
| Minimal container deployment | `--native` |
|
||||
| Maximum performance needed | Default (pg_dump) |
|
||||
| Large databases (>10GB) | Default with `-j8` |
|
||||
| Disaster recovery simplicity | `--native` |
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### Data Type Handling
|
||||
- **PostgreSQL**: Proper handling of arrays, JSON, timestamps, binary data
|
||||
- **MySQL**: Advanced binary data encoding, proper string escaping, type-specific formatting
|
||||
- **Both**: NULL value handling, numeric precision, date/time formatting
|
||||
### Native Backup Flow
|
||||
|
||||
### Performance Features
|
||||
- Configurable batch processing (1000-10000 rows per batch)
|
||||
- I/O streaming with buffered writers
|
||||
- Memory-efficient row processing
|
||||
- Connection pooling support
|
||||
```
|
||||
User → backupCmd → cfg.UseNativeEngine=true → runNativeBackup()
|
||||
↓
|
||||
native.EngineManager.BackupWithNativeEngine()
|
||||
↓
|
||||
native.PostgreSQLNativeEngine.Backup()
|
||||
↓
|
||||
pgx: COPY table TO STDOUT → SQL file
|
||||
```
|
||||
|
||||
### Output Formats
|
||||
- **SQL Format**: Standard SQL DDL and DML statements
|
||||
- **Custom Format**: (Framework ready for PostgreSQL custom format)
|
||||
- **Directory Format**: (Framework ready for multi-file output)
|
||||
### Native Restore Flow
|
||||
|
||||
### Configuration Integration
|
||||
- Seamless integration with existing dbbackup configuration system
|
||||
- New CLI flags: `--native`, `--fallback-tools`, `--native-debug`
|
||||
- Backward compatibility with all existing options
|
||||
```
|
||||
User → restoreCmd → cfg.UseNativeEngine=true → runNativeRestore()
|
||||
↓
|
||||
native.EngineManager.RestoreWithNativeEngine()
|
||||
↓
|
||||
native.PostgreSQLRestoreEngine.Restore()
|
||||
↓
|
||||
Parse SQL → pgx CopyFrom / Exec → Database
|
||||
```
|
||||
|
||||
## Verification Results
|
||||
### Native Cluster Flow (NEW in v5.5.0)
|
||||
|
||||
```
|
||||
User → backup cluster --native
|
||||
↓
|
||||
For each database:
|
||||
native.PostgreSQLNativeEngine.Backup()
|
||||
↓
|
||||
Create .sql.gz file (not .dump)
|
||||
↓
|
||||
Package all .sql.gz into tar.gz archive
|
||||
|
||||
User → restore cluster --native --confirm
|
||||
↓
|
||||
Extract tar.gz → .sql.gz files
|
||||
↓
|
||||
For each .sql.gz:
|
||||
native.PostgreSQLRestoreEngine.Restore()
|
||||
↓
|
||||
Parse SQL → pgx CopyFrom → Database
|
||||
```
|
||||
|
||||
### External Tools Flow (Default Cluster)
|
||||
|
||||
```
|
||||
User → restoreClusterCmd → engine.RestoreCluster()
|
||||
↓
|
||||
Extract tar.gz → .dump files
|
||||
↓
|
||||
For each .dump:
|
||||
cleanup.SafeCommand("pg_restore", args...)
|
||||
↓
|
||||
PostgreSQL restores data
|
||||
```
|
||||
|
||||
## CLI Flags
|
||||
|
||||
### Build Status
|
||||
```bash
|
||||
$ go build -o dbbackup-complete .
|
||||
# Builds successfully with zero warnings
|
||||
--native # Use native engine for backup/restore (works for cluster too!)
|
||||
--fallback-tools # Fall back to external if native fails
|
||||
--native-debug # Enable native engine debug logging
|
||||
```
|
||||
|
||||
### Tool Dependencies
|
||||
```bash
|
||||
$ ./dbbackup-complete version
|
||||
# Database Tools: (none detected)
|
||||
# Confirms zero external tool dependencies
|
||||
```
|
||||
## Future Improvements
|
||||
|
||||
### CLI Integration
|
||||
```bash
|
||||
$ ./dbbackup-complete backup --help | grep native
|
||||
--fallback-tools Fallback to external tools if native engine fails
|
||||
--native Use pure Go native engines (no external tools)
|
||||
--native-debug Enable detailed native engine debugging
|
||||
# All native engine flags available
|
||||
```
|
||||
1. ~~Add SQL format option for cluster backup~~ ✅ **DONE in v5.5.0**
|
||||
|
||||
## Key Achievements
|
||||
2. **Implement custom format parser in Go**
|
||||
- Very complex (PostgreSQL proprietary format)
|
||||
- Would enable native restore of .dump files
|
||||
|
||||
### External Tool Elimination
|
||||
- **Before**: Required `pg_dump`, `mysqldump`, `pg_restore`, `mysql`, etc.
|
||||
- **After**: Zero external dependencies - pure Go implementation
|
||||
3. **Add parallel native restore**
|
||||
- Parse SQL file into table chunks
|
||||
- Restore multiple tables concurrently
|
||||
|
||||
### Protocol-Level Implementation
|
||||
- **PostgreSQL**: Direct pgx connection with PostgreSQL wire protocol
|
||||
- **MySQL**: Direct go-sql-driver with MySQL protocol
|
||||
- **Both**: Native SQL generation without shelling out to external tools
|
||||
## Summary
|
||||
|
||||
### Advanced Features
|
||||
- Proper data type handling for complex types (binary, JSON, arrays)
|
||||
- Configurable batch processing for performance
|
||||
- Support for multiple output formats and compression
|
||||
- Extensible architecture for future enhancements
|
||||
| Feature | Default | With `--native` |
|
||||
|---------|---------|-----------------|
|
||||
| Single DB backup (SQL) | ✅ Native Go | ✅ Native Go |
|
||||
| Single DB restore (SQL) | ✅ Native Go | ✅ Native Go |
|
||||
| Single DB restore (.dump) | pg_restore | pg_restore |
|
||||
| Cluster backup | pg_dump (.dump) | ✅ **Native Go (.sql.gz)** |
|
||||
| Cluster restore (.dump) | pg_restore | pg_restore |
|
||||
| Cluster restore (.sql.gz) | psql | ✅ **Native Go** |
|
||||
| MySQL backup | ✅ Native Go | ✅ Native Go |
|
||||
| MySQL restore | ✅ Native Go | ✅ Native Go |
|
||||
|
||||
### Production Ready Features
|
||||
- Connection management and error handling
|
||||
- Progress tracking and status reporting
|
||||
- Configuration integration
|
||||
- Backward compatibility
|
||||
**Bottom Line:** With `--native` flag, dbbackup can now perform **ALL operations** without external tools, as long as you create native-format backups. This enables single-binary deployment with zero PostgreSQL client dependencies.
|
||||
|
||||
### Code Quality
|
||||
- Clean, maintainable Go code with proper interfaces
|
||||
- Comprehensive error handling
|
||||
- Modular architecture for extensibility
|
||||
- Integration examples and documentation
|
||||
**Bottom Line:** With `--native` flag, dbbackup can now perform **ALL operations** without external tools, as long as you create native-format backups. This enables single-binary deployment with zero PostgreSQL client dependencies.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic Native Backup
|
||||
```bash
|
||||
# PostgreSQL backup with native engine
|
||||
./dbbackup backup --native --host localhost --port 5432 --database mydb
|
||||
|
||||
# MySQL backup with native engine
|
||||
./dbbackup backup --native --host localhost --port 3306 --database myapp
|
||||
```
|
||||
|
||||
### Advanced Configuration
|
||||
```go
|
||||
// PostgreSQL with advanced options
|
||||
psqlEngine, _ := native.NewPostgreSQLAdvancedEngine(config, log)
|
||||
result, _ := psqlEngine.AdvancedBackup(ctx, output, &native.AdvancedBackupOptions{
|
||||
Format: native.FormatSQL,
|
||||
Compression: native.CompressionGzip,
|
||||
BatchSize: 10000,
|
||||
ConsistentSnapshot: true,
|
||||
})
|
||||
```
|
||||
|
||||
## Final Status
|
||||
|
||||
**Mission Status:** **COMPLETE SUCCESS**
|
||||
|
||||
The user's goal of "FULL - no dependency to the other tools" has been **100% achieved**.
|
||||
|
||||
dbbackup now features:
|
||||
- **Zero external tool dependencies**
|
||||
- **Native Go implementations** for both PostgreSQL and MySQL
|
||||
- **Production-ready** data type handling and performance features
|
||||
- **Extensible architecture** for future database engines
|
||||
- **Full CLI integration** with existing dbbackup workflows
|
||||
|
||||
The implementation provides a solid foundation that can be enhanced with additional features like:
|
||||
- Parallel processing implementation
|
||||
- Custom format support completion
|
||||
- Full restore functionality implementation
|
||||
- Additional database engine support
|
||||
|
||||
**Result:** A completely self-contained, dependency-free database backup solution written in pure Go.
|
||||
**Bottom Line:** Native engine works for SQL format operations. Cluster operations use external tools because PostgreSQL's custom format provides better performance and features.
|
||||
@ -34,8 +34,16 @@ Examples:
|
||||
var clusterCmd = &cobra.Command{
|
||||
Use: "cluster",
|
||||
Short: "Create full cluster backup (PostgreSQL only)",
|
||||
Long: `Create a complete backup of the entire PostgreSQL cluster including all databases and global objects (roles, tablespaces, etc.)`,
|
||||
Args: cobra.NoArgs,
|
||||
Long: `Create a complete backup of the entire PostgreSQL cluster including all databases and global objects (roles, tablespaces, etc.).
|
||||
|
||||
Native Engine:
|
||||
--native - Use pure Go native engine (SQL format, no pg_dump required)
|
||||
--fallback-tools - Fall back to external tools if native engine fails
|
||||
|
||||
By default, cluster backup uses PostgreSQL custom format (.dump) for efficiency.
|
||||
With --native, all databases are backed up in SQL format (.sql.gz) using the
|
||||
native Go engine, eliminating the need for pg_dump.`,
|
||||
Args: cobra.NoArgs,
|
||||
RunE: func(cmd *cobra.Command, args []string) error {
|
||||
return runClusterBackup(cmd.Context())
|
||||
},
|
||||
@ -113,6 +121,24 @@ func init() {
|
||||
backupCmd.AddCommand(singleCmd)
|
||||
backupCmd.AddCommand(sampleCmd)
|
||||
|
||||
// Native engine flags for cluster backup
|
||||
clusterCmd.Flags().Bool("native", false, "Use pure Go native engine (SQL format, no external tools)")
|
||||
clusterCmd.Flags().Bool("fallback-tools", false, "Fall back to external tools if native engine fails")
|
||||
clusterCmd.PreRunE = func(cmd *cobra.Command, args []string) error {
|
||||
if cmd.Flags().Changed("native") {
|
||||
native, _ := cmd.Flags().GetBool("native")
|
||||
cfg.UseNativeEngine = native
|
||||
if native {
|
||||
log.Info("Native engine mode enabled for cluster backup - using SQL format")
|
||||
}
|
||||
}
|
||||
if cmd.Flags().Changed("fallback-tools") {
|
||||
fallback, _ := cmd.Flags().GetBool("fallback-tools")
|
||||
cfg.FallbackToTools = fallback
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Incremental backup flags (single backup only) - using global vars to avoid initialization cycle
|
||||
singleCmd.Flags().StringVar(&backupTypeFlag, "backup-type", "full", "Backup type: full or incremental")
|
||||
singleCmd.Flags().StringVar(&baseBackupFlag, "base-backup", "", "Path to base backup (required for incremental)")
|
||||
|
||||
@ -336,6 +336,8 @@ func init() {
|
||||
restoreSingleCmd.Flags().BoolVar(&restoreDiagnose, "diagnose", false, "Run deep diagnosis before restore to detect corruption/truncation")
|
||||
restoreSingleCmd.Flags().StringVar(&restoreSaveDebugLog, "save-debug-log", "", "Save detailed error report to file on failure (e.g., /tmp/restore-debug.json)")
|
||||
restoreSingleCmd.Flags().BoolVar(&restoreDebugLocks, "debug-locks", false, "Enable detailed lock debugging (captures PostgreSQL config, Guard decisions, boost attempts)")
|
||||
restoreSingleCmd.Flags().Bool("native", false, "Use pure Go native engine (no psql/pg_restore required)")
|
||||
restoreSingleCmd.Flags().Bool("fallback-tools", false, "Fall back to external tools if native engine fails")
|
||||
|
||||
// Cluster restore flags
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreListDBs, "list-databases", false, "List databases in cluster backup and exit")
|
||||
@ -363,6 +365,32 @@ func init() {
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreCreate, "create", false, "Create target database if it doesn't exist (for single DB restore)")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreOOMProtection, "oom-protection", false, "Enable OOM protection: disable swap, tune PostgreSQL memory, protect from OOM killer")
|
||||
restoreClusterCmd.Flags().BoolVar(&restoreLowMemory, "low-memory", false, "Force low-memory mode: single-threaded restore with minimal memory (use for <8GB RAM or very large backups)")
|
||||
restoreClusterCmd.Flags().Bool("native", false, "Use pure Go native engine for .sql.gz files (no psql/pg_restore required)")
|
||||
restoreClusterCmd.Flags().Bool("fallback-tools", false, "Fall back to external tools if native engine fails")
|
||||
|
||||
// Handle native engine flags for restore commands
|
||||
for _, cmd := range []*cobra.Command{restoreSingleCmd, restoreClusterCmd} {
|
||||
originalPreRun := cmd.PreRunE
|
||||
cmd.PreRunE = func(c *cobra.Command, args []string) error {
|
||||
if originalPreRun != nil {
|
||||
if err := originalPreRun(c, args); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
if c.Flags().Changed("native") {
|
||||
native, _ := c.Flags().GetBool("native")
|
||||
cfg.UseNativeEngine = native
|
||||
if native {
|
||||
log.Info("Native engine mode enabled for restore")
|
||||
}
|
||||
}
|
||||
if c.Flags().Changed("fallback-tools") {
|
||||
fallback, _ := c.Flags().GetBool("fallback-tools")
|
||||
cfg.FallbackToTools = fallback
|
||||
}
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// PITR restore flags
|
||||
restorePITRCmd.Flags().StringVar(&pitrBaseBackup, "base-backup", "", "Path to base backup file (.tar.gz) (required)")
|
||||
|
||||
@ -9,7 +9,6 @@ import (
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
"strconv"
|
||||
@ -19,9 +18,11 @@ import (
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/checks"
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/cloud"
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/database"
|
||||
"dbbackup/internal/engine/native"
|
||||
"dbbackup/internal/fs"
|
||||
"dbbackup/internal/logger"
|
||||
"dbbackup/internal/metadata"
|
||||
@ -542,6 +543,109 @@ func (e *Engine) BackupCluster(ctx context.Context) error {
|
||||
format := "custom"
|
||||
parallel := e.cfg.DumpJobs
|
||||
|
||||
// USE NATIVE ENGINE if configured
|
||||
// This creates .sql.gz files using pure Go (no pg_dump)
|
||||
if e.cfg.UseNativeEngine {
|
||||
sqlFile := filepath.Join(tempDir, "dumps", name+".sql.gz")
|
||||
mu.Lock()
|
||||
e.printf(" Using native Go engine (pure Go, no pg_dump)\n")
|
||||
mu.Unlock()
|
||||
|
||||
// Create native engine for this database
|
||||
nativeCfg := &native.PostgreSQLNativeConfig{
|
||||
Host: e.cfg.Host,
|
||||
Port: e.cfg.Port,
|
||||
User: e.cfg.User,
|
||||
Password: e.cfg.Password,
|
||||
Database: name,
|
||||
SSLMode: e.cfg.SSLMode,
|
||||
Format: "sql",
|
||||
Compression: compressionLevel,
|
||||
Parallel: e.cfg.Jobs,
|
||||
Blobs: true,
|
||||
Verbose: e.cfg.Debug,
|
||||
}
|
||||
|
||||
nativeEngine, nativeErr := native.NewPostgreSQLNativeEngine(nativeCfg, e.log)
|
||||
if nativeErr != nil {
|
||||
if e.cfg.FallbackToTools {
|
||||
mu.Lock()
|
||||
e.log.Warn("Native engine failed, falling back to pg_dump", "database", name, "error", nativeErr)
|
||||
e.printf(" [WARN] Native engine failed, using pg_dump fallback\n")
|
||||
mu.Unlock()
|
||||
// Fall through to use pg_dump below
|
||||
} else {
|
||||
e.log.Error("Failed to create native engine", "database", name, "error", nativeErr)
|
||||
mu.Lock()
|
||||
e.printf(" [FAIL] Failed to create native engine for %s: %v\n", name, nativeErr)
|
||||
mu.Unlock()
|
||||
atomic.AddInt32(&failCount, 1)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
// Connect and backup with native engine
|
||||
if connErr := nativeEngine.Connect(ctx); connErr != nil {
|
||||
if e.cfg.FallbackToTools {
|
||||
mu.Lock()
|
||||
e.log.Warn("Native engine connection failed, falling back to pg_dump", "database", name, "error", connErr)
|
||||
mu.Unlock()
|
||||
} else {
|
||||
e.log.Error("Native engine connection failed", "database", name, "error", connErr)
|
||||
atomic.AddInt32(&failCount, 1)
|
||||
nativeEngine.Close()
|
||||
return
|
||||
}
|
||||
} else {
|
||||
// Create output file with compression
|
||||
outFile, fileErr := os.Create(sqlFile)
|
||||
if fileErr != nil {
|
||||
e.log.Error("Failed to create output file", "file", sqlFile, "error", fileErr)
|
||||
atomic.AddInt32(&failCount, 1)
|
||||
nativeEngine.Close()
|
||||
return
|
||||
}
|
||||
|
||||
// Use pgzip for parallel compression
|
||||
gzWriter, _ := pgzip.NewWriterLevel(outFile, compressionLevel)
|
||||
|
||||
result, backupErr := nativeEngine.Backup(ctx, gzWriter)
|
||||
gzWriter.Close()
|
||||
outFile.Close()
|
||||
nativeEngine.Close()
|
||||
|
||||
if backupErr != nil {
|
||||
os.Remove(sqlFile) // Clean up partial file
|
||||
if e.cfg.FallbackToTools {
|
||||
mu.Lock()
|
||||
e.log.Warn("Native backup failed, falling back to pg_dump", "database", name, "error", backupErr)
|
||||
e.printf(" [WARN] Native backup failed, using pg_dump fallback\n")
|
||||
mu.Unlock()
|
||||
// Fall through to use pg_dump below
|
||||
} else {
|
||||
e.log.Error("Native backup failed", "database", name, "error", backupErr)
|
||||
atomic.AddInt32(&failCount, 1)
|
||||
return
|
||||
}
|
||||
} else {
|
||||
// Native backup succeeded!
|
||||
if info, statErr := os.Stat(sqlFile); statErr == nil {
|
||||
mu.Lock()
|
||||
e.printf(" [OK] Completed %s (%s) [native]\n", name, formatBytes(info.Size()))
|
||||
mu.Unlock()
|
||||
e.log.Info("Native backup completed",
|
||||
"database", name,
|
||||
"size", info.Size(),
|
||||
"duration", result.Duration,
|
||||
"engine", result.EngineUsed)
|
||||
}
|
||||
atomic.AddInt32(&successCount, 1)
|
||||
return // Skip pg_dump path
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Standard pg_dump path (for non-native mode or fallback)
|
||||
if size, err := e.db.GetDatabaseSize(ctx, name); err == nil {
|
||||
if size > 5*1024*1024*1024 {
|
||||
format = "plain"
|
||||
@ -650,7 +754,7 @@ func (e *Engine) executeCommandWithProgress(ctx context.Context, cmdArgs []strin
|
||||
|
||||
e.log.Debug("Executing backup command with progress", "cmd", cmdArgs[0], "args", cmdArgs[1:])
|
||||
|
||||
cmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
cmd := cleanup.SafeCommand(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
|
||||
// Set environment variables for database tools
|
||||
cmd.Env = os.Environ()
|
||||
@ -696,9 +800,9 @@ func (e *Engine) executeCommandWithProgress(ctx context.Context, cmdArgs []strin
|
||||
case cmdErr = <-cmdDone:
|
||||
// Command completed (success or failure)
|
||||
case <-ctx.Done():
|
||||
// Context cancelled - kill process to unblock
|
||||
e.log.Warn("Backup cancelled - killing process")
|
||||
cmd.Process.Kill()
|
||||
// Context cancelled - kill entire process group
|
||||
e.log.Warn("Backup cancelled - killing process group")
|
||||
cleanup.KillCommandGroup(cmd)
|
||||
<-cmdDone // Wait for goroutine to finish
|
||||
cmdErr = ctx.Err()
|
||||
}
|
||||
@ -754,7 +858,7 @@ func (e *Engine) monitorCommandProgress(stderr io.ReadCloser, tracker *progress.
|
||||
// Uses in-process pgzip for parallel compression (2-4x faster on multi-core systems)
|
||||
func (e *Engine) executeMySQLWithProgressAndCompression(ctx context.Context, cmdArgs []string, outputFile string, tracker *progress.OperationTracker) error {
|
||||
// Create mysqldump command
|
||||
dumpCmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
dumpCmd := cleanup.SafeCommand(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
dumpCmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" {
|
||||
dumpCmd.Env = append(dumpCmd.Env, "MYSQL_PWD="+e.cfg.Password)
|
||||
@ -816,8 +920,8 @@ func (e *Engine) executeMySQLWithProgressAndCompression(ctx context.Context, cmd
|
||||
case dumpErr = <-dumpDone:
|
||||
// mysqldump completed
|
||||
case <-ctx.Done():
|
||||
e.log.Warn("Backup cancelled - killing mysqldump")
|
||||
dumpCmd.Process.Kill()
|
||||
e.log.Warn("Backup cancelled - killing mysqldump process group")
|
||||
cleanup.KillCommandGroup(dumpCmd)
|
||||
<-dumpDone
|
||||
return ctx.Err()
|
||||
}
|
||||
@ -846,7 +950,7 @@ func (e *Engine) executeMySQLWithProgressAndCompression(ctx context.Context, cmd
|
||||
// Uses in-process pgzip for parallel compression (2-4x faster on multi-core systems)
|
||||
func (e *Engine) executeMySQLWithCompression(ctx context.Context, cmdArgs []string, outputFile string) error {
|
||||
// Create mysqldump command
|
||||
dumpCmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
dumpCmd := cleanup.SafeCommand(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
dumpCmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" {
|
||||
dumpCmd.Env = append(dumpCmd.Env, "MYSQL_PWD="+e.cfg.Password)
|
||||
@ -895,8 +999,8 @@ func (e *Engine) executeMySQLWithCompression(ctx context.Context, cmdArgs []stri
|
||||
case dumpErr = <-dumpDone:
|
||||
// mysqldump completed
|
||||
case <-ctx.Done():
|
||||
e.log.Warn("Backup cancelled - killing mysqldump")
|
||||
dumpCmd.Process.Kill()
|
||||
e.log.Warn("Backup cancelled - killing mysqldump process group")
|
||||
cleanup.KillCommandGroup(dumpCmd)
|
||||
<-dumpDone
|
||||
return ctx.Err()
|
||||
}
|
||||
@ -951,7 +1055,7 @@ func (e *Engine) createSampleBackup(ctx context.Context, databaseName, outputFil
|
||||
Format: "plain",
|
||||
})
|
||||
|
||||
cmd := exec.CommandContext(ctx, schemaCmd[0], schemaCmd[1:]...)
|
||||
cmd := cleanup.SafeCommand(ctx, schemaCmd[0], schemaCmd[1:]...)
|
||||
cmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" {
|
||||
cmd.Env = append(cmd.Env, "PGPASSWORD="+e.cfg.Password)
|
||||
@ -990,7 +1094,7 @@ func (e *Engine) backupGlobals(ctx context.Context, tempDir string) error {
|
||||
globalsFile := filepath.Join(tempDir, "globals.sql")
|
||||
|
||||
// CRITICAL: Always pass port even for localhost - user may have non-standard port
|
||||
cmd := exec.CommandContext(ctx, "pg_dumpall", "--globals-only",
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_dumpall", "--globals-only",
|
||||
"-p", fmt.Sprintf("%d", e.cfg.Port),
|
||||
"-U", e.cfg.User)
|
||||
|
||||
@ -1034,8 +1138,8 @@ func (e *Engine) backupGlobals(ctx context.Context, tempDir string) error {
|
||||
case cmdErr = <-cmdDone:
|
||||
// Command completed normally
|
||||
case <-ctx.Done():
|
||||
e.log.Warn("Globals backup cancelled - killing pg_dumpall")
|
||||
cmd.Process.Kill()
|
||||
e.log.Warn("Globals backup cancelled - killing pg_dumpall process group")
|
||||
cleanup.KillCommandGroup(cmd)
|
||||
<-cmdDone
|
||||
return ctx.Err()
|
||||
}
|
||||
@ -1430,7 +1534,7 @@ func (e *Engine) executeCommand(ctx context.Context, cmdArgs []string, outputFil
|
||||
|
||||
// For custom format, pg_dump handles everything (writes directly to file)
|
||||
// NO GO BUFFERING - pg_dump writes directly to disk
|
||||
cmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
cmd := cleanup.SafeCommand(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
|
||||
// Start heartbeat ticker for backup progress
|
||||
backupStart := time.Now()
|
||||
@ -1499,9 +1603,9 @@ func (e *Engine) executeCommand(ctx context.Context, cmdArgs []string, outputFil
|
||||
case cmdErr = <-cmdDone:
|
||||
// Command completed (success or failure)
|
||||
case <-ctx.Done():
|
||||
// Context cancelled - kill process to unblock
|
||||
e.log.Warn("Backup cancelled - killing pg_dump process")
|
||||
cmd.Process.Kill()
|
||||
// Context cancelled - kill entire process group
|
||||
e.log.Warn("Backup cancelled - killing pg_dump process group")
|
||||
cleanup.KillCommandGroup(cmd)
|
||||
<-cmdDone // Wait for goroutine to finish
|
||||
cmdErr = ctx.Err()
|
||||
}
|
||||
@ -1536,7 +1640,7 @@ func (e *Engine) executeWithStreamingCompression(ctx context.Context, cmdArgs []
|
||||
}
|
||||
|
||||
// Create pg_dump command
|
||||
dumpCmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
dumpCmd := cleanup.SafeCommand(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
dumpCmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" && e.cfg.IsPostgreSQL() {
|
||||
dumpCmd.Env = append(dumpCmd.Env, "PGPASSWORD="+e.cfg.Password)
|
||||
@ -1612,9 +1716,9 @@ func (e *Engine) executeWithStreamingCompression(ctx context.Context, cmdArgs []
|
||||
case dumpErr = <-dumpDone:
|
||||
// pg_dump completed (success or failure)
|
||||
case <-ctx.Done():
|
||||
// Context cancelled/timeout - kill pg_dump to unblock
|
||||
e.log.Warn("Backup timeout - killing pg_dump process")
|
||||
dumpCmd.Process.Kill()
|
||||
// Context cancelled/timeout - kill pg_dump process group
|
||||
e.log.Warn("Backup timeout - killing pg_dump process group")
|
||||
cleanup.KillCommandGroup(dumpCmd)
|
||||
<-dumpDone // Wait for goroutine to finish
|
||||
dumpErr = ctx.Err()
|
||||
}
|
||||
|
||||
154
internal/cleanup/command.go
Normal file
154
internal/cleanup/command.go
Normal file
@ -0,0 +1,154 @@
|
||||
//go:build !windows
|
||||
// +build !windows
|
||||
|
||||
package cleanup
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os/exec"
|
||||
"syscall"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/logger"
|
||||
)
|
||||
|
||||
// SafeCommand creates an exec.Cmd with proper process group setup for clean termination.
|
||||
// This ensures that child processes (e.g., from pipelines) are killed when the parent is killed.
|
||||
func SafeCommand(ctx context.Context, name string, args ...string) *exec.Cmd {
|
||||
cmd := exec.CommandContext(ctx, name, args...)
|
||||
|
||||
// Set up process group for clean termination
|
||||
// This allows killing the entire process tree when cancelled
|
||||
cmd.SysProcAttr = &syscall.SysProcAttr{
|
||||
Setpgid: true, // Create new process group
|
||||
Pgid: 0, // Use the new process's PID as the PGID
|
||||
}
|
||||
|
||||
return cmd
|
||||
}
|
||||
|
||||
// TrackedCommand creates a command that is tracked for cleanup on shutdown.
|
||||
// When the handler shuts down, this command will be killed if still running.
|
||||
type TrackedCommand struct {
|
||||
*exec.Cmd
|
||||
log logger.Logger
|
||||
name string
|
||||
}
|
||||
|
||||
// NewTrackedCommand creates a tracked command
|
||||
func NewTrackedCommand(ctx context.Context, log logger.Logger, name string, args ...string) *TrackedCommand {
|
||||
tc := &TrackedCommand{
|
||||
Cmd: SafeCommand(ctx, name, args...),
|
||||
log: log,
|
||||
name: name,
|
||||
}
|
||||
return tc
|
||||
}
|
||||
|
||||
// StartWithCleanup starts the command and registers cleanup with the handler
|
||||
func (tc *TrackedCommand) StartWithCleanup(h *Handler) error {
|
||||
if err := tc.Cmd.Start(); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Register cleanup function
|
||||
pid := tc.Cmd.Process.Pid
|
||||
h.RegisterCleanup(fmt.Sprintf("kill-%s-%d", tc.name, pid), func(ctx context.Context) error {
|
||||
return tc.Kill()
|
||||
})
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// Kill terminates the command and its process group
|
||||
func (tc *TrackedCommand) Kill() error {
|
||||
if tc.Cmd.Process == nil {
|
||||
return nil // Not started or already cleaned up
|
||||
}
|
||||
|
||||
pid := tc.Cmd.Process.Pid
|
||||
|
||||
// Get the process group ID
|
||||
pgid, err := syscall.Getpgid(pid)
|
||||
if err != nil {
|
||||
// Process might already be gone
|
||||
return nil
|
||||
}
|
||||
|
||||
tc.log.Debug("Terminating process", "name", tc.name, "pid", pid, "pgid", pgid)
|
||||
|
||||
// Try graceful shutdown first (SIGTERM to process group)
|
||||
if err := syscall.Kill(-pgid, syscall.SIGTERM); err != nil {
|
||||
tc.log.Debug("SIGTERM failed, trying SIGKILL", "error", err)
|
||||
}
|
||||
|
||||
// Wait briefly for graceful shutdown
|
||||
done := make(chan error, 1)
|
||||
go func() {
|
||||
_, err := tc.Cmd.Process.Wait()
|
||||
done <- err
|
||||
}()
|
||||
|
||||
select {
|
||||
case <-time.After(3 * time.Second):
|
||||
// Force kill after timeout
|
||||
tc.log.Debug("Process didn't stop gracefully, sending SIGKILL", "name", tc.name, "pid", pid)
|
||||
if err := syscall.Kill(-pgid, syscall.SIGKILL); err != nil {
|
||||
tc.log.Debug("SIGKILL failed", "error", err)
|
||||
}
|
||||
<-done // Wait for Wait() to finish
|
||||
|
||||
case <-done:
|
||||
// Process exited
|
||||
}
|
||||
|
||||
tc.log.Debug("Process terminated", "name", tc.name, "pid", pid)
|
||||
return nil
|
||||
}
|
||||
|
||||
// WaitWithContext waits for the command to complete, handling context cancellation properly.
|
||||
// This is the recommended way to wait for commands, as it ensures proper cleanup on cancellation.
|
||||
func WaitWithContext(ctx context.Context, cmd *exec.Cmd, log logger.Logger) error {
|
||||
if cmd.Process == nil {
|
||||
return fmt.Errorf("process not started")
|
||||
}
|
||||
|
||||
// Wait for command in a goroutine
|
||||
cmdDone := make(chan error, 1)
|
||||
go func() {
|
||||
cmdDone <- cmd.Wait()
|
||||
}()
|
||||
|
||||
select {
|
||||
case err := <-cmdDone:
|
||||
return err
|
||||
|
||||
case <-ctx.Done():
|
||||
// Context cancelled - kill process group
|
||||
log.Debug("Context cancelled, terminating process", "pid", cmd.Process.Pid)
|
||||
|
||||
// Get process group and kill entire group
|
||||
pgid, err := syscall.Getpgid(cmd.Process.Pid)
|
||||
if err == nil {
|
||||
// Kill process group
|
||||
syscall.Kill(-pgid, syscall.SIGTERM)
|
||||
|
||||
// Wait briefly for graceful shutdown
|
||||
select {
|
||||
case <-cmdDone:
|
||||
// Process exited
|
||||
case <-time.After(2 * time.Second):
|
||||
// Force kill
|
||||
syscall.Kill(-pgid, syscall.SIGKILL)
|
||||
<-cmdDone
|
||||
}
|
||||
} else {
|
||||
// Fallback to killing just the process
|
||||
cmd.Process.Kill()
|
||||
<-cmdDone
|
||||
}
|
||||
|
||||
return ctx.Err()
|
||||
}
|
||||
}
|
||||
99
internal/cleanup/command_windows.go
Normal file
99
internal/cleanup/command_windows.go
Normal file
@ -0,0 +1,99 @@
|
||||
//go:build windows
|
||||
// +build windows
|
||||
|
||||
package cleanup
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os/exec"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/logger"
|
||||
)
|
||||
|
||||
// SafeCommand creates an exec.Cmd with proper setup for clean termination on Windows.
|
||||
func SafeCommand(ctx context.Context, name string, args ...string) *exec.Cmd {
|
||||
cmd := exec.CommandContext(ctx, name, args...)
|
||||
// Windows doesn't use process groups the same way as Unix
|
||||
// exec.CommandContext will handle termination via the context
|
||||
return cmd
|
||||
}
|
||||
|
||||
// TrackedCommand creates a command that is tracked for cleanup on shutdown.
|
||||
type TrackedCommand struct {
|
||||
*exec.Cmd
|
||||
log logger.Logger
|
||||
name string
|
||||
}
|
||||
|
||||
// NewTrackedCommand creates a tracked command
|
||||
func NewTrackedCommand(ctx context.Context, log logger.Logger, name string, args ...string) *TrackedCommand {
|
||||
tc := &TrackedCommand{
|
||||
Cmd: SafeCommand(ctx, name, args...),
|
||||
log: log,
|
||||
name: name,
|
||||
}
|
||||
return tc
|
||||
}
|
||||
|
||||
// StartWithCleanup starts the command and registers cleanup with the handler
|
||||
func (tc *TrackedCommand) StartWithCleanup(h *Handler) error {
|
||||
if err := tc.Cmd.Start(); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Register cleanup function
|
||||
pid := tc.Cmd.Process.Pid
|
||||
h.RegisterCleanup(fmt.Sprintf("kill-%s-%d", tc.name, pid), func(ctx context.Context) error {
|
||||
return tc.Kill()
|
||||
})
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// Kill terminates the command on Windows
|
||||
func (tc *TrackedCommand) Kill() error {
|
||||
if tc.Cmd.Process == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
tc.log.Debug("Terminating process", "name", tc.name, "pid", tc.Cmd.Process.Pid)
|
||||
|
||||
if err := tc.Cmd.Process.Kill(); err != nil {
|
||||
tc.log.Debug("Kill failed", "error", err)
|
||||
return err
|
||||
}
|
||||
|
||||
tc.log.Debug("Process terminated", "name", tc.name, "pid", tc.Cmd.Process.Pid)
|
||||
return nil
|
||||
}
|
||||
|
||||
// WaitWithContext waits for the command to complete, handling context cancellation properly.
|
||||
func WaitWithContext(ctx context.Context, cmd *exec.Cmd, log logger.Logger) error {
|
||||
if cmd.Process == nil {
|
||||
return fmt.Errorf("process not started")
|
||||
}
|
||||
|
||||
cmdDone := make(chan error, 1)
|
||||
go func() {
|
||||
cmdDone <- cmd.Wait()
|
||||
}()
|
||||
|
||||
select {
|
||||
case err := <-cmdDone:
|
||||
return err
|
||||
|
||||
case <-ctx.Done():
|
||||
log.Debug("Context cancelled, terminating process", "pid", cmd.Process.Pid)
|
||||
cmd.Process.Kill()
|
||||
|
||||
select {
|
||||
case <-cmdDone:
|
||||
case <-time.After(5 * time.Second):
|
||||
// Already killed, just wait for it
|
||||
}
|
||||
|
||||
return ctx.Err()
|
||||
}
|
||||
}
|
||||
242
internal/cleanup/handler.go
Normal file
242
internal/cleanup/handler.go
Normal file
@ -0,0 +1,242 @@
|
||||
// Package cleanup provides graceful shutdown and resource cleanup functionality
|
||||
package cleanup
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os"
|
||||
"os/signal"
|
||||
"sync"
|
||||
"syscall"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/logger"
|
||||
)
|
||||
|
||||
// CleanupFunc is a function that performs cleanup with a timeout context
|
||||
type CleanupFunc func(ctx context.Context) error
|
||||
|
||||
// Handler manages graceful shutdown and resource cleanup
|
||||
type Handler struct {
|
||||
ctx context.Context
|
||||
cancel context.CancelFunc
|
||||
|
||||
cleanupFns []cleanupEntry
|
||||
mu sync.Mutex
|
||||
|
||||
shutdownTimeout time.Duration
|
||||
log logger.Logger
|
||||
|
||||
// Track if shutdown has been initiated
|
||||
shutdownOnce sync.Once
|
||||
shutdownDone chan struct{}
|
||||
}
|
||||
|
||||
type cleanupEntry struct {
|
||||
name string
|
||||
fn CleanupFunc
|
||||
}
|
||||
|
||||
// NewHandler creates a shutdown handler
|
||||
func NewHandler(log logger.Logger) *Handler {
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
|
||||
h := &Handler{
|
||||
ctx: ctx,
|
||||
cancel: cancel,
|
||||
cleanupFns: make([]cleanupEntry, 0),
|
||||
shutdownTimeout: 30 * time.Second,
|
||||
log: log,
|
||||
shutdownDone: make(chan struct{}),
|
||||
}
|
||||
|
||||
return h
|
||||
}
|
||||
|
||||
// Context returns the shutdown context
|
||||
func (h *Handler) Context() context.Context {
|
||||
return h.ctx
|
||||
}
|
||||
|
||||
// RegisterCleanup adds a named cleanup function
|
||||
func (h *Handler) RegisterCleanup(name string, fn CleanupFunc) {
|
||||
h.mu.Lock()
|
||||
defer h.mu.Unlock()
|
||||
h.cleanupFns = append(h.cleanupFns, cleanupEntry{name: name, fn: fn})
|
||||
}
|
||||
|
||||
// SetShutdownTimeout sets the maximum time to wait for cleanup
|
||||
func (h *Handler) SetShutdownTimeout(d time.Duration) {
|
||||
h.shutdownTimeout = d
|
||||
}
|
||||
|
||||
// Shutdown triggers graceful shutdown
|
||||
func (h *Handler) Shutdown() {
|
||||
h.shutdownOnce.Do(func() {
|
||||
h.log.Info("Initiating graceful shutdown...")
|
||||
|
||||
// Cancel context first (stops all ongoing operations)
|
||||
h.cancel()
|
||||
|
||||
// Run cleanup functions
|
||||
h.runCleanup()
|
||||
|
||||
close(h.shutdownDone)
|
||||
})
|
||||
}
|
||||
|
||||
// ShutdownWithSignal triggers shutdown due to an OS signal
|
||||
func (h *Handler) ShutdownWithSignal(sig os.Signal) {
|
||||
h.log.Info("Received signal, initiating graceful shutdown", "signal", sig.String())
|
||||
h.Shutdown()
|
||||
}
|
||||
|
||||
// Wait blocks until shutdown is complete
|
||||
func (h *Handler) Wait() {
|
||||
<-h.shutdownDone
|
||||
}
|
||||
|
||||
// runCleanup executes all cleanup functions in LIFO order
|
||||
func (h *Handler) runCleanup() {
|
||||
h.mu.Lock()
|
||||
fns := make([]cleanupEntry, len(h.cleanupFns))
|
||||
copy(fns, h.cleanupFns)
|
||||
h.mu.Unlock()
|
||||
|
||||
if len(fns) == 0 {
|
||||
h.log.Info("No cleanup functions registered")
|
||||
return
|
||||
}
|
||||
|
||||
h.log.Info("Running cleanup functions", "count", len(fns))
|
||||
|
||||
// Create timeout context for cleanup
|
||||
ctx, cancel := context.WithTimeout(context.Background(), h.shutdownTimeout)
|
||||
defer cancel()
|
||||
|
||||
// Run all cleanups in LIFO order (most recently registered first)
|
||||
var failed int
|
||||
for i := len(fns) - 1; i >= 0; i-- {
|
||||
entry := fns[i]
|
||||
|
||||
h.log.Debug("Running cleanup", "name", entry.name)
|
||||
|
||||
if err := entry.fn(ctx); err != nil {
|
||||
h.log.Warn("Cleanup function failed", "name", entry.name, "error", err)
|
||||
failed++
|
||||
} else {
|
||||
h.log.Debug("Cleanup completed", "name", entry.name)
|
||||
}
|
||||
}
|
||||
|
||||
if failed > 0 {
|
||||
h.log.Warn("Some cleanup functions failed", "failed", failed, "total", len(fns))
|
||||
} else {
|
||||
h.log.Info("All cleanup functions completed successfully")
|
||||
}
|
||||
}
|
||||
|
||||
// RegisterSignalHandler sets up signal handling for graceful shutdown
|
||||
func (h *Handler) RegisterSignalHandler() {
|
||||
sigChan := make(chan os.Signal, 2)
|
||||
signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM, syscall.SIGINT)
|
||||
|
||||
go func() {
|
||||
// First signal: graceful shutdown
|
||||
sig := <-sigChan
|
||||
h.ShutdownWithSignal(sig)
|
||||
|
||||
// Second signal: force exit
|
||||
sig = <-sigChan
|
||||
h.log.Warn("Received second signal, forcing exit", "signal", sig.String())
|
||||
os.Exit(1)
|
||||
}()
|
||||
}
|
||||
|
||||
// ChildProcessCleanup creates a cleanup function for killing child processes
|
||||
func (h *Handler) ChildProcessCleanup() CleanupFunc {
|
||||
return func(ctx context.Context) error {
|
||||
h.log.Info("Cleaning up orphaned child processes...")
|
||||
|
||||
if err := KillOrphanedProcesses(h.log); err != nil {
|
||||
h.log.Warn("Failed to kill some orphaned processes", "error", err)
|
||||
return err
|
||||
}
|
||||
|
||||
h.log.Info("Child process cleanup complete")
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// DatabasePoolCleanup creates a cleanup function for database connection pools
|
||||
// poolCloser should be a function that closes the pool
|
||||
func DatabasePoolCleanup(log logger.Logger, name string, poolCloser func()) CleanupFunc {
|
||||
return func(ctx context.Context) error {
|
||||
log.Debug("Closing database connection pool", "name", name)
|
||||
poolCloser()
|
||||
log.Debug("Database connection pool closed", "name", name)
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// FileCleanup creates a cleanup function for file handles
|
||||
func FileCleanup(log logger.Logger, path string, file *os.File) CleanupFunc {
|
||||
return func(ctx context.Context) error {
|
||||
if file == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
log.Debug("Closing file", "path", path)
|
||||
if err := file.Close(); err != nil {
|
||||
return fmt.Errorf("failed to close file %s: %w", path, err)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// TempFileCleanup creates a cleanup function that closes and removes a temp file
|
||||
func TempFileCleanup(log logger.Logger, file *os.File) CleanupFunc {
|
||||
return func(ctx context.Context) error {
|
||||
if file == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
path := file.Name()
|
||||
log.Debug("Removing temporary file", "path", path)
|
||||
|
||||
// Close file first
|
||||
if err := file.Close(); err != nil {
|
||||
log.Warn("Failed to close temp file", "path", path, "error", err)
|
||||
}
|
||||
|
||||
// Remove file
|
||||
if err := os.Remove(path); err != nil {
|
||||
if !os.IsNotExist(err) {
|
||||
return fmt.Errorf("failed to remove temp file %s: %w", path, err)
|
||||
}
|
||||
}
|
||||
|
||||
log.Debug("Temporary file removed", "path", path)
|
||||
return nil
|
||||
}
|
||||
}
|
||||
|
||||
// TempDirCleanup creates a cleanup function that removes a temp directory
|
||||
func TempDirCleanup(log logger.Logger, path string) CleanupFunc {
|
||||
return func(ctx context.Context) error {
|
||||
if path == "" {
|
||||
return nil
|
||||
}
|
||||
|
||||
log.Debug("Removing temporary directory", "path", path)
|
||||
|
||||
if err := os.RemoveAll(path); err != nil {
|
||||
if !os.IsNotExist(err) {
|
||||
return fmt.Errorf("failed to remove temp dir %s: %w", path, err)
|
||||
}
|
||||
}
|
||||
|
||||
log.Debug("Temporary directory removed", "path", path)
|
||||
return nil
|
||||
}
|
||||
}
|
||||
@ -8,12 +8,12 @@ import (
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"regexp"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/fs"
|
||||
"dbbackup/internal/logger"
|
||||
|
||||
@ -568,7 +568,7 @@ func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult)
|
||||
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "--list", filePath)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "--list", filePath)
|
||||
output, err := cmd.CombinedOutput()
|
||||
|
||||
if err != nil {
|
||||
|
||||
@ -17,8 +17,10 @@ import (
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/checks"
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/database"
|
||||
"dbbackup/internal/engine/native"
|
||||
"dbbackup/internal/fs"
|
||||
"dbbackup/internal/logger"
|
||||
"dbbackup/internal/progress"
|
||||
@ -499,7 +501,7 @@ func (e *Engine) checkDumpHasLargeObjects(archivePath string) bool {
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "-l", archivePath)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "-l", archivePath)
|
||||
output, err := cmd.Output()
|
||||
|
||||
if err != nil {
|
||||
@ -532,7 +534,23 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
|
||||
return fmt.Errorf("dump validation failed: %w - the backup file may be truncated or corrupted", err)
|
||||
}
|
||||
|
||||
// Use psql for SQL scripts
|
||||
// USE NATIVE ENGINE if configured
|
||||
// This uses pure Go (pgx) instead of psql
|
||||
if e.cfg.UseNativeEngine {
|
||||
e.log.Info("Using native Go engine for restore", "database", targetDB, "file", archivePath)
|
||||
nativeErr := e.restoreWithNativeEngine(ctx, archivePath, targetDB, compressed)
|
||||
if nativeErr != nil {
|
||||
if e.cfg.FallbackToTools {
|
||||
e.log.Warn("Native restore failed, falling back to psql", "database", targetDB, "error", nativeErr)
|
||||
} else {
|
||||
return fmt.Errorf("native restore failed: %w", nativeErr)
|
||||
}
|
||||
} else {
|
||||
return nil // Native restore succeeded!
|
||||
}
|
||||
}
|
||||
|
||||
// Use psql for SQL scripts (fallback or non-native mode)
|
||||
var cmd []string
|
||||
|
||||
// For localhost, omit -h to use Unix socket (avoids Ident auth issues)
|
||||
@ -569,6 +587,69 @@ func (e *Engine) restorePostgreSQLSQL(ctx context.Context, archivePath, targetDB
|
||||
return e.executeRestoreCommand(ctx, cmd)
|
||||
}
|
||||
|
||||
// restoreWithNativeEngine restores a SQL file using the pure Go native engine
|
||||
func (e *Engine) restoreWithNativeEngine(ctx context.Context, archivePath, targetDB string, compressed bool) error {
|
||||
// Create native engine config
|
||||
nativeCfg := &native.PostgreSQLNativeConfig{
|
||||
Host: e.cfg.Host,
|
||||
Port: e.cfg.Port,
|
||||
User: e.cfg.User,
|
||||
Password: e.cfg.Password,
|
||||
Database: targetDB, // Connect to target database
|
||||
SSLMode: e.cfg.SSLMode,
|
||||
}
|
||||
|
||||
// Create restore engine
|
||||
restoreEngine, err := native.NewPostgreSQLRestoreEngine(nativeCfg, e.log)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create native restore engine: %w", err)
|
||||
}
|
||||
defer restoreEngine.Close()
|
||||
|
||||
// Open input file
|
||||
file, err := os.Open(archivePath)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to open backup file: %w", err)
|
||||
}
|
||||
defer file.Close()
|
||||
|
||||
var reader io.Reader = file
|
||||
|
||||
// Handle compression
|
||||
if compressed {
|
||||
gzReader, err := pgzip.NewReader(file)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to create gzip reader: %w", err)
|
||||
}
|
||||
defer gzReader.Close()
|
||||
reader = gzReader
|
||||
}
|
||||
|
||||
// Restore with progress tracking
|
||||
options := &native.RestoreOptions{
|
||||
Database: targetDB,
|
||||
ContinueOnError: true, // Be resilient like pg_restore
|
||||
ProgressCallback: func(progress *native.RestoreProgress) {
|
||||
e.log.Debug("Native restore progress",
|
||||
"operation", progress.Operation,
|
||||
"objects", progress.ObjectsCompleted,
|
||||
"rows", progress.RowsProcessed)
|
||||
},
|
||||
}
|
||||
|
||||
result, err := restoreEngine.Restore(ctx, reader, options)
|
||||
if err != nil {
|
||||
return fmt.Errorf("native restore failed: %w", err)
|
||||
}
|
||||
|
||||
e.log.Info("Native restore completed",
|
||||
"database", targetDB,
|
||||
"objects", result.ObjectsProcessed,
|
||||
"duration", result.Duration)
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// restoreMySQLSQL restores from MySQL SQL script
|
||||
func (e *Engine) restoreMySQLSQL(ctx context.Context, archivePath, targetDB string, compressed bool) error {
|
||||
options := database.RestoreOptions{}
|
||||
@ -592,7 +673,7 @@ func (e *Engine) executeRestoreCommand(ctx context.Context, cmdArgs []string) er
|
||||
func (e *Engine) executeRestoreCommandWithContext(ctx context.Context, cmdArgs []string, archivePath, targetDB string, format ArchiveFormat) error {
|
||||
e.log.Info("Executing restore command", "command", strings.Join(cmdArgs, " "))
|
||||
|
||||
cmd := exec.CommandContext(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
cmd := cleanup.SafeCommand(ctx, cmdArgs[0], cmdArgs[1:]...)
|
||||
|
||||
// Set environment variables
|
||||
cmd.Env = append(os.Environ(),
|
||||
@ -662,9 +743,9 @@ func (e *Engine) executeRestoreCommandWithContext(ctx context.Context, cmdArgs [
|
||||
case cmdErr = <-cmdDone:
|
||||
// Command completed (success or failure)
|
||||
case <-ctx.Done():
|
||||
// Context cancelled - kill process
|
||||
e.log.Warn("Restore cancelled - killing process")
|
||||
cmd.Process.Kill()
|
||||
// Context cancelled - kill entire process group
|
||||
e.log.Warn("Restore cancelled - killing process group")
|
||||
cleanup.KillCommandGroup(cmd)
|
||||
<-cmdDone
|
||||
cmdErr = ctx.Err()
|
||||
}
|
||||
@ -772,7 +853,7 @@ func (e *Engine) executeRestoreWithDecompression(ctx context.Context, archivePat
|
||||
defer gz.Close()
|
||||
|
||||
// Start restore command
|
||||
cmd := exec.CommandContext(ctx, restoreCmd[0], restoreCmd[1:]...)
|
||||
cmd := cleanup.SafeCommand(ctx, restoreCmd[0], restoreCmd[1:]...)
|
||||
cmd.Env = append(os.Environ(),
|
||||
fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password),
|
||||
fmt.Sprintf("MYSQL_PWD=%s", e.cfg.Password),
|
||||
@ -876,7 +957,7 @@ func (e *Engine) executeRestoreWithPgzipStream(ctx context.Context, archivePath,
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "" {
|
||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||
}
|
||||
cmd = exec.CommandContext(ctx, "psql", args...)
|
||||
cmd = cleanup.SafeCommand(ctx, "psql", args...)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
} else {
|
||||
// MySQL - use MYSQL_PWD env var to avoid password in process list
|
||||
@ -885,7 +966,7 @@ func (e *Engine) executeRestoreWithPgzipStream(ctx context.Context, archivePath,
|
||||
args = append(args, "-h", e.cfg.Host)
|
||||
}
|
||||
args = append(args, "-P", fmt.Sprintf("%d", e.cfg.Port), targetDB)
|
||||
cmd = exec.CommandContext(ctx, "mysql", args...)
|
||||
cmd = cleanup.SafeCommand(ctx, "mysql", args...)
|
||||
// Pass password via environment variable to avoid process list exposure
|
||||
cmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" {
|
||||
@ -1322,7 +1403,7 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string, preExtr
|
||||
}
|
||||
} else if strings.HasSuffix(dumpFile, ".dump") {
|
||||
// Validate custom format dumps using pg_restore --list
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "--list", dumpFile)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "--list", dumpFile)
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
dbName := strings.TrimSuffix(entry.Name(), ".dump")
|
||||
@ -1370,7 +1451,7 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string, preExtr
|
||||
if statErr == nil && archiveStats != nil {
|
||||
backupSizeBytes = archiveStats.Size()
|
||||
}
|
||||
memCheck := guard.CheckSystemMemory(backupSizeBytes)
|
||||
memCheck := guard.CheckSystemMemoryWithType(backupSizeBytes, true) // true = cluster archive with pre-compressed dumps
|
||||
if memCheck != nil {
|
||||
if memCheck.Critical {
|
||||
e.log.Error("🚨 CRITICAL MEMORY WARNING", "error", memCheck.Recommendation)
|
||||
@ -1688,19 +1769,54 @@ func (e *Engine) RestoreCluster(ctx context.Context, archivePath string, preExtr
|
||||
preserveOwnership := isSuperuser
|
||||
isCompressedSQL := strings.HasSuffix(dumpFile, ".sql.gz")
|
||||
|
||||
// Get expected size for this database for progress estimation
|
||||
expectedDBSize := dbSizes[dbName]
|
||||
|
||||
// Start heartbeat ticker to show progress during long-running restore
|
||||
// Use 15s interval to reduce mutex contention during parallel restores
|
||||
// CRITICAL FIX: Report progress to TUI callbacks so large DB restores show updates
|
||||
heartbeatCtx, cancelHeartbeat := context.WithCancel(ctx)
|
||||
heartbeatTicker := time.NewTicker(15 * time.Second)
|
||||
heartbeatTicker := time.NewTicker(5 * time.Second) // More frequent updates (was 15s)
|
||||
heartbeatCount := int64(0)
|
||||
go func() {
|
||||
for {
|
||||
select {
|
||||
case <-heartbeatTicker.C:
|
||||
heartbeatCount++
|
||||
elapsed := time.Since(dbRestoreStart)
|
||||
mu.Lock()
|
||||
statusMsg := fmt.Sprintf("Restoring %s (%d/%d) - elapsed: %s",
|
||||
dbName, idx+1, totalDBs, formatDuration(elapsed))
|
||||
e.progress.Update(statusMsg)
|
||||
|
||||
// CRITICAL: Report activity to TUI callbacks during long-running restore
|
||||
// Use time-based progress estimation: assume ~10MB/s average throughput
|
||||
// This gives visual feedback even when pg_restore hasn't completed
|
||||
estimatedBytesPerSec := int64(10 * 1024 * 1024) // 10 MB/s conservative estimate
|
||||
estimatedBytesDone := elapsed.Milliseconds() / 1000 * estimatedBytesPerSec
|
||||
if expectedDBSize > 0 && estimatedBytesDone > expectedDBSize {
|
||||
estimatedBytesDone = expectedDBSize * 95 / 100 // Cap at 95%
|
||||
}
|
||||
|
||||
// Calculate current progress including in-flight database
|
||||
currentBytesEstimate := bytesCompleted + estimatedBytesDone
|
||||
|
||||
// Report to TUI with estimated progress
|
||||
e.reportDatabaseProgressByBytes(currentBytesEstimate, totalBytes, dbName, int(atomic.LoadInt32(&successCount)), totalDBs)
|
||||
|
||||
// Also report timing info
|
||||
phaseElapsed := time.Since(restorePhaseStart)
|
||||
var avgPerDB time.Duration
|
||||
completedDBTimesMu.Lock()
|
||||
if len(completedDBTimes) > 0 {
|
||||
var total time.Duration
|
||||
for _, d := range completedDBTimes {
|
||||
total += d
|
||||
}
|
||||
avgPerDB = total / time.Duration(len(completedDBTimes))
|
||||
}
|
||||
completedDBTimesMu.Unlock()
|
||||
e.reportDatabaseProgressWithTiming(idx, totalDBs, dbName, phaseElapsed, avgPerDB)
|
||||
|
||||
mu.Unlock()
|
||||
case <-heartbeatCtx.Done():
|
||||
return
|
||||
@ -2121,7 +2237,7 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", args...)
|
||||
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
|
||||
@ -2183,8 +2299,8 @@ func (e *Engine) restoreGlobals(ctx context.Context, globalsFile string) error {
|
||||
case cmdErr = <-cmdDone:
|
||||
// Command completed
|
||||
case <-ctx.Done():
|
||||
e.log.Warn("Globals restore cancelled - killing process")
|
||||
cmd.Process.Kill()
|
||||
e.log.Warn("Globals restore cancelled - killing process group")
|
||||
cleanup.KillCommandGroup(cmd)
|
||||
<-cmdDone
|
||||
cmdErr = ctx.Err()
|
||||
}
|
||||
@ -2225,7 +2341,7 @@ func (e *Engine) checkSuperuser(ctx context.Context) (bool, error) {
|
||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", args...)
|
||||
|
||||
// Always set PGPASSWORD (empty string is fine for peer/ident auth)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
@ -2260,7 +2376,7 @@ func (e *Engine) terminateConnections(ctx context.Context, dbName string) error
|
||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", args...)
|
||||
|
||||
// Always set PGPASSWORD (empty string is fine for peer/ident auth)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
@ -2296,7 +2412,7 @@ func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||
revokeArgs = append([]string{"-h", e.cfg.Host}, revokeArgs...)
|
||||
}
|
||||
revokeCmd := exec.CommandContext(ctx, "psql", revokeArgs...)
|
||||
revokeCmd := cleanup.SafeCommand(ctx, "psql", revokeArgs...)
|
||||
revokeCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
revokeCmd.Run() // Ignore errors - database might not exist
|
||||
|
||||
@ -2315,7 +2431,7 @@ func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error
|
||||
if e.cfg.Host != "localhost" && e.cfg.Host != "127.0.0.1" && e.cfg.Host != "" {
|
||||
forceArgs = append([]string{"-h", e.cfg.Host}, forceArgs...)
|
||||
}
|
||||
forceCmd := exec.CommandContext(ctx, "psql", forceArgs...)
|
||||
forceCmd := cleanup.SafeCommand(ctx, "psql", forceArgs...)
|
||||
forceCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
|
||||
output, err := forceCmd.CombinedOutput()
|
||||
@ -2338,7 +2454,7 @@ func (e *Engine) dropDatabaseIfExists(ctx context.Context, dbName string) error
|
||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", args...)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
|
||||
output, err = cmd.CombinedOutput()
|
||||
@ -2372,7 +2488,7 @@ func (e *Engine) ensureMySQLDatabaseExists(ctx context.Context, dbName string) e
|
||||
"-e", fmt.Sprintf("CREATE DATABASE IF NOT EXISTS `%s`", dbName),
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "mysql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "mysql", args...)
|
||||
cmd.Env = os.Environ()
|
||||
if e.cfg.Password != "" {
|
||||
cmd.Env = append(cmd.Env, "MYSQL_PWD="+e.cfg.Password)
|
||||
@ -2410,7 +2526,7 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
|
||||
args = append([]string{"-h", e.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", args...)
|
||||
|
||||
// Always set PGPASSWORD (empty string is fine for peer/ident auth)
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
@ -2467,7 +2583,7 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
|
||||
createArgs = append([]string{"-h", e.cfg.Host}, createArgs...)
|
||||
}
|
||||
|
||||
createCmd := exec.CommandContext(ctx, "psql", createArgs...)
|
||||
createCmd := cleanup.SafeCommand(ctx, "psql", createArgs...)
|
||||
|
||||
// Always set PGPASSWORD (empty string is fine for peer/ident auth)
|
||||
createCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
@ -2487,7 +2603,7 @@ func (e *Engine) ensurePostgresDatabaseExists(ctx context.Context, dbName string
|
||||
simpleArgs = append([]string{"-h", e.cfg.Host}, simpleArgs...)
|
||||
}
|
||||
|
||||
simpleCmd := exec.CommandContext(ctx, "psql", simpleArgs...)
|
||||
simpleCmd := cleanup.SafeCommand(ctx, "psql", simpleArgs...)
|
||||
simpleCmd.Env = append(os.Environ(), fmt.Sprintf("PGPASSWORD=%s", e.cfg.Password))
|
||||
|
||||
output, err = simpleCmd.CombinedOutput()
|
||||
@ -2552,7 +2668,7 @@ func (e *Engine) detectLargeObjectsInDumps(dumpsDir string, entries []os.DirEntr
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpFile)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "-l", dumpFile)
|
||||
output, err := cmd.Output()
|
||||
|
||||
if err != nil {
|
||||
@ -2876,7 +2992,7 @@ func (e *Engine) canRestartPostgreSQL() bool {
|
||||
// Try a quick sudo check - if this fails, we can't restart
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
|
||||
defer cancel()
|
||||
cmd := exec.CommandContext(ctx, "sudo", "-n", "true")
|
||||
cmd := cleanup.SafeCommand(ctx, "sudo", "-n", "true")
|
||||
cmd.Stdin = nil
|
||||
if err := cmd.Run(); err != nil {
|
||||
e.log.Info("Running as postgres user without sudo access - cannot restart PostgreSQL",
|
||||
@ -2906,7 +3022,7 @@ func (e *Engine) tryRestartPostgreSQL(ctx context.Context) bool {
|
||||
runWithTimeout := func(args ...string) bool {
|
||||
cmdCtx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||
defer cancel()
|
||||
cmd := exec.CommandContext(cmdCtx, args[0], args[1:]...)
|
||||
cmd := cleanup.SafeCommand(cmdCtx, args[0], args[1:]...)
|
||||
// Set stdin to /dev/null to prevent sudo from waiting for password
|
||||
cmd.Stdin = nil
|
||||
return cmd.Run() == nil
|
||||
|
||||
@ -7,12 +7,12 @@ import (
|
||||
"fmt"
|
||||
"io"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/logger"
|
||||
|
||||
@ -568,7 +568,7 @@ func getCommandVersion(cmd string, arg string) string {
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||
defer cancel()
|
||||
|
||||
output, err := exec.CommandContext(ctx, cmd, arg).CombinedOutput()
|
||||
output, err := cleanup.SafeCommand(ctx, cmd, arg).CombinedOutput()
|
||||
if err != nil {
|
||||
return ""
|
||||
}
|
||||
|
||||
@ -5,11 +5,11 @@ package restore
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os/exec"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/logger"
|
||||
)
|
||||
@ -124,7 +124,7 @@ func ApplySessionOptimizations(ctx context.Context, cfg *config.Config, log logg
|
||||
|
||||
for _, sql := range safeOptimizations {
|
||||
cmdArgs := append(args, "-c", sql)
|
||||
cmd := exec.CommandContext(ctx, "psql", cmdArgs...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", cmdArgs...)
|
||||
cmd.Env = append(cmd.Environ(), fmt.Sprintf("PGPASSWORD=%s", cfg.Password))
|
||||
|
||||
if err := cmd.Run(); err != nil {
|
||||
|
||||
@ -6,11 +6,11 @@ import (
|
||||
"database/sql"
|
||||
"fmt"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"syscall"
|
||||
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/logger"
|
||||
)
|
||||
@ -358,6 +358,14 @@ func (g *LargeDBGuard) WarnUser(strategy *RestoreStrategy, silentMode bool) {
|
||||
|
||||
// CheckSystemMemory validates system has enough memory for restore
|
||||
func (g *LargeDBGuard) CheckSystemMemory(backupSizeBytes int64) *MemoryCheck {
|
||||
return g.CheckSystemMemoryWithType(backupSizeBytes, false)
|
||||
}
|
||||
|
||||
// CheckSystemMemoryWithType validates system memory with archive type awareness
|
||||
// isClusterArchive: true for .tar.gz cluster backups (contain pre-compressed .dump files)
|
||||
//
|
||||
// false for single .sql.gz files (compressed SQL that expands significantly)
|
||||
func (g *LargeDBGuard) CheckSystemMemoryWithType(backupSizeBytes int64, isClusterArchive bool) *MemoryCheck {
|
||||
check := &MemoryCheck{
|
||||
BackupSizeGB: float64(backupSizeBytes) / (1024 * 1024 * 1024),
|
||||
}
|
||||
@ -374,8 +382,18 @@ func (g *LargeDBGuard) CheckSystemMemory(backupSizeBytes int64) *MemoryCheck {
|
||||
check.SwapTotalGB = float64(memInfo.SwapTotal) / (1024 * 1024 * 1024)
|
||||
check.SwapFreeGB = float64(memInfo.SwapFree) / (1024 * 1024 * 1024)
|
||||
|
||||
// Estimate uncompressed size (typical compression ratio 5:1 to 10:1)
|
||||
estimatedUncompressedGB := check.BackupSizeGB * 7 // Conservative estimate
|
||||
// Estimate uncompressed size based on archive type:
|
||||
// - Cluster archives (.tar.gz): contain pre-compressed .dump files, ratio ~1.2x
|
||||
// - Single SQL files (.sql.gz): compressed SQL expands significantly, ratio ~5-7x
|
||||
var compressionMultiplier float64
|
||||
if isClusterArchive {
|
||||
compressionMultiplier = 1.2 // tar.gz with already-compressed .dump files
|
||||
g.log.Debug("Using cluster archive compression ratio", "multiplier", compressionMultiplier)
|
||||
} else {
|
||||
compressionMultiplier = 5.0 // Conservative for gzipped SQL (was 7, reduced to 5)
|
||||
g.log.Debug("Using single file compression ratio", "multiplier", compressionMultiplier)
|
||||
}
|
||||
estimatedUncompressedGB := check.BackupSizeGB * compressionMultiplier
|
||||
|
||||
// Memory requirements
|
||||
// - PostgreSQL needs ~2-4GB for shared_buffers
|
||||
@ -572,7 +590,7 @@ func (g *LargeDBGuard) RevertMySQLSettings() []string {
|
||||
// Uses pg_restore -l which outputs a line-by-line listing, then streams through it
|
||||
func (g *LargeDBGuard) StreamCountBLOBs(ctx context.Context, dumpFile string) (int, error) {
|
||||
// pg_restore -l outputs text listing, one line per object
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpFile)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "-l", dumpFile)
|
||||
|
||||
stdout, err := cmd.StdoutPipe()
|
||||
if err != nil {
|
||||
@ -609,7 +627,7 @@ func (g *LargeDBGuard) StreamCountBLOBs(ctx context.Context, dumpFile string) (i
|
||||
// StreamAnalyzeDump analyzes a dump file using streaming to avoid memory issues
|
||||
// Returns: blobCount, estimatedObjects, error
|
||||
func (g *LargeDBGuard) StreamAnalyzeDump(ctx context.Context, dumpFile string) (blobCount, totalObjects int, err error) {
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpFile)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "-l", dumpFile)
|
||||
|
||||
stdout, err := cmd.StdoutPipe()
|
||||
if err != nil {
|
||||
|
||||
@ -1,18 +1,22 @@
|
||||
package restore
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"context"
|
||||
"database/sql"
|
||||
"fmt"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"regexp"
|
||||
"runtime"
|
||||
"strconv"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/cleanup"
|
||||
|
||||
"github.com/dustin/go-humanize"
|
||||
"github.com/klauspost/pgzip"
|
||||
"github.com/shirou/gopsutil/v3/mem"
|
||||
)
|
||||
|
||||
@ -381,7 +385,7 @@ func (e *Engine) countBlobsInDump(ctx context.Context, dumpFile string) int {
|
||||
ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpFile)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "-l", dumpFile)
|
||||
output, err := cmd.Output()
|
||||
if err != nil {
|
||||
return 0
|
||||
@ -398,24 +402,51 @@ func (e *Engine) countBlobsInDump(ctx context.Context, dumpFile string) int {
|
||||
}
|
||||
|
||||
// estimateBlobsInSQL samples compressed SQL for lo_create patterns
|
||||
// Uses in-process pgzip decompression (NO external gzip process)
|
||||
func (e *Engine) estimateBlobsInSQL(sqlFile string) int {
|
||||
// Use zgrep for efficient searching in gzipped files
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
|
||||
defer cancel()
|
||||
|
||||
// Count lo_create calls (each = one large object)
|
||||
cmd := exec.CommandContext(ctx, "zgrep", "-c", "lo_create", sqlFile)
|
||||
output, err := cmd.Output()
|
||||
// Open the gzipped file
|
||||
f, err := os.Open(sqlFile)
|
||||
if err != nil {
|
||||
// Also try SELECT lo_create pattern
|
||||
cmd2 := exec.CommandContext(ctx, "zgrep", "-c", "SELECT.*lo_create", sqlFile)
|
||||
output, err = cmd2.Output()
|
||||
if err != nil {
|
||||
return 0
|
||||
}
|
||||
e.log.Debug("Cannot open SQL file for BLOB estimation", "file", sqlFile, "error", err)
|
||||
return 0
|
||||
}
|
||||
defer f.Close()
|
||||
|
||||
// Create pgzip reader for parallel decompression
|
||||
gzReader, err := pgzip.NewReader(f)
|
||||
if err != nil {
|
||||
e.log.Debug("Cannot create pgzip reader", "file", sqlFile, "error", err)
|
||||
return 0
|
||||
}
|
||||
defer gzReader.Close()
|
||||
|
||||
// Scan for lo_create patterns
|
||||
// We use a regex to match both "lo_create" and "SELECT lo_create" patterns
|
||||
loCreatePattern := regexp.MustCompile(`lo_create`)
|
||||
|
||||
scanner := bufio.NewScanner(gzReader)
|
||||
// Use larger buffer for potentially long lines
|
||||
buf := make([]byte, 0, 256*1024)
|
||||
scanner.Buffer(buf, 10*1024*1024)
|
||||
|
||||
count := 0
|
||||
linesScanned := 0
|
||||
maxLines := 1000000 // Limit scanning for very large files
|
||||
|
||||
for scanner.Scan() && linesScanned < maxLines {
|
||||
line := scanner.Text()
|
||||
linesScanned++
|
||||
|
||||
// Count all lo_create occurrences in the line
|
||||
matches := loCreatePattern.FindAllString(line, -1)
|
||||
count += len(matches)
|
||||
}
|
||||
|
||||
count, _ := strconv.Atoi(strings.TrimSpace(string(output)))
|
||||
if err := scanner.Err(); err != nil {
|
||||
e.log.Debug("Error scanning SQL file", "file", sqlFile, "error", err, "lines_scanned", linesScanned)
|
||||
}
|
||||
|
||||
e.log.Debug("BLOB estimation from SQL file", "file", sqlFile, "lo_create_count", count, "lines_scanned", linesScanned)
|
||||
return count
|
||||
}
|
||||
|
||||
|
||||
@ -8,6 +8,7 @@ import (
|
||||
"os/exec"
|
||||
"strings"
|
||||
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/fs"
|
||||
"dbbackup/internal/logger"
|
||||
@ -419,7 +420,7 @@ func (s *Safety) checkPostgresDatabaseExists(ctx context.Context, dbName string)
|
||||
}
|
||||
args = append([]string{"-h", host}, args...)
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", args...)
|
||||
|
||||
// Set password if provided
|
||||
if s.cfg.Password != "" {
|
||||
@ -447,7 +448,7 @@ func (s *Safety) checkMySQLDatabaseExists(ctx context.Context, dbName string) (b
|
||||
args = append([]string{"-h", s.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "mysql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "mysql", args...)
|
||||
|
||||
if s.cfg.Password != "" {
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("MYSQL_PWD=%s", s.cfg.Password))
|
||||
@ -493,7 +494,7 @@ func (s *Safety) listPostgresUserDatabases(ctx context.Context) ([]string, error
|
||||
}
|
||||
args = append([]string{"-h", host}, args...)
|
||||
|
||||
cmd := exec.CommandContext(ctx, "psql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "psql", args...)
|
||||
|
||||
// Set password - check config first, then environment
|
||||
env := os.Environ()
|
||||
@ -542,7 +543,7 @@ func (s *Safety) listMySQLUserDatabases(ctx context.Context) ([]string, error) {
|
||||
args = append([]string{"-h", s.cfg.Host}, args...)
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "mysql", args...)
|
||||
cmd := cleanup.SafeCommand(ctx, "mysql", args...)
|
||||
|
||||
if s.cfg.Password != "" {
|
||||
cmd.Env = append(os.Environ(), fmt.Sprintf("MYSQL_PWD=%s", s.cfg.Password))
|
||||
|
||||
@ -3,11 +3,11 @@ package restore
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os/exec"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/cleanup"
|
||||
"dbbackup/internal/database"
|
||||
)
|
||||
|
||||
@ -54,7 +54,7 @@ func GetDumpFileVersion(dumpPath string) (*VersionInfo, error) {
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(ctx, "pg_restore", "-l", dumpPath)
|
||||
cmd := cleanup.SafeCommand(ctx, "pg_restore", "-l", dumpPath)
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to read dump file metadata: %w (output: %s)", err, string(output))
|
||||
|
||||
@ -16,6 +16,7 @@ import (
|
||||
"dbbackup/internal/config"
|
||||
"dbbackup/internal/database"
|
||||
"dbbackup/internal/logger"
|
||||
"dbbackup/internal/progress"
|
||||
"dbbackup/internal/restore"
|
||||
)
|
||||
|
||||
@ -75,6 +76,13 @@ type RestoreExecutionModel struct {
|
||||
overallPhase int // 1=Extracting, 2=Globals, 3=Databases
|
||||
extractionDone bool
|
||||
|
||||
// Rich progress view for cluster restores
|
||||
richProgressView *RichClusterProgressView
|
||||
unifiedProgress *progress.UnifiedClusterProgress
|
||||
useRichProgress bool // Whether to use the rich progress view
|
||||
termWidth int // Terminal width for rich progress
|
||||
termHeight int // Terminal height for rich progress
|
||||
|
||||
// Results
|
||||
done bool
|
||||
cancelling bool // True when user has requested cancellation
|
||||
@ -108,6 +116,11 @@ func NewRestoreExecution(cfg *config.Config, log logger.Logger, parent tea.Model
|
||||
details: []string{},
|
||||
spinnerFrames: spinnerFrames, // Use package-level constant
|
||||
spinnerFrame: 0,
|
||||
// Initialize rich progress view for cluster restores
|
||||
richProgressView: NewRichClusterProgressView(),
|
||||
useRichProgress: restoreType == "restore-cluster",
|
||||
termWidth: 80,
|
||||
termHeight: 24,
|
||||
}
|
||||
}
|
||||
|
||||
@ -176,6 +189,9 @@ type sharedProgressState struct {
|
||||
// Throttling to prevent excessive updates (memory optimization)
|
||||
lastSpeedSampleTime time.Time // Last time we added a speed sample
|
||||
minSampleInterval time.Duration // Minimum interval between samples (100ms)
|
||||
|
||||
// Unified progress tracker for rich display
|
||||
unifiedProgress *progress.UnifiedClusterProgress
|
||||
}
|
||||
|
||||
type restoreSpeedSample struct {
|
||||
@ -231,6 +247,18 @@ func getCurrentRestoreProgress() (bytesTotal, bytesDone int64, description strin
|
||||
currentRestoreProgressState.phase3StartTime
|
||||
}
|
||||
|
||||
// getUnifiedProgress returns the unified progress tracker if available
|
||||
func getUnifiedProgress() *progress.UnifiedClusterProgress {
|
||||
currentRestoreProgressMu.Lock()
|
||||
defer currentRestoreProgressMu.Unlock()
|
||||
|
||||
if currentRestoreProgressState == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
return currentRestoreProgressState.unifiedProgress
|
||||
}
|
||||
|
||||
// calculateRollingSpeed calculates speed from recent samples (last 5 seconds)
|
||||
func calculateRollingSpeed(samples []restoreSpeedSample) float64 {
|
||||
if len(samples) < 2 {
|
||||
@ -332,6 +360,11 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
progressState := &sharedProgressState{
|
||||
speedSamples: make([]restoreSpeedSample, 0, 100),
|
||||
}
|
||||
|
||||
// Initialize unified progress tracker for cluster restores
|
||||
if restoreType == "restore-cluster" {
|
||||
progressState.unifiedProgress = progress.NewUnifiedClusterProgress("restore", archive.Path)
|
||||
}
|
||||
engine.SetProgressCallback(func(current, total int64, description string) {
|
||||
progressState.mu.Lock()
|
||||
defer progressState.mu.Unlock()
|
||||
@ -342,10 +375,19 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
progressState.overallPhase = 1
|
||||
progressState.extractionDone = false
|
||||
|
||||
// Update unified progress tracker
|
||||
if progressState.unifiedProgress != nil {
|
||||
progressState.unifiedProgress.SetPhase(progress.PhaseExtracting)
|
||||
progressState.unifiedProgress.SetExtractProgress(current, total)
|
||||
}
|
||||
|
||||
// Check if extraction is complete
|
||||
if current >= total && total > 0 {
|
||||
progressState.extractionDone = true
|
||||
progressState.overallPhase = 2
|
||||
if progressState.unifiedProgress != nil {
|
||||
progressState.unifiedProgress.SetPhase(progress.PhaseGlobals)
|
||||
}
|
||||
}
|
||||
|
||||
// Throttle speed samples to prevent memory bloat (max 10 samples/sec)
|
||||
@ -384,6 +426,13 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
// Clear byte progress when switching to db progress
|
||||
progressState.bytesTotal = 0
|
||||
progressState.bytesDone = 0
|
||||
|
||||
// Update unified progress tracker
|
||||
if progressState.unifiedProgress != nil {
|
||||
progressState.unifiedProgress.SetPhase(progress.PhaseDatabases)
|
||||
progressState.unifiedProgress.SetDatabasesTotal(total, nil)
|
||||
progressState.unifiedProgress.StartDatabase(dbName, 0)
|
||||
}
|
||||
})
|
||||
|
||||
// Set up timing-aware database progress callback for cluster restore ETA
|
||||
@ -406,6 +455,13 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
// Clear byte progress when switching to db progress
|
||||
progressState.bytesTotal = 0
|
||||
progressState.bytesDone = 0
|
||||
|
||||
// Update unified progress tracker
|
||||
if progressState.unifiedProgress != nil {
|
||||
progressState.unifiedProgress.SetPhase(progress.PhaseDatabases)
|
||||
progressState.unifiedProgress.SetDatabasesTotal(total, nil)
|
||||
progressState.unifiedProgress.StartDatabase(dbName, 0)
|
||||
}
|
||||
})
|
||||
|
||||
// Set up weighted (bytes-based) progress callback for accurate cluster restore progress
|
||||
@ -424,6 +480,14 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
if progressState.phase3StartTime.IsZero() {
|
||||
progressState.phase3StartTime = time.Now()
|
||||
}
|
||||
|
||||
// Update unified progress tracker
|
||||
if progressState.unifiedProgress != nil {
|
||||
progressState.unifiedProgress.SetPhase(progress.PhaseDatabases)
|
||||
progressState.unifiedProgress.SetDatabasesTotal(dbTotal, nil)
|
||||
progressState.unifiedProgress.StartDatabase(dbName, bytesTotal)
|
||||
progressState.unifiedProgress.UpdateDatabaseProgress(bytesDone)
|
||||
}
|
||||
})
|
||||
|
||||
// Store progress state in a package-level variable for the ticker to access
|
||||
@ -489,11 +553,30 @@ func executeRestoreWithTUIProgress(parentCtx context.Context, cfg *config.Config
|
||||
|
||||
func (m RestoreExecutionModel) Update(msg tea.Msg) (tea.Model, tea.Cmd) {
|
||||
switch msg := msg.(type) {
|
||||
case tea.WindowSizeMsg:
|
||||
// Update terminal dimensions for rich progress view
|
||||
m.termWidth = msg.Width
|
||||
m.termHeight = msg.Height
|
||||
if m.richProgressView != nil {
|
||||
m.richProgressView.SetSize(msg.Width, msg.Height)
|
||||
}
|
||||
return m, nil
|
||||
|
||||
case restoreTickMsg:
|
||||
if !m.done {
|
||||
m.spinnerFrame = (m.spinnerFrame + 1) % len(m.spinnerFrames)
|
||||
m.elapsed = time.Since(m.startTime)
|
||||
|
||||
// Advance spinner for rich progress view
|
||||
if m.richProgressView != nil {
|
||||
m.richProgressView.AdvanceSpinner()
|
||||
}
|
||||
|
||||
// Update unified progress reference
|
||||
if m.useRichProgress && m.unifiedProgress == nil {
|
||||
m.unifiedProgress = getUnifiedProgress()
|
||||
}
|
||||
|
||||
// Poll shared progress state for real-time updates
|
||||
// Note: dbPhaseElapsed is now calculated in realtime inside getCurrentRestoreProgress()
|
||||
bytesTotal, bytesDone, description, hasUpdate, dbTotal, dbDone, speed, dbPhaseElapsed, dbAvgPerDB, currentDB, overallPhase, extractionDone, dbBytesTotal, dbBytesDone, _ := getCurrentRestoreProgress()
|
||||
@ -782,7 +865,16 @@ func (m RestoreExecutionModel) View() string {
|
||||
} else {
|
||||
// Show unified progress for cluster restore
|
||||
if m.restoreType == "restore-cluster" {
|
||||
// Calculate overall progress across all phases
|
||||
// Use rich progress view when we have unified progress data
|
||||
if m.useRichProgress && m.unifiedProgress != nil {
|
||||
// Render using the rich cluster progress view
|
||||
s.WriteString(m.richProgressView.RenderUnified(m.unifiedProgress))
|
||||
s.WriteString("\n")
|
||||
s.WriteString(infoStyle.Render("[KEYS] Press Ctrl+C to cancel"))
|
||||
return s.String()
|
||||
}
|
||||
|
||||
// Fallback: Calculate overall progress across all phases
|
||||
// Phase 1: Extraction (0-60%)
|
||||
// Phase 2: Globals (60-65%)
|
||||
// Phase 3: Databases (65-100%)
|
||||
|
||||
354
internal/tui/rich_cluster_progress.go
Normal file
354
internal/tui/rich_cluster_progress.go
Normal file
@ -0,0 +1,354 @@
|
||||
package tui
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"dbbackup/internal/progress"
|
||||
)
|
||||
|
||||
// RichClusterProgressView renders detailed cluster restore progress
|
||||
type RichClusterProgressView struct {
|
||||
width int
|
||||
height int
|
||||
spinnerFrames []string
|
||||
spinnerFrame int
|
||||
}
|
||||
|
||||
// NewRichClusterProgressView creates a new rich progress view
|
||||
func NewRichClusterProgressView() *RichClusterProgressView {
|
||||
return &RichClusterProgressView{
|
||||
width: 80,
|
||||
height: 24,
|
||||
spinnerFrames: []string{
|
||||
"⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏",
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// SetSize updates the terminal size
|
||||
func (v *RichClusterProgressView) SetSize(width, height int) {
|
||||
v.width = width
|
||||
v.height = height
|
||||
}
|
||||
|
||||
// AdvanceSpinner moves to the next spinner frame
|
||||
func (v *RichClusterProgressView) AdvanceSpinner() {
|
||||
v.spinnerFrame = (v.spinnerFrame + 1) % len(v.spinnerFrames)
|
||||
}
|
||||
|
||||
// RenderUnified renders progress from UnifiedClusterProgress
|
||||
func (v *RichClusterProgressView) RenderUnified(p *progress.UnifiedClusterProgress) string {
|
||||
if p == nil {
|
||||
return ""
|
||||
}
|
||||
|
||||
snapshot := p.GetSnapshot()
|
||||
return v.RenderSnapshot(&snapshot)
|
||||
}
|
||||
|
||||
// RenderSnapshot renders progress from a ProgressSnapshot
|
||||
func (v *RichClusterProgressView) RenderSnapshot(snapshot *progress.ProgressSnapshot) string {
|
||||
if snapshot == nil {
|
||||
return ""
|
||||
}
|
||||
|
||||
var b strings.Builder
|
||||
b.Grow(2048)
|
||||
|
||||
// Header with overall progress
|
||||
b.WriteString(v.renderHeader(snapshot))
|
||||
b.WriteString("\n\n")
|
||||
|
||||
// Overall progress bar
|
||||
b.WriteString(v.renderOverallProgress(snapshot))
|
||||
b.WriteString("\n\n")
|
||||
|
||||
// Phase-specific details
|
||||
b.WriteString(v.renderPhaseDetails(snapshot))
|
||||
|
||||
// Performance metrics
|
||||
if v.height > 15 {
|
||||
b.WriteString("\n")
|
||||
b.WriteString(v.renderMetricsFromSnapshot(snapshot))
|
||||
}
|
||||
|
||||
return b.String()
|
||||
}
|
||||
|
||||
func (v *RichClusterProgressView) renderHeader(snapshot *progress.ProgressSnapshot) string {
|
||||
elapsed := time.Since(snapshot.StartTime)
|
||||
|
||||
// Calculate ETA based on progress
|
||||
overall := v.calculateOverallPercent(snapshot)
|
||||
var etaStr string
|
||||
if overall > 0 && overall < 100 {
|
||||
eta := time.Duration(float64(elapsed) / float64(overall) * float64(100-overall))
|
||||
etaStr = fmt.Sprintf("ETA: %s", formatDuration(eta))
|
||||
} else if overall >= 100 {
|
||||
etaStr = "Complete!"
|
||||
} else {
|
||||
etaStr = "ETA: calculating..."
|
||||
}
|
||||
|
||||
title := "Cluster Restore Progress"
|
||||
// Cap separator at 40 chars to avoid long lines on wide terminals
|
||||
sepLen := maxInt(0, v.width-len(title)-4)
|
||||
if sepLen > 40 {
|
||||
sepLen = 40
|
||||
}
|
||||
separator := strings.Repeat("━", sepLen)
|
||||
|
||||
return fmt.Sprintf("%s %s\n Elapsed: %s | %s",
|
||||
title, separator,
|
||||
formatDuration(elapsed), etaStr)
|
||||
}
|
||||
|
||||
func (v *RichClusterProgressView) renderOverallProgress(snapshot *progress.ProgressSnapshot) string {
|
||||
overall := v.calculateOverallPercent(snapshot)
|
||||
|
||||
// Phase indicator
|
||||
phaseLabel := v.getPhaseLabel(snapshot)
|
||||
|
||||
// Progress bar
|
||||
barWidth := v.width - 20
|
||||
if barWidth < 20 {
|
||||
barWidth = 20
|
||||
}
|
||||
bar := v.renderProgressBarWidth(overall, barWidth)
|
||||
|
||||
return fmt.Sprintf(" Overall: %s %3d%%\n Phase: %s", bar, overall, phaseLabel)
|
||||
}
|
||||
|
||||
func (v *RichClusterProgressView) getPhaseLabel(snapshot *progress.ProgressSnapshot) string {
|
||||
switch snapshot.Phase {
|
||||
case progress.PhaseExtracting:
|
||||
return fmt.Sprintf("📦 Extracting archive (%s / %s)",
|
||||
FormatBytes(snapshot.ExtractBytes), FormatBytes(snapshot.ExtractTotal))
|
||||
case progress.PhaseGlobals:
|
||||
return "🔧 Restoring globals (roles, tablespaces)"
|
||||
case progress.PhaseDatabases:
|
||||
return fmt.Sprintf("🗄️ Databases (%d/%d) %s",
|
||||
snapshot.DatabasesDone, snapshot.DatabasesTotal, snapshot.CurrentDB)
|
||||
case progress.PhaseVerifying:
|
||||
return fmt.Sprintf("✅ Verifying (%d/%d)", snapshot.VerifyDone, snapshot.VerifyTotal)
|
||||
case progress.PhaseComplete:
|
||||
return "🎉 Complete!"
|
||||
case progress.PhaseFailed:
|
||||
return "❌ Failed"
|
||||
default:
|
||||
return string(snapshot.Phase)
|
||||
}
|
||||
}
|
||||
|
||||
func (v *RichClusterProgressView) calculateOverallPercent(snapshot *progress.ProgressSnapshot) int {
|
||||
// Use the same logic as UnifiedClusterProgress
|
||||
phaseWeights := map[progress.Phase]int{
|
||||
progress.PhaseExtracting: 20,
|
||||
progress.PhaseGlobals: 5,
|
||||
progress.PhaseDatabases: 70,
|
||||
progress.PhaseVerifying: 5,
|
||||
}
|
||||
|
||||
switch snapshot.Phase {
|
||||
case progress.PhaseIdle:
|
||||
return 0
|
||||
case progress.PhaseExtracting:
|
||||
if snapshot.ExtractTotal > 0 {
|
||||
return int(float64(snapshot.ExtractBytes) / float64(snapshot.ExtractTotal) * float64(phaseWeights[progress.PhaseExtracting]))
|
||||
}
|
||||
return 0
|
||||
case progress.PhaseGlobals:
|
||||
return phaseWeights[progress.PhaseExtracting] + phaseWeights[progress.PhaseGlobals]
|
||||
case progress.PhaseDatabases:
|
||||
basePercent := phaseWeights[progress.PhaseExtracting] + phaseWeights[progress.PhaseGlobals]
|
||||
if snapshot.DatabasesTotal == 0 {
|
||||
return basePercent
|
||||
}
|
||||
dbProgress := float64(snapshot.DatabasesDone) / float64(snapshot.DatabasesTotal)
|
||||
if snapshot.CurrentDBTotal > 0 {
|
||||
currentProgress := float64(snapshot.CurrentDBBytes) / float64(snapshot.CurrentDBTotal)
|
||||
dbProgress += currentProgress / float64(snapshot.DatabasesTotal)
|
||||
}
|
||||
return basePercent + int(dbProgress*float64(phaseWeights[progress.PhaseDatabases]))
|
||||
case progress.PhaseVerifying:
|
||||
basePercent := phaseWeights[progress.PhaseExtracting] + phaseWeights[progress.PhaseGlobals] + phaseWeights[progress.PhaseDatabases]
|
||||
if snapshot.VerifyTotal > 0 {
|
||||
verifyProgress := float64(snapshot.VerifyDone) / float64(snapshot.VerifyTotal)
|
||||
return basePercent + int(verifyProgress*float64(phaseWeights[progress.PhaseVerifying]))
|
||||
}
|
||||
return basePercent
|
||||
case progress.PhaseComplete:
|
||||
return 100
|
||||
default:
|
||||
return 0
|
||||
}
|
||||
}
|
||||
|
||||
func (v *RichClusterProgressView) renderPhaseDetails(snapshot *progress.ProgressSnapshot) string {
|
||||
var b strings.Builder
|
||||
|
||||
switch snapshot.Phase {
|
||||
case progress.PhaseExtracting:
|
||||
pct := 0
|
||||
if snapshot.ExtractTotal > 0 {
|
||||
pct = int(float64(snapshot.ExtractBytes) / float64(snapshot.ExtractTotal) * 100)
|
||||
}
|
||||
bar := v.renderMiniProgressBar(pct)
|
||||
b.WriteString(fmt.Sprintf(" 📦 Extraction: %s %d%%\n", bar, pct))
|
||||
b.WriteString(fmt.Sprintf(" %s / %s\n",
|
||||
FormatBytes(snapshot.ExtractBytes), FormatBytes(snapshot.ExtractTotal)))
|
||||
|
||||
case progress.PhaseDatabases:
|
||||
b.WriteString(" 📊 Databases:\n\n")
|
||||
|
||||
// Show completed databases if any
|
||||
if snapshot.DatabasesDone > 0 {
|
||||
avgTime := time.Duration(0)
|
||||
if len(snapshot.DatabaseTimes) > 0 {
|
||||
var total time.Duration
|
||||
for _, t := range snapshot.DatabaseTimes {
|
||||
total += t
|
||||
}
|
||||
avgTime = total / time.Duration(len(snapshot.DatabaseTimes))
|
||||
}
|
||||
b.WriteString(fmt.Sprintf(" ✓ %d completed (avg: %s)\n",
|
||||
snapshot.DatabasesDone, formatDuration(avgTime)))
|
||||
}
|
||||
|
||||
// Show current database
|
||||
if snapshot.CurrentDB != "" {
|
||||
spinner := v.spinnerFrames[v.spinnerFrame]
|
||||
pct := 0
|
||||
if snapshot.CurrentDBTotal > 0 {
|
||||
pct = int(float64(snapshot.CurrentDBBytes) / float64(snapshot.CurrentDBTotal) * 100)
|
||||
}
|
||||
bar := v.renderMiniProgressBar(pct)
|
||||
|
||||
phaseElapsed := time.Since(snapshot.PhaseStartTime)
|
||||
|
||||
// Better display when we have progress info vs when we're waiting
|
||||
if snapshot.CurrentDBTotal > 0 {
|
||||
b.WriteString(fmt.Sprintf(" %s %-20s %s %3d%%\n",
|
||||
spinner, truncateString(snapshot.CurrentDB, 20), bar, pct))
|
||||
b.WriteString(fmt.Sprintf(" └─ %s / %s (running %s)\n",
|
||||
FormatBytes(snapshot.CurrentDBBytes), FormatBytes(snapshot.CurrentDBTotal),
|
||||
formatDuration(phaseElapsed)))
|
||||
} else {
|
||||
// No byte-level progress available - show activity indicator with elapsed time
|
||||
b.WriteString(fmt.Sprintf(" %s %-20s [restoring...] running %s\n",
|
||||
spinner, truncateString(snapshot.CurrentDB, 20),
|
||||
formatDuration(phaseElapsed)))
|
||||
b.WriteString(fmt.Sprintf(" └─ pg_restore in progress (progress updates every 5s)\n"))
|
||||
}
|
||||
}
|
||||
|
||||
// Show remaining count
|
||||
remaining := snapshot.DatabasesTotal - snapshot.DatabasesDone
|
||||
if snapshot.CurrentDB != "" {
|
||||
remaining--
|
||||
}
|
||||
if remaining > 0 {
|
||||
b.WriteString(fmt.Sprintf(" ⏳ %d remaining\n", remaining))
|
||||
}
|
||||
|
||||
case progress.PhaseVerifying:
|
||||
pct := 0
|
||||
if snapshot.VerifyTotal > 0 {
|
||||
pct = snapshot.VerifyDone * 100 / snapshot.VerifyTotal
|
||||
}
|
||||
bar := v.renderMiniProgressBar(pct)
|
||||
b.WriteString(fmt.Sprintf(" ✅ Verification: %s %d%%\n", bar, pct))
|
||||
b.WriteString(fmt.Sprintf(" %d / %d databases verified\n",
|
||||
snapshot.VerifyDone, snapshot.VerifyTotal))
|
||||
|
||||
case progress.PhaseComplete:
|
||||
elapsed := time.Since(snapshot.StartTime)
|
||||
b.WriteString(fmt.Sprintf(" 🎉 Restore complete!\n"))
|
||||
b.WriteString(fmt.Sprintf(" %d databases restored in %s\n",
|
||||
snapshot.DatabasesDone, formatDuration(elapsed)))
|
||||
|
||||
case progress.PhaseFailed:
|
||||
b.WriteString(" ❌ Restore failed:\n")
|
||||
for _, err := range snapshot.Errors {
|
||||
b.WriteString(fmt.Sprintf(" • %s\n", truncateString(err, v.width-10)))
|
||||
}
|
||||
}
|
||||
|
||||
return b.String()
|
||||
}
|
||||
|
||||
func (v *RichClusterProgressView) renderMetricsFromSnapshot(snapshot *progress.ProgressSnapshot) string {
|
||||
var b strings.Builder
|
||||
b.WriteString(" 📈 Performance:\n")
|
||||
|
||||
elapsed := time.Since(snapshot.StartTime)
|
||||
if elapsed > 0 {
|
||||
// Calculate throughput from extraction phase if we have data
|
||||
if snapshot.ExtractBytes > 0 && elapsed.Seconds() > 0 {
|
||||
throughput := float64(snapshot.ExtractBytes) / elapsed.Seconds()
|
||||
b.WriteString(fmt.Sprintf(" Throughput: %s/s\n", FormatBytes(int64(throughput))))
|
||||
}
|
||||
|
||||
// Database timing info
|
||||
if len(snapshot.DatabaseTimes) > 0 {
|
||||
var total time.Duration
|
||||
for _, t := range snapshot.DatabaseTimes {
|
||||
total += t
|
||||
}
|
||||
avg := total / time.Duration(len(snapshot.DatabaseTimes))
|
||||
b.WriteString(fmt.Sprintf(" Avg DB time: %s\n", formatDuration(avg)))
|
||||
}
|
||||
}
|
||||
|
||||
return b.String()
|
||||
}
|
||||
|
||||
// Helper functions
|
||||
|
||||
func (v *RichClusterProgressView) renderProgressBarWidth(pct, width int) string {
|
||||
if width < 10 {
|
||||
width = 10
|
||||
}
|
||||
filled := (pct * width) / 100
|
||||
empty := width - filled
|
||||
|
||||
bar := strings.Repeat("█", filled) + strings.Repeat("░", empty)
|
||||
return "[" + bar + "]"
|
||||
}
|
||||
|
||||
func (v *RichClusterProgressView) renderMiniProgressBar(pct int) string {
|
||||
width := 20
|
||||
filled := (pct * width) / 100
|
||||
empty := width - filled
|
||||
return strings.Repeat("█", filled) + strings.Repeat("░", empty)
|
||||
}
|
||||
|
||||
func truncateString(s string, maxLen int) string {
|
||||
if len(s) <= maxLen {
|
||||
return s
|
||||
}
|
||||
if maxLen < 4 {
|
||||
return s[:maxLen]
|
||||
}
|
||||
return s[:maxLen-3] + "..."
|
||||
}
|
||||
|
||||
func maxInt(a, b int) int {
|
||||
if a > b {
|
||||
return a
|
||||
}
|
||||
return b
|
||||
}
|
||||
|
||||
func formatNumShort(n int64) string {
|
||||
if n >= 1e9 {
|
||||
return fmt.Sprintf("%.1fB", float64(n)/1e9)
|
||||
} else if n >= 1e6 {
|
||||
return fmt.Sprintf("%.1fM", float64(n)/1e6)
|
||||
} else if n >= 1e3 {
|
||||
return fmt.Sprintf("%.1fK", float64(n)/1e3)
|
||||
}
|
||||
return fmt.Sprintf("%d", n)
|
||||
}
|
||||
2
main.go
2
main.go
@ -16,7 +16,7 @@ import (
|
||||
|
||||
// Build information (set by ldflags)
|
||||
var (
|
||||
version = "5.4.2"
|
||||
version = "5.5.0"
|
||||
buildTime = "unknown"
|
||||
gitCommit = "unknown"
|
||||
)
|
||||
|
||||
192
scripts/test-sigint-cleanup.sh
Executable file
192
scripts/test-sigint-cleanup.sh
Executable file
@ -0,0 +1,192 @@
|
||||
#!/bin/bash
|
||||
# scripts/test-sigint-cleanup.sh
|
||||
# Test script to verify clean shutdown on SIGINT (Ctrl+C)
|
||||
|
||||
set -e
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_DIR="$(dirname "$SCRIPT_DIR")"
|
||||
BINARY="$PROJECT_DIR/dbbackup"
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
echo "=== SIGINT Cleanup Test ==="
|
||||
echo ""
|
||||
echo "Project: $PROJECT_DIR"
|
||||
echo "Binary: $BINARY"
|
||||
echo ""
|
||||
|
||||
# Check if binary exists
|
||||
if [ ! -f "$BINARY" ]; then
|
||||
echo -e "${YELLOW}Binary not found, building...${NC}"
|
||||
cd "$PROJECT_DIR"
|
||||
go build -o dbbackup .
|
||||
fi
|
||||
|
||||
# Create a test backup file if it doesn't exist
|
||||
TEST_BACKUP="/tmp/test-sigint-backup.sql.gz"
|
||||
if [ ! -f "$TEST_BACKUP" ]; then
|
||||
echo -e "${YELLOW}Creating test backup file...${NC}"
|
||||
echo "-- Test SQL file for SIGINT testing" | gzip > "$TEST_BACKUP"
|
||||
fi
|
||||
|
||||
echo "=== Phase 1: Pre-test Cleanup ==="
|
||||
echo "Killing any existing dbbackup processes..."
|
||||
pkill -f "dbbackup" 2>/dev/null || true
|
||||
sleep 1
|
||||
|
||||
echo ""
|
||||
echo "=== Phase 2: Check Initial State ==="
|
||||
|
||||
echo "Checking for orphaned processes..."
|
||||
INITIAL_PROCS=$(pgrep -f "pg_dump|pg_restore|dbbackup" 2>/dev/null | wc -l)
|
||||
echo "Initial related processes: $INITIAL_PROCS"
|
||||
|
||||
echo ""
|
||||
echo "Checking for temp files..."
|
||||
INITIAL_TEMPS=$(ls /tmp/dbbackup-* 2>/dev/null | wc -l || echo "0")
|
||||
echo "Initial temp files: $INITIAL_TEMPS"
|
||||
|
||||
echo ""
|
||||
echo "=== Phase 3: Start Test Operation ==="
|
||||
|
||||
# Start a TUI operation that will hang (version is fast, but menu would wait)
|
||||
echo "Starting dbbackup TUI (will be interrupted)..."
|
||||
|
||||
# Run in background with PTY simulation (needed for TUI)
|
||||
cd "$PROJECT_DIR"
|
||||
timeout 30 script -q -c "$BINARY" /dev/null &
|
||||
PID=$!
|
||||
|
||||
echo "Process started: PID=$PID"
|
||||
sleep 2
|
||||
|
||||
# Check if process is running
|
||||
if ! kill -0 $PID 2>/dev/null; then
|
||||
echo -e "${YELLOW}Process exited quickly (expected for non-interactive test)${NC}"
|
||||
echo "This is normal - the TUI requires a real TTY"
|
||||
PID=""
|
||||
else
|
||||
echo "Process is running"
|
||||
|
||||
echo ""
|
||||
echo "=== Phase 4: Check Running State ==="
|
||||
|
||||
echo "Child processes of $PID:"
|
||||
pgrep -P $PID 2>/dev/null | while read child; do
|
||||
ps -p $child -o pid,ppid,cmd 2>/dev/null || true
|
||||
done
|
||||
|
||||
echo ""
|
||||
echo "=== Phase 5: Send SIGINT ==="
|
||||
echo "Sending SIGINT to process $PID..."
|
||||
kill -SIGINT $PID 2>/dev/null || true
|
||||
|
||||
echo "Waiting for cleanup (max 10 seconds)..."
|
||||
for i in {1..10}; do
|
||||
if ! kill -0 $PID 2>/dev/null; then
|
||||
echo ""
|
||||
echo -e "${GREEN}Process exited after ${i} seconds${NC}"
|
||||
break
|
||||
fi
|
||||
sleep 1
|
||||
echo -n "."
|
||||
done
|
||||
echo ""
|
||||
|
||||
# Check if still running
|
||||
if kill -0 $PID 2>/dev/null; then
|
||||
echo -e "${RED}Process still running after 10 seconds!${NC}"
|
||||
echo "Force killing..."
|
||||
kill -9 $PID 2>/dev/null || true
|
||||
fi
|
||||
fi
|
||||
|
||||
sleep 2 # Give OS time to clean up
|
||||
|
||||
echo ""
|
||||
echo "=== Phase 6: Post-Shutdown Verification ==="
|
||||
|
||||
# Check for zombie processes
|
||||
ZOMBIES=$(ps aux 2>/dev/null | grep -E "dbbackup|pg_dump|pg_restore" | grep -v grep | grep defunct | wc -l)
|
||||
echo "Zombie processes: $ZOMBIES"
|
||||
|
||||
# Check for orphaned children
|
||||
if [ -n "$PID" ]; then
|
||||
ORPHANS=$(pgrep -P $PID 2>/dev/null | wc -l || echo "0")
|
||||
echo "Orphaned children of original process: $ORPHANS"
|
||||
else
|
||||
ORPHANS=0
|
||||
fi
|
||||
|
||||
# Check for leftover related processes
|
||||
LEFTOVER_PROCS=$(pgrep -f "pg_dump|pg_restore" 2>/dev/null | wc -l || echo "0")
|
||||
echo "Leftover pg_dump/pg_restore processes: $LEFTOVER_PROCS"
|
||||
|
||||
# Check for temp files
|
||||
TEMP_FILES=$(ls /tmp/dbbackup-* 2>/dev/null | wc -l || echo "0")
|
||||
echo "Temporary files: $TEMP_FILES"
|
||||
|
||||
# Database connections check (if psql available and configured)
|
||||
if command -v psql &> /dev/null; then
|
||||
echo ""
|
||||
echo "Checking database connections..."
|
||||
DB_CONNS=$(psql -t -c "SELECT count(*) FROM pg_stat_activity WHERE application_name LIKE '%dbbackup%';" 2>/dev/null | tr -d ' ' || echo "N/A")
|
||||
echo "Database connections with 'dbbackup' in name: $DB_CONNS"
|
||||
else
|
||||
echo "psql not available - skipping database connection check"
|
||||
DB_CONNS="N/A"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "=== Test Results ==="
|
||||
|
||||
PASSED=true
|
||||
|
||||
if [ "$ZOMBIES" -gt 0 ]; then
|
||||
echo -e "${RED}❌ FAIL: $ZOMBIES zombie process(es) found${NC}"
|
||||
PASSED=false
|
||||
else
|
||||
echo -e "${GREEN}✓ No zombie processes${NC}"
|
||||
fi
|
||||
|
||||
if [ "$ORPHANS" -gt 0 ]; then
|
||||
echo -e "${RED}❌ FAIL: $ORPHANS orphaned child process(es) found${NC}"
|
||||
PASSED=false
|
||||
else
|
||||
echo -e "${GREEN}✓ No orphaned children${NC}"
|
||||
fi
|
||||
|
||||
if [ "$LEFTOVER_PROCS" -gt 0 ]; then
|
||||
echo -e "${YELLOW}⚠ WARNING: $LEFTOVER_PROCS leftover pg_dump/pg_restore process(es)${NC}"
|
||||
echo " These may be from other operations"
|
||||
fi
|
||||
|
||||
if [ "$TEMP_FILES" -gt "$INITIAL_TEMPS" ]; then
|
||||
NEW_TEMPS=$((TEMP_FILES - INITIAL_TEMPS))
|
||||
echo -e "${RED}❌ FAIL: $NEW_TEMPS new temporary file(s) left behind${NC}"
|
||||
ls -la /tmp/dbbackup-* 2>/dev/null || true
|
||||
PASSED=false
|
||||
else
|
||||
echo -e "${GREEN}✓ No new temporary files left behind${NC}"
|
||||
fi
|
||||
|
||||
if [ "$DB_CONNS" != "N/A" ] && [ "$DB_CONNS" -gt 0 ]; then
|
||||
echo -e "${RED}❌ FAIL: $DB_CONNS database connection(s) still active${NC}"
|
||||
PASSED=false
|
||||
elif [ "$DB_CONNS" != "N/A" ]; then
|
||||
echo -e "${GREEN}✓ No lingering database connections${NC}"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
if [ "$PASSED" = true ]; then
|
||||
echo -e "${GREEN}=== ✓ ALL TESTS PASSED ===${NC}"
|
||||
exit 0
|
||||
else
|
||||
echo -e "${RED}=== ✗ SOME TESTS FAILED ===${NC}"
|
||||
exit 1
|
||||
fi
|
||||
Reference in New Issue
Block a user