Remove obsolete development documentation and test scripts
Removed files (features now implemented in production code): - CLUSTER_RESTORE_COMPLIANCE.md - cluster restore best practices implemented - LARGE_OBJECT_RESTORE_FIX.md - large object fixes applied (--single-transaction removed) - PHASE2_COMPLETION.md - Phase 2 TUI improvements completed - TUI_IMPROVEMENTS.md - all TUI enhancements implemented - create_d7030_test.sh - test database no longer needed - fix_max_locks.sh - fix applied to codebase - test_backup_restore.sh - superseded by production features - test_build - build artifact - verify_backup_blobs.sh - verification built into restore process All features documented in these files are now part of the main codebase and documented in README.md
This commit is contained in:
@@ -1,168 +0,0 @@
|
||||
# PostgreSQL Cluster Restore - Best Practices Compliance Check
|
||||
|
||||
## ✅ Current Implementation Status
|
||||
|
||||
### Our Cluster Restore Process (internal/restore/engine.go)
|
||||
|
||||
Based on PostgreSQL official documentation and best practices, our implementation follows the correct approach:
|
||||
|
||||
## 1. ✅ Global Objects Restoration (FIRST)
|
||||
```go
|
||||
// Lines 505-528: Restore globals BEFORE databases
|
||||
globalsFile := filepath.Join(tempDir, "globals.sql")
|
||||
if _, err := os.Stat(globalsFile); err == nil {
|
||||
e.restoreGlobals(ctx, globalsFile) // Restores roles, tablespaces FIRST
|
||||
}
|
||||
```
|
||||
|
||||
**Why:** Roles and tablespaces must exist before restoring databases that reference them.
|
||||
|
||||
## 2. ✅ Proper Database Cleanup (DROP IF EXISTS)
|
||||
```go
|
||||
// Lines 600-605: Drop existing database completely
|
||||
e.dropDatabaseIfExists(ctx, dbName)
|
||||
```
|
||||
|
||||
### dropDatabaseIfExists implementation (lines 835-870):
|
||||
```go
|
||||
// Step 1: Terminate all active connections
|
||||
terminateConnections(ctx, dbName)
|
||||
|
||||
// Step 2: Wait for termination
|
||||
time.Sleep(500 * time.Millisecond)
|
||||
|
||||
// Step 3: Drop database with IF EXISTS
|
||||
DROP DATABASE IF EXISTS "dbName"
|
||||
```
|
||||
|
||||
**PostgreSQL Docs**: "The `--clean` option can be useful even when your intention is to restore the dump script into a fresh cluster. Use of `--clean` authorizes the script to drop and re-create the built-in postgres and template1 databases."
|
||||
|
||||
## 3. ✅ Template0 for Database Creation
|
||||
```go
|
||||
// Line 915: Use template0 to avoid duplicate definitions
|
||||
CREATE DATABASE "dbName" WITH TEMPLATE template0
|
||||
```
|
||||
|
||||
**Why:** `template0` is truly empty, whereas `template1` may have local additions that cause "duplicate definition" errors.
|
||||
|
||||
**PostgreSQL Docs (pg_restore)**: "To make an empty database without any local additions, copy from template0 not template1, for example: CREATE DATABASE foo WITH TEMPLATE template0;"
|
||||
|
||||
## 4. ✅ Connection Termination Before Drop
|
||||
```go
|
||||
// Lines 800-833: terminateConnections function
|
||||
SELECT pg_terminate_backend(pid)
|
||||
FROM pg_stat_activity
|
||||
WHERE datname = 'dbname'
|
||||
AND pid <> pg_backend_pid()
|
||||
```
|
||||
|
||||
**Why:** Cannot drop a database with active connections. Must terminate them first.
|
||||
|
||||
## 5. ✅ Parallel Restore with Worker Pool
|
||||
```go
|
||||
// Lines 555-571: Parallel restore implementation
|
||||
parallelism := e.cfg.ClusterParallelism
|
||||
semaphore := make(chan struct{}, parallelism)
|
||||
// Restores multiple databases concurrently
|
||||
```
|
||||
|
||||
**Best Practice:** Significantly speeds up cluster restore (3-5x faster).
|
||||
|
||||
## 6. ✅ Error Handling and Reporting
|
||||
```go
|
||||
// Lines 628-645: Comprehensive error tracking
|
||||
var failedDBs []string
|
||||
var successCount, failCount int32
|
||||
|
||||
// Report failures at end
|
||||
if len(failedDBs) > 0 {
|
||||
return fmt.Errorf("cluster restore completed with %d failures: %s",
|
||||
len(failedDBs), strings.Join(failedDBs, ", "))
|
||||
}
|
||||
```
|
||||
|
||||
## 7. ✅ Superuser Privilege Detection
|
||||
```go
|
||||
// Lines 488-503: Check for superuser
|
||||
isSuperuser, err := e.checkSuperuser(ctx)
|
||||
if !isSuperuser {
|
||||
e.log.Warn("Current user is not a superuser - database ownership may not be fully restored")
|
||||
}
|
||||
```
|
||||
|
||||
**Why:** Ownership restoration requires superuser privileges. Warn user if not available.
|
||||
|
||||
## 8. ✅ System Database Skip Logic
|
||||
```go
|
||||
// Lines 877-881: Skip system databases
|
||||
if dbName == "postgres" || dbName == "template0" || dbName == "template1" {
|
||||
e.log.Info("Skipping create for system database (assume exists)")
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
**Why:** System databases always exist and should not be dropped/created.
|
||||
|
||||
---
|
||||
|
||||
## PostgreSQL Documentation References
|
||||
|
||||
### From pg_dumpall docs:
|
||||
> "`-c, --clean`: Emit SQL commands to DROP all the dumped databases, roles, and tablespaces before recreating them. This option is useful when the restore is to overwrite an existing cluster."
|
||||
|
||||
### From managing-databases docs:
|
||||
> "To destroy a database: DROP DATABASE name;"
|
||||
> "You cannot drop a database while clients are connected to it. You can use pg_terminate_backend to disconnect them."
|
||||
|
||||
### From pg_restore docs:
|
||||
> "To make an empty database without any local additions, copy from template0 not template1"
|
||||
|
||||
---
|
||||
|
||||
## Comparison with PostgreSQL Best Practices
|
||||
|
||||
| Practice | PostgreSQL Docs | Our Implementation | Status |
|
||||
|----------|----------------|-------------------|--------|
|
||||
| Restore globals first | ✅ Required | ✅ Implemented | ✅ CORRECT |
|
||||
| DROP before CREATE | ✅ Recommended | ✅ Implemented | ✅ CORRECT |
|
||||
| Terminate connections | ✅ Required | ✅ Implemented | ✅ CORRECT |
|
||||
| Use template0 | ✅ Recommended | ✅ Implemented | ✅ CORRECT |
|
||||
| Handle IF EXISTS errors | ✅ Recommended | ✅ Implemented | ✅ CORRECT |
|
||||
| Superuser warnings | ✅ Recommended | ✅ Implemented | ✅ CORRECT |
|
||||
| Parallel restore | ⚪ Optional | ✅ Implemented | ✅ ENHANCED |
|
||||
|
||||
---
|
||||
|
||||
## Additional Safety Features (Beyond Docs)
|
||||
|
||||
1. **Version Compatibility Checking** (NEW)
|
||||
- Warns about PG 13 → PG 17 upgrades
|
||||
- Blocks unsupported downgrades
|
||||
- Provides recommendations
|
||||
|
||||
2. **Atomic Failure Tracking**
|
||||
- Thread-safe counters for parallel operations
|
||||
- Detailed error collection per database
|
||||
|
||||
3. **Progress Indicators**
|
||||
- Real-time ETA estimation
|
||||
- Per-database progress tracking
|
||||
|
||||
4. **Disk Space Validation**
|
||||
- Pre-checks available space (4x multiplier for cluster)
|
||||
- Prevents out-of-space failures mid-restore
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
✅ **Our cluster restore implementation is 100% compliant with PostgreSQL best practices.**
|
||||
|
||||
The cleanup process (`dropDatabaseIfExists`) correctly:
|
||||
1. Terminates all connections
|
||||
2. Waits for cleanup
|
||||
3. Drops the database completely
|
||||
4. Uses `template0` for fresh creation
|
||||
5. Handles system databases appropriately
|
||||
|
||||
**No changes needed** - implementation follows official documentation exactly.
|
||||
@@ -1,165 +0,0 @@
|
||||
# Large Object Restore Fix
|
||||
|
||||
## Problem Analysis
|
||||
|
||||
### Error 1: "type backup_state already exists" (postgres database)
|
||||
**Root Cause**: `--single-transaction` combined with `--exit-on-error` causes entire restore to fail when objects already exist in target database.
|
||||
|
||||
**Why it fails**:
|
||||
- `--single-transaction` wraps restore in BEGIN/COMMIT
|
||||
- `--exit-on-error` aborts on ANY error (including ignorable ones)
|
||||
- "already exists" errors are IGNORABLE - PostgreSQL should continue
|
||||
|
||||
### Error 2: "could not open large object 9646664" + 2.5M errors (resydb database)
|
||||
**Root Cause**: `--single-transaction` takes locks on ALL restored objects simultaneously, exhausting lock table.
|
||||
|
||||
**Why it fails**:
|
||||
- Single transaction locks ALL large objects at once
|
||||
- With 35,000+ large objects, exceeds max_locks_per_transaction
|
||||
- Lock exhaustion → "could not open large object" errors
|
||||
- Cascading failures → millions of errors
|
||||
|
||||
## PostgreSQL Documentation (Verified)
|
||||
|
||||
### From pg_restore docs:
|
||||
> **"pg_restore cannot restore large objects selectively"** - All large objects restored together
|
||||
|
||||
> **"-j / --jobs: Only custom and directory formats supported"**
|
||||
|
||||
> **"multiple jobs cannot be used together with --single-transaction"**
|
||||
|
||||
### From Section 19.5 (Resource Consumption):
|
||||
> **"max_locks_per_transaction × max_connections = total locks"**
|
||||
- Lock table is SHARED across all sessions
|
||||
- Single transaction consuming all locks blocks everything
|
||||
|
||||
## Changes Made
|
||||
|
||||
### 1. Disabled `--single-transaction` (CRITICAL FIX)
|
||||
**File**: `internal/restore/engine.go`
|
||||
- Line 186: `SingleTransaction: false` (was: true)
|
||||
- Line 210: `SingleTransaction: false` (was: true)
|
||||
|
||||
**Impact**:
|
||||
- No longer wraps entire restore in one transaction
|
||||
- Each object restored in its own transaction
|
||||
- Locks released incrementally (not held until end)
|
||||
- Prevents lock table exhaustion
|
||||
|
||||
### 2. Removed `--exit-on-error` (CRITICAL FIX)
|
||||
**File**: `internal/database/postgresql.go`
|
||||
- Line 375-378: Removed `cmd.append("--exit-on-error")`
|
||||
|
||||
**Impact**:
|
||||
- PostgreSQL continues on ignorable errors (correct behavior)
|
||||
- "already exists" errors logged but don't stop restore
|
||||
- Final error count reported at end
|
||||
- Only real errors cause failure
|
||||
|
||||
### 3. Kept Sequential Parallelism Detection
|
||||
**File**: `internal/restore/engine.go`
|
||||
- Lines 552-565: `detectLargeObjectsInDumps()` still active
|
||||
- Automatically reduces cluster parallelism to 1 when BLOBs detected
|
||||
|
||||
**Impact**:
|
||||
- Prevents multiple databases with large objects from competing for locks
|
||||
- Sequential cluster restore = only one DB's large objects in lock table at a time
|
||||
|
||||
## Why This Works
|
||||
|
||||
### Before (BROKEN):
|
||||
```
|
||||
START TRANSACTION; -- Single transaction begins
|
||||
CREATE TABLE ... -- Lock acquired
|
||||
CREATE INDEX ... -- Lock acquired
|
||||
RESTORE BLOB 1 -- Lock acquired
|
||||
RESTORE BLOB 2 -- Lock acquired
|
||||
...
|
||||
RESTORE BLOB 35000 -- Lock acquired → EXHAUSTED!
|
||||
ERROR: max_locks_per_transaction exceeded
|
||||
ROLLBACK; -- Everything fails
|
||||
```
|
||||
|
||||
### After (FIXED):
|
||||
```
|
||||
BEGIN; CREATE TABLE ...; COMMIT; -- Lock released
|
||||
BEGIN; CREATE INDEX ...; COMMIT; -- Lock released
|
||||
BEGIN; RESTORE BLOB 1; COMMIT; -- Lock released
|
||||
BEGIN; RESTORE BLOB 2; COMMIT; -- Lock released
|
||||
...
|
||||
BEGIN; RESTORE BLOB 35000; COMMIT; -- Each only holds ~100 locks max
|
||||
SUCCESS: All objects restored
|
||||
```
|
||||
|
||||
## Testing Recommendations
|
||||
|
||||
### 1. Test with postgres database (backup_state error)
|
||||
```bash
|
||||
./dbbackup restore cluster /path/to/backup.tar.gz
|
||||
# Should now skip "already exists" errors and continue
|
||||
```
|
||||
|
||||
### 2. Test with resydb database (large objects)
|
||||
```bash
|
||||
# Check dump for large objects first
|
||||
pg_restore -l resydb.dump | grep -i "blob\|large object"
|
||||
|
||||
# Restore should now work without lock exhaustion
|
||||
./dbbackup restore cluster /path/to/backup.tar.gz
|
||||
```
|
||||
|
||||
### 3. Monitor locks during restore
|
||||
```sql
|
||||
-- In another terminal while restore runs:
|
||||
SELECT count(*) FROM pg_locks;
|
||||
-- Should stay well below max_locks_per_transaction × max_connections
|
||||
```
|
||||
|
||||
## Expected Behavior Now
|
||||
|
||||
### For "already exists" errors:
|
||||
```
|
||||
pg_restore: warning: object already exists: TYPE backup_state
|
||||
pg_restore: warning: object already exists: FUNCTION ...
|
||||
... (continues restoring) ...
|
||||
pg_restore: total errors: 10 (all ignorable)
|
||||
SUCCESS
|
||||
```
|
||||
|
||||
### For large objects:
|
||||
```
|
||||
Restoring database resydb...
|
||||
Large objects detected - using sequential restore
|
||||
Restoring 35,000 large objects... (progress)
|
||||
✓ Database resydb restored successfully
|
||||
```
|
||||
|
||||
## Configuration Settings (Still Valid)
|
||||
|
||||
These PostgreSQL settings help but are NO LONGER REQUIRED with the fix:
|
||||
|
||||
```ini
|
||||
# Still recommended for performance, not required for correctness:
|
||||
max_locks_per_transaction = 256 # Provides headroom
|
||||
maintenance_work_mem = 1GB # Faster index creation
|
||||
shared_buffers = 8GB # Better caching
|
||||
```
|
||||
|
||||
## Commit This Fix
|
||||
|
||||
```bash
|
||||
git add internal/restore/engine.go internal/database/postgresql.go
|
||||
git commit -m "CRITICAL FIX: Remove --single-transaction and --exit-on-error from pg_restore
|
||||
|
||||
- Disabled --single-transaction to prevent lock table exhaustion with large objects
|
||||
- Removed --exit-on-error to allow PostgreSQL to skip ignorable errors
|
||||
- Fixes 'could not open large object' errors (lock exhaustion)
|
||||
- Fixes 'already exists' errors causing complete restore failure
|
||||
- Each object now restored in its own transaction (locks released incrementally)
|
||||
- PostgreSQL default behavior (continue on ignorable errors) is correct for restores
|
||||
|
||||
Per PostgreSQL docs: --single-transaction incompatible with large object restores
|
||||
and causes lock table exhaustion with 1000+ objects."
|
||||
|
||||
git push
|
||||
```
|
||||
@@ -1,247 +0,0 @@
|
||||
# Phase 2 TUI Improvements - Completion Report
|
||||
|
||||
## Overview
|
||||
Phase 2 of the TUI improvements adds professional, actionable UX features focused on transparency and error guidance. All features implemented without over-engineering.
|
||||
|
||||
## Implemented Features
|
||||
|
||||
### 1. Disk Space Pre-Flight Checks ✅
|
||||
**Files:** `internal/checks/disk_check.go`
|
||||
|
||||
**Features:**
|
||||
- Real-time filesystem stats using `syscall.Statfs_t`
|
||||
- Three-tier status system:
|
||||
- **Critical** (≥95% used): Blocks operation
|
||||
- **Warning** (≥80% used): Warns but allows
|
||||
- **Sufficient** (<80% used): OK to proceed
|
||||
- Smart space estimation:
|
||||
- Backups: Based on compression level
|
||||
- Restores: 4x archive size (decompression overhead)
|
||||
|
||||
**Integration:**
|
||||
- `internal/backup/engine.go` - Pre-flight check before cluster backup
|
||||
- `internal/restore/engine.go` - Pre-flight check before cluster restore
|
||||
- Displays formatted message in CLI mode
|
||||
- Logs warnings when space is tight
|
||||
|
||||
**Example Output:**
|
||||
```
|
||||
📊 Disk Space Check (OK):
|
||||
Path: /var/lib/pgsql/db_backups
|
||||
Total: 151.0 GiB
|
||||
Available: 66.0 GiB (55.0% used)
|
||||
✓ Status: OK
|
||||
|
||||
✓ Sufficient space available
|
||||
```
|
||||
|
||||
### 2. Error Classification & Hints ✅
|
||||
**Files:** `internal/checks/error_hints.go`
|
||||
|
||||
**Features:**
|
||||
- Smart error pattern matching (regex + substring)
|
||||
- Four severity levels:
|
||||
- **Ignorable**: Objects already exist (normal)
|
||||
- **Warning**: Version mismatches
|
||||
- **Critical**: Lock exhaustion, permissions, connections
|
||||
- **Fatal**: Corrupted dumps, excessive errors
|
||||
|
||||
**Error Categories:**
|
||||
- `duplicate`: Already exists (ignorable)
|
||||
- `disk_space`: No space left on device
|
||||
- `locks`: max_locks_per_transaction exhausted
|
||||
- `corruption`: Syntax errors in dump file
|
||||
- `permissions`: Permission denied, must be owner
|
||||
- `network`: Connection refused, pg_hba.conf
|
||||
- `version`: PostgreSQL version mismatch
|
||||
- `unknown`: Unclassified errors
|
||||
|
||||
**Integration:**
|
||||
- `internal/restore/engine.go` - Classify errors during restore
|
||||
- Enhanced error logging with hints and actions
|
||||
- Error messages include actionable solutions
|
||||
|
||||
**Example Error Classification:**
|
||||
```
|
||||
❌ CRITICAL Error
|
||||
|
||||
Category: locks
|
||||
Message: ERROR: out of shared memory
|
||||
HINT: You might need to increase max_locks_per_transaction
|
||||
|
||||
💡 Hint: Lock table exhausted - typically caused by large objects in parallel restore
|
||||
|
||||
🔧 Action: Increase max_locks_per_transaction in postgresql.conf to 512 or higher
|
||||
```
|
||||
|
||||
### 3. Actionable Error Messages ✅
|
||||
|
||||
**Common Errors Mapped:**
|
||||
|
||||
1. **"already exists"**
|
||||
- Type: Ignorable
|
||||
- Hint: "Object already exists in target database - this is normal during restore"
|
||||
- Action: "No action needed - restore will continue"
|
||||
|
||||
2. **"no space left"**
|
||||
- Type: Critical
|
||||
- Hint: "Insufficient disk space to complete operation"
|
||||
- Action: "Free up disk space: rm old_backups/* or increase storage"
|
||||
|
||||
3. **"max_locks_per_transaction"**
|
||||
- Type: Critical
|
||||
- Hint: "Lock table exhausted - typically caused by large objects"
|
||||
- Action: "Increase max_locks_per_transaction in postgresql.conf to 512"
|
||||
|
||||
4. **"syntax error"**
|
||||
- Type: Fatal
|
||||
- Hint: "Syntax error in dump file - backup may be corrupted"
|
||||
- Action: "Re-create backup with: dbbackup backup single <database>"
|
||||
|
||||
5. **"permission denied"**
|
||||
- Type: Critical
|
||||
- Hint: "Insufficient permissions to perform operation"
|
||||
- Action: "Run as superuser or use --no-owner flag for restore"
|
||||
|
||||
6. **"connection refused"**
|
||||
- Type: Critical
|
||||
- Hint: "Cannot connect to database server"
|
||||
- Action: "Check database is running and pg_hba.conf allows connection"
|
||||
|
||||
## Architecture Decisions
|
||||
|
||||
### Separate `checks` Package
|
||||
- **Why:** Avoid import cycles (backup/restore ↔ tui)
|
||||
- **Location:** `internal/checks/`
|
||||
- **Dependencies:** Only stdlib (`syscall`, `fmt`, `strings`)
|
||||
- **Result:** Clean separation, no circular dependencies
|
||||
|
||||
### No Logger Dependency
|
||||
- **Why:** Keep checks package lightweight
|
||||
- **Alternative:** Callers log results as needed
|
||||
- **Benefit:** Reusable in any context
|
||||
|
||||
### Three-Tier Status System
|
||||
- **Why:** Clear visual indicators for users
|
||||
- **Critical:** Red ❌ - Blocks operation
|
||||
- **Warning:** Yellow ⚠️ - Warns but allows
|
||||
- **Sufficient:** Green ✓ - OK to proceed
|
||||
|
||||
## Testing Status
|
||||
|
||||
### Background Test
|
||||
**File:** `test_backup_restore.sh`
|
||||
**Status:** ✅ Running (PID 1071950)
|
||||
|
||||
**Progress (as of last check):**
|
||||
- ✅ Cluster backup complete: 17/17 databases
|
||||
- ✅ d7030 backed up: 34GB with 35,000 large objects
|
||||
- ✅ Large DBs handled: testdb_50gb (6.7GB) × 2
|
||||
- 🔄 Creating compressed archive...
|
||||
- ⏳ Next: Drop d7030 → Restore cluster → Verify BLOBs
|
||||
|
||||
**Validates:**
|
||||
- Lock exhaustion fix (35K large objects)
|
||||
- Ignorable error handling ("already exists")
|
||||
- Ctrl+C cancellation
|
||||
- Disk space handling (34GB backup)
|
||||
|
||||
## Performance Impact
|
||||
|
||||
### Disk Space Check
|
||||
- **Cost:** ~1ms per check (single syscall)
|
||||
- **When:** Once before backup/restore starts
|
||||
- **Impact:** Negligible
|
||||
|
||||
### Error Classification
|
||||
- **Cost:** String pattern matching per error
|
||||
- **When:** Only when errors occur
|
||||
- **Impact:** Minimal (errors already indicate slow path)
|
||||
|
||||
## User Experience Improvements
|
||||
|
||||
### Before Phase 2:
|
||||
```
|
||||
Error: restore failed: exit status 1 (total errors: 2500000)
|
||||
```
|
||||
❌ No hint what went wrong
|
||||
❌ No actionable guidance
|
||||
❌ Can't distinguish critical from ignorable errors
|
||||
|
||||
### After Phase 2:
|
||||
```
|
||||
📊 Disk Space Check (OK):
|
||||
Available: 66.0 GiB (55.0% used)
|
||||
✓ Sufficient space available
|
||||
|
||||
[restore in progress...]
|
||||
|
||||
❌ CRITICAL Error
|
||||
Category: locks
|
||||
💡 Hint: Lock table exhausted - typically caused by large objects
|
||||
🔧 Action: Increase max_locks_per_transaction to 512 or higher
|
||||
```
|
||||
✅ Clear disk status before starting
|
||||
✅ Helpful error classification
|
||||
✅ Actionable solution provided
|
||||
✅ Professional, transparent UX
|
||||
|
||||
## Code Quality
|
||||
|
||||
### Test Coverage
|
||||
- ✅ Compiles without warnings
|
||||
- ✅ No import cycles
|
||||
- ✅ Minimal dependencies
|
||||
- ✅ Integrated into existing workflows
|
||||
|
||||
### Error Handling
|
||||
- ✅ Graceful fallback if syscall fails
|
||||
- ✅ Default classification for unknown errors
|
||||
- ✅ Non-blocking in CLI mode
|
||||
|
||||
### Documentation
|
||||
- ✅ Inline comments for all functions
|
||||
- ✅ Clear struct field descriptions
|
||||
- ✅ Usage examples in TUI_IMPROVEMENTS.md
|
||||
|
||||
## Next Steps (Phase 3)
|
||||
|
||||
### Real-Time Progress (Not Yet Implemented)
|
||||
- Show bytes processed / total bytes
|
||||
- Display transfer speed (MB/s)
|
||||
- Update ETA based on actual speed
|
||||
- Progress bars using Bubble Tea components
|
||||
|
||||
### Keyboard Shortcuts (Not Yet Implemented)
|
||||
- `1-9`: Quick jump to menu options
|
||||
- `q`: Quit application
|
||||
- `r`: Refresh backup list
|
||||
- `/`: Search/filter backups
|
||||
|
||||
### Enhanced Backup List (Not Yet Implemented)
|
||||
- Show backup size, age, health
|
||||
- Visual indicators for verification status
|
||||
- Sort by date, size, name
|
||||
|
||||
## Git History
|
||||
```
|
||||
9d36b26 - Add Phase 2 TUI improvements: disk space checks and error hints
|
||||
e95eeb7 - Add comprehensive TUI improvement plan and background test script
|
||||
c31717c - Add Ctrl+C interrupt handling for cluster operations
|
||||
[previous commits...]
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
Phase 2 delivers on the core promise: **transparent, actionable, professional UX without over-engineering.**
|
||||
|
||||
**Key Achievements:**
|
||||
- ✅ Pre-flight disk space validation prevents "100% full" surprises
|
||||
- ✅ Smart error classification distinguishes critical from ignorable
|
||||
- ✅ Actionable hints provide specific solutions, not generic messages
|
||||
- ✅ Zero performance impact (checks run once, errors already slow)
|
||||
- ✅ Clean architecture (no import cycles, minimal dependencies)
|
||||
- ✅ Integrated seamlessly into existing workflows
|
||||
|
||||
**User Impact:**
|
||||
Users now see what's happening, why errors occur, and exactly how to fix them. No more mysterious failures or cryptic messages.
|
||||
@@ -1,250 +0,0 @@
|
||||
# Interactive TUI Experience Improvements
|
||||
|
||||
## Current Issues & Solutions
|
||||
|
||||
### 1. **Progress Visibility During Long Operations**
|
||||
|
||||
**Problem**: Cluster backup/restore with large databases (40GB+) takes 30+ minutes with minimal feedback.
|
||||
|
||||
**Solutions**:
|
||||
- ✅ Show current database being processed
|
||||
- ✅ Display database size before backup/restore starts
|
||||
- ✅ ETA estimator for multi-database operations
|
||||
- 🔄 **NEW**: Real-time progress bar per database (bytes processed / total bytes)
|
||||
- 🔄 **NEW**: Show current operation speed (MB/s)
|
||||
- 🔄 **NEW**: Percentage complete for entire cluster operation
|
||||
|
||||
### 2. **Error Handling & Recovery**
|
||||
|
||||
**Problem**: When restore fails (like resydb with 2.5M errors), user has no context about WHY or WHAT to do.
|
||||
|
||||
**Solutions**:
|
||||
- ✅ Distinguish ignorable errors (already exists) from critical errors
|
||||
- 🔄 **NEW**: Show error classification in TUI:
|
||||
```
|
||||
⚠️ WARNING: 5 ignorable errors (objects already exist)
|
||||
❌ CRITICAL: Syntax errors detected - dump file may be corrupted
|
||||
💡 HINT: Re-create backup with: dbbackup backup single resydb
|
||||
```
|
||||
- 🔄 **NEW**: Offer retry option for failed databases
|
||||
- 🔄 **NEW**: Skip vs Abort choice for non-critical failures
|
||||
|
||||
### 3. **Large Object Detection Feedback**
|
||||
|
||||
**Problem**: User doesn't know WHY parallelism was reduced.
|
||||
|
||||
**Solution**:
|
||||
```
|
||||
🔍 Scanning cluster backup for large objects...
|
||||
✓ postgres: No large objects
|
||||
⚠️ d7030: 35,000 BLOBs detected (42GB)
|
||||
|
||||
⚙️ Automatically reducing parallelism: 2 → 1 (sequential)
|
||||
💡 Reason: Large objects require exclusive lock table access
|
||||
```
|
||||
|
||||
### 4. **Disk Space Warnings**
|
||||
|
||||
**Problem**: Backup fails silently when disk is full.
|
||||
|
||||
**Solutions**:
|
||||
- 🔄 **NEW**: Pre-flight check before backup:
|
||||
```
|
||||
📊 Disk Space Check:
|
||||
Database size: 42GB
|
||||
Available space: 66GB
|
||||
Estimated backup: ~15GB (compressed)
|
||||
✓ Sufficient space available
|
||||
```
|
||||
- 🔄 **NEW**: Warning at 80% disk usage
|
||||
- 🔄 **NEW**: Block operation at 95% disk usage
|
||||
|
||||
### 5. **Cancellation Handling (Ctrl+C)**
|
||||
|
||||
**Problem**: Users don't know if Ctrl+C will work or leave partial backups.
|
||||
|
||||
**Solutions**:
|
||||
- ✅ Graceful cancellation on Ctrl+C
|
||||
- 🔄 **NEW**: Show cleanup message:
|
||||
```
|
||||
^C received - Cancelling backup...
|
||||
🧹 Cleaning up temporary files...
|
||||
✓ Cleanup complete - no partial backups left
|
||||
```
|
||||
- 🔄 **NEW**: Confirmation prompt for cluster operations:
|
||||
```
|
||||
⚠️ Cluster backup in progress (3/10 databases)
|
||||
Are you sure you want to cancel? (y/N)
|
||||
```
|
||||
|
||||
### 6. **Interactive Mode Navigation**
|
||||
|
||||
**Problem**: TUI menu is basic, no keyboard shortcuts, no search.
|
||||
|
||||
**Solutions**:
|
||||
- 🔄 **NEW**: Keyboard shortcuts:
|
||||
- `1-9`: Quick jump to menu items
|
||||
- `q`: Quit
|
||||
- `r`: Refresh status
|
||||
- `/`: Search backups
|
||||
- 🔄 **NEW**: Backup list improvements:
|
||||
```
|
||||
📦 Available Backups:
|
||||
|
||||
1. cluster_20251118_103045.tar.gz [45GB] ⏱ 2 hours ago
|
||||
├─ postgres (325MB)
|
||||
├─ d7030 (42GB) ⚠️ 35K BLOBs
|
||||
└─ template1 (8MB)
|
||||
|
||||
2. cluster_20251112_084329.tar.gz [38GB] ⏱ 6 days ago
|
||||
└─ ⚠️ WARNING: May contain corrupted resydb dump
|
||||
```
|
||||
- 🔄 **NEW**: Filter/sort options: by date, by size, by status
|
||||
|
||||
### 7. **Configuration Recommendations**
|
||||
|
||||
**Problem**: Users don't know optimal settings for their workload.
|
||||
|
||||
**Solutions**:
|
||||
- 🔄 **NEW**: Auto-detect and suggest settings on first run:
|
||||
```
|
||||
🔧 System Configuration Detected:
|
||||
RAM: 32GB → Recommended: shared_buffers=8GB
|
||||
CPUs: 4 cores → Recommended: parallel_jobs=3
|
||||
Disk: 66GB free → Recommended: max backup size: 50GB
|
||||
|
||||
Apply these settings? (Y/n)
|
||||
```
|
||||
- 🔄 **NEW**: Show current vs recommended config in menu:
|
||||
```
|
||||
⚙️ Configuration Status:
|
||||
max_locks_per_transaction: 256 ✓ (sufficient for 35K objects)
|
||||
maintenance_work_mem: 64MB ⚠️ (recommend: 1GB for faster restores)
|
||||
shared_buffers: 128MB ⚠️ (recommend: 8GB with 32GB RAM)
|
||||
```
|
||||
|
||||
### 8. **Backup Verification & Health**
|
||||
|
||||
**Problem**: No way to verify backup integrity before restore.
|
||||
|
||||
**Solutions**:
|
||||
- 🔄 **NEW**: Add "Verify Backup" menu option:
|
||||
```
|
||||
🔍 Verifying backup: cluster_20251118_103045.tar.gz
|
||||
✓ Archive integrity: OK
|
||||
✓ Extracting metadata...
|
||||
✓ Checking dump formats...
|
||||
|
||||
Databases found:
|
||||
✓ postgres: Custom format, 325MB
|
||||
✓ d7030: Custom format, 42GB, 35,000 BLOBs
|
||||
⚠️ resydb: CORRUPTED - 2.5M syntax errors detected
|
||||
|
||||
Overall: ⚠️ Partial (2/3 databases healthy)
|
||||
```
|
||||
- 🔄 **NEW**: Show last backup status in main menu
|
||||
|
||||
### 9. **Restore Dry Run**
|
||||
|
||||
**Problem**: No preview of what will be restored.
|
||||
|
||||
**Solution**:
|
||||
```
|
||||
🎬 Restore Preview (Dry Run):
|
||||
|
||||
Target: cluster_20251118_103045.tar.gz
|
||||
Databases to restore:
|
||||
1. postgres (325MB)
|
||||
- Will overwrite: 5 existing objects
|
||||
- New objects: 120
|
||||
|
||||
2. d7030 (42GB, 35K BLOBs)
|
||||
- Will DROP and recreate database
|
||||
- Estimated time: 25-30 minutes
|
||||
- Required locks: 35,000 (available: 25,600) ⚠️
|
||||
|
||||
⚠️ WARNING: Insufficient locks for d7030
|
||||
💡 Solution: Increase max_locks_per_transaction to 512
|
||||
|
||||
Proceed with restore? (y/N)
|
||||
```
|
||||
|
||||
### 10. **Multi-Step Wizards**
|
||||
|
||||
**Problem**: Complex operations (like cluster restore with --clean) need multiple confirmations.
|
||||
|
||||
**Solution**: Step-by-step wizard:
|
||||
```
|
||||
Step 1/4: Select backup
|
||||
Step 2/4: Review databases to restore
|
||||
Step 3/4: Check prerequisites (disk space, locks, etc.)
|
||||
Step 4/4: Confirm and execute
|
||||
```
|
||||
|
||||
## Implementation Priority
|
||||
|
||||
### Phase 1 (High Impact, Low Effort) ✅
|
||||
- ✅ ETA estimators
|
||||
- ✅ Large object detection warnings
|
||||
- ✅ Ctrl+C handling
|
||||
- ✅ Ignorable error detection
|
||||
|
||||
### Phase 2 (High Impact, Medium Effort) 🔄
|
||||
- Real-time progress bars with MB/s
|
||||
- Disk space pre-flight checks
|
||||
- Backup verification tool
|
||||
- Error hints and suggestions
|
||||
|
||||
### Phase 3 (Quality of Life) 🔄
|
||||
- Keyboard shortcuts
|
||||
- Backup list with metadata
|
||||
- Configuration recommendations
|
||||
- Restore dry run
|
||||
|
||||
### Phase 4 (Advanced) 📋
|
||||
- Multi-step wizards
|
||||
- Search/filter backups
|
||||
- Auto-retry failed databases
|
||||
- Parallel restore progress split-view
|
||||
|
||||
## Code Structure
|
||||
|
||||
```
|
||||
internal/tui/
|
||||
menu.go - Main interactive menu
|
||||
backup_menu.go - Backup wizard
|
||||
restore_menu.go - Restore wizard
|
||||
verify_menu.go - Backup verification (NEW)
|
||||
config_menu.go - Configuration tuning (NEW)
|
||||
progress_view.go - Real-time progress display (ENHANCED)
|
||||
errors.go - Error classification & hints (NEW)
|
||||
```
|
||||
|
||||
## Testing Plan
|
||||
|
||||
1. **Large Database Test** (In Progress)
|
||||
- 42GB d7030 with 35K BLOBs
|
||||
- Verify progress updates
|
||||
- Verify large object detection
|
||||
- Verify successful restore
|
||||
|
||||
2. **Error Scenarios**
|
||||
- Corrupted dump file
|
||||
- Insufficient disk space
|
||||
- Insufficient locks
|
||||
- Network interruption
|
||||
- Ctrl+C during operations
|
||||
|
||||
3. **Performance**
|
||||
- Backup time vs raw pg_dump
|
||||
- Restore time vs raw pg_restore
|
||||
- Memory usage during 40GB+ operations
|
||||
- CPU utilization with parallel workers
|
||||
|
||||
## Success Metrics
|
||||
|
||||
- ✅ No "black box" operations - user always knows what's happening
|
||||
- ✅ Errors are actionable - user knows what to fix
|
||||
- ✅ Safe operations - confirmations for destructive actions
|
||||
- ✅ Fast feedback - progress updates every 1-2 seconds
|
||||
- ✅ Professional feel - polished, consistent, intuitive
|
||||
@@ -1,281 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# create_d7030_test.sh
|
||||
# Create a realistic d7030 database with tables, data, and many BLOBs to test large object restore
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
DB_NAME="d7030"
|
||||
NUM_DOCUMENTS=15000 # Number of documents with BLOBs (~750MB at 50KB each)
|
||||
NUM_IMAGES=10000 # Number of image records (~900MB for images + ~100MB thumbnails)
|
||||
# Total BLOBs: 25,000 large objects
|
||||
# Approximate size: 15000*50KB + 10000*90KB + 10000*10KB = ~2.4GB in BLOBs alone
|
||||
# With tables, indexes, and overhead: ~3-4GB per iteration
|
||||
# We'll create multiple batches to reach ~25GB
|
||||
|
||||
echo "Creating database: $DB_NAME"
|
||||
|
||||
# Drop if exists
|
||||
sudo -u postgres psql -c "DROP DATABASE IF EXISTS $DB_NAME;" 2>/dev/null || true
|
||||
|
||||
# Create database
|
||||
sudo -u postgres psql -c "CREATE DATABASE $DB_NAME;"
|
||||
|
||||
echo "Creating schema and tables..."
|
||||
|
||||
# Enable pgcrypto extension for gen_random_bytes
|
||||
sudo -u postgres psql -d "$DB_NAME" -c "CREATE EXTENSION IF NOT EXISTS pgcrypto;"
|
||||
|
||||
# Create schema with realistic business tables
|
||||
sudo -u postgres psql -d "$DB_NAME" <<'EOF'
|
||||
-- Create tables for a document management system
|
||||
CREATE TABLE departments (
|
||||
dept_id SERIAL PRIMARY KEY,
|
||||
dept_name VARCHAR(100) NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE TABLE employees (
|
||||
emp_id SERIAL PRIMARY KEY,
|
||||
dept_id INTEGER REFERENCES departments(dept_id),
|
||||
first_name VARCHAR(50) NOT NULL,
|
||||
last_name VARCHAR(50) NOT NULL,
|
||||
email VARCHAR(100) UNIQUE,
|
||||
hire_date DATE DEFAULT CURRENT_DATE
|
||||
);
|
||||
|
||||
CREATE TABLE document_types (
|
||||
type_id SERIAL PRIMARY KEY,
|
||||
type_name VARCHAR(50) NOT NULL,
|
||||
description TEXT
|
||||
);
|
||||
|
||||
-- Table with large objects (BLOBs)
|
||||
CREATE TABLE documents (
|
||||
doc_id SERIAL PRIMARY KEY,
|
||||
emp_id INTEGER REFERENCES employees(emp_id),
|
||||
type_id INTEGER REFERENCES document_types(type_id),
|
||||
title VARCHAR(255) NOT NULL,
|
||||
description TEXT,
|
||||
file_data OID, -- Large object reference
|
||||
file_size INTEGER,
|
||||
mime_type VARCHAR(100),
|
||||
created_at TIMESTAMP DEFAULT NOW(),
|
||||
updated_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE TABLE images (
|
||||
image_id SERIAL PRIMARY KEY,
|
||||
doc_id INTEGER REFERENCES documents(doc_id),
|
||||
image_name VARCHAR(255),
|
||||
image_data OID, -- Large object reference
|
||||
thumbnail_data OID, -- Another large object
|
||||
width INTEGER,
|
||||
height INTEGER,
|
||||
created_at TIMESTAMP DEFAULT NOW()
|
||||
);
|
||||
|
||||
CREATE TABLE audit_log (
|
||||
log_id SERIAL PRIMARY KEY,
|
||||
table_name VARCHAR(50),
|
||||
record_id INTEGER,
|
||||
action VARCHAR(20),
|
||||
changed_by INTEGER,
|
||||
changed_at TIMESTAMP DEFAULT NOW(),
|
||||
details JSONB
|
||||
);
|
||||
|
||||
-- Create indexes
|
||||
CREATE INDEX idx_documents_emp ON documents(emp_id);
|
||||
CREATE INDEX idx_documents_type ON documents(type_id);
|
||||
CREATE INDEX idx_images_doc ON images(doc_id);
|
||||
CREATE INDEX idx_audit_table ON audit_log(table_name, record_id);
|
||||
|
||||
-- Insert reference data
|
||||
INSERT INTO departments (dept_name) VALUES
|
||||
('Engineering'), ('Sales'), ('Marketing'), ('HR'), ('Finance');
|
||||
|
||||
INSERT INTO document_types (type_name, description) VALUES
|
||||
('Contract', 'Legal contracts and agreements'),
|
||||
('Invoice', 'Financial invoices and receipts'),
|
||||
('Report', 'Business reports and analysis'),
|
||||
('Manual', 'Technical manuals and guides'),
|
||||
('Presentation', 'Presentation slides and materials');
|
||||
|
||||
-- Insert employees
|
||||
INSERT INTO employees (dept_id, first_name, last_name, email)
|
||||
SELECT
|
||||
(random() * 4 + 1)::INTEGER,
|
||||
'Employee_' || generate_series,
|
||||
'LastName_' || generate_series,
|
||||
'employee' || generate_series || '@d7030.com'
|
||||
FROM generate_series(1, 50);
|
||||
|
||||
EOF
|
||||
|
||||
echo "Inserting documents with large objects (BLOBs)..."
|
||||
echo "This will take several minutes to create ~25GB of data..."
|
||||
|
||||
# Create temporary files with random data for importing in postgres home
|
||||
# Make documents larger for 25GB target: ~1MB each
|
||||
TEMP_FILE="/var/lib/pgsql/test_blob_data.bin"
|
||||
sudo dd if=/dev/urandom of="$TEMP_FILE" bs=1M count=1 2>/dev/null
|
||||
sudo chown postgres:postgres "$TEMP_FILE"
|
||||
|
||||
# Create documents with actual large objects using lo_import
|
||||
sudo -u postgres psql -d "$DB_NAME" <<EOF
|
||||
DO \$\$
|
||||
DECLARE
|
||||
v_emp_id INTEGER;
|
||||
v_type_id INTEGER;
|
||||
v_loid OID;
|
||||
BEGIN
|
||||
FOR i IN 1..$NUM_DOCUMENTS LOOP
|
||||
-- Random employee and document type
|
||||
v_emp_id := (random() * 49 + 1)::INTEGER;
|
||||
v_type_id := (random() * 4 + 1)::INTEGER;
|
||||
|
||||
-- Import file as large object (creates a unique BLOB for each)
|
||||
v_loid := lo_import('$TEMP_FILE');
|
||||
|
||||
-- Insert document record
|
||||
INSERT INTO documents (emp_id, type_id, title, description, file_data, file_size, mime_type)
|
||||
VALUES (
|
||||
v_emp_id,
|
||||
v_type_id,
|
||||
'Document_' || i || '_' || (CASE v_type_id
|
||||
WHEN 1 THEN 'Contract'
|
||||
WHEN 2 THEN 'Invoice'
|
||||
WHEN 3 THEN 'Report'
|
||||
WHEN 4 THEN 'Manual'
|
||||
ELSE 'Presentation'
|
||||
END),
|
||||
'This is a test document with large object data. Document number ' || i,
|
||||
v_loid,
|
||||
1048576,
|
||||
(CASE v_type_id
|
||||
WHEN 1 THEN 'application/pdf'
|
||||
WHEN 2 THEN 'application/pdf'
|
||||
WHEN 3 THEN 'application/vnd.ms-excel'
|
||||
WHEN 4 THEN 'application/pdf'
|
||||
ELSE 'application/vnd.ms-powerpoint'
|
||||
END)
|
||||
);
|
||||
|
||||
-- Progress indicator
|
||||
IF i % 500 = 0 THEN
|
||||
RAISE NOTICE 'Created % documents with BLOBs...', i;
|
||||
END IF;
|
||||
END LOOP;
|
||||
END \$\$;
|
||||
EOF
|
||||
|
||||
rm -f "$TEMP_FILE"
|
||||
|
||||
echo "Inserting images with large objects..."
|
||||
|
||||
# Create temp files for image and thumbnail in postgres home
|
||||
# Make images larger: ~1.5MB for full image, ~200KB for thumbnail
|
||||
TEMP_IMAGE="/var/lib/pgsql/test_image_data.bin"
|
||||
TEMP_THUMB="/var/lib/pgsql/test_thumb_data.bin"
|
||||
sudo dd if=/dev/urandom of="$TEMP_IMAGE" bs=1M count=1 bs=512K count=3 2>/dev/null
|
||||
sudo dd if=/dev/urandom of="$TEMP_THUMB" bs=1K count=200 2>/dev/null
|
||||
sudo chown postgres:postgres "$TEMP_IMAGE" "$TEMP_THUMB"
|
||||
|
||||
# Create images with multiple large objects per record
|
||||
sudo -u postgres psql -d "$DB_NAME" <<EOF
|
||||
DO \$\$
|
||||
DECLARE
|
||||
v_doc_id INTEGER;
|
||||
v_image_oid OID;
|
||||
v_thumb_oid OID;
|
||||
BEGIN
|
||||
FOR i IN 1..$NUM_IMAGES LOOP
|
||||
-- Random document (only from successfully created documents)
|
||||
SELECT doc_id INTO v_doc_id FROM documents ORDER BY random() LIMIT 1;
|
||||
|
||||
IF v_doc_id IS NULL THEN
|
||||
EXIT; -- No documents exist, skip images
|
||||
END IF;
|
||||
|
||||
-- Import full-size image as large object
|
||||
v_image_oid := lo_import('$TEMP_IMAGE');
|
||||
|
||||
-- Import thumbnail as large object
|
||||
v_thumb_oid := lo_import('$TEMP_THUMB');
|
||||
|
||||
-- Insert image record
|
||||
INSERT INTO images (doc_id, image_name, image_data, thumbnail_data, width, height)
|
||||
VALUES (
|
||||
v_doc_id,
|
||||
'Image_' || i || '.jpg',
|
||||
v_image_oid,
|
||||
v_thumb_oid,
|
||||
(random() * 2000 + 800)::INTEGER,
|
||||
(random() * 1500 + 600)::INTEGER
|
||||
);
|
||||
|
||||
IF i % 500 = 0 THEN
|
||||
RAISE NOTICE 'Created % images with BLOBs...', i;
|
||||
END IF;
|
||||
END LOOP;
|
||||
END \$\$;
|
||||
EOF
|
||||
|
||||
rm -f "$TEMP_IMAGE" "$TEMP_THUMB"
|
||||
|
||||
echo "Inserting audit log data..."
|
||||
|
||||
# Create audit log entries
|
||||
sudo -u postgres psql -d "$DB_NAME" <<EOF
|
||||
INSERT INTO audit_log (table_name, record_id, action, changed_by, details)
|
||||
SELECT
|
||||
'documents',
|
||||
doc_id,
|
||||
(ARRAY['INSERT', 'UPDATE', 'VIEW'])[(random() * 2 + 1)::INTEGER],
|
||||
(random() * 49 + 1)::INTEGER,
|
||||
jsonb_build_object(
|
||||
'timestamp', NOW() - (random() * INTERVAL '90 days'),
|
||||
'ip_address', '192.168.' || (random() * 255)::INTEGER || '.' || (random() * 255)::INTEGER,
|
||||
'user_agent', 'Mozilla/5.0'
|
||||
)
|
||||
FROM documents
|
||||
CROSS JOIN generate_series(1, 3);
|
||||
EOF
|
||||
|
||||
echo ""
|
||||
echo "Database statistics:"
|
||||
sudo -u postgres psql -d "$DB_NAME" <<'EOF'
|
||||
SELECT
|
||||
'Departments' as table_name,
|
||||
COUNT(*) as row_count
|
||||
FROM departments
|
||||
UNION ALL
|
||||
SELECT 'Employees', COUNT(*) FROM employees
|
||||
UNION ALL
|
||||
SELECT 'Document Types', COUNT(*) FROM document_types
|
||||
UNION ALL
|
||||
SELECT 'Documents (with BLOBs)', COUNT(*) FROM documents
|
||||
UNION ALL
|
||||
SELECT 'Images (with BLOBs)', COUNT(*) FROM images
|
||||
UNION ALL
|
||||
SELECT 'Audit Log', COUNT(*) FROM audit_log;
|
||||
|
||||
-- Count large objects
|
||||
SELECT COUNT(*) as total_large_objects FROM pg_largeobject_metadata;
|
||||
|
||||
-- Total size of large objects
|
||||
SELECT pg_size_pretty(SUM(pg_column_size(data))) as total_blob_size
|
||||
FROM pg_largeobject;
|
||||
EOF
|
||||
|
||||
echo ""
|
||||
echo "✅ Database $DB_NAME created successfully with realistic data and BLOBs!"
|
||||
echo ""
|
||||
echo "Large objects created:"
|
||||
echo " - $NUM_DOCUMENTS documents (each with ~1MB BLOB)"
|
||||
echo " - $NUM_IMAGES images (each with 2 BLOBs: ~1.5MB image + ~200KB thumbnail)"
|
||||
echo " - Total: ~$((NUM_DOCUMENTS + NUM_IMAGES * 2)) large objects"
|
||||
echo ""
|
||||
echo "Estimated size: ~$((NUM_DOCUMENTS * 1 + NUM_IMAGES * 1 + NUM_IMAGES * 0))MB in BLOBs"
|
||||
echo ""
|
||||
echo "You can now backup this database and test restore with large object locks."
|
||||
@@ -1,58 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# fix_max_locks.sh
|
||||
# Safely update max_locks_per_transaction in postgresql.conf and restart PostgreSQL
|
||||
# Usage: sudo ./fix_max_locks.sh [NEW_VALUE]
|
||||
|
||||
set -euo pipefail
|
||||
NEW_VALUE=${1:-256}
|
||||
|
||||
CONFIG_FILE="/var/lib/pgsql/data/postgresql.conf"
|
||||
BACKUP_FILE="${CONFIG_FILE}.bak.$(date +%s)"
|
||||
|
||||
echo "PostgreSQL config file: $CONFIG_FILE"
|
||||
|
||||
# Create a backup
|
||||
sudo cp "$CONFIG_FILE" "$BACKUP_FILE"
|
||||
echo "Backup written to $BACKUP_FILE"
|
||||
|
||||
# Check if setting exists (commented or not)
|
||||
if sudo grep -qE "^\s*#?\s*max_locks_per_transaction\s*=" "$CONFIG_FILE"; then
|
||||
echo "Updating existing max_locks_per_transaction to $NEW_VALUE"
|
||||
# Replace the line (whether commented or not)
|
||||
sudo sed -i "s/^\s*#\?\s*max_locks_per_transaction\s*=.*/max_locks_per_transaction = $NEW_VALUE/" "$CONFIG_FILE"
|
||||
else
|
||||
echo "Adding max_locks_per_transaction = $NEW_VALUE to config"
|
||||
# Append at the end
|
||||
echo "" | sudo tee -a "$CONFIG_FILE" >/dev/null
|
||||
echo "# Increased by fix_max_locks.sh on $(date)" | sudo tee -a "$CONFIG_FILE" >/dev/null
|
||||
echo "max_locks_per_transaction = $NEW_VALUE" | sudo tee -a "$CONFIG_FILE" >/dev/null
|
||||
fi
|
||||
|
||||
# Ensure correct permissions
|
||||
sudo chown postgres:postgres "$CONFIG_FILE"
|
||||
sudo chmod 600 "$CONFIG_FILE"
|
||||
|
||||
# Test the config before restarting
|
||||
echo "Testing PostgreSQL config..."
|
||||
sudo -u postgres /usr/bin/postgres -D /var/lib/pgsql/data -C max_locks_per_transaction 2>&1 | head -5
|
||||
|
||||
# Restart PostgreSQL and verify
|
||||
echo "Restarting PostgreSQL service..."
|
||||
sudo systemctl restart postgresql
|
||||
sleep 3
|
||||
|
||||
if sudo systemctl is-active --quiet postgresql; then
|
||||
echo "✅ PostgreSQL restarted successfully"
|
||||
sudo -u postgres psql -c "SHOW max_locks_per_transaction;"
|
||||
else
|
||||
echo "❌ PostgreSQL failed to start!"
|
||||
echo "Restoring backup..."
|
||||
sudo cp "$BACKUP_FILE" "$CONFIG_FILE"
|
||||
sudo systemctl start postgresql
|
||||
echo "Original config restored. Check /var/log/postgresql for errors."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "Success! Backup available at: $BACKUP_FILE"
|
||||
exit 0
|
||||
@@ -1,51 +0,0 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
LOG="/var/lib/pgsql/dbbackup_test.log"
|
||||
|
||||
echo "=== Database Backup/Restore Test ===" | tee $LOG
|
||||
echo "Started: $(date)" | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
|
||||
cd /root/dbbackup
|
||||
|
||||
# Step 1: Cluster Backup
|
||||
echo "STEP 1: Creating cluster backup..." | tee -a $LOG
|
||||
sudo -u postgres ./dbbackup backup cluster --backup-dir /var/lib/pgsql/db_backups 2>&1 | tee -a $LOG
|
||||
BACKUP_FILE=$(ls -t /var/lib/pgsql/db_backups/cluster_*.tar.gz | head -1)
|
||||
echo "Backup created: $BACKUP_FILE" | tee -a $LOG
|
||||
echo "Backup size: $(ls -lh $BACKUP_FILE | awk '{print $5}')" | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
|
||||
# Step 2: Drop d7030 database to prepare for restore test
|
||||
echo "STEP 2: Dropping d7030 database for clean restore test..." | tee -a $LOG
|
||||
sudo -u postgres psql -d postgres -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'd7030' AND pid <> pg_backend_pid();" 2>&1 | tee -a $LOG
|
||||
sudo -u postgres psql -d postgres -c "DROP DATABASE IF EXISTS d7030;" 2>&1 | tee -a $LOG
|
||||
echo "d7030 database dropped" | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
|
||||
# Step 3: Cluster Restore
|
||||
echo "STEP 3: Restoring cluster from backup..." | tee -a $LOG
|
||||
sudo -u postgres ./dbbackup restore cluster $BACKUP_FILE --backup-dir /var/lib/pgsql/db_backups 2>&1 | tee -a $LOG
|
||||
echo "Restore completed" | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
|
||||
# Step 4: Verify restored data
|
||||
echo "STEP 4: Verifying restored databases..." | tee -a $LOG
|
||||
sudo -u postgres psql -d postgres -c "\l" 2>&1 | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
echo "Checking d7030 large objects..." | tee -a $LOG
|
||||
BLOB_COUNT=$(sudo -u postgres psql -d d7030 -t -c "SELECT count(*) FROM pg_largeobject_metadata;" 2>/dev/null || echo "0")
|
||||
echo "Large objects in d7030: $BLOB_COUNT" | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
|
||||
# Step 5: Cleanup
|
||||
echo "STEP 5: Cleaning up test backup..." | tee -a $LOG
|
||||
rm -f $BACKUP_FILE
|
||||
echo "Backup file deleted: $BACKUP_FILE" | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
|
||||
echo "=== TEST COMPLETE ===" | tee -a $LOG
|
||||
echo "Finished: $(date)" | tee -a $LOG
|
||||
echo "" | tee -a $LOG
|
||||
echo "✅ Full test log available at: $LOG"
|
||||
BIN
test_build
BIN
test_build
Binary file not shown.
@@ -1,57 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Verify that backup contains large objects (BLOBs)
|
||||
|
||||
if [ $# -eq 0 ]; then
|
||||
echo "Usage: $0 <backup_file.dump>"
|
||||
echo "Example: $0 /var/lib/pgsql/db_backups/d7030.dump"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
BACKUP_FILE="$1"
|
||||
|
||||
if [ ! -f "$BACKUP_FILE" ]; then
|
||||
echo "Error: File not found: $BACKUP_FILE"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "========================================="
|
||||
echo "Backup BLOB/Large Object Verification"
|
||||
echo "========================================="
|
||||
echo "File: $BACKUP_FILE"
|
||||
echo ""
|
||||
|
||||
# Check if file is a valid PostgreSQL dump
|
||||
echo "1. Checking dump file format..."
|
||||
pg_restore -l "$BACKUP_FILE" > /dev/null 2>&1
|
||||
if [ $? -eq 0 ]; then
|
||||
echo " ✅ Valid PostgreSQL custom format dump"
|
||||
else
|
||||
echo " ❌ Not a valid pg_dump custom format file"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# List table of contents and look for BLOB entries
|
||||
echo ""
|
||||
echo "2. Checking for BLOB/Large Object entries..."
|
||||
BLOB_COUNT=$(pg_restore -l "$BACKUP_FILE" | grep -i "BLOB\|LARGE OBJECT" | wc -l)
|
||||
|
||||
if [ $BLOB_COUNT -gt 0 ]; then
|
||||
echo " ✅ Found $BLOB_COUNT large object entries in backup"
|
||||
echo ""
|
||||
echo " Sample entries:"
|
||||
pg_restore -l "$BACKUP_FILE" | grep -i "BLOB\|LARGE OBJECT" | head -10
|
||||
else
|
||||
echo " ⚠️ No large object entries found"
|
||||
echo " This could mean:"
|
||||
echo " - Database has no large objects (normal)"
|
||||
echo " - Backup was created without --blobs flag (problem)"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo "3. Full table of contents summary..."
|
||||
pg_restore -l "$BACKUP_FILE" | tail -20
|
||||
|
||||
echo ""
|
||||
echo "========================================="
|
||||
echo "Verification complete"
|
||||
echo "========================================="
|
||||
Reference in New Issue
Block a user