cleanup: Fix
This commit is contained in:
@ -1,268 +0,0 @@
|
||||
# 🚀 Huge Database Backup - Quick Start Guide
|
||||
|
||||
## Problem Solved
|
||||
✅ **"signal: killed" errors on large PostgreSQL databases with BLOBs**
|
||||
|
||||
## What Changed
|
||||
|
||||
### Before (❌ Failing)
|
||||
- Memory: Buffered entire database in RAM
|
||||
- Format: Custom format with TOC overhead
|
||||
- Compression: In-memory compression (high CPU/RAM)
|
||||
- Result: **OOM killed on 20GB+ databases**
|
||||
|
||||
### After (✅ Working)
|
||||
- Memory: **Constant <1GB** regardless of database size
|
||||
- Format: Auto-selects plain format for >5GB databases
|
||||
- Compression: Streaming `pg_dump | pigz` (zero-copy)
|
||||
- Result: **Handles 100GB+ databases**
|
||||
|
||||
## Usage
|
||||
|
||||
### Interactive Mode (Recommended)
|
||||
```bash
|
||||
./dbbackup interactive
|
||||
|
||||
# Then select:
|
||||
# → Backup Execution
|
||||
# → Cluster Backup
|
||||
```
|
||||
|
||||
The tool will automatically:
|
||||
1. Detect database sizes
|
||||
2. Use plain format for databases >5GB
|
||||
3. Stream compression with pigz
|
||||
4. Cap compression at level 6
|
||||
5. Set 2-hour timeout per database
|
||||
|
||||
### Command Line Mode
|
||||
```bash
|
||||
# Basic cluster backup (auto-optimized)
|
||||
./dbbackup backup cluster
|
||||
|
||||
# With custom settings
|
||||
./dbbackup backup cluster \
|
||||
--dump-jobs 4 \
|
||||
--compression 6 \
|
||||
--auto-detect-cores
|
||||
|
||||
# For maximum performance
|
||||
./dbbackup backup cluster \
|
||||
--dump-jobs 8 \
|
||||
--compression 3 \
|
||||
--jobs 16
|
||||
```
|
||||
|
||||
## Optimizations Applied
|
||||
|
||||
### 1. Smart Format Selection ✅
|
||||
- **Small DBs (<5GB)**: Custom format with compression
|
||||
- **Large DBs (>5GB)**: Plain format + external compression
|
||||
- **Benefit**: No TOC memory overhead
|
||||
|
||||
### 2. Streaming Compression ✅
|
||||
```
|
||||
pg_dump → stdout → pigz → disk
|
||||
(no Go buffers in between)
|
||||
```
|
||||
- **Memory**: Constant 64KB pipe buffer
|
||||
- **Speed**: Parallel compression with all CPU cores
|
||||
- **Benefit**: 90% memory reduction
|
||||
|
||||
### 3. Direct File Writing ✅
|
||||
- pg_dump writes **directly to disk**
|
||||
- No Go stdout/stderr buffering
|
||||
- **Benefit**: Zero-copy I/O
|
||||
|
||||
### 4. Resource Limits ✅
|
||||
- **Compression**: Capped at level 6 (was 9)
|
||||
- **Timeout**: 2 hours per database (was 30 min)
|
||||
- **Parallel**: Configurable dump jobs
|
||||
- **Benefit**: Prevents hangs and OOM
|
||||
|
||||
### 5. Size Detection ✅
|
||||
- Check database size before backup
|
||||
- Warn on databases >10GB
|
||||
- Choose optimal strategy
|
||||
- **Benefit**: User visibility
|
||||
|
||||
## Performance Comparison
|
||||
|
||||
### Test Database: 25GB with 15GB BLOB Table
|
||||
|
||||
| Metric | Before | After | Improvement |
|
||||
|--------|--------|-------|-------------|
|
||||
| Memory Usage | 8.2GB | 850MB | **90% reduction** |
|
||||
| Backup Time | FAILED (OOM) | 18m 45s | **✅ Works!** |
|
||||
| CPU Usage | 98% (1 core) | 45% (8 cores) | Better utilization |
|
||||
| Disk I/O | Buffered | Streaming | Faster |
|
||||
|
||||
### Test Database: 100GB with Multiple BLOB Tables
|
||||
|
||||
| Metric | Before | After | Improvement |
|
||||
|--------|--------|-------|-------------|
|
||||
| Memory Usage | FAILED (OOM) | 920MB | **✅ Works!** |
|
||||
| Backup Time | N/A | 67m 12s | Successfully completes |
|
||||
| Compression | N/A | 72.3GB | 27.7% reduction |
|
||||
| Status | ❌ Killed | ✅ Success | Fixed! |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Still Getting "signal: killed"?
|
||||
|
||||
#### Check 1: Disk Space
|
||||
```bash
|
||||
df -h /path/to/backups
|
||||
```
|
||||
Ensure 2x database size available.
|
||||
|
||||
#### Check 2: System Resources
|
||||
```bash
|
||||
# Check available memory
|
||||
free -h
|
||||
|
||||
# Check for OOM killer
|
||||
dmesg | grep -i "killed process"
|
||||
```
|
||||
|
||||
#### Check 3: PostgreSQL Configuration
|
||||
```bash
|
||||
# Check work_mem setting
|
||||
psql -c "SHOW work_mem;"
|
||||
|
||||
# Recommended for backups:
|
||||
# work_mem = 64MB (not 1GB+)
|
||||
```
|
||||
|
||||
#### Check 4: Use Lower Compression
|
||||
```bash
|
||||
# Try compression level 3 (faster, less memory)
|
||||
./dbbackup backup cluster --compression 3
|
||||
```
|
||||
|
||||
### Performance Tuning
|
||||
|
||||
#### For Maximum Speed
|
||||
```bash
|
||||
./dbbackup backup cluster \
|
||||
--compression 1 \ # Fastest compression
|
||||
--dump-jobs 8 \ # Parallel dumps
|
||||
--jobs 16 # Max compression threads
|
||||
```
|
||||
|
||||
#### For Maximum Compression
|
||||
```bash
|
||||
./dbbackup backup cluster \
|
||||
--compression 6 \ # Best ratio (safe)
|
||||
--dump-jobs 2 # Conservative
|
||||
```
|
||||
|
||||
#### For Huge Machines (64+ cores)
|
||||
```bash
|
||||
./dbbackup backup cluster \
|
||||
--auto-detect-cores \ # Auto-optimize
|
||||
--compression 6
|
||||
```
|
||||
|
||||
## System Requirements
|
||||
|
||||
### Minimum
|
||||
- RAM: 2GB
|
||||
- Disk: 2x database size
|
||||
- CPU: 2 cores
|
||||
|
||||
### Recommended
|
||||
- RAM: 4GB+
|
||||
- Disk: 3x database size (for temp files)
|
||||
- CPU: 4+ cores (for parallel compression)
|
||||
|
||||
### Optimal (for 100GB+ databases)
|
||||
- RAM: 8GB+
|
||||
- Disk: Fast SSD with 4x database size
|
||||
- CPU: 8+ cores
|
||||
- Network: 1Gbps+ (for remote backups)
|
||||
|
||||
## Optional: Install pigz for Faster Compression
|
||||
|
||||
```bash
|
||||
# Debian/Ubuntu
|
||||
apt-get install pigz
|
||||
|
||||
# RHEL/CentOS
|
||||
yum install pigz
|
||||
|
||||
# Check installation
|
||||
which pigz
|
||||
```
|
||||
|
||||
**Benefit**: 3-5x faster compression on multi-core systems
|
||||
|
||||
## Monitoring Backup Progress
|
||||
|
||||
### Watch Backup Directory
|
||||
```bash
|
||||
watch -n 5 'ls -lh /path/to/backups | tail -10'
|
||||
```
|
||||
|
||||
### Monitor System Resources
|
||||
```bash
|
||||
# Terminal 1: Monitor memory
|
||||
watch -n 2 'free -h'
|
||||
|
||||
# Terminal 2: Monitor I/O
|
||||
watch -n 2 'iostat -x 2 1'
|
||||
|
||||
# Terminal 3: Run backup
|
||||
./dbbackup backup cluster
|
||||
```
|
||||
|
||||
### Check PostgreSQL Activity
|
||||
```sql
|
||||
-- Active backup connections
|
||||
SELECT * FROM pg_stat_activity
|
||||
WHERE application_name LIKE 'pg_dump%';
|
||||
|
||||
-- Current transaction locks
|
||||
SELECT * FROM pg_locks
|
||||
WHERE granted = true;
|
||||
```
|
||||
|
||||
## Recovery Testing
|
||||
|
||||
Always test your backups!
|
||||
|
||||
```bash
|
||||
# Test restore (dry run)
|
||||
./dbbackup restore /path/to/backup.sql.gz \
|
||||
--verify-only
|
||||
|
||||
# Full restore to test database
|
||||
./dbbackup restore /path/to/backup.sql.gz \
|
||||
--database testdb
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
### Production Deployment
|
||||
1. ✅ Test on staging database first
|
||||
2. ✅ Run during low-traffic window
|
||||
3. ✅ Monitor system resources
|
||||
4. ✅ Verify backup integrity
|
||||
5. ✅ Test restore procedure
|
||||
|
||||
### Future Enhancements (Roadmap)
|
||||
- [ ] Resume capability on failure
|
||||
- [ ] Chunked backups (1GB chunks)
|
||||
- [ ] BLOB external storage
|
||||
- [ ] Native libpq integration (CGO)
|
||||
- [ ] Distributed backup (multi-node)
|
||||
|
||||
## Support
|
||||
|
||||
See full optimization plan: `LARGE_DATABASE_OPTIMIZATION_PLAN.md`
|
||||
|
||||
**Issues?** Open a bug report with:
|
||||
- Database size
|
||||
- System specs (RAM, CPU, disk)
|
||||
- Error messages
|
||||
- `dmesg` output if OOM killed
|
||||
@ -1,324 +0,0 @@
|
||||
# 🚀 Large Database Optimization Plan
|
||||
|
||||
## Problem Statement
|
||||
Cluster backups failing with "signal: killed" on huge PostgreSQL databases with large BLOB data (multi-GB tables).
|
||||
|
||||
## Root Cause
|
||||
- **Memory Buffering**: Go processes buffering stdout/stderr in memory
|
||||
- **Custom Format Overhead**: pg_dump custom format requires memory for TOC
|
||||
- **Compression Memory**: High compression levels (7-9) use excessive RAM
|
||||
- **No Streaming**: Data flows through multiple Go buffers before disk
|
||||
|
||||
## Solution Architecture
|
||||
|
||||
### Phase 1: Immediate Optimizations (✅ IMPLEMENTED)
|
||||
|
||||
#### 1.1 Direct File Writing
|
||||
- ✅ Use `pg_dump --file=output.dump` to write directly to disk
|
||||
- ✅ Eliminate Go stdout buffering
|
||||
- ✅ Zero-copy from pg_dump to filesystem
|
||||
- **Memory Reduction: 80%**
|
||||
|
||||
#### 1.2 Smart Format Selection
|
||||
- ✅ Auto-detect database size before backup
|
||||
- ✅ Use plain format for databases > 5GB
|
||||
- ✅ Disable custom format TOC overhead
|
||||
- **Speed Increase: 40-50%**
|
||||
|
||||
#### 1.3 Optimized Compression Pipeline
|
||||
- ✅ Use streaming: `pg_dump | pigz -p N > file.gz`
|
||||
- ✅ Parallel compression with pigz
|
||||
- ✅ No intermediate buffering
|
||||
- **Memory Reduction: 90%**
|
||||
|
||||
#### 1.4 Per-Database Resource Limits
|
||||
- ✅ 2-hour timeout per database
|
||||
- ✅ Compression level capped at 6
|
||||
- ✅ Parallel dump jobs configurable
|
||||
- **Reliability: Prevents hangs**
|
||||
|
||||
### Phase 2: Native Library Integration (NEXT SPRINT)
|
||||
|
||||
#### 2.1 Replace lib/pq with pgx v5
|
||||
**Current:** `github.com/lib/pq` (pure Go, high memory)
|
||||
**Target:** `github.com/jackc/pgx/v5` (optimized, native)
|
||||
|
||||
**Benefits:**
|
||||
- 50% lower memory usage
|
||||
- Better connection pooling
|
||||
- Native COPY protocol support
|
||||
- Batch operations
|
||||
|
||||
**Migration:**
|
||||
```go
|
||||
// Replace:
|
||||
import _ "github.com/lib/pq"
|
||||
db, _ := sql.Open("postgres", dsn)
|
||||
|
||||
// With:
|
||||
import "github.com/jackc/pgx/v5/pgxpool"
|
||||
pool, _ := pgxpool.New(ctx, dsn)
|
||||
```
|
||||
|
||||
#### 2.2 Direct COPY Protocol
|
||||
Stream data without pg_dump:
|
||||
|
||||
```go
|
||||
// Export using COPY TO STDOUT
|
||||
conn.CopyTo(ctx, writer, "COPY table TO STDOUT BINARY")
|
||||
|
||||
// Import using COPY FROM STDIN
|
||||
conn.CopyFrom(ctx, table, columns, reader)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- No pg_dump process overhead
|
||||
- Direct binary protocol
|
||||
- Zero-copy streaming
|
||||
- 70% faster for large tables
|
||||
|
||||
### Phase 3: Advanced Features (FUTURE)
|
||||
|
||||
#### 3.1 Chunked Backup Mode
|
||||
```bash
|
||||
./dbbackup backup cluster --mode chunked --chunk-size 1GB
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```
|
||||
backups/
|
||||
├── cluster_20251104_chunk_001.sql.gz (1.0GB)
|
||||
├── cluster_20251104_chunk_002.sql.gz (1.0GB)
|
||||
├── cluster_20251104_chunk_003.sql.gz (856MB)
|
||||
└── cluster_20251104_manifest.json
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Resume on failure
|
||||
- Parallel processing
|
||||
- Smaller memory footprint
|
||||
- Better error isolation
|
||||
|
||||
#### 3.2 BLOB External Storage
|
||||
```bash
|
||||
./dbbackup backup single mydb --blob-mode external
|
||||
```
|
||||
|
||||
**Output:**
|
||||
```
|
||||
backups/
|
||||
├── mydb_schema.sql.gz # Schema + small data
|
||||
├── mydb_blobs.tar.gz # Packed BLOBs
|
||||
└── mydb_blobs/ # Individual BLOBs
|
||||
├── blob_000001.bin
|
||||
├── blob_000002.bin
|
||||
└── ...
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- BLOBs stored as files
|
||||
- Deduplicated storage
|
||||
- Selective restore
|
||||
- Cloud storage friendly
|
||||
|
||||
#### 3.3 Parallel Table Export
|
||||
```bash
|
||||
./dbbackup backup single mydb --parallel-tables 4
|
||||
```
|
||||
|
||||
Export multiple tables simultaneously:
|
||||
```
|
||||
workers: [table1] [table2] [table3] [table4]
|
||||
↓ ↓ ↓ ↓
|
||||
file1 file2 file3 file4
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- 4x faster for multi-table DBs
|
||||
- Better CPU utilization
|
||||
- Independent table recovery
|
||||
|
||||
### Phase 4: Operating System Tuning
|
||||
|
||||
#### 4.1 Kernel Parameters
|
||||
```bash
|
||||
# /etc/sysctl.d/99-dbbackup.conf
|
||||
vm.overcommit_memory = 1
|
||||
vm.swappiness = 10
|
||||
vm.dirty_ratio = 10
|
||||
vm.dirty_background_ratio = 5
|
||||
```
|
||||
|
||||
#### 4.2 Process Limits
|
||||
```bash
|
||||
# /etc/security/limits.d/dbbackup.conf
|
||||
postgres soft nofile 65536
|
||||
postgres hard nofile 65536
|
||||
postgres soft nproc 32768
|
||||
postgres hard nproc 32768
|
||||
```
|
||||
|
||||
#### 4.3 I/O Scheduler
|
||||
```bash
|
||||
# For database workloads
|
||||
echo deadline > /sys/block/sda/queue/scheduler
|
||||
echo 0 > /sys/block/sda/queue/add_random
|
||||
```
|
||||
|
||||
#### 4.4 Filesystem Options
|
||||
```bash
|
||||
# Mount with optimal flags for large files
|
||||
mount -o noatime,nodiratime,data=writeback /dev/sdb1 /backups
|
||||
```
|
||||
|
||||
### Phase 5: CGO Native Integration (ADVANCED)
|
||||
|
||||
#### 5.1 Direct libpq C Bindings
|
||||
```go
|
||||
// #cgo LDFLAGS: -lpq
|
||||
// #include <libpq-fe.h>
|
||||
import "C"
|
||||
|
||||
func nativeExport(conn *C.PGconn, table string) {
|
||||
result := C.PQexec(conn, C.CString("COPY table TO STDOUT"))
|
||||
// Direct memory access, zero-copy
|
||||
}
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Lowest possible overhead
|
||||
- Direct memory access
|
||||
- Native PostgreSQL protocol
|
||||
- Maximum performance
|
||||
|
||||
## Implementation Timeline
|
||||
|
||||
### Week 1: Quick Wins ✅ DONE
|
||||
- [x] Direct file writing
|
||||
- [x] Smart format selection
|
||||
- [x] Streaming compression
|
||||
- [x] Resource limits
|
||||
- [x] Size detection
|
||||
|
||||
### Week 2: Testing & Validation
|
||||
- [ ] Test on 10GB+ databases
|
||||
- [ ] Test on 50GB+ databases
|
||||
- [ ] Test on 100GB+ databases
|
||||
- [ ] Memory profiling
|
||||
- [ ] Performance benchmarks
|
||||
|
||||
### Week 3: Native Integration
|
||||
- [ ] Integrate pgx v5
|
||||
- [ ] Implement COPY protocol
|
||||
- [ ] Connection pooling
|
||||
- [ ] Batch operations
|
||||
|
||||
### Week 4: Advanced Features
|
||||
- [ ] Chunked backup mode
|
||||
- [ ] BLOB external storage
|
||||
- [ ] Parallel table export
|
||||
- [ ] Resume capability
|
||||
|
||||
### Month 2: Production Hardening
|
||||
- [ ] CGO integration (optional)
|
||||
- [ ] Distributed backup
|
||||
- [ ] Cloud streaming
|
||||
- [ ] Multi-region support
|
||||
|
||||
## Performance Targets
|
||||
|
||||
### Current Issues
|
||||
- ❌ Cluster backup fails on 20GB+ databases
|
||||
- ❌ Memory usage: ~8GB for 10GB database
|
||||
- ❌ Speed: 50MB/s
|
||||
- ❌ Crashes with OOM
|
||||
|
||||
### Target Metrics (Phase 1)
|
||||
- ✅ Cluster backup succeeds on 100GB+ databases
|
||||
- ✅ Memory usage: <1GB constant regardless of DB size
|
||||
- ✅ Speed: 150MB/s (with pigz)
|
||||
- ✅ No OOM kills
|
||||
|
||||
### Target Metrics (Phase 2)
|
||||
- ✅ Memory usage: <500MB constant
|
||||
- ✅ Speed: 250MB/s (native COPY)
|
||||
- ✅ Resume on failure
|
||||
- ✅ Parallel processing
|
||||
|
||||
### Target Metrics (Phase 3)
|
||||
- ✅ Memory usage: <200MB constant
|
||||
- ✅ Speed: 400MB/s (chunked parallel)
|
||||
- ✅ Selective restore
|
||||
- ✅ Cloud streaming
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Test Databases
|
||||
1. **Small** (1GB) - Baseline
|
||||
2. **Medium** (10GB) - Common case
|
||||
3. **Large** (50GB) - BLOB heavy
|
||||
4. **Huge** (100GB+) - Stress test
|
||||
5. **Extreme** (500GB+) - Edge case
|
||||
|
||||
### Test Scenarios
|
||||
- Single table with 50GB BLOB column
|
||||
- Multiple tables (1000+ tables)
|
||||
- High transaction rate during backup
|
||||
- Network interruption (resume)
|
||||
- Disk space exhaustion
|
||||
- Memory pressure (8GB RAM limit)
|
||||
|
||||
### Success Criteria
|
||||
- ✅ Zero OOM kills
|
||||
- ✅ Constant memory usage (<1GB)
|
||||
- ✅ Successful completion on all test sizes
|
||||
- ✅ Resume capability
|
||||
- ✅ Data integrity verification
|
||||
|
||||
## Monitoring & Observability
|
||||
|
||||
### Metrics to Track
|
||||
```go
|
||||
type BackupMetrics struct {
|
||||
MemoryUsageMB int64
|
||||
DiskIORate int64 // bytes/sec
|
||||
CPUUsagePercent float64
|
||||
DatabaseSizeGB float64
|
||||
BackupDurationSec int64
|
||||
CompressionRatio float64
|
||||
ErrorCount int
|
||||
}
|
||||
```
|
||||
|
||||
### Logging Enhancements
|
||||
- Per-table progress
|
||||
- Memory consumption tracking
|
||||
- I/O rate monitoring
|
||||
- Compression statistics
|
||||
- Error recovery actions
|
||||
|
||||
## Risk Mitigation
|
||||
|
||||
### Risks
|
||||
1. **Disk Space** - Backup size unknown until complete
|
||||
2. **Time** - Very long backup windows
|
||||
3. **Network** - Remote backup failures
|
||||
4. **Corruption** - Data integrity issues
|
||||
|
||||
### Mitigations
|
||||
1. **Pre-flight check** - Estimate backup size
|
||||
2. **Timeouts** - Per-database limits
|
||||
3. **Retry logic** - Exponential backoff
|
||||
4. **Checksums** - Verify after backup
|
||||
|
||||
## Conclusion
|
||||
|
||||
This plan provides a phased approach to handle massive PostgreSQL databases:
|
||||
|
||||
- **Phase 1** (✅ DONE): Immediate 80-90% memory reduction
|
||||
- **Phase 2**: Native library integration for better performance
|
||||
- **Phase 3**: Advanced features for production use
|
||||
- **Phase 4**: System-level optimizations
|
||||
- **Phase 5**: Maximum performance with CGO
|
||||
|
||||
The current implementation should handle 100GB+ databases without OOM issues.
|
||||
@ -1,296 +0,0 @@
|
||||
# ✅ Phase 2 Complete: Native pgx Integration
|
||||
|
||||
## Migration Summary
|
||||
|
||||
### **Replaced lib/pq with jackc/pgx v5**
|
||||
|
||||
**Before:**
|
||||
```go
|
||||
import _ "github.com/lib/pq"
|
||||
db, _ := sql.Open("postgres", dsn)
|
||||
```
|
||||
|
||||
**After:**
|
||||
```go
|
||||
import "github.com/jackc/pgx/v5/pgxpool"
|
||||
pool, _ := pgxpool.NewWithConfig(ctx, config)
|
||||
db := stdlib.OpenDBFromPool(pool)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Improvements
|
||||
|
||||
### **Memory Usage**
|
||||
| Workload | lib/pq | pgx v5 | Improvement |
|
||||
|----------|---------|--------|-------------|
|
||||
| 10GB DB | 2.1GB | 1.1GB | **48% reduction** |
|
||||
| 50GB DB | OOM | 1.3GB | **✅ Works now** |
|
||||
| 100GB DB | OOM | 1.4GB | **✅ Works now** |
|
||||
|
||||
### **Connection Performance**
|
||||
- **50% faster** connection establishment
|
||||
- **Better connection pooling** (2-10 connections)
|
||||
- **Lower overhead** per query
|
||||
- **Native prepared statements**
|
||||
|
||||
### **Query Performance**
|
||||
- **30% faster** for large result sets
|
||||
- **Zero-copy** binary protocol
|
||||
- **Better BLOB handling**
|
||||
- **Streaming** large queries
|
||||
|
||||
---
|
||||
|
||||
## Technical Benefits
|
||||
|
||||
### 1. **Connection Pooling** ✅
|
||||
```go
|
||||
config.MaxConns = 10 // Max connections
|
||||
config.MinConns = 2 // Keep ready
|
||||
config.HealthCheckPeriod = 1m // Auto-heal
|
||||
```
|
||||
|
||||
### 2. **Runtime Optimization** ✅
|
||||
```go
|
||||
config.ConnConfig.RuntimeParams["work_mem"] = "64MB"
|
||||
config.ConnConfig.RuntimeParams["maintenance_work_mem"] = "256MB"
|
||||
```
|
||||
|
||||
### 3. **Binary Protocol** ✅
|
||||
- Native binary encoding/decoding
|
||||
- Lower CPU usage for type conversion
|
||||
- Better performance for BLOB data
|
||||
|
||||
### 4. **Better Error Handling** ✅
|
||||
- Detailed error codes (SQLSTATE)
|
||||
- Connection retry logic built-in
|
||||
- Graceful degradation
|
||||
|
||||
---
|
||||
|
||||
## Code Changes
|
||||
|
||||
### Files Modified:
|
||||
1. **`internal/database/postgresql.go`**
|
||||
- Added `pgxpool.Pool` field
|
||||
- Implemented `buildPgxDSN()` with URL format
|
||||
- Optimized connection config
|
||||
- Custom Close() to handle both pool and db
|
||||
|
||||
2. **`internal/database/interface.go`**
|
||||
- Replaced lib/pq import with pgx/stdlib
|
||||
- Updated driver registration
|
||||
|
||||
3. **`go.mod`**
|
||||
- Added `github.com/jackc/pgx/v5 v5.7.6`
|
||||
- Added `github.com/jackc/puddle/v2 v2.2.2` (pool manager)
|
||||
- Removed `github.com/lib/pq v1.10.9`
|
||||
|
||||
---
|
||||
|
||||
## Connection String Format
|
||||
|
||||
### **pgx URL Format**
|
||||
```
|
||||
postgres://user:password@host:port/database?sslmode=prefer&pool_max_conns=10
|
||||
```
|
||||
|
||||
### **Features:**
|
||||
- Standard PostgreSQL URL format
|
||||
- Better parameter support
|
||||
- Connection pool settings in URL
|
||||
- SSL configuration
|
||||
- Application name tracking
|
||||
|
||||
---
|
||||
|
||||
## Compatibility
|
||||
|
||||
### **Backward Compatible** ✅
|
||||
- Still uses `database/sql` interface
|
||||
- No changes to backup/restore commands
|
||||
- Existing code works unchanged
|
||||
- Same pg_dump/pg_restore tools
|
||||
|
||||
### **New Capabilities** 🚀
|
||||
- Native connection pooling
|
||||
- Better resource management
|
||||
- Automatic connection health checks
|
||||
- Lower memory footprint
|
||||
|
||||
---
|
||||
|
||||
## Testing Results
|
||||
|
||||
### Test 1: Simple Connection
|
||||
```bash
|
||||
./dbbackup --db-type postgres status
|
||||
```
|
||||
**Result:** ✅ Connected successfully with pgx driver
|
||||
|
||||
### Test 2: Large Database Backup
|
||||
```bash
|
||||
./dbbackup backup cluster
|
||||
```
|
||||
**Result:** ✅ Memory usage 48% lower than lib/pq
|
||||
|
||||
### Test 3: Concurrent Operations
|
||||
```bash
|
||||
./dbbackup backup cluster --dump-jobs 8
|
||||
```
|
||||
**Result:** ✅ Better connection pool utilization
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
### For Users:
|
||||
**✅ No action required!**
|
||||
- Drop-in replacement
|
||||
- Same commands work
|
||||
- Same configuration
|
||||
- Better performance automatically
|
||||
|
||||
### For Developers:
|
||||
```bash
|
||||
# Update dependencies
|
||||
go get github.com/jackc/pgx/v5@latest
|
||||
go get github.com/jackc/pgx/v5/pgxpool@latest
|
||||
go mod tidy
|
||||
|
||||
# Build
|
||||
go build -o dbbackup .
|
||||
|
||||
# Test
|
||||
./dbbackup status
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements (Phase 3)
|
||||
|
||||
### 1. **Native COPY Protocol** 🎯
|
||||
Use pgx's COPY support for direct data streaming:
|
||||
|
||||
```go
|
||||
// Instead of pg_dump, use native COPY
|
||||
conn.CopyFrom(ctx, pgx.Identifier{"table"},
|
||||
[]string{"col1", "col2"},
|
||||
readerFunc)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- No pg_dump process overhead
|
||||
- Direct binary protocol
|
||||
- 50-70% faster for large tables
|
||||
- Real-time progress tracking
|
||||
|
||||
### 2. **Batch Operations** 🎯
|
||||
```go
|
||||
batch := &pgx.Batch{}
|
||||
batch.Queue("SELECT * FROM table1")
|
||||
batch.Queue("SELECT * FROM table2")
|
||||
results := conn.SendBatch(ctx, batch)
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Multiple queries in one round-trip
|
||||
- Lower network overhead
|
||||
- Better throughput
|
||||
|
||||
### 3. **Listen/Notify for Progress** 🎯
|
||||
```go
|
||||
conn.Listen(ctx, "backup_progress")
|
||||
// Real-time progress updates from database
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Live progress from database
|
||||
- No polling required
|
||||
- Better user experience
|
||||
|
||||
---
|
||||
|
||||
## Performance Benchmarks
|
||||
|
||||
### Connection Establishment
|
||||
```
|
||||
lib/pq: avg 45ms, max 120ms
|
||||
pgx v5: avg 22ms, max 55ms
|
||||
Result: 51% faster
|
||||
```
|
||||
|
||||
### Large Query (10M rows)
|
||||
```
|
||||
lib/pq: memory 2.1GB, time 42s
|
||||
pgx v5: memory 1.1GB, time 29s
|
||||
Result: 48% less memory, 31% faster
|
||||
```
|
||||
|
||||
### BLOB Handling (5GB binary data)
|
||||
```
|
||||
lib/pq: memory 8.2GB, OOM killed
|
||||
pgx v5: memory 1.3GB, completed
|
||||
Result: ✅ Works vs fails
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Issue: "Peer authentication failed"
|
||||
**Solution:** Use password authentication or configure pg_hba.conf
|
||||
|
||||
```bash
|
||||
# Test with explicit auth
|
||||
./dbbackup --host localhost --user myuser --password mypass status
|
||||
```
|
||||
|
||||
### Issue: "Pool exhausted"
|
||||
**Solution:** Increase max connections in config
|
||||
|
||||
```go
|
||||
config.MaxConns = 20 // Increase from 10
|
||||
```
|
||||
|
||||
### Issue: "Connection timeout"
|
||||
**Solution:** Check network and increase timeout
|
||||
|
||||
```
|
||||
postgres://user:pass@host:port/db?connect_timeout=30
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Documentation
|
||||
|
||||
### Related Files:
|
||||
- `LARGE_DATABASE_OPTIMIZATION_PLAN.md` - Overall optimization strategy
|
||||
- `HUGE_DATABASE_QUICK_START.md` - User guide for large databases
|
||||
- `PRIORITY2_PGX_INTEGRATION.md` - This file
|
||||
|
||||
### References:
|
||||
- [pgx Documentation](https://github.com/jackc/pgx)
|
||||
- [pgxpool Guide](https://pkg.go.dev/github.com/jackc/pgx/v5/pgxpool)
|
||||
- [PostgreSQL Connection Pooling](https://www.postgresql.org/docs/current/runtime-config-connection.html)
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
✅ **Phase 2 Complete**: Native pgx integration successful
|
||||
|
||||
**Key Achievements:**
|
||||
- 48% memory reduction
|
||||
- 30-50% performance improvement
|
||||
- Better resource management
|
||||
- Production-ready and tested
|
||||
- Backward compatible
|
||||
|
||||
**Next Steps:**
|
||||
- Phase 3: Native COPY protocol
|
||||
- Chunked backup implementation
|
||||
- Resume capability
|
||||
|
||||
The foundation is now ready for advanced optimizations! 🚀
|
||||
198
QUICKRUN.MD
198
QUICKRUN.MD
@ -1,198 +0,0 @@
|
||||
# DB Backup Tool - Quick Start Guide
|
||||
|
||||
## 🌟 NEW: Real-Time Progress Tracking!
|
||||
|
||||
The database backup tool now includes **enhanced progress tracking** with:
|
||||
- <20> **Live progress bars** with percentage completion
|
||||
- ⏱️ **Real-time timing** and performance metrics
|
||||
- 📝 **Detailed logging** with timestamps
|
||||
- 🎨 **Beautiful interactive UI** with color indicators
|
||||
|
||||
## <20>🚀 Quick Installation
|
||||
|
||||
### Option 1: Pre-compiled Binary (Recommended)
|
||||
```bash
|
||||
# Linux AMD64
|
||||
./bin/dbbackup_linux_amd64
|
||||
chmod +x ./bin/dbbackup_linux_amd64
|
||||
|
||||
# Linux ARM64
|
||||
./bin/dbbackup_linux_arm64
|
||||
chmod +x ./bin/dbbackup_linux_arm64
|
||||
|
||||
# macOS Intel
|
||||
./bin/dbbackup_darwin_amd64
|
||||
chmod +x ./bin/dbbackup_darwin_amd64
|
||||
|
||||
# macOS M1/M2
|
||||
./bin/dbbackup_darwin_arm64
|
||||
chmod +x ./bin/dbbackup_darwin_arm64
|
||||
|
||||
# Windows
|
||||
./bin/dbbackup_windows_amd64.exe
|
||||
```
|
||||
|
||||
### Option 2: Build from Source
|
||||
```bash
|
||||
git clone https://git.uuxo.net/renz/dbbackup.git
|
||||
cd dbbackup
|
||||
go build -o dbbackup .
|
||||
```
|
||||
|
||||
## ⚡ Quick Start with Progress Tracking
|
||||
|
||||
### 🎨 Enhanced Interactive Mode (Best for Beginners)
|
||||
```bash
|
||||
# Start with real-time progress tracking
|
||||
dbbackup interactive --database your_database
|
||||
|
||||
# Example with PostgreSQL
|
||||
dbbackup interactive --database postgres --host localhost --user postgres
|
||||
```
|
||||
|
||||
> Tip: In the interactive menu, tap the left/right arrows (or `t`) to toggle between PostgreSQL and MySQL/MariaDB before starting a task.
|
||||
|
||||
### 📊 Command Line with Progress
|
||||
```bash
|
||||
# Single backup with live progress
|
||||
dbbackup backup single your_database --progress
|
||||
|
||||
# Cluster backup with detailed logging
|
||||
dbbackup backup cluster --progress --verbose --timestamps
|
||||
```
|
||||
|
||||
## 🎬 Progress Tracking in Action
|
||||
|
||||
### Real-Time Progress Display
|
||||
```bash
|
||||
🔄 PostgreSQL Backup [67%] - Compressing archive...
|
||||
[███████████████▒▒▒▒▒▒▒]
|
||||
⏱️ Elapsed: 1m 45.2s | ETA: 42s
|
||||
📁 Files: 18/25 processed
|
||||
💾 Data: 1.4GB/2.1GB transferred
|
||||
|
||||
Steps:
|
||||
✅ Prepare backup directory
|
||||
✅ Build backup command
|
||||
✅ Execute database backup
|
||||
🔄 Verify backup file
|
||||
⏳ Create metadata file
|
||||
|
||||
Details:
|
||||
database: postgres | type: single | compression: 6
|
||||
output_file: /backups/db_postgres_20241203_143527.dump
|
||||
```
|
||||
|
||||
### Post-Operation Summary
|
||||
```bash
|
||||
✅ Single database backup completed: db_postgres_20241203_143527.dump
|
||||
|
||||
📊 Operation Summary:
|
||||
Total: 1 | Completed: 1 | Failed: 0 | Running: 0
|
||||
Total Duration: 2m 18.7s
|
||||
|
||||
📁 Backup Details:
|
||||
File: /backups/db_postgres_20241203_143527.dump
|
||||
Size: 1.1GB (compressed from 2.1GB)
|
||||
Verification: PASSED
|
||||
Metadata: Created successfully
|
||||
```
|
||||
|
||||
### Command Line (For Scripts/Automation)
|
||||
|
||||
#### PostgreSQL Examples
|
||||
```bash
|
||||
# Single database backup (auto-optimized)
|
||||
dbbackup backup single myapp_db --db-type postgres
|
||||
|
||||
# Sample backup (10% of data)
|
||||
dbbackup backup sample myapp_db --sample-ratio 10 --db-type postgres
|
||||
|
||||
# Full cluster backup
|
||||
dbbackup backup cluster --db-type postgres
|
||||
|
||||
# Check connection
|
||||
dbbackup status --db-type postgres
|
||||
```
|
||||
|
||||
#### MySQL Examples
|
||||
```bash
|
||||
# Single database backup
|
||||
dbbackup backup single myapp_db --db-type mysql
|
||||
|
||||
# Using the short flag for database selection
|
||||
dbbackup backup single myapp_db -d mysql
|
||||
|
||||
# Sample backup
|
||||
dbbackup backup sample myapp_db --sample-ratio 10 --db-type mysql
|
||||
|
||||
# Check connection
|
||||
dbbackup status --db-type mysql
|
||||
```
|
||||
|
||||
## 🧠 CPU Optimization Commands
|
||||
|
||||
```bash
|
||||
# Show CPU information and recommendations
|
||||
dbbackup cpu
|
||||
|
||||
# Auto-optimize for your hardware
|
||||
dbbackup backup single mydb --auto-detect-cores
|
||||
|
||||
# Manual configuration for big servers
|
||||
dbbackup backup cluster --jobs 16 --dump-jobs 8 --max-cores 32
|
||||
```
|
||||
|
||||
## 🔧 Common Options
|
||||
|
||||
| Option | Description | Example |
|
||||
|--------|-------------|---------|
|
||||
| `--host` | Database host | `--host db.example.com` |
|
||||
| `--port` | Database port | `--port 5432` |
|
||||
| `--user` | Database user | `--user backup_user` |
|
||||
| `-d`, `--db-type` | Database type (`postgres`, `mysql`, `mariadb`) | `-d mysql` |
|
||||
| `--insecure` | Disable SSL | `--insecure` |
|
||||
| `--jobs` | Parallel jobs | `--jobs 8` |
|
||||
| `--debug` | Debug mode | `--debug` |
|
||||
|
||||
### PostgreSQL Quick Options
|
||||
|
||||
- Default target is PostgreSQL; explicitly add `--db-type postgres` when switching between engines in scripts.
|
||||
- Run as the `postgres` OS user for local clusters: `sudo -u postgres dbbackup ...` ensures socket authentication succeeds.
|
||||
- Pick an SSL strategy: omit `--ssl-mode` for local sockets, use `--ssl-mode require` (or stricter) for remote TLS, or `--insecure` to disable TLS.
|
||||
- Cluster backups (`backup cluster`) and plain `.dump` restores are PostgreSQL-only operations.
|
||||
|
||||
### MySQL / MariaDB Quick Options
|
||||
|
||||
- Select the engine with `--db-type mysql` (or `mariadb`); supply `--host`, `--port 3306`, `--user`, `--password`, and `--database` explicitly.
|
||||
- Use `--insecure` to emit `--skip-ssl`; choose `--ssl-mode require|verify-ca|verify-identity` when TLS is mandatory.
|
||||
- MySQL dumps are gzipped SQL scripts (`*.sql.gz`); the restore preview shows `gunzip -c ... | mysql` so you know the exact pipeline.
|
||||
- Credentials can also come from `MYSQL_PWD` or option files if you prefer not to use the `--password` flag.
|
||||
|
||||
## 📁 Available Binaries
|
||||
|
||||
Choose the right binary for your platform:
|
||||
|
||||
- **Linux**: `dbbackup_linux_amd64`, `dbbackup_linux_arm64`
|
||||
- **macOS**: `dbbackup_darwin_amd64`, `dbbackup_darwin_arm64`
|
||||
- **Windows**: `dbbackup_windows_amd64.exe`
|
||||
- **BSD**: `dbbackup_freebsd_amd64`, `dbbackup_openbsd_amd64`
|
||||
|
||||
## 🆘 Need Help?
|
||||
|
||||
```bash
|
||||
# General help
|
||||
dbbackup --help
|
||||
|
||||
# Command-specific help
|
||||
dbbackup backup --help
|
||||
dbbackup backup single --help
|
||||
|
||||
# Check CPU configuration
|
||||
dbbackup cpu
|
||||
|
||||
# Test connection
|
||||
dbbackup status --debug
|
||||
```
|
||||
|
||||
For complete documentation, see `README.md`.
|
||||
Reference in New Issue
Block a user