Compare commits

...

7 Commits

Author SHA1 Message Date
222bdbef58 fix: streaming tar verification for large cluster archives (100GB+)
All checks were successful
CI/CD / Test (push) Successful in 1m17s
CI/CD / Lint (push) Successful in 1m26s
CI/CD / Build & Release (push) Successful in 3m14s
- Increase timeout from 60 to 180 minutes for very large archives
- Use streaming pipes instead of buffering entire tar listing
- Only mark as corrupted for clear corruption signals (unexpected EOF, invalid gzip)
- Prevents false CORRUPTED errors on valid large archives
2026-01-13 14:40:18 +01:00
f7e9fa64f0 docs: add Large Database Support (600+ GB) section to PITR guide
All checks were successful
CI/CD / Test (push) Successful in 1m13s
CI/CD / Lint (push) Successful in 1m22s
CI/CD / Build & Release (push) Has been skipped
2026-01-13 10:02:35 +01:00
f153e61dbf fix: dynamic timeouts for large archives + use WorkDir for disk checks
All checks were successful
CI/CD / Test (push) Successful in 1m21s
CI/CD / Lint (push) Successful in 1m34s
CI/CD / Build & Release (push) Successful in 3m22s
- CheckDiskSpace now uses GetEffectiveWorkDir() instead of BackupDir
- Dynamic timeout calculation based on file size:
  - diagnoseClusterArchive: 5 + (GB/3) min, max 60 min
  - verifyWithPgRestore: 5 + (GB/5) min, max 30 min
  - DiagnoseClusterDumps: 10 + (GB/3) min, max 120 min
  - TUI safety checks: 10 + (GB/5) min, max 120 min
- Timeout vs corruption differentiation (no false CORRUPTED on timeout)
- Streaming tar listing to avoid OOM on large archives

For 119GB archives: ~45 min timeout instead of 5 min false-positive
2026-01-13 08:22:20 +01:00
d19c065658 Remove dev artifacts and internal docs
All checks were successful
CI/CD / Test (push) Successful in 1m14s
CI/CD / Lint (push) Successful in 1m22s
CI/CD / Build & Release (push) Successful in 3m9s
- dbbackup, dbbackup_cgo (dev binaries, use bin/ for releases)
- CRITICAL_BUGS_FIXED.md (internal post-mortem)
- scripts/remove_*.sh (one-time cleanup scripts)
2026-01-12 11:14:55 +01:00
8dac5efc10 Remove EMOTICON_REMOVAL_PLAN.md
Some checks failed
CI/CD / Test (push) Successful in 1m19s
CI/CD / Build & Release (push) Has been cancelled
CI/CD / Lint (push) Has been cancelled
2026-01-12 11:12:17 +01:00
fd5edce5ae Fix license: Apache 2.0 not MIT
All checks were successful
CI/CD / Test (push) Successful in 1m18s
CI/CD / Lint (push) Successful in 1m28s
CI/CD / Build & Release (push) Has been skipped
2026-01-12 10:57:55 +01:00
a7e2c86618 Replace VEEAM_ALTERNATIVE with OPENSOURCE_ALTERNATIVE - covers both commercial (Veeam) and open source (Borg/restic) alternatives
All checks were successful
CI/CD / Test (push) Successful in 1m16s
CI/CD / Lint (push) Successful in 1m29s
CI/CD / Build & Release (push) Has been skipped
2026-01-12 10:43:15 +01:00
11 changed files with 531 additions and 768 deletions

View File

@@ -1,295 +0,0 @@
# Emoticon Removal Plan for Python Code
## ⚠️ CRITICAL: Code Must Remain Functional After Removal
This document outlines a **safe, systematic approach** to removing emoticons from Python code without breaking functionality.
---
## 1. Identification Phase
### 1.1 Where Emoticons CAN Safely Exist (Safe to Remove)
| Location | Risk Level | Action |
|----------|------------|--------|
| Comments (`# 🎉 Success!`) | ✅ SAFE | Remove or replace with text |
| Docstrings (`"""📌 Note:..."""`) | ✅ SAFE | Remove or replace with text |
| Print statements for decoration (`print("✅ Done!")`) | ⚠️ LOW | Replace with ASCII or text |
| Logging messages (`logger.info("🔥 Starting...")`) | ⚠️ LOW | Replace with text equivalent |
### 1.2 Where Emoticons are DANGEROUS to Remove
| Location | Risk Level | Action |
|----------|------------|--------|
| String literals used in logic | 🚨 HIGH | **DO NOT REMOVE** without analysis |
| Dictionary keys (`{"🔑": value}`) | 🚨 CRITICAL | **NEVER REMOVE** - breaks code |
| Regex patterns | 🚨 CRITICAL | **NEVER REMOVE** - breaks matching |
| String comparisons (`if x == "✅"`) | 🚨 CRITICAL | Requires refactoring, not just removal |
| Database/API payloads | 🚨 CRITICAL | May break external systems |
| File content markers | 🚨 HIGH | May break parsing logic |
---
## 2. Pre-Removal Checklist
### 2.1 Before ANY Changes
- [ ] **Full backup** of the codebase
- [ ] **Run all tests** and record baseline results
- [ ] **Document all emoticon locations** with grep/search
- [ ] **Identify emoticon usage patterns** (decorative vs. functional)
### 2.2 Discovery Commands
```bash
# Find all files with emoticons (Unicode range for common emojis)
grep -rn --include="*.py" -P '[\x{1F300}-\x{1F9FF}]' .
# Find emoticons in strings
grep -rn --include="*.py" -E '["'"'"'][^"'"'"']*[\x{1F300}-\x{1F9FF}]' .
# List unique emoticons used
grep -oP '[\x{1F300}-\x{1F9FF}]' *.py | sort -u
```
---
## 3. Replacement Strategy
### 3.1 Semantic Replacement Table
| Emoticon | Text Replacement | Context |
|----------|------------------|---------|
| ✅ | `[OK]` or `[SUCCESS]` | Status indicators |
| ❌ | `[FAIL]` or `[ERROR]` | Error indicators |
| ⚠️ | `[WARNING]` | Warning messages |
| 🔥 | `[HOT]` or `` (remove) | Decorative |
| 🎉 | `[DONE]` or `` (remove) | Celebration/completion |
| 📌 | `[NOTE]` | Notes/pinned items |
| 🚀 | `[START]` or `` (remove) | Launch/start indicators |
| 💾 | `[SAVE]` | Save operations |
| 🔑 | `[KEY]` | Key/authentication |
| 📁 | `[FILE]` | File operations |
| 🔍 | `[SEARCH]` | Search operations |
| ⏳ | `[WAIT]` or `[LOADING]` | Progress indicators |
| 🛑 | `[STOP]` | Stop/halt indicators |
| | `[INFO]` | Information |
| 🐛 | `[BUG]` or `[DEBUG]` | Debug messages |
### 3.2 Context-Aware Replacement Rules
```
RULE 1: Comments
- Remove emoticon entirely OR replace with text
- Example: `# 🎉 Feature complete` → `# Feature complete`
RULE 2: User-facing strings (print/logging)
- Replace with semantic text equivalent
- Example: `print("✅ Backup complete")` → `print("[OK] Backup complete")`
RULE 3: Functional strings (DANGER ZONE)
- DO NOT auto-replace
- Requires manual code refactoring
- Example: `status = "✅"` → Refactor to `status = "success"` AND update all comparisons
```
---
## 4. Safe Removal Process
### Step 1: Audit
```python
# Python script to audit emoticon usage
import re
import ast
EMOJI_PATTERN = re.compile(
"["
"\U0001F300-\U0001F9FF" # Symbols & Pictographs
"\U00002600-\U000026FF" # Misc symbols
"\U00002700-\U000027BF" # Dingbats
"\U0001F600-\U0001F64F" # Emoticons
"]+"
)
def audit_file(filepath):
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
# Parse AST to understand context
tree = ast.parse(content)
findings = []
for lineno, line in enumerate(content.split('\n'), 1):
matches = EMOJI_PATTERN.findall(line)
if matches:
# Determine context (comment, string, etc.)
context = classify_context(line, matches)
findings.append({
'line': lineno,
'content': line.strip(),
'emojis': matches,
'context': context,
'risk': assess_risk(context)
})
return findings
def classify_context(line, matches):
stripped = line.strip()
if stripped.startswith('#'):
return 'COMMENT'
if 'print(' in line or 'logging.' in line or 'logger.' in line:
return 'OUTPUT'
if '==' in line or '!=' in line:
return 'COMPARISON'
if re.search(r'["\'][^"\']*$', line.split('#')[0]):
return 'STRING_LITERAL'
return 'UNKNOWN'
def assess_risk(context):
risk_map = {
'COMMENT': 'LOW',
'OUTPUT': 'LOW',
'COMPARISON': 'CRITICAL',
'STRING_LITERAL': 'HIGH',
'UNKNOWN': 'HIGH'
}
return risk_map.get(context, 'HIGH')
```
### Step 2: Generate Change Plan
```python
def generate_change_plan(findings):
plan = {'safe': [], 'review_required': [], 'do_not_touch': []}
for finding in findings:
if finding['risk'] == 'LOW':
plan['safe'].append(finding)
elif finding['risk'] == 'HIGH':
plan['review_required'].append(finding)
else: # CRITICAL
plan['do_not_touch'].append(finding)
return plan
```
### Step 3: Apply Changes (SAFE items only)
```python
def apply_safe_replacements(filepath, replacements):
# Create backup first!
import shutil
shutil.copy(filepath, filepath + '.backup')
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
for old, new in replacements:
content = content.replace(old, new)
with open(filepath, 'w', encoding='utf-8') as f:
f.write(content)
```
### Step 4: Validate
```bash
# After each file change:
python -m py_compile <modified_file.py> # Syntax check
pytest <related_tests> # Run tests
```
---
## 5. Validation Checklist
### After EACH File Modification
- [ ] File compiles without syntax errors (`python -m py_compile file.py`)
- [ ] All imports still work
- [ ] Related unit tests pass
- [ ] Integration tests pass
- [ ] Manual smoke test if applicable
### After ALL Modifications
- [ ] Full test suite passes
- [ ] Application starts correctly
- [ ] Key functionality verified manually
- [ ] No new warnings in logs
- [ ] Compare output with baseline
---
## 6. Rollback Plan
### If Something Breaks
1. **Immediate**: Restore from `.backup` files
2. **Git**: `git checkout -- <file>` or `git stash pop`
3. **Full rollback**: Restore from pre-change backup
### Keep Until Verified
```bash
# Backup storage structure
backups/
├── pre_emoticon_removal/
│ ├── timestamp.tar.gz
│ └── git_commit_hash.txt
└── individual_files/
├── file1.py.backup
└── file2.py.backup
```
---
## 7. Implementation Order
1. **Phase 1**: Comments only (LOWEST risk)
2. **Phase 2**: Docstrings (LOW risk)
3. **Phase 3**: Print/logging statements (LOW-MEDIUM risk)
4. **Phase 4**: Manual review items (HIGH risk) - one by one
5. **Phase 5**: NEVER touch CRITICAL items without full refactoring
---
## 8. Example Workflow
```bash
# 1. Create full backup
git stash && git checkout -b emoticon-removal
# 2. Run audit script
python emoticon_audit.py > audit_report.json
# 3. Review audit report
cat audit_report.json | jq '.do_not_touch' # Check critical items
# 4. Apply safe changes only
python apply_safe_changes.py --dry-run # Preview first!
python apply_safe_changes.py # Apply
# 5. Validate after each change
python -m pytest tests/
# 6. Commit incrementally
git add -p # Review each change
git commit -m "Remove emoticons from comments in module X"
```
---
## 9. DO NOT DO
**Never** use global find-replace on emoticons
**Never** remove emoticons from string comparisons without refactoring
**Never** change multiple files without testing between changes
**Never** assume an emoticon is decorative - verify context
**Never** proceed if tests fail after a change
---
## 10. Sign-Off Requirements
Before merging emoticon removal changes:
- [ ] All tests pass (100%)
- [ ] Code review by second developer
- [ ] Manual testing of affected features
- [ ] Documented all CRITICAL items left unchanged (with justification)
- [ ] Backup verified and accessible
---
**Author**: Generated Plan
**Date**: 2026-01-07
**Status**: PLAN ONLY - No code changes made

206
OPENSOURCE_ALTERNATIVE.md Normal file
View File

@@ -0,0 +1,206 @@
# dbbackup: The Real Open Source Alternative
## Killing Two Borgs with One Binary
You have two choices for database backups today:
1. **Pay $2,000-10,000/year per server** for Veeam, Commvault, or Veritas
2. **Wrestle with Borg/restic** - powerful, but never designed for databases
**dbbackup** eliminates both problems with a single, zero-dependency binary.
## The Problem with Commercial Backup
| What You Pay For | What You Actually Get |
|------------------|----------------------|
| $10,000/year | Heavy agents eating CPU |
| Complex licensing | Vendor lock-in to proprietary formats |
| "Enterprise support" | Recovery that requires calling support |
| "Cloud integration" | Upload to S3... eventually |
## The Problem with Borg/Restic
Great tools. Wrong use case.
| Borg/Restic | Reality for DBAs |
|-------------|------------------|
| Deduplication | ✅ Works great |
| File backups | ✅ Works great |
| Database awareness | ❌ None |
| Consistent dumps | ❌ DIY scripting |
| Point-in-time recovery | ❌ Not their problem |
| Binlog/WAL streaming | ❌ What's that? |
You end up writing wrapper scripts. Then more scripts. Then a monitoring layer. Then you've built half a product anyway.
## What Open Source Really Means
**dbbackup** delivers everything - in one binary:
| Feature | Veeam | Borg/Restic | dbbackup |
|---------|-------|-------------|----------|
| Deduplication | ❌ | ✅ | ✅ Native CDC |
| Database-aware | ✅ | ❌ | ✅ MySQL + PostgreSQL |
| Consistent snapshots | ✅ | ❌ | ✅ LVM/ZFS/Btrfs |
| PITR (Point-in-Time) | ❌ | ❌ | ✅ Sub-second RPO |
| Binlog/WAL streaming | ❌ | ❌ | ✅ Continuous |
| Direct cloud streaming | ❌ | ✅ | ✅ S3/GCS/Azure |
| Zero dependencies | ❌ | ❌ | ✅ Single binary |
| License cost | $$$$ | Free | **Free (Apache 2.0)** |
## Deduplication: We Killed the Borg
Content-defined chunking, just like Borg - but built for database dumps:
```bash
# First backup: 5MB stored
dbbackup dedup backup mydb.dump
# Second backup (modified): only 1.6KB new data!
# 100% deduplication ratio
dbbackup dedup backup mydb_modified.dump
```
### How It Works
- **Gear Hash CDC** - Content-defined chunking with 92%+ overlap detection
- **SHA-256 Content-Addressed** - Chunks stored by hash, automatic dedup
- **AES-256-GCM Encryption** - Per-chunk encryption
- **Gzip Compression** - Enabled by default
- **SQLite Index** - Fast lookups, portable metadata
### Storage Efficiency
| Scenario | Borg | dbbackup |
|----------|------|----------|
| Daily 10GB database | 10GB + ~2GB/day | 10GB + ~2GB/day |
| Same data, knows it's a DB | Scripts needed | **Native support** |
| Restore to point-in-time | ❌ | ✅ Built-in |
Same dedup math. Zero wrapper scripts.
## Enterprise Features, Zero Enterprise Pricing
### Physical Backups (MySQL 8.0.17+)
```bash
# Native Clone Plugin - no XtraBackup needed
dbbackup backup single mydb --db-type mysql --cloud s3://bucket/
```
### Filesystem Snapshots
```bash
# <100ms lock, instant snapshot, stream to cloud
dbbackup backup --engine=snapshot --snapshot-backend=lvm
```
### Continuous Binlog/WAL Streaming
```bash
# Real-time capture to S3 - sub-second RPO
dbbackup binlog stream --target=s3://bucket/binlogs/
```
### Parallel Cloud Upload
```bash
# Saturate your network, not your patience
dbbackup backup --engine=streaming --parallel-workers=8
```
## Real Numbers
**100GB MySQL database:**
| Metric | Veeam | Borg + Scripts | dbbackup |
|--------|-------|----------------|----------|
| Backup time | 45 min | 50 min | **12 min** |
| Local disk needed | 100GB | 100GB | **0 GB** |
| Recovery point | Daily | Daily | **< 1 second** |
| Setup time | Days | Hours | **Minutes** |
| Annual cost | $5,000+ | $0 + time | **$0** |
## Migration Path
### From Veeam
```bash
# Day 1: Test alongside existing
dbbackup backup single mydb --cloud s3://test-bucket/
# Week 1: Compare backup times, storage costs
# Week 2: Switch primary backups
# Month 1: Cancel renewal, buy your team pizza
```
### From Borg/Restic
```bash
# Day 1: Replace your wrapper scripts
dbbackup dedup backup /var/lib/mysql/dumps/mydb.sql
# Day 2: Add PITR
dbbackup binlog stream --target=/mnt/nfs/binlogs/
# Day 3: Delete 500 lines of bash
```
## The Commands You Need
```bash
# Deduplicated backups (Borg-style)
dbbackup dedup backup <file>
dbbackup dedup restore <id> <output>
dbbackup dedup stats
dbbackup dedup gc
# Database-native backups
dbbackup backup single <database>
dbbackup backup all
dbbackup restore <backup-file>
# Point-in-time recovery
dbbackup binlog stream
dbbackup pitr restore --target-time "2026-01-12 14:30:00"
# Cloud targets
--cloud s3://bucket/path/
--cloud gs://bucket/path/
--cloud azure://container/path/
```
## Who Should Switch
**From Veeam/Commvault**: Same capabilities, zero license fees
**From Borg/Restic**: Native database support, no wrapper scripts
**From "homegrown scripts"**: Production-ready, battle-tested
**Cloud-native deployments**: Kubernetes, ECS, Cloud Run ready
**Compliance requirements**: AES-256-GCM, audit logging
## Get Started
```bash
# Download (single binary, ~48MB static linked)
curl -LO https://github.com/PlusOne/dbbackup/releases/latest/download/dbbackup_linux_amd64
chmod +x dbbackup_linux_amd64
# Your first deduplicated backup
./dbbackup_linux_amd64 dedup backup /var/lib/mysql/dumps/production.sql
# Your first cloud backup
./dbbackup_linux_amd64 backup single production \
--db-type mysql \
--cloud s3://my-backups/
```
## The Bottom Line
| Solution | What It Costs You |
|----------|-------------------|
| Veeam | Money |
| Borg/Restic | Time (scripting, integration) |
| dbbackup | **Neither** |
**This is what open source really means.**
Not just "free as in beer" - but actually solving the problem without requiring you to become a backup engineer.
---
*Apache 2.0 Licensed. Free forever. No sales calls. No wrapper scripts.*
[GitHub](https://github.com/PlusOne/dbbackup) | [Releases](https://github.com/PlusOne/dbbackup/releases) | [Changelog](CHANGELOG.md)

94
PITR.md
View File

@@ -584,6 +584,100 @@ Document your recovery procedure:
9. Create new base backup 9. Create new base backup
``` ```
## Large Database Support (600+ GB)
For databases larger than 600 GB, PITR is the **recommended approach** over full dump/restore.
### Why PITR Works Better for Large DBs
| Approach | 600 GB Database | Recovery Time (RTO) |
|----------|-----------------|---------------------|
| Full pg_dump/restore | Hours to dump, hours to restore | 4-12+ hours |
| PITR (base + WAL) | Incremental WAL only | 30 min - 2 hours |
### Setup for Large Databases
**1. Enable WAL archiving with compression:**
```bash
dbbackup pitr enable --archive-dir /backups/wal_archive --compress
```
**2. Take ONE base backup weekly/monthly (use pg_basebackup):**
```bash
# For 600+ GB, use fast checkpoint to minimize impact
pg_basebackup -D /backups/base_$(date +%Y%m%d).tar.gz \
-Ft -z -P --checkpoint=fast --wal-method=none
# Duration: 2-6 hours for 600 GB, but only needed weekly/monthly
```
**3. WAL files archive continuously** (~1-5 GB/hour typical), capturing every change.
**4. Recover to any point in time:**
```bash
dbbackup restore pitr \
--base-backup /backups/base_20260101.tar.gz \
--wal-archive /backups/wal_archive \
--target-time "2026-01-13 14:30:00" \
--target-dir /var/lib/postgresql/16/restored
```
### PostgreSQL Optimizations for 600+ GB
| Setting | Value | Purpose |
|---------|-------|---------|
| `wal_compression = on` | postgresql.conf | 70-80% smaller WAL files |
| `max_wal_size = 4GB` | postgresql.conf | Reduce checkpoint frequency |
| `checkpoint_timeout = 30min` | postgresql.conf | Less frequent checkpoints |
| `archive_timeout = 300` | postgresql.conf | Force archive every 5 min |
### Recovery Optimizations
| Optimization | How | Benefit |
|--------------|-----|---------|
| Parallel recovery | PostgreSQL 15+ automatic | 2-4x faster WAL replay |
| NVMe/SSD for WAL | Hardware | 3-10x faster recovery |
| Separate WAL disk | Dedicated mount | Avoid I/O contention |
| `recovery_prefetch = on` | PostgreSQL 15+ | Faster page reads |
### Storage Planning
| Component | Size Estimate | Retention |
|-----------|---------------|-----------|
| Base backup | ~200-400 GB compressed | 1-2 copies |
| WAL per day | 5-50 GB (depends on writes) | 7-14 days |
| Total archive | 100-400 GB WAL + base | - |
### RTO Estimates for Large Databases
| Database Size | Base Extraction | WAL Replay (1 week) | Total RTO |
|---------------|-----------------|---------------------|-----------|
| 200 GB | 15-30 min | 15-30 min | 30-60 min |
| 600 GB | 45-90 min | 30-60 min | 1-2.5 hours |
| 1 TB | 60-120 min | 45-90 min | 2-3.5 hours |
| 2 TB | 2-4 hours | 1-2 hours | 3-6 hours |
**Compare to full restore:** 600 GB pg_dump restore takes 8-12+ hours.
### Best Practices for 600+ GB
1. **Weekly base backups** - Monthly if storage is tight
2. **Test recovery monthly** - Verify WAL chain integrity
3. **Monitor WAL lag** - Alert if archive falls behind
4. **Use streaming replication** - For HA, combine with PITR for DR
5. **Separate archive storage** - Don't fill up the DB disk
```bash
# Quick health check for large DB PITR setup
dbbackup pitr status --verbose
# Expected output:
# Base Backup: 2026-01-06 (7 days old) - OK
# WAL Archive: 847 files, 52 GB
# Recovery Window: 2026-01-06 to 2026-01-13 (7 days)
# Estimated RTO: ~90 minutes
```
## Performance Considerations ## Performance Considerations
### WAL Archive Size ### WAL Archive Size

View File

@@ -1,133 +0,0 @@
# Why DBAs Are Switching from Veeam to dbbackup
## The Enterprise Backup Problem
You're paying **$2,000-10,000/year per database server** for enterprise backup solutions.
What are you actually getting?
- Heavy agents eating your CPU
- Complex licensing that requires a spreadsheet to understand
- Vendor lock-in to proprietary formats
- "Cloud support" that means "we'll upload your backup somewhere"
- Recovery that requires calling support
## What If There Was a Better Way?
**dbbackup v3.2.0** delivers enterprise-grade MySQL/MariaDB backup capabilities in a **single, zero-dependency binary**:
| Feature | Veeam/Commercial | dbbackup |
|---------|------------------|----------|
| Physical backups | ✅ Via XtraBackup | ✅ Native Clone Plugin |
| Consistent snapshots | ✅ | ✅ LVM/ZFS/Btrfs |
| Binlog streaming | ❌ | ✅ Continuous PITR |
| Direct cloud streaming | ❌ (stage to disk) | ✅ Zero local storage |
| Parallel uploads | ❌ | ✅ Configurable workers |
| License cost | $$$$ | **Free (MIT)** |
| Dependencies | Agent + XtraBackup + ... | **Single binary** |
## Real Numbers
**100GB database backup comparison:**
| Metric | Traditional | dbbackup v3.2 |
|--------|-------------|---------------|
| Backup time | 45 min | **12 min** |
| Local disk needed | 100GB | **0 GB** |
| Network efficiency | 1x | **3x** (parallel) |
| Recovery point | Daily | **< 1 second** |
## The Technical Revolution
### MySQL Clone Plugin (8.0.17+)
```bash
# Physical backup at InnoDB page level
# No XtraBackup. No external tools. Pure Go.
dbbackup backup single mydb --db-type mysql --cloud s3://bucket/backups/
```
### Filesystem Snapshots
```bash
# Brief lock (<100ms), instant snapshot, stream to cloud
dbbackup backup --engine=snapshot --snapshot-backend=lvm
```
### Continuous Binlog Streaming
```bash
# Real-time binlog capture to S3
# Sub-second RPO without touching the database server
dbbackup binlog stream --target=s3://bucket/binlogs/
```
### Parallel Cloud Upload
```bash
# Saturate your network, not your patience
dbbackup backup --engine=streaming --parallel-workers=8
```
## Who Should Switch?
**Cloud-native deployments** - Kubernetes, ECS, Cloud Run
**Cost-conscious enterprises** - Same capabilities, zero license fees
**DevOps teams** - Single binary, easy automation
**Compliance requirements** - AES-256-GCM encryption, audit logging
**Multi-cloud strategies** - S3, GCS, Azure Blob native support
## Migration Path
**Day 1**: Run dbbackup alongside existing solution
```bash
# Test backup
dbbackup backup single mydb --cloud s3://test-bucket/
# Verify integrity
dbbackup verify s3://test-bucket/mydb_20260115.dump.gz
```
**Week 1**: Compare backup times, storage costs, recovery speed
**Week 2**: Switch primary backups to dbbackup
**Month 1**: Cancel Veeam renewal, buy your team pizza with savings 🍕
## FAQ
**Q: Is this production-ready?**
A: Used in production by organizations managing petabytes of MySQL data.
**Q: What about support?**
A: Community support via GitHub. Enterprise support available.
**Q: Can it replace XtraBackup?**
A: For MySQL 8.0.17+, yes. We use native Clone Plugin instead.
**Q: What about PostgreSQL?**
A: Full PostgreSQL support including WAL archiving and PITR.
## Get Started
```bash
# Download (single binary, ~15MB)
curl -LO https://github.com/UUXO/dbbackup/releases/latest/download/dbbackup_linux_amd64
chmod +x dbbackup_linux_amd64
# Your first backup
./dbbackup_linux_amd64 backup single production \
--db-type mysql \
--cloud s3://my-backups/
```
## The Bottom Line
Every dollar you spend on backup licensing is a dollar not spent on:
- Better hardware
- Your team
- Actually useful tools
**dbbackup**: Enterprise capabilities. Zero enterprise pricing.
---
*Apache 2.0 Licensed. Free forever. No sales calls required.*
[GitHub](https://github.com/UUXO/dbbackup) | [Documentation](https://github.com/UUXO/dbbackup#readme) | [Changelog](CHANGELOG.md)

View File

@@ -4,8 +4,8 @@ This directory contains pre-compiled binaries for the DB Backup Tool across mult
## Build Information ## Build Information
- **Version**: 3.42.10 - **Version**: 3.42.10
- **Build Time**: 2026-01-12_08:50:35_UTC - **Build Time**: 2026-01-13_07:23:20_UTC
- **Git Commit**: b1f8c6d - **Git Commit**: f153e61
## Recent Updates (v1.1.0) ## Recent Updates (v1.1.0)
- ✅ Fixed TUI progress display with line-by-line output - ✅ Fixed TUI progress display with line-by-line output

View File

@@ -185,15 +185,15 @@ Examples:
// Flags // Flags
var ( var (
dedupDir string dedupDir string
dedupIndexDB string // Separate path for SQLite index (for NFS/CIFS support) dedupIndexDB string // Separate path for SQLite index (for NFS/CIFS support)
dedupCompress bool dedupCompress bool
dedupEncrypt bool dedupEncrypt bool
dedupKey string dedupKey string
dedupName string dedupName string
dedupDBType string dedupDBType string
dedupDBName string dedupDBName string
dedupDBHost string dedupDBHost string
dedupDecompress bool // Auto-decompress gzip input dedupDecompress bool // Auto-decompress gzip input
) )

View File

@@ -414,24 +414,121 @@ func (d *Diagnoser) diagnoseSQLScript(filePath string, compressed bool, result *
// diagnoseClusterArchive analyzes a cluster tar.gz archive // diagnoseClusterArchive analyzes a cluster tar.gz archive
func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResult) { func (d *Diagnoser) diagnoseClusterArchive(filePath string, result *DiagnoseResult) {
// First verify tar.gz integrity with timeout // Calculate dynamic timeout based on file size
// 5 minutes for large archives (multi-GB archives need more time) // Large archives (100GB+) can take significant time to list
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) // Minimum 5 minutes, scales with file size, max 180 minutes for very large archives
timeoutMinutes := 5
if result.FileSize > 0 {
// 1 minute per 2 GB, minimum 5 minutes, max 180 minutes
sizeGB := result.FileSize / (1024 * 1024 * 1024)
estimatedMinutes := int(sizeGB/2) + 5
if estimatedMinutes > timeoutMinutes {
timeoutMinutes = estimatedMinutes
}
if timeoutMinutes > 180 {
timeoutMinutes = 180
}
}
d.log.Info("Verifying cluster archive integrity",
"size", fmt.Sprintf("%.1f GB", float64(result.FileSize)/(1024*1024*1024)),
"timeout", fmt.Sprintf("%d min", timeoutMinutes))
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
defer cancel() defer cancel()
// Use streaming approach with pipes to avoid memory issues with large archives
cmd := exec.CommandContext(ctx, "tar", "-tzf", filePath) cmd := exec.CommandContext(ctx, "tar", "-tzf", filePath)
output, err := cmd.Output() stdout, pipeErr := cmd.StdoutPipe()
if err != nil { if pipeErr != nil {
result.IsValid = false // Pipe creation failed - not a corruption issue
result.IsCorrupted = true result.Warnings = append(result.Warnings,
result.Errors = append(result.Errors, fmt.Sprintf("Cannot create pipe for verification: %v", pipeErr),
fmt.Sprintf("Tar archive is invalid or corrupted: %v", err), "Archive integrity cannot be verified but may still be valid")
"Run: tar -tzf "+filePath+" 2>&1 | tail -20")
return return
} }
// Parse tar listing var stderrBuf bytes.Buffer
files := strings.Split(strings.TrimSpace(string(output)), "\n") cmd.Stderr = &stderrBuf
if startErr := cmd.Start(); startErr != nil {
result.Warnings = append(result.Warnings,
fmt.Sprintf("Cannot start tar verification: %v", startErr),
"Archive integrity cannot be verified but may still be valid")
return
}
// Stream output line by line to avoid buffering entire listing in memory
scanner := bufio.NewScanner(stdout)
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024) // Allow long paths
var files []string
fileCount := 0
for scanner.Scan() {
fileCount++
line := scanner.Text()
// Only store dump/metadata files, not every file
if strings.HasSuffix(line, ".dump") || strings.HasSuffix(line, ".sql.gz") ||
strings.HasSuffix(line, ".sql") || strings.HasSuffix(line, ".json") ||
strings.Contains(line, "globals") || strings.Contains(line, "manifest") ||
strings.Contains(line, "metadata") {
files = append(files, line)
}
}
scanErr := scanner.Err()
waitErr := cmd.Wait()
stderrOutput := stderrBuf.String()
// Handle errors - distinguish between actual corruption and resource/timeout issues
if waitErr != nil || scanErr != nil {
// Check if it was a timeout
if ctx.Err() == context.DeadlineExceeded {
result.Warnings = append(result.Warnings,
fmt.Sprintf("Verification timed out after %d minutes - archive is very large", timeoutMinutes),
"This does not necessarily mean the archive is corrupted",
"Manual verification: tar -tzf "+filePath+" | wc -l")
// Don't mark as corrupted or invalid on timeout - archive may be fine
if fileCount > 0 {
result.Details.TableCount = len(files)
result.Details.TableList = files
}
return
}
// Check for specific gzip/tar corruption indicators
if strings.Contains(stderrOutput, "unexpected end of file") ||
strings.Contains(stderrOutput, "Unexpected EOF") ||
strings.Contains(stderrOutput, "gzip: stdin: unexpected end of file") ||
strings.Contains(stderrOutput, "not in gzip format") ||
strings.Contains(stderrOutput, "invalid compressed data") {
// These indicate actual corruption
result.IsValid = false
result.IsCorrupted = true
result.Errors = append(result.Errors,
"Tar archive appears truncated or corrupted",
fmt.Sprintf("Error: %s", truncateString(stderrOutput, 200)),
"Run: tar -tzf "+filePath+" 2>&1 | tail -20")
return
}
// Other errors (signal killed, memory, etc.) - not necessarily corruption
// If we read some files successfully, the archive structure is likely OK
if fileCount > 0 {
result.Warnings = append(result.Warnings,
fmt.Sprintf("Verification incomplete (read %d files before error)", fileCount),
"Archive may still be valid - error could be due to system resources")
// Proceed with what we got
} else {
// Couldn't read anything - but don't mark as corrupted without clear evidence
result.Warnings = append(result.Warnings,
fmt.Sprintf("Cannot verify archive: %v", waitErr),
"Archive integrity is uncertain - proceed with caution or verify manually")
return
}
}
// Parse the collected file list
var dumpFiles []string var dumpFiles []string
hasGlobals := false hasGlobals := false
hasMetadata := false hasMetadata := false
@@ -497,9 +594,22 @@ func (d *Diagnoser) diagnoseUnknown(filePath string, result *DiagnoseResult) {
// verifyWithPgRestore uses pg_restore --list to verify dump integrity // verifyWithPgRestore uses pg_restore --list to verify dump integrity
func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult) { func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult) {
// Use timeout to prevent blocking on very large dump files // Calculate dynamic timeout based on file size
// 5 minutes for large dumps (multi-GB dumps with many tables) // pg_restore --list is usually faster than tar -tzf for same size
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) timeoutMinutes := 5
if result.FileSize > 0 {
// 1 minute per 5 GB, minimum 5 minutes, max 30 minutes
sizeGB := result.FileSize / (1024 * 1024 * 1024)
estimatedMinutes := int(sizeGB/5) + 5
if estimatedMinutes > timeoutMinutes {
timeoutMinutes = estimatedMinutes
}
if timeoutMinutes > 30 {
timeoutMinutes = 30
}
}
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
defer cancel() defer cancel()
cmd := exec.CommandContext(ctx, "pg_restore", "--list", filePath) cmd := exec.CommandContext(ctx, "pg_restore", "--list", filePath)
@@ -554,14 +664,72 @@ func (d *Diagnoser) verifyWithPgRestore(filePath string, result *DiagnoseResult)
// DiagnoseClusterDumps extracts and diagnoses all dumps in a cluster archive // DiagnoseClusterDumps extracts and diagnoses all dumps in a cluster archive
func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*DiagnoseResult, error) { func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*DiagnoseResult, error) {
// First, try to list archive contents without extracting (fast check) // Get archive size for dynamic timeout calculation
// 10 minutes for very large archives archiveInfo, err := os.Stat(archivePath)
listCtx, listCancel := context.WithTimeout(context.Background(), 10*time.Minute) if err != nil {
return nil, fmt.Errorf("cannot stat archive: %w", err)
}
// Dynamic timeout based on archive size: base 10 min + 1 min per 3 GB
// Large archives like 100+ GB need more time for tar -tzf
timeoutMinutes := 10
if archiveInfo.Size() > 0 {
sizeGB := archiveInfo.Size() / (1024 * 1024 * 1024)
estimatedMinutes := int(sizeGB/3) + 10
if estimatedMinutes > timeoutMinutes {
timeoutMinutes = estimatedMinutes
}
if timeoutMinutes > 120 { // Max 2 hours
timeoutMinutes = 120
}
}
d.log.Info("Listing cluster archive contents",
"size", fmt.Sprintf("%.1f GB", float64(archiveInfo.Size())/(1024*1024*1024)),
"timeout", fmt.Sprintf("%d min", timeoutMinutes))
listCtx, listCancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
defer listCancel() defer listCancel()
listCmd := exec.CommandContext(listCtx, "tar", "-tzf", archivePath) listCmd := exec.CommandContext(listCtx, "tar", "-tzf", archivePath)
listOutput, listErr := listCmd.CombinedOutput()
if listErr != nil { // Use pipes for streaming to avoid buffering entire output in memory
// This prevents OOM kills on large archives (100GB+) with millions of files
stdout, err := listCmd.StdoutPipe()
if err != nil {
return nil, fmt.Errorf("failed to create stdout pipe: %w", err)
}
var stderrBuf bytes.Buffer
listCmd.Stderr = &stderrBuf
if err := listCmd.Start(); err != nil {
return nil, fmt.Errorf("failed to start tar listing: %w", err)
}
// Stream the output line by line, only keeping relevant files
var files []string
scanner := bufio.NewScanner(stdout)
// Set a reasonable max line length (file paths shouldn't exceed this)
scanner.Buffer(make([]byte, 0, 4096), 1024*1024)
fileCount := 0
for scanner.Scan() {
fileCount++
line := scanner.Text()
// Only store dump files and important files, not every single file
if strings.HasSuffix(line, ".dump") || strings.HasSuffix(line, ".sql") ||
strings.HasSuffix(line, ".sql.gz") || strings.HasSuffix(line, ".json") ||
strings.Contains(line, "globals") || strings.Contains(line, "manifest") ||
strings.Contains(line, "metadata") || strings.HasSuffix(line, "/") {
files = append(files, line)
}
}
scanErr := scanner.Err()
listErr := listCmd.Wait()
if listErr != nil || scanErr != nil {
// Archive listing failed - likely corrupted // Archive listing failed - likely corrupted
errResult := &DiagnoseResult{ errResult := &DiagnoseResult{
FilePath: archivePath, FilePath: archivePath,
@@ -573,7 +741,12 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
Details: &DiagnoseDetails{}, Details: &DiagnoseDetails{},
} }
errOutput := string(listOutput) errOutput := stderrBuf.String()
actualErr := listErr
if scanErr != nil {
actualErr = scanErr
}
if strings.Contains(errOutput, "unexpected end of file") || if strings.Contains(errOutput, "unexpected end of file") ||
strings.Contains(errOutput, "Unexpected EOF") || strings.Contains(errOutput, "Unexpected EOF") ||
strings.Contains(errOutput, "truncated") { strings.Contains(errOutput, "truncated") {
@@ -585,7 +758,7 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
"Solution: Re-create the backup from source database") "Solution: Re-create the backup from source database")
} else { } else {
errResult.Errors = append(errResult.Errors, errResult.Errors = append(errResult.Errors,
fmt.Sprintf("Cannot list archive contents: %v", listErr), fmt.Sprintf("Cannot list archive contents: %v", actualErr),
fmt.Sprintf("tar error: %s", truncateString(errOutput, 300)), fmt.Sprintf("tar error: %s", truncateString(errOutput, 300)),
"Run manually: tar -tzf "+archivePath+" 2>&1 | tail -50") "Run manually: tar -tzf "+archivePath+" 2>&1 | tail -50")
} }
@@ -593,11 +766,10 @@ func (d *Diagnoser) DiagnoseClusterDumps(archivePath, tempDir string) ([]*Diagno
return []*DiagnoseResult{errResult}, nil return []*DiagnoseResult{errResult}, nil
} }
// Archive is listable - now check disk space before extraction d.log.Debug("Archive listing streamed successfully", "total_files", fileCount, "relevant_files", len(files))
files := strings.Split(strings.TrimSpace(string(listOutput)), "\n")
// Check if we have enough disk space (estimate 4x archive size needed) // Check if we have enough disk space (estimate 4x archive size needed)
archiveInfo, _ := os.Stat(archivePath) // archiveInfo already obtained at function start
requiredSpace := archiveInfo.Size() * 4 requiredSpace := archiveInfo.Size() * 4
// Check temp directory space - try to extract metadata first // Check temp directory space - try to extract metadata first

View File

@@ -229,8 +229,14 @@ func containsSQLKeywords(content string) bool {
} }
// CheckDiskSpace verifies sufficient disk space for restore // CheckDiskSpace verifies sufficient disk space for restore
// Uses the effective work directory (WorkDir if set, otherwise BackupDir) since
// that's where extraction actually happens for large databases
func (s *Safety) CheckDiskSpace(archivePath string, multiplier float64) error { func (s *Safety) CheckDiskSpace(archivePath string, multiplier float64) error {
return s.CheckDiskSpaceAt(archivePath, s.cfg.BackupDir, multiplier) checkDir := s.cfg.GetEffectiveWorkDir()
if checkDir == "" {
checkDir = s.cfg.BackupDir
}
return s.CheckDiskSpaceAt(archivePath, checkDir, multiplier)
} }
// CheckDiskSpaceAt verifies sufficient disk space at a specific directory // CheckDiskSpaceAt verifies sufficient disk space at a specific directory

View File

@@ -106,9 +106,23 @@ type safetyCheckCompleteMsg struct {
func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd { func runSafetyChecks(cfg *config.Config, log logger.Logger, archive ArchiveInfo, targetDB string) tea.Cmd {
return func() tea.Msg { return func() tea.Msg {
// 10 minutes for safety checks - large archives can take a long time to diagnose // Dynamic timeout based on archive size for large database support
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute) // Base: 10 minutes + 1 minute per 5 GB, max 120 minutes
timeoutMinutes := 10
if archive.Size > 0 {
sizeGB := archive.Size / (1024 * 1024 * 1024)
estimatedMinutes := int(sizeGB/5) + 10
if estimatedMinutes > timeoutMinutes {
timeoutMinutes = estimatedMinutes
}
if timeoutMinutes > 120 {
timeoutMinutes = 120
}
}
ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeoutMinutes)*time.Minute)
defer cancel() defer cancel()
_ = ctx // Used by database checks below
safety := restore.NewSafety(cfg, log) safety := restore.NewSafety(cfg, log)
checks := []SafetyCheck{} checks := []SafetyCheck{}

View File

@@ -1,171 +0,0 @@
#!/bin/bash
# COMPLETE emoji/Unicode removal - Replace ALL non-ASCII with ASCII equivalents
# Date: January 8, 2026
set -euo pipefail
echo "[INFO] Starting COMPLETE Unicode->ASCII replacement..."
echo ""
# Create backup
BACKUP_DIR="backup_unicode_removal_$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
echo "[INFO] Creating backup in $BACKUP_DIR..."
find . -name "*.go" -type f -not -path "*/vendor/*" -not -path "*/.git/*" -exec bash -c 'mkdir -p "$1/$(dirname "$2")" && cp "$2" "$1/$2"' -- "$BACKUP_DIR" {} \;
echo "[OK] Backup created"
echo ""
# Find all affected files
echo "[SEARCH] Finding files with Unicode..."
FILES=$(find . -name "*.go" -type f -not -path "*/vendor/*" -not -path "*/.git/*")
PROCESSED=0
TOTAL=$(echo "$FILES" | wc -l)
for file in $FILES; do
PROCESSED=$((PROCESSED + 1))
if ! grep -qP '[\x{80}-\x{FFFF}]' "$file" 2>/dev/null; then
continue
fi
echo "[$PROCESSED/$TOTAL] Processing: $file"
# Create temp file for atomic replacements
TMPFILE="${file}.tmp"
cp "$file" "$TMPFILE"
# Box drawing / decorative (used in TUI borders)
sed -i 's/─/-/g' "$TMPFILE"
sed -i 's/━/-/g' "$TMPFILE"
sed -i 's/│/|/g' "$TMPFILE"
sed -i 's/║/|/g' "$TMPFILE"
sed -i 's/├/+/g' "$TMPFILE"
sed -i 's/└/+/g' "$TMPFILE"
sed -i 's/╔/+/g' "$TMPFILE"
sed -i 's/╗/+/g' "$TMPFILE"
sed -i 's/╚/+/g' "$TMPFILE"
sed -i 's/╝/+/g' "$TMPFILE"
sed -i 's/╠/+/g' "$TMPFILE"
sed -i 's/╣/+/g' "$TMPFILE"
sed -i 's/═/=/g' "$TMPFILE"
# Status symbols
sed -i 's/✅/[OK]/g' "$TMPFILE"
sed -i 's/❌/[FAIL]/g' "$TMPFILE"
sed -i 's/✓/[+]/g' "$TMPFILE"
sed -i 's/✗/[-]/g' "$TMPFILE"
sed -i 's/⚠️/[WARN]/g' "$TMPFILE"
sed -i 's/⚠/[!]/g' "$TMPFILE"
sed -i 's/❓/[?]/g' "$TMPFILE"
# Arrows
sed -i 's/←/</g' "$TMPFILE"
sed -i 's/→/>/g' "$TMPFILE"
sed -i 's/↑/^/g' "$TMPFILE"
sed -i 's/↓/v/g' "$TMPFILE"
sed -i 's/▲/^/g' "$TMPFILE"
sed -i 's/▼/v/g' "$TMPFILE"
sed -i 's/▶/>/g' "$TMPFILE"
# Shapes
sed -i 's/●/*\*/g' "$TMPFILE"
sed -i 's/○/o/g' "$TMPFILE"
sed -i 's/⚪/o/g' "$TMPFILE"
sed -i 's/•/-/g' "$TMPFILE"
sed -i 's/█/#/g' "$TMPFILE"
sed -i 's/▎/|/g' "$TMPFILE"
sed -i 's/░/./g' "$TMPFILE"
sed -i 's//-/g' "$TMPFILE"
# Emojis - Info/Data
sed -i 's/📊/[INFO]/g' "$TMPFILE"
sed -i 's/📋/[LIST]/g' "$TMPFILE"
sed -i 's/📁/[DIR]/g' "$TMPFILE"
sed -i 's/📦/[PKG]/g' "$TMPFILE"
sed -i 's/📜/[LOG]/g' "$TMPFILE"
sed -i 's/📭/[EMPTY]/g' "$TMPFILE"
sed -i 's/📝/[NOTE]/g' "$TMPFILE"
sed -i 's/💡/[TIP]/g' "$TMPFILE"
# Emojis - Actions/Objects
sed -i 's/🎯/[TARGET]/g' "$TMPFILE"
sed -i 's/🛡️/[SECURE]/g' "$TMPFILE"
sed -i 's/🔒/[LOCK]/g' "$TMPFILE"
sed -i 's/🔓/[UNLOCK]/g' "$TMPFILE"
sed -i 's/🔍/[SEARCH]/g' "$TMPFILE"
sed -i 's/🔀/[SWITCH]/g' "$TMPFILE"
sed -i 's/🔥/[FIRE]/g' "$TMPFILE"
sed -i 's/💾/[SAVE]/g' "$TMPFILE"
sed -i 's/🗄️/[DB]/g' "$TMPFILE"
sed -i 's/🗄/[DB]/g' "$TMPFILE"
# Emojis - Time/Status
sed -i 's/⏱️/[TIME]/g' "$TMPFILE"
sed -i 's/⏱/[TIME]/g' "$TMPFILE"
sed -i 's/⏳/[WAIT]/g' "$TMPFILE"
sed -i 's/⏪/[REW]/g' "$TMPFILE"
sed -i 's/⏹️/[STOP]/g' "$TMPFILE"
sed -i 's/⏹/[STOP]/g' "$TMPFILE"
sed -i 's/⟳/[SYNC]/g' "$TMPFILE"
# Emojis - Cloud
sed -i 's/☁️/[CLOUD]/g' "$TMPFILE"
sed -i 's/☁/[CLOUD]/g' "$TMPFILE"
sed -i 's/📤/[UPLOAD]/g' "$TMPFILE"
sed -i 's/📥/[DOWNLOAD]/g' "$TMPFILE"
sed -i 's/🗑️/[DELETE]/g' "$TMPFILE"
# Emojis - Misc
sed -i 's/📈/[UP]/g' "$TMPFILE"
sed -i 's/📉/[DOWN]/g' "$TMPFILE"
sed -i 's/⌨️/[KEY]/g' "$TMPFILE"
sed -i 's/⌨/[KEY]/g' "$TMPFILE"
sed -i 's/⚙️/[CONFIG]/g' "$TMPFILE"
sed -i 's/⚙/[CONFIG]/g' "$TMPFILE"
sed -i 's/✏️/[EDIT]/g' "$TMPFILE"
sed -i 's/✏/[EDIT]/g' "$TMPFILE"
sed -i 's/⚡/[FAST]/g' "$TMPFILE"
# Spinner characters (braille patterns for loading animations)
sed -i 's/⠋/|/g' "$TMPFILE"
sed -i 's/⠙/\//g' "$TMPFILE"
sed -i 's/⠹/-/g' "$TMPFILE"
sed -i 's/⠸/\\/g' "$TMPFILE"
sed -i 's/⠼/|/g' "$TMPFILE"
sed -i 's/⠴/\//g' "$TMPFILE"
sed -i 's/⠦/-/g' "$TMPFILE"
sed -i 's/⠧/\\/g' "$TMPFILE"
sed -i 's/⠇/|/g' "$TMPFILE"
sed -i 's/⠏/\//g' "$TMPFILE"
# Move temp file over original
mv "$TMPFILE" "$file"
done
echo ""
echo "[OK] Replacement complete!"
echo ""
# Verify
REMAINING=$(grep -roP '[\x{80}-\x{FFFF}]' --include="*.go" . 2>/dev/null | wc -l || echo "0")
echo "[INFO] Unicode characters remaining: $REMAINING"
if [ "$REMAINING" -gt 0 ]; then
echo "[WARN] Some Unicode still exists (might be in comments or safe locations)"
echo "[INFO] Unique remaining characters:"
grep -roP '[\x{80}-\x{FFFF}]' --include="*.go" . 2>/dev/null | grep -oP '[\x{80}-\x{FFFF}]' | sort -u | head -20
else
echo "[OK] All Unicode characters replaced with ASCII!"
fi
echo ""
echo "[INFO] Backup: $BACKUP_DIR"
echo "[INFO] To restore: cp -r $BACKUP_DIR/* ."
echo ""
echo "[INFO] Next steps:"
echo " 1. go build"
echo " 2. go test ./..."
echo " 3. Test TUI: ./dbbackup"
echo " 4. Commit: git add . && git commit -m 'v3.42.11: Replace all Unicode with ASCII'"
echo ""

View File

@@ -1,130 +0,0 @@
#!/bin/bash
# Remove ALL emojis/unicode symbols from Go code and replace with ASCII
# Date: January 8, 2026
# Issue: 638 lines contain Unicode emojis causing display issues
set -euo pipefail
echo "[INFO] Starting emoji removal process..."
echo ""
# Find all Go files with emojis (expanded emoji list)
echo "[SEARCH] Finding affected files..."
FILES=$(find . -name "*.go" -type f -not -path "*/vendor/*" -not -path "*/.git/*" | xargs grep -l -P '[\x{1F000}-\x{1FFFF}]|[\x{2300}-\x{27BF}]|[\x{2600}-\x{26FF}]' 2>/dev/null || true)
if [ -z "$FILES" ]; then
echo "[WARN] No files with emojis found!"
exit 0
fi
FILECOUNT=$(echo "$FILES" | wc -l)
echo "[INFO] Found $FILECOUNT files containing emojis"
echo ""
# Count total emojis before
BEFORE=$(find . -name "*.go" -type f -not -path "*/vendor/*" | xargs grep -oP '[\x{1F000}-\x{1FFFF}]|[\x{2300}-\x{27BF}]|[\x{2600}-\x{26FF}]' 2>/dev/null | wc -l || echo "0")
echo "[INFO] Total emojis found: $BEFORE"
echo ""
# Create backup
BACKUP_DIR="backup_before_emoji_removal_$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
echo "[INFO] Creating backup in $BACKUP_DIR..."
for file in $FILES; do
mkdir -p "$BACKUP_DIR/$(dirname "$file")"
cp "$file" "$BACKUP_DIR/$file"
done
echo "[OK] Backup created"
echo ""
# Process each file
echo "[INFO] Replacing emojis with ASCII equivalents..."
PROCESSED=0
for file in $FILES; do
PROCESSED=$((PROCESSED + 1))
echo "[$PROCESSED/$FILECOUNT] Processing: $file"
# Create temp file
TMPFILE="${file}.tmp"
# Status indicators
sed 's/✅/[OK]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/❌/[FAIL]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/✓/[+]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/✗/[-]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
# Warning symbols (⚠️ has variant selector, handle both)
sed 's/⚠️/[WARN]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/⚠/[!]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
# Info/Data symbols
sed 's/📊/[INFO]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/📋/[LIST]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/📁/[DIR]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/📦/[PKG]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
# Target/Security
sed 's/🎯/[TARGET]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/🛡️/[SECURE]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/🔒/[LOCK]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/🔓/[UNLOCK]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
# Actions
sed 's/🔍/[SEARCH]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/⏱️/[TIME]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
# Cloud operations (☁️ has variant selector, handle both)
sed 's/☁️/[CLOUD]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/☁/[CLOUD]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/📤/[UPLOAD]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/📥/[DOWNLOAD]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/🗑️/[DELETE]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
# Other
sed 's/📈/[UP]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/📉/[DOWN]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
# Additional emojis found
sed 's/⌨️/[KEY]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/⌨/[KEY]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/🗄️/[DB]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/🗄/[DB]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/⚙️/[CONFIG]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/⚙/[CONFIG]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/✏️/[EDIT]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
sed 's/✏/[EDIT]/g' "$file" > "$TMPFILE" && mv "$TMPFILE" "$file"
done
echo ""
echo "[OK] Replacement complete!"
echo ""
# Count remaining emojis
AFTER=$(find . -name "*.go" -type f -not -path "*/vendor/*" | xargs grep -oP '[\x{1F000}-\x{1FFFF}]|[\x{2300}-\x{27BF}]|[\x{2600}-\x{26FF}]' 2>/dev/null | wc -l || echo "0")
echo "[INFO] Emojis before: $BEFORE"
echo "[INFO] Emojis after: $AFTER"
echo "[INFO] Emojis removed: $((BEFORE - AFTER))"
echo ""
if [ "$AFTER" -gt 0 ]; then
echo "[WARN] $AFTER emojis still remaining!"
echo "[INFO] Listing remaining emojis:"
find . -name "*.go" -type f -not -path "*/vendor/*" | xargs grep -nP '[\x{1F000}-\x{1FFFF}]|[\x{2300}-\x{27BF}]|[\x{2600}-\x{26FF}]' 2>/dev/null | head -20
else
echo "[OK] All emojis successfully removed!"
fi
echo ""
echo "[INFO] Backup location: $BACKUP_DIR"
echo "[INFO] To restore: cp -r $BACKUP_DIR/* ."
echo ""
echo "[INFO] Next steps:"
echo " 1. Build: go build"
echo " 2. Test: go test ./..."
echo " 3. Manual testing: ./dbbackup status"
echo " 4. If OK, commit: git add . && git commit -m 'Replace emojis with ASCII'"
echo " 5. If broken, restore: cp -r $BACKUP_DIR/* ."
echo ""
echo "[OK] Emoji removal script completed!"