Files
dbbackup/CLOUD.md
Renz 64f1458e9a feat: Sprint 4 - Azure Blob Storage and Google Cloud Storage support
Implemented full native support for Azure Blob Storage and Google Cloud Storage:

**Azure Blob Storage (internal/cloud/azure.go):**
- Native Azure SDK integration (github.com/Azure/azure-sdk-for-go)
- Block blob upload for large files (>256MB with 100MB blocks)
- Azurite emulator support for local testing
- Production Azure authentication (account name + key)
- SHA-256 integrity verification with metadata
- Streaming uploads with progress tracking

**Google Cloud Storage (internal/cloud/gcs.go):**
- Native GCS SDK integration (cloud.google.com/go/storage)
- Chunked upload for large files (16MB chunks)
- fake-gcs-server emulator support for local testing
- Application Default Credentials support
- Service account JSON key file support
- SHA-256 integrity verification with metadata
- Streaming uploads with progress tracking

**Backend Integration:**
- Updated NewBackend() factory to support azure/azblob and gs/gcs providers
- Added Name() methods to both backends
- Fixed ProgressReader usage across all backends
- Updated Config comments to document Azure/GCS support

**Testing Infrastructure:**
- docker-compose.azurite.yml: Azurite + PostgreSQL + MySQL test environment
- docker-compose.gcs.yml: fake-gcs-server + PostgreSQL + MySQL test environment
- scripts/test_azure_storage.sh: 8 comprehensive Azure integration tests
- scripts/test_gcs_storage.sh: 8 comprehensive GCS integration tests
- Both test scripts validate upload/download/verify/cleanup/restore operations

**Documentation:**
- AZURE.md: Complete guide (600+ lines) covering setup, authentication, usage
- GCS.md: Complete guide (600+ lines) covering setup, authentication, usage
- Updated CLOUD.md with Azure and GCS sections
- Updated internal/config/config.go with Azure/GCS field documentation

**Test Coverage:**
- Large file uploads (300MB for Azure, 200MB for GCS)
- Block/chunked upload verification
- Backup verification with SHA-256 checksums
- Restore from cloud URIs
- Cleanup and retention policies
- Emulator support for both providers

**Dependencies Added:**
- Azure: github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.6.3
- GCS: cloud.google.com/go/storage v1.57.2
- Plus transitive dependencies (~50+ packages)

**Build:**
- Compiles successfully: 68MB binary
- All imports resolved
- No compilation errors

Sprint 4 closes the multi-cloud gap identified in Sprint 3 evaluation.
Users can now use Azure and GCS URIs that were previously parsed but unsupported.
2025-11-25 21:31:21 +00:00

18 KiB

Cloud Storage Guide for dbbackup

Overview

dbbackup v2.0 includes comprehensive cloud storage integration, allowing you to backup directly to S3-compatible storage providers and restore from cloud URIs.

Supported Providers:

  • AWS S3
  • MinIO (self-hosted S3-compatible)
  • Backblaze B2
  • Azure Blob Storage (native support)
  • Google Cloud Storage (native support)
  • Any S3-compatible storage

Key Features:

  • Direct backup to cloud with --cloud URI flag
  • Restore from cloud URIs
  • Verify cloud backup integrity
  • Apply retention policies to cloud storage
  • Multipart upload for large files (>100MB)
  • Progress tracking for uploads/downloads
  • Automatic metadata synchronization
  • Streaming transfers (memory efficient)

Quick Start

1. Set Up Credentials

# For AWS S3
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"

# For MinIO
export AWS_ACCESS_KEY_ID="minioadmin"
export AWS_SECRET_ACCESS_KEY="minioadmin123"
export AWS_ENDPOINT_URL="http://localhost:9000"

# For Backblaze B2
export AWS_ACCESS_KEY_ID="your-b2-key-id"
export AWS_SECRET_ACCESS_KEY="your-b2-application-key"
export AWS_ENDPOINT_URL="https://s3.us-west-002.backblazeb2.com"

2. Backup with Cloud URI

# Backup to S3
dbbackup backup single mydb --cloud s3://my-bucket/backups/

# Backup to MinIO
dbbackup backup single mydb --cloud minio://my-bucket/backups/

# Backup to Backblaze B2
dbbackup backup single mydb --cloud b2://my-bucket/backups/

3. Restore from Cloud

# Restore from cloud URI
dbbackup restore single s3://my-bucket/backups/mydb_20260115_120000.dump --confirm

# Restore to different database
dbbackup restore single s3://my-bucket/backups/mydb.dump \
    --target mydb_restored \
    --confirm

URI Syntax

Cloud URIs follow this format:

<provider>://<bucket>/<path>/<filename>

Supported Providers:

  • s3:// - AWS S3 or S3-compatible storage
  • minio:// - MinIO (auto-enables path-style addressing)
  • b2:// - Backblaze B2
  • gs:// or gcs:// - Google Cloud Storage (native support)
  • azure:// or azblob:// - Azure Blob Storage (native support)

Examples:

s3://production-backups/databases/postgres/
minio://local-backups/dev/mydb/
b2://offsite-backups/daily/
gs://gcp-backups/prod/

Configuration Methods

dbbackup backup single mydb --cloud s3://my-bucket/backups/

Method 2: Individual Flags

dbbackup backup single mydb \
    --cloud-auto-upload \
    --cloud-provider s3 \
    --cloud-bucket my-bucket \
    --cloud-prefix backups/

Method 3: Environment Variables

export CLOUD_ENABLED=true
export CLOUD_AUTO_UPLOAD=true
export CLOUD_PROVIDER=s3
export CLOUD_BUCKET=my-bucket
export CLOUD_PREFIX=backups/
export CLOUD_REGION=us-east-1

dbbackup backup single mydb

Method 4: Config File

# ~/.dbbackup.conf
[cloud]
enabled = true
auto_upload = true
provider = "s3"
bucket = "my-bucket"
prefix = "backups/"
region = "us-east-1"

Commands

Cloud Upload

Upload existing backup files to cloud storage:

# Upload single file
dbbackup cloud upload /backups/mydb.dump \
    --cloud-provider s3 \
    --cloud-bucket my-bucket

# Upload with cloud URI flags
dbbackup cloud upload /backups/mydb.dump \
    --cloud-provider minio \
    --cloud-bucket local-backups \
    --cloud-endpoint http://localhost:9000

# Upload multiple files
dbbackup cloud upload /backups/*.dump \
    --cloud-provider s3 \
    --cloud-bucket my-bucket \
    --verbose

Cloud Download

Download backups from cloud storage:

# Download to current directory
dbbackup cloud download mydb.dump . \
    --cloud-provider s3 \
    --cloud-bucket my-bucket

# Download to specific directory
dbbackup cloud download backups/mydb.dump /restore/ \
    --cloud-provider s3 \
    --cloud-bucket my-bucket \
    --verbose

Cloud List

List backups in cloud storage:

# List all backups
dbbackup cloud list \
    --cloud-provider s3 \
    --cloud-bucket my-bucket

# List with prefix filter
dbbackup cloud list \
    --cloud-provider s3 \
    --cloud-bucket my-bucket \
    --cloud-prefix postgres/

# Verbose output with details
dbbackup cloud list \
    --cloud-provider s3 \
    --cloud-bucket my-bucket \
    --verbose

Cloud Delete

Delete backups from cloud storage:

# Delete specific backup (with confirmation prompt)
dbbackup cloud delete mydb_old.dump \
    --cloud-provider s3 \
    --cloud-bucket my-bucket

# Delete without confirmation
dbbackup cloud delete mydb_old.dump \
    --cloud-provider s3 \
    --cloud-bucket my-bucket \
    --confirm

Backup with Auto-Upload

# Backup and automatically upload
dbbackup backup single mydb --cloud s3://my-bucket/backups/

# With individual flags
dbbackup backup single mydb \
    --cloud-auto-upload \
    --cloud-provider s3 \
    --cloud-bucket my-bucket \
    --cloud-prefix backups/

Restore from Cloud

# Restore from cloud URI (auto-download)
dbbackup restore single s3://my-bucket/backups/mydb.dump --confirm

# Restore to different database
dbbackup restore single s3://my-bucket/backups/mydb.dump \
    --target mydb_restored \
    --confirm

# Restore with database creation
dbbackup restore single s3://my-bucket/backups/mydb.dump \
    --create \
    --confirm

Verify Cloud Backups

# Verify single cloud backup
dbbackup verify-backup s3://my-bucket/backups/mydb.dump

# Quick verification (size check only)
dbbackup verify-backup s3://my-bucket/backups/mydb.dump --quick

# Verbose output
dbbackup verify-backup s3://my-bucket/backups/mydb.dump --verbose

Cloud Cleanup

Apply retention policies to cloud storage:

# Cleanup old backups (dry-run)
dbbackup cleanup s3://my-bucket/backups/ \
    --retention-days 30 \
    --min-backups 5 \
    --dry-run

# Actual cleanup
dbbackup cleanup s3://my-bucket/backups/ \
    --retention-days 30 \
    --min-backups 5

# Pattern-based cleanup
dbbackup cleanup s3://my-bucket/backups/ \
    --retention-days 7 \
    --min-backups 3 \
    --pattern "mydb_*.dump"

Provider-Specific Setup

AWS S3

Prerequisites:

  • AWS account
  • S3 bucket created
  • IAM user with S3 permissions

IAM Policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-bucket/*",
        "arn:aws:s3:::my-bucket"
      ]
    }
  ]
}

Configuration:

export AWS_ACCESS_KEY_ID="AKIAIOSFODNN7EXAMPLE"
export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
export AWS_REGION="us-east-1"

dbbackup backup single mydb --cloud s3://my-bucket/backups/

MinIO (Self-Hosted)

Setup with Docker:

docker run -d \
  -p 9000:9000 \
  -p 9001:9001 \
  -e "MINIO_ROOT_USER=minioadmin" \
  -e "MINIO_ROOT_PASSWORD=minioadmin123" \
  --name minio \
  minio/minio server /data --console-address ":9001"

# Create bucket
docker exec minio mc alias set local http://localhost:9000 minioadmin minioadmin123
docker exec minio mc mb local/backups

Configuration:

export AWS_ACCESS_KEY_ID="minioadmin"
export AWS_SECRET_ACCESS_KEY="minioadmin123"
export AWS_ENDPOINT_URL="http://localhost:9000"

dbbackup backup single mydb --cloud minio://backups/db/

Or use docker-compose:

docker-compose -f docker-compose.minio.yml up -d

Backblaze B2

Prerequisites:

  • Backblaze account
  • B2 bucket created
  • Application key generated

Configuration:

export AWS_ACCESS_KEY_ID="<your-b2-key-id>"
export AWS_SECRET_ACCESS_KEY="<your-b2-application-key>"
export AWS_ENDPOINT_URL="https://s3.us-west-002.backblazeb2.com"
export AWS_REGION="us-west-002"

dbbackup backup single mydb --cloud b2://my-bucket/backups/

Azure Blob Storage

Native Azure support with comprehensive features:

See AZURE.md for complete documentation.

Quick Start:

# Using account name and key
dbbackup backup postgres \
  --host localhost \
  --database mydb \
  --cloud "azure://container/backups/db.sql?account=myaccount&key=ACCOUNT_KEY"

# With Azurite emulator for testing
dbbackup backup postgres \
  --host localhost \
  --database mydb \
  --cloud "azure://test-backups/db.sql?endpoint=http://localhost:10000"

Features:

  • Native Azure SDK integration
  • Block blob upload for large files (>256MB)
  • Azurite emulator support for local testing
  • SHA-256 integrity verification
  • Comprehensive test suite

Google Cloud Storage

Native GCS support with full features:

See GCS.md for complete documentation.

Quick Start:

# Using Application Default Credentials
dbbackup backup postgres \
  --host localhost \
  --database mydb \
  --cloud "gs://mybucket/backups/db.sql"

# With service account
dbbackup backup postgres \
  --host localhost \
  --database mydb \
  --cloud "gs://mybucket/backups/db.sql?credentials=/path/to/key.json"

# With fake-gcs-server emulator for testing
dbbackup backup postgres \
  --host localhost \
  --database mydb \
  --cloud "gs://test-backups/db.sql?endpoint=http://localhost:4443/storage/v1"

Features:

  • Native GCS SDK integration
  • Chunked upload for large files (16MB chunks)
  • fake-gcs-server emulator support
  • Application Default Credentials support
  • Workload Identity for GKE

Features

Multipart Upload

Files larger than 100MB automatically use multipart upload for:

  • Faster transfers with parallel parts
  • Resume capability on failure
  • Better reliability for large files

Configuration:

  • Part size: 10MB
  • Concurrency: 10 parallel parts
  • Automatic based on file size

Progress Tracking

Real-time progress for uploads and downloads:

Uploading backup to cloud...
Progress: 10%
Progress: 20%
Progress: 30%
...
Upload completed: /backups/mydb.dump (1.2 GB)

Metadata Synchronization

Automatically uploads .meta.json with each backup containing:

  • SHA-256 checksum
  • Database name and type
  • Backup timestamp
  • File size
  • Compression info

Automatic Verification

Downloads from cloud include automatic checksum verification:

Downloading backup from cloud...
Download completed
Verifying checksum...
Checksum verified successfully: sha256=abc123...

Testing

Local Testing with MinIO

1. Start MinIO:

docker-compose -f docker-compose.minio.yml up -d

2. Run Integration Tests:

./scripts/test_cloud_storage.sh

3. Manual Testing:

# Set credentials
export AWS_ACCESS_KEY_ID=minioadmin
export AWS_SECRET_ACCESS_KEY=minioadmin123
export AWS_ENDPOINT_URL=http://localhost:9000

# Test backup
dbbackup backup single mydb --cloud minio://test-backups/test/

# Test restore
dbbackup restore single minio://test-backups/test/mydb.dump --confirm

# Test verify
dbbackup verify-backup minio://test-backups/test/mydb.dump

# Test cleanup
dbbackup cleanup minio://test-backups/test/ --retention-days 7 --dry-run

4. Access MinIO Console:


Best Practices

Security

  1. Never commit credentials:

    # Use environment variables or config files
    export AWS_ACCESS_KEY_ID="..."
    
  2. Use IAM roles when possible:

    # On EC2/ECS, credentials are automatic
    dbbackup backup single mydb --cloud s3://bucket/
    
  3. Restrict bucket permissions:

    • Minimum required: GetObject, PutObject, DeleteObject, ListBucket
    • Use bucket policies to limit access
  4. Enable encryption:

    • S3: Server-side encryption enabled by default
    • MinIO: Configure encryption at rest

Performance

  1. Use multipart for large backups:

    • Automatic for files >100MB
    • Configure concurrency based on bandwidth
  2. Choose nearby regions:

    --cloud-region us-west-2  # Closest to your servers
    
  3. Use compression:

    --compression gzip  # Reduces upload size
    

Reliability

  1. Test restores regularly:

    # Monthly restore test
    dbbackup restore single s3://bucket/latest.dump --target test_restore
    
  2. Verify backups:

    # Daily verification
    dbbackup verify-backup s3://bucket/backups/*.dump
    
  3. Monitor retention:

    # Weekly cleanup check
    dbbackup cleanup s3://bucket/ --retention-days 30 --dry-run
    

Cost Optimization

  1. Use lifecycle policies:

    • S3: Transition old backups to Glacier
    • Configure in AWS Console or bucket policy
  2. Cleanup old backups:

    dbbackup cleanup s3://bucket/ --retention-days 30 --min-backups 10
    
  3. Choose appropriate storage class:

    • Standard: Frequent access
    • Infrequent Access: Monthly restores
    • Glacier: Long-term archive

Troubleshooting

Connection Issues

Problem: Cannot connect to S3/MinIO

Error: failed to create cloud backend: failed to load AWS config

Solution:

  1. Check credentials:

    echo $AWS_ACCESS_KEY_ID
    echo $AWS_SECRET_ACCESS_KEY
    
  2. Test connectivity:

    curl $AWS_ENDPOINT_URL
    
  3. Verify endpoint URL for MinIO/B2

Permission Errors

Problem: Access denied

Error: failed to upload to S3: AccessDenied

Solution:

  1. Check IAM policy includes required permissions
  2. Verify bucket name is correct
  3. Check bucket policy allows your IAM user

Upload Failures

Problem: Large file upload fails

Error: multipart upload failed: connection timeout

Solution:

  1. Check network stability
  2. Retry - multipart uploads resume automatically
  3. Increase timeout in config
  4. Check firewall allows outbound HTTPS

Verification Failures

Problem: Checksum mismatch

Error: checksum mismatch: expected abc123, got def456

Solution:

  1. Re-download the backup
  2. Check if file was corrupted during upload
  3. Verify original backup integrity locally
  4. Re-upload if necessary

Examples

Full Backup Workflow

#!/bin/bash
# Daily backup to S3 with retention

# Backup all databases
for db in db1 db2 db3; do
    dbbackup backup single $db \
        --cloud s3://production-backups/daily/$db/ \
        --compression gzip
done

# Cleanup old backups (keep 30 days, min 10 backups)
dbbackup cleanup s3://production-backups/daily/ \
    --retention-days 30 \
    --min-backups 10

# Verify today's backups
dbbackup verify-backup s3://production-backups/daily/*/$(date +%Y%m%d)*.dump

Disaster Recovery

#!/bin/bash
# Restore from cloud backup

# List available backups
dbbackup cloud list \
    --cloud-provider s3 \
    --cloud-bucket disaster-recovery \
    --verbose

# Restore latest backup
LATEST=$(dbbackup cloud list \
    --cloud-provider s3 \
    --cloud-bucket disaster-recovery | tail -1)

dbbackup restore single "s3://disaster-recovery/$LATEST" \
    --target restored_db \
    --create \
    --confirm

Multi-Cloud Strategy

#!/bin/bash
# Backup to both AWS S3 and Backblaze B2

# Backup to S3
dbbackup backup single production_db \
    --cloud s3://aws-backups/prod/ \
    --output-dir /tmp/backups

# Also upload to B2
BACKUP_FILE=$(ls -t /tmp/backups/*.dump | head -1)
dbbackup cloud upload "$BACKUP_FILE" \
    --cloud-provider b2 \
    --cloud-bucket b2-offsite-backups \
    --cloud-endpoint https://s3.us-west-002.backblazeb2.com

# Verify both locations
dbbackup verify-backup s3://aws-backups/prod/$(basename $BACKUP_FILE)
dbbackup verify-backup b2://b2-offsite-backups/$(basename $BACKUP_FILE)

FAQ

Q: Can I use dbbackup with my existing S3 buckets?
A: Yes! Just specify your bucket name and credentials.

Q: Do I need to keep local backups?
A: No, use --cloud flag to upload directly without keeping local copies.

Q: What happens if upload fails?
A: Backup succeeds locally. Upload failure is logged but doesn't fail the backup.

Q: Can I restore without downloading?
A: No, backups are downloaded to temp directory, then restored and cleaned up.

Q: How much does cloud storage cost?
A: Varies by provider:

  • AWS S3: ~$0.023/GB/month + transfer
  • Azure Blob Storage: ~$0.018/GB/month (Hot tier)
  • Google Cloud Storage: ~$0.020/GB/month (Standard)
  • Backblaze B2: ~$0.005/GB/month + transfer
  • MinIO: Self-hosted, hardware costs only

Q: Can I use multiple cloud providers?
A: Yes! Use different URIs or upload to multiple destinations.

Q: Is multipart upload automatic?
A: Yes, automatically used for files >100MB.

Q: Can I use S3 Glacier?
A: Yes, but restore requires thawing. Use lifecycle policies for automatic archival.



Support

For issues or questions:

  • GitHub Issues: Create an issue
  • Documentation: Check README.md and inline help
  • Examples: See scripts/test_cloud_storage.sh