- AZURE.md, GCS.md: Replace 'backup postgres' with 'backup single' - AZURE.md, GCS.md: Replace 'restore postgres --source' with proper syntax - AZURE.md, GCS.md: Remove non-existent --output, --source flags - VEEAM_ALTERNATIVE.md: Fix command examples and broken link - CONTRIBUTING.md: Remove RELEASE_NOTES step from release process - CHANGELOG.md: Remove reference to deleted file - Remove RELEASE_NOTES_v3.1.md (content is in CHANGELOG.md)
14 KiB
Google Cloud Storage Integration
This guide covers using Google Cloud Storage (GCS) with dbbackup for secure, scalable cloud backup storage.
Table of Contents
- Quick Start
- URI Syntax
- Authentication
- Configuration
- Usage Examples
- Advanced Features
- Testing with fake-gcs-server
- Best Practices
- Troubleshooting
Quick Start
1. GCP Setup
- Create a GCS bucket in Google Cloud Console
- Set up authentication (choose one):
- Service Account: Create and download JSON key file
- Application Default Credentials: Use gcloud CLI
- Workload Identity: For GKE clusters
2. Basic Backup
# Backup PostgreSQL to GCS (using ADC)
dbbackup backup single mydb \
--cloud "gs://mybucket/backups/"
3. Restore from GCS
# Download backup from GCS and restore
dbbackup cloud download "gs://mybucket/backups/mydb.dump.gz" ./mydb.dump.gz
dbbackup restore single ./mydb.dump.gz --target mydb_restored --confirm
URI Syntax
Basic Format
gs://bucket/path/to/backup.sql
gcs://bucket/path/to/backup.sql
Both gs:// and gcs:// prefixes are supported.
URI Components
| Component | Required | Description | Example |
|---|---|---|---|
bucket |
Yes | GCS bucket name | mybucket |
path |
Yes | Object path within bucket | backups/db.sql |
credentials |
No | Path to service account JSON | /path/to/key.json |
project |
No | GCP project ID | my-project-id |
endpoint |
No | Custom endpoint (emulator) | http://localhost:4443 |
URI Examples
Production GCS (Application Default Credentials):
gs://prod-backups/postgres/db.sql
With Service Account:
gs://prod-backups/postgres/db.sql?credentials=/path/to/service-account.json
With Project ID:
gs://prod-backups/postgres/db.sql?project=my-project-id
fake-gcs-server Emulator:
gs://test-backups/postgres/db.sql?endpoint=http://localhost:4443/storage/v1
With Path Prefix:
gs://backups/production/postgres/2024/db.sql
Authentication
Method 1: Application Default Credentials (Recommended)
Use gcloud CLI to set up ADC:
# Login with your Google account
gcloud auth application-default login
# Or use service account for server environments
gcloud auth activate-service-account --key-file=/path/to/key.json
# Use simplified URI (credentials from environment)
dbbackup backup single mydb --cloud "gs://mybucket/backups/"
Method 2: Service Account JSON
Download service account key from GCP Console:
- Go to IAM & Admin → Service Accounts
- Create or select a service account
- Click Keys → Add Key → Create new key → JSON
- Download the JSON file
Use in URI:
dbbackup backup single mydb \
--cloud "gs://mybucket/?credentials=/path/to/service-account.json"
Or via environment:
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
dbbackup backup single mydb --cloud "gs://mybucket/"
Method 3: Workload Identity (GKE)
For Kubernetes workloads:
apiVersion: v1
kind: ServiceAccount
metadata:
name: dbbackup-sa
annotations:
iam.gke.io/gcp-service-account: dbbackup@project.iam.gserviceaccount.com
Then use ADC in your pod:
dbbackup backup single mydb --cloud "gs://mybucket/"
Required IAM Permissions
Service account needs these roles:
- Storage Object Creator: Upload backups
- Storage Object Viewer: List and download backups
- Storage Object Admin: Delete backups (for cleanup)
Or use predefined role: Storage Admin
# Grant permissions
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:dbbackup@PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"
Configuration
Bucket Setup
Create a bucket before first use:
# gcloud CLI
gsutil mb -p PROJECT_ID -c STANDARD -l us-central1 gs://mybucket/
# Or let dbbackup create it (requires permissions)
dbbackup cloud upload file.sql "gs://mybucket/file.sql?create=true&project=PROJECT_ID"
Storage Classes
GCS offers multiple storage classes:
- Standard: Frequent access (default)
- Nearline: Access <1/month (lower cost)
- Coldline: Access <1/quarter (very low cost)
- Archive: Long-term retention (lowest cost)
Set the class when creating bucket:
gsutil mb -c NEARLINE gs://mybucket/
Lifecycle Management
Configure automatic transitions and deletion:
{
"lifecycle": {
"rule": [
{
"action": {"type": "SetStorageClass", "storageClass": "NEARLINE"},
"condition": {"age": 30, "matchesPrefix": ["backups/"]}
},
{
"action": {"type": "SetStorageClass", "storageClass": "ARCHIVE"},
"condition": {"age": 90, "matchesPrefix": ["backups/"]}
},
{
"action": {"type": "Delete"},
"condition": {"age": 365, "matchesPrefix": ["backups/"]}
}
]
}
}
Apply lifecycle configuration:
gsutil lifecycle set lifecycle.json gs://mybucket/
Regional Configuration
Choose bucket location for better performance:
# US regions
gsutil mb -l us-central1 gs://mybucket/
gsutil mb -l us-east1 gs://mybucket/
# EU regions
gsutil mb -l europe-west1 gs://mybucket/
# Multi-region
gsutil mb -l us gs://mybucket/
gsutil mb -l eu gs://mybucket/
Usage Examples
Backup with Auto-Upload
# PostgreSQL backup with automatic GCS upload
dbbackup backup single production_db \
--cloud "gs://prod-backups/postgres/" \
--compression 6
Backup All Databases
# Backup entire PostgreSQL cluster to GCS
dbbackup backup cluster \
--cloud "gs://prod-backups/postgres/cluster/"
Verify Backup
# Verify backup integrity
dbbackup verify "gs://prod-backups/postgres/backup.sql"
List Backups
# List all backups in bucket
dbbackup cloud list "gs://prod-backups/postgres/"
# List with pattern
dbbackup cloud list "gs://prod-backups/postgres/2024/"
# Or use gsutil
gsutil ls gs://prod-backups/postgres/
Download Backup
# Download from GCS to local
dbbackup cloud download \
"gs://prod-backups/postgres/backup.sql" \
/local/path/backup.sql
Delete Old Backups
# Manual delete
dbbackup cloud delete "gs://prod-backups/postgres/old_backup.sql"
# Automatic cleanup (keep last 7 backups)
dbbackup cleanup "gs://prod-backups/postgres/" --keep 7
Scheduled Backups
#!/bin/bash
# GCS backup script (run via cron)
GCS_URI="gs://prod-backups/postgres/"
dbbackup backup single production_db \
--cloud "${GCS_URI}" \
--compression 9
# Cleanup old backups
dbbackup cleanup "gs://prod-backups/postgres/" --keep 30
Crontab:
# Daily at 2 AM
0 2 * * * /usr/local/bin/gcs-backup.sh >> /var/log/gcs-backup.log 2>&1
Systemd Timer:
# /etc/systemd/system/gcs-backup.timer
[Unit]
Description=Daily GCS Database Backup
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
Advanced Features
Chunked Upload
For large files, dbbackup automatically uses GCS chunked upload:
- Chunk Size: 16MB per chunk
- Streaming: Direct streaming from source
- Checksum: SHA-256 integrity verification
# Large database backup (automatically uses chunked upload)
dbbackup backup single huge_db \
--cloud "gs://backups/"
Progress Tracking
# Backup with progress display
dbbackup backup single mydb \
--cloud "gs://backups/"
Concurrent Operations
# Backup cluster with parallel jobs
dbbackup backup cluster \
--cloud "gs://backups/cluster/" \
--jobs 4
Custom Metadata
Backups include SHA-256 checksums as object metadata:
# View metadata using gsutil
gsutil stat gs://backups/backup.sql
Object Versioning
Enable versioning to protect against accidental deletion:
# Enable versioning
gsutil versioning set on gs://mybucket/
# List all versions
gsutil ls -a gs://mybucket/backup.sql
# Restore previous version
gsutil cp gs://mybucket/backup.sql#VERSION /local/backup.sql
Customer-Managed Encryption Keys (CMEK)
Use your own encryption keys:
# Create encryption key in Cloud KMS
gcloud kms keyrings create backup-keyring --location=us-central1
gcloud kms keys create backup-key --location=us-central1 --keyring=backup-keyring --purpose=encryption
# Set default CMEK for bucket
gsutil kms encryption gs://mybucket/ projects/PROJECT/locations/us-central1/keyRings/backup-keyring/cryptoKeys/backup-key
Testing with fake-gcs-server
Setup fake-gcs-server Emulator
Docker Compose:
services:
gcs-emulator:
image: fsouza/fake-gcs-server:latest
ports:
- "4443:4443"
command: -scheme http -public-host localhost:4443
Start:
docker-compose -f docker-compose.gcs.yml up -d
Create Test Bucket
# Using curl
curl -X POST "http://localhost:4443/storage/v1/b?project=test-project" \
-H "Content-Type: application/json" \
-d '{"name": "test-backups"}'
Test Backup
# Backup to fake-gcs-server
dbbackup backup single testdb \
--cloud "gs://test-backups/?endpoint=http://localhost:4443/storage/v1"
Run Integration Tests
# Run comprehensive test suite
./scripts/test_gcs_storage.sh
Tests include:
- PostgreSQL and MySQL backups
- Upload/download operations
- Large file handling (200MB+)
- Verification and cleanup
- Restore operations
Best Practices
1. Security
- Never commit credentials to version control
- Use Application Default Credentials when possible
- Rotate service account keys regularly
- Use Workload Identity for GKE
- Enable VPC Service Controls for enterprise security
- Use Customer-Managed Encryption Keys (CMEK) for sensitive data
2. Performance
- Use compression for faster uploads:
--compression 6 - Enable parallelism for cluster backups:
--parallelism 4 - Choose appropriate GCS region (close to source)
- Use multi-region buckets for high availability
3. Cost Optimization
- Use Nearline for backups older than 30 days
- Use Archive for long-term retention (>90 days)
- Enable lifecycle management for automatic transitions
- Monitor storage costs in GCP Billing Console
- Use Coldline for quarterly access patterns
4. Reliability
- Test restore procedures regularly
- Use retention policies:
--keep 30 - Enable object versioning (30-day recovery)
- Use multi-region buckets for disaster recovery
- Monitor backup success with Cloud Monitoring
5. Organization
- Use consistent naming:
{database}/{date}/{backup}.sql - Use bucket prefixes:
prod-backups,dev-backups - Tag backups with labels (environment, version)
- Document restore procedures
- Use separate buckets per environment
Troubleshooting
Connection Issues
Problem: failed to create GCS client
Solutions:
- Check
GOOGLE_APPLICATION_CREDENTIALSenvironment variable - Verify service account JSON file exists and is valid
- Ensure gcloud CLI is authenticated:
gcloud auth list - For emulator, confirm
http://localhost:4443is running
Authentication Errors
Problem: authentication failed or permission denied
Solutions:
- Verify service account has required IAM roles
- Check if Application Default Credentials are set up
- Run
gcloud auth application-default login - Verify service account JSON is not corrupted
- Check GCP project ID is correct
Upload Failures
Problem: failed to upload object
Solutions:
- Check bucket exists (or use
&create=true) - Verify service account has
storage.objects.createpermission - Check network connectivity to GCS
- Try smaller files first (test connection)
- Check GCP quota limits
Large File Issues
Problem: Upload timeout for large files
Solutions:
- dbbackup automatically uses chunked upload
- Increase compression:
--compression 9 - Check network bandwidth
- Use Transfer Appliance for TB+ data
List/Download Issues
Problem: object not found
Solutions:
- Verify object name (check GCS Console)
- Check bucket name is correct
- Ensure object hasn't been moved/deleted
- Check if object is in Archive class (requires restore)
Performance Issues
Problem: Slow upload/download
Solutions:
- Use compression:
--compression 6 - Choose closer GCS region
- Check network bandwidth
- Use multi-region bucket for better availability
- Enable parallelism for multiple files
Debugging
Enable debug mode:
dbbackup backup single mydb \
--cloud "gs://bucket/" \
--debug
Check GCP logs:
# Cloud Logging
gcloud logging read "resource.type=gcs_bucket AND resource.labels.bucket_name=mybucket" \
--limit 50 \
--format json
View bucket details:
gsutil ls -L -b gs://mybucket/
Monitoring and Alerting
Cloud Monitoring
Create metrics and alerts:
# Monitor backup success rate
gcloud monitoring policies create \
--notification-channels=CHANNEL_ID \
--display-name="Backup Failure Alert" \
--condition-display-name="No backups in 24h" \
--condition-threshold-value=0 \
--condition-threshold-duration=86400s
Logging
Export logs to BigQuery for analysis:
gcloud logging sinks create backup-logs \
bigquery.googleapis.com/projects/PROJECT_ID/datasets/backup_logs \
--log-filter='resource.type="gcs_bucket" AND resource.labels.bucket_name="prod-backups"'
Additional Resources
- Google Cloud Storage Documentation
- fake-gcs-server
- gsutil Tool
- GCS Client Libraries
- dbbackup Cloud Storage Guide
Support
For issues specific to GCS integration:
- Check Troubleshooting section
- Run integration tests:
./scripts/test_gcs_storage.sh - Enable debug mode:
--debug - Check GCP Service Status
- Open an issue on GitHub with debug logs