Files
hmac-file-server/NETWORK_SWITCHING_IMPROVEMENTS.md
Alexander Renz 6a90fa6e30 Implement network switching improvements for HMAC file server
- Added support for chunked and resumable uploads to enhance resilience against network interruptions.
- Introduced a new upload session management system to track progress and handle retries.
- Enhanced connection management with improved timeout settings for mobile networks.
- Implemented network change detection and handling to pause and resume uploads seamlessly.
- Developed client-side retry logic for uploads to improve reliability.
- Updated configuration options to enable new features and set recommended defaults for timeouts and chunk sizes.
- Created integration layer to add new features without modifying existing core functionality.
- Established a network resilience manager to monitor network changes and manage active uploads.
2025-07-17 18:00:14 +02:00

7.2 KiB

Network Switching Improvements for HMAC File Server

Issues Identified

1. No Resumable Upload Support

  • Current uploads fail completely on network interruption
  • No chunked upload implementation despite configuration option
  • File deletion on any upload error loses all progress

2. Aggressive Connection Timeouts

  • ReadTimeout/WriteTimeout too short for large uploads over mobile networks
  • IdleConnTimeout too aggressive for network switching scenarios
  • No retry mechanisms for temporary network failures

3. No Connection State Management

  • No detection of network changes
  • No graceful handling of connection switches
  • No upload session persistence

1. Implement Chunked/Resumable Uploads

// Add to upload configuration
type ChunkedUploadSession struct {
    ID          string
    Filename    string
    TotalSize   int64
    ChunkSize   int64
    UploadedBytes int64
    Chunks      map[int]bool  // Track completed chunks
    LastActivity time.Time
    ClientIP    string
}

// New upload handler for chunked uploads
func handleChunkedUpload(w http.ResponseWriter, r *http.Request) {
    // Check for existing session
    sessionID := r.Header.Get("X-Upload-Session-ID")
    chunkNumber := r.Header.Get("X-Chunk-Number")
    
    // Resume logic here
}

2. Enhanced Connection Management

// Improved HTTP client configuration
dualStackClient = &http.Client{
    Transport: &http.Transport{
        DialContext:           dialer.DialContext,
        IdleConnTimeout:       300 * time.Second, // 5 minutes for mobile
        MaxIdleConns:          50,
        MaxIdleConnsPerHost:   20,                // More connections per host
        TLSHandshakeTimeout:   30 * time.Second,  // Longer for mobile networks
        ResponseHeaderTimeout: 60 * time.Second,  // Account for network switches
        DisableKeepAlives:     false,             // Enable keep-alives
        MaxConnsPerHost:       30,                // Allow more concurrent connections
    },
    Timeout: 0, // No overall timeout - let individual operations timeout
}

// Enhanced server timeouts
server := &http.Server{
    ReadTimeout:    5 * time.Minute,   // Allow for slow mobile uploads
    WriteTimeout:   5 * time.Minute,   // Allow for slow responses
    IdleTimeout:    10 * time.Minute,  // Keep connections alive longer
}

3. Network Change Detection and Handling

// Enhanced network monitoring
func monitorNetworkChanges(ctx context.Context) {
    ticker := time.NewTicker(5 * time.Second) // More frequent checking
    defer ticker.Stop()
    
    var lastInterfaces []net.Interface
    
    for {
        select {
        case <-ctx.Done():
            return
        case <-ticker.C:
            currentInterfaces, err := net.Interfaces()
            if err != nil {
                continue
            }
            
            // Detect interface changes
            if hasNetworkChanges(lastInterfaces, currentInterfaces) {
                log.Info("Network change detected - pausing active uploads")
                pauseActiveUploads()
                
                // Wait for network stabilization
                time.Sleep(2 * time.Second)
                
                log.Info("Resuming uploads after network change")
                resumeActiveUploads()
            }
            
            lastInterfaces = currentInterfaces
        }
    }
}

4. Upload Session Persistence

// Store upload sessions in Redis or local cache
type UploadSessionStore struct {
    sessions map[string]*ChunkedUploadSession
    mutex    sync.RWMutex
}

func (s *UploadSessionStore) SaveSession(session *ChunkedUploadSession) {
    s.mutex.Lock()
    defer s.mutex.Unlock()
    
    // Store in Redis if available, otherwise in-memory
    if redisClient != nil {
        data, _ := json.Marshal(session)
        redisClient.Set(ctx, "upload:"+session.ID, data, 24*time.Hour)
    } else {
        s.sessions[session.ID] = session
    }
}

5. Client-Side Retry Logic (for mobile apps/browsers)

// Client-side upload with retry logic
class ResilientUploader {
    constructor(file, endpoint, options = {}) {
        this.file = file;
        this.endpoint = endpoint;
        this.chunkSize = options.chunkSize || 5 * 1024 * 1024; // 5MB chunks
        this.maxRetries = options.maxRetries || 5;
        this.retryDelay = options.retryDelay || 2000;
    }
    
    async upload() {
        const totalChunks = Math.ceil(this.file.size / this.chunkSize);
        const sessionId = this.generateSessionId();
        
        for (let i = 0; i < totalChunks; i++) {
            await this.uploadChunk(i, sessionId);
        }
    }
    
    async uploadChunk(chunkIndex, sessionId, retryCount = 0) {
        try {
            const start = chunkIndex * this.chunkSize;
            const end = Math.min(start + this.chunkSize, this.file.size);
            const chunk = this.file.slice(start, end);
            
            const response = await fetch(this.endpoint, {
                method: 'PUT',
                headers: {
                    'X-Upload-Session-ID': sessionId,
                    'X-Chunk-Number': chunkIndex,
                    'X-Total-Chunks': totalChunks,
                    'Content-Range': `bytes ${start}-${end-1}/${this.file.size}`
                },
                body: chunk
            });
            
            if (!response.ok) throw new Error(`HTTP ${response.status}`);
            
        } catch (error) {
            if (retryCount < this.maxRetries) {
                // Exponential backoff with jitter
                const delay = this.retryDelay * Math.pow(2, retryCount) + Math.random() * 1000;
                await new Promise(resolve => setTimeout(resolve, delay));
                return this.uploadChunk(chunkIndex, sessionId, retryCount + 1);
            }
            throw error;
        }
    }
}

Implementation Priority

  1. High Priority: Implement chunked uploads with session persistence
  2. High Priority: Adjust connection timeouts for mobile scenarios
  3. Medium Priority: Add network change detection and upload pausing
  4. Medium Priority: Implement retry logic in upload handlers
  5. Low Priority: Add client-side SDK with built-in resilience

Configuration Changes Needed

[uploads]
resumableuploadsenabled = true    # Enable the feature
chunkeduploadsenabled = true      # Already exists but not implemented
chunksize = "5MB"                 # Smaller chunks for mobile
sessiontimeout = "24h"            # How long to keep upload sessions
maxretries = 5                    # Server-side retry attempts

[timeouts]
readtimeout = "300s"              # 5 minutes for mobile uploads
writetimeout = "300s"             # 5 minutes for responses  
idletimeout = "600s"              # 10 minutes idle timeout
uploadtimeout = "3600s"           # 1 hour total upload timeout

[network]
networkchangedetection = true     # Enable network monitoring
uploadpauseonchange = true        # Pause uploads during network changes
reconnectdelay = "2s"             # Wait time after network change
keepaliveinterval = "30s"         # TCP keep-alive interval

This comprehensive approach will make uploads much more resilient to network switching scenarios common with mobile devices using multiple network interfaces.