Files
hmac-file-server/PERFORMANCE_OPTIMIZATION.md
Alexander Renz 614d4f5b38 Implement comprehensive optimizations for HMAC File Server
- Added ClamAV security configuration to enhance scanning efficiency for critical file types.
- Introduced deduplication optimization with a 1GB threshold to bypass SHA256 computation for large files, improving upload speed.
- Resolved "endless encryption" issue by disabling deduplication for large files and allowing video file extensions in global settings.
- Enhanced upload performance verification scripts to monitor and validate upload processes and configurations.
- Updated monitoring scripts for real-time log analysis and upload activity tracking.
- Documented all changes and configurations in respective markdown files for clarity and future reference.
2025-07-18 07:32:55 +00:00

4.7 KiB

Optimized Configuration for Large File Performance

🎯 Root Cause of "Feeling the Same" Issue

The problem was deduplication post-processing - after uploads complete, the server was:

  1. Computing SHA256 hash of entire file (970MB = ~30-60 seconds)
  2. Moving files and creating hard links
  3. This happened after upload but before client success response

Performance Optimizations Applied

1. Smart Deduplication Size Limits

[deduplication]
enabled = true
directory = "/opt/hmac-file-server/data/dedup"
maxsize = "500MB"  # NEW: Skip deduplication for files >500MB

2. Enhanced ClamAV Security Configuration

[clamav]
clamavenabled = true
maxscansize = "200MB"
numscanworkers = 2
clamavsocket = "/var/run/clamav/clamd.ctl"

# ONLY scan genuinely dangerous file types
scanfileextensions = [
    # Critical executables
    ".exe", ".com", ".bat", ".cmd", ".scr", ".dll", ".sys",
    ".sh", ".bash", ".bin", ".run", ".jar", ".app",
    
    # Dangerous scripts
    ".php", ".asp", ".aspx", ".jsp", ".js", ".vbs", ".py",
    
    # Macro-enabled documents
    ".docm", ".xlsm", ".pptm", ".dotm", ".xltm", ".potm",
    
    # Compressed archives (can hide malware)
    ".zip", ".rar", ".7z", ".tar", ".gz", ".msi", ".iso"
]

3. Files That Should NEVER Be Scanned/Deduplicated

# Media files (safe, large, unique)
.mp4, .avi, .mov, .mkv, .wmv, .flv, .webm
.mp3, .wav, .flac, .aac, .ogg, .m4a
.jpg, .jpeg, .png, .gif, .bmp, .tiff, .svg

# Large data files (safe, often unique)
.sql, .dump, .backup, .img, .vmdk, .vdi

🚀 Expected Performance Improvements

File Type Size Before Fix After Fix Improvement
.mp4 video 970MB 60s dedup delay Instant 60x faster
.exe binary 50MB ⚠️ Slow scan + dedup Fast scan only 10x faster
.zip archive 200MB ⚠️ Slow scan + dedup Skip both 20x faster
.txt document 1MB Fast Fast No change

High Performance Setup

[server]
max_upload_size = "10GB"
deduplication_enabled = true

[deduplication] 
enabled = true
directory = "/opt/hmac-file-server/data/dedup"
maxsize = "100MB"  # Only deduplicate small files

[clamav]
clamavenabled = true
maxscansize = "50MB"  # Only scan small potentially dangerous files
scanfileextensions = [".exe", ".com", ".bat", ".scr", ".dll", ".sh", ".jar", ".zip", ".rar"]

Balanced Security/Performance Setup

[deduplication]
enabled = true
maxsize = "500MB"  # Medium-sized files get deduplicated

[clamav] 
clamavenabled = true
maxscansize = "200MB"  # Current setting
scanfileextensions = [
    ".exe", ".com", ".bat", ".cmd", ".scr", ".dll",
    ".sh", ".bash", ".bin", ".jar", ".php", ".js",
    ".zip", ".rar", ".7z", ".tar.gz"
]

Maximum Security Setup

[deduplication]
enabled = false  # Disable for maximum speed

[clamav]
clamavenabled = true
maxscansize = "1GB"  # Scan larger files
scanfileextensions = [
    # All potentially dangerous types
    ".exe", ".com", ".bat", ".cmd", ".scr", ".dll", ".sys",
    ".sh", ".bash", ".bin", ".jar", ".php", ".asp", ".js",
    ".doc", ".docx", ".xls", ".xlsx", ".pdf",
    ".zip", ".rar", ".7z", ".tar", ".gz", ".iso"
]

📊 File Type Classification

Critical Security Risk (Always Scan)

  • Executables: .exe, .com, .bat, .scr, .dll, .sys
  • Scripts: .sh, .bash, .php, .js, .py, .vbs
  • System files: .jar, .app, .deb, .rpm, .msi

Medium Risk (Scan if Small)

  • Documents: .doc, .docx, .xls, .xlsx, .pdf
  • Archives: .zip, .rar, .7z, .tar.gz

No Security Risk (Never Scan)

  • Media: .mp4, .avi, .mp3, .jpg, .png
  • Data: .txt, .csv, .json, .log, .sql

🔍 Monitoring Commands

Check Deduplication Skips

sudo journalctl -u hmac-file-server -f | grep -i "exceeds deduplication size limit"

Check ClamAV Skips

sudo journalctl -u hmac-file-server -f | grep -i "exceeds.*scan limit\|not in scan list"

Monitor Upload Performance

sudo tail -f /var/log/hmac-file-server/hmac-file-server.log | grep -E "(upload|dedup|scan)"

Current Status

  • ClamAV: Smart size and extension filtering
  • Deduplication: Size-based skipping (default 500MB limit)
  • Performance: Large files bypass both bottlenecks
  • Security: Maintained for genuinely risky file types
  • Configurable: All limits adjustable via config.toml

Large uploads should now complete immediately without post-processing delays!