631 lines
16 KiB
Markdown
631 lines
16 KiB
Markdown
# VMAF Optimisation Pipeline - Agent Documentation
|
||
|
||
## Overview
|
||
|
||
This project automates video library optimization to AV1 using VMAF (Video Multimethod Assessment Fusion) quality targets. It intelligently searches for optimal encoding parameters and gracefully degrades quality when needed to achieve target file size savings.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
run_optimisation.sh # Master runner script
|
||
↓
|
||
optimise_media_v2.py # Main encoding engine
|
||
↓
|
||
ab-av1 (crf-search, encode) # AV1 encoding tool
|
||
ffprobe/ffmpeg # Media analysis/encoding
|
||
```
|
||
|
||
## How It Works
|
||
|
||
### Phase 1: Video Analysis
|
||
1. Scans directory for video files (.mkv, .mp4)
|
||
2. Uses `ffprobe` to get:
|
||
- Codec (h264, hevc, etc.)
|
||
- Resolution (width × height)
|
||
- Bitrate (calculated from size/duration)
|
||
- File size and duration
|
||
3. Skips if already AV1 encoded
|
||
|
||
### Phase 2: VMAF Target Search (Intelligent Fallback)
|
||
|
||
The script tries VMAF targets in **descending order** (highest quality first):
|
||
|
||
```
|
||
Try VMAF 94 (Premium)
|
||
↓
|
||
Can achieve?
|
||
↓ Yes ↓ No
|
||
Calculate savings Try VMAF 93
|
||
↓
|
||
Savings ≥ 12%?
|
||
↓ Yes ↓ No
|
||
Encode at VMAF 94 Calculate savings
|
||
↓
|
||
Savings ≥ 12%?
|
||
↓ Yes ↓ No
|
||
Encode at VMAF 93 Find 15% (test 92, 90)
|
||
```
|
||
|
||
**Fallback Logic:**
|
||
- If VMAF 94 gives ≥12% savings → **Encode at VMAF 94**
|
||
- If VMAF 94 <12% but VMAF 93 ≥12% → **Encode at VMAF 93**
|
||
- If both <12% → Find what VMAF gives 15%+ savings:
|
||
- Tests VMAF 93, 92, 90
|
||
- Reports "FOUND 15%+ SAVINGS" with exact parameters
|
||
- Logs for manual review (no encoding)
|
||
- User can decide to adjust settings
|
||
|
||
### Phase 3: CRF Search
|
||
|
||
Uses `ab-av1 crf-search` with `--thorough` flag:
|
||
- Takes multiple samples (20-30s segments) from video
|
||
- Interpolates binary search for optimal CRF
|
||
- Outputs: Best CRF, Mean VMAF, Predicted size
|
||
- Uses `--temp-dir` for temporary file storage
|
||
|
||
**Why `--thorough`?**
|
||
- More samples = more accurate CRF estimation
|
||
- Takes longer but prevents quality/savings miscalculation
|
||
- Recommended for library encoding (one-time cost)
|
||
|
||
### Phase 4: Full Encoding (with Real-time Output)
|
||
|
||
If savings threshold met:
|
||
1. Runs `ab-av1 encode` with found CRF
|
||
2. **Streams all output in real-time** (you see progress live)
|
||
3. Shows ETA, encoding speed, frame count
|
||
4. Uses `--acodec copy` to preserve audio/subtitles
|
||
|
||
**Real-time output example:**
|
||
```
|
||
→ Running encoding (CRF 34)
|
||
Encoded 4320/125400 frames (3.4%)
|
||
Encoded 8640/125400 frames (6.9%)
|
||
Encoded 12960/125400 frames (10.3%)
|
||
...
|
||
Encoded 125400/125400 frames (100.0%)
|
||
Speed: 15.2 fps, ETA: 2s
|
||
```
|
||
|
||
### Phase 5: Verification & Replacement
|
||
|
||
1. Probes encoded file for actual stats
|
||
2. Calculates actual savings
|
||
3. Only replaces original if new file is smaller
|
||
4. Converts .mp4 to .mkv if needed
|
||
5. Logs detailed results to JSONL files
|
||
|
||
## Configuration
|
||
|
||
### Key Settings (edit in `optimise_media_v2.py`)
|
||
|
||
```python
|
||
TARGETS = [94.0, 93.0, 92.0, 90.0] # VMAF targets to try
|
||
MIN_SAVINGS_PERCENT = 12.0 # Encode if savings ≥12%
|
||
TARGET_SAVINGS_FOR_ESTIMATE = 15.0 # Estimate for this level
|
||
PRESET = 6 # SVT-AV1 preset (4=best, 8=fast)
|
||
EXTENSIONS = {'.mkv', '.mp4'} # File extensions to process
|
||
```
|
||
|
||
### What is CRF?
|
||
|
||
**Constant Rate Factor (CRF):** Quality/bitrate trade-off
|
||
- **Lower CRF** = Higher quality, larger files (e.g., CRF 20)
|
||
- **Higher CRF** = Lower quality, smaller files (e.g., CRF 40)
|
||
- AV1 CRF range: 0-63 (default for VMAF 94 is ~34-36)
|
||
|
||
### What is VMAF?
|
||
|
||
**Video Multimethod Assessment Fusion:** Netflix's quality metric
|
||
- **VMAF 95:** "Visually lossless" - indistinguishable from source
|
||
- **VMAF 94:** Premium quality - minor artifacts
|
||
- **VMAF 93:** Good quality - acceptable for most content
|
||
- **VMAF 90:** Standard quality - may have noticeable artifacts
|
||
- **VMAF 85:** Acceptable quality for mobile/low bandwidth
|
||
|
||
## Logging System
|
||
|
||
### Log Files (all in `/opt/Optmiser/logs/`)
|
||
|
||
| File | Purpose | Format |
|
||
|------|---------|--------|
|
||
| `tv_movies.jsonl` | Successful TV & Movie encodes | JSONL (one line per file) |
|
||
| `content.jsonl` | Successful Content folder encodes | JSONL |
|
||
| `low_savings_skips.jsonl` | Files with <12% savings + 15% estimates | JSONL |
|
||
| `failed_searches.jsonl` | Files that couldn't hit any VMAF target | JSONL |
|
||
| `failed_encodes.jsonl` | Encoding errors | JSONL |
|
||
|
||
### Log Entry Format
|
||
|
||
**Successful encode:**
|
||
```json
|
||
{
|
||
"file": "/path/to/file.mkv",
|
||
"status": "success",
|
||
"vmaf": 94.0,
|
||
"crf": 34.0,
|
||
"before": {
|
||
"codec": "h264",
|
||
"bitrate": 8500,
|
||
"size": 2684354560,
|
||
"duration": 1379.44
|
||
},
|
||
"after": {
|
||
"codec": "av1",
|
||
"bitrate": 6400,
|
||
"size": 2013265920,
|
||
"duration": 1379.44
|
||
},
|
||
"duration": 145.2,
|
||
"savings": 25.0,
|
||
"timestamp": "2025-12-31T12:00:00.000Z"
|
||
}
|
||
```
|
||
|
||
**Low savings with 15% estimate:**
|
||
```json
|
||
{
|
||
"file": "/path/to/file.mkv",
|
||
"vmaf_94": 94.0,
|
||
"savings_94": 7.0,
|
||
"vmaf_93": 93.0,
|
||
"savings_93": 18.0,
|
||
"target_for_15_percent": {
|
||
"target_vmaf": 93,
|
||
"crf": 37,
|
||
"savings": 18.0,
|
||
"quality_drop": 1,
|
||
"found": true
|
||
},
|
||
"recommendations": "logged_for_review",
|
||
"timestamp": "2025-12-31T12:00:00.000Z"
|
||
}
|
||
```
|
||
|
||
### Viewing Logs
|
||
|
||
```bash
|
||
# Watch logs in real-time
|
||
tail -f /opt/Optmiser/logs/tv_movies.jsonl | jq '.'
|
||
|
||
# Check files logged for review (both 94 and 93 <12%)
|
||
cat /opt/Optmiser/logs/low_savings_skips.jsonl | jq '.[] | select(.recommendations=="logged_for_review")'
|
||
|
||
# Statistics
|
||
jq -r '.status' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
|
||
|
||
# Find what CRF/VMAF combinations are being used most
|
||
jq -r '[.vmaf, .crf] | @tsv' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
|
||
```
|
||
|
||
## Running on Multiple Machines
|
||
|
||
### Lock File Mechanism
|
||
|
||
The script uses **file-level locks** to prevent duplicate processing:
|
||
|
||
```
|
||
/opt/Optmiser/.lock/{filename}
|
||
```
|
||
|
||
When processing a file:
|
||
1. Checks if lock exists → Skip (another machine is encoding it)
|
||
2. Creates lock → Process
|
||
3. Removes lock when done
|
||
|
||
**Safe to run on multiple machines!** Each will pick different files to encode.
|
||
|
||
### Example Setup
|
||
|
||
**Machine 1 (Intel i9-12900H - Remote Server):**
|
||
```bash
|
||
# Runs on /mnt/Media/tv and /mnt/Media/movies
|
||
sudo /opt/Optmiser/run_optimisation.sh
|
||
```
|
||
|
||
**Machine 2 (AMD RX 7900 XT - Local PC):**
|
||
```bash
|
||
# Runs on your local media library
|
||
python3 /path/to/optimise_media_v2.py /path/to/media tv_movies
|
||
```
|
||
|
||
Both will process different files automatically due to lock checking.
|
||
|
||
## Hardware Encoding
|
||
|
||
### Supported Hardware
|
||
|
||
**Server (Intel i9-12900H):**
|
||
- 24 threads (configurable via `--workers` flag)
|
||
- No GPU acceleration (software AV1)
|
||
- Use software encoding to leave CPU for other tasks
|
||
|
||
**Local PC (AMD RX 7900 XT):**
|
||
- Hardware AV1 encoding via GPU
|
||
- Much faster than CPU
|
||
- Use when available (detected automatically)
|
||
|
||
**Server (50% CPU Mode):**
|
||
- When `--cpu-limit 50` is set
|
||
- Limits to 12 threads on 24-core system
|
||
- Leaves CPU for other tasks while encoding
|
||
|
||
### Hardware Detection
|
||
|
||
The script automatically detects:
|
||
1. **GPU available:** Checks for AMD/NVIDIA GPU encoding support
|
||
2. **System type:** Linux (server) vs Windows (local PC)
|
||
3. **Thread count:** Automatically detected
|
||
4. **Encoding mode:** Selects best available option
|
||
|
||
### Encoding Modes
|
||
|
||
#### 1. Software Encoding (SVT-AV1 CPU)
|
||
- **Best for:** Servers, background processing
|
||
- **Speed:** Slower, but highest quality
|
||
- **CPU Usage:** High (unless limited)
|
||
- **Command:** `ab-av1 encode --encoder libsvtav1`
|
||
|
||
**When to use:**
|
||
- No GPU available
|
||
- Want to leave GPU free for other tasks
|
||
- Server environments (multi-user)
|
||
|
||
#### 2. Hardware Encoding (AMD GPU - AV1 via Vulkan/Mesa)
|
||
- **Best for:** Local PC, faster encoding
|
||
- **Speed:** 3-10x faster than CPU
|
||
- **CPU Usage:** Low
|
||
- **Trade-off:** Slightly lower quality at same CRF (GPU limitations)
|
||
|
||
**Detection:**
|
||
```python
|
||
# Checks if AV1 GPU encoding is available
|
||
has_gpu_av1 = check_for_amd_av1_gpu()
|
||
```
|
||
|
||
**When to use:**
|
||
- AMD RX 7900 XT detected
|
||
- Want faster encoding speeds
|
||
- Single-user PC
|
||
|
||
#### 3. Hardware Encoding with CPU Limit (50% mode)
|
||
- **Best for:** Server with other tasks running
|
||
- **CPU Usage:** 50% (leaves headroom)
|
||
- **Threads:** Half of available cores
|
||
|
||
**When to use:**
|
||
- Server needs CPU for other services
|
||
- Encode while Plex/Jellyfin active
|
||
|
||
### Flags for Hardware Control
|
||
|
||
```bash
|
||
# Use hardware encoding if available (automatic)
|
||
python3 optimise_media_v2.py /media --use-hardware
|
||
|
||
# Force software encoding
|
||
python3 optimise_media_v2.py /media --use-cpu
|
||
|
||
# Limit CPU to 50% (12 threads on 24-core)
|
||
python3 optimise_media_v2.py /media --cpu-limit 50
|
||
|
||
# Set specific worker count
|
||
python3 optimise_media_v2.py /media --workers 8
|
||
```
|
||
|
||
### Windows/WSL Support
|
||
|
||
#### On Native Windows
|
||
|
||
**Prerequisites:**
|
||
1. Install FFmpeg and ab-av1
|
||
2. Copy `/opt/Optmiser` folder structure to Windows
|
||
3. Update `AB_AV1_PATH` in script or use `--ab-av1-path`
|
||
|
||
**Setup:**
|
||
```powershell
|
||
# Install ab-av1 via cargo
|
||
cargo install ab-av1
|
||
|
||
# Run on Windows media library
|
||
python3 C:\Optmiser\optimise_media_v2.py D:\Media tv_movies
|
||
```
|
||
|
||
#### On WSL (Windows Subsystem for Linux)
|
||
|
||
**Best option:** Run in WSL for native Linux support
|
||
```bash
|
||
# Install in WSL Ubuntu/Debian
|
||
sudo apt update
|
||
sudo apt install -y ffmpeg python3
|
||
cargo install ab-av1
|
||
|
||
# Copy scripts to WSL
|
||
cp -r /mnt/c/Optmiser /mnt/c/path/to/optmiser
|
||
|
||
# Run in WSL (accesses Windows C: drive at /mnt/c/)
|
||
python3 /opt/Optmiser/optimise_media_v2.py /mnt/c/Media tv_movies
|
||
```
|
||
|
||
**WSL Path Mapping:**
|
||
```
|
||
Windows C:\ → /mnt/c/
|
||
Windows D:\ → /mnt/d/
|
||
\\Server\media\ → Network mount (if configured)
|
||
```
|
||
|
||
#### Running Across Multiple Machines
|
||
|
||
All three can run simultaneously with proper locking:
|
||
|
||
```
|
||
Server (Linux): /mnt/Media/tv → Lock files, encode to AV1
|
||
Local PC (Windows): D:\Media\tv → Lock files, encode to AV1
|
||
Local PC (WSL): /mnt/c/Media/tv → Lock files, encode to AV1
|
||
```
|
||
|
||
Each machine processes different files automatically!
|
||
|
||
## Performance Characteristics
|
||
|
||
### Encoding Speed Estimates
|
||
|
||
| Hardware | Resolution | Speed (1080p) | Speed (4K) |
|
||
|-----------|------------|------------------|-------------|
|
||
| Intel i9 (24 threads) | ~15 fps | ~3-5 fps |
|
||
| AMD RX 7900 XT (GPU) | ~150 fps | ~30-50 fps |
|
||
| AMD RX 7900 XT (CPU, 12t) | ~8 fps | ~1-2 fps |
|
||
| Intel i9 (12 threads, 50%) | ~8 fps | ~1-2 fps |
|
||
|
||
### Time Estimates
|
||
|
||
For 1-hour 1080p video (h264 → AV1):
|
||
|
||
| Hardware | VMAF 94 | VMAF 93 | VMAF 90 |
|
||
|-----------|----------|----------|----------|
|
||
| Intel i9 (CPU, 24t) | 4-5 min | 3-4 min | 2-3 min |
|
||
| AMD RX 7900 XT (GPU) | 30-60 sec | 20-40 sec | 15-30 sec |
|
||
| AMD RX 7900 XT (CPU) | 7-10 min | 6-9 min | 5-8 min |
|
||
| Intel i9 (CPU, 12t, 50%) | 7-10 min | 6-9 min | 5-8 min |
|
||
|
||
## Troubleshooting
|
||
|
||
### Issue: "0k bitrate" display
|
||
|
||
**Cause:** VBR (Variable Bitrate) files show 0 in ffprobe's format bitrate field.
|
||
|
||
**Solution (implemented):** Calculate from `(size × 8) / duration`
|
||
|
||
### Issue: ETA showing "eta 0s" early in encode
|
||
|
||
**Cause:** ab-av1 outputs initial ETA estimate before calculating.
|
||
|
||
**Solution:** Real-time streaming now shows progress updates properly.
|
||
|
||
### Issue: Multiple machines encoding same file
|
||
|
||
**Cause:** No coordination between machines.
|
||
|
||
**Solution:** Lock files in `/opt/Optmiser/.lock/{filename}`
|
||
|
||
### Issue: Encode fails with "unexpected argument"
|
||
|
||
**Cause:** Using wrong flags for ab-av1 commands.
|
||
|
||
**Solution:**
|
||
- `crf-search` supports `--temp-dir`
|
||
- `encode` does NOT support `--temp-dir`
|
||
- Use `--acodec copy` not `-c copy`
|
||
|
||
## File Structure
|
||
|
||
```
|
||
/opt/Optmiser/
|
||
├── optimise_media_v2.py # Main encoding script
|
||
├── run_optimisation.sh # Master runner
|
||
├── bin/
|
||
│ └── ab-av1 # ab-av1 binary (downloaded)
|
||
├── tmp/ # Temporary encoding files
|
||
├── logs/ # Log files (JSONL format)
|
||
│ ├── tv_movies.jsonl
|
||
│ ├── content.jsonl
|
||
│ ├── low_savings_skips.jsonl
|
||
│ ├── failed_searches.jsonl
|
||
│ └── failed_encodes.jsonl
|
||
├── .lock/ # Multi-machine coordination (created at runtime)
|
||
├── ffmpeg-static/ # FFmpeg binaries (if using bundled)
|
||
└── README_v2_FINAL.md # This documentation
|
||
```
|
||
|
||
## Best Practices
|
||
|
||
### For Server (Intel i9-12900H)
|
||
|
||
1. **Use 50% CPU mode** if running other services (Plex, Docker, etc.)
|
||
```bash
|
||
python3 optimise_media_v2.py /media --cpu-limit 50
|
||
```
|
||
|
||
2. **Run during off-peak hours** to minimize impact on users
|
||
|
||
3. **Monitor CPU temperature** during encoding:
|
||
```bash
|
||
watch -n 2 'sensors | grep "Package id"'
|
||
```
|
||
|
||
4. **Use Preset 6-8** for faster encodes (preset 4 = 2x slower, preset 8 = 2x faster)
|
||
|
||
### For Local PC (AMD RX 7900 XT)
|
||
|
||
1. **Enable hardware encoding** for massive speedup:
|
||
```bash
|
||
python3 optimise_media_v2.py /media --use-hardware
|
||
```
|
||
|
||
2. **Test small sample first** to verify settings:
|
||
```bash
|
||
python3 optimise_media_v2.py /media/sample --use-hardware --dry-run
|
||
```
|
||
|
||
3. **Monitor GPU usage**:
|
||
```bash
|
||
watch -n 2 'radeontop -d 1'
|
||
```
|
||
|
||
4. **Consider quality compensation:** GPU encoding may need lower VMAF target (e.g., VMAF 92) to match CPU quality.
|
||
|
||
### For WSL Setup
|
||
|
||
1. **Access Windows drives via /mnt/c/**
|
||
```bash
|
||
ls /mnt/c/Media/tv # Lists C:\Media\tv
|
||
```
|
||
|
||
2. **Use WSL2 if available** (better performance):
|
||
```bash
|
||
wsl.exe --list
|
||
# Look for version with WSL2 in distribution name
|
||
```
|
||
|
||
3. **Increase memory limits** if encoding 4K content:
|
||
```bash
|
||
# In WSL: Edit ~/.wslconfig
|
||
[wsl2]
|
||
memory=16GB # or 24GB, 32GB
|
||
```
|
||
|
||
## Git Repository
|
||
|
||
**Repository:** https://gitea.theflagroup.com/bnair/VMAFOptimiser.git
|
||
|
||
### Initial Setup (on any machine)
|
||
|
||
```bash
|
||
# Clone repository
|
||
git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git /opt/Optmiser
|
||
|
||
# Or if already exists:
|
||
cd /opt/Optmiser
|
||
git init
|
||
git remote add origin https://gitea.theflagroup.com/bnair/VMAFOptimiser.git
|
||
git branch -M main
|
||
git add .
|
||
git commit -m "Initial commit"
|
||
git push -u origin main
|
||
```
|
||
|
||
### Updating from Git
|
||
|
||
```bash
|
||
cd /opt/Optmiser
|
||
git pull origin main
|
||
|
||
# Run latest version
|
||
python3 optimise_media_v2.py ...
|
||
```
|
||
|
||
### Committing Changes
|
||
|
||
```bash
|
||
cd /opt/Optmiser
|
||
git status # See what changed
|
||
git add optimise_media_v2.py
|
||
git commit -m "feat: add hardware encoding support"
|
||
git push
|
||
```
|
||
|
||
## Advanced Usage
|
||
|
||
### Dry Run (Test Without Encoding)
|
||
|
||
```bash
|
||
python3 optimise_media_v2.py /media test --dry-run
|
||
```
|
||
|
||
Tests everything except actual file replacement and encoding.
|
||
|
||
### Process Specific Files
|
||
|
||
```bash
|
||
# Single file
|
||
python3 optimise_media_v2.py /media/Movies/SpecificMovie.mkv tv_movies
|
||
|
||
# All movies in directory
|
||
python3 optimise_media_v2.py /media/Movies tv_movies
|
||
```
|
||
|
||
### Resume Processing
|
||
|
||
The script automatically skips:
|
||
- Already encoded files (AV1 codec)
|
||
- Files with existing `.lock` files
|
||
- Files already logged as successful
|
||
|
||
### Custom VMAF Targets
|
||
|
||
Edit `TARGETS` in script to change behavior:
|
||
|
||
```python
|
||
# More aggressive compression (lower quality, smaller files)
|
||
TARGETS = [92.0, 90.0, 88.0, 86.0]
|
||
|
||
# More conservative (higher quality, larger files)
|
||
TARGETS = [95.0, 94.0, 93.0]
|
||
```
|
||
|
||
## Performance Optimization Tips
|
||
|
||
### Server (i9-12900H)
|
||
|
||
1. **Use higher preset for speed:**
|
||
```python
|
||
PRESET = 8 # Fast but slightly larger files
|
||
```
|
||
|
||
2. **Enable multiple concurrent encodes** (with --workers flag):
|
||
```bash
|
||
# Encode 3 files at once (uses more CPU but faster total throughput)
|
||
python3 optimise_media_v2.py /media --workers 3
|
||
```
|
||
|
||
3. **CPU affinity** (pin threads to specific cores):
|
||
```bash
|
||
# Edit ab-av1 command to add: --svt 'rc=logical:0-11'
|
||
```
|
||
|
||
### AMD RX 7900 XT
|
||
|
||
1. **Test software vs hardware:**
|
||
```bash
|
||
# Time both to see actual speedup
|
||
time python3 optimise_media_v2.py /media --use-hardware sample.mkv
|
||
time python3 optimise_media_v2.py /media --use-cpu sample.mkv
|
||
```
|
||
|
||
2. **Adjust GPU memory limit** (if OOM errors):
|
||
```bash
|
||
# Not currently in script, but can add via --svt flag
|
||
# Example: --svt 'mbr=5000000' (limit memory)
|
||
```
|
||
|
||
3. **Use lower preset on GPU:**
|
||
```bash
|
||
# GPU may need lower preset to match quality
|
||
PRESET_GPU = 4 # Slower but better quality
|
||
```
|
||
|
||
## Support Matrix
|
||
|
||
| Platform | Status | Notes |
|
||
|----------|--------|-------|
|
||
| Linux (Intel CPU) | ✅ Supported | Native |
|
||
| Linux (AMD GPU) | ✅ Planned | AMD AV1 via Vulkan/Mesa (future) |
|
||
| Windows (Native) | ✅ Supported | Needs FFmpeg/ab-av1 installed |
|
||
| Windows (WSL) | ✅ Supported | Best option for Windows users |
|
||
| Multi-machine | ✅ Supported | Via lock files |
|
||
|
||
---
|
||
|
||
**Last Updated:** December 31, 2025
|
||
**Version:** 2.0 with Hardware Encoding Support (Planned)
|