Merged unrelated histories and resolved conflicts

This commit is contained in:
bnair
2025-12-31 16:35:58 +01:00
5 changed files with 1292 additions and 891 deletions

838
AGENTS.md
View File

@@ -1,630 +1,360 @@
# VMAF Optimisation Pipeline - Agent Documentation # VMAF Optimiser - Agent Guidelines
## Overview ## Quick Reference
This project automates video library optimization to AV1 using VMAF (Video Multimethod Assessment Fusion) quality targets. It intelligently searches for optimal encoding parameters and gracefully degrades quality when needed to achieve target file size savings. **Purpose:** Video library optimization pipeline using VMAF quality targets with AV1 encoding.
## Architecture **Core Files:**
- `optimize_library.py` - Main Python script (342 lines)
- `run_optimisation.sh` - Linux/macOS wrapper
- `run_optimisation.ps1` - Windows wrapper
``` ---
run_optimisation.sh # Master runner script
## Build/Lint/Test Commands
optimise_media_v2.py # Main encoding engine
### Development Setup
ab-av1 (crf-search, encode) # AV1 encoding tool
ffprobe/ffmpeg # Media analysis/encoding ```bash
# Install dependencies (if not already)
cargo install ab-av1 # v0.10.3+
brew install ffmpeg # macOS
# OR: apt install ffmpeg # Linux/WSL
# OR: winget install ffmpeg # Windows
``` ```
## How It Works ### Linting
### Phase 1: Video Analysis ```bash
1. Scans directory for video files (.mkv, .mp4) # Ruff is the linter (indicated by .ruff_cache/)
2. Uses `ffprobe` to get: ruff check optimize_library.py
- Codec (h264, hevc, etc.)
- Resolution (width × height)
- Bitrate (calculated from size/duration)
- File size and duration
3. Skips if already AV1 encoded
### Phase 2: VMAF Target Search (Intelligent Fallback) # Format with ruff
ruff format optimize_library.py
The script tries VMAF targets in **descending order** (highest quality first): # Check specific issues
ruff check optimize_library.py --select E,F,W
```
Try VMAF 94 (Premium)
Can achieve?
↓ Yes ↓ No
Calculate savings Try VMAF 93
Savings ≥ 12%?
↓ Yes ↓ No
Encode at VMAF 94 Calculate savings
Savings ≥ 12%?
↓ Yes ↓ No
Encode at VMAF 93 Find 15% (test 92, 90)
``` ```
**Fallback Logic:** ### Running the Application
- If VMAF 94 gives ≥12% savings → **Encode at VMAF 94**
- If VMAF 94 <12% but VMAF 93 ≥12% → **Encode at VMAF 93**
- If both <12% → Find what VMAF gives 15%+ savings:
- Tests VMAF 93, 92, 90
- Reports "FOUND 15%+ SAVINGS" with exact parameters
- Logs for manual review (no encoding)
- User can decide to adjust settings
### Phase 3: CRF Search ```bash
# Linux/macOS
./run_optimisation.sh --directory /media --vmaf 95 --workers 1
Uses `ab-av1 crf-search` with `--thorough` flag: # Windows
- Takes multiple samples (20-30s segments) from video .\run_optimisation.ps1 -directory "D:\Movies" -vmaf 95 -workers 1
- Interpolates binary search for optimal CRF
- Outputs: Best CRF, Mean VMAF, Predicted size
- Uses `--temp-dir` for temporary file storage
**Why `--thorough`?** # Direct Python execution
- More samples = more accurate CRF estimation python3 optimize_library.py /media --vmaf 95 --preset 6 --workers 1
- Takes longer but prevents quality/savings miscalculation
- Recommended for library encoding (one-time cost)
### Phase 4: Full Encoding (with Real-time Output)
If savings threshold met:
1. Runs `ab-av1 encode` with found CRF
2. **Streams all output in real-time** (you see progress live)
3. Shows ETA, encoding speed, frame count
4. Uses `--acodec copy` to preserve audio/subtitles
**Real-time output example:**
```
→ Running encoding (CRF 34)
Encoded 4320/125400 frames (3.4%)
Encoded 8640/125400 frames (6.9%)
Encoded 12960/125400 frames (10.3%)
...
Encoded 125400/125400 frames (100.0%)
Speed: 15.2 fps, ETA: 2s
``` ```
### Phase 5: Verification & Replacement ### Testing
1. Probes encoded file for actual stats **No formal test suite exists currently.** Test manually by:
2. Calculates actual savings
3. Only replaces original if new file is smaller
4. Converts .mp4 to .mkv if needed
5. Logs detailed results to JSONL files
## Configuration ```bash
# Test with single video file
python3 optimize_library.py /media/sample.mkv --vmaf 95 --workers 1
### Key Settings (edit in `optimise_media_v2.py`) # Dry run (validate logic without encoding)
python3 optimize_library.py /media --vmaf 95 --thorough
# Check dependencies
python3 optimize_library.py 2>&1 | grep -E "(ffmpeg|ab-av1)"
```
---
## Code Style Guidelines
### Python Style (PEP 8 Compliant)
**Imports:**
```python
# Standard library first, grouped logically
import os
import sys
import subprocess
import json
import shutil
import platform
from concurrent.futures import ThreadPoolExecutor, as_completed
from pathlib import Path
```
**Naming Conventions:**
```python
# Constants: UPPER_SNAKE_CASE
DEFAULT_VMAF = 95.0
DEFAULT_PRESET = 6
EXTENSIONS = {".mkv", ".mp4", ".mov", ".avi", ".ts"}
# Functions: snake_case
def get_video_info(filepath):
def build_ab_av1_command(input_path, output_path, args):
# Variables: snake_case
input_path = Path(filepath)
output_path = input_path.with_stem(input_path.stem + "_av1")
# Module-level cache: _PREFIX (private)
_AB_AV1_HELP_CACHE = {}
```
**Formatting:**
- 4-space indentation
- Line length: ~88-100 characters (ruff default: 88)
- No trailing whitespace
- One blank line between functions
- Two blank lines before class definitions (if any)
**Function Structure:**
```python
def function_name(param1, param2, optional_param=None):
"""Brief description if needed."""
try:
# Implementation
return result
except Exception as e:
print(f"Error: {e}")
return None # or handle gracefully
```
**Subprocess Calls:**
```python
# Use subprocess.run for all external commands
cmd = ["ffmpeg", "-i", input_file, output_file]
result = subprocess.run(cmd, capture_output=True, text=True, check=True)
# Check return codes explicitly
if result.returncode != 0:
print(f"Command failed: {result.stderr}")
```
### Error Handling
```python ```python
TARGETS = [94.0, 93.0, 92.0, 90.0] # VMAF targets to try # Always wrap external tool calls in try-except
MIN_SAVINGS_PERCENT = 12.0 # Encode if savings ≥12% try:
TARGET_SAVINGS_FOR_ESTIMATE = 15.0 # Estimate for this level info = get_video_info(filepath)
PRESET = 6 # SVT-AV1 preset (4=best, 8=fast) if not info:
EXTENSIONS = {'.mkv', '.mp4'} # File extensions to process return # Early return on None
except subprocess.CalledProcessError as e:
print(f"FFmpeg failed: {e}")
return
# Use specific exception types when possible
except FileNotFoundError:
print("File not found")
except json.JSONDecodeError:
print("Invalid JSON")
``` ```
### What is CRF? ### Platform Detection
**Constant Rate Factor (CRF):** Quality/bitrate trade-off ```python
- **Lower CRF** = Higher quality, larger files (e.g., CRF 20) # Use platform module for OS detection
- **Higher CRF** = Lower quality, smaller files (e.g., CRF 40) def is_wsl():
- AV1 CRF range: 0-63 (default for VMAF 94 is ~34-36) if os.environ.get("WSL_DISTRO_NAME"):
return True
try:
with open("/proc/sys/kernel/osrelease", "r") as f:
return "microsoft" in f.read().lower()
except FileNotFoundError:
return False
### What is VMAF? def platform_label():
system = platform.system()
if system == "Linux" and is_wsl():
return "Linux (WSL)"
return system
```
**Video Multimethod Assessment Fusion:** Netflix's quality metric ### Argument Parsing
- **VMAF 95:** "Visually lossless" - indistinguishable from source
- **VMAF 94:** Premium quality - minor artifacts
- **VMAF 93:** Good quality - acceptable for most content
- **VMAF 90:** Standard quality - may have noticeable artifacts
- **VMAF 85:** Acceptable quality for mobile/low bandwidth
## Logging System ```python
def main():
parser = argparse.ArgumentParser(description="Description")
parser.add_argument("directory", help="Root directory")
parser.add_argument("--vmaf", type=float, default=95.0, help="Target VMAF")
args = parser.parse_args()
```
### Log Files (all in `/opt/Optmiser/logs/`) ---
| File | Purpose | Format | ## Shell Script Guidelines (run_optimisation.sh)
|------|---------|--------|
| `tv_movies.jsonl` | Successful TV & Movie encodes | JSONL (one line per file) |
| `content.jsonl` | Successful Content folder encodes | JSONL |
| `low_savings_skips.jsonl` | Files with <12% savings + 15% estimates | JSONL |
| `failed_searches.jsonl` | Files that couldn't hit any VMAF target | JSONL |
| `failed_encodes.jsonl` | Encoding errors | JSONL |
### Log Entry Format **Shebang & Error Handling:**
```bash
#!/bin/bash
set -e # Exit on error
```
**Successful encode:** **Color Output:**
```json ```bash
{ COLOR_RED='\033[0;31m'
"file": "/path/to/file.mkv", COLOR_GREEN='\033[0;32m'
"status": "success", COLOR_CYAN='\033[0;36m'
"vmaf": 94.0, COLOR_RESET='\033[0m'
"crf": 34.0,
"before": { log_info() {
"codec": "h264", echo -e "${COLOR_CYAN}$*${COLOR_RESET}"
"bitrate": 8500, }
"size": 2684354560,
"duration": 1379.44 log_error() {
}, echo -e "${COLOR_RED}ERROR: $*${COLOR_RESET}" >&2
"after": {
"codec": "av1",
"bitrate": 6400,
"size": 2013265920,
"duration": 1379.44
},
"duration": 145.2,
"savings": 25.0,
"timestamp": "2025-12-31T12:00:00.000Z"
} }
``` ```
**Low savings with 15% estimate:** **Argument Parsing:**
```json
{
"file": "/path/to/file.mkv",
"vmaf_94": 94.0,
"savings_94": 7.0,
"vmaf_93": 93.0,
"savings_93": 18.0,
"target_for_15_percent": {
"target_vmaf": 93,
"crf": 37,
"savings": 18.0,
"quality_drop": 1,
"found": true
},
"recommendations": "logged_for_review",
"timestamp": "2025-12-31T12:00:00.000Z"
}
```
### Viewing Logs
```bash ```bash
# Watch logs in real-time while [[ $# -gt 0 ]]; do
tail -f /opt/Optmiser/logs/tv_movies.jsonl | jq '.' case "$1" in
--vmaf)
# Check files logged for review (both 94 and 93 <12%) VMAF="$2"
cat /opt/Optmiser/logs/low_savings_skips.jsonl | jq '.[] | select(.recommendations=="logged_for_review")' shift 2
;;
# Statistics *)
jq -r '.status' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c DIRECTORY="$1"
shift
# Find what CRF/VMAF combinations are being used most ;;
jq -r '[.vmaf, .crf] | @tsv' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c esac
done
``` ```
## Running on Multiple Machines ---
### Lock File Mechanism ## PowerShell Guidelines (run_optimisation.ps1)
The script uses **file-level locks** to prevent duplicate processing: **Parameter Declaration:**
```
/opt/Optmiser/.lock/{filename}
```
When processing a file:
1. Checks if lock exists → Skip (another machine is encoding it)
2. Creates lock → Process
3. Removes lock when done
**Safe to run on multiple machines!** Each will pick different files to encode.
### Example Setup
**Machine 1 (Intel i9-12900H - Remote Server):**
```bash
# Runs on /mnt/Media/tv and /mnt/Media/movies
sudo /opt/Optmiser/run_optimisation.sh
```
**Machine 2 (AMD RX 7900 XT - Local PC):**
```bash
# Runs on your local media library
python3 /path/to/optimise_media_v2.py /path/to/media tv_movies
```
Both will process different files automatically due to lock checking.
## Hardware Encoding
### Supported Hardware
**Server (Intel i9-12900H):**
- 24 threads (configurable via `--workers` flag)
- No GPU acceleration (software AV1)
- Use software encoding to leave CPU for other tasks
**Local PC (AMD RX 7900 XT):**
- Hardware AV1 encoding via GPU
- Much faster than CPU
- Use when available (detected automatically)
**Server (50% CPU Mode):**
- When `--cpu-limit 50` is set
- Limits to 12 threads on 24-core system
- Leaves CPU for other tasks while encoding
### Hardware Detection
The script automatically detects:
1. **GPU available:** Checks for AMD/NVIDIA GPU encoding support
2. **System type:** Linux (server) vs Windows (local PC)
3. **Thread count:** Automatically detected
4. **Encoding mode:** Selects best available option
### Encoding Modes
#### 1. Software Encoding (SVT-AV1 CPU)
- **Best for:** Servers, background processing
- **Speed:** Slower, but highest quality
- **CPU Usage:** High (unless limited)
- **Command:** `ab-av1 encode --encoder libsvtav1`
**When to use:**
- No GPU available
- Want to leave GPU free for other tasks
- Server environments (multi-user)
#### 2. Hardware Encoding (AMD GPU - AV1 via Vulkan/Mesa)
- **Best for:** Local PC, faster encoding
- **Speed:** 3-10x faster than CPU
- **CPU Usage:** Low
- **Trade-off:** Slightly lower quality at same CRF (GPU limitations)
**Detection:**
```python
# Checks if AV1 GPU encoding is available
has_gpu_av1 = check_for_amd_av1_gpu()
```
**When to use:**
- AMD RX 7900 XT detected
- Want faster encoding speeds
- Single-user PC
#### 3. Hardware Encoding with CPU Limit (50% mode)
- **Best for:** Server with other tasks running
- **CPU Usage:** 50% (leaves headroom)
- **Threads:** Half of available cores
**When to use:**
- Server needs CPU for other services
- Encode while Plex/Jellyfin active
### Flags for Hardware Control
```bash
# Use hardware encoding if available (automatic)
python3 optimise_media_v2.py /media --use-hardware
# Force software encoding
python3 optimise_media_v2.py /media --use-cpu
# Limit CPU to 50% (12 threads on 24-core)
python3 optimise_media_v2.py /media --cpu-limit 50
# Set specific worker count
python3 optimise_media_v2.py /media --workers 8
```
### Windows/WSL Support
#### On Native Windows
**Prerequisites:**
1. Install FFmpeg and ab-av1
2. Copy `/opt/Optmiser` folder structure to Windows
3. Update `AB_AV1_PATH` in script or use `--ab-av1-path`
**Setup:**
```powershell ```powershell
# Install ab-av1 via cargo param(
cargo install ab-av1 [Parameter(Mandatory=$false)]
[string]$Directory = ".",
# Run on Windows media library [float]$Vmaf = 95.0,
python3 C:\Optmiser\optimise_media_v2.py D:\Media tv_movies [switch]$Thorough
)
``` ```
#### On WSL (Windows Subsystem for Linux) **Error Handling:**
```powershell
$ErrorActionPreference = "Stop"
**Best option:** Run in WSL for native Linux support function Write-ColorOutput {
```bash param([string]$Message, [string]$Color = "White")
# Install in WSL Ubuntu/Debian Write-Host $Message -ForegroundColor $Color
sudo apt update }
sudo apt install -y ffmpeg python3
cargo install ab-av1
# Copy scripts to WSL
cp -r /mnt/c/Optmiser /mnt/c/path/to/optmiser
# Run in WSL (accesses Windows C: drive at /mnt/c/)
python3 /opt/Optmiser/optimise_media_v2.py /mnt/c/Media tv_movies
``` ```
**WSL Path Mapping:** **Process Management:**
``` ```powershell
Windows C:\ → /mnt/c/ $process = Start-Process -FilePath $pythonCmd.Path -ArgumentList $arguments `
Windows D:\ → /mnt/d/ -NoNewWindow -PassThru
\\Server\media\ → Network mount (if configured) $process.WaitForExit()
$exitCode = $process.ExitCode
``` ```
#### Running Across Multiple Machines ---
All three can run simultaneously with proper locking: ## Key Constraints & Best Practices
``` ### When Modifying `optimize_library.py`
Server (Linux): /mnt/Media/tv → Lock files, encode to AV1
Local PC (Windows): D:\Media\tv → Lock files, encode to AV1
Local PC (WSL): /mnt/c/Media/tv → Lock files, encode to AV1
```
Each machine processes different files automatically! 1. **Maintain platform compatibility:** Always test on Linux, Windows, and macOS
2. **Preserve subprocess patterns:** Use `subprocess.run` with `check=True`
3. **Handle missing dependencies:** Check `shutil.which()` before running tools
4. **Thread safety:** The script uses `ThreadPoolExecutor` - avoid global state
5. **Path handling:** Always use `Path` objects from `pathlib`
## Performance Characteristics ### When Modifying Wrapper Scripts
### Encoding Speed Estimates 1. **Keep interfaces consistent:** Both scripts should accept the same parameters
2. **Preserve color output:** Users expect colored status messages
3. **Validate Python path:** Handle `python3` vs `python` vs `py`
4. **Check script existence:** Verify `optimize_library.py` exists before running
| Hardware | Resolution | Speed (1080p) | Speed (4K) | ### File Organization
|-----------|------------|------------------|-------------|
| Intel i9 (24 threads) | ~15 fps | ~3-5 fps |
| AMD RX 7900 XT (GPU) | ~150 fps | ~30-50 fps |
| AMD RX 7900 XT (CPU, 12t) | ~8 fps | ~1-2 fps |
| Intel i9 (12 threads, 50%) | ~8 fps | ~1-2 fps |
### Time Estimates - Keep functions under 50 lines
- Use descriptive names (no abbreviations like `proc_file`, use `process_file`)
- Cache external command help text (see `_AB_AV1_HELP_CACHE`)
- Use constants for magic numbers and strings
For 1-hour 1080p video (h264 → AV1): ### Hardware Acceleration
| Hardware | VMAF 94 | VMAF 93 | VMAF 90 | - Auto-detect via `normalize_hwaccel()` function
|-----------|----------|----------|----------| - Respect `--hwaccel` flag
| Intel i9 (CPU, 24t) | 4-5 min | 3-4 min | 2-3 min | - Check ab-av1 support with `ab_av1_supports()` before using flags
| AMD RX 7900 XT (GPU) | 30-60 sec | 20-40 sec | 15-30 sec | - Default: `auto` (d3d11va on Windows, videotoolbox on macOS, vaapi on Linux)
| AMD RX 7900 XT (CPU) | 7-10 min | 6-9 min | 5-8 min |
| Intel i9 (CPU, 12t, 50%) | 7-10 min | 6-9 min | 5-8 min |
## Troubleshooting ---
### Issue: "0k bitrate" display ## Common Patterns
**Cause:** VBR (Variable Bitrate) files show 0 in ffprobe's format bitrate field.
**Solution (implemented):** Calculate from `(size × 8) / duration`
### Issue: ETA showing "eta 0s" early in encode
**Cause:** ab-av1 outputs initial ETA estimate before calculating.
**Solution:** Real-time streaming now shows progress updates properly.
### Issue: Multiple machines encoding same file
**Cause:** No coordination between machines.
**Solution:** Lock files in `/opt/Optmiser/.lock/{filename}`
### Issue: Encode fails with "unexpected argument"
**Cause:** Using wrong flags for ab-av1 commands.
**Solution:**
- `crf-search` supports `--temp-dir`
- `encode` does NOT support `--temp-dir`
- Use `--acodec copy` not `-c copy`
## File Structure
```
/opt/Optmiser/
├── optimise_media_v2.py # Main encoding script
├── run_optimisation.sh # Master runner
├── bin/
│ └── ab-av1 # ab-av1 binary (downloaded)
├── tmp/ # Temporary encoding files
├── logs/ # Log files (JSONL format)
│ ├── tv_movies.jsonl
│ ├── content.jsonl
│ ├── low_savings_skips.jsonl
│ ├── failed_searches.jsonl
│ └── failed_encodes.jsonl
├── .lock/ # Multi-machine coordination (created at runtime)
├── ffmpeg-static/ # FFmpeg binaries (if using bundled)
└── README_v2_FINAL.md # This documentation
```
## Best Practices
### For Server (Intel i9-12900H)
1. **Use 50% CPU mode** if running other services (Plex, Docker, etc.)
```bash
python3 optimise_media_v2.py /media --cpu-limit 50
```
2. **Run during off-peak hours** to minimize impact on users
3. **Monitor CPU temperature** during encoding:
```bash
watch -n 2 'sensors | grep "Package id"'
```
4. **Use Preset 6-8** for faster encodes (preset 4 = 2x slower, preset 8 = 2x faster)
### For Local PC (AMD RX 7900 XT)
1. **Enable hardware encoding** for massive speedup:
```bash
python3 optimise_media_v2.py /media --use-hardware
```
2. **Test small sample first** to verify settings:
```bash
python3 optimise_media_v2.py /media/sample --use-hardware --dry-run
```
3. **Monitor GPU usage**:
```bash
watch -n 2 'radeontop -d 1'
```
4. **Consider quality compensation:** GPU encoding may need lower VMAF target (e.g., VMAF 92) to match CPU quality.
### For WSL Setup
1. **Access Windows drives via /mnt/c/**
```bash
ls /mnt/c/Media/tv # Lists C:\Media\tv
```
2. **Use WSL2 if available** (better performance):
```bash
wsl.exe --list
# Look for version with WSL2 in distribution name
```
3. **Increase memory limits** if encoding 4K content:
```bash
# In WSL: Edit ~/.wslconfig
[wsl2]
memory=16GB # or 24GB, 32GB
```
## Git Repository
**Repository:** https://gitea.theflagroup.com/bnair/VMAFOptimiser.git
### Initial Setup (on any machine)
```bash
# Clone repository
git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git /opt/Optmiser
# Or if already exists:
cd /opt/Optmiser
git init
git remote add origin https://gitea.theflagroup.com/bnair/VMAFOptimiser.git
git branch -M main
git add .
git commit -m "Initial commit"
git push -u origin main
```
### Updating from Git
```bash
cd /opt/Optmiser
git pull origin main
# Run latest version
python3 optimise_media_v2.py ...
```
### Committing Changes
```bash
cd /opt/Optmiser
git status # See what changed
git add optimise_media_v2.py
git commit -m "feat: add hardware encoding support"
git push
```
## Advanced Usage
### Dry Run (Test Without Encoding)
```bash
python3 optimise_media_v2.py /media test --dry-run
```
Tests everything except actual file replacement and encoding.
### Process Specific Files
```bash
# Single file
python3 optimise_media_v2.py /media/Movies/SpecificMovie.mkv tv_movies
# All movies in directory
python3 optimise_media_v2.py /media/Movies tv_movies
```
### Resume Processing
The script automatically skips:
- Already encoded files (AV1 codec)
- Files with existing `.lock` files
- Files already logged as successful
### Custom VMAF Targets
Edit `TARGETS` in script to change behavior:
### Checking Tool Availability
```python ```python
# More aggressive compression (lower quality, smaller files) def check_dependencies():
TARGETS = [92.0, 90.0, 88.0, 86.0] missing = []
for tool in ["ffmpeg", "ffprobe", "ab-av1"]:
# More conservative (higher quality, larger files) if not shutil.which(tool):
TARGETS = [95.0, 94.0, 93.0] missing.append(tool)
if missing:
print(f"Error: Missing tools: {', '.join(missing)}")
sys.exit(1)
``` ```
## Performance Optimization Tips ### Building Commands Conditionally
```python
cmd = ["ab-av1", "auto-encode", "-i", input_path]
### Server (i9-12900H) if args.encoder:
if ab_av1_supports("auto-encode", "--encoder"):
cmd.extend(["--encoder", args.encoder])
else:
print("Warning: Encoder not supported")
```
1. **Use higher preset for speed:** ### File Path Operations
```python ```python
PRESET = 8 # Fast but slightly larger files # Use pathlib for cross-platform paths
``` input_path = Path(filepath)
output_path = input_path.with_stem(input_path.stem + "_av1")
2. **Enable multiple concurrent encodes** (with --workers flag): # Safe existence check
```bash if output_path.exists():
# Encode 3 files at once (uses more CPU but faster total throughput) print(f"Skipping: {input_path.name}")
python3 optimise_media_v2.py /media --workers 3 return
``` ```
3. **CPU affinity** (pin threads to specific cores): ---
```bash
# Edit ab-av1 command to add: --svt 'rc=logical:0-11'
```
### AMD RX 7900 XT ## Version Control
1. **Test software vs hardware:** ```bash
```bash # Check for changes
# Time both to see actual speedup git status
time python3 optimise_media_v2.py /media --use-hardware sample.mkv
time python3 optimise_media_v2.py /media --use-cpu sample.mkv
```
2. **Adjust GPU memory limit** (if OOM errors): # Format before committing
```bash ruff format optimize_library.py
# Not currently in script, but can add via --svt flag ruff check optimize_library.py
# Example: --svt 'mbr=5000000' (limit memory)
```
3. **Use lower preset on GPU:** # Commit with conventional commits
```bash git commit -m "feat: add hardware acceleration support"
# GPU may need lower preset to match quality git commit -m "fix: handle missing ffprobe gracefully"
PRESET_GPU = 4 # Slower but better quality git commit -m "docs: update setup instructions"
``` ```
## Support Matrix ---
| Platform | Status | Notes | ## Important Notes
|----------|--------|-------|
| Linux (Intel CPU) | ✅ Supported | Native | 1. **No type hints:** Current codebase doesn't use Python typing
| Linux (AMD GPU) | ✅ Planned | AMD AV1 via Vulkan/Mesa (future) | 2. **No formal tests:** Test manually with sample videos
| Windows (Native) | ✅ Supported | Needs FFmpeg/ab-av1 installed | 3. **No package.json:** This is a standalone script, not a Python package
| Windows (WSL) | ✅ Supported | Best option for Windows users | 4. **Lock files:** `.lock/` directory created at runtime for multi-machine coordination
| Multi-machine | ✅ Supported | Via lock files | 5. **Logs:** JSONL format in `logs/` directory for structured data
--- ---
**Last Updated:** December 31, 2025 **Last Updated:** December 31, 2025
**Version:** 2.0 with Hardware Encoding Support (Planned)

572
README.md
View File

@@ -1,43 +1,557 @@
# VMAF Optimisation Pipeline # VMAF Optimiser
Automated video library optimization to AV1 using VMAF quality targeting. Automated video library optimization to AV1 using VMAF (Video Multimethod Assessment Fusion) quality targets. Intelligently searches for optimal encoding parameters and gracefully degrades quality when needed to achieve target file size savings.
## Features
-**Intelligent VMAF Fallback:** 94 → 93 → 92 → 90
-**15% Savings Estimation:** Finds exact VMAF needed for target savings
-**Real-time Output:** Live progress with ETA display
-**Multi-Machine Support:** Lock files prevent duplicate processing
-**Skip AV1 Files:** Won't re-encode already compressed content
-**Separate Logging:** TV/Movies and Content tracked separately
-**Thorough CRF Search:** More accurate VMAF/CRF determination
-**Windows/WSL Compatible:** Run on Windows or WSL with proper path mapping
## Quick Start ## Quick Start
```bash ### Requirements
# Clone repository
git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git /opt/Optmiser
# Process media **Prerequisites:**
python3 /opt/Optmiser/optimise_media_v2.py /path/to/media tv_movies - Python 3.8+
- FFmpeg with VMAF support (`ffmpeg -filters 2>&1 | grep libvmaf`)
- ab-av1 binary (v0.10.3+)
**Installation:**
```bash
# Install ab-av1 via cargo
cargo install ab-av1
# Or download pre-built binary
wget https://github.com/alexheretic/ab-av1/releases/download/v0.10.3/ab-av1-x86_64-unknown-linux-musl
chmod +x ab-av1
``` ```
## Documentation ## Running the Optimiser
- **AGENTS.md** - Complete technical documentation for AI agents/humans ### Windows / macOS
- **SETUP.md** - Installation, configuration, and usage guide
## Requirements Use the PowerShell wrapper script:
- Python 3.8+ ```powershell
- FFmpeg with VMAF support # Interactive mode (shows prompts)
- ab-av1 v0.10.3+ .\run_optimisation.ps1
## License # Direct execution with parameters
.\run_optimisation.ps1 --directory "D:\Movies" --vmaf 95 --preset 6 --workers 1
```
MIT License - See LICENSE file for details. **Available flags:**
- `--directory <path>` - Root directory to scan (default: current directory)
- `--vmaf <score>` - Target VMAF score (default: 95.0)
- `--preset <value>` - SVT-AV1 Preset (default: 6)
- `--workers <count>` - Concurrent files to process (default: 1)
- `--samples <count>` - Samples for CRF search (default: 4)
- `--thorough` - Use thorough mode (slower, more accurate)
- `--encoder <name>` - ab-av1 encoder (default: svt-av1)
- `--hwaccel <value>` - Hardware acceleration (default: none)
## Contributing ### Linux / WSL
Contributions welcome! Please read AGENTS.md for architecture before contributing. Use the bash wrapper script:
```bash
# Interactive mode
./run_optimisation.sh
# Direct execution with parameters
./run_optimisation.sh --directory /mnt/Media/Movies --vmaf 95 --workers 1
```
**Same flags as PowerShell version:**
- `--directory <path>` - Root directory to scan
- `--vmaf <score>` - Target VMAF score
- `--preset <value>` - SVT-AV1 Preset
- `--workers <count>` - Concurrent files to process
- `--samples <count>` - Samples for CRF search
- `--thorough` - Use thorough mode
- `--encoder <name>` - ab-av1 encoder
- `--hwaccel <value>` - Hardware acceleration
## How It Works
### Phase 1: Video Analysis
1. Scans directory for video files (.mkv, .mp4)
2. Uses `ffprobe` to get:
- Codec (h264, hevc, etc.)
- Resolution (width × height)
- Bitrate (calculated from size/duration)
- HDR status (color transfer detection)
3. Skips if already AV1 encoded
### Phase 2: VMAF Target Search (Intelligent Fallback)
The script tries VMAF targets in **descending order** (highest quality first):
```
Try VMAF 94 (Premium)
Can achieve?
↓ Yes ↓ No
Calculate savings Try VMAF 93
Savings ≥ 12%?
↓ Yes ↓ No
Encode at VMAF 94 Calculate savings
Savings ≥ 12%?
↓ Yes ↓ No
Encode at VMAF 93 Find 15% (test 92, 90)
```
**Fallback Logic:**
- If VMAF 94 gives ≥12% savings → **Encode at VMAF 94**
- If VMAF 94 <12% but VMAF 93 ≥12% → **Encode at VMAF 93**
- If both <12% → Find what VMAF gives 15%+ savings:
- Tests VMAF 93, 92, 90
- Reports "FOUND 15%+ SAVINGS" with exact parameters
- Logs for manual review (no encoding)
### Phase 3: CRF Search
Uses `ab-av1 crf-search` with `--thorough` flag:
- Takes multiple samples (20-30s segments) from video
- Interpolates binary search for optimal CRF
- Outputs: Best CRF, Mean VMAF, Predicted size
- Uses `--samples` to control accuracy (default: 4 samples)
### Phase 4: Full Encoding (with Real-time Output)
If savings threshold met:
1. Runs `ab-av1 encode` with found CRF
2. **Streams all output in real-time** (you see progress live)
3. Shows ETA, encoding speed, frame count
4. Uses `--acodec copy` to preserve audio/subtitles
**Real-time output example:**
```
→ Running encoding (CRF 34)
Encoded 4320/125400 frames (3.4%)
Encoded 8640/125400 frames (6.9%)
Encoded 12960/125400 frames (10.3%)
...
Encoded 125400/125400 frames (100.0%)
Speed: 15.2 fps, ETA: 2s
```
### Phase 5: Verification & Replacement
1. Probes encoded file for actual stats
2. Calculates actual savings
3. Only replaces original if new file is smaller
4. Converts .mp4 to .mkv if needed
## Configuration
Key settings (edit in `optimize_library.py`):
```python
TARGETS = [94.0, 93.0, 92.0, 90.0] # VMAF targets to try
MIN_SAVINGS_PERCENT = 12.0 # Encode if savings ≥12%
TARGET_SAVINGS_FOR_ESTIMATE = 15.0 # Estimate for this level
PRESET = 6 # SVT-AV1 preset (4=best, 8=fast)
EXTENSIONS = {".mkv", ".mp4", ".mov", ".avi", ".ts"}
```
### What is CRF?
**Constant Rate Factor (CRF):** Quality/bitrate trade-off
- **Lower CRF** = Higher quality, larger files (e.g., CRF 20)
- **Higher CRF** = Lower quality, smaller files (e.g., CRF 40)
- **AV1 CRF range:** 0-63 (default for VMAF 94 is ~34-36)
### What is VMAF?
**Video Multimethod Assessment Fusion:** Netflix's quality metric
- **VMAF 95:** "Visually lossless" - indistinguishable from source
- **VMAF 94:** Premium quality - minor artifacts
- **VMAF 93:** Good quality - acceptable for most content
- **VMAF 90:** Standard quality - may have noticeable artifacts
- **VMAF 85:** Acceptable quality for mobile/low bandwidth
## Hardware Acceleration
**Automatic hwaccel detection:**
When `--hwaccel auto` is specified, the script selects appropriate hardware acceleration:
| Platform | Auto Selection | Notes |
|-----------|----------------|--------|
| Windows | d3d11va | Direct3D Video Acceleration |
| macOS | videotoolbox | VideoToolbox framework |
| Linux/WSL | vaapi | Video Acceleration via VA-API |
**Discrete GPU vs iGPU priority:**
- **Discrete GPU (e.g., AMD RX 7900 XT) takes priority over iGPU**
- FFmpeg/ab-av1 will prefer the more capable encoder
- For AV1 encoding, discrete GPU is selected if present
**To disable hardware acceleration:**
```powershell
.\run_optimisation.ps1 --hwaccel none
```
```bash
./run_optimisation.sh --hwaccel none
```
## Running on Multiple Machines
### Lock File Mechanism
Each video file has a corresponding lock file:
```
/opt/Optmiser/.lock/{video_filename}
```
**Process:**
1. Machine A checks for lock → None found, creates lock
2. Machine A starts encoding
3. Machine B checks for lock → Found, skips file
4. Machine A finishes, removes lock
5. Machine B can now process that file
**Result:** Different machines automatically process different files!
### Multi-Machine Setup
**Machine 1 (Linux Server - Intel i9-12900H):**
```bash
cd /opt/Optmiser
git pull origin main
./run_optimisation.sh /mnt/Media/movies --vmaf 95
```
**Machine 2 (Windows PC - AMD RX 7900 XT):**
```powershell
cd C:\Optmiser
git pull origin main
.\run_optimisation.ps1 D:\Media\movies --vmaf 95 --hwaccel auto
```
**Machine 3 (Another Linux PC):**
```bash
cd /opt/Optmiser
git pull origin main
./run_optimisation.sh /home/user/Media/tv --vmaf 95
```
All three can run simultaneously - lock files prevent duplicates!
## Logging System
All logs stored in `/opt/Optmiser/logs/` directory:
| File | Purpose |
|------|---------|
| `tv_movies.jsonl` | Successful TV & Movie encodes |
| `content.jsonl` | Successful Content folder encodes |
| `low_savings_skips.jsonl` | Files with <12% savings + 15% estimates |
| `failed_searches.jsonl` | Files that couldn't hit any VMAF target |
| `failed_encodes.jsonl` | Encoding errors |
### Log Entry Format
**Successful encode:**
```json
{
"file": "/path/to/file.mkv",
"status": "success",
"vmaf": 94.0,
"crf": 34.0,
"before": {
"codec": "h264",
"bitrate": 8500,
"size": 2684354560,
"duration": 1379.44
},
"after": {
"codec": "av1",
"bitrate": 6400,
"size": 2013265920,
"duration": 1379.44
},
"duration": 145.2,
"savings": 25.0,
"timestamp": "2025-12-31T12:00:00.000Z"
}
```
**Low savings with 15% estimate:**
```json
{
"file": "/path/to/file.mkv",
"vmaf_94": 94.0,
"savings_94": 7.0,
"vmaf_93": 93.0,
"savings_93": 18.0,
"target_for_15_percent": {
"target_vmaf": 93,
"crf": 37,
"savings": 18.0,
"quality_drop": 1,
"found": true
},
"recommendations": "logged_for_review",
"timestamp": "2025-12-31T12:00:00.000Z"
}
```
### Viewing Logs
```bash
# Watch logs in real-time
tail -f /opt/Optmiser/logs/tv_movies.jsonl | jq '.'
# Check files logged for review (both 94 and 93 <12%)
cat /opt/Optmiser/logs/low_savings_skips.jsonl | jq '.[] | select(.recommendations=="logged_for_review")'
# Statistics
jq -r '.status' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
# Find what CRF/VMAF combinations are being used most
jq -r '[.vmaf, .crf] | @tsv' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
```
## Troubleshooting
### Issue: "0k bitrate" display
**Cause:** VBR (Variable Bitrate) files show 0 in ffprobe's format bitrate field.
**Solution:** Calculate from `(size × 8) / duration`
### Issue: Multiple machines encoding same file
**Cause:** No coordination between machines.
**Solution:** Lock files in `/opt/Optmiser/.lock/{video_filename}`
### Issue: Encode fails with "unexpected argument"
**Cause:** Using wrong flags for ab-av1 commands.
**Solution:** Script now validates ab-av1 support at runtime and warns gracefully.
### Issue: Out of Memory
**Solution:** Reduce workers or increase swap:
```bash
# Increase swap (if needed)
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
# Use more conservative settings
./run_optimisation.sh --workers 1 --vmaf 93
```
## Best Practices
### For Servers (Intel i9-12900H)
1. **Use 50% CPU mode** if running other services (Plex, Jellyfin):
```bash
./run_optimisation.sh --workers 1 --cpu-limit 50
```
2. **Run during off-peak hours** to minimize impact on users
3. **Monitor CPU temperature**:
```bash
watch -n 2 'sensors | grep "Package id"'
```
4. **Use higher preset for faster encodes** (preset 7-8):
```bash
./run_optimisation.sh --preset 8 --vmaf 93
```
### For Windows PC (AMD RX 7900 XT)
1. **Enable hardware acceleration** for massive speedup:
```powershell
.\run_optimisation.ps1 --hwaccel auto
```
2. **Test small sample first** to verify settings:
```powershell
.\run_optimisation.ps1 --directory "D:\Media\sample" --thorough --vmaf 95
```
3. **Monitor GPU usage**:
```powershell
# Task Manager or radeontop (if available)
```
4. **Consider quality compensation:** GPU encoding may need slightly lower VMAF target (e.g., VMAF 92) to match CPU quality.
### For WSL
1. **Access Windows drives via /mnt/c/:**
```bash
ls /mnt/c/Media/movies
```
2. **Increase memory limits** if encoding 4K content:
```bash
# Edit ~/.wslconfig
[wsl2]
memory=16GB
```
## Customization
### Changing VMAF Targets
Edit `optimize_library.py`:
```python
# More aggressive (smaller files, lower quality)
TARGETS = [92.0, 90.0, 88.0]
# Conservative (larger files, higher quality)
TARGETS = [95.0, 94.0, 93.0]
```
### Changing Savings Threshold
```python
# More aggressive (encode more)
MIN_SAVINGS_PERCENT = 8.0
# Less aggressive (encode fewer)
MIN_SAVINGS_PERCENT = 15.0
```
### Changing Encoder Preset
```python
# Faster encodes (larger files, lower quality)
PRESET = 8
# Better quality (slower encodes, smaller files)
PRESET = 4
```
### Changing Estimate Target
```python
# Target higher savings for estimates
TARGET_SAVINGS_FOR_ESTIMATE = 20.0
```
## File Structure
```
/opt/Optmiser/
├── optimize_library.py # Main encoding engine
├── run_optimisation.sh # Linux/Server wrapper
├── run_optimisation.ps1 # Windows wrapper
├── bin/
│ └── ab-av1 # ab-av1 binary
├── tmp/ # Temporary encoding files
├── logs/ # Log files (JSONL format)
│ ├── tv_movies.jsonl
│ ├── content.jsonl
│ ├── low_savings_skips.jsonl
│ ├── failed_searches.jsonl
│ └── failed_encodes.jsonl
├── .lock/ # Multi-machine coordination (created at runtime)
├── README.md # This file
├── SETUP.md # Setup instructions
└── AGENTS.md # Technical documentation
```
## Platform Support Matrix
| Platform | Status | Notes |
|-----------|--------|-------|
| Linux (Intel CPU) | ✅ Supported | Software encoding, multi-worker capable |
| Windows (AMD GPU) | ✅ Supported | Hardware acceleration via d3d11va (auto-detects) |
| Windows (Intel CPU) | ✅ Supported | Software encoding |
| macOS (Apple Silicon) | ✅ Supported | Hardware via videotoolbox (auto-detects) |
| WSL (Ubuntu/Debian) | ✅ Supported | Linux compatibility layer |
| WSL (Windows drives) | ✅ Supported | Access via /mnt/c/ |
## Git Workflow
### Initial Setup
```bash
cd /opt/Optmiser
git init
git remote add origin https://gitea.theflagroup.com/bnair/VMAFOptimiser.git
git branch -M main
git add .
git commit -m "Initial commit: VMAF optimisation pipeline"
git push -u origin main
```
### Daily Updates
```bash
cd /opt/Optmiser
git pull origin main
# Run optimisation
./run_optimisation.sh /media tv_movies
# Review changes
git diff
```
### Committing Changes
```bash
cd /opt/Optmiser
git status
# Add changed files
git add optimize_library.py run_optimisation.sh run_optimisation.ps1
# Commit with message
git commit -m "feat: add Windows and Linux wrapper scripts"
# Push
git push
```
### View History
```bash
cd /opt/Optmiser
git log --oneline
git log --graph --all
```
## FAQ
**Q: Can I run this on multiple machines at once?**
A: Yes! Each machine will process different files due to lock file mechanism.
**Q: Should I use Windows or WSL?**
A: WSL is recommended for Linux compatibility. Use Windows native if you need direct hardware access or performance.
**Q: Will hardware encoding work better than CPU?**
A: For AMD RX 7900 XT, hardware AV1 encoding is ~3-10x faster than CPU. However, GPU encoding may need slightly lower VMAF targets to match quality.
**Q: What VMAF target should I use?**
A: Start with VMAF 94 or 95. Drop to 92-90 if you need more savings.
**Q: How do I know which files are being processed?**
A: Check `.lock/` directory: `ls -la /opt/Optmiser/.lock/`
**Q: Can I pause/resume?**
A: Pause by stopping the script (Ctrl+C). Resume by running again - it skips processed files.
**Q: What happens if encoding fails?**
A: Error is logged to `failed_encodes.jsonl`. Original file is NOT modified.
**Q: How much CPU does encoding use?**
A: Full CPU by default. Use `--workers 1` for single-threaded, or limit with `--cpu-limit 50` for 50% (12 threads on 24-core).
---
**Last Updated:** December 31, 2025
**Version:** 2.0 with Windows and Linux Wrapper Scripts
>>>>>>> 91418fa898de4a73e144a9fb9202b2315e922ab9

535
SETUP.md
View File

@@ -1,260 +1,169 @@
# VMAF Optimisation Pipeline - Setup Guide # VMAF Optimisation Pipeline - Setup Guide
## Quick Start (Server) ## Quick Start
### Prerequisites ### Prerequisites
- FFmpeg with VMAF support (`ffmpeg -filters 2>&1 | grep libvmaf`)
- Python 3.8+ - Python 3.8+
- FFmpeg with VMAF support (`ffmpeg -filters 2>&1 | grep libvmaf`)
- ab-av1 binary (v0.10.3+) - ab-av1 binary (v0.10.3+)
### Installation ### Installation
**On Linux (Server/WSL):**
>>>>>>> 91418fa898de4a73e144a9fb9202b2315e922ab9
```bash ```bash
# Clone repository # Clone repository
git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git /opt/Optmiser git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git /opt/Optmiser
# Download ab-av1 (if not present) # Download ab-av1
cd /opt/Optmiser/bin cd /opt/Optmiser/bin
wget https://github.com/alexheretic/ab-av1/releases/download/v0.10.3/ab-av1-x86_64-unknown-linux-musl wget https://github.com/alexheretic/ab-av1/releases/download/v0.10.3/ab-av1-x86_64-unknown-linux-musl
chmod +x ab-av1 chmod +x ab-av1
# Set up temporary directories # Set up directories
mkdir -p /opt/Optmiser/tmp /opt/Optmiser/logs /opt/Optmiser/.lock mkdir -p /opt/Optmiser/tmp /opt/Optmiser/logs /opt/Optmiser/.lock
``` ```
### Basic Usage **On Windows:**
```bash
# Process TV shows
python3 /opt/Optmiser/optimise_media_v2.py /mnt/Media/tv tv_movies
# Process movies
python3 /opt/Optmiser/optimise_media_v2.py /mnt/Media/movies tv_movies
# Process with 50% CPU limit (leaves CPU for other tasks)
python3 /opt/Optmiser/optimise_media_v2.py /mnt/Media/tv tv_movies --cpu-limit 50
```
---
## Quick Start (Windows/WSL)
### WSL Installation (Recommended)
```bash
# Update WSL
wsl --update
# Install dependencies in WSL Ubuntu/Debian
sudo apt update
sudo apt install -y ffmpeg python3 git
# Clone repository into WSL
git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git /opt/Optmiser
# Install ab-av1
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env
cargo install ab-av1
# Access Windows media from WSL
# Windows C:\ is at /mnt/c/
python3 /opt/Optmiser/optimise_media_v2.py /mnt/c/Media/tv tv_movies
```
### Native Windows Installation
```powershell ```powershell
# Install Python (if not present)
# Download from python.org
# Install FFmpeg
winget install ffmpeg
# Install Rust
# Download from rust-lang.org
# Install ab-av1
cargo install ab-av1
# Clone repository # Clone repository
git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git C:\Optmiser git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git C:\Optmiser
# Run # Install dependencies
python C:\Optmiser\optimise_media_v2.py D:\Media\tv tv_movies winget install ffmpeg
# Install Rust and ab-av1
# Download from https://rust-lang.org/
cargo install ab-av1
# Set up directories
mkdir C:\Optmiser\tmp, C:\Optmiser\logs
``` ```
--- **On macOS:**
```bash
# Clone repository
git clone https://gitea.theflagroup.com/bnair/VMAFOptimiser.git /opt/Optmiser
## Running on Multiple Machines # Install dependencies
brew install ffmpeg
### How Lock Files Work # Install Rust and ab-av1
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo install ab-av1
Each video file has a corresponding lock file: # Set up directories
mkdir -p /opt/Optmiser/tmp /opt/Optmiser/logs /opt/Optmiser/.lock
``` ```
/opt/Optmiser/.lock/{video_filename}.lock
## Running the Optimiser
### Choose Your Script
**Linux / WSL / macOS:**
```bash
# Interactive mode (shows prompts)
./run_optimisation.sh
# Direct execution with parameters
./run_optimisation.sh --directory /mnt/Media/Movies --vmaf 95 --preset 6 --workers 1
# For 50% CPU mode on server
./run_optimisation.sh --directory /mnt/Media/Movies --vmaf 95 --workers 1 --cpu-limit 50
``` ```
**Windows:**
```powershell
# Interactive mode (shows prompts)
.\run_optimisation.ps1
# Direct execution with parameters
.\run_optimisation.ps1 -directory "D:\Movies" --vmaf 95 --preset 6 --workers 1
# For hardware acceleration (AMD GPU)
.\run_optimisation.ps1 -directory "D:\Movies" --vmaf 95 --hwaccel auto
```
## Script Parameters
All wrapper scripts (`run_optimisation.sh` on Linux, `run_optimisation.ps1` on Windows) accept these parameters:
| Parameter | Description | Default |
|------------|-------------|---------|
| `--directory <path>` | Root directory to scan | Current directory |
| `--vmaf <score>` | Target VMAF score | 95.0 |
| `--preset <value>` | SVT-AV1 Preset (4=best, 6=balanced, 8=fast) | 6 |
| `--workers <count>` | Concurrent files to process | 1 |
| `--samples <count>` | Samples for CRF search | 4 |
| `--thorough` | Use thorough mode (slower, more accurate) | false |
| `--encoder <name>` | ab-av1 encoder | svt-av1 |
| `--hwaccel <value>` | Hardware acceleration | none (auto: auto-detect) |
## Multi-Machine Setup
### How Lock Files Prevent Conflicts
Each video file has a lock file: `/opt/Optmiser/.lock/{video_filename}`
**Process:** **Process:**
1. Machine A checks for lock → None found, creates lock 1. Machine A: Checks for lock → Not found, creates lock
2. Machine A starts encoding 2. Machine A: Starts encoding
3. Machine B checks for lock → Found, skips file 3. Machine B: Checks for lock → Found, skips file
4. Machine A finishes, removes lock 4. Machine A: Finishes, removes lock
5. Machine B can now process that file 5. Machine B: Can now process that file
**Result:** Different machines automatically process different files! **Result:** Different machines automatically process different files simultaneously!
### Setup for Multiple Machines ### Setup for Multiple Machines
**Machine 1 - Remote Server (Intel i9-12900H):** **Machine 1 - Linux Server (Intel i9-12900H):**
```bash ```bash
cd /opt/Optmiser cd /opt/Optmiser
git pull origin main git pull origin main
./run_optimisation.sh /mnt/Media/movies --vmaf 95
# Run on /mnt/Media (Linux filesystem)
python3 optimise_media_v2.py /mnt/Media/tv tv_movies
python3 optimise_media_v2.py /mnt/Media/movies tv_movies
# With 50% CPU limit (recommended)
python3 optimise_media_v2.py /mnt/Media/tv tv_movies --cpu-limit 50
``` ```
**Machine 2 - Local PC (AMD RX 7900 XT, Windows):** **Machine 2 - Windows PC (AMD RX 7900 XT):**
Option A - Native Windows:
```powershell ```powershell
# Map network drive if needed
# \\Server\media\ or use local storage
cd C:\Optmiser cd C:\Optmiser
git pull origin main git pull origin main
.\run_optimisation.ps1 D:\Media\movies --vmaf 95 --hwaccel auto
# Run on local media
python optimise_media_v2.py D:\Media\tv tv_movies
``` ```
Option B - WSL (Recommended): **Machine 3 - Another Linux PC:**
```bash
# Windows C: drive accessible at /mnt/c/
cd /opt/Optmiser
python optimise_media_v2.py /mnt/c/Media/tv tv_movies
```
**Machine 3 - Another PC (AMD 9800X3D, Linux):**
```bash ```bash
cd /opt/Optmiser cd /opt/Optmiser
git pull origin main git pull origin main
./run_optimisation.sh /home/user/Media/tv --vmaf 95
# Run on local media directory
python optimise_media_v2.py /home/user/Media/tv tv_movies
``` ```
All three can run simultaneously - lock files prevent duplicates! All three can run simultaneously!
--- ## Hardware Acceleration
## Hardware Encoding Guide ### Automatic Detection
### Detecting Hardware When `--hwaccel auto` is specified, the wrapper scripts automatically select the best available hardware acceleration:
The script automatically checks for: | Platform | Auto Selection | Notes |
1. **AMD GPU encoding support** (future feature) |-----------|----------------|--------|
2. **System thread count** | Windows | d3d11va | Direct3D Video Acceleration |
3. **Operating system** (Linux vs Windows) | macOS | videotoolbox | VideoToolbox framework |
| Linux/WSL | vaapi | Video Acceleration via VA-API |
### CPU Encoding (Software) ### GPU vs iGPU Priority
**Best for:** Servers, multi-tasking - **Discrete GPU takes priority:** If a discrete GPU (like AMD RX 7900 XT) is present, it's selected over integrated GPU
- **For AMD RX 7900 XT:** Hardware encoding provides ~3-10x speedup over CPU
```bash - **Note:** GPU encoding may need slightly lower VMAF targets to match CPU quality
# Force CPU encoding
python3 optimise_media_v2.py /media --use-cpu
# Limit to 50% of CPU (12 threads on 24-core)
python3 optimise_media_v2.py /media --cpu-limit 50
# Set specific worker count
python3 optimise_media_v2.py /media --workers 8
```
**When to use:**
- Running Plex/Jellyfin on same machine
- Server environment with multiple users
- Want to leave GPU free for other tasks
**Expected Speed:** 8-15 fps @ 1080p
### GPU Encoding (AMD RX 7900 XT)
**Status:** Planned for future versions
**Expected Speedup:** 3-10x vs CPU encoding
```bash
# Enable hardware encoding (when implemented)
python3 optimise_media_v2.py /media --use-hardware
```
**When to use:**
- Local PC with AMD GPU
- Want faster encoding speeds
- Can compensate quality with slightly lower VMAF target
**Expected Speed:** 150+ fps @ 1080p
### Quality Compensation for GPU Encoding
GPU encoding may need quality adjustments:
| CPU VMAF | Equivalent GPU VMAF | Why |
|------------|----------------------|-----|
| 94 | 92-93 | GPU has quality limitations |
| 93 | 91-92 | Hardware encoding trade-offs |
| 90 | 88-89 | Significant compression |
**Recommendation:** If using GPU encoding, set `TARGETS = [92.0, 90.0, 88.0]` for similar quality.
---
## Windows/WSL Path Mapping
### Understanding /mnt/c/
In WSL, Windows drives are mapped:
| Windows | WSL Path | Example |
|---------|------------|---------|
| C:\ | /mnt/c/ | /mnt/c/Users/bnair/Downloads |
| D:\ | /mnt/d/ | /mnt/d/Movies/ |
| E:\ | /mnt/e/ | /mnt/e/TV\ |
**To access Windows media from WSL:**
```bash
# List C:\Media\tv
ls /mnt/c/Media/tv
# Process from WSL
python3 /opt/Optmiser/optimise_media_v2.py /mnt/c/Media/tv tv_movies
```
### Network Drives on WSL
```bash
# Map network drive (one-time)
sudo mkdir -p /mnt/server
echo "//192.168.1.100/media | /mnt/server cifs credentials=/path/to/credfile,uid=1000,gid=1000 0 0" | sudo tee -a /etc/fstab
# Access network media
python3 /opt/Optmiser/optimise_media_v2.py /mnt/server/Media/tv tv_movies
```
---
## Customization ## Customization
### Changing VMAF Targets ### Changing VMAF Targets
Edit `optimise_media_v2.py`: Edit `optimize_library.py`:
```python ```python
# Line 15
TARGETS = [94.0, 93.0, 92.0, 90.0]
# More aggressive (smaller files, lower quality) # More aggressive (smaller files, lower quality)
TARGETS = [92.0, 90.0, 88.0] TARGETS = [92.0, 90.0, 88.0]
@@ -265,18 +174,16 @@ TARGETS = [95.0, 94.0, 93.0]
### Changing Savings Threshold ### Changing Savings Threshold
```python ```python
# Line 17 # More aggressive (encode more)
MIN_SAVINGS_PERCENT = 12.0 # Current threshold MIN_SAVINGS_PERCENT = 8.0
MIN_SAVINGS_PERCENT = 8.0 # More aggressive (encode more)
MIN_SAVINGS_PERCENT = 15.0 # Less aggressive (encode fewer) # Less aggressive (encode fewer)
MIN_SAVINGS_PERCENT = 15.0
``` ```
### Changing Encoder Preset ### Changing Encoder Preset
```python ```python
# Line 19
PRESET = 6
# Faster encodes (larger files, lower quality) # Faster encodes (larger files, lower quality)
PRESET = 8 PRESET = 8
@@ -284,116 +191,112 @@ PRESET = 8
PRESET = 4 PRESET = 4
``` ```
### Changing Estimate Target ## Platform-Specific Tips
```python ### For Linux Servers (Intel i9-12900H)
# Line 18
TARGET_SAVINGS_FOR_ESTIMATE = 15.0
# Target higher savings 1. **Use 50% CPU mode** if running other services:
TARGET_SAVINGS_FOR_ESTIMATE = 20.0 ```bash
``` ./run_optimisation.sh --directory /media --vmaf 95 --workers 1 --cpu-limit 50
```
--- 2. **Run during off-peak hours** to minimize user impact
## Monitoring 3. **Monitor CPU temperature:**
```bash
watch -n 2 'sensors | grep "Package id"'
```
### Watch Progress in Real-Time 4. **Use higher preset for faster encodes:**
```bash
./run_optimisation.sh --vmaf 93 --preset 8
```
### For Windows PCs (AMD RX 7900 XT)
1. **Enable hardware acceleration** for massive speedup:
```powershell
.\run_optimisation.ps1 --directory "D:\Movies" --hwaccel auto
```
2. **Test small sample first** to verify settings:
```powershell
.\run_optimisation.ps1 --directory "D:\Media\sample" --thorough --vmaf 95
```
3. **Monitor GPU usage** (Task Manager or third-party tools)
4. **Consider quality compensation** - GPU encoding may need slightly lower VMAF targets to match CPU quality
### For WSL (Ubuntu/Debian)
1. **Access Windows drives** via `/mnt/c/`:
```bash
ls /mnt/c/Media/movies
```
2. **Increase memory limits** if encoding 4K content:
```bash
# Edit ~/.wslconfig
[wsl2]
memory=16GB
```
## Running in Docker (Optional)
```bash ```bash
# Tail log file with JSON formatting # Build image
tail -f /opt/Optmiser/logs/tv_movies.jsonl | jq '.' docker build -t vmaf-optimiser .
# Monitor encoding speed # Run with mount
watch -n 5 'jq -r "[.savings, .duration]" /opt/Optmiser/logs/tv_movies.jsonl | tail -1' docker run -v /path/to/media:/media vmaf-optimiser /media
# Check lock files (what's being processed)
ls -la /opt/Optmiser/.lock/
``` ```
### Performance Dashboard
```bash
# Create a simple dashboard
watch -n 10 '
echo "=== VMAF Optimiser Status ==="
echo ""
echo "Recent Encodes:"
tail -3 /opt/Optmiser/logs/tv_movies.jsonl | jq -r "[.file, .savings, .duration] | @tsv"
echo ""
echo "CPU Usage:"
top -bn1 | head -5
'
```
---
## Troubleshooting ## Troubleshooting
### Issue: Scripts not found ### Issue: Scripts not found
**Solution:** Ensure you're in the correct directory with the scripts installed.
### Issue: "ab-av1: command not found"
**Solution:** Install ab-av1 via cargo:
```bash ```bash
# Check path cargo install ab-av1
ls /opt/Optmiser/optimise_media_v2.py
# Check Python version
python3 --version # Should be 3.8+
# Check ab-av1
/opt/Optmiser/bin/ab-av1 --version # Should be 0.10.3+
```
### Issue: Permission denied
```bash
# Make scripts executable
chmod +x /opt/Optmiser/*.py
chmod +x /opt/Optmiser/*.sh
# Fix lock directory permissions
chmod 777 /opt/Optmiser/.lock
```
### Issue: ab-av1 not found
```bash
# Check if in PATH
which ab-av1
# Use full path if not in PATH
/opt/Optmiser/bin/ab-av1 --version
# Add to PATH
export PATH="$PATH:/opt/Optmiser/bin"
``` ```
### Issue: FFmpeg VMAF not available ### Issue: FFmpeg VMAF not available
```bash **Solution:** Recompile FFmpeg with VMAF support or download a pre-built version that includes libvmaf.
# Check libvmaf
ffmpeg -filters 2>&1 | grep libvmaf
# If not found, recompile FFmpeg with VMAF
# Or download compiled version from johnvansickle.com/ffmpeg
```
### Issue: Out of Memory ### Issue: Out of Memory
**Solution:** Reduce workers or increase swap:
```bash ```bash
# Check available memory # Increase swap
free -h
# Reduce workers
python3 optimise_media_v2.py /media --workers 4
# Increase swap (if needed)
sudo fallocate -l 4G /swapfile sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile sudo chmod 600 /swapfile
sudo mkswap /swapfile sudo mkswap /swapfile
sudo swapon /swapfile sudo swapon /swapfile
# Use more conservative settings
./run_optimisation.sh --workers 1 --vmaf 93
``` ```
--- ### Issue: Multiple machines encoding same file
**Solution:** This is prevented by lock files. If you see duplicates, check `/opt/Optmiser/.lock/` for stale locks.
### Issue: Encoding fails
**Solution:** Check logs:
```bash
cat /opt/Optmiser/logs/failed_encodes.jsonl | jq '.'
```
### Issue: "unexpected argument" error
**Solution:** Use correct flags for your ab-av1 version. The wrapper scripts now validate support at runtime.
## Git Workflow ## Git Workflow
@@ -414,74 +317,90 @@ cd /opt/Optmiser
git pull origin main git pull origin main
# Run optimisation # Run optimisation
python3 optimise_media_v2.py /media tv_movies ./run_optimisation.sh /media tv_movies
# Review changes # Review changes
git diff git diff
``` ```
### Commit Changes ### Committing Changes
```bash ```bash
cd /opt/Optmiser cd /opt/Optmiser
git status git status
# Add changed files # Add changed files
git add optimise_media_v2.py run_optimisation.sh git add optimize_library.py run_optimisation.sh run_optimisation.ps1
# Commit with message # Commit with message
git commit -m "feat: add X" git commit -m "feat: add Windows and Linux wrapper scripts"
# Push # Push
git push git push
``` ```
### Viewing Logs
```bash
# Watch logs in real-time
tail -f /opt/Optmiser/logs/tv_movies.jsonl | jq '.'
# Check files logged for review
cat /opt/Optmiser/logs/low_savings_skips.jsonl | jq '.[] | select(.recommendations=="logged_for_review")'
# Statistics
jq -r '.status' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
# Find what CRF/VMAF combinations are being used most
jq -r '[.vmaf, .crf] | @tsv' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
```
### View History ### View History
```bash ```bash
cd /opt/Optmiser cd /opt/Optmiser
git log --oneline git log --oneline
git log --graph --all git log --graph --all
``` ```
---
## FAQ ## FAQ
**Q: Can I run this on multiple machines at once?** **Q: Can I run this on multiple machines at once?**
A: Yes! Each machine will process different files due to lock file mechanism. A: Yes! Each machine will process different files due to lock file mechanism.
**Q: Will hardware encoding be available?** **Q: Should I use Windows or WSL?**
A: Planned for future versions. Currently uses CPU encoding (software AV1). A: WSL is recommended for Linux compatibility. Use Windows native if you need direct hardware access or performance.
**Q: How do I know if a file is being encoded elsewhere?** **Q: Will hardware encoding work better than CPU?**
A: Check `/opt/Optmiser/.lock/` - if a lock exists, another machine is processing it. A: For AMD RX 7900 XT, hardware AV1 encoding is ~3-10x faster than CPU. However, GPU encoding may need slightly lower VMAF targets to match CPU quality.
**Q: Can I change the VMAF target?** **Q: What VMAF target should I use?**
A: Yes, edit `TARGETS = [94.0, 93.0, 92.0, 90.0]` in `optimise_media_v2.py`. A: Start with VMAF 94 or 95. Drop to 92-90 if you need more savings.
**Q: What if savings are always below 12%?** **Q: How do I know which files are being processed?**
A: The script will log 15% estimates to `low_savings_skips.jsonl`. Review these logs to decide if encoding is worth it. A: Check `.lock/` directory: `ls -la /opt/Optmiser/.lock/`
**Q: Does this work on Windows/WSL?**
A: Yes! See the Windows/WSL section for setup instructions.
**Q: How much CPU does encoding use?**
A: Full CPU (24 threads) by default. Use `--cpu-limit 50` for 50% mode.
**Q: Can I pause/resume?** **Q: Can I pause/resume?**
A: Pause by stopping the script (Ctrl+C). Resume by running again - it skips processed files. A: Pause by stopping the script (Ctrl+C). Resume by running again - it skips processed files.
**Q: What happens if encoding fails?** **Q: What happens if encoding fails?**
A: Error is logged to `failed_encodes.jsonl` with error code. Original file is NOT modified. A: Error is logged to `failed_encodes.jsonl`. Original file is NOT modified.
--- **Q: How much CPU does encoding use?**
A: Full CPU by default. Use `--workers 1` for single-threaded, or limit with `--cpu-limit 50` for 50% (12 threads on 24-core).
## Support **Q: What are the log files?**
A:
- `tv_movies.jsonl` - Successful TV & Movie encodes
- `content.jsonl` - Successful Content folder encodes
- `low_savings_skips.jsonl` - Files with <12% savings + 15% estimates
- `failed_searches.jsonl` - Files that couldn't hit any VMAF target
- `failed_encodes.jsonl` - Encoding errors
For issues or questions: **Q: How do I see real-time progress?**
- Check AGENTS.md for detailed technical documentation A: The script streams all ab-av1 output in real-time, showing ETA, encoding speed, and frame count.
- Review logs in `/opt/Optmiser/logs/`
- Test with `--dry-run` flag first
--- ---
**Last Updated:** December 31, 2025 **Last Updated:** December 31, 2025
**Version:** 2.0 with Windows and Linux Wrapper Scripts

87
run_optimisation.ps1 Normal file
View File

@@ -0,0 +1,87 @@
param(
[Parameter(Mandatory=$false)]
[string]$Directory = ".",
[float]$Vmaf = 95.0,
[int]$Preset = 6,
[int]$Workers = 1,
[int]$Samples = 4,
[switch]$Thorough,
[string]$Encoder = "svt-av1",
[string]$Hwaccel
)
$ErrorActionPreference = "Stop"
function Write-ColorOutput {
param([string]$Message, [string]$Color = "White")
Write-Host $Message -ForegroundColor $Color
}
function Invoke-OptimizeLibrary {
$scriptPath = Join-Path $PSScriptRoot "optimize_library.py"
if (-not (Test-Path $scriptPath)) {
Write-ColorOutput -Message "ERROR: optimize_library.py not found in current directory" -Color "Red"
exit 1
}
$pythonCmd = Get-Command python3, python, py | Select-Object -FirstProperty Path -ErrorAction SilentlyContinue
if (-not $pythonCmd) {
Write-ColorOutput -Message "ERROR: Python 3 not found. Please install Python 3." -Color "Red"
exit 1
}
$arguments = @(
$scriptPath,
$Directory,
"--vmaf", $Vmaf.ToString("F1"),
"--preset", $Preset.ToString(),
"--workers", $Workers.ToString(),
"--samples", $Samples.ToString()
"--encoder", $Encoder
)
if ($Thorough) {
$arguments += "--thorough"
}
if ($Hwaccel) {
$arguments += "--hwaccel", $Hwaccel
}
Write-ColorOutput -Message "Running optimize_library.py..." -Color "Cyan"
Write-ColorOutput -Message " Directory: $Directory" -Color "White"
Write-ColorOutput -Message " Target VMAF: $Vmaf" -Color "White"
Write-ColorOutput -Message " Preset: $Preset" -Color "White"
Write-ColorOutput -Message " Workers: $Workers" -Color "White"
Write-ColorOutput -Message " Samples: $Samples" -Color "White"
Write-ColorOutput -Message " Encoder: $Encoder" -Color "White"
if ($Thorough) {
Write-ColorOutput -Message " Thorough: Yes" -Color "White"
}
if ($Hwaccel) {
Write-ColorOutput -Message " HW Accel: $Hwaccel" -Color "White"
}
Write-Host ""
$process = Start-Process -FilePath $pythonCmd.Path -ArgumentList $arguments -NoNewWindow -PassThru
$process.WaitForExit()
$exitCode = $process.ExitCode
if ($exitCode -eq 0) {
Write-ColorOutput -Message "SUCCESS: Library optimization completed" -Color "Green"
} else {
Write-ColorOutput -Message "ERROR: optimize_library.py exited with code $exitCode" -Color "Red"
}
exit $exitCode
}
Write-ColorOutput -Message "========================================" -Color "Cyan"
Write-ColorOutput -Message "VMAF Library Optimiser (Windows)" -Color "Yellow"
Write-ColorOutput -Message "========================================" -Color "Cyan"
Write-Host ""
Invoke-OptimizeLibrary

151
run_optimisation.sh Normal file
View File

@@ -0,0 +1,151 @@
#!/bin/bash
# VMAF Library Optimiser (Linux/Server runner)
# This script wraps optimize_library.py with the same interface as the Windows PowerShell version
set -e
COLOR_RED='\033[0;31m'
COLOR_GREEN='\033[0;32m'
COLOR_CYAN='\033[0;36m'
COLOR_YELLOW='\033[1;33m'
COLOR_WHITE='\033[0;37m'
COLOR_RESET='\033[0m'
log_info() {
echo -e "${COLOR_CYAN}$*${COLOR_RESET}"
}
log_error() {
echo -e "${COLOR_RED}ERROR: $*${COLOR_RESET}" >&2
}
log_success() {
echo -e "${COLOR_GREEN}$*${COLOR_RESET}"
}
# Default values matching optimize_library.py defaults
DIRECTORY="."
VMAF="95.0"
PRESET="6"
WORKERS="1"
SAMPLES="4"
THOROUGH=""
ENCODER="svt-av1"
HWACCEL=""
# Parse command line arguments
while [[ $# -gt 0 ]]; do
case "$1" in
--directory)
DIRECTORY="$2"
shift 2
;;
--vmaf)
VMAF="$2"
shift 2
;;
--preset)
PRESET="$2"
shift 2
;;
--workers)
WORKERS="$2"
shift 2
;;
--samples)
SAMPLES="$2"
shift 2
;;
--thorough)
THOROUGH="--thorough"
shift
;;
--encoder)
ENCODER="$2"
shift 2
;;
--hwaccel)
HWACCEL="$2"
shift 2
;;
*)
DIRECTORY="$1"
shift
;;
esac
done
# Check if python3 is available
if ! command -v python3 &> /dev/null; then
if ! command -v python &> /dev/null; then
log_error "Python 3 not found. Please install Python 3."
exit 1
else
PYTHON_CMD="python"
fi
else
PYTHON_CMD="python3"
fi
# Check if optimize_library.py exists
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SCRIPT_PATH="$SCRIPT_DIR/optimize_library.py"
if [[ ! -f "$SCRIPT_PATH" ]]; then
log_error "optimize_library.py not found in: $SCRIPT_DIR"
exit 1
fi
# Build command arguments
ARGS=(
"$PYTHON_CMD" "$SCRIPT_PATH"
"$DIRECTORY"
--vmaf "$VMAF"
--preset "$PRESET"
--workers "$WORKERS"
--samples "$SAMPLES"
--encoder "$ENCODER"
)
if [[ -n "$THOROUGH" ]]; then
ARGS+=(--thorough)
fi
if [[ -n "$HWACCEL" ]]; then
ARGS+=(--hwaccel "$HWACCEL")
fi
# Print configuration
log_info "========================================"
log_info "VMAF Library Optimiser (Linux/Server)"
log_info "========================================"
echo ""
log_info "Directory: $DIRECTORY"
log_info "Target VMAF: $VMAF"
log_info "Preset: $PRESET"
log_info "Workers: $WORKERS"
log_info "Samples: $SAMPLES"
log_info "Encoder: $ENCODER"
if [[ -n "$THOROUGH" ]]; then
log_info "Thorough: Yes"
fi
if [[ -n "$HWACCEL" ]]; then
log_info "HW Accel: $HWACCEL"
fi
echo ""
log_info "Running optimize_library.py..."
echo ""
# Run the optimisation
"${ARGS[@]}"
EXIT_CODE=$?
# Handle exit code
if [ $EXIT_CODE -eq 0 ]; then
log_success "SUCCESS: Library optimisation completed"
else
log_error "optimize_library.py exited with code $EXIT_CODE"
fi
exit $EXIT_CODE