VMAFOptimiser/AGENTS.md

# VMAF Optimiser - Agent Guidelines

## Quick Reference

**Purpose:** Video library optimization pipeline using VMAF quality targets with AV1 encoding.

**Core Files:**
- `optimize_library.py` - Main Python script (342 lines)
- `run_optimisation.sh` - Linux/macOS wrapper
- `run_optimisation.ps1` - Windows wrapper

---

## Build/Lint/Test Commands

### Development Setup

```bash
# Install dependencies (if not already)
cargo install ab-av1  # v0.10.3+
brew install ffmpeg   # macOS
# OR: apt install ffmpeg  # Linux/WSL
# OR: winget install ffmpeg  # Windows
```

### Linting

```bash
# Ruff is the linter (indicated by .ruff_cache/)
ruff check optimize_library.py

# Format with ruff
ruff format optimize_library.py

# Check specific issues
ruff check optimize_library.py --select E,F,W
```

### Running the Application

```bash
# Linux/macOS
./run_optimisation.sh --directory /media --vmaf 95 --workers 1

# Windows
.\run_optimisation.ps1 -directory "D:\Movies" -vmaf 95 -workers 1

# Direct Python execution
python3 optimize_library.py /media --vmaf 95 --preset 6 --workers 1
```

### Testing

**No formal test suite exists currently.** Test manually by:

```bash
# Test with single video file
python3 optimize_library.py /media/sample.mkv --vmaf 95 --workers 1

# Dry run (validate logic without encoding)
python3 optimize_library.py /media --vmaf 95 --thorough

# Check dependencies
python3 optimize_library.py 2>&1 | grep -E "(ffmpeg|ab-av1)"
```

---

## Code Style Guidelines

### Python Style (PEP 8 Compliant)

**Imports:**
```python
# Standard library first, grouped logically
import os
import sys
import subprocess
import json
import shutil
import platform
from concurrent.futures import ThreadPoolExecutor, as_completed
from pathlib import Path
```

**Naming Conventions:**
```python
# Constants: UPPER_SNAKE_CASE
DEFAULT_VMAF = 95.0
DEFAULT_PRESET = 6
EXTENSIONS = {".mkv", ".mp4", ".mov", ".avi", ".ts"}

# Functions: snake_case
def get_video_info(filepath):
    def build_ab_av1_command(input_path, output_path, args):

# Variables: snake_case
input_path = Path(filepath)
output_path = input_path.with_stem(input_path.stem + "_av1")

# Module-level cache: _PREFIX (private)
_AB_AV1_HELP_CACHE = {}
```

**Formatting:**
- 4-space indentation
- Line length: ~88-100 characters (ruff default: 88)
- No trailing whitespace
- One blank line between functions
- Two blank lines before class definitions (if any)

**Function Structure:**
```python
def function_name(param1, param2, optional_param=None):
    """Brief description if needed."""
    try:
        # Implementation
        return result
    except Exception as e:
        print(f"Error: {e}")
        return None  # or handle gracefully
```

**Subprocess Calls:**
```python
# Use subprocess.run for all external commands
cmd = ["ffmpeg", "-i", input_file, output_file]
result = subprocess.run(cmd, capture_output=True, text=True, check=True)

# Check return codes explicitly
if result.returncode != 0:
    print(f"Command failed: {result.stderr}")
```

### Error Handling

```python
# Always wrap external tool calls in try-except
try:
    info = get_video_info(filepath)
    if not info:
        return  # Early return on None
except subprocess.CalledProcessError as e:
    print(f"FFmpeg failed: {e}")
    return

# Use specific exception types when possible
except FileNotFoundError:
    print("File not found")
except json.JSONDecodeError:
    print("Invalid JSON")
```

### Platform Detection

```python
# Use platform module for OS detection
def is_wsl():
    if os.environ.get("WSL_DISTRO_NAME"):
        return True
    try:
        with open("/proc/sys/kernel/osrelease", "r") as f:
            return "microsoft" in f.read().lower()
    except FileNotFoundError:
        return False

def platform_label():
    system = platform.system()
    if system == "Linux" and is_wsl():
        return "Linux (WSL)"
    return system
```

### Argument Parsing

```python
def main():
    parser = argparse.ArgumentParser(description="Description")
    parser.add_argument("directory", help="Root directory")
    parser.add_argument("--vmaf", type=float, default=95.0, help="Target VMAF")
    args = parser.parse_args()
```

---

## Shell Script Guidelines (run_optimisation.sh)

**Shebang & Error Handling:**
```bash
#!/bin/bash
set -e  # Exit on error
```

**Color Output:**
```bash
COLOR_RED='\033[0;31m'
COLOR_GREEN='\033[0;32m'
COLOR_CYAN='\033[0;36m'
COLOR_RESET='\033[0m'

log_info() {
    echo -e "${COLOR_CYAN}$*${COLOR_RESET}"
}

log_error() {
    echo -e "${COLOR_RED}ERROR: $*${COLOR_RESET}" >&2
}
```

**Argument Parsing:**
```bash
while [[ $# -gt 0 ]]; do
    case "$1" in
        --vmaf)
            VMAF="$2"
            shift 2
            ;;
        *)
            DIRECTORY="$1"
            shift
            ;;
    esac
done
```

---

## PowerShell Guidelines (run_optimisation.ps1)

**Parameter Declaration:**
```powershell
param(
    [Parameter(Mandatory=$false)]
    [string]$Directory = ".",
    [float]$Vmaf = 95.0,
    [switch]$Thorough
)
```

**Error Handling:**
```powershell
$ErrorActionPreference = "Stop"

function Write-ColorOutput {
    param([string]$Message, [string]$Color = "White")
    Write-Host $Message -ForegroundColor $Color
}
```

**Process Management:**
```powershell
$process = Start-Process -FilePath $pythonCmd.Path -ArgumentList $arguments `
    -NoNewWindow -PassThru
$process.WaitForExit()
$exitCode = $process.ExitCode
```

---

## Key Constraints & Best Practices

### When Modifying `optimize_library.py`

1. **Maintain platform compatibility:** Always test on Linux, Windows, and macOS
2. **Preserve subprocess patterns:** Use `subprocess.run` with `check=True`
3. **Handle missing dependencies:** Check `shutil.which()` before running tools
4. **Thread safety:** The script uses `ThreadPoolExecutor` - avoid global state
5. **Path handling:** Always use `Path` objects from `pathlib`

### When Modifying Wrapper Scripts

1. **Keep interfaces consistent:** Both scripts should accept the same parameters
2. **Preserve color output:** Users expect colored status messages
3. **Validate Python path:** Handle `python3` vs `python` vs `py`
4. **Check script existence:** Verify `optimize_library.py` exists before running

### File Organization

- Keep functions under 50 lines
- Use descriptive names (no abbreviations like `proc_file`, use `process_file`)
- Cache external command help text (see `_AB_AV1_HELP_CACHE`)
- Use constants for magic numbers and strings

### Hardware Acceleration

- Auto-detect via `normalize_hwaccel()` function
- Respect `--hwaccel` flag
- Check ab-av1 support with `ab_av1_supports()` before using flags
- Default: `auto` (d3d11va on Windows, videotoolbox on macOS, vaapi on Linux)

---

## Common Patterns

### Checking Tool Availability
```python
def check_dependencies():
    missing = []
    for tool in ["ffmpeg", "ffprobe", "ab-av1"]:
        if not shutil.which(tool):
            missing.append(tool)
    if missing:
        print(f"Error: Missing tools: {', '.join(missing)}")
        sys.exit(1)
```

### Building Commands Conditionally
```python
cmd = ["ab-av1", "auto-encode", "-i", input_path]

if args.encoder:
    if ab_av1_supports("auto-encode", "--encoder"):
        cmd.extend(["--encoder", args.encoder])
    else:
        print("Warning: Encoder not supported")
```

### File Path Operations
```python
# Use pathlib for cross-platform paths
input_path = Path(filepath)
output_path = input_path.with_stem(input_path.stem + "_av1")

# Safe existence check
if output_path.exists():
    print(f"Skipping: {input_path.name}")
    return
```

---

## Version Control

```bash
# Check for changes
git status

# Format before committing
ruff format optimize_library.py
ruff check optimize_library.py

# Commit with conventional commits
git commit -m "feat: add hardware acceleration support"
git commit -m "fix: handle missing ffprobe gracefully"
git commit -m "docs: update setup instructions"
```

---

## Important Notes

1. **No type hints:** Current codebase doesn't use Python typing
2. **No formal tests:** Test manually with sample videos
3. **No package.json:** This is a standalone script, not a Python package
4. **Lock files:** `.lock/` directory created at runtime for multi-machine coordination
5. **Logs:** JSONL format in `logs/` directory for structured data

---

**Last Updated:** December 31, 2025