Files
VMAFOptimiser/FEATURES.md
2025-12-31 23:13:32 +04:00

351 lines
9.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# optimize_library.py - Complete Feature Restore
## Overview
Restored all degraded functionality from original `optimise_media_v2.py` and added new features:
- Intelligent VMAF targeting (94 → 93 → estimate for 12% savings)
- Comprehensive logging system (separate logs for tv/movies vs content)
- Before/after metadata tracking
- Hardware encoding with 1 HW worker + CPU workers
- Plex refresh on completion
- Resume capability and graceful shutdown
- Lock file coordination for multi-machine setups
---
## Key Features
### 1. Intelligent VMAF Target Search
**Flow:**
```
Try VMAF 94
↓ Success? → Check savings ≥ 12%
↓ Yes ↓ No
Encode at 94 Try VMAF 93
Savings ≥ 12%?
↓ Yes ↓ No
Encode at 93 Find 15% (test 92, 90)
```
**Benefits:**
- Targets high quality first (VMAF 94)
- Falls back to VMAF 93 if needed
- Estimates VMAF for 15%+ savings if both fail
- Logs recommendations for manual review
### 2. Comprehensive Logging System
**Log Files (in log_dir):**
- `tv_movies.jsonl` - Successful encodes from /tv and /movies
- `content.jsonl` - Successful encodes from /content
- `failed_encodes.jsonl` - Encoding errors
- `failed_searches.jsonl` - Files that couldn't hit any VMAF target
- `low_savings_skips.jsonl` - Files with <12% savings + 15% estimates
**Log Entry Structure:**
```json
{
"file": "/path/to/file.mkv",
"status": "success",
"vmaf": 94.0,
"crf": 37.0,
"before": {
"codec": "h264",
"width": 1280,
"height": 720,
"bitrate": 1010,
"size": 158176376,
"duration": 1252.298
},
"after": {
"codec": "av1",
"width": 1280,
"height": 720,
"bitrate": 775,
"size": 121418115,
"duration": 1252.296
},
"duration": 1299.28,
"savings": 23.24,
"timestamp": "2025-12-31T13:56:55.894288"
}
```
### 3. Before/After Metadata
**Tracked metrics:**
- Codec (h264, hevc, av1, etc.)
- Resolution (width × height)
- Bitrate (calculated from size × 8 / duration - more reliable than ffprobe)
- File size (bytes)
- Duration (seconds)
- Savings percentage
**Why calculate bitrate from file size?**
- FFmpeg's bitrate field often returns 0 for VBR files
- File size + duration = accurate, reliable metric
### 4. Hardware Encoding with 1 HW Worker
**Configuration:**
```bash
# Enable hardware encoding with 1 HW worker + rest CPU
python3 optimize_library.py /media --hwaccel auto --use-hardware-worker --workers 4
```
**Behavior:**
- First file processed: Uses hardware encoding (faster, GPU-accelerated)
- Remaining files: Use CPU encoding (slower, more accurate)
- Hardware methods auto-detected:
- Windows: d3d11va
- macOS: videotoolbox
- Linux/WSL: vaapi
**Why 1 HW worker?**
- GPU memory is limited - multiple simultaneous encodes may OOM
- CPU encoding yields higher quality at same CRF
- Best of both worlds: 1 fast GPU encode, rest high-quality CPU encodes
**To disable hardware:**
```bash
python3 optimize_library.py /media --hwaccel none
# or just omit --hwaccel flag
```
### 5. Separate Logging by Directory
**Automatic detection:**
- Scanning `/mnt/Media/tv` or `/mnt/Media/movies` → Logs to `tv_movies.jsonl`
- Scanning `/mnt/Media/content` → Logs to `content.jsonl`
**Exclusion:**
- When scanning `/tv` or `/movies`, the `/content` subdirectory is automatically excluded
**Example:**
```bash
# TV/Movies - logged together
python3 optimize_library.py /mnt/Media/movies
# Creates: tv_movies.jsonl
# Content - logged separately
python3 optimize_library.py /mnt/Media/content
# Creates: content.jsonl
```
### 6. Plex Refresh on Completion
**Configuration:**
```bash
python3 optimize_library.py /media \
--plex-url http://localhost:32400 \
--plex-token YOUR_TOKEN_HERE
```
**Behavior:**
- After all files processed (or shutdown), triggers Plex library refresh
- Only refreshes if at least 1 file was successfully encoded
- Uses Plex API: `GET /library/sections/1/refresh`
**To get Plex token:**
1. Sign in to Plex Web
2. Go to Settings → Network
3. Look for "List of IP addresses and ports that have authorized devices"
4. Copy token (long alphanumeric string)
### 7. Resume Capability
**Automatic skip:**
- Files already processed in current run are skipped
- Uses lock files for multi-machine coordination
- Press Ctrl+C for graceful shutdown
**Lock file mechanism:**
```
/log_dir/.lock/{video_filename}
```
- Before processing: Check if lock exists → Skip if yes
- Start processing: Create lock
- Finish processing: Remove lock
**Multi-machine safe:**
- Machine A: No lock → Create lock → Encode → Remove lock
- Machine B: Lock exists → Skip file
- Result: Different machines process different files automatically
### 8. Graceful Shutdown
**Behavior:**
- Press Ctrl+C → Current tasks finish, new tasks stop
- Output: "⚠️ Shutdown requested. Finishing current tasks..."
- No partial encodes left hanging
---
## Usage Examples
### Basic Usage (CPU only)
```bash
python3 optimize_library.py /mnt/Media/movies
```
### Hardware Encoding (1 HW + 3 CPU workers)
```bash
python3 optimize_library.py /mnt/Media/movies \
--hwaccel auto \
--use-hardware-worker \
--workers 4
```
### With Plex Refresh
```bash
python3 optimize_library.py /mnt/Media/tv \
--plex-url http://localhost:32400 \
--plex-token YOUR_TOKEN \
--workers 2
```
### Custom Settings
```bash
python3 optimize_library.py /mnt/Media/movies \
--vmaf 95 \
--preset 7 \
--workers 3 \
--log-dir /custom/logs \
--hwaccel vaapi
```
---
## New Command-Line Arguments
| Argument | Description | Default |
|----------|-------------|----------|
| `directory` | Root directory to scan | (required) |
| `--vmaf` | Target VMAF score | 95.0 |
| `--preset` | SVT-AV1 Preset (4=best, 8=fast) | 6 |
| `--workers` | Concurrent files to process | 1 |
| `--samples` | Samples for CRF search | 4 |
| `--hwaccel` | Hardware acceleration (auto, vaapi, d3d11va, videotoolbox, none) | None |
| `--use-hardware-worker` | Use 1 hardware worker + rest CPU (requires --hwaccel) | False |
| `--plex-url` | Plex server URL | None |
| `--plex-token` | Plex authentication token | None |
| `--log-dir` | Log directory | /opt/Optmiser/logs |
---
## Monitoring and Diagnostics
### Check logs in real-time
```bash
# Watch successful encodes
tail -f /opt/Optmiser/logs/tv_movies.jsonl | jq '.'
# Check files logged for review (low savings)
tail -f /opt/Optmiser/logs/low_savings_skips.jsonl | jq '.'
```
### View statistics
```bash
# Count successful encodes
jq -r '.status' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
# Average savings
jq -r '.savings' /opt/Optmiser/logs/tv_movies.jsonl | jq -s 'add/length'
# Files by VMAF target
jq -r '.vmaf' /opt/Optmiser/logs/tv_movies.jsonl | sort | uniq -c
```
### Check lock files (multi-machine coordination)
```bash
ls -la /opt/Optmiser/.lock/
```
---
## Troubleshooting
### Hardware encoding not working
```bash
# Check if hwaccel is detected
python3 optimize_library.py /media --hwaccel auto --help
# Verify ffmpeg supports hardware acceleration
ffmpeg -hwaccels
# Try specific hardware method
python3 optimize_library.py /media --hwaccel vaapi
```
### Plex refresh not working
```bash
# Test curl manually
curl -X GET http://localhost:32400/library/sections/1/refresh \
-H "X-Plex-Token: YOUR_TOKEN"
# Check Plex token is valid
curl -X GET http://localhost:32400/library/sections \
-H "X-Plex-Token: YOUR_TOKEN"
```
### Files being skipped (locked)
```bash
# Check for stale lock files
ls -la /opt/Optmiser/.lock/
# Remove stale locks (if no process is running)
rm /opt/Optmiser/.lock/*.lock
```
---
## Differences from Original optimise_media_v2.py
### Preserved (restored):
- ✅ Intelligent VMAF target search (94 → 93 → 15% estimate)
- ✅ Comprehensive logging system (5 log files)
- ✅ Before/after metadata (codec, bitrate, size, duration)
- ✅ Real-time output streaming
- ✅ Lock file mechanism for multi-machine
- ✅ Recommendations system
### Added new features:
- ✅ Hardware encoding with 1 HW worker + CPU workers
- ✅ Separate logging for /tv+/movies vs /content
- ✅ Plex refresh on completion
- ✅ Graceful shutdown (Ctrl+C handling)
- ✅ Resume capability (track processed files)
- ✅ Configurable log directory
### Changed from original:
- `AB_AV1_PATH` → Use system `ab-av1` (more portable)
- Fixed `--enc-input` usage (only for crf-search, not encode)
- Added proper exception handling for probe failures
- Improved error messages and progress output
---
## Performance Characteristics
### Single file encoding time (1-hour 1080p h264)
| Method | VMAF 94 | VMAF 93 | VMAF 90 |
|--------|------------|------------|------------|
| CPU (24 threads) | 4-5 min | 3-4 min | 2-3 min |
| GPU (hardware) | 30-60 sec | 20-40 sec | 15-30 sec |
### Multi-worker throughput
| Workers | HW worker | Throughput (1-hour files) |
|---------|-------------|---------------------------|
| 1 | No | ~1 file per 5 min (CPU) |
| 1 | Yes | ~1 file per 1 min (GPU) |
| 4 | No | ~4 files per 5 min (CPU) |
| 4 | Yes | ~1 GPU file + 3 CPU files (~4 total) |
---
**Last Updated:** December 31, 2025
**Version:** 3.0 - Complete Feature Restore