feat(stream): cache extracted subtitles to a hidden .unarr sidecar

On-demand WebVTT extraction re-ran ffmpeg on every /sub request and, for
50GB+ remuxes, couldn't finish a full text track within the 60s HTTP timeout
→ the web player got a 500 and no subtitles.

Extract each text subtitle ONCE — during the library scan (no HTTP deadline,
generous per-file timeout) and write-through on the first on-demand request —
into a hidden ".unarr/<name>.s<index>.vtt" sidecar next to the media file.
The /sub handler serves a fresh sidecar instantly (mtime-invalidated when the
media is replaced), so playback subtitles are instant and huge files work.

- mediainfo.sidecar: cache paths, mtime freshness, atomic write, ExtractSubtitleVTT,
  IsTextSubtitleCodec (shared classifier, mirrors engine + web whitelists).
- library.PrewarmSidecars: bounded, idempotent, ctx-cancellable background pass
  run after every scan (manual + daemon auto-scan).
- subtitleHandler: cache-read → hit; miss → extract → write-through.
- config: library.cache_subtitles (default true), wired via SetCacheSubtitles.

Local-only by design: nothing extracted is uploaded — the sidecar is the user's
own content, private to their disk.
This commit is contained in:
Deivid Soto 2026-06-02 09:10:36 +02:00
parent 7417fad45f
commit 178c16f458
6 changed files with 353 additions and 33 deletions

View file

@ -189,6 +189,14 @@ type LibraryConfig struct {
AutoScan bool `toml:"auto_scan"` // enable daily auto-scan in daemon (default true)
ScanInterval string `toml:"scan_interval"` // e.g. "24h", "12h", "6h" (default "24h")
AllowDelete bool `toml:"allow_delete"` // allow web UI to request file deletion from disk
// Sidecar caching: extract text subtitles (WebVTT) and thumbnail frames once
// during the library scan and store them in a hidden ".unarr" dir next to the
// media file, so the stream handlers serve them instantly instead of running
// ffmpeg per request (and so huge remuxes don't hit the on-demand HTTP
// timeout). Both default true; disable to save the disk/CPU of pre-extraction.
CacheSubtitles bool `toml:"cache_subtitles"` // default true
CacheThumbnails bool `toml:"cache_thumbnails"` // default true
}
// Default returns a Config with sensible defaults. Used both for fresh
@ -255,9 +263,11 @@ func Default() Config {
Locale: "en",
},
Library: LibraryConfig{
AutoScan: true,
ScanInterval: "24h",
Workers: 8,
AutoScan: true,
ScanInterval: "24h",
Workers: 8,
CacheSubtitles: true,
CacheThumbnails: true,
},
}
}
@ -321,6 +331,16 @@ func applyDefaults(cfg *Config, meta toml.MetaData) {
cfg.General.Country = "US"
}
// Sidecar caching defaults ON for existing configs that predate these keys —
// it only adds small hidden files next to media and makes subs/thumbnails
// instant. Power users can set them false explicitly to opt out.
if !meta.IsDefined("library", "cache_subtitles") {
cfg.Library.CacheSubtitles = true
}
if !meta.IsDefined("library", "cache_thumbnails") {
cfg.Library.CacheThumbnails = true
}
if !meta.IsDefined("downloads", "transcode", "enabled") {
cfg.Download.Transcode.Enabled = true
}