feat(subs): resilient subtitle extraction — sidecars, charset, torrent/debrid

Close the recurring "video has subtitles but the web player shows none" gap
with a source-agnostic pipeline:

- Discover EXTERNAL sidecar subs in the scan (Video.es.ass siblings + a Subs/
  bundle), parse lang/forced/SDH from the filename, skip VobSub (.sub+.idx).
  ffprobe-only scanning ignored these (ToonsHub/anime "MSubs" releases).
- Transcode sidecar charset -> UTF-8 before WebVTT (BOM/UTF-16/code-page by
  language). Chinese SCRIPT matters: chs/sc -> GBK, cht/tc/big5 -> Big5
  (decoding one as the other is mojibake).
- /sub now serves a standalone sidecar file (i=-1, p=file, &l=lang hint) and a
  remote debrid URL (ffmpeg reads http, no local stat) — not just embedded
  streams of a local file.
- probe.json emits a tokened vttUrl per TEXT track so torrent/debrid HLS streams
  (never library-scanned) get subtitles too. Embedded index is counted among
  embedded streams only, so -map 0:s:N stays aligned when sidecars are appended.

Tested against a real 347-file gallery: 26/26 sidecars and embedded ass/srt/
mov_text all extract to valid WebVTT; bitmap (pgs/dvd_subtitle) correctly stays
burn-in. Manual harness gated behind GALLERY_DIR.
This commit is contained in:
Deivid Soto 2026-06-08 13:04:09 +02:00
parent 22081cf106
commit d708ea2360
13 changed files with 957 additions and 39 deletions

4
go.mod
View file

@ -14,8 +14,10 @@ require (
github.com/huin/goupnp v1.3.0
github.com/olekukonko/tablewriter v1.1.4
github.com/spf13/cobra v1.10.2
github.com/spf13/pflag v1.0.10
github.com/torrentclaw/go-client v0.2.0
golang.org/x/term v0.43.0
golang.org/x/text v0.37.0
golang.org/x/time v0.15.0
golang.zx2c4.com/wireguard v0.0.0-20250521234502-f333402bd9cb
)
@ -113,7 +115,6 @@ require (
github.com/rivo/uniseg v0.4.7 // indirect
github.com/rs/dnscache v0.0.0-20230804202142-fc85eb664529 // indirect
github.com/spaolacci/murmur3 v1.1.0 // indirect
github.com/spf13/pflag v1.0.10 // indirect
github.com/tidwall/btree v1.8.1 // indirect
github.com/wlynxg/anet v0.0.5 // indirect
github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e // indirect
@ -127,7 +128,6 @@ require (
golang.org/x/net v0.54.0 // indirect
golang.org/x/sync v0.20.0 // indirect
golang.org/x/sys v0.44.0 // indirect
golang.org/x/text v0.37.0 // indirect
golang.zx2c4.com/wintun v0.0.0-20230126152724-0fa3db229ce2 // indirect
gvisor.dev/gvisor v0.0.0-20250503011706-39ed1f5ac29c // indirect
lukechampine.com/blake3 v1.4.1 // indirect