feat(subs): resilient subtitle extraction — sidecars, charset, torrent/debrid
Close the recurring "video has subtitles but the web player shows none" gap with a source-agnostic pipeline: - Discover EXTERNAL sidecar subs in the scan (Video.es.ass siblings + a Subs/ bundle), parse lang/forced/SDH from the filename, skip VobSub (.sub+.idx). ffprobe-only scanning ignored these (ToonsHub/anime "MSubs" releases). - Transcode sidecar charset -> UTF-8 before WebVTT (BOM/UTF-16/code-page by language). Chinese SCRIPT matters: chs/sc -> GBK, cht/tc/big5 -> Big5 (decoding one as the other is mojibake). - /sub now serves a standalone sidecar file (i=-1, p=file, &l=lang hint) and a remote debrid URL (ffmpeg reads http, no local stat) — not just embedded streams of a local file. - probe.json emits a tokened vttUrl per TEXT track so torrent/debrid HLS streams (never library-scanned) get subtitles too. Embedded index is counted among embedded streams only, so -map 0:s:N stays aligned when sidecars are appended. Tested against a real 347-file gallery: 26/26 sidecars and embedded ass/srt/ mov_text all extract to valid WebVTT; bitmap (pgs/dvd_subtitle) correctly stays burn-in. Manual harness gated behind GALLERY_DIR.
This commit is contained in:
parent
22081cf106
commit
d708ea2360
13 changed files with 957 additions and 39 deletions
|
|
@ -64,7 +64,13 @@ var langNormalize = map[string]string{
|
|||
"mlt": "mt", "mt": "mt",
|
||||
"swa": "sw", "sw": "sw",
|
||||
"afr": "af", "af": "af",
|
||||
"lat": "la", "la": "la",
|
||||
"kan": "kn", "kn": "kn",
|
||||
"mal": "ml", "ml": "ml",
|
||||
"mar": "mr", "mr": "mr",
|
||||
"pan": "pa", "pa": "pa",
|
||||
"guj": "gu", "gu": "gu",
|
||||
"kann": "kn",
|
||||
"lat": "la", "la": "la",
|
||||
|
||||
// Full English names (ffprobe sometimes returns these instead of codes)
|
||||
"english": "en", "spanish": "es", "french": "fr", "german": "de",
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue