- log honesto de resume (copy codifica desde 0, no desde StartSec)
- inyección EXT-X-START anclada a #EXTM3U con warning si falla
- ServeSegment sin tope segmentCount en copy (ffmpeg adelanta al índice)
- comentario types.go: gate por HLS_COPY_MIN_VERSION web-side
El downmix estéreo del re-encode (f89396c) dejaba un agujero simétrico: una
fuente cuyo audio YA es AAC 5.1 se copiaba tal cual, y WebKit rechaza el
AAC multicanal en el primer segmento exactamente igual que el re-encodeado.
Copy de audio ahora solo cuando la pista es AAC con ≤2 canales; cualquier
otra cosa (no-AAC, AAC 5.1+, o canales desconocidos en el probe — fail-safe)
re-encodea a AAC estéreo 48k. La pista multicanal original queda intacta
para reproductor externo. Test smoke nuevo: fuente AAC 5.1 → re-encode.
Sin -ac 2 una fuente 5.1 (AC3/EAC3) producía AAC de 6 canales del encoder
nativo de ffmpeg, que WebKit/Apple HLS rechaza al sniffar el primer
segmento: en el access log de Safari se ve master → index → init → seg-0
dos veces y silencio. Era el discriminador exacto del patrón de campo:
episodios con AAC estéreo (copy de audio) reproducían en iPhone; todas las
películas 5.1 fallaban. Verificado con Safari/macOS via WebDriver-less
access log: con -ac 2 la progresión de segmentos avanza con normalidad.
Espeja los flags del path de encode (aac 192k 48kHz estéreo). Test smoke
ampliado: el re-encode debe llevar -ac 2.
Hasta que llega ENDLIST la sesión copy es un EVENT creciente y algunos
players nativos (iOS) tratan un playlist sin terminar como LIVE: se
enganchan al borde en vez de a la posición 0. EXT-X-START:TIME-OFFSET=0
(RFC 8216 §4.3.5.2) fija el arranque explícitamente; inofensivo cuando el
playlist ya es final. Coincide con el patrón observado: episodios cortos
(ENDLIST en segundos) reproducían en iPhone, películas (EVENT durante
minutos) no.
Un playlist EVENT cuyas entradas empiezan en 0 mientras los fragmentos
llevan tfdt desplazado (-ss + -output_ts_offset) es exactamente la forma
que el parser HLS nativo de iOS no traga: resume a 368s → error del player
y bucle de re-bootstrap de sesión en iPhone (observado 2026-06-10).
Copy produce siempre desde 0 con PTS absolutos reales: adelanta a la
reproducción a velocidad de I/O, así que el punto de resume aparece en la
timeline creciente en segundos y el seek de startPosition del player
aterriza con normalidad. Test de resume actualizado: el playlist debe
cubrir la timeline completa.
Nuevo modo VideoCopy en el engine HLS: ffmpeg -c:v copy (el vídeo jamás se
re-encodea — I/O puro, funciona en un NAS sin GPU), audio copy si ya es AAC
o AAC 192k si no, muxeado a segmentos fMP4 con ffmpeg escribiendo SU PROPIO
playlist (EVENT mientras corre, ENDLIST al acabar, EXTINF exactos en los
keyframes del source). Sustituye al remux growing-fMP4 servido por HTTP
Range artesanal, cuya fragilidad estructural produjo tres incidentes en un
día (init malformado/delay_moov, loop de re-seek por total inventado, iOS
rechazando total desconocido).
Diferencias deliberadas respecto al modo encode:
- playlist de ffmpeg servido desde disco (los cortes van a keyframe del
source → duraciones imposibles de pre-renderizar; medido: probar
keyframes antes cuesta 8-24s, inviable para TTFF)
- sin seek-restart ni auto-restart (la copia va a velocidad de disco y
adelanta a cualquier viewer; el -ss de segmentos uniformes corrompería
la timeline de cortes variables)
- sin caché HLS (regenerar no cuesta encode; cachear solo quema disco)
- resume vía -ss (snap a keyframe) + -output_ts_offset
- master playlist sin CODECS (un string hardcodeado equivocado hace que
iOS rechace la variante; omitirlo es legal y universal)
Validación: TTFB seg-0 510ms sobre el MKV real del incidente (HEVC Main10
+ EAC3, 6.7GB). Suite de integración con ffmpeg real (tag smoke): h264+aac
(copy total), h264+ac3 (re-encode de audio con priming dts — la clase
delay_moov), hevc10+eac3 (la forma exacta del incidente, tag hvc1), resume
con StartSec, y serving del playlist; asserts de codecs vía ffprobe sobre
el playlist servido, suma EXTINF ≈ duración, segmentos completos en disco
(+temp_file = rename atómico).
El wiring web (plan remux→hls+videoCopy con gate de versión ≥1.0.10) va en
el repo web. Plan: docs/plans/hls-copy-remux-replacement.md (web).
Keep an NVENC downscale of an SDR source entirely on the GPU
(decode -> scale_cuda -> h264_nvenc) instead of copying every frame to the
CPU for `scale=` and back. That GPU->CPU->GPU round-trip is the wall on
modest GPUs; even a strong box gains ~37% (scale_cuda 14.9x vs CPU 10.9x
on a 4K SDR HEVC -> 1080p encode).
Strictly gated so every case that needs CPU frames is unchanged:
- HDR (libplacebo Vulkan / zscale CPU tonemap can't consume a CUDA surface),
- burn-in (the scale2ref+overlay composite runs on CPU frames),
- non-NVENC encoders, and no-op when not actually downscaling.
- hwscale.go: FFmpegSupportsScaleCuda — a functional 1-frame probe mirroring
the libplacebo probe (presence in -filters lies; needs a real CUDA device).
Probes the worst-case real input (10-bit p010 -> 8-bit yuv420p) so a host
whose scale_cuda can't do the 10->8-bit conversion fails closed to CPU.
- hls.go: useCudaScale gate + `-hwaccel_output_format cuda` + a
`scale_cuda=-2:H:format=yuv420p` filter branch. Output is 8-bit
(format=yuv420p + `-profile:v main`), browser-safe.
- transcode_quality.go / player_session_registry.go / daemon.go: HasScaleCuda
flag, populated + warmed at startup like the other ffmpeg capability probes.
Fail-closed: probe absent/fails -> keep the CPU scale path, no regression.
Verified live (real 4K SDR HEVC Main10 session emitted scale_cuda, 5.54x
realtime, nvenc at 100%) + 8 arg-builder unit tests for the gate.
NVENC (ffmpeg 6.1 + drivers actuales) emite los keyframes forzados por
-force_key_frames como I-frames NO-IDR; el muxer HLS solo corta en IDR,
así que cada segmento se estiraba en silencio al GOP por defecto
(250 frames ≈ 10.4 s @24fps) mientras la playlist server-side seguía
prometiendo 2 s por segmento. Con los PTS reales ~5× fuera del mapa de
la playlist, los seeks aterrizaban donde podían y los subtítulos se
desincronizaban en cuanto se mezclaban segmentos de runs distintos
(seek-restart) en el mismo dir.
Medido: 3 segmentos por 30 s de encode en vez de 15; con -forced-idr 1
exactamente 15, y post-fix seg-150/151/158 arrancan en 300.0/302.0/316.0
clavados. Afecta a TODO el HLS por NVENC histórico (no era del rate
control nuevo: la config de bitrate fijo producía lo mismo). QSV recibe
su grafía -forced_idr. Las entradas de caché viejas nunca llegaron a
sellarse (el conteo de segmentos no cuadraba), así que no hay migración:
solo sesiones vivas estaban afectadas.
- StreamSession.Prewarm → HLSSessionConfig.Prewarm: el daemon difiere el
encode de un prewarm hasta que no haya encode vivo (poll 10s, tope
30min) y lo registra vía RegisterKeep (side-by-side, sin desalojar).
Antes todo pasaba por Register(), que cierra las demás sesiones — un
prewarm de next-episode reclamado en mitad de la reproducción mataba
el stream del usuario ("closed (cache discarded)" → master 404,
verificado 2026-06-10). Una sesión REAL nueva primero reapea los
prewarms en vuelo (CloseWhere(IsPrewarm)) para liberar el writer-lock
de la caché — un prewarm SELLADO sobrevive como cache HIT — y luego
desaloja normal vía Register.
- Trickplay: -skip_frame nokey + fps=...:eof_action=pass — solo
decodifica keyframes (12x menos CPU medido: 233s→19s en un episodio
de 24min 1080p; importa porque corre junto al streaming en vivo).
Los ticks siguen siendo uniformes (fps repite el último keyframe),
así que manifest y clientes cacheados no cambian. eof_action=pass
cubre clips con un único keyframe (el filtro fps no emite nada de un
stream de 1 frame con el eof por defecto).
- HLSSessionConfig.StartSec (sync StreamSession.startSec): el primer
ffmpeg arranca ya seekeado en el punto de resume (-ss +
-output_ts_offset + -start_number, misma maquinaria que el
seek-restart) en vez de encodear desde seg-0 para morir en el
seek-restart inmediato del player (doble spawn, resume lento).
readyMax se pre-siembra al índice de arranque; el ready-watcher
compara ReadyCount() > WriterStartIdx() para no marcar "ready" antes
del primer segmento real. startSec >= duración → arranque desde 0
(resume obsoleto de un fichero reemplazado).
- Rate control: capped constant-quality donde el encoder lo hace bien —
libx264 -crf 23, NVENC -cq 23 -b:v 0 — con el mismo -maxrate de
siempre y -bufsize 2x (antes 1x estrangulaba picos). Escenas fáciles
emiten muchos menos bits (menos stalls vía funnel/LTE); el peor caso
no cambia. QSV/VideoToolbox/VAAPI conservan el triple de bitrate fijo
probado (sus knobs de calidad tienen gotchas de vendor).
- Limpieza: wrapper buildHLSFFmpegArgs y guard startIdx<0 muertos.
Close the recurring "video has subtitles but the web player shows none" gap
with a source-agnostic pipeline:
- Discover EXTERNAL sidecar subs in the scan (Video.es.ass siblings + a Subs/
bundle), parse lang/forced/SDH from the filename, skip VobSub (.sub+.idx).
ffprobe-only scanning ignored these (ToonsHub/anime "MSubs" releases).
- Transcode sidecar charset -> UTF-8 before WebVTT (BOM/UTF-16/code-page by
language). Chinese SCRIPT matters: chs/sc -> GBK, cht/tc/big5 -> Big5
(decoding one as the other is mojibake).
- /sub now serves a standalone sidecar file (i=-1, p=file, &l=lang hint) and a
remote debrid URL (ffmpeg reads http, no local stat) — not just embedded
streams of a local file.
- probe.json emits a tokened vttUrl per TEXT track so torrent/debrid HLS streams
(never library-scanned) get subtitles too. Embedded index is counted among
embedded streams only, so -map 0:s:N stays aligned when sidecars are appended.
Tested against a real 347-file gallery: 26/26 sidecars and embedded ass/srt/
mov_text all extract to valid WebVTT; bitmap (pgs/dvd_subtitle) correctly stays
burn-in. Manual harness gated behind GALLERY_DIR.
Parse ffmpeg's -stats progress line (speed=Yx, fps=) from the HLS encoder's
stderr into a per-session EWMA, and report a health snapshot to the web side a
few seconds after seg-0. Lets the player name a too-slow transcode from a
direct measurement (~5-7s) instead of inferring it from stall shape (~15-30s).
- hls.go: add -stats; rewrite hlsStderrCapture.Write to frame on \r and \n,
parse speed=/fps= (telemetry only, never logged), flag input-bound on source
read errors. EWMA on HLSSession + GetTranscodeStats(); warmup-skip the first
cold-start frames so a healthy encoder isn't reported as struggling.
- client.go: MarkSessionReady takes an optional *SessionHealth.
- daemon.go: watcher reports one health snapshot once >=4 post-warmup samples
settle; classifyAgentHealth maps the speed ratio to ok/marginal/struggling.
Additive: old web replicas ignore the extra field; cache-hit/direct-play
sessions and short encodes report nil (the web keeps its stall heuristic).
The web persists the chosen audioIndex globally, so a value from a
multi-track file can arrive for a file with fewer tracks. buildHLSFFmpegArgsAt
mapped `-map 0🅰️N?` verbatim; the optional `?` then matched nothing and the
HLS output had NO audio stream (video-only — 2026-06-03, Wistoria S02E08 had
one audio track but the session carried audioIndex=2). Clamp an out-of-range
index to the first track so audio is never silently dropped.
Regression test: TestBuildHLSFFmpegArgsAudioClamp.
Make libplacebo actually reachable in the shipped agent image, and refuse it
where it would be a regression.
Dockerfile (so a Vulkan-capable host can use the GPU tonemap path):
- install libvulkan1 (the Vulkan loader libplacebo links at runtime; ~150 KB)
- add 'graphics' to NVIDIA_DRIVER_CAPABILITIES so the nvidia container runtime
mounts the Vulkan ICD (nvidia_icd.json + GLX libs) under --gpus all
Both are inert without a working Vulkan GPU — the functional probe gates use.
hls.go: gate libplacebo on a real HW encoder (HWAccel != none). A software-only
host with mesa would expose lavapipe (CPU Vulkan); the functional probe accepts
it but its tonemap is SLOWER than the zscale CPU chain, so libplacebo there is a
regression. No HW encoder -> stay on zscale.
Verified on the GPU dev box: nvenc session still picks libplacebo (-c:v
h264_nvenc -vf ...,libplacebo=...:tonemapping=bt.2390); new unit test locks the
software-encoder path onto zscale.
Prefer the single-pass Vulkan libplacebo filter over the CPU zscale chain
for HDR->SDR tonemapping when the agent ffmpeg has it. One GPU pass does
tonemap + BT.709 primaries/transfer/matrix + 8-bit yuv420p, replacing the
four-stage zscale chain and its trailing format=/setparams. Higher quality,
far cheaper than the CPU path, and present on builds that lack zscale.
- FFmpegSupportsLibplacebo probe (cached, mirrors FFmpegSupportsZscale)
- HasLibplacebo on TranscodeRuntime, wired from buildTranscodeRuntime
- hls.go: videoTail picks libplacebo when present (not h264_vaapi), else
keeps the zscale tonemap + format chain
- test: libplacebo replaces the zscale chain, never runs alongside it
Slightly-VFR / B-frame MKV sources made ffmpeg's fMP4 muxer emit a continuous
"Packet duration is out of range" flood and produce uneven segment lengths the
web player stuttered on. Add, on the two main encoders + globally:
- libx264: -bf 0 -sc_threshold 0
- h264_nvenc: -bf 0 -no-scenecut 1
- -fps_mode cfr (force constant frame rate)
Keyframe cadence stays driven by -force_key_frames, so every segment is exactly
hls_time long. Verified: the warning flood drops from dozens/sec to ~1 per 80s
of transcoded content (cosmetic), segments stay valid fMP4.
Add GET /sub?p=&i=&t= that extracts an embedded text subtitle stream to
WebVTT via ffmpeg (-map 0:s:N -c:s webvtt), token-gated with a per-track
sub:<sha256(path)>:<index> scope. The web player attaches these as
external <track>s for both direct-play and HLS, native and hls.js.
Removes the old per-session extraction path (extractSubtitles,
ServeSubtitle, manifest SUBTITLES renditions, subs/ mkdir, Close() wait):
native HLS playback never surfaced manifest subs, so that work was wasted.
The on-demand /sub endpoint is now the single subtitle source.
Bitmap subs can't be served as WebVTT, so the user picks one and the daemon
re-encodes with it overlaid. HLSSessionConfig.BurnSubtitleIndex (*int, nil=no
burn) flows into the cache key + a -filter_complex graph:
[0✌️0]<vchain>[base];[0:s:N][base]scale2ref[sub][base2];[base2][sub]overlay[vout]
Overlay after the tonemap (SDR subs keep brightness); scale2ref fits the PGS
canvas to the output. Invalid/text/out-of-range index -> clean-encode fallback.
IsTextSubtitle now includes "text" (parity with the web classifier).
Anamorphic 2.39:1 scaled to 1080 height = ~2586x1080 = 11016 MBs, busting
level 4.1's 8192-MB MaxFS -> nvenc "InitializeEncoder failed: Invalid Level"
(libx264: "frame MB size > level limit") -> 0 segments, session stalls. Most
4K rips are 2.39:1, so HLS playback was silently broken for them.
H264LevelForFrame(w,h) derives the level from the real macroblock count
(max of MB-tier and height-tier). hls.go computes output width and uses it.
16:9 unchanged; anamorphic bumps to 5.0 when needed. Discovered + verified
during the trickplay smoke.
HDR (HDR10/HLG/Dolby Vision) transcoded to SDR came out washed-out and
desaturated because the filter chain never tonemapped. buildHLSFFmpegArgsAt now
inserts a zscale linearise -> hable tonemap -> BT.709 chain after the scale and
before format=, but only when the source is HDR and the ffmpeg build has zscale
(FFmpegSupportsZscale, cached). Builds without zimg keep the old behaviour
(plays, just desaturated) instead of erroring.
It's a CPU filter, valid for every encoder here: the decode hwaccel deliberately
leaves frames in system memory (no -hwaccel_output_format), so zscale runs ahead
of format=/hwupload exactly like the existing scale filter. Verified on a real
4K HDR10 file — vivid colour and deep blacks vs the washed-out baseline.
Debrid direct links are time-limited; a long playback can outlive the link
the session was created with. When a debrid source dies mid-stream the daemon
now re-resolves a fresh link for the same content and resumes — no torrent
fallback, no playback restart.
- debridFileProvider holds the URL behind a mutex; on an expired-link status
(401/403/404/410) the ranged reader re-resolves via a refresh callback and
retries (bounded: 1 initial + 1 post-refresh attempt). A browser opens
several range connections, so the refresh is coalesced singleflight-style —
N readers hitting the dead link share ONE re-resolution, not N.
- HLS-from-URL: the auto-restart supervisor re-resolves the link before
relaunching ffmpeg (else it just retries the dead URL and burns the retry
budget). The mutable URL lives in s.liveURL under s.mu — restartFromSegment
reads it from the HTTP handler goroutine too (seek-restart), so cfg stays
immutable and the write races nothing.
- agentClient.RefreshStreamURL → POST /api/internal/agent/stream-url.
Cross-source torrent<->debrid swap (the rare "debrid genuinely gone" case) is
intentionally deferred. Reader refresh + coalescing covered by unit tests
(incl. -race); the web endpoint re-resolves against a real AllDebrid account.
Non-browser-native debrid content (mkv/HEVC/…) can now stream: ffmpeg reads
the debrid HTTPS link directly (-i <url>) and transcodes to HLS, instead of
2a's raw direct-play which only works for mp4/m4v.
- HLSSessionConfig gains SourceURL + CacheID; sourceRef() feeds ffprobe,
ffmpeg -i, and subtitle extraction from one place. HTTP-resilience flags
(-reconnect*, -rw_timeout) are added only for a URL source; a seek-restart
re-opens the URL with a Range request (-ss before -i = input seek).
- Segment cache keys by CacheID (the torrent info_hash) for URL sessions so
re-plays hit cache despite the debrid URL changing each resolution
(KeyForID, no filepath.Abs).
- OnStreamSession: the 2a direct-play branch is now gated on PlayMethod != "hls";
a new branch handles DirectURL + PlayMethod=="hls" → HLS-from-URL. The
local-file and both debrid HLS paths share a startHLSPlayback helper.
- ExtractMediaInfo no longer masks a URL probe failure as "file not found"
(surfaces ffprobe's real stderr, e.g. "Protocol not found" on a TLS-less
ffmpeg build).
- Bump 0.11.0 -> 0.12.0 as the HLS-from-URL floor the web gates on.
Validated e2e against real AllDebrid: a cached HEVC x265 mkv transcodes
(h264_nvenc) from the debrid URL and plays 1080p in Chrome via hls.js,
subtitles extracted from the remote mkv.
With `-tune ll` NVENC emits long IDR-less GOPs that ignore
`-force_key_frames`, so ffmpeg's HLS muxer keeps writing into seg-0.m4s
forever instead of closing it at the 2 s boundary. Result:
* seg-0.m4s balloons to the full encoded size (1.2 GB on a 48-min movie)
* seg-1.m4s never appears
* daemon's pollSegments needs seg-N+1 to confirm seg-N is closed → never
advances → `mark-ready: timeout` after 60 s
* web player sits on "preparando sesión" until the user gives up
Verified on ffmpeg 6.1.1 + driver 580 / Ryzen 7 7700X + RTX-class GPU:
without `-tune ll`, the same `-preset p3 -rc vbr` cmd produces 39
discrete segments in 15 s at ~27x real-time (was 1 segment / 9 min of
material with `-tune ll` — encoder kept going on a single output).
Introduced by `3b8d77b feat(hls): faster first-start — probe cache +
tighter encoder presets (0.9.9)`. Dropping `-tune ll` costs ~0.5 dB
PSNR at the same bitrate but restores playback. NVENC first-segment
latency remains under 2 s — well within the player's startup budget.
Closes QW2. Validated against the dev box's AMD Raphael iGPU
(/dev/dri/renderD128, radeonsi/mesa 25.2.8). The "proper" full-GPU
path via scale_vaapi triggers a known mesa 25 + Raphael bug
("Cannot allocate memory" per session start, encode still succeeds
but logs are spammy) — hybrid CPU scale → format=nv12 → hwupload
→ h264_vaapi encode delivers GPU surfaces to the encoder without
poking the broken scaler.
Three concrete changes in buildHLSFFmpegArgsAt:
1. New `case "h264_vaapi"` adds `-vaapi_device /dev/dri/renderD128`.
Multi-GPU hosts (this dev box has NVIDIA on renderD129 + AMD on
renderD128) need it so the encoder doesn't bind to a non-VAAPI
render node — without it the encoder fell back to NULL device
in manual smoke testing.
2. Filter chain branches on codec: VAAPI uses
`scale=…,format=nv12,hwupload` while libx264 / NVENC / QSV
keep the existing `scale=…,format=yuv420p,setparams=…` shape.
The setparams color metadata block is dropped on VAAPI because
VAAPI surfaces don't expose VUI fields and the encoder writes
its own.
3. Two new unit tests lock the argv shape so a future refactor
doesn't accidentally merge the paths back together:
TestBuildHLSFFmpegArgsVAAPI asserts the new flags + the
ABSENCE of scale_vaapi; TestBuildHLSFFmpegArgsLibx264NoRegression
verifies the software path keeps yuv420p + setparams + has
none of the VAAPI extras.
Manual ffmpeg validation on the dev box:
hybrid encode of 5 s 4K → 720p: 0.66 s wall, 472 % CPU, 268 KB
output — no errors logged. scale_vaapi variant in comparison
spammed "Cannot allocate memory" while emitting valid output.
Closes the deferred bajo-priority item from the fase 3.3b critico.
Without this the watcher kept polling a torn-down HLSSession for up
to 60 s — fine in current code paths (Close always pairs with ctx
cancel which makes the select{} branch fire), but the function's
correctness then leaned on a caller invariant rather than its own
state check. Adding IsClosed() as a public wrapper around the
existing isClosed() lets the watcher detect any future
session-shutdown path (registry replace, idle sweep, internal kill)
without touching the unexported helper.
Closes Fase 3.3b. Daemon now tells the server the moment a session's
first HLS segment + init.mp4 land on disk; the web side flips
streaming_session.ready_at = NOW(), which its SSE endpoint pushes to
subscribed players so the loading UI flips from "Preparando…" to
"Stream listo" without polling HEAD on the segment URL.
Surface:
- New Client.MarkSessionReady(ctx, sessionId) HTTP method →
POST /api/internal/agent/session-ready.
- New engine.HLSSession.ReadyCount() + FromCache() accessors so the
watcher goroutine doesn't reach into private state.
- New cmd.watchSessionReady(ctx, client, hsess, sessionId) goroutine
polls ReadyCount every 200 ms with a 60 s deadline + short-circuits
for cache-HIT sessions (ready the moment StartHLSSession returns).
- Daemon callback spawns it right after streamSrv.HLS().Register so
the watcher's lifecycle matches the session's.
Best-effort: a transient network failure on the webhook is logged + the
goroutine exits — the player's existing HEAD-probe retry path still
discovers ready state independently. The webhook is an acceleration,
not a hard dependency.
First-frame latency drops by another 1-2 s on cold-cache plays:
1. HLS segment duration halved from 4 s to 2 s. seg-0 lands in ~half
the wait time — the player paints the first frame as soon as it
arrives. Software encodes on 4K go from ~3 s wait to ~1.5 s; HW
encoders shave ~0.5 s. Trade-off: 2× segment count per source
(~3600 segments for a 2 h movie instead of ~1800), but each is
half the size on disk. Within HLS spec — Apple recommends 6 s, but
2 s is valid; LL-HLS uses 1-2 s.
2. Cache from 0.9.9 self-heals: cached entries used 4 s segments;
VerifyComplete now expects a different highest segment index and
invalidates them, triggering a re-encode on next play. No manual
cleanup needed.
3. OnStreamSession daemon callback now runs StartHLSSession in a
goroutine. Sync HTTP responses return immediately (~50 ms instead
of waiting for the ~0.3-1 s ffprobe). Other pending actions in
the same sync cycle (new tasks, deletes) no longer wait for the
transcoder warmup. Browser HEAD probes already have a 30 s retry
budget that covers the brief gap between playerSessionRegistry.add
and streamSrv.HLS().Register.
Helpers added (engine.segmentDurationFor / segmentStartSec /
segmentCountForDuration) so a future short-first-segment variant or
non-uniform layout can slot in without touching every call site.
Internal: -hls_init_time was investigated but discarded — ffmpeg's
implementation treats it as a min duration, not a target, so it
couldn't deliver a uniformly 2 s first segment on top of a 4 s
steady state. Uniform 2 s is simpler and gets the same first-frame
win.
Addresses items raised by the multi-agent code review of the 0.9.9
HW accel + first-start work:
- EncoderProfile now carries DecodeHwAccel so the demuxer `-hwaccel`
flag and the encoder argv derive from a single resolved profile.
Adding a new backend can no longer leave the two switches out of
sync.
- VAAPI no longer passes `-hwaccel_output_format vaapi`. That option
pinned decoded frames to GPU memory, but the filter chain (scale,
format, setparams) runs on CPU and would fail with "impossible to
convert between formats". Frames now decode HW + flow on CPU; the
encoder uploads back to GPU. Pre-existing bug, never reported because
no one had VAAPI auto-detected in practice.
- readyMax field comment + name: documented that it's a COUNT
(segments ready), not an index. The semantics were correct but the
comment read "highest index" which made `idx < readyMax` look like
an off-by-one to reviewers.
- probe_cache background janitor: 5-minute sweeper that drops expired
entries even when no lookup retouches the key. Lookup-only eviction
was fine for small libraries but unbounded for users who browse and
abandon thousands of files within a TTL window. Lazy + sync.Once.
- probe_cache TTL eviction now re-checks under the write lock so a
concurrent re-insert isn't accidentally evicted.
- probe_cache size-change test now Chtimes the file back to its
original mtime so only `size` differs between store and lookup
keys — properly exercises the size-check path.
- New TestProbeCache_SweepDropsExpired covers the janitor sweep.
- CHANGELOG: backfilled missing compare links 0.6.4 → 0.9.9.
- Stale "line ~1119" reference in VideoToolbox comment dropped; the
bitrate block moved a few lines and the comment was already wrong.
Two issues with the 0.9.9 preset retune:
1. applyDefaults was filling Preset="veryfast" before
ResolveEncoderProfile got to pick the latency-biased default, so the
"superfast" change never reached users with a freshly-generated
config.toml — only those who left the field empty saw it.
2. The configured preset was being passed through to every encoder.
That's only valid for libx264 (ultrafast…veryslow); NVENC uses p1-p7
and rejects anything else, QSV uses its own subset. A user with NVENC
+ preset="veryfast" would have ffmpeg reject the argv.
Now:
- TranscodeConfig.Preset documented as libx264-only with the full
range + advice on quality vs first-start latency.
- Default in applyDefaults is empty (was "veryfast") so the engine
fills in "superfast" on libx264.
- ResolveEncoderProfile ignores configuredPreset for vendor encoders
(NVENC sticks to p3, QSV to veryfast, VideoToolbox has no preset
knob). Test cases updated to lock in this behaviour.
Users who want better quality at slower first-play should set
download.transcode.preset = "veryfast" (previous default) / "faster" /
"fast" / "medium" in their config.toml.
Reduces first-segment latency on cache MISS so the player doesn't sit on
"preparando sesión". Three independent levers:
1. ProbeFile memoised by (path, mtime, size) for 30 min — second play of
the same source skips ffprobe (1-3 s on 50+ GB MKVs).
2. HLS encoder presets biased for latency over quality:
- libx264 default veryfast → superfast (~15-20% faster, marginal
quality loss at 5-25 Mbps target bitrates).
- NVENC: -preset p4 -tune hq → -preset p3 -tune ll. First-segment
~0.8 s on RTX-class GPUs (was ~1.5 s).
- QSV: -preset medium → -preset veryfast (keeps look_ahead=0).
- VideoToolbox: adds -realtime 1 (was unset). Bitrate args still
drive rate control; -q:v dropped to avoid the silent conflict
where ffmpeg ignored it under -b:v.
3. Per-session log surfaces encoder + accel + preset so "first-start
was slow" complaints can be triaged from the journal alone.
Diagnostic helpers (DetectHWAccelDiagnostic + HWAccelDiagnostic) added
for future wiring into daemon startup / agent register; users today can
already inspect via `unarr probe-hwaccel`.
Web: AgentsTab profile page now shows the agent's chosen encoder
(amber if software libx264, green if HW) plus the transcode-resolution
cap. Hidden for pre-0.9.9 agents that haven't reported hwAccel.
Drops the custom WebRTC DataChannel pipeline + pion deps + WSS signaling
client + wire framing. Every in-browser playback now uses HLS over HTTP
from the daemon (Tailscale/LAN/UPnP). Browser P2P never re-enabled.
Wire renames (incompatible with web < 2026-05-26): agent.WebRTCSession
=> agent.StreamSession, SyncResponse.WebRTCSessions (JSON: webrtcSessions)
=> StreamSessions (JSON: streamSessions). MIN_AGENT_VERSION is bumped
to 0.9.4 on the web side so older agents see an upgrade card.
Also fixes the libx264 'VBV bitrate > level limit' abort by clamping
the encoder bitrate to the effective output height instead of the
requested label (carried over from the prior 0.9.3 unreleased work).
The seed_file vertical (mode=seed_file handler + engine.SeedFile) was
retired with the in-browser P2P player. [downloads.webrtc] config block
deleted; existing TOML files with the section still parse fine.
Asking for 2160p quality on a 720p source kept the daemon's qcap.VideoBitrate
at 25 Mbps even after outputHeight was clamped to the source. The level
H264LevelForHeight picks for the 720p output is 3.1 / 4.0, which rejects any
VBV >20 Mbps — libx264 then exited with "VBV bitrate (25000) > level limit"
on every restart, ffmpeg auto-restarted 3 times, master.m3u8 never appeared,
and the player got stuck at "Preparando sesión".
Re-derive the (height, bitrate) cap from the EFFECTIVE outputHeight via the
new capForHeight helper. Result: 720p source asked for 2160p → outputs 720p
with the 3500 kbps bitrate the level actually accepts. ffmpeg runs cleanly,
master.m3u8 appears, playback starts.
The web also clamps effectiveQuality to source resolution before the session
row is written, so the daemon mostly receives sane labels. This change keeps
the daemon defensive against (a) older web clients that still ask for
upscaled qualities, and (b) future quality="original" requests where qcap
is empty and Transcode.VideoBitrate could overshoot the level too.
Phase 1 security audit follow-up:
- Reject HLS session IDs that aren't safe filesystem components
(regex allowlist) to defend against path traversal via a buggy or
compromised server. Applied at StartHLSSession and at the /hls URL
handler; invalid IDs share the 404 of unknown sessions so the
accepted format isn't enumerable.
- /health no longer leaks the active filename, taskID prefix or client
IP to non-loopback callers. Uses net.IP.IsLoopback so IPv4-mapped
IPv6 (::ffff:127.0.0.1) is recognised and the empty-string parse
failure stops bypassing the boundary.
- unrar/7z passwords now travel through stdin instead of -p<password>
in argv, removing /proc/<pid>/cmdline disclosure. Control characters
in the password are rejected up front so a hostile NZB cannot feed
extra prompt answers. Both invocations are bounded by a 30-minute
context to stop indefinite hangs if the tool ever decides to prompt.
Three related fixes around 4K-source transcoding that left the web
player stuck on "preparing session" with no useful diagnostics:
1. Dynamic -level:v derived from output height (hls.go, transcoder.go).
The previous fixed "4.0" silently rejected anything taller than 1080p
inside libx264 — "frame MB size > level limit", "DPB size > level
limit" — and emitted unplayable segments. Helper H264LevelForHeight()
now picks 4.0 / 5.0 / 5.1 / 6.0 from the actual encode height.
2. New `unarr probe-hwaccel` diagnostic command. Lists the HW encoders
compiled into ffmpeg, the device files / drivers present, and the
backend the daemon would actually pick today. Surfaces the canonical
gotcha: a host with an RTX 3090 + nvidia-smi but a Homebrew ffmpeg
built without --enable-nvenc still falls back to libx264 software.
3. Register payload now includes hwAccel + maxTranscodeHeight so the web
side can suggest a smaller alternate quality before the user even
tries to play a 4K source on a software-only host. Software-only =
1080p cap, any HW backend = 2160p cap.
Follow-ups from /critico review on commits eb2548f + 40e7977. No
functional change.
- engine/hls.go restartFromSegment now reads `s.exited` under
`readyMu`. The field is documented as readyMu-protected (see field
declaration) and writers in waitFFmpeg / pollSegments hold the lock
consistently; the previous direct read produced a `go test -race`
warning under concurrent restart paths.
- engine/hls.go renderMasterPlaylist drops the `defaultIdx := -1`
branch that was unreachable (no rendition was ever flagged DEFAULT
or AUTOSELECT). Output is unchanged; the source is just shorter.
- engine/hls.go subtitle "(forzados)" suffix → "(forced)". Daemon
convention is English; the web client localises if needed.
- engine/hls.go hlsStderrCapture now also caps single-write payloads
larger than maxStderrBuf (was only capping the cumulative buffer).
- engine/hls.go waitFFmpeg restart-window reset drops the redundant
`!IsZero` guard — a zero time is far enough in the past that the
`> restartWindow` branch covers it.
Reliability hardening pass for the HLS daemon. None of these change the
public API, all reduce the chances of an end-user seeing a broken
session in production.
- engine/hls.go waitFFmpeg now supervises ffmpeg: on a non-graceful
exit while the session is still in use, restart from the last good
segment up to 3 times within a 60 s window. Beyond that we give up
and log the file as broken — better than a perpetually black player
with no error.
- engine/hls.go CleanupHLSOrphanDirs() removes tmpdirs older than 1 h
at startup; cmd/daemon.go calls it before streamSrv.Listen so a
daemon crash + restart doesn't leak gigabytes of segment files.
- engine/hls.go StartHLSSession wraps ffprobe in a 15 s timeout. A
hung probe on a slow remote fs would otherwise block the goroutine
forever and the player would stay on "Preparando sesion".
- engine/hls.go hlsStderrCapture buffer is capped at 64 KiB; a
misbehaving ffmpeg that emits megabytes without newlines used to
grow daemon memory unbounded.
Follow-ups on the daemon HLS pipeline (0fc0e1c):
- engine/hls.go HLSSession.Register now closes every other active
session in the registry. Modeled as "one viewer == one transcode" so
repeated quality switches or page reloads don't leave orphan ffmpegs
saturating the CPU until the idle sweeper reaps them 30 min later.
- engine/hls.go restartFromSegment kills + respawns ffmpeg with
-ss / -output_ts_offset / -start_number when the browser asks for a
segment far ahead of the writer head. Segments already on disk stay
cached. Without this, a user dragging the scrubber to minute 30 of a
fresh stream blocks until the encoder reaches minute 30 in real time.
- engine/hls.go subtitle disambiguation: never set DEFAULT=YES on any
rendition (anime forced "signs only" tracks were autoselected and
rendered nothing during opening dialogue, looking broken). Names get
numeric suffixes when language is duplicated; FORCED tracks get a
"(forzados)" suffix.
- engine/hls.go ProbeInfo() exposes codec / audio / subtitle metadata
to the new GET /hls/<id>/probe.json endpoint for the player's info
badge + bandwidth logic.
- engine/hls.go scale chain fix: chains a trunc(iw/2)*2 scale after
the height cap so libx264 stops rejecting odd widths (853x480 etc.).
- engine/hls.go HW encoder tuning: NVENC -preset p4 -rc vbr -tune hq,
QSV -preset medium.
- engine/stream_server.go routes /hls/<id>/probe.json to the session.
- cmd/daemon.go runs an idle sweeper goroutine every 5 min, reaping
sessions whose last segment fetch was >30 min ago.
Introduces an HLS-over-HTTP path as Plan B for in-browser streaming. The
WebRTC + MSE pipeline keeps working untouched; the new path is selected
when the backend sets transport="hls" on a streaming session.
Daemon scope:
- engine/hls.go: HLSSession + HLSSessionRegistry. Spawns ffmpeg with
-f hls -hls_segment_type fmp4 + force_key_frames aligned with 4 s
segments. Pre-renders master + media playlists from the probe duration
so the browser knows the total timeline before any segment exists,
fixing seek/duration/pause/multi-track issues seen with the live fMP4
pipe.
- engine/probe.go: enumerate every audio + subtitle track instead of
collapsing to a single default audio track.
- engine/stream_server.go: route /hls/<id>/{master.m3u8,video/...,
subs/...} to the matching session. Emit a synthesised single-VTT
subtitle playlist per text track; bitmap subs (PGS/DVB) skip silently.
- cmd/daemon.go: branch on WebRTCSession.Transport == "hls" to register
an HLS session instead of running the legacy DataChannel pump.
- agent/types.go: WebRTCSession.Transport + AudioIndex fields.
Backend + web sides land in a follow-up commit.