fix(stream): don't cache transient libplacebo probe timeouts

Second critico pass on the functional probe. - The probe does real Vulkan device init, which can transiently fail when the box is busy (notably the startup warm racing the encode benchmark). Caching that timeout as a permanent 'no' would pin HDR to the zscale CPU chain until daemon restart. Now a deadline is NOT cached — only a clean non-zero exit (filter absent / no ICD), which is a stable result. zscale stays cached as before (cheap deterministic grep, can't flake). - Surface the exec error when ffmpeg never produced stderr (timeout / ENOENT): the fallback log now shows err.Error() instead of a blank tail, so 'no Vulkan' is distinguishable from 'ffmpeg never ran'. - Dockerfile comment: clarify the Vulkan ICD (not GLX) is the load-bearing mount that 'graphics' adds; 'compute' alone doesn't mount it. Probe still returns true on a Vulkan host (verified); engine tests green.
2026-06-03 10:48:30 +02:00 · 2026-06-03 10:48:30 +02:00 · e298ff6c05
commit e298ff6c05
parent 5e5a719f27
2 changed files with 28 additions and 12 deletions
--- a/7
+++ b/7
@ -95,9 +95,10 @@ ENV XDG_DATA_HOME=/data

 # NVIDIA passthrough defaults. `--gpus all` alone only grants the "utility" +
 # "compute" capabilities; nvenc needs "video", and "graphics" makes the runtime
-# mount the NVIDIA Vulkan ICD (nvidia_icd.json + GLX libs) so ffmpeg's libplacebo
-# filter (GPU HDR tonemap, paired with libvulkan1 above) can create a Vulkan
-# device. Baking these here means a plain `docker run --gpus all` (or the compose
+# mount the NVIDIA Vulkan ICD (nvidia_icd.json — the load-bearing piece — plus
+# GLX/EGL libs) so ffmpeg's libplacebo filter (GPU HDR tonemap, paired with
+# libvulkan1 above) can create a Vulkan device. "compute" alone does NOT mount
+# the ICD. Baking these here means a plain `docker run --gpus all` (or the compose
 # device reservation) lights up HW transcode + GPU tonemap with zero extra flags.
 # Harmless when no GPU is attached.
 ENV NVIDIA_VISIBLE_DEVICES=all