fix(transcoder): force aac stereo 48khz + frag_duration for mse compat
Two transcoder fixes for browser MediaSource Extensions parsing: 1. -ar 48000 -ac 2 on the audio output. Source 5.1 / 7.1 streams produced a moov atom Chrome CHUNK_DEMUXER refuses to parse, even when the video metadata is fine and a non-MSE video element accepts the same file. Forcing AAC-LC stereo 48 kHz makes the moov shape MSE-compatible. 2. -frag_duration 1000000 (1 second) so each moof+mdat fragment caps at ~1s of media. Without it, ffmpeg only splits at keyframes and high- bitrate 1080p produces 8 MiB+ mdat boxes — MSE waits for the whole mdat before parsing the first fragment, so playback never starts. 3. -movflags +negative_cts_offsets so b-frames carry the right pts/dts offsets and the playhead doesn't reset every fragment. 4. New range_req debug log to make sizing bugs greppable.
This commit is contained in:
parent
27fe84f2a0
commit
81abc4acca
1 changed files with 24 additions and 8 deletions
|
|
@ -259,18 +259,34 @@ func buildFFmpegArgs(filePath string, opts TranscodeOpts) []string {
|
|||
filterChain = "format=yuv420p,setparams=colorspace=bt709:color_trc=bt709:color_primaries=bt709:range=tv"
|
||||
}
|
||||
args = append(args, "-vf", filterChain)
|
||||
args = append(args, "-c:a", "aac", "-b:a", coalesce(opts.AudioBitrate, "192k"))
|
||||
// Force AAC-LC stereo 48 kHz so MSE's CHUNK_DEMUXER accepts the moov.
|
||||
// 5.1 / 7.1 source streams produce a moov shape that MSE refuses to
|
||||
// parse (the <video src=blob:> demuxer is more forgiving), so we
|
||||
// always downmix to stereo and resample to 48 kHz here. Source
|
||||
// material that's already stereo passes through losslessly aside
|
||||
// from the re-encode.
|
||||
args = append(args,
|
||||
"-c:a", "aac",
|
||||
"-b:a", coalesce(opts.AudioBitrate, "192k"),
|
||||
"-ar", "48000",
|
||||
"-ac", "2",
|
||||
)
|
||||
}
|
||||
|
||||
// Common output flags — fragmented MP4 to a single pipe.
|
||||
// NO faststart: that flag rewrites the moov atom to the front of the
|
||||
// file as a SECOND pass after encoding finishes, which means the
|
||||
// browser never sees a moov until ffmpeg exits. For live transcoding
|
||||
// we need empty_moov (write a placeholder up front) so MSE can start
|
||||
// decoding the very first fragment. faststart is only safe for
|
||||
// already-finished files.
|
||||
//
|
||||
// * empty_moov + default_base_moof: write a header-only init segment
|
||||
// up front so MSE can start decoding before the file is finished.
|
||||
// * frag_duration=1s: cap each moof+mdat at ~1 second of media. Without
|
||||
// this, ffmpeg only splits at keyframes, which on a high-bitrate
|
||||
// 1080p stream produces 8 MiB+ mdat boxes — MSE refuses to parse
|
||||
// the first fragment until the whole mdat lands, so playback never
|
||||
// starts.
|
||||
// * negative_cts_offsets: lets b-frames carry the right pts/dts so
|
||||
// decoders don't reset the playhead to 0 every fragment.
|
||||
args = append(args,
|
||||
"-movflags", "frag_keyframe+empty_moov+default_base_moof",
|
||||
"-movflags", "+frag_keyframe+empty_moov+default_base_moof+negative_cts_offsets",
|
||||
"-frag_duration", "1000000",
|
||||
"-f", "mp4",
|
||||
"pipe:1",
|
||||
)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue