audio: change format negotiation, remove channel remix fudging

The audio format neogitation code was pretty complicated, although the idea was simple: when the format changes (or on the first audio frame), filter only the new frame through the entire filter chain, discard the resulting frame, but use the format to initialize the AO. This was useful for "fudging" the channel remix behavior (upmix or downmix), and moving it before other filters. Apparently this was useful for things like DRC filters, which might work better in stereo, and which also can only achieve the desired volume levels by doing it before a downmix, which would modify the volume. This mechanism was introduced in commit 60048b7eb957b (which the commit message also describes as "idiotic heuristic"). Knowing the output format is inherently necessary for this, because otherwise we can't know what the hell the user defined filters will do. There were problems with robustness. Some filters needed more than one frame. Resampling in particular would discard initial audio at high resampling ratios. Some filters might drop audio intentionally (like clipping data on timestamp ranges). There were also allegations that some decoders output 0 length frames (although that is invalid in libavcodec). The state machine was excessively complex and hard to understand too. There are 3 things that could have been done: 1. Fix robustness problems by doing more heuristics, like repeating audio frames or simply decoding several frames. Since filters can behave differently, this would have added lots of complexity. 2. Make use of libavfilter's format negotiation, and add the same to mpv builtin filters. This is sort of annoying, because the format negotiation in libavfilter changes the state of the filters. It also reports only some parameters (mostly all for audio, but a lot of holes for video). It would remove some of the state machine, but not all. 3. Drop the channel remix fudging, and do the same as the video chain. This would not require format negotiation, but instead you can just filter the audio frames, and look what comes out of it. If nothing comes out, simply never create an AO. This commit selects option 3. It removes the remix fudging, which means the loss of a feature. Users can instead add "--af=format=channels=2" before their DRC filter, or something. I'm also considering changing the default for --audio-channels back to stereo, and downmix in the decoder or at the start of the filter chain, which would give the same results, except requiring more configuration. Implementation-wise, this is still a bit different from the video path. The VO always remains the same instance, while the AO might have to be recreated on configuration changes. This still requires explicit format change handling + draining old data, but by putting it into f_autoconvert, not much new code is needed.
author: wm4 <wm4@nowhere> 2018-04-07 14:38:40 +0200
committer: Jan Ekström <jeebjp@gmail.com> 2018-04-15 23:11:33 +0300
commit: 3c123281a7eb3181a9706a52d0148f70aa65cff5 (patch)
tree: 86fe71febdffb4bead600b47371f28c7838ff3d2 /filters/f_autoconvert.c
parent: 4b48966d87d7759346f989c44e8ab3cb405d0037 (diff)
download: mpv-3c123281a7eb3181a9706a52d0148f70aa65cff5.tar.bz2
mpv-3c123281a7eb3181a9706a52d0148f70aa65cff5.tar.xz
1 files changed, 36 insertions, 6 deletions
diff --git a/filters/f_autoconvert.c b/filters/f_autoconvert.c
index ee72474f09..78b716623e 100644
--- a/filters/f_autoconvert.c
+++ b/filters/f_autoconvert.c
@@ -45,6 +45,9 @@ struct priv {
     double audio_speed;
     bool resampling_forced;
 
+    bool format_change_blocked;
+    bool format_change_cont;
+
     struct mp_autoconvert public;
 };
 
@@ -283,17 +286,30 @@ static void handle_audio_frame(struct mp_filter *f)
     struct mp_chmap chmap = {0};
     mp_aframe_get_chmap(aframe, &chmap);
 
-    if (afmt == p->in_afmt && srate == p->in_srate &&
-        mp_chmap_equals(&chmap, &p->in_chmap) &&
-        (!p->resampling_forced || p->sub.filter) &&
-        !p->force_update)
-    {
+    bool format_change = afmt != p->in_afmt ||
+                         srate != p->in_srate ||
+                         !mp_chmap_equals(&chmap, &p->in_chmap) ||
+                         p->force_update;
+
+    if (!format_change && (!p->resampling_forced || p->sub.filter))
         goto cont;
-    }
 
     if (!mp_subfilter_drain_destroy(&p->sub))
         return;
 
+    if (format_change && p->public.on_audio_format_change) {
+        if (p->format_change_blocked)
+            return;
+
+        if (!p->format_change_cont) {
+            p->format_change_blocked = true;
+            p->public.
+                on_audio_format_change(p->public.on_audio_format_change_opaque);
+            return;
+        }
+        p->format_change_cont = false;
+    }
+
     p->in_afmt = afmt;
     p->in_srate = srate;
     p->in_chmap = chmap;
@@ -373,6 +389,17 @@ static void process(struct mp_filter *f)
     mp_subfilter_continue(&p->sub);
 }
 
+void mp_autoconvert_format_change_continue(struct mp_autoconvert *c)
+{
+    struct priv *p = c->f->priv;
+
+    if (p->format_change_blocked) {
+        p->format_change_cont = true;
+        p->format_change_blocked = false;
+        mp_filter_wakeup(c->f);
+    }
+}
+
 static bool command(struct mp_filter *f, struct mp_filter_command *cmd)
 {
     struct priv *p = f->priv;
@@ -394,6 +421,9 @@ static void reset(struct mp_filter *f)
     struct priv *p = f->priv;
 
     mp_subfilter_reset(&p->sub);
+
+    p->format_change_cont = false;
+    p->format_change_blocked = false;
 }
 
 static void destroy(struct mp_filter *f)
author	wm4 <wm4@nowhere>	2018-04-07 14:38:40 +0200
committer	Jan Ekström <jeebjp@gmail.com>	2018-04-15 23:11:33 +0300
commit	3c123281a7eb3181a9706a52d0148f70aa65cff5 (patch)
tree	86fe71febdffb4bead600b47371f28c7838ff3d2 /filters/f_autoconvert.c
parent	4b48966d87d7759346f989c44e8ab3cb405d0037 (diff)
download	mpv-3c123281a7eb3181a9706a52d0148f70aa65cff5.tar.bz2 mpv-3c123281a7eb3181a9706a52d0148f70aa65cff5.tar.xz