mpv - a free, open source, and cross-platform media player

	Commit message (Collapse)	Author	Age	Files	Lines
*	vo: fix missed option updates under rare circumstances	wm4	2019-09-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dear diary, today I fixed a shitty bug that was all my fault because I made a horrible mess. (Except it was a horrible mess before I even touched this shit, but let's not blame others.) Sometimes, updates to VO option that control video sizing (like panscan) didn't update the screen correctly. They were delayed until the next option change or so. It turns out that if the option update happens at the "same" time as a VOCTRL, update_opts() doesn't actually notify the vo_driver of the change. This in turn happened because run_control() called m_config_cache_update(). The latter function returns true if the options changed since the last call, and update_opts() also calls it (on the same config cache) for the same purpose. The update_opts() call, which is triggered by a third mechanism, comes later, but the cache update call will return false (as it should). Basically, given the config API, you can't act differently on multiple update calls and expect it to work. The skipped handling in update_opts() meant that the notification required to apply the changed option wasn't run. Fix this by simply calling update_opts() directly instead. Now there's only 1 m_config_cache_update() call on this specific instance. Fix the call in run_reconfig() too, so the previous sentence isn't a lie (but it probably doesn't make a difference in practice due to certain details). I'm not sure how I even ran into this sort-of race condition. The VOCTRL that messed up the option update was VOCTRL_UPDATE_PLAYBACK_STATE, which happens semi-regularly. Why this config cache shit and all the other shit? Rediscovering this crap wasn't pleasant. It's a bunch of hacks that became necessary when the ancient MPlayer architecture made it hard to move the VO to a separate thread. All the VO code typically accesses vo->opts (whose fields all used to be global variables in MPlayer). The frontend changes these on user input. Putting locking around all the options would be a nightmare, and keeping a copy of the options in the thread was much simpler. You need a way to propagate option changes, notify the thread, and update the local copy too. And the result of these thoughts was the config cache mechanism. In this specific case, the relevant cache update call in update_opts() triggers a VOCTRL_SET_PANSCAN to the VO driver, which isn't related to its former function anymore. Instead, it causes the VO driver to update the video sizing/placing options, which the generic VO code can't do. (Mostly because the VO driver includes the windowing stuff and is responsible for resizing etc. itself.) VOCTRLs sent by the frontend are even worse. MPlayer had no real runtime option change mechanism. Some options were vaguely duplicated by properties, so you could effectively change those options at runtime. Each of these options had its own VOCTRL, which still exist today, e.g. VOCTRL_FULLSCREEN, or VOCTRL_ONTOP. I tried to make all options runtime changeable, and to unify properties with options. But I couldn't be bothered with updating all VO drivers to listen to option changes directly, because that would be pretty tedious. So the property code is still all there and sends the old VOCTRLs. But of course you need to sync up the options, which is why the run_control() code did that. (Unrelated: VO_EVENT_FULLSCREEN_STATE is the worst shithack of them all. Currently, only the frontend can actually write to options (for awful reasons), so if the fullscreen state changes due to outside interaction, the VO driver can't update the corresponding option fields. So the VO notifies the frontend with said VO_EVENT_, and the frontend then sends VOCTRL_GET_FULLSCREEN, and updates the global copy of the option with the value returned by that. I still like to think the situation is not that bad considering the monstrous effort of converting single-threaded code that had hundreds of options in global variables to multi-threaded code with no global variables at all.)
*	win_state: silence a valgrind warning	wm4	2019-09-19	1	-1/+1
\| \| \| \| \| \| \|	m_geometry_apply() will read and modify the dummy variable. It's not actually used for anything, but valgrind will still warn against uninitialized data. I'm not sure whether this was UB, but in any case it's annoying when running valgrind.
*	vo_gpu: hwdec_vaegl: silence confusing message during probing	wm4	2019-09-19	1	-2/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During probing on a system with AMD GPU, mpv used to output the following messages if hardware decoding was enabled: [ffmpeg] AVHWFramesContext: Failed to create surface: 2 (resource allocation failed). [ffmpeg] AVHWFramesContext: Unable to allocate a surface from internal buffer pool. This commit removed the message, with hopefully no other side effects. Long explanations follow, better don't read them, it's just tedious drivel about the details. People should learn to write concise commit messages, not drone on and on endlessly all while they have no fucking point. The code probes supported hardware pixel format, and checks whether they can be mapped as textures. av_hwdevice_get_hwframe_constraints() returns a list of hardware pixel formats in the valid_sw_formats field (the "sw" means software, but they're still hardware pixel formats, makes sense). This contained the format yuv420p, even though this is not a valid hardware format. Trying to create a surface of this type results in VA surface creation failure, upon which FFmpeg prints the error messages above. We'd be fine with this, except FFmpeg has a global log callback, and there's no way to suppress these messages without creating other issues. It turns out that FFmpeg's vaapi implementation returns all formats from vaQueryImageFormats() if no "hwconfig" is provided. This list includes yuv420p, which is probably supported for surface upload/download, but not as native format. Following FFmpeg's logic, it should not appear in the valid_sw_formats list, because formats for transfers are returned by another roundabout API. Idiotically, there doesn't seem to be any vaapi call that determines whether a format is a valid surface format. All mechanisms to do this are bound to a VAConfigID (= video codec or video processor), all while the actual surface creation API strangely does not take a VAConfigID (a big WTF). Also, calling the vaCreateSurfaces() API ourselves for probing is out of the question, because that functions is utterly and idiotically complex. Look at the FFmpeg code and how much effort it requires to setup a complete set of attributes - we can't duplicate this. So the only way left to do this is the most idiotic and tedious way: enumerating all VAProfile (and VAEntrypoints) to create all possible VAConfigIDs. Each of the VAConfigIDs is associated with a list of formats, which FFmpeg can return (by passing the ID along with the "hwconfig"), and which is probed separately. Note that VAConfigID actually refers to a dynamic instance of something, and creating a VAConfigID takes not only the VAProfile and the VAEntrypoint, but also an arbitrary attribute array. In theory, this means our attempt to get to know all possible configurations cannot work, but in practice this attribute array seems to be pointless for decoding and video processing, and FFmpeg doesn't use it (though the encoding path does use it). This probably just makes it _barely_ OK to do it this way. Could we discard all this probing shit, and somehow do it another way? Probably not. The EGL API for mapping surfaces doesn't even seem to provide a way to enumerate supported formats, we may not even know whether DRM/dmabuf interop is actually supported (AFAIR the EGL extensions are present even if they don't work), nor do we know whether the VAAPI driver supports this interop (not sure). So actually trying is the only way. Further, mpv initializes the decoder on a another thread, where you can't just access OpenGL state. This suckage is mostly to be blamed on OpenGL itself and its crazy thread boundedness. In theory, this could be done anyway (see how software decoding "direct rendering" tries to get around this). But to make it worse, the decoder never cares about the list of supported formats determined by this code; instead, f_autoconvert.c tries to deal with it and insert a video processor (well, good luck with this crap, I bet it doesn't even work). So this whole endeavor might be pointless, other than the fact that failed probing can disable use of vaapi (which is correct and necessary). But if you have a shovel, you don't use it to smash the flat end on the heap of shit that's piled up before you, or do you? While this method probably works, it's still orgasmically tedious. It was tedious before: we had to create a real surface, create a GL texture, map the surface with it, then destroy everything again. But the added code is tedious on its own. Highlights include the need to malloc a FFmpeg struct just to pass a single damn integer, the need to enumerate "entrypoints" for each VA profile, even though all profiles have exactly 1 entrypoint, and the kind of obnoxious way how vaapi requires you to preallocate arrays for returned things, even they could for example reasonably be returned as immutable arrays or have some other simpler API. The main grand fuckup is of course that vaapi requires a VAConfigID to query surface properties, but not for creating surfaces. This awkwardness even affected the FFmpeg API design, which has a "hwconfig" concept that is only used by vaapi (vaapi is only 1 out of 10 hardware decoding APIs supported by the FFmpeg hwcontext stuff). Maybe I'm just missing something. It's as if vaapi required setting radioactive shit on fire. Look how clean the native D3D11 code is instead. (Even the ANGLE code manages to avoid being this fucked up. Or the VDPAU code, despite supporting multiple mapping methods.) Another only barely related change is that the valid_sw_formats field can be NULL, and the API explicitly documents this. Technically, the mpv code was buggy for not checking this, although until now the FFmpeg implementation so far could not return it when we still passed NULL for the hwconfig parameter.
*	vo_gpu: hwdec_vaegl: refactor format probing	wm4	2019-09-19	1	-40/+64
\| \| \| \| \| \| \| \| \| \| \|	No functional changes, just preparation for the next commit. Split the probing into multiple functions. Prepare for the yet unused possibility to pass AVVAAPIHWConfig to probing. try_format_pixfmt() now assumes it can be called multiple times with the same format, so it filters the format. The format probing is now something like O(n^2) for n formats, but n will most likely remain something under 50 or so.
*	vo_gpu: remove vdpau/GLX backend	wm4	2019-09-19	2	-421/+0
\| \| \| \| \| \| \|	Useless garbage. This was once added to test whether vdpau presentation feedback could be used. Results were always unsatisfactory, and now vdpau is dead.
*	vo_gpu: remove mali-fbdev	wm4	2019-09-19	2	-162/+0
\| \| \| \| \|	Useless at this point, I don't even know if it still works, or how to test it.
*	aspect: add video margin options	wm4	2019-09-19	1	-5/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Semantics a bit questionable. This is done for the OSC (next commit), and a comment added the manpage explicitly states this. Meaning this is probably garbage and needs to revisit when the OSC changes and/or someone wants to use this margin feature for something else. Not sure about the subtitle thing. It's imaginable that someone uses these options to create empty borders for subtitles on the bottom, so subtitles should be located there. On the other hand, this gives a rather unpolished user experience when using the (later added) OSC feature to not overlap with the video. There's not much of a point if the OSC still overlaps the video. However, I'm too lazy to think about this, so it stays like it is.
*	aspect: fix some UB problems in corner cases	wm4	2019-09-19	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \|	--video-margin-ratio-left=0.2 --video-margin-ratio-right=0.9 (added in the the next commit) will set f_w to inf, resulting in some garbage being propagated. Later, the OSD margins are computed from values before various sanity clamping is applied, which makes libass suffer from bullshit values. I'm very sure it's OK and more correct to compute the OSD margins using the later values, but I'm not sure about that.
*	wayland: fix wl_proxy leak	dudemanguy	2019-09-19	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This one is probably not terribly obvious from just the valgrind log, but a wayland dev explained it to me just a second ago. Whenever mpv sends events to the screen with wl_display_dispatch, wayland internally allocates memory to a struct wl_proxy object if a new id is found. Quite a few more things happen to that proxy object, but eventually mpv stores the data on the client-side in a wrapper type of struct (struct wl_data_offer). mpv's data_device_listener keeps track of those proxies and frees the memory when appropriate. Of course, mpv is constantly sending events to the screen and does so until the user quits the player. What happens here is that one final wl_display_dispatch is called right before the user quits the player and before mpv's data_device_listener can handle that object. So the result is that you always have one extra dangling proxy that doesn't get properly freed. The solution is to just simply call wl_data_offer_destroy before closing the wl_display to free that final dangling wl_proxy.
*	drm: fix libmpv ABI breakage introduced in ↵	Anton Kindestam	2019-09-18	4	-13/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	351c083487050c88adb0e3d60f2174850f869018 Extending the client-allocated mpv_opengl_drm_params struct constituted a break of ABI that could cause UB. Create a clean break by deprecating "drm_params" and related structs and enum values, and replacing it with "drm_params_v2". Also fix some comments and code that wrongly assumed that open could return any other negative number than -1 for failure. This commit updates the libmpv version to 1.104
*	vo_gpu: hwdec_vaapi: Refactor Vulkan and OpenGL interops for VAAPI	Philip Langdale	2019-09-15	5	-326/+471
\| \| \| \| \| \|	Like hwdec_cuda, you get a big #ifdef mess if you try and keep the OpenGL and Vulkan interops in the same file. So, I've refactored them into separate files in a similar way.
*	vo_gpu: hwdec_cuda: Improve interop selection mechanism	Philip Langdale	2019-09-15	4	-15/+20
\| \| \| \| \| \|	This change updates the interop selection to match what I did for VAAPI, by iterating through an array of init functions until one of them works.
*	vo_gpu: x11: remove special vdpau probing, use EGL by default	wm4	2019-09-15	2	-29/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, vo_gpu/vo_opengl considered the case of Nvidia proprietary drivers, which required vdpau/GLX, and Intel open source drivers, which require vaapi/EGL. Since window creation and GPU context creation are inseparable in mpv's internal API, it had to pick the correct API very early, or hardware decoding wouldn't work. "x11probe" was introduced for this reason. It created a GLX context (without showing the window yet), and checked whether vdpau was available. If yes, it used GLX, if not, it continued probing x11/EGL. (Obviously it couldn't always fail on GLX without vdpau, which is why it was a separate "probe" backend.) Years passed, and now the situation is different. Vdpau is dead. Nvidia drivers and libavcodec now provide CUDA interop, which requires EGL, and fixes some of the vdpau problems. AMD drivers now provide vaapi, which generally works better than vdpau. Intel didn't change. In particular, vaapi provides working HEVC Main10 support. In theory, it should work on vdpau too, with quality reduction (no 10 bit surfaces), but I couldn't get it to work. So always prefer EGL. And suddenly hardware decoding works. This is actually rather important, because HEVC is unfortunately on the rise, despite shitty encoders and unoptimized decoders. The latter may mean that hardware decoding works better than libavcodec. This should have been done a long, long time ago.
*	vo_gpu: correctly normalize src.sig_peak	Niklas Haas	2019-09-15	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases, src.sig_peak remains undefined as 0, which was definitely the case when using the OSD, since it never got passed through the usual color space normalization process. Most robust work-around is to simply force the normalization at the site where it's needed. This ensures this value is always valid and defined, to make the peak-dependent logic in these two functions always work. Fixes 4b25ec3a9d Fixes #6917 Fixes #6918
*	vo: add warning message to vo_vaapi and vo_vdpau	sfan5	2019-09-14	2	-0/+10
\| \| \| \| \|	These are a common source of bug reports, due to misconceptions that they are required to make use of hardware decoding.
*	vo_d3d11/context: fix crash due to ctx->ra is null pointer access	Hui Jin	2019-09-14	1	-2/+4
\| \| \| \|	'ctx->ra' is null pointer when d3d11 init failed before call 'ra_d3d11_create' in 'd3d11_init'.
*	vo_d3d11/hwdec_dxva2dxgi: fix memory leak that 'ctx11' be not release	Hui Jin	2019-09-14	1	-0/+6
\| \| \| \|	'ctx11' be not release when d3d11 hwdec be uninit with 'mapper_uninit' method.
*	vo_gpu: x11egl: support Mesa OML sync extension	wm4	2019-09-08	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mesa supports the EGL_CHROMIUM_sync_control extension, and it's available out of the box with AMD drivers. In practice, this is exactly the same as GLX_OML_sync_control, but for EGL. The extension specification is separate from the GLX one though, and buried somewhere in the Chromium code. This appears to work, although I don't know if it really works. In theory, this could be useful for other EGL targets. Support code for it could have been added to egl_helpers.c to avoid some minor duplicated glue code if another EGL target were to provide this extension. I didn't bother with that. ANGLE on Windows can't support it, because the extension spec. explicitly requires POSIX timers. ANGLE on Linux/OSX is actively harmful for mpv and hopefully won't ever use it. Wayland uses EGL, but has its own fancy presentation feedback stuff (and besides, I don't think basic video player functionality works on Wayland at all). context_drm_egl maybe? But I think DRM has its own stuff.
*	vo_gpu: glx: move OML sync code to an independent file	wm4	2019-09-08	3	-96/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So the next commit can make EGL use it. EGL has a quite similar function, that practically works the same. Although it's relatively trivial, it's still tricky, and probably shouldn't end up as duplicated code. There are no functional changes, except initialization, and how failure of the glXGetSyncValues call is handled. Also, some comments mention the EGL extension. Note that there's no intention for this code to handle anything else than the very specific OML sync extension (and its EGL equivalent). This is just too weirdly specific to the weird idiosyncrasies of the extension, and it makes no sense to extend it to handle anything else. (Such as Wayland or DXGI presentation feedback.)
*	vo/gpu: fix check on src/dst peak mismatch	Niklas Haas	2019-09-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	In the past, src peak was always equal to or higher than dst peak. But since `--target-peak` got introduced, this could no longer be the case. This leads to an incorrect result (scaling for peak mismatch in gamma light) unless some other option (CMS, --linear-scaling, etc.) forces the linearization. Fixes #6533
*	cocoa-cb: remove an unused variable	der richter	2019-09-02	1	-1/+0
\|
*	vo/gpu: vulkan: Pass the device name option through to libplacebo	Philip Langdale	2019-08-24	1	-0/+1
\| \| \| \| \| \| \| \| \|	We collect a 'vulkan-device' option today but then don't actually pass it on, so it's useless. Once that's fixed, it can be used to select a specific vulkan device by name. Tested with the new nvidia offload feature to select between the nvidia and intel GPUs.
*	vo_gpu: d3d11: fix storage lifetime of compound literals	James Ross-Gowan	2019-08-20	1	-8/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Somehow I got the idea that compound literals had function-scoped lifetime. Instead, like all other objects with automatic storage duration, compound literals are block-scoped, so they become invalid after exiting the block they were declared in. It seems like a recent change to GCC actually reuses the memory that the compound literals used to occupy, which was causing a few bugs. The pattern of conditionally assigning a pointer to a compound literal was used in a few places in ra_d3d11 where the Direct3D API expects either a pointer to an initialised struct or NULL. Change these to ensure the lifetime of the struct includes the API call. Should fix #6775.
*	vo_gpu: fix taking screenshots of rotated videos	wnoun	2019-08-14	1	-2/+6
\|
*	vo_gpu: hwdec_vaapi: Synchronise after exporting VA surface	Philip Langdale	2019-08-07	1	-0/+3
\| \| \| \| \| \| \|	This is documented as required (although we did not do it in the old GL codepath, with no visible problems) and I have seen transient artifacts after seeking which _appear_ to have gone away after introducing this.
*	cocoa-cb: migrate to swift 5 with swift 4 fallback	der richter	2019-07-21	4	-69/+60
\| \| \| \| \| \| \| \| \| \| \| \| \|	this migrates our current swift code to version 5 and 4. building is support from 10.12.6 and xcode 9.1 onwards. dynamic linking is the new default, since Apple removed static libs from their new toolchains and it's the recommended way. additionally the found macOS SDK version is printed since it's an important information for finding possible errors now. Fixes #6470
*	cocoa-cb: fix optional cases on macOS 10.12	der richter	2019-07-21	1	-4/+4
\|
*	cocoa-cb: conditional compilation for Dark Mode and Material features	der richter	2019-07-21	1	-1/+4
\| \| \| \|	Fixes #6621
*	vo_gpu: hwdec_vaapi: Count planes rather than layers in Vulkan interop	Philip Langdale	2019-07-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We saw a segfault when trying to use the intel-media-driver (iHD) rather than the normal intel va driver. This happened because the iHD driver reports P010 (and maybe other formats) with multiple layers to represent the interleaved UV plane. The normal va driver reports one UV layer to match the plane. This threw off my logic which assumed that the number of layers could not exceed the number of planes. There's a way one could fix this in a fully generalised form, but I'm just going to do what the EGL path does and assume that: * Layer 'n' is on Plane 'n' for n < total number of planes * These layers always start at offset 0 on the plane You can imagine ways that these assumptions are violated, but at least the failure will look the same for both EGL and Vulkan paths.
*	vo_gpu: hwdec_vaapi: Suppress format errors when probing	Philip Langdale	2019-07-08	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	Today, we normally see a format error when probing because yuyv422 cannot be used, but it's in the normal set of probed formats. This error is distracting and confusing, so only log probing errors at the VERBOSE level. Fixes #6411
*	vo_gpu: hwdec_vaapi: Add Vulkan interop	Philip Langdale	2019-07-08	3	-157/+298
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change introduces a vulkan interop path for the vaapi hwdec. The basic principles are mostly the same as for EGL, with the exported dma_buf being imported by Vukan. The biggest difference is that we cannot reuse the texture as we do with OpenGL - there's no way to rebind a VkImage to a different piece of memory, as far as I can see. So, a new texture is created on each map call. I did not bother implementing a code path for the old libva API as I think it's safe to assume any system with a working vulkan driver will have access to a newer libva. Note that we are using separate layers for the vaapi surface, just as is done for EGL. This is because libplacebo doesn't support multiplane images. This change does not include format negotiation because no driver implements the vk_ext_image_drm_format_modifier extension that would be required to do that. In practice, the two formats we care about (nv12, p010) work correctly, so we are not blocked. A separate change had to be made in libplacebo to filter out non-fatal validation errors related to surface sizes due to the lack of format negotiation.
*	vo_gpu: hwdec_vaegl: Rename and move to hwdec_vaapi	Philip Langdale	2019-07-08	1	-0/+0
\| \| \| \| \|	In preparation for adding Vulkan interop support, let's rename to remove the egl reference and move to an api neutral location.
*	vo/gpu: hwdec_vdpau: Support direct mode for 4:4:4 content	Philip Langdale	2019-07-08	1	-4/+15
\| \| \| \| \| \| \|	New releases of VDPAU support decoding 4:4:4 content, and that comes back as NV24 when using 'direct mode' in OpenGL Interop. That means we need to be a little bit smarter about how we set up the OpenGL textures.
*	opengl/context_wayland: Fix crash on configure before initial reconfig	Michael Forney	2019-07-08	1	-1/+3
\| \| \| \| \| \| \| \| \|	If the compositor sends a configure event before the surface is initially mapped, resize gets called before the egl_window gets created, resulting in a crash in wl_egl_window_resize. This was fixed back in 618361c697, but was reintroduced when the wayland code was rewritten in 68f9ee7e0b.
*	video/out/gpu: Add a `storable` flag to ra_format	Philip Langdale	2019-07-08	5	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While `ra` supports the concept of a texture as a storage destination, it does not support the concept of a texture format being usable for a storage texture. This can lead to us attempting to create a texture from an incompatible format, with undefined results. So, let's introduce an explicit format flag for storage and use it. In `ra_pl` we can simply reflect the `storable` flag. For GL and D3D, we'll need to write some new code to do the compatibility checks. I'm not going to do it here because it's not a regression; we were already implicitly assuming all formats were storable. Fixes #6657
*	vo_gpu: process three component together in error diffusion	Bin Jin	2019-06-16	1	-42/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This started as a desperate attempt to lower the memory requirement of error diffusion, but later it turns out that this change also improved the rendering performance a lot (by 40% as I tested). Errors was stored in three uint before this change, each with 24bit precision. This change encoded them into a single uint, each with 8bit precision. This reduced the shared memory usage, as well as number of atomic operations, all by three times. Before this change, with the minimum required 32kb shared memory, only the `simple` kernel can be used to render 1080p video, which is mostly useless compare to `--dither=fruit`. After this change, 32kb can handle `burkes` kernel for 1080p, or `sierra-lite` for 4K resolution.
*	vo_gpu: fix use of existing textures in error diffusion	Bin Jin	2019-06-16	1	-6/+8
\| \| \| \| \| \| \|	error diffusion requires two texture rendering pass. The existing code reuses `screen_tex` and creates another for such purpose. This works generally well for opengl, but could potentially be problematic for vulkan, due to its async natural.
*	vo_gpu: implement error diffusion for dithering	Bin Jin	2019-06-16	4	-0/+423
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|