summaryrefslogtreecommitdiffstats
path: root/video/out/gpu
Commit message (Collapse)AuthorAgeFilesLines
* vo_gpu: fix crash if dither texture fails to allocatewm42020-01-081-0/+3
| | | | | | | Theoretically possible (and quite unlikely due to the small texture size). The code was originally written with the assumption that texture allocations can't fail, and it was never updated out of laziness. Untested.
* Revert "vo_gpu: move wayland below X11 in autoprobe order"Dudemanguy2020-01-011-6/+6
| | | | This reverts commit a6d8e9b7ff0166ee6db69c41c0cfc319e03f4c80.
* vo_gpu: move wayland below X11 in autoprobe orderwm42019-12-301-6/+6
| | | | | | | I'm sick of mpv being accused of not doing it right. You can't do it right on Wayland? Long live X11. Fixes: #7307
* video: cuda: add explicit context creation for copy hwaccelsPhilip Langdale2019-12-291-1/+1
| | | | | | | | | | | | | | | | | | | | | In the distant past, the cuviddec backed copy hwaccel could be configured directly using lavc options. However, since that time, we gained support for automatic hw ctx creation which ended up bypassing the lavc options. Rather than trying to find a way to pass those options again, a better idea is to make the 'cuda-decode-device' option, used by the interop hwaccels, work for the copy hwaccels too. And that's pretty simple: we have to add a create function that checks the option and passes it on to ffmpeg. Note that this does require a slight re-jig to the configuration flags, as we now have a scenario where we want to build with support for the cuda copy hwaccels but not the interop ones. So we need a distinct configuration flag for that combination. Fixes #7295.
* vo_gpu: hwdec_vdpau: remove direct_modePhilip Langdale2019-12-282-46/+0
| | | | | | | | | | | | | | | | | | | | | | | | As we are less and less interested in vpdpau, with nvdec and vaapi being better choices in general on nvidia and AMD respectively, we might consider removing direct_mode, where we bypass the vdpau mixer and work directly with yuv textures. Normally, working with yuv textures would be great, but vdpau built in an assumption that all frames are delivered as separate fields, causing us to have to re-interleave them. nvidia then introduces a new OpenGL extension that can return the yuv frames as frames, but we can't just unconditionally switch to that as we'd want to keep supporting older hardware where the drivers are no longer getting new features. The end result is that we wouldn't be able to get rid of the old code paths. Removing direct_mode means we always use the mixer, and work with rgba frame textures. There are some theoretical limitations to this, but in practice they probably don't matter much - unsupported colourspaces don't matter because without 10bit decoding support, we can't use them anyway, and apparently we're not doing separate chroma scaling these days, so scaling the rbga doesn't really lose anything (and the vdpau hq scaling option remains available).
* vf_gpu: render subtitleswm42019-11-302-8/+13
| | | | | | | Pretty annoying affair. The vo_gpu code could of course not trigger rendering from filters yet, so it needed to be extended. Also, this uses some icky stuff made for vf_sub (and this was the reason I marked vf_sub as deprecated), so everything is terrible.
* vo_gpu: fix infinite scaler reinit spamNiklas Haas2019-11-231-8/+9
| | | | | | | | | Handling the window with this function makes no sense, since windows and kernels are not the same thing and don't share the same option list. The only reason it's done is to make sure the char* points at the static string rather than the dynamically allocated one, which we can do manually in this function. Rewrite a bit for clarity/quality.
* video/out/gpu: Remove stray top-level ';'Michael Forney2019-11-182-2/+2
|
* vo_gpu: sync duplicated condition on peak computationwm42019-11-161-0/+2
| | | | | | | | | | | | | | | | | | | | pass_color_map() (in video_shaders.c) and pass_colormanage() (video.c) both duplicate the condition on whether to do peak computation. Peak computation requires a compute shader, so if the duplicated conditions don't match, video_shaders.c will generate a compute shader, but video.c will try to run it as fragment shader. This leads to a "blue screen". This can be reproduced by playing a HDTV video with --target-peak=99. It's not clear how to fix this. Should pass_tone_map() be only invoked if mp_trc_is_hdr() == true (what pass_colormanage() uses to decide whether to enable peak computation), or should pass_colormanage() just tell pass_color_map() to skip peak computation? Decide for the latter, as it's more robust. Even if not correct, at least it gets rid of the blue shit. Fixes: #7149
* vo_gpu: yuv alpha is always full rangewm42019-11-091-8/+6
| | | | | | | | | | | | | | Probably. It's not like these pixel formats are formally specified - FFmpeg added them because _some_ file format or decoder supports it, and while that format/codec may define it precisely, the pixel format is sort of disconnected and just a FFmpeg thing. In any case, the yuva sample I had at hand uses the full range the component data type can provide. The old code used the same "shifted" range as for Y/U/V components, which must have been wrong. This will not work correctly for packed YUVA formats, but fortunately they matter even less.
* vo_gpu: unconditionally clear framebuffer on start of framewm42019-11-061-5/+3
| | | | | | | | | | | | | | | | | | | | | | For some reason, the first frame displayed on X11 with amdgpu and OpenGL will be garbled. This is especially visible if the player starts, displays a frame, but then still takes a while to properly start playback. With --interpolation, the behavior somehow changes (usually gets worse). I'm not sure what exactly is going on, and the code in video.c is way too abstruse. Maybe there is some slight possibility that a frame with uncleared contents gets displayed, which somehow also corrupts another frame that is displayed immediately after that. If clear is unconditionally run, this somehow doesn't happen, and you see a video frame. By any logic this shouldn't happen: a video frame should always overwrite the background. So I can't exclude that this isn't some sort of driver bug, or at least very obscure interaction. Clearing should be practically free anyway, so always do it. Fixes: #7105
* Replace uses of FFMIN/MAX with MPMIN/MAXwm42019-10-311-4/+2
| | | | And remove libavutil includes where possible.
* vo_gpu/d3d11: add support for configuring swap chain color spaceJan Ekström2019-10-302-5/+171
| | | | | | | | | | | | | | | | By default utilizes the color space of the desktop on which the swap chain is located. If a specific value is defined, it will be instead be utilized. Enables configuration of the PQ color space (BT.2020 primaries, PQ transfer function) for HDR. Additionally, signals the swap chain color space to the renderer, so that the render looks correct without having to specify target-trc or target-prim manually. Due to all of the APIs being Win10+ only, will only work starting with Windows 10.
* vo_gpu/d3d11: add helpers for getting names for DXGI formats & CSPsJan Ekström2019-10-302-4/+178
| | | | | Additionally, define the few enum values that are currently missing in mingw-w64 headers.
* vo_gpu: add and utilize color space information from ra_fboJan Ekström2019-10-302-8/+40
| | | | | | | | This lets us set primaries, transfer function and the target peak based on what the presenting layer would want us to have. Now that this mechanism is available, warn if the user has overridden values such as primaries or transfer function.
* vo_gpu: log ra_format.storable with the other flagsJames Ross-Gowan2019-10-271-3/+5
| | | | | | This seems to have been missed when the storable flag was added, since all the other flags were logged here. It can be useful to know if an RA format is storable, so log it as well.
* vo_gpu: attempt to fix 0bgr formatwm42019-10-261-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | Using e.g. --vf=format=0bgr showed obviously wrong colors with --vo=gpu. The reason is that leading padding wasn't handled correctly. Try to hack fix it. While the code in copy_image() is somewhat reasonable, I can't tell what the fuck is going on with that HOOKED shit. For some reason this HOOKED shit doesn't use copy_image() (???), or uses it incorrectly. It affects debanding. --deband=no works correctly. If it's enabled, the crap in hook_prelude() is needed. I bet there are many more bugs with this. For example, the deband shader will try to deband the alpha channel if the format abgr is used (because the correct component order is only established later). This can be tested by inserting a "color.x = 0;" at the end of the deband shader, and using --vf=format=rgba vs. abgr. I cannot comprehend why it doesn't just store explicitly which components a texture contains, and why it doesn't just read the components always in an uniform way. There's a big chance this fix works only by coincidence. This shouldn't have been so hard either. Time for a complete rewrite?
* vo_gpu, options: don't return NaN through APIwm42019-10-251-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Internally, vo_gpu uses NaN for some options to indicate a default value that is different depending on the context (e.g. different scalers). There are 2 problems with this: 1. you couldn't reset the options to their defaults 2. NaN is a damn mess and shouldn't be part of the API The option parser already rejected NaN explicitly, which is why 1. didn't work. Regarding 2., JSON might be a good example, and actually caused a bug report. Fix this by mapping NaN to the special value "default". I think I'd prefer other mechanisms (maybe just having every scaler expose separate options?), but for now this will do. See you in a future commit, which painfully deprecates this and replaces it with something else. I refrained from using "no" (my favorite magic value for "unset" etc.) because then I'd have e.g. make --no-scale-param1 work, which in addition to a lot of effort looks dumb and nobody will use it. Here's also an apology for the shitty added test script. Fixes: #6691
* vo_gpu: hwdec_d3d11eglrgb: remove thiswm42019-10-161-2/+0
| | | | | Finally. Since with the previous commit we can (probably) handle P010 directly, this hack isn't needed anymore.
* vo_gpu/d3d11: fix memleak of the adapter description stringJan Ekström2019-10-151-1/+5
|
* vo_gpu/d3d11: remove unnecessary nullptr checkJan Ekström2019-10-151-2/+2
| | | | mp_to_utf8 will abort in case of either invalid input or OOM.
* vo_gpu/d3d11: switch adapter selection to case-insensitive startswithJan Ekström2019-10-151-3/+4
| | | | | | This lets users set values such as "intel" or "nvidia" as the adapter vendor is generally noted in the beginning of the description string.
* vo_gpu/d3d11: fixup adapter selection by switching it all to bstrJan Ekström2019-10-152-11/+7
| | | | | I did ponder if I should have done this right away, and it seems like not doing it at first was a mistake.
* vo_gpu/d3d11: add support for configuring swap chain formatJan Ekström2019-10-132-1/+120
| | | | | | | Query information on the system output most linked to the swap chain, and either utilize a user-configured format, or either 8bit RGBA or 10bit RGB with 2bit alpha depending on the system output's bit depth.
* vo_gpu/d3d11: add adapter name validation and listing with "help"Jan Ekström2019-09-292-5/+37
| | | | Not the prettiest way to get it done, but seems to work.
* vo_gpu/d3d11: refactor pthread_once d3d11 loading to functionJan Ekström2019-09-291-6/+15
| | | | Lets us reuse this in the future.
* vo_gpu/d3d11: utilize the passed adapter nameJan Ekström2019-09-291-5/+77
| | | | | | Normalize nullptr and an empty string both to nullptr to simplify handling. API users cannot set a value back to nullptr, so both an empty string as well as nullptr should behave the same.
* vo_gpu/d3d11: add an option for the adapter nameJan Ekström2019-09-291-0/+5
| | | | Set it from the adapter name in the d3d11 options.
* vo_gpu/d3d11_helpers: also load up CreateDXGIFactory1Jan Ekström2019-09-291-4/+13
| | | | Just a factory, without a device, is required for listing of devices.
* vo: make swapchain-depth option generic for all VOsAnton Kindestam2019-09-281-1/+0
| | | | In preparation for making vo_drm able to use swapchain-depth
* video: add pure gamma TRC curves for 2.0, 2.4 and 2.6.Wessel Dankers2019-09-272-0/+21
|
* options: add M_OPT_FILE to some more options that take filesPhilip Sequeira2019-09-271-2/+2
|
* vo_gpu: vulkan: add Android contextsfan52019-09-271-0/+4
|
* rpi: Update for modern systemsCameron Cawley2019-09-201-1/+1
|
* vo_gpu: remove vdpau/GLX backendwm42019-09-191-4/+0
| | | | | | | Useless garbage. This was once added to test whether vdpau presentation feedback could be used. Results were always unsatisfactory, and now vdpau is dead.
* vo_gpu: remove mali-fbdevwm42019-09-191-4/+0
| | | | | Useless at this point, I don't even know if it still works, or how to test it.
* drm: fix libmpv ABI breakage introduced in ↵Anton Kindestam2019-09-181-4/+4
| | | | | | | | | | | | | | | 351c083487050c88adb0e3d60f2174850f869018 Extending the client-allocated mpv_opengl_drm_params struct constituted a break of ABI that could cause UB. Create a clean break by deprecating "drm_params" and related structs and enum values, and replacing it with "drm_params_v2". Also fix some comments and code that wrongly assumed that open could return any other negative number than -1 for failure. This commit updates the libmpv version to 1.104
* vo_gpu: hwdec_vaapi: Refactor Vulkan and OpenGL interops for VAAPIPhilip Langdale2019-09-151-1/+1
| | | | | | Like hwdec_cuda, you get a big #ifdef mess if you try and keep the OpenGL and Vulkan interops in the same file. So, I've refactored them into separate files in a similar way.
* vo_gpu: x11: remove special vdpau probing, use EGL by defaultwm42019-09-151-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, vo_gpu/vo_opengl considered the case of Nvidia proprietary drivers, which required vdpau/GLX, and Intel open source drivers, which require vaapi/EGL. Since window creation and GPU context creation are inseparable in mpv's internal API, it had to pick the correct API very early, or hardware decoding wouldn't work. "x11probe" was introduced for this reason. It created a GLX context (without showing the window yet), and checked whether vdpau was available. If yes, it used GLX, if not, it continued probing x11/EGL. (Obviously it couldn't always fail on GLX without vdpau, which is why it was a separate "probe" backend.) Years passed, and now the situation is different. Vdpau is dead. Nvidia drivers and libavcodec now provide CUDA interop, which requires EGL, and fixes some of the vdpau problems. AMD drivers now provide vaapi, which generally works better than vdpau. Intel didn't change. In particular, vaapi provides working HEVC Main10 support. In theory, it should work on vdpau too, with quality reduction (no 10 bit surfaces), but I couldn't get it to work. So always prefer EGL. And suddenly hardware decoding works. This is actually rather important, because HEVC is unfortunately on the rise, despite shitty encoders and unoptimized decoders. The latter may mean that hardware decoding works better than libavcodec. This should have been done a long, long time ago.
* vo_gpu: correctly normalize src.sig_peakNiklas Haas2019-09-151-1/+4
| | | | | | | | | | | | | In some cases, src.sig_peak remains undefined as 0, which was definitely the case when using the OSD, since it never got passed through the usual color space normalization process. Most robust work-around is to simply force the normalization at the site where it's needed. This ensures this value is always valid and defined, to make the peak-dependent logic in these two functions always work. Fixes 4b25ec3a9d Fixes #6917 Fixes #6918
* vo/gpu: fix check on src/dst peak mismatchNiklas Haas2019-09-051-1/+1
| | | | | | | | | | In the past, src peak was always equal to or higher than dst peak. But since `--target-peak` got introduced, this could no longer be the case. This leads to an incorrect result (scaling for peak mismatch in gamma light) unless some other option (CMS, --linear-scaling, etc.) forces the linearization. Fixes #6533
* vo_gpu: fix taking screenshots of rotated videoswnoun2019-08-141-2/+6
|
* video/out/gpu: Add a `storable` flag to ra_formatPhilip Langdale2019-07-082-1/+2
| | | | | | | | | | | | | | | | While `ra` supports the concept of a texture as a storage destination, it does not support the concept of a texture format being usable for a storage texture. This can lead to us attempting to create a texture from an incompatible format, with undefined results. So, let's introduce an explicit format flag for storage and use it. In `ra_pl` we can simply reflect the `storable` flag. For GL and D3D, we'll need to write some new code to do the compatibility checks. I'm not going to do it here because it's not a regression; we were already implicitly assuming all formats were storable. Fixes #6657
* vo_gpu: process three component together in error diffusionBin Jin2019-06-161-42/+70
| | | | | | | | | | | | | | | | This started as a desperate attempt to lower the memory requirement of error diffusion, but later it turns out that this change also improved the rendering performance a lot (by 40% as I tested). Errors was stored in three uint before this change, each with 24bit precision. This change encoded them into a single uint, each with 8bit precision. This reduced the shared memory usage, as well as number of atomic operations, all by three times. Before this change, with the minimum required 32kb shared memory, only the `simple` kernel can be used to render 1080p video, which is mostly useless compare to `--dither=fruit`. After this change, 32kb can handle `burkes` kernel for 1080p, or `sierra-lite` for 4K resolution.
* vo_gpu: fix use of existing textures in error diffusionBin Jin2019-06-161-6/+8
| | | | | | | error diffusion requires two texture rendering pass. The existing code reuses `screen_tex` and creates another for such purpose. This works generally well for opengl, but could potentially be problematic for vulkan, due to its async natural.
* vo_gpu: implement error diffusion for ditheringBin Jin2019-06-164-0/+423
| | | | | | | | | | | | | | | | | This is a straightforward parallel implementation of error diffusion algorithms in compute shader. Basically we use single work group with maximal possible size to process the whole image. After a shift mapping we are able to process all pixels column by column. A large ring buffer are allocated in shared memory to speed things up. However the size of required shared memory depends linearly on the height of video window (or screen height in fullscreen mode). In case there is no enough shared memory, it will fallback to `--dither=fruit`. The maximal allowed work group size is hardcoded as 1024. Ideally we could query `GL_MAX_COMPUTE_WORK_GROUP_INVOCATIONS`. But for whatever reason, it seems most high end card from nvidia and amd support only the minimal required value, so I guess we can stick to it for now.
* vo_gpu: fix --scaler-resizes-only for fractional ratio scalingBin Jin2019-06-061-3/+6
| | | | | | | | The calculation of scale factor involves 32-bit float, and a strict equality test will effectively ignore `--scaler-resizes-only` option for some non-integer scale factor. Fix this by using non-strict equality check.
* vo_gpu: expose texture_off to user shaderBin Jin2019-06-061-0/+1
| | | | | It will provide low level access to coordinate mapping other than texmap().
* vo_gpu: allow user shader to fix texture offsetBin Jin2019-06-063-9/+45
| | | | | This commit essentially makes user shader able to fix offset (produced by other prescaler, for example) like builtin `--scale`.
* vo_gpu: vulkan: Add back context_win for libplaceboPhilip Langdale2019-04-211-3/+1
| | | | | Feature parity with the original ra_vk obviously requires win32 support, so let's put it back in.
* vo_gpu: vulkan: use libplacebo insteadNiklas Haas2019-04-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | This commit rips out the entire mpv vulkan implementation in favor of exposing lightweight wrappers on top of libplacebo instead, which provides much of the same except in a more up-to-date and polished form. This (finally) unifies the code base between mpv and libplacebo, which is something I've been hoping to do for a long time. Note: The ra_pl wrappers are abstract enough from the actual libplacebo device type that we can in theory re-use them for other devices like d3d11 or even opengl in the future, so I moved them to a separate directory for the time being. However, the rest of the code is still vulkan-specific, so I've kept the "vulkan" naming and file paths, rather than introducing a new `--gpu-api` type. (Which would have been ended up with significantly more code duplicaiton) Plus, the code and functionality is similar enough that for most users this should just be a straight-up drop-in replacement. Note: This commit excludes some changes; specifically, the updates to context_win and hwdec_cuda are deferred to separate commits for authorship reasons.
* vo_gpu: fix segfault when OSD tex creation failsNiklas Haas2019-04-211-1/+1
| | | | If !osd->texture, then mpgl_osd_draw_prepare fails.
* vo_gpu: index desc namespaces by raNiklas Haas2019-04-212-2/+2
| | | | | No reason to require them be constant. This allows them to depend on runtime characteristics of the `ra`.
* vo_gpu: increase user shader size limitBin Jin2019-03-131-1/+1
| | | | | | | | | | | | | The old size limit was chosen before LUT texture was supported in user shader. At that time, the whole user shader will be compiled and run on GPU, which makes large user shader impractical to be used. With the introduction of LUT texture, the ol