summaryrefslogtreecommitdiffstats
path: root/video/out/gpu/ra.h
Commit message (Collapse)AuthorAgeFilesLines
* various: make mentions of macOS consistentder richter2024-02-211-2/+2
| | | | | change all mentions and variations of OSX, OS X, MacOSX, MacOS X, etc consistent. use the official naming macOS.
* various: fix various typos in the code baseAlexander Seiler2023-03-281-1/+1
| | | | Signed-off-by: Alexander Seiler <seileralex@gmail.com>
* various: fix typosHarri Nieminen2023-03-281-1/+1
| | | | Found by codespell
* vo_gpu: implement VO_DR_FLAG_HOST_CACHEDsfan52023-01-231-0/+1
| | | | | | | | For OpenGL, this is based on simply comparing GL_VENDOR strings against a list of allowed vendors. Co-authored-by: Nicolas F. <ovdev@fratti.ch> Co-authored-by: Niklas Haas <git@haasn.dev>
* vo_gpu: stop hard-coding max compute group threadsPhilip Langdale2021-12-191-0/+4
| | | | | | | | | | We've been assuming that maximum number of compute group threads is never less than the 1024 defined by the desktop GL spec. Given that we haven't had working compute shaders for GLES and I guess the Vulkan spec defines at least as high a value, we've gotten away with it so far. But we should really look the value up and respect it.
* {player,video}: remove references to obsolete opengl-cb APIsfan52021-12-151-2/+2
|
* vo_gpu: fix green shit with float yuv inputwm42020-05-091-0/+2
| | | | | | | | | | | | This was incorrect at least because the colorspace matrix attempted to center chroma at (conceptually) 0.5, instead of 0. Also, it tried to apply the fixed point shift logic for component sizes > 8 bit. There is no float yuv format in mpv/ffmpeg yet, but see next commit, which enables zimg to output it. I'm assuming zimg defines this format such that luma is in range [0,1] and chroma in range [-0.5,0.5], with the levels flag being ignored. This is consistent with H264/5 Annex E (I think...), and it sort of seems to look right, so that's it.
* video/out/gpu: Add a `storable` flag to ra_formatPhilip Langdale2019-07-081-0/+1
| | | | | | | | | | | | | | | | While `ra` supports the concept of a texture as a storage destination, it does not support the concept of a texture format being usable for a storage texture. This can lead to us attempting to create a texture from an incompatible format, with undefined results. So, let's introduce an explicit format flag for storage and use it. In `ra_pl` we can simply reflect the `storable` flag. For GL and D3D, we'll need to write some new code to do the compatibility checks. I'm not going to do it here because it's not a regression; we were already implicitly assuming all formats were storable. Fixes #6657
* vo_gpu: index desc namespaces by raNiklas Haas2019-04-211-1/+1
| | | | | No reason to require them be constant. This allows them to depend on runtime characteristics of the `ra`.
* vo_gpu: vulkan: Add support for exporting buffer memoryPhilip Langdale2018-10-221-0/+1
| | | | | | | | | | | | | | | | | | | | The CUDA/Vulkan interop works on the basis of memory being exported from Vulkan and then imported by CUDA. To enable this, we add a way to declare a buffer as being intended for export, and then add a function to do the export. For now, we support the fd and Handle based exports on Linux and Windows respectively. There are others, which we can support when a need arises. Also note that this is just for exporting buffers, rather than textures (VkImages). Image import on the CUDA side is supposed to work, but it is currently buggy and waiting for a new driver release. Finally, at least with my nvidia hardware and drivers, everything seems to work even if we don't initialise the buffer with the right exportability options. Nevertheless I'm enforcing it so that we're following the spec.
* client API: add a new way to pass X11 Display etc. to render APIwm42018-03-261-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hardware decoding things often need access to additional handles from the windowing system, such as the X11 or Wayland display when using vaapi. The opengl-cb had nothing dedicated for this, and used the weird GL_MP_MPGetNativeDisplay GL extension (which was mpv specific and not officially registered with OpenGL). This was awkward, and a pain due to having to emulate GL context behavior (like needing a TLS variable to store context for the pseudo GL extension function). In addition (and not inherently due to this), we could pass only one resource from mpv builtin context backends to hwdecs. It was also all GL specific. Replace this with a newer mechanism. It works for all RA backends, not just GL. the API user can explicitly pass the objects at init time via mpv_render_context_create(). Multiple resources are naturally possible. The API uses MPV_RENDER_PARAM_* defines, but internally we use strings. This is done for 2 reasons: 1. trying to leave libmpv and internal mechanisms decoupled, 2. not having to add public API for some of the internal resource types (especially D3D/GL interop stuff). To remain sane, drop support for obscure half-working opengl-cb things, like the DRM interop (was missing necessary things), the RPI window thing (nobody used it), and obscure D3D interop things (not needed with ANGLE, others were undocumented). In order not to break ABI and the C API, we don't remove the associated structs from opengl_cb.h. The parts which are still needed (in particular DRM interop) needs to be ported to the render API.
* vo_gpu: make screenshots use the GL rendererwm42018-02-111-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | Using the GL renderer for color conversion will make sure screenshots will use the same conversion as normal video rendering. It can do this for all types of screenshots. The logic when to write 16 bit PNGs changes. To approximate the old behavior, we decide by looking whether the source video format has more than 8 bits per component. We apply this logic even for window screenshots. Also, 16 bit PNGs now always include an unused alpha channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no RGB064. RGB48 is 3 bytes and usually not supported by GPUs for rendering, so we have to use RGBA64, which forces an alpha channel. Will break for users who use --target-trc and similar options. I considered creating a new gl_video context, but it could double GPU memory use, so I didn't. This uses FBOs instead of glGetTexImage(), because that increases the chance it could work on GLES (e.g. ANGLE). Untested. No support for the Vulkan and D3D11 backends yet. Fixes #5498. Also fixes #5240, because the code for reading back is not used with the new code path.
* vo_gpu: add RA_CAP for gl_NumWorkGroupsNiklas Haas2018-02-051-0/+1
| | | | | SPIRV-Cross doesn't support this for the time being. It's possible this could go away again at a later date.
* vo_gpu: attempt re-using the FBO format for p->output_texNiklas Haas2017-12-251-0/+2
| | | | | | | | | This allows RAs with support for non-opaque FBO formats to use a more appropriate FBO format for the output tex, possibly enabling a more efficient blit operation. This requires distinguishing between real formats (which can be used to create textures) and fake formats (e.g. ra_gl's FBO hack).
* vo_gpu: aggressively prefer async computeNiklas Haas2017-12-251-0/+1
| | | | | | | | | | On AMD devices, we only get one graphics pipe but several compute pipes which can (in theory) run independently. As such, we should prefer compute shaders over fragment shaders in scenarios where we expect them to be better for parallelism. This is amusingly trivial to do, and actually improves performance even in a single-queue scenario.
* vo_gpu: allow invalidating FBO in renderpass_runNiklas Haas2017-12-251-0/+3
| | | | | | | | | This is especially interesting for vulkan since it allows completely skipping the layout transition as part of the renderpass. Unfortunately, that also means it needs to be put into renderpass_params, as opposed to renderpass_run_params (unlike #4777). Closes #4777.
* vo_gpu: d3d11: add RA caps for ra_d3d11James Ross-Gowan2017-11-071-0/+2
| | | | | | | | | | | | | | | | ra_d3d11 uses the SPIR-V compiler to translate GLSL to SPIR-V, which is then translated to HLSL. This means it always exposes the same GLSL version that the SPIR-V compiler supports (4.50 for shaderc/glslang.) Despite claiming to support GLSL 4.50, some features that are tied to the GLSL version in OpenGL are not supported by ra_d3d11 when targeting legacy Direct3D feature levels. This includes two features that mpv relies on: - Reading from gl_FragCoord in the fragment shader (requires FL 10_0) - textureGather from any texture component (requires FL 11_0) These features have been exposed as new RA caps.
* vo_gpu: export the GLSL format qualifier for ra_formatJames Ross-Gowan2017-11-071-0/+6
| | | | | | | | | | | | | | | Backported from @haasn's change to libplacebo, except in the current RA, there's nothing to indicate an ra_format can be bound as a storage image, so there's no way to force all of these formats to have a glsl_format. Instead, the layout qualifier will be removed if glsl_format is NULL. This is needed for the upcoming ra_d3d11 backend. In Direct3D 11, while loading float values from unorm images often works as expected, it's technically undefined behaviour, and in Windows 10, it will cause the debug layer to spam the log with error messages. Also, apparently in GLSL, the format name must match the image's format exactly (but in Direct3D, it just has to have the same component type.)
* vo_gpu: add namespace query mechanismJames Ross-Gowan2017-11-071-4/+8
| | | | | | Backported from @haasn's change to libplacebo. More flexible than the previous "shared || non-shared" distinction. The extra flexibility is needed for Direct3D 11, but it also doesn't hurt code-wise.
* vo_gpu: vulkan: add support for push constantsNiklas Haas2017-09-261-0/+9
| | | | Can in theory avoid updating the uniform buffer every frame
* vo_gpu: vulkan: initial implementationNiklas Haas2017-09-261-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This time based on ra/vo_gpu. 2017 is the year of the vulkan desktop! Current problems / limitations / improvement opportunities: 1. The swapchain/flipping code violates the vulkan spec, by assuming that the presentation queue will be bounded (in cases where rendering is significantly faster than vsync). But apparently, there's simply no better way to do this right now, to the point where even the stupid cube.c examples from LunarG etc. do it wrong. (cf. https://github.com/KhronosGroup/Vulkan-Docs/issues/370) 2. The memory allocator could be improved. (This is a universal constant) 3. Could explore using push descriptors instead of descriptor sets, especially since we expect to switch descriptors semi-often for some passes (like interpolation). Probably won't make a difference, but the synchronization overhead might be a factor. Who knows. 4. Parallelism across frames / async transfer is not well-defined, we either need to use a better semaphore / command buffer strategy or a resource pooling layer to safely handle cross-frame parallelism. (That said, I gave resource pooling a try and was not happy with the result at all - so I'm still exploring the semaphore strategy) 5. We aggressively use pipeline barriers where events would offer a much more fine-grained synchronization mechanism. As a result of this, we might be suffering from GPU bubbles due to too-short dependencies on objects. (That said, I'm also exploring the use of semaphores as a an ordering tactic which would allow cross-frame time slicing in theory) Some minor changes to the vo_gpu and infrastructure, but nothing consequential. NOTE: For safety, all use of asynchronous commands / multiple command pools is currently disabled completely. There are some left-over relics of this in the code (e.g. the distinction between dev_poll and pool_poll), but that is kept in place mostly because this will be re-extended in the future (vulkan rev 2). The queue count is also currently capped to 1, because of the lack of cross-frame semaphores means we need the implicit synchronization from the same-queue semantics to guarantee a correct result.
* vo_gpu: fix comment on ra_buf_typeNiklas Haas2017-09-211-2/+2
| | | | This hasn't been true for several iterations of this API.
* vo_opengl: refactor into vo_gpuNiklas Haas2017-09-211-0/+488
This is done in several steps: 1. refactor MPGLContext -> struct ra_ctx 2. move GL-specific stuff in vo_opengl into opengl/context.c 3. generalize context creation to support other APIs, and add --gpu-api 4. rename all of the --opengl- options that are no longer opengl-specific 5. move all of the stuff from opengl/* that isn't GL-specific into gpu/ (note: opengl/gl_utils.h became opengl/utils.h) 6. rename vo_opengl to vo_gpu 7. to handle window screenshots, the short-term approach was to just add it to ra_swchain_fns. Long term (and for vulkan) this has to be moved to ra itself (and vo_gpu altered to compensate), but this was a stop-gap measure to prevent this commit from getting too big 8. move ra->fns->flush to ra_gl_ctx instead 9. some other minor changes that I've probably already forgotten Note: This is one half of a major refactor, the other half of which is provided by rossy's following commit. This commit enables support for all linux platforms, while his version enables support for all non-linux platforms. Note 2: vo_opengl_cb.c also re-uses ra_gl_ctx so it benefits from the --opengl- options like --opengl-early-flush, --opengl-finish etc. Should be a strict superset of the old functionality. Disclaimer: Since I have no way of compiling mpv on all platforms, some of these ports were done blindly. Specifically, the blind ports included context_mali_fbdev.c and context_rpi.c. Since they're both based on egl_helpers, the port should have gone smoothly without any major changes required. But if somebody complains about a compile error on those platforms (assuming anybody actually uses them), you know where to complain.