summaryrefslogtreecommitdiffstats
path: root/video/out/d3d11
Commit message (Collapse)AuthorAgeFilesLines
* video/out/gpu: Add a `storable` flag to ra_formatPhilip Langdale2019-07-081-0/+3
| | | | | | | | | | | | | | | | While `ra` supports the concept of a texture as a storage destination, it does not support the concept of a texture format being usable for a storage texture. This can lead to us attempting to create a texture from an incompatible format, with undefined results. So, let's introduce an explicit format flag for storage and use it. In `ra_pl` we can simply reflect the `storable` flag. For GL and D3D, we'll need to write some new code to do the compatibility checks. I'm not going to do it here because it's not a regression; we were already implicitly assuming all formats were storable. Fixes #6657
* vo_gpu: d3d11: use the SPIRV-Cross C API directlyJames Ross-Gowan2019-06-121-19/+67
| | | | | | | | | | When the D3D11 backend was first written, SPIRV-Cross only had a C++ API and no guarantee of API or ABI stability, so instead of using SPIRV-Cross directly, mpv used an unofficial C wrapper called crossc. Now that KhronosGroup/SPIRV-Cross#611 is resolved, SPIRV-Cross has an official C API that can be used instead, so remove crossc and use SPIRV-Cross directly.
* vo_gpu: index desc namespaces by raNiklas Haas2019-04-211-1/+1
| | | | | No reason to require them be constant. This allows them to depend on runtime characteristics of the `ra`.
* vo_gpu: d3d11: implement tex_download()James Ross-Gowan2018-02-132-7/+57
| | | | | | | This allows the new GPU screenshot functionality introduced in 9f595f3a80ee to work with the D3D11 backend. It replaces the old window screenshot functionality, which was shared between D3D11 and ANGLE. The old code can be removed, since it's not needed by ANGLE anymore either.
* video: rewrite filtering glue codewm42018-01-301-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of the old vf.c code. Replace it with a generic filtering framework, which can potentially handle more than just --vf. At least reimplementing --af with this code is planned. This changes some --vf semantics (including runtime behavior and the "vf" command). The most important ones are listed in interface-changes. vf_convert.c is renamed to f_swscale.c. It is now an internal filter that can not be inserted by the user manually. f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is conceptually easy, but a big mess due to the data flow changes). The existing filters are all changed heavily. The data flow of the new filter framework is different. Especially EOF handling changes - EOF is now a "frame" rather than a state, and must be passed through exactly once. Another major thing is that all filters must support dynamic format changes. The filter reconfig() function goes away. (This sounds complex, but since all filters need to handle EOF draining anyway, they can use the same code, and it removes the mess with reconfig() having to predict the output format, which completely breaks with libavfilter anyway.) In addition, there is no automatic format negotiation or conversion. libavfilter's primitive and insufficient API simply doesn't allow us to do this in a reasonable way. Instead, filters can use f_autoconvert as sub-filter, and tell it which formats they support. This filter will in turn add actual conversion filters, such as f_swscale, to perform necessary format changes. vf_vapoursynth.c uses the same basic principle of operation as before, but with worryingly different details in data flow. Still appears to work. The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are heavily changed. Fortunately, they all used refqueue.c, which is for sharing the data flow logic (especially for managing future/past surfaces and such). It turns out it can be used to factor out most of the data flow. Some of these filters accepted software input. Instead of having ad-hoc upload code in each filter, surface upload is now delegated to f_autoconvert, which can use f_hwupload to perform this. Exporting VO capabilities is still a big mess (mp_stream_info stuff). The D3D11 code drops the redundant image formats, and all code uses the hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a big mess for now. f_async_queue is unused.
* vo_gpu: hwdec_dxva2dxgi: initial implementationJames Ross-Gowan2018-01-062-0/+466
| | | | | | | | | | | | | This enables DXVA2 hardware decoding with ra_d3d11. It should be useful for Windows 7, where D3D11VA is not available. Images are transfered from D3D9 to D3D11 using D3D9Ex surface sharing[1]. Following Microsoft's recommendations, it uses a queue of shared surfaces, similar to Microsoft's ISurfaceQueue. This will hopefully prevent surface sharing from impacting parallelism and allow multiple D3D11 frames to be in-flight at once. [1]: https://msdn.microsoft.com/en-us/library/windows/desktop/ee913554.aspx
* vo_gpu: d3d11: check for NULL backbuffer in start_frameJames Ross-Gowan2018-01-041-2/+6
| | | | | | | | | | | | | In a lost device scenario, resize() will fail and p->backbuffer will be NULL. We can't recover from lost devices yet, but we should still check for a NULL backbuffer in start_frame() rather than crashing. Also remove a NULL check for p->swapchain. This was a red herring, since p->swapchain never becomes NULL in an error condition, but p->backbuffer actually does. This should fix the crash in #5320, but it doesn't fix the underlying reason for the lost device (which is probably a driver bug.)
* vo_gpu: d3d11: avoid copying staging buffers to cbuffersJames Ross-Gowan2018-01-011-48/+15
| | | | | | | | | | | | | | | | Apparently some Intel drivers have a bug where copying from staging buffers to constant buffers does not work. We used to keep a copy of the buffer data in a staging buffer to enable partial constant buffer updates. To work around this bug, keep the copy in talloc-allocated system memory instead. There doesn't seem to be any noticable performance difference from keeping the copy in system memory. Our cbuffers are probably too small for it to matter anyway. See also: https://crbug.com/593024 Fixes #5293
* vo_gpu: d3d11: check for timestamp query supportJames Ross-Gowan2017-12-091-0/+9
| | | | | | | | | | Apparently timestamp queries are optional for 10level9 devices. Check for support when creating the device rather than spamming error messages during rendering. CreateQuery can be used to check for support by passing NULL as the final parameter. See: https://msdn.microsoft.com/en-us/library/windows/desktop/ff476150.aspx#ID3D11Device_CreateQuery
* video: remove some more hwdec legacy stuffwm42017-12-021-4/+1
| | | | | | | | | Finally get rid of all the HWDEC_* things, and instead rely on the libavutil equivalents. vdpau still uses a shitty hack, but fuck the vdpau code. Remove all the now unneeded remains. The vdpau preemption thing was not unused anymore; if someone cares this could probably be restored.
* video: move d3d.c out of decode sub directorywm42017-12-011-1/+1
| | | | | | It makes more sense to have it in the general video directory (along with vdpau.c and vaapi.c), since the decoder source files don't even access it anymore.
* vo_gpu: hwdec: remove redundant fieldswm42017-12-011-1/+0
| | | | | | | | | | | | | The testing_only field is not referenced anymore with vaglx removed and the previous commit dropping all uses. The ra_hwdec_driver.api field became unused with the previous commit, but all hwdec interop drivers still initialized it. Since this touches highly OS-specific code, build regressions are possible (plus the previous commit might break hw decoding at runtime). At least hwdec_cuda.c still used the .api field, other than initializing it.
* vo_gpu: d3d11: don't use runtime version for UAV slot countJames Ross-Gowan2017-11-191-1/+1
| | | | | | FL 11_1 is only supported with the Direct3D 11.1 runtime anyway, so there is no need to check both the runtime version and the feature level.
* vo_gpu: d3d11: mark the bgra8 format as unorderedJames Ross-Gowan2017-11-191-1/+1
| | | | Whoops. I was confused by the double-negative here.
* vo_gpu: d3d11: remove flipped texture upload hackJames Ross-Gowan2017-11-121-8/+0
| | | | Made unnecessary by 4a6b04bdb930.
* vo_gpu: hwdec_d3d11va: allow zero-copy video decodingJames Ross-Gowan2017-11-073-62/+159
| | | | | | | | | | | | | | Like the manual says, this is technically undefined behaviour. See: https://msdn.microsoft.com/en-us/library/windows/desktop/ff476085.aspx In particular, MSDN says texture arrays created with the BIND_DECODER flag cannot be used with CreateShaderResourceView, which means they can't be sampled through SRVs like normal Direct3D textures. However, some programs (Google Chrome included) do this anyway for performance and power-usage reasons, and it appears to work with most drivers. Older AMD drivers had a "bug" with zero-copy decoding, but this appears to have been fixed. See #3255, #3464 and http://crbug.com/623029.
* vo_gpu: d3d11: enhance cache invalidationJames Ross-Gowan2017-11-071-4/+70
| | | | | | | | The shader cache in ra_d3d11 caches the result of shaderc, crossc and the D3DCompiler DLL, so it should be invalidated when any of those components are updated. This should make the cache more reliable, which makes it safer to enable gpu-shader-cache-dir. Shader compilation is slow with D3D11, so gpu-shader-cache-dir is highly necessary
* vo_gpu: d3d11: log shader compilation timesJames Ross-Gowan2017-11-071-0/+26
| | | | | | | Some shaders take a _long_ time to compile with the Direct3D compiler. The ANGLE backend had this problem too, to a certain extent. Logging should help identify which shaders cause long stalls and could also help with benchmarking ways of reducing compile times.
* vo_gpu: move d3d11_screenshot to shared codeJames Ross-Gowan2017-11-071-0/+9
| | | | This can be used by the ANGLE backend and ra_d3d11.
* vo_gpu: d3d11: add RA caps for ra_d3d11James Ross-Gowan2017-11-071-0/+5
| | | | | | | | | | | | | | | | ra_d3d11 uses the SPIR-V compiler to translate GLSL to SPIR-V, which is then translated to HLSL. This means it always exposes the same GLSL version that the SPIR-V compiler supports (4.50 for shaderc/glslang.) Despite claiming to support GLSL 4.50, some features that are tied to the GLSL version in OpenGL are not supported by ra_d3d11 when targeting legacy Direct3D feature levels. This includes two features that mpv relies on: - Reading from gl_FragCoord in the fragment shader (requires FL 10_0) - textureGather from any texture component (requires FL 11_0) These features have been exposed as new RA caps.
* vo_gpu: d3d11: initial implementationJames Ross-Gowan2017-11-074-0/+2699
This is a new RA/vo_gpu backend that uses Direct3D 11. The GLSL generated by vo_gpu is cross-compiled to HLSL with SPIRV-Cross. What works: - All of mpv's internal shaders should work, including compute shaders. - Some external shaders have been tested and work, including RAVU and adaptive-sharpen. - Non-dumb mode works, even on very old hardware. Most features work at feature level 9_3 and all features work at feature level 10_0. Some features also work at feature level 9_1 and 9_2, but without high-bit- depth FBOs, it's not very useful. (Hardware this old is probably not fast enough for advanced features anyway.) Note: This is more compatible than ANGLE, which requires 9_3 to work at all (GLES 2.0,) and 10_1 for non-dumb-mode (GLES 3.0.) - Hardware decoding with D3D11VA, including decoding of 10-bit formats without truncation to 8-bit. What doesn't work / can be improved: - PBO upload and direct rendering does not work yet. Direct rendering requires persistent-mapped PBOs because the decoder needs to be able to read data from images that have already been decoded and uploaded. Unfortunately, it seems like persistent-mapped PBOs are fundamentally incompatible with D3D11, which requires all resources to use driver- managed memory and requires memory to be unmapped (and hence pointers to be invalidated) when a resource is used in a draw or copy operation. However it might be possible to use D3D11's limited multithreading capabilities to emulate some features of PBOs, like asynchronous texture uploading. - The blit() and clear() operations don't have equivalents in the D3D11 API that handle all cases, so in most cases, they have to be emulated with a shader. This is currently done inside ra_d3d11, but ideally it would be done in generic code, so it can take advantage of mpv's shader generation utilities. - SPIRV-Cross is used through a NIH C-compatible wrapper library, since it does not expose a C interface itself. The library is available here: https://github.com/rossy/crossc - The D3D11 context could be made to support more modern DXGI features in future. For example, it should be possible to add support for high-bit-depth and HDR output with DXGI 1.5/1.6.