summaryrefslogtreecommitdiffstats
path: root/video/out/d3d11
Commit message (Collapse)AuthorAgeFilesLines
* vo_gpu: d3d11: only use presentation feedback with flip modelJames Ross-Gowan2020-05-071-4/+12
| | | | | | | | | | | The current implementation of presentation feedback was designed to be used with flip model presentation. With the bitblt model, GetFrameStatistics returns totally different values and it's not clear if we can use them at all. Previously, this wasn't a problem because with the bitblt model, GetFrameStatistics only worked in exclusive fullscreen. Now that mpv supports exclusive fullscreen, we should explicitly check for a flip model swapchain before using presentation feedback.
* vo_gpu: d3d11: also utilize config cache for d3d11-specific optionsJan Ekström2020-04-121-1/+9
| | | | | Update the cache whenever we utilize p->opts, as recommended by wm4.
* vo_gpu: d3d11: add support for exclusive fullscreenJames Ross-Gowan2020-04-121-1/+86
| | | | | Lets the application fully control the rendering onto the screen instead of the compositor.
* options: change option macros and all option declarationswm42020-03-182-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change all OPT_* macros such that they don't define the entire m_option initializer, and instead expand only to a part of it, which sets certain fields. This requires changing almost every option declaration, because they all use these macros. A declaration now always starts with {"name", ... followed by designated initializers only (possibly wrapped in macros). The OPT_* macros now initialize the .offset and .type fields only, sometimes also .priv and others. I think this change makes the option macros less tricky. The old code had to stuff everything into macro arguments (and attempted to allow setting arbitrary fields by letting the user pass designated initializers in the vararg parts). Some of this was made messy due to C99 and C11 not allowing 0-sized varargs with ',' removal. It's also possible that this change is pointless, other than cosmetic preferences. Not too happy about some things. For example, the OPT_CHOICE() indentation I applied looks a bit ugly. Much of this change was done with regex search&replace, but some places required manual editing. In particular, code in "obscure" areas (which I didn't include in compilation) might be broken now. In wayland_common.c the author of some option declarations confused the flags parameter with the default value (though the default value was also properly set below). I fixed this with this change.
* vo_gpu/d3d11: add support for configuring swap chain color spaceJan Ekström2019-10-301-0/+12
| | | | | | | | | | | | | | | | By default utilizes the color space of the desktop on which the swap chain is located. If a specific value is defined, it will be instead be utilized. Enables configuration of the PQ color space (BT.2020 primaries, PQ transfer function) for HDR. Additionally, signals the swap chain color space to the renderer, so that the render looks correct without having to specify target-trc or target-prim manually. Due to all of the APIs being Win10+ only, will only work starting with Windows 10.
* vo_gpu: d3d11: set the ra_format.storable flagJames Ross-Gowan2019-10-271-3/+12
| | | | | This flag was added in e2976e662d92, but it was only set for Vulkan. In D3D11 it can be set from info in D3D11_FEATURE_FORMAT_SUPPORT2.
* vo_gpu: d3d11: prevent wraparound in queued frames calcJames Ross-Gowan2019-10-261-1/+2
| | | | | | | If expected_sync_pc is greater than submit_count, the unsigned subtraction will wraparound, which breaks playback. This bug was found while experimenting with bit-blt model present, but it might be possible to trigger it with the flip model as well, if there was a dropped frame.
* vo_gpu/d3d11: switch adapter selection to case-insensitive startswithJan Ekström2019-10-151-1/+1
| | | | | | This lets users set values such as "intel" or "nvidia" as the adapter vendor is generally noted in the beginning of the description string.
* vo_gpu/d3d11: fixup adapter selection by switching it all to bstrJan Ekström2019-10-151-1/+1
| | | | | I did ponder if I should have done this right away, and it seems like not doing it at first was a mistake.
* vo_gpu/d3d11: add support for configuring swap chain formatJan Ekström2019-10-131-0/+9
| | | | | | | Query information on the system output most linked to the swap chain, and either utilize a user-configured format, or either 8bit RGBA or 10bit RGB with 2bit alpha depending on the system output's bit depth.
* vo_gpu/d3d11: utilize actual backbuffer values for bit depthJames Ross-Gowan2019-10-131-1/+6
| | | | | And if backbuffer is not around, return an error value utilized elsewhere already.
* vo_gpu: d3d11: use linear filtering for wrapped texturesJames Ross-Gowan2019-10-101-1/+3
| | | | | | | | | | This affects hwdec_dxva2dxgi, which uses ra_d3d11_wrap_tex to wrap RGB video frames that are shared with a D3D9 device. Without it, mpv uses nearest instead of bilinear scaling with --scale=bilinear (the default) and --hwdec=dxva2. It's kind of hard to believe this bug has gone unnoticed for almost two years, but that seems to have been the case. Fixes: #7042
* vo_gpu/d3d11: add adapter name validation and listing with "help"Jan Ekström2019-09-291-1/+36
| | | | Not the prettiest way to get it done, but seems to work.
* vo_gpu/d3d11: add an option for the adapter nameJan Ekström2019-09-291-0/+1
| | | | Set it from the adapter name in the d3d11 options.
* video/d3d11: add adapter selection by name into d3d11 optionsJan Ekström2019-09-291-0/+3
| | | | | This lets the user define an adapter name that can then be passed further into the internals.
* vo: make swapchain-depth option generic for all VOsAnton Kindestam2019-09-281-2/+2
| | | | In preparation for making vo_drm able to use swapchain-depth
* vo_gpu: d3d11: don't reset frame stats after pauseJames Ross-Gowan2019-09-261-9/+0
| | | | | | | | | | | | | | | | | | | | | | I think I was wrong about having to reset the stats when mpv stops producing frames, eg. when it's paused. As long as the swapchain doesn't underflow, last_queue_display_time will still be accurate, because the next frame produced should still be presented one vsync after the last one in the swapchain. If the swapchain underflows (which is the common case for when mpv is paused for more than 150ms,) the next predicted frame time should be in the past. It should be fine to leave last_queue_display_time unset in this case, since vo.c will use the current time instead, which is a decent guess (though it doesn't take vsync phase into account.) last_sync_refresh_count and last_sync_qpc_time should be kept on swapchain underflow as well. Assuming the display refresh rate doesn't change while mpv is paused, they'll only provide a more accurate guess of the vsync duration when mpv starts playing again. Also, vsync_duration_qpc never needs to get reset. It will get overwritten immediately in most cases, and when it doesn't, it's still a better guess of the vsync duration than nothing.
* vo_gpu: d3d11: add support for presentation feedbackJames Ross-Gowan2019-09-221-0/+124
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds vsync reporting to the D3D11 backend using the presentation feedback provided by DXGI, which is pretty similar to what's provided by GLX_OML_sync_control in the GLX backend. In DirectX, PresentCount is the SBC, PresentRefreshCount and SyncRefreshCount are kind of like the MSC and SyncQPCTime is the UST. Unlike GLX, the DXGI API makes it possible for PresentCount and SyncQPCTime to refer to different physical vsyncs, in which case PresentRefreshCount and SyncRefreshCount will be different. The code supports this possibility, even though it's not clear whether it can happen when using flip-model presentation. The docs say for flip-model apps, PresentRefreshCount is equal to SyncRefreshCount "when the app presents on every vsync," but on my hardware, they're always equal, even when mpv misses a vsync. They can definitely be different in exclusive fullscreen bitblt mode, though, which mpv doesn't support now, but might support in future. Another difference to GLX is that, at least on my hardware, PresentRefreshCount and SyncRefreshCount always refer to the last physical vsync on which mpv presented a frame, but glxGetSyncValues can apparently return a MSC and UST from the most recent physical vsync, even if mpv didn't present a new frame on it. This might result in different behaviour between the two backends after dropped frames or brief pauses. Also note, the docs for the DXGI presentation feedback APIs are pretty awful, even by Microsoft standards. In particular the docs for DXGI_FRAME_STATISTICS are misleading (PresentCount really is the number of times Present() has been called for that frame, not "the running total count of times that an image was presented to the monitor since the computer booted.") For good documentation, try these: https://docs.microsoft.com/en-us/windows/win32/direct3ddxgi/dxgi-flip-model https://docs.microsoft.com/en-us/windows/win32/direct3d9/d3dpresentstats https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/content/d3dkmthk/ns-d3dkmthk-_d3dkmt_present_stats (Yeah, the docs for the D3D9Ex and even the kernel-mode version of this structure are better than the DXGI ones. It seems possible that they're all rewordings of the same internal Microsoft docs, but whoever wrote the DXGI one didn't really understand it.)
* vo_d3d11/context: fix crash due to ctx->ra is null pointer accessHui Jin2019-09-141-2/+4
| | | | 'ctx->ra' is null pointer when d3d11 init failed before call 'ra_d3d11_create' in 'd3d11_init'.
* vo_d3d11/hwdec_dxva2dxgi: fix memory leak that 'ctx11' be not releaseHui Jin2019-09-141-0/+6
| | | | 'ctx11' be not release when d3d11 hwdec be uninit with 'mapper_uninit' method.
* vo_gpu: d3d11: fix storage lifetime of compound literalsJames Ross-Gowan2019-08-201-8/+15
| | | | | | | | | | | | | | | | Somehow I got the idea that compound literals had function-scoped lifetime. Instead, like all other objects with automatic storage duration, compound literals are block-scoped, so they become invalid after exiting the block they were declared in. It seems like a recent change to GCC actually reuses the memory that the compound literals used to occupy, which was causing a few bugs. The pattern of conditionally assigning a pointer to a compound literal was used in a few places in ra_d3d11 where the Direct3D API expects either a pointer to an initialised struct or NULL. Change these to ensure the lifetime of the struct includes the API call. Should fix #6775.
* video/out/gpu: Add a `storable` flag to ra_formatPhilip Langdale2019-07-081-0/+3
| | | | | | | | | | | | | | | | While `ra` supports the concept of a texture as a storage destination, it does not support the concept of a texture format being usable for a storage texture. This can lead to us attempting to create a texture from an incompatible format, with undefined results. So, let's introduce an explicit format flag for storage and use it. In `ra_pl` we can simply reflect the `storable` flag. For GL and D3D, we'll need to write some new code to do the compatibility checks. I'm not going to do it here because it's not a regression; we were already implicitly assuming all formats were storable. Fixes #6657
* vo_gpu: d3d11: use the SPIRV-Cross C API directlyJames Ross-Gowan2019-06-121-19/+67
| | | | | | | | | | When the D3D11 backend was first written, SPIRV-Cross only had a C++ API and no guarantee of API or ABI stability, so instead of using SPIRV-Cross directly, mpv used an unofficial C wrapper called crossc. Now that KhronosGroup/SPIRV-Cross#611 is resolved, SPIRV-Cross has an official C API that can be used instead, so remove crossc and use SPIRV-Cross directly.
* vo_gpu: index desc namespaces by raNiklas Haas2019-04-211-1/+1
| | | | | No reason to require them be constant. This allows them to depend on runtime characteristics of the `ra`.
* vo_gpu: d3d11: implement tex_download()James Ross-Gowan2018-02-132-7/+57
| | | | | | | This allows the new GPU screenshot functionality introduced in 9f595f3a80ee to work with the D3D11 backend. It replaces the old window screenshot functionality, which was shared between D3D11 and ANGLE. The old code can be removed, since it's not needed by ANGLE anymore either.
* video: rewrite filtering glue codewm42018-01-301-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of the old vf.c code. Replace it with a generic filtering framework, which can potentially handle more than just --vf. At least reimplementing --af with this code is planned. This changes some --vf semantics (including runtime behavior and the "vf" command). The most important ones are listed in interface-changes. vf_convert.c is renamed to f_swscale.c. It is now an internal filter that can not be inserted by the user manually. f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is conceptually easy, but a big mess due to the data flow changes). The existing filters are all changed heavily. The data flow of the new filter framework is different. Especially EOF handling changes - EOF is now a "frame" rather than a state, and must be passed through exactly once. Another major thing is that all filters must support dynamic format changes. The filter reconfig() function goes away. (This sounds complex, but since all filters need to handle EOF draining anyway, they can use the same code, and it removes the mess with reconfig() having to predict the output format, which completely breaks with libavfilter anyway.) In addition, there is no automatic format negotiation or conversion. libavfilter's primitive and insufficient API simply doesn't allow us to do this in a reasonable way. Instead, filters can use f_autoconvert as sub-filter, and tell it which formats they support. This filter will in turn add actual conversion filters, such as f_swscale, to perform necessary format changes. vf_vapoursynth.c uses the same basic principle of operation as before, but with worryingly different details in data flow. Still appears to work. The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are heavily changed. Fortunately, they all used refqueue.c, which is for sharing the data flow logic (especially for managing future/past surfaces and such). It turns out it can be used to factor out most of the data flow. Some of these filters accepted software input. Instead of having ad-hoc upload code in each filter, surface upload is now delegated to f_autoconvert, which can use f_hwupload to perform this. Exporting VO capabilities is still a big mess (mp_stream_info stuff). The D3D11 code drops the redundant image formats, and all code uses the hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a big mess for now. f_async_queue is unused.
* vo_gpu: hwdec_dxva2dxgi: initial implementationJames Ross-Gowan2018-01-062-0/+466
| | | | | | | | | | | | | This enables DXVA2 hardware decoding with ra_d3d11. It should be useful for Windows 7, where D3D11VA is not available. Images are transfered from D3D9 to D3D11 using D3D9Ex surface sharing[1]. Following Microsoft's recommendations, it uses a queue of shared surfaces, similar to Microsoft's ISurfaceQueue. This will hopefully prevent surface sharing from impacting parallelism and allow multiple D3D11 frames to be in-flight at once. [1]: https://msdn.microsoft.com/en-us/library/windows/desktop/ee913554.aspx
* vo_gpu: d3d11: check for NULL backbuffer in start_frameJames Ross-Gowan2018-01-041-2/+6
| | | | | | | | | | | | | In a lost device scenario, resize() will fail and p->backbuffer will be NULL. We can't recover from lost devices yet, but we should still check for a NULL backbuffer in start_frame() rather than crashing. Also remove a NULL check for p->swapchain. This was a red herring, since p->swapchain never becomes NULL in an error condition, but p->backbuffer actually does. This should fix the crash in #5320, but it doesn't fix the underlying reason for the lost device (which is probably a driver bug.)
* vo_gpu: d3d11: avoid copying staging buffers to cbuffersJames Ross-Gowan2018-01-011-48/+15
| | | | | | | | | | | | | | | | Apparently some Intel drivers have a bug where copying from staging buffers to constant buffers does not work. We used to keep a copy of the buffer data in a staging buffer to enable partial constant buffer updates. To work around this bug, keep the copy in talloc-allocated system memory instead. There doesn't seem to be any noticable performance difference from keeping the copy in system memory. Our cbuffers are probably too small for it to matter anyway. See also: https://crbug.com/593024 Fixes #5293
* vo_gpu: d3d11: check for timestamp query supportJames Ross-Gowan2017-12-091-0/+9
| | | | | | | | | | Apparently timestamp queries are optional for 10level9 devices. Check for support when creating the device rather than spamming error messages during rendering. CreateQuery can be used to check for support by passing NULL as the final parameter. See: https://msdn.microsoft.com/en-us/library/windows/desktop/ff476150.aspx#ID3D11Device_CreateQuery
* video: remove some more hwdec legacy stuffwm42017-12-021-4/+1
| | | | | | | | | Finally get rid of all the HWDEC_* things, and instead rely on the libavutil equivalents. vdpau still uses a shitty hack, but fuck the vdpau code. Remove all the now unneeded remains. The vdpau preemption thing was not unused anymore; if someone cares this could probably be restored.
* video: move d3d.c out of decode sub directorywm42017-12-011-1/+1
| | | | | | It makes more sense to have it in the general video directory (along with vdpau.c and vaapi.c), since the decoder source files don't even access it anymore.
* vo_gpu: hwdec: remove redundant fieldswm42017-12-011-1/+0
| | | | | | | | | | | | | The testing_only field is not referenced anymore with vaglx removed and the previous commit dropping all uses. The ra_hwdec_driver.api field became unused with the previous commit, but all hwdec interop drivers still initialized it. Since this touches highly OS-specific code, build regressions are possible (plus the previous commit might break hw decoding at runtime). At least hwdec_cuda.c still used the .api field, other than initializing it.
* vo_gpu: d3d11: don't use runtime version for UAV slot countJames Ross-Gowan2017-11-191-1/+1
| | | | | | FL 11_1 is only supported with the Direct3D 11.1 runtime anyway, so there is no need to check both the runtime version and the feature level.
* vo_gpu: d3d11: mark the bgra8 format as unorderedJames Ross-Gowan2017-11-191-1/+1
| | | | Whoops. I was confused by the double-negative here.
* vo_gpu: d3d11: remove flipped texture upload hackJames Ross-Gowan2017-11-121-8/+0
| | | | Made unnecessary by 4a6b04bdb930.
* vo_gpu: hwdec_d3d11va: allow zero-copy video decodingJames Ross-Gowan2017-11-073-62/+159
| | | | | | | | | | | | | | Like the manual says, this is technically undefined behaviour. See: https://msdn.microsoft.com/en-us/library/windows/desktop/ff476085.aspx In particular, MSDN says texture arrays created with the BIND_DECODER flag cannot be used with CreateShaderResourceView, which means they can't be sampled through SRVs like normal Direct3D textures. However, some programs (Google Chrome included) do this anyway for performance and power-usage reasons, and it appears to work with most drivers. Older AMD drivers had a "bug" with zero-copy decoding, but this appears to have been fixed. See #3255, #3464 and http://crbug.com/623029.
* vo_gpu: d3d11: enhance cache invalidationJames Ross-Gowan2017-11-071-4/+70
| | | | | | | | The shader cache in ra_d3d11 caches the result of shaderc, crossc and the D3DCompiler DLL, so it should be invalidated when any of those components are updated. This should make the cache more reliable, which makes it safer to enable gpu-shader-cache-dir. Shader compilation is slow with D3D11, so gpu-shader-cache-dir is highly necessary
* vo_gpu: d3d11: log shader compilation timesJames Ross-Gowan2017-11-071-0/+26
| | | | | | | Some shaders take a _long_ time to compile with the Direct3D compiler. The ANGLE backend had this problem too, to a certain extent. Logging should help identify which shaders cause long stalls and could also help with benchmarking ways of reducing compile times.
* vo_gpu: move d3d11_screenshot to shared codeJames Ross-Gowan2017-11-071-0/+9
| | | | This can be used by the ANGLE backend and ra_d3d11.
* vo_gpu: d3d11: add RA caps for ra_d3d11James Ross-Gowan2017-11-071-0/+5
| | | | | | | | | | | | | | | | ra_d3d11 uses the SPIR-V compiler to translate GLSL to SPIR-V, which is then translated to HLSL. This means it always exposes the same GLSL version that the SPIR-V compiler supports (4.50 for shaderc/glslang.) Despite claiming to support GLSL 4.50, some features that are tied to the GLSL version in OpenGL are not supported by ra_d3d11 when targeting legacy Direct3D feature levels. This includes two features that mpv relies on: - Reading from gl_FragCoord in the fragment shader (requires FL 10_0) - textureGather from any texture component (requires FL 11_0) These features have been exposed as new RA caps.
* vo_gpu: d3d11: initial implementationJames Ross-Gowan2017-11-074-0/+2699
This is a new RA/vo_gpu backend that uses Direct3D 11. The GLSL generated by vo_gpu is cross-compiled to HLSL with SPIRV-Cross. What works: - All of mpv's internal shaders should work, including compute shaders. - Some external shaders have been tested and work, including RAVU and adaptive-sharpen. - Non-dumb mode works, even on very old hardware. Most features work at feature level 9_3 and all features work at feature level 10_0. Some features also work at feature level 9_1 and 9_2, but without high-bit- depth FBOs, it's not very useful. (Hardware this old is probably not fast enough for advanced features anyway.) Note: This is more compatible than ANGLE, which requires 9_3 to work at all (GLES 2.0,) and 10_1 for non-dumb-mode (GLES 3.0.) - Hardware decoding with D3D11VA, including decoding of 10-bit formats without truncation to 8-bit. What doesn't work / can be improved: - PBO upload and direct rendering does not work yet. Direct rendering requires persistent-mapped PBOs because the decoder needs to be able to read data from images that have already been decoded and uploaded. Unfortunately, it seems like persistent-mapped PBOs are fundamentally incompatible with D3D11, which requires all resources to use driver- managed memory and requires memory to be unmapped (and hence pointers to be invalidated) when a resource is used in a draw or copy operation. However it might be possible to use D3D11's limited multithreading capabilities to emulate some features of PBOs, like asynchronous texture uploading. - The blit() and clear() operations don't have equivalents in the D3D11 API that handle all cases, so in most cases, they have to be emulated with a shader. This is currently done inside ra_d3d11, but ideally it would be done in generic code, so it can take advantage of mpv's shader generation utilities. - SPIRV-Cross is used through a NIH C-compatible wrapper library, since it does not expose a C interface itself. The library is available here: https://github.com/rossy/crossc - The D3D11 context could be made to support more modern DXGI features in future. For example, it should be possible to add support for high-bit-depth and HDR output with DXGI 1.5/1.6.