| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This change is mostly motivated by missing
VK_KHR_portability_enumeration instance extension when enumerating the
devices. Which causes issues with MoltenVK which does not advertise full
Vulkan conformance.
To avoid duplicating code use pl_vk_inst_create() which correctly query
availability and enables the mentioned extension.
While at it fix the VkInstance leaking in vk_validate_dev().
|
|
|
|
|
|
|
| |
As no drivers were ever released with the unstable extension,
it's not needed anymore.
Not bumping the required headers version yet.
|
|
|
|
| |
Needed for the next commit
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The opt validator functions are casted to generic validator, which has
erased type for value. Calling function by pointer of different
definition is an UB.
Avoid that by generating wrapper function that does proper argument type
conversion and calls validator function. Type erased functions have
mangled type in the name.
Fixes UBSAN failures on Clang 17, which enabled fsanitize=function by
default.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently only allow specifying the Vulkan device to use by name. We
did this to avoid confusion around devices being enumerated in an
unpredictable order. However, there is a valid edge case where a system
may contain multiple devices of the same type - which means they will
have the same name, and so you can't control which one is used.
This change implements picking devices by UUID so that if names don't
work, you have some option available. As Vulkan 1.1 is a hard
requirement for libplacebo, we can just use UUIDs without conditional
checks.
Fixes #10898
|
|
|
|
|
|
|
|
|
|
|
| |
The dmabuf-wayland vo has a stub ra implementation that doesn't
have a swapchain. That means that it's currently not safe to call
ra_vk_ctx_get on that ra_ctx, but it must be safe to call on all ra
implementations as this is how we discover if it is a vulkan ra.
This hasn't been an issue before because no Vulkan code paths would be
triggered when using dmabuf-wayland, but with the new vulkan hwdec, it
becomes possible to trigger when hwdecs are probed.
|
|
|
|
|
|
|
|
|
|
|
| |
AV1 support in Vulkan is extremely bleeding edge - to the point that
the extension is not present in official Khronos releases, but it has
a reserved identifier and we can look it up with a string literal for
now.
This will be skipped and ignored if the driver doesn't support it, so
it's safe if/when the name changes later (it'll just never be activated
in that case).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I originally wrote this trying to avoid doing an explicit version check
on the headers, but it just makes things more confusing, and the
requirements harder to understand.
So, Vulkan interop now takes a dependency on the header release where
they finalised the video decode headers. VK_EXT_descriptor_buffer was
added in 1.3.235, so that's covered as well.
Along the way I fixed a bug in the waf build where it was depending
on libplacebo-next instead of libplacebo-decode.
|
|
|
|
| |
It is not needed at all.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vulkan hwdec interop with the ffmpeg 6.1 vulkan code will require
additional features beyond those activated by libplacebo by default.
Enabling these features requires both requesting the features'
extensions and then explicitly turning on the features. libplacebo
handles detecting unsupported features and dropping them, to avoid
failing to create the vulkan device.
We then leave it to ffmpeg to decide if any missing features are
required for functionality, and error out if necessary.
As ffmpeg requires at least one bleeding edge extension (descriptor
buffers), all of this logic is gated on the presence of sufficiently
new Vulkan headers.
|
|
|
|
|
|
| |
c78482045444c488bb7948305d583a55d17cd236 introduced a bool option type
as a replacement for the flag type, but didn't actually transition and
remove the flag type because it would have been too much mundane work.
|
|
|
|
|
|
| |
This has had no effect since libplacebo v4.192.0, and was deprecated
upstream a year ago. No deprecation period in mpv is justified by this
being a debug / work-around option.
|
|
|
|
|
|
| |
As found out by @philipl, failing to pass this from the VkInstance to
the VkDevice is bad style. We might want to override the get_proc_addr
pointer in the future.
|
|
|
|
|
| |
Use the pl_log APIs introduced in libplacebo v4, replacing the
deprecated pl_context concept.
|
|
|
|
|
|
|
|
|
| |
In practice, this is for wayland. vo_gpu_next doesn't check the
check_visible parameter since it didn't descend into the
vulkan/context.c file when starting a frame. To make this happen, just
call the start_frame function pointer but pass NULL as the ra_fbo. In
there, we can do the visibility checks and after that bail out of the
start_frame function if ra_fbo is NULL.
|
|
|
|
|
|
|
|
|
|
| |
This vulkan-specific parameter was poorly named and probably causes
confusion. Just rename it to check_visible instead to make clear what is
going on here. Only wayland uses it for now but in theory anyone else
can. As an aside, wayland egl accomplishes this by using an external
swapchain instead (an opengl-only concept in the mpv code). This may or
may not need to be changed in the future when gpu-next gets opengl
support.
|
|
|
|
| |
So I can reuse it in vo_gpu_next.
|
|
|
|
| |
No longer needed with the bump to v3.104.
|
|
|
|
|
|
|
|
|
|
|
| |
The VkDisplayKHR context type requires making calls against the
physical device before the libplacebo context is initialised.
That means we can't simply use the physical device object that
libplacebo would create - instead we have to create a separate one,
but make sure it's referring to the same physical device.
To that end, we need the device name that the user may have requested
so we can pass it on.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Today, validation is only possible for string type options. But there's
no particular reason why it needs to be restricted in this way, and
there are potential uses, to allow other options to be validated
without forcing the option to have to reimplement parsing from
scratch.
The first part, simply making the validation function an explicit
field instead of overloading priv is simple enough. But if we only do
that, then the validation function still needs to deal with the raw
pre-parsed string. Instead, we want to allow the value to be parsed
before it is validated. That in turn leads to us having validator
functions that should be type aware. Unfortunately, that means we need
to keep the explicit macro like OPT_STRING_VALIDATE() as a way to
enforce the correct typing of the function. Otherwise, we'd have to
have the validator take a void * and hope the implementation can cast
it correctly.
For help, we don't have this problem, as help doesn't look at the
value.
Then, we turn validators that are really help generators into explicit
help functions and where a validator is help + validation, we split
them into two parts.
I have, however, left functions that need to query information for both
help and validation as single functions to avoid code duplication.
In this change, I have not added an other OPT_FOO_VALIDATE() macros as
they are not needed, but I will add some in a separate change to
illustrate the pattern.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Back in the olden days, mpv's wayland backend was driven by the frame
callback. This had several issues and was removed in favor of the
current approach which allowed some advanced features (like
display-resample and presentation time) to actually work properly.
However as a consequence, it meant that mpv always rendered, even if the
surface was hidden. Wayland people consider this "wasteful" (and well
they aren't wrong). This commit aims to avoid wasteful rendering by
doing some additional checks in the swapchain. There's three main parts
to this.
1. Wayland EGL now uses an external swapchain (like the drm context).
Before we start a new frame, we check to see if we are waiting on a
callback from the compositor. If there is no wait, then go ahead and
proceed to render the frame, swap buffers, and then initiate
vo_wayland_wait_frame to poll (with a timeout) for the next potential
callback. If we are still waiting on callback from the compositor when
starting a new frame, then we simple skip rendering it entirely until
the surface comes back into view.
2. Wayland on vulkan has essentially the same approach although the
details are a little different. The ra_vk_ctx does not have support for
an external swapchain and although such a mechanism could theoretically
be added, it doesn't make much sense with libplacebo. Instead,
start_frame was added as a param and used to check for callback.
3. For wlshm, it's simply a matter of adding frame callback to it,
leveraging vo_wayland_wait_frame, and using the frame callback value to
whether or not to draw the image.
|
|
|
|
|
|
|
|
|
|
|
| |
libplacebo exposes this feature already, because this particular type of
bug is unusually common in practice. Simply make use of it, by exposing
it as an option.
Could probably also bump the libplacebo minimum version to get rid of
the #if, but that would break debian oldoldstable or something.
Fixes #7867.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change all OPT_* macros such that they don't define the entire m_option
initializer, and instead expand only to a part of it, which sets certain
fields. This requires changing almost every option declaration, because
they all use these macros. A declaration now always starts with
{"name", ...
followed by designated initializers only (possibly wrapped in macros).
The OPT_* macros now initialize the .offset and .type fields only,
sometimes also .priv and others.
I think this change makes the option macros less tricky. The old code
had to stuff everything into macro arguments (and attempted to allow
setting arbitrary fields by letting the user pass designated
initializers in the vararg parts). Some of this was made messy due to
C99 and C11 not allowing 0-sized varargs with ',' removal. It's also
possible that this change is pointless, other than cosmetic preferences.
Not too happy about some things. For example, the OPT_CHOICE()
indentation I applied looks a bit ugly.
Much of this change was done with regex search&replace, but some places
required manual editing. In particular, code in "obscure" areas (which I
didn't include in compilation) might be broken now.
In wayland_common.c the author of some option declarations confused the
flags parameter with the default value (though the default value was
also properly set below). I fixed this with this change.
|
|
|
|
|
| |
This was added in libplacebo v1.29.0 and allows making resizes slightly
smoother for clients that already handle resize events (such as mpv).
|
|
|
|
|
|
|
|
|
|
| |
There's 2 stupid things here that need to be fixed. First of all,
vulkan wasn't actually using presentation time because somehow the
get_vsync function in context.c disappeared. Secondly, if the mpv window
was hidden it was updating the ust time based on the refresh_usec but
really it should simply just not feed any information to the vsync info
structure. So this adds some logic to assume whether or not a window is
hidden.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old way of using wayland in mpv relied on an external renderloop for
semi-accurate timings. This had multiple issues though. Display sync
would break whenever the window was hidden (since the frame callback
stopped being executed) which was really annoying. Also the entire
external renderloop logic was kind of fragile and didn't play well with
mpv's internal structure (i.e. using presentation time in that old
paradigm breaks stats.lua).
Basically the problem is that swap buffers blocks on wayland which is
crap whenever you hide the mpv window since it looks up the entire
player. So you have to make swap buffers not block, but this has a
different problem. Timings will be terrible if you use the unblocked
swap buffers call.
Based on some discussion in #wayland, the trick here is relatively
simple and works well enough for our purposes. Instead we basically
build a way to block with a timeout in the wayland buffer swap
functions.
A bool is set in the frame callback function that indicates whether or
not mpv is waiting for a frame to be displayed. In the actual buffer
swap function, we enter into a while loop waiting for this flag to be
set. At the same time, the wl_display is polled to block the thread and
wakeup if it receives any events from the compositor. This loop only
breaks if enough time has passed or if the frame callback bool is
received.
In the near future, it is better to set whether or not frame a frame has
been displayed in the presentation feedback. However as a first pass,
doing it in the frame callback is more than good enough.
The "downside" is that we render frames that aren't actually shown on
screen when the player is hidden (it seems like wayland people don't
like that). But who cares. Accurate timings are way more important. It's
probably not too hard to add that behavior back in the player though.
|
|
|
|
| |
In preparation for making vo_drm able to use swapchain-depth
|
|
|
|
|
|
|
|
|
| |
We collect a 'vulkan-device' option today but then don't actually
pass it on, so it's useless. Once that's fixed, it can be used
to select a specific vulkan device by name.
Tested with the new nvidia offload feature to select between the
nvidia and intel GPUs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit rips out the entire mpv vulkan implementation in favor of
exposing lightweight wrappers on top of libplacebo instead, which
provides much of the same except in a more up-to-date and polished form.
This (finally) unifies the code base between mpv and libplacebo, which
is something I've been hoping to do for a long time.
Note: The ra_pl wrappers are abstract enough from the actual libplacebo
device type that we can in theory re-use them for other devices like
d3d11 or even opengl in the future, so I moved them to a separate
directory for the time being. However, the rest of the code is still
vulkan-specific, so I've kept the "vulkan" naming and file paths, rather
than introducing a new `--gpu-api` type. (Which would have been ended up
with significantly more code duplicaiton)
Plus, the code and functionality is similar enough that for most users
this should just be a straight-up drop-in replacement.
Note: This commit excludes some changes; specifically, the updates to
context_win and hwdec_cuda are deferred to separate commits for
authorship reasons.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By design, some vulkan implementations block until vsync during
vkAcquireNextImageKHR. Since mpv only considers the time that
`swap_buffers` spent blocking as constituting part of the vsync, we can
help it out a bit by pre-emptively calling this function here in order
to improve the accuracy of vsync jitter measurements on vulkan.
(If it fails, we just ignore the error and have the user call it a
second time later - maybe it will work then)
On my system this drops vsync-jitter from ~0.030 to ~0.007, an accuracy
of +/- 100μs. (Which *might* have something to do with the fact that
this is the polling interval for command polling)
|
|
|
|
|
|
|
|
| |
Makes performance slightly better when using multiple queues by avoiding
unnecessary semaphores due to bad queue selection.
Also remove an aeons-old workaround for an nvidia bug that only ever
existed in the earliest beta vulkan drivers anyway.
|
|
|
|
|
|
| |
Since the code just broke out of the loop on a match rather than jumping
straight to the end of the function body, it ended up hitting the code
path for when the end of the list was reached.
|
|
|
|
|
|
|
|
|
|
|
| |
There is now a better way. Reading the font framebuffer was always a
hack. The new code via VOCTRL_SCREENSHOT renders it into a FBO, which
does not come with the disadvantages of reading the front buffer (like
not being supported by GLES, possibly black regions due to overlapping
windows on some systems).
For now keep VOCTRL_SCREENSHOT_WIN on the VO level, because there are
still some lesser VOs and backends that use it.
|
|
|
|
|
|
|
|
|
| |
Async compute in particular seems to cause problems on some drivers, and
even when supprted the benefits are not that massive from the tests I
have seen, so it's probably safe to keep off by default.
Async transfer on the other hand seems to work better and offers a more
substantial improvement, so it's kept on.
|
|
|
|
|
| |
This is now associated with the ra_tex directly and used in the correct
way, rather than hackily done from submit_frame.
|
|
|
|
|
| |
Now handles both VK_ERROR_OUT_OF_DATE_KHR and VK_SUBOPTIMAL_KHR for both
vkAcquireNextImageKHR and vkQueuePresentKHR in the correct way.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using a single primary queue, we generate multiple
vk_cmdpools and pick the right one dynamically based on the intent.
This has a number of immediate benefits:
1. We can use async texture uploads
2. We can use the DMA engine for buffer updates
3. We can benefit from async compute on AMD GPUs
Unfortunately, the major downside is that due to the lack of QF
ownership tracking, we need to use CONCURRENT sharing for all resources
(buffers *and* images!). In theory, we could try figuring out a way to
get rid of the concurrent sharing for buffers (which is only needed for
compute shader UBOs), but even so, the concurrent sharing mode doesn't
really seem to have a significant impact over here (nvidia). It's
possible that other platforms may disagree.
Our deadlock-avoidance strategy is stupidly simple: Just flush the
command every time we need to switch queues, and make sure all
submission and callbacks happen in FIFO order. This required lifting the
cmds_pending and cmds_queued out from vk_cmdpool to mpvk_ctx, and some
functions died/got moved as a result, but that's a relatively minor
change.
On my hardware this is a fairly significant performance boost, mainly
due to async transfers. (Nvidia doesn't expose separate compute queues
anyway). On AMD, this should be a performance boost as well due to async
compute.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the new vk_signal mechanism to order all access to textures.
This has several advantageS:
1. It allows real synchronization of image access across multiple frames
when using multiple queues for parallelism.
2. It allows using events instead of pipeline barriers, which is a
finer-grained synchronization primitive that allows for more
efficient layout transitions over longer durations.
This commit also restructures some of the implicit transition code for
renderpasses to be more flexible and correct. (Note: this technically
drops the ability to transition the image out of undefined layout when
not blending, but that was a bug anyway and needs to be done properly)
vo_gpu: vulkan: remove no-longer-true optimization
The change to the output_tex format makes this no longer true, and it
actually seems to hurt performance now as well. So just don't do it
anymore. I also realized it hurts performance when drawing an OSD, so
it's probably not a good idea anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of being submitted immediately, commands are appended into an
internal submission queue, and the actual submission is done once per
frame (at the same time as queue cycling). Again, the benefits are not
immediately obvious because nothing benefits from this yet, but it will
make more sense for an upcoming vk_signal mechanism.
This also cleans up the way the ra_vk submission interacts with the
synchronization/callbacks from the ra_vk_ctx. Although currently, the
way the dependency is signalled is a bit hacky: normally it would be
associated with the ra_tex itself and waited on in the appropriate stage
implicitly. But that code is just temporary, so I'm keeping it in there
for a better commit order.
|
|
|
|
|
|
|
|
| |
Instead of associating a single VkSemaphore with every command buffer
and allowing the user to ad-hoc wait on it during submission, make the
raw semaphores-to-signal array work like the raw semaphores-to-wait-on
array. Doesn't really provide a clear benefit yet, but it's required for
upcoming modifications.
|
|
|
|
|
|
| |
1. No more static arrays (deps / callbacks / queues / cmds)
2. Allows safely recording multiple commands at the same time
3. Uses resources optimally by never over-allocating commands
|
|
|
|
|
| |
This is fixed upstream (and we now know it's a driver bug) so reword the
comment.
|
|
|
|
|
|
|
|
|
|
|
|
| |
FlagBits is just the name of the enum. The actual data type representing
a combination of these flags follows the *Flags convention. (The
relevant difference is that the latter is defined to be uint32_t instead
of left implicit)
For consistency, use *Flags everywhere instead of randomly switching
between *Flags and *FlagBits.
Also fix a wrong type name on `stageFlags`, pointed out by @atomnuker
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In addition to the built-in nvidia compiler, we now also support a
backend based on libshaderc. shaderc is sort of like glslang except it
has a C API and is available as a dynamic library.
The generated SPIR-V is now cached alongside the VkPipeline in the
cached_program. We use a special cache header to ensure validity of this
cache before passing it blindly to the vulkan implementation, since
passing invalid SPIR-V can cause all sorts of nasty things. It's also
designed to self-invalidate i |