| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That new API was introduced and allows to have several native resources.
Thisuses that mechanisma for drm resources rather than the deprecated
opengl-cb structs.
This patch therefore add two structs that can be used with the drm atomic interop.
- mpv_opengl_drm_params : which will hold all the drm handles
- mpv_opengl_drm_osd_size : which will hold osd layer size
This commit adds a drm-osd-size=WxH parameter to commandline which
allows to define the OSD plane dimension. OSD can be upscaled to
screen resolution when having OSD at video resolution is too heavy.
This is especially useful for UHD modes on embedded devices where
the GPU cannot handle UHD modes at a decent framerate.
|
|
|
|
|
| |
This is actually more generic and better than just lazily plastering
peak calculation together with dumb mode.
|
|
|
|
|
| |
I don't know if we can just return from this function, so for now
just adding this piece of logging.
|
|
|
|
|
|
|
|
|
|
| |
Define a hard-coded value for gl_NumWorkGroups if it is not available.
This adds an additional requirement of needing a shader recompile for
all window size changes.
This was considered a worthwhile compromise as currently f.ex. d3d11
completely lacked any peak computation - this is a major quality of
life upgrade.
|
|
|
|
|
| |
Now that the feature depends on multiple features, log all of
their states in the message.
|
|
|
|
|
|
| |
Also rename stereo3d to stereo_in. The only real change is that the
vo_gpu OSD code now uses the actual stereo 3D mode, instead of the
--video-steroe-mode value. (Why does this vo_gpu code even exist?)
|
|
|
|
|
| |
Like DR, this needed a lot of preparation, and here's the boring glue
code that finally implements it.
|
|
|
|
|
| |
With all the preparation work done, this only has to do the annoying
dance of passing it through all the damn layers.
|
|
|
|
|
| |
This also happens to fix some UB on the error path (target being
declared after the first "goto done;").
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally, MPV_RENDER_PARAM* arguments are copied, unless documented
otherwise. Of course we can't copy X11 Display or Wayland wl_display
types, but for arguments that are "summarized" in a struct (like
MPV_RENDER_PARAM_OPENGL_FBO), a copy is expected.
Also add some unused infrastructure to make this explicit, and to make
it easier to add parameter types that require a copy.
Untested.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hardware decoding things often need access to additional handles from
the windowing system, such as the X11 or Wayland display when using
vaapi. The opengl-cb had nothing dedicated for this, and used the weird
GL_MP_MPGetNativeDisplay GL extension (which was mpv specific and not
officially registered with OpenGL).
This was awkward, and a pain due to having to emulate GL context
behavior (like needing a TLS variable to store context for the pseudo GL
extension function). In addition (and not inherently due to this), we
could pass only one resource from mpv builtin context backends to
hwdecs. It was also all GL specific.
Replace this with a newer mechanism. It works for all RA backends, not
just GL. the API user can explicitly pass the objects at init time via
mpv_render_context_create(). Multiple resources are naturally possible.
The API uses MPV_RENDER_PARAM_* defines, but internally we use strings.
This is done for 2 reasons: 1. trying to leave libmpv and internal
mechanisms decoupled, 2. not having to add public API for some of the
internal resource types (especially D3D/GL interop stuff).
To remain sane, drop support for obscure half-working opengl-cb things,
like the DRM interop (was missing necessary things), the RPI window
thing (nobody used it), and obscure D3D interop things (not needed with
ANGLE, others were undocumented). In order not to break ABI and the C
API, we don't remove the associated structs from opengl_cb.h.
The parts which are still needed (in particular DRM interop) needs to be
ported to the render API.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This passed the display size as source size to the renderer, which is of
course nonsense. I don't know what I was doing in 569383bc54.
Yet another fix for those damn anamorphic videos.
As a somewhat redundant/cosmetic change, use image_params instead of
real_image_params in the code above. They should have the same, dimensions
(but possibly different formats when doing hw decdoing), and mixing them
is confusing. p->image_params wins because it's shorter.
Actually fixes #5619.
|
|
|
|
|
|
|
|
|
|
|
| |
We took the storage size instead of the display size for "unscaled"
screenshots. Even if it's called "unscaled", it's still supposed to
scale to compensate for aspect ratio.
(How many commits fixing anamorphic screenshots in various situations
are there?)
Fixes #5619.
|
| |
|
|
|
|
| |
Good old 90° rotation logic messing everything up.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of the new API is to make it useable with other APIs than
OpenGL, especially D3D11 and vulkan. In theory it's now possible to
support other vo_gpu backends, as well as backends that don't use the
vo_gpu code at all.
This also aims to get rid of the dumb mpv_get_sub_api() function. The
life cycle of the new mpv_render_context is a bit different from
mpv_opengl_cb_context, and you explicitly create/destroy the new
context, instead of calling init/uninit on an object returned by
mpv_get_sub_api().
In other to make the render API generic, it's annoyingly EGL style, and
requires you to pass in API-specific objects to generic functions. This
is to avoid explicit objects like the internal ra API has, because that
sounds more complicated and annoying for an API that's supposed to never
change.
The opengl_cb API will continue to exist for a bit longer, but
internally there are already a few tradeoffs, like reduced
thread-safety.
Mostly untested. Seems to work fine with mpc-qt.
|
| |
|
|
|
|
|
|
|
| |
Mobius isn't well-defined for sig_peak <= 1.0. We can solve this by just
soft-clamping sig_peak to 1.0. Although, in this case, we can just skip
tone mapping altogether since the limit of mobius as sig_peak -> 1.0 is
just a linear function.
|
|
|
|
|
|
|
|
|
| |
Based on testing with real-world non-HDR BT.2020 clips, clipping the
color space looks better than attempting to gamut map using a tone
mapping shader that's (by now) optimized for HDR content.
If anything, we'd have to develop a separate gamut mapping shader that
works in LCh space.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This solves a number of problems simultaneously:
1. When outputting HLG, this allows tuning the OOTF based on the display
characteristics.
2. When outputting PQ or other HDR curves, this allows soft-limiting the
output brightness using the tone mapping algorithm.
3. When outputting SDR, this allows HDR-in-SDR style output, by
controlling the output brightness directly.
Closes #5521
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The HLG OOTF is defined as a one-parameter family of OOTFs depending on
the display's peak luminance. With the preceding change to OOTF scale
and handling, we no longer have any issues with outputting values in
whatever signal range we need.
So as a result, it's easy for us to support a tunable OOTF which may
(drastically) alter the display brightness. In fact, this is also the
only correct way to do it, because the HLG appearance depends strongly
on the OOTF configuration. For the OOTF, we consult the mastering
display's tagging (via src.sig_peak). For the inverse OOTF, we consult
the output display's target peak.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The primary need for this change is the fact that the OOTF was
incorrectly scaled, due to the fact that the application of the OOTF can
itself change the required normalization peak. (Plus, an oversight in
pass_inverse_ootf meant we forgot to normalize at the end of it)
The linearize/delinearize functions still normalize the scale since it's
used in a number of places throughout gpu/video.c, but the color
management function now converts to absolute scale right away, instead
of in an awkward way inside the tone mapping branch. The OOTF functions
now work in absolute scale only.
In addition, minor changes have been made to the way normalization is
handled for tone mapping - we now divide out the dst_peak *after* peak
detection, in order to make the scale of the peak detection buffer
consistent even if the dst_peak were to (hypothetically) change
mid-stream. In theory, we could also do this for desaturation, but doing
the desaturation before tone mapping has the advantage of preserving
much more brightness than the other way around - and even mid-stream
changes are not that drastic here.
Finally, some preparation work has been done for allowing the user to
customize the `dst.sig_peak` in the future.
|
|
|
|
|
|
|
|
|
|
|
| |
There is now a better way. Reading the font framebuffer was always a
hack. The new code via VOCTRL_SCREENSHOT renders it into a FBO, which
does not come with the disadvantages of reading the front buffer (like
not being supported by GLES, possibly black regions due to overlapping
windows on some systems).
For now keep VOCTRL_SCREENSHOT_WIN on the VO level, because there are
still some lesser VOs and backends that use it.
|
|
|
|
|
|
|
| |
This allows the new GPU screenshot functionality introduced in
9f595f3a80ee to work with the D3D11 backend. It replaces the old window
screenshot functionality, which was shared between D3D11 and ANGLE. The
old code can be removed, since it's not needed by ANGLE anymore either.
|
|
|
|
|
| |
This is just a cosmetic change. Now the RA_CAP_FRAGCOORD check looks
like all the others.
|
|
|
|
|
|
|
|
| |
Similar spirit to edb4970ca8b9. check_gl_features() has a confusing
early-return. This also adds compute_hdr_peak to the list of options
that is copied to the dumb-mode options struct, since it seems to make a
difference. Otherwise it would be impossible to disable HDR peak
detection in dumb mode.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using the GL renderer for color conversion will make sure screenshots
will use the same conversion as normal video rendering. It can do this
for all types of screenshots.
The logic when to write 16 bit PNGs changes. To approximate the old
behavior, we decide by looking whether the source video format has more
than 8 bits per component. We apply this logic even for window
screenshots. Also, 16 bit PNGs now always include an unused alpha
channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no
RGB064. RGB48 is 3 bytes and usually not supported by GPUs for
rendering, so we have to use RGBA64, which forces an alpha channel.
Will break for users who use --target-trc and similar options.
I considered creating a new gl_video context, but it could double GPU
memory use, so I didn't.
This uses FBOs instead of glGetTexImage(), because that increases the
chance it could work on GLES (e.g. ANGLE). Untested. No support for the
Vulkan and D3D11 backends yet.
Fixes #5498. Also fixes #5240, because the code for reading back is not
used with the new code path.
|
|
|
|
| |
Needed for the following commit.
|
|
|
|
|
| |
Even if RA_CAP_BLIT is set, this might just not be enabled for the
target ra_tex.
|
|
|
|
|
| |
This can cause the peak detection state to be inconsistent in rare
cases, which might explain the issues when taking screenshots in #5499.
|
|
|
|
|
|
| |
The re-ordering of commits e3d93fd and 0870859 ended up swallowing the
change which made the HDR tone mapping algorithm actually check for
RA_CAP_NUM_GROUPS support.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The major changes are as follows:
1. Use `uint32_t` instead of `unsigned int` for the SSBO size
calculation. This doesn't really matter, since a too-big buffer will
still work just fine, but since `uint` is a 32-bit integer by
definition this is the correct way to do it.
2. Pre-divide the frame_sum by the num_wg immediately at the end of a
frame. This change was made to prevent overflow. At 4K screen size,
this code is currently already very at risk of overflow, especially
once I started playing with longer averaging sizes. Pre-dividing this
out makes it just about fit into 32-bit even for worst-case PQ
content. (It's technically also faster and easier this way, so I
should have done it to begin with). Rename `frame_sum` to `frame_avg`
to clearly signal the change in semantics.
3. Implement a scene transition detection algorithm. This basically
compares the current frame's average brightness against the
(averaged) value of the past frames. If it exceeds a threshold, which
I experimentally configured, we reset the peak detection SSBO's state
immediately - so that it just contains the current frame. This
prevents annoying "eye adaptation"-like effects on scene transitions.
4. As a result of the previous change, we can now use a much larger
buffer size by default, which results in a more stable and less
flickery result. I experimented with values between 20 and 256 and
settled on the new value of 64. (I also switched to a power-of-2
array size, because I like powers of two)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current peak detection algorithm was very bugged (which contributed
to the excessive cross-frame flicker without long normalization) and
also didn't take into account the frame average brightness level.
The new algorithm both takes into account frame average brightness (in
addition to peak brightness), and also computes the values in a more
stable/correct way. (The old path was basically undefined behavior)
In addition to improving the algorithm, we also switch to hable tone
mapping by default, and try to enable peak computation automatically
whever possible (compute shaders + SSBOs supported). We also make the
desaturation milder, after extensive testing during libplacebo
development.
I also had to compensate a bit for the representational differences
between mpv and libplacebo (libplacebo treats 1.0 as the reference peak,
but mpv treats it as the nominal peak), but it shouldn't have caused any
problems.
This is still not quite the same as libplacebo, since libplacebo also
allows tagging the desired scene average brightness on the output, and
it also supports reading the scene average brightness from static
metadata (MaxFALL) where available. But those changes are a bit more
involved. It's possible we could also read this from metadata in the
future, but we have problems communicating with AVFrames as it is and I
don't want to touch the mpv colorimetry structs for the time being.
|
|
|
|
|
| |
SPIRV-Cross doesn't support this for the time being. It's possible this
could go away again at a later date.
|
|
|
|
|
|
|
| |
The RA_CAP_FRAGCOORD checks apply to dumb mode as well, but they were
after the check for dumb mode, which returns early, so they never ran.
Fixes #5436
|
|
|
|
|
|
|
|
|
|
| |
Using vdpau will allocate additional textures for the reinterleaving
step, which uninit_rendering() will free. This is a problem because the
hwdec image remains mapped when reinitializing, so the reinterleaving
textures are turned into dangling pointers. Fix this by freeing the
reinterleave textures on full uninit instead.
Fixes #5447.
|
|
|
|
| |
Fix https://github.com/mpv-player/mpv/issues/5420
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DR (direct rendering) works by having the decoder decode into the GPU
staging buffers, instead of copying the video data on texture upload. We
did this even for formats unsupported by the GPU or the renderer. This
"worked" because the staging memory is untyped, and the video frame was
converted by libswscale to a supported format, and then uploaded with a
copy using the normal non-DR texture upload path.
Even though it "works", we don't gain anything from using the staging
buffers for decoding, since we can't use them for upload anyway. Also,
staging memory might be potentially limited (what really happens is up
to the driver). It's easy to avoid, so just skip it in these cases.
|
|
|
|
|
|
|
|
|
|
|
| |
The check_gl_features(p) call here checks whether dumb mode can be used.
It uses the field use_integer_conversion, which is set _after_ the call
in the same function. Move check_gl_features() to the end of the
function, when use_integer_conversion is finally set.
Fixes that it tried to use bilinear filtering with integer textures. The
bug disabled the code that is supposed to convert it to non-integer
textures.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables DXVA2 hardware decoding with ra_d3d11. It should be useful
for Windows 7, where D3D11VA is not available. Images are transfered
from D3D9 to D3D11 using D3D9Ex surface sharing[1].
Following Microsoft's recommendations, it uses a queue of shared
surfaces, similar to Microsoft's ISurfaceQueue. This will hopefully
prevent surface sharing from impacting parallelism and allow multiple
D3D11 frames to be in-flight at once.
[1]: https://msdn.microsoft.com/en-us/library/windows/desktop/ee913554.aspx
|
|
|
|
|
|
|
|
|
|
| |
Previously, mpv would attempt to use a BGRA swapchain in the hope that
it would give better performance, since the Windows desktop is also
composited in BGRA. In practice, it seems like there is no noticable
performance difference between RGBA and BGRA swapchains and BGRA
swapchains cause trouble with a42b8b1142fd, which attempts to use the
swapchain format for intermediate FBOs, even though D3D11 does not
guarantee BGRA surfaces will work with UAV typed stores.
|
| |
|
|
|
|
|
|
|
|
|
| |
This allows RAs with support for non-opaque FBO formats to use a more
appropriate FBO format for the output tex, possibly enabling a more
efficient blit operation.
This requires distinguishing between real formats (which can be used to
create textures) and fake formats (e.g. ra_gl's FBO hack).
|
|
|
|
|
|
|
|
|
|
| |
On AMD devices, we only get one graphics pipe but several compute pipes
which can (in theory) run independently. As such, we should prefer
compute shaders over fragment shaders in scenarios where we expect them
to be better for parallelism.
This is amusingly trivial to do, and actually improves performance even
in a single-queue scenario.
|
|
|
|
|
| |
Don't discard the OSD or pass_draw_to_screen passes though. Could be
faster on some hardware.
|
|
|
|
|
|
|
|
|
| |
This is especially interesting for vulkan since it allows completely
skipping the layout transition as part of the renderpass. Unfortunately,
that also means it needs to be put into renderpass_params, as opposed to
renderpass_run_params (unlike #4777).
Closes #4777.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've decided that MP_TRACE means “noisy spam per frame”, whereas
MP_DBG just means “more verbose debugging messages than MSGL_V”.
Basically, MSGL_DBG shouldn't create spam per frame like it currently
does, and MSGL_V should make sense to the end-user and provide mostly
additional informational output.
MP_DBG is basically what I want to make the new default for --log-file,
so the cut-off point for MP_DBG is if we probably want to know if for
debugging purposes but the user most likely doesn't care about on the
terminal.
Also, the debug callbacks for libass and ffmpeg got bumped in their
verbosity levels slightly, because being external components they're a
bit less relevant to mpv debugging, and a bit too over-eager in what
they consider to be relevant information.
I exclusively used the "try it on my machine and remove messages from
MSGL_* until it does what I want it to" approach of refactoring, so
YMMV.
|
|
|
|
| |
Add the "all" value to the --gpu-hwdec-interop help output.
|
|
|
|
|
| |
Make gl_video_load_hwdecs() call gl_video_load_hwdecs_all() when
all HW decoders should be loaded.
|
|
|