| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
This gets rid of the hard-coded limits on the number of hooks, textures
and hook points.
|
|
|
|
| |
Consistency
|
|
|
|
|
|
|
|
|
|
|
|
| |
FlagBits is just the name of the enum. The actual data type representing
a combination of these flags follows the *Flags convention. (The
relevant difference is that the latter is defined to be uint32_t instead
of left implicit)
For consistency, use *Flags everywhere instead of randomly switching
between *Flags and *FlagBits.
Also fix a wrong type name on `stageFlags`, pointed out by @atomnuker
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using renderpass layout transitions is more optimal and doesn't require
a redundant pipeline barrier.
Since our render passes are static and don't change throughout the
lifetime of a ra_renderpass, we unfortunately don't have much
flexibility here - so just hard-code SHADER_READ_ONLY_OPTIMAL as the
output format as this will be the most common case.
We also can't short-circuit the transition when we need to preserve the
framebuffer contents, since that depends on the current layout; so we
still use an explicit tex_barrier in this case. (Most optimal for this
scenario would be an input attachment anyway)
|
|
|
|
|
|
| |
This restores cuda/cuvid under Windows.
Cuvid is relatively useless under Windows, but this was requested.
|
|
|
|
|
| |
Like as in previous commits, you need a very recent FFmpeg (probably git
master).
|
|
|
|
|
|
|
|
|
| |
Now you need FFmpeg git, or something.
This also gets rid of the last real use of gpu_memcpy(). libavutil does
that itself. (vaapi.c still used it, but it was essentially unused,
because the code path isn't really in use anymore. It wasn't even
included due to the d3d-hwaccel dependency in wscript.)
|
|
|
|
| |
Just use FFmpeg 3.3 (or whatever it was) to get Cuvid support.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is apparently required to get storage images working on
windows/vulkan, and probably good practice either way. Not entirely sure
if it's the best idea to be always storing the value as 32-bit float,
but it should hardly matter in practice (since we're only writing one
sample per thread).
(Leaving them implicit requires the shaderStorageImageWriteWithoutFormat
feature to be enabled, which the windows nvidia vulkan driver doesn't
support, at least not for a GTX 670)
|
|
|
|
|
|
|
|
|
| |
This makes the radeon driver shut up about frequently updating
STATIC_DRAW UBOs (--opengl-debug), and also reduces the amount of
synchronization necessary for vulkan uniform buffers.
Also add some extra debugging/tracing code paths. I went with a
flags-based approach in case we ever want to extend this.
|
|
|
|
| |
Can in theory avoid updating the uniform buffer every frame
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In addition to the built-in nvidia compiler, we now also support a
backend based on libshaderc. shaderc is sort of like glslang except it
has a C API and is available as a dynamic library.
The generated SPIR-V is now cached alongside the VkPipeline in the
cached_program. We use a special cache header to ensure validity of this
cache before passing it blindly to the vulkan implementation, since
passing invalid SPIR-V can cause all sorts of nasty things. It's also
designed to self-invalidate if the compiler gets better, by offering a
catch-all `int compiler_version` that implementations can use as a cache
invalidation marker.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This time based on ra/vo_gpu. 2017 is the year of the vulkan desktop!
Current problems / limitations / improvement opportunities:
1. The swapchain/flipping code violates the vulkan spec, by assuming
that the presentation queue will be bounded (in cases where rendering
is significantly faster than vsync). But apparently, there's simply
no better way to do this right now, to the point where even the
stupid cube.c examples from LunarG etc. do it wrong.
(cf. https://github.com/KhronosGroup/Vulkan-Docs/issues/370)
2. The memory allocator could be improved. (This is a universal
constant)
3. Could explore using push descriptors instead of descriptor sets,
especially since we expect to switch descriptors semi-often for some
passes (like interpolation). Probably won't make a difference, but
the synchronization overhead might be a factor. Who knows.
4. Parallelism across frames / async transfer is not well-defined, we
either need to use a better semaphore / command buffer strategy or a
resource pooling layer to safely handle cross-frame parallelism.
(That said, I gave resource pooling a try and was not happy with the
result at all - so I'm still exploring the semaphore strategy)
5. We aggressively use pipeline barriers where events would offer a much
more fine-grained synchronization mechanism. As a result of this, we
might be suffering from GPU bubbles due to too-short dependencies on
objects. (That said, I'm also exploring the use of semaphores as a an
ordering tactic which would allow cross-frame time slicing in theory)
Some minor changes to the vo_gpu and infrastructure, but nothing
consequential.
NOTE: For safety, all use of asynchronous commands / multiple command
pools is currently disabled completely. There are some left-over relics
of this in the code (e.g. the distinction between dev_poll and
pool_poll), but that is kept in place mostly because this will be
re-extended in the future (vulkan rev 2).
The queue count is also currently capped to 1, because of the lack of
cross-frame semaphores means we need the implicit synchronization from
the same-queue semantics to guarantee a correct result.
|
|
|
|
| |
opengl-debug was renamed to gpu-debug
|
|
|
|
|
| |
Iterations after the first time will fail to realize that the pass was
never created. This function's logic and control flow is so annoying...
|
|
|
|
| |
This should have been renamed when it stopped being empty.
|
|
|
|
|
|
|
|
|
|
| |
Tested by making the ra_tex_resize function always fail (apart from the
initial FBO check). This required a few changes:
1. reset shaders on failed dispatch
2. reset cleanup binds on failed dispatch
3. fall back to initializing the struct image to 1x1 on failure
4. handle output_fbo_valid gracefully
|
|
|
|
|
|
|
|
| |
This was sort of grating by default and made it really hard to actually
read e.g. text on top of a transparent background. I decided to approach
the problem from both directions, making the whites darker and the grays
lighter. This brings it closer to the dynamic range of e.g. the
wikipedia transparent svg preview.
|
|
|
|
| |
Since the removal of FBOTEX_FUZZY, this can be made slightly simpler.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to the plethora of historical baggage from different eras getting
confusing, I decided to simplify and unify the struct organization and
naming scheme.
Structs that got renamed:
1. fbodst -> ra_fbo (and moved to gpu/context.h)
2. fbotex -> removed (redundant after 2af2fa7a)
3. fbosurface -> surface
4. img_tex -> image
In addition to these structs being renamed, all of the names have been
made consistent. The new scheme is as follows:
struct image img;
struct ra_tex *tex;
struct ra_fbo fbo;
This also affects derived names, e.g. indirect_fbo -> indirect_tex.
Notably also, finish_pass_fbo -> finish_pass_tex and finish_pass_direct
-> finish_pass_fbo.
The new equivalent of fbotex_change() is called ra_tex_resize().
This commit (should) contain no logic changes, just renaming a bunch of
crap.
|
|
|
|
|
|
|
|
|
|
| |
I've observed the garbage pixels in more scenarios. They also were never
really needed to begin with, originally being a discovered work-around
for bug that we fixed since then anyway. Doesn't really seem to even
help resizing, since the OpenGL drivers are all smart enough to pool
resources internally anyway.
Fixes #1814
|
|
|
|
|
|
|
|
| |
Enabling double buffering fixed some graphical glitches when entering
fullscreen, but it also caused a fullscreen performance regression. We
decided that the glitches were preferable to the performance regression.
This reverts commit cee764849e4fe22b00fb3f31838a63906e2e0d54.
|
|
|
|
|
| |
ANGLE can take advantage of some of these when using the external
swapchain-backed surface.
|
|
|
|
|
|
| |
gl_read_fbo_contents can fail
Fixes #4905
|
|
|
|
|
|
|
|
|
| |
The code used ra_ctx_destroy even though ra_ctx_create was never called
(since it's just a dummy ctx), which led to a conflict of assumptions.
The proper fix is to only use ra_gl_ctx_uninit (mirroring the
ra_gl_ctx_init) and free the dummy ctx manually.
Fixes https://github.com/cmdrkotori/mpc-qt/issues/129
|
|
|
|
|
|
|
|
| |
We want e.g. --opengl-shaders-append=foo to resolve to the new option,
all while printing an option name. --opengl-shader is a similar case.
These options are special, because they apply "actions" on actual
options by specifying a suffix. So the alias/deprecation handling has to
be part of resolving the actual option from prefix and suffix.
|
| |
|
|
|
|
|
| |
Also readd the the error message for when no GL backends are found (why
was this removed?).
|
|
|
|
|
|
|
| |
Commit bfa9b628589068 accidentally reverted these (because my editor did
not reload the file correctly).
Fixes #4904.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[179/188] Compiling video/out/vo_lavc.c
../../video/out/opengl/hwdec_ios.m:135:9: warning: unused variable 'gl' [-Wunused-variable]
GL *gl = ra_gl_get(mapper->ra);
^
../../video/out/opengl/hwdec_ios.m:247:48: warning: incompatible pointer to integer conversion passing 'CVOpenGLESTextureRef' (aka 'struct __CVBuffer *') to parameter of type 'GLuint' (aka 'unsigned int') [-Wint-conversion]
p->gl_planes[i]);
^~~~~~~~~~~~~~~
../../video/out/opengl/ra_gl.h:9:45: note: passing argument to parameter 'gl_texture' here
GLuint gl_texture);
^
2 warnings generated.
|
|
|
|
|
|
|
| |
Turns out the option code apparently tries to directly talloc_free() the
allocated strings, instead of going through a tactx wrapper or
something. So we can't directly overwrite it. Do something else
instead..
|
|
|
|
|
|
| |
The ctx->ra was never freed propely, nor was p->wrapped_fb.
(TIL: MPV_LEAK_REPORT exists)
|
|
|
|
|
|
|
|
|
|
| |
Almost as fast as the old code, but more general. Notably, glslang
doesn't support nested arrays.
(cf. https://github.com/KhronosGroup/glslang/issues/1057)
Also much cleaner code-wise, so I think I'll keep it even if glslang
implements array_of_arrays.
|
| |
|
|
|
|
|
| |
If shader compilation fails in an unexpected way, it can end up calling
renderpass_run on an invalid pass, since current_shader is never cleared.
|
|
|
|
|
|
| |
This never really made sense since the BT.1886 changes. It should get
*brighter* for bright rooms, not darker for dark rooms. Picked some new
values that seemed reasonable-ish.
|
|
|
|
| |
This hasn't been true for several iterations of this API.
|
|
|
|
| |
This can get left unknown if something hooks NATIVE
|
|
|
|
|
|
|
|
|
| |
This causes a performance regression on 10.11 and newer, but the single
buffered method was broken and could cause partially rendered frames to
be presented to the screen.
This reverts 9f30cd8292b4b7bfe5d7db29fe31a07cc76dec2c and
e543853a7ff0ab4dcd4ccaf06c448013fd41c03a.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is done in several steps:
1. refactor MPGLContext -> struct ra_ctx
2. move GL-specific stuff in vo_opengl into opengl/context.c
3. generalize context creation to support other APIs, and add --gpu-api
4. rename all of the --opengl- options that are no longer opengl-specific
5. move all of the stuff from opengl/* that isn't GL-specific into gpu/
(note: opengl/gl_utils.h became opengl/utils.h)
6. rename vo_opengl to vo_gpu
7. to handle window screenshots, the short-term approach was to just add
it to ra_swchain_fns. Long term (and for vulkan) this has to be moved to
ra itself (and vo_gpu altered to compensate), but this was a stop-gap
measure to prevent this commit from getting too big
8. move ra->fns->flush to ra_gl_ctx instead
9. some other minor changes that I've probably already forgotten
Note: This is one half of a major refactor, the other half of which is
provided by rossy's following commit. This commit enables support for
all linux platforms, while his version enables support for all non-linux
platforms.
Note 2: vo_opengl_cb.c also re-uses ra_gl_ctx so it benefits from the
--opengl- options like --opengl-early-flush, --opengl-finish etc. Should
be a strict superset of the old functionality.
Disclaimer: Since I have no way of compiling mpv on all platforms, some
of these ports were done blindly. Specifically, the blind ports included
context_mali_fbdev.c and context_rpi.c. Since they're both based on
egl_helpers, the port should have gone smoothly without any major
changes required. But if somebody complains about a compile error on
those platforms (assuming anybody actually uses them), you know where to
complain.
|
|
|
|
|
|
|
| |
See "Copyright" file for caveats.
This changes the remaining "almost LGPL" files to LGPL, because we think
that the conditions the author set for these was finally fulfilled.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is "wrong", because you might want mp_image_copy_attributes() to
preserve the information that the colorspace parameters are unknown.
This is important for hwdec -copy modes, which call this function before
fix_image_params() and mp_colorspace_merge() are called.
Instead, just wipe the colorspace attributes if the pixel format changes
in an apparently incompatible way. Use mp_image_params_guess_csp() logic
for this and factor that into its own function.
mp_image_set_attributes() attempts to do something similar, so change
that in the same way. Also, mp_image_params_guess_csp() just returned if
the imgfmt was invalid or unset - just remove that part, because it
annoyingly doesn't fit into the new code, and had little reason to exist
to begin with. (Probably.)
|
|
|
|
|
|
| |
I see no reason not to do this. I think the check comes from the time
when mp_image stored the image aspect ratio, instead of the pixel aspect
ratio, where the logic might have made more sense.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was noticed that -copy hwdec modes typically dropped the
chroma_location field. This happened because the attributes on hw
download are copied with mp_image_copy_attributes(), which tries to copy
these parameters only if src and dst were both YUV (in an attempt to
copy parameters only if it makes sense).
But hardware formats did not have the YUV flag set (anymore?), and code
shouldn't attempt to check the flag in this way anyway. Drop the check,
and always copy the whole color metadata struct. There is a call to
mp_image_params_guess_csp() below, which tries to unset nonsense
metadata if it was copied from a YUV format to RGB. This function would
also do the right thing for hw formats (although for the cited bug only
the software case matters).
Fixes #4804.
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 96462040ec79b353457b64949f96fad30bd6e988.
I guess the autoprobing is still too primitive to handle this well. What
it really should be trying is initializing the wrapper decoder, and if
that doesn't work, try another method. This is complicated by hwaccels
initializing in a delayed way, so there is no easy solution yet.
Probably fixes #4865.
|
|
|
|
| |
The random space kept screwing me over
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This overrides the use of GLX_SGI_swap_control, because apparently
GLX_SGI_swap_control doesn't support SwapInterval(0), but the
GLX_MESA_swap_interval does.
Of course, everybody except mesa just accepts SwapInterval(0) even for
GLX_SGI_swap_control, but mesa needs to be the special snowflake here
and reject it, forcing us to load their stupid named extension instead.
Meanwhile khronos has done nothing except spit out GLX_EXT_swap_control
(not to be confused with GL_EXT_swap_control, which is exported by
WGL_EXT_swap_control), that doesn't fix the problem because mesa doesn't
implement it anyway.
What a fucking mess.
|
|
|
|
|
|
| |
Even if the contents are entirely zero. In the current code, these
entries were left uninitialized. (Which always worked for nvidia - but
randomly blew up for AMD)
|
|
|
|
|
|
| |
This is simultaneously generalized into two directions:
1. Support more sc_uniform types (needed for SC_UNIFORM_TYPE_PUSHC)
2. Support more flexible packing (needed for both PUSHC and ra_d3d11)
|
|
|
|
|
|
| |
This is around 512 kB, which is just way too much. Heap-allocate it
instead. Also cut down the max pass count to 64, since 128 was
unrealistically high even for vo_opengl.
|
|
|
|
|
|
|
| |
Instead of relying on power-of-two buffer sizes and unsigned overflow,
make this code more robust (and also cleaner).
Why can't C get a real modulo operator?
|
|
|
|
|
|
| |
This was there even before the refactor, but the refactor exposed the
bug. I hate C's useless fucking modulo operator so much. I've gotten hit
by this exact bug way too many times.
|
|
|
|
|