summaryrefslogtreecommitdiffstats
path: root/video/out/gpu
Commit message (Collapse)AuthorAgeFilesLines
* vo_gpu: increase user shader size limitBin Jin2019-03-131-1/+1
| | | | | | | | | | | | | The old size limit was chosen before LUT texture was supported in user shader. At that time, the whole user shader will be compiled and run on GPU, which makes large user shader impractical to be used. With the introduction of LUT texture, the old size limit doesn't make any sense. For example, a 1024x1024 rgba16f LUT will cost 32MB shader size. Fix this by increasing the size limit to a value that's unlikely be reached.
* Merge branch 'master' into pr6360Jan Ekström2019-03-117-178/+203
|\ | | | | | | | | | | Manual changes done: * Merged the interface-changes under the already master'd changes. * Moved the hwdec-related option changes to video/decode/vd_lavc.c.
| * vo_gpu: add two useful operators to user shaderBin Jin2019-03-092-0/+7
| | | | | | | | | | | | | | | | modulo operator could be used to check if size is multiple of a certain number. equal operator could be used to verify if size of different textures aligns.
| * vo_gpu: make texture offset available to CHROMA hooksBin Jin2019-03-091-16/+25
| | | | | | | | | | | | | | | | | | | | Before this commit, texture offset is set after all source textures are finalized. Which means CHROMA hooks won't be able to align with luma planes. This could be problematic for chroma prescalers utilizing information from luma plane. Fix this by find the reference texture early, and set global texture offset early.
| * lcms: allow infinite contrastzc622019-03-091-1/+1
| | | | | | | | Fixes #5980
| * vo_gpu: fix initial seeding of the peak detect ssboNiklas Haas2019-02-182-5/+7
| | | | | | | | | | | | | | This solves some edge cases when using files with very weird metadata (e.g. MaxCLL 10k and so forth). Instead of just blindly seeding it with the tagged metadata, forcibly set the initial state from the detected values.
| * vo_gpu: use dB units for scene change detectionNiklas Haas2019-02-183-11/+12
| | | | | | | | | | | | | | Rather than the linear cd/m^2 units, these (relative) logarithmic units lend themselves much better to actually detecting scene changes, especially since the scene averaging was changed to also work logarithmically.
| * vo_gpu: clamp sigmoid functionNiklas Haas2019-02-181-0/+2
| | | | | | | | Can explode on some clips otherwise
| * vo_gpu: tone map before gamut mappingNiklas Haas2019-02-181-5/+4
| | | | | | | | | | | | Gamut mapping can take very bright out-of-gamut colors into the negatives, which completely destroys the color balance (which tone mapping tries its best to preserve).
| * vo_gpu: make --gamut-warning warn on negative colorsNiklas Haas2019-02-181-1/+2
| | | | | | | | | | As is the case for actually out-of-gamut colors (rather than just too bright colors).
| * vo_gpu: improve numerical accuracy of PQ OETF constantNiklas Haas2019-02-181-1/+1
| | | | | | | | | | Not a huge deal, but we can do the division in C, which makes the float constant larger.
| * vo_gpu: allow color management in dumb modeNiklas Haas2019-02-181-5/+6
| | | | | | | | | | There's no point to disallow target-trc/prim in dumb mode, since they still work fine.
| * vo_gpu: improve accuracy of HDR brightness estimationNiklas Haas2019-02-182-10/+14
| | | | | | | | | | | | | | This change switches to a logarithmic mean to estimate the average signal brightness. This handles dark scenes with isolated highlights much more faithfully than the linear mean did, since the log of the signal roughly corresponds to the perceptual brightness.
| * vo_gpu: allow boosting dark scenes when tone mappingNiklas Haas2019-02-183-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | In theory our "eye adaptation" algorithm works in both ways, both darkening bright scenes and brightening dark scenes. But I've always just prevented the latter with a hard clamp, since I wanted to avoid blowing up dark scenes into looking funny (and full of noise). But allowing a tiny bit of over-exposure might be a good thing. I won't change the default just yet (better let users test), but a moderate value of 1.2 might be better than the current 1.0 limit. Needs testing especially on dark scenes.
| * vo_gpu: redesign peak detection algorithmNiklas Haas2019-02-183-77/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous approach of using an FIR with tunable hard threshold for scene changes had several problems: - the FIR involved annoying hard-coded buffer sizes, high VRAM usage, and the FIR sum was prone to numerical overflow which limited the number of frames we could average over. We also totally redesign the scene change detection. - the hard scene change detection was prone to both false positives and false negatives, each with their own (annoying) issues. Scrap this entirely and switch to a dual approach of using a simple single-pole IIR low pass filter to smooth out noise, while using a softer scene change curve (with tunable low and high thresholds), based on `smoothstep`. The IIR filter is extremely simple in its implementation and has an arbitrarily user-tunable cutoff frequency, while the smoothstep-based scene change curve provides a good, tunable tradeoff between adaptation speed and stability - without exhibiting either of the traditional issues associated with the hard cutoff. Another way to think about the new options is that the "low threshold" provides a margin of error within which we don't care about small fluctuations in the scene (which will therefore be smoothed out by the IIR filter).
| * vo_gpu: improve tone mapping desaturationNiklas Haas2019-02-184-76/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of desaturating towards luma, we desaturate towards the per-channel tone mapped version. This essentially proves a smooth roll-off towards the "hollywood"-style (non-chromatic) tone mapping algorithm, which works better for bright content, while continuing to use the "linear" style (chromatic) tone mapping algorithm for primarily in-gamut content. We also split up the desaturation algorithm into strength and exponent, which allows users to use less aggressive desaturation settings without affecting the overall curve.
| * vo_gpu: allow resetting target-peak to the trc defaultKotori Itsuka2019-01-231-1/+2
| | | | | | | | | | | | | | | | | | Add "auto" the possible values of target-peak. The default value for target_peak is to calculate the target using mp_trc_nom_peak. Unfortunately, this default was outside the acceptable range of 10-10000 nits, which prevented its later reassignment. So add an "auto" choice to target-peak which lets clients and scripts go back to using the trc default after assigning a value.
* | vo: use a struct for vsync feedback stuffwm42018-12-061-8/+2
| | | | | | | | So new useless stuff can be easily added.
* | vo_gpu: glx: use GLX_OML_sync_control for better vsync reportingwm42018-12-061-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the extension to compute the (hopefully correct) video delay and vsync phase. This is very fuzzy, because the latency will suddenly be applied after some frames have already been shown. This means there _will_ be "jumps" in the time accounting, which can lead to strange effects at start of playback (such as making initial "dropped" etc. frames worse). The only reasonable way to fix this would be running a few dummy frame swaps at start of playback until the latency is known. The same happens when unpausing. This only affects display-sync mode. Correct function was not confirmed. It only "looks right". I don't have the equipment to make scientifically correct measurements. A potentially bad thing is that we trust the timestamps we're receiving. Out of bounds timestamps could wreak havoc. On the other hand, this will probably cause the higher level code to panic and just disable DS. As a further caveat, this makes a bunch of assumptions about UST timestamps. If there are delayed frames (i.e. we skipped one or more vsyncs), the latency logic is mostly reset. There is no attempt to make the vo.c skipped vsync logic to use this. Also, the latency computation determines a vsync duration, and there's no effort to reconcile or share the vo.c logic for determining vsync duration.
* | Merge commit '559a400ac36e75a8d73ba263fd7fa6736df1c2da' into ↵Anton Kindestam2018-12-051-1/+3
|\ \ | |/ |/| | | | | | | wm4-commits--merge-edition This bumps libmpv version to 1.103
| * player: get rid of mpv_global.optswm42018-05-241-1/+3
| | | | | | | | | | | | | | | | This was always a legacy thing. Remove it by applying an orgy of mp_get_config_group() calls, and sometimes m_config_cache_alloc() or mp_read_option_raw(). win32 changes untested.
* | spirv: remove --spirv-compiler=nvidiaNiklas Haas2018-12-011-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | This option has been deprecated upstream for a long time, probably doesn't even work anymore, and won't work moving forwards as we replace the vulkan code by libplacebo wrappers. I haven't removed the option completely yet since in theory we could still add support for e.g. a native glslang wrapper in the future. But most likely the future of this code is deletion. As an aside, fix an issue where the man page didn't mention d3d11.
* | drm: rename plane options to better, invariant, namesAnton Kindestam2018-12-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit bumps the libmpv version to 1.102 drm-osd-plane -> drm-draw-plane drm-video-plane -> drm-drmprime-video-plane drm-osd-size -> drm-draw-surface-size "draw plane", as in the plane that OpenGL draws to, whether it be video + OSD or just OSD. "drmprime video plane", as in the plane used for hwdec video imported via drmprime. "draw surface size", as in the size of the surface used for the draw plane The new names are invariant whether or not hwdec_drmprime_drm is being used or not. The original naming was very confusing, as when doing regular rendering (swdec or vaapi) the video would be displayed on the "OSD plane", and the "Video plane" would remain unused.
* | gpu: prefer wayland context on autodetectdudemanguy2018-11-191-3/+3
| |
* | vo_libmpv: support render performance dataAkemi2018-11-131-0/+9
| |
* | vo_gpu: vulkan: Add support for exporting buffer memoryPhilip Langdale2018-10-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CUDA/Vulkan interop works on the basis of memory being exported from Vulkan and then imported by CUDA. To enable this, we add a way to declare a buffer as being intended for export, and then add a function to do the export. For now, we support the fd and Handle based exports on Linux and Windows respectively. There are others, which we can support when a need arises. Also note that this is just for exporting buffers, rather than textures (VkImages). Image import on the CUDA side is supposed to work, but it is currently buggy and waiting for a new driver release. Finally, at least with my nvidia hardware and drivers, everything seems to work even if we don't initialise the buffer with the right exportability options. Nevertheless I'm enforcing it so that we're following the spec.
* | vo_gpu: vulkan: fix strncpy truncation in spirv_compiler_initBtbN2018-10-211-1/+1
| | | | | | | | | | | | Fixes GCC8 warning ../video/out/gpu/spirv.c: In function 'spirv_compiler_init': ../video/out/gpu/spirv.c:68:9: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation]
* | vo_gpu: split --linear-scaling into two separate optionsNiklas Haas2018-10-192-13/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since linear downscaling makes sense to handle independently from linear/sigmoid upscaling, we split this option up. Now, linear-downscaling is its own option that only controls linearization when downscaling and nothing more. Likewise, linear-upscaling / sigmoid-upscaling are two mutually exclusive options (the latter overriding the former) that apply only to upscaling and no longer implicitly enable linear light downscaling as well. The old behavior was very confusing, as evidenced by issues such as #6213. The current behavior should make much more sense, and only minimally breaks backwards compatibility (since using linear-scaling directly was very uncommon - most users got this for free as part of gpu-hq and relied only on that). Closes #6213.
* | vo_gpu: fix vec3 packing in UBOs/push_constantsNiklas Haas2018-09-291-11/+13
| | | | | | | | | | | | | | | | | | | | | | For vec3, the alignment and size differ. The current code will pack a struct like { vec3; float; vec2 } into 8 machine words, whereas the spec would only use 6. This actually fixes a real bug: The only place in the code I could find where it was conceivably possible that a vec3 is followed by a float was when using --gpu-dumb-mode in combination with --gamma-factor, and only when --gpu-api=vulkan. So it's no surprised nobody ran into it yet.
* | vo_gpu: use explicit offsets for push constantsNiklas Haas2018-09-291-2/+1
| | | | | | | | | | | | | | | | | | These used to be unsupported long ago, but it seems glslang added support in the meantime. (I don't know which version, but I'm guessing it was long enough ago that we don't have to add a feature check) Should hopefully help make push constant layouts more robust against possible bugs either in our code or in the driver.
* | vo_gpu: adjust PRNG variant used by GL shaderssfan52018-09-261-1/+5
| | | | | | | | | | | | | | | | | | | | | | Certain low-end Mali GPUs have a rather low precision and overflow during the PRNG calculations, thereby breaking e.g. deband-grain. Modify the permute() to avoid this, this does not impact the quality of PRNG output (noticeably). This problem was observed on: GL_VENDOR='ARM', GL_RENDERER='Mali-T720' GL_VERSION='OpenGL ES 3.1 v1.r15p0-00rel0.bdd9e62cdc8c88e0610a16b5901161e9'
* | vo_gpu: switch to optimization level performanceNiklas Haas2018-09-011-1/+1
| | | | | | | | | | | | Upstream has this now. Didn't really make any different for me (except making the polar compute shader 2%-3% faster), but maybe it does for somebody else.
* | vo_gpu: avoid overwriting compute shader block sizesNiklas Haas2018-08-261-4/+10
| | | | | | | | | | | | | | | | | | | | When using multiple compute shaders as part of the same pass, there can be a conflict in the block sizes. In the problematic case, the HDR detection shader can collide with the polar sampling shader. In this case, the solution is clear - the passes that can handle any size should "give in" and not overwrite the block sizes. Fixes #6083.
* | wscript: split egl-android from androidTom Yan2018-08-201-1/+1
| |
* | gpu: prefer 16bit floating point FBO formats to 16bit integer onesJan Ekström2018-07-081-1/+1
| | | | | | | | | | | | According to earlier discussions, this can improve visual quality. This only changes the preferred order of the formats, not the formats themselves.
* | vo_gpu: desaturate after peak detectionNiklas Haas2018-05-311-12/+12
|/ | | | | | | | | | This sacrifices some dynamic range for well-behaved sources, but prevents catastrophic desaturation on badly mastered / too bright sources. I think that's the better trade-off. This makes the desaturation algorithm much "safer" to deploy by default, as well. One could even argue going up to strength 1.0, which works better for some sources but worse for others. But I think the current strength is the best trade-off even after this change.
* vo_gpu: allow higher icc-contrast and improve loggingNiklas Haas2018-05-171-2/+3
| | | | | | | | | | With the advent of actual HDR devices, my real measured ICC profile has an "infinite" contrast, since the display is completely off on pure black inputs. 100k:1 might not be enough, so let's just bump it up to 1m:1 to be safe. Also, improve the logging in the case that the detected contrast is too high by default.
* drm/atomic: refactor hwdec_drmprime_drm with native resourcesLongChair2018-05-011-0/+9
| | | | | | | | | | | | | | | | | That new API was introduced and allows to have several native resources. Thisuses that mechanisma for drm resources rather than the deprecated opengl-cb structs. This patch therefore add two structs that can be used with the drm atomic interop. - mpv_opengl_drm_params : which will hold all the drm handles - mpv_opengl_drm_osd_size : which will hold osd layer size This commit adds a drm-osd-size=WxH parameter to commandline which allows to define the OSD plane dimension. OSD can be upscaled to screen resolution when having OSD at video resolution is too heavy. This is especially useful for UHD modes on embedded devices where the GPU cannot handle UHD modes at a decent framerate.
* vo_gpu/video: disable compute shaders if an FBO format was not availableJan Ekström2018-05-011-0/+5
| | | | | This is actually more generic and better than just lazily plastering peak calculation together with dumb mode.
* vo_gpu/video: add improved logging when a user-specified FBO failsJan Ekström2018-05-011-2/+13
| | | | | I don't know if we can just return from this function, so for now just adding this piece of logging.
* gpu/video: make HDR peak computing work without work group countNiklas Haas2018-04-291-4/+5
| | | | | | | | | | Define a hard-coded value for gl_NumWorkGroups if it is not available. This adds an additional requirement of needing a shader recompile for all window size changes. This was considered a worthwhile compromise as currently f.ex. d3d11 completely lacked any peak computation - this is a major quality of life upgrade.
* gpu/video: improve HDR peak computation feature check loggingJan Ekström2018-04-291-1/+4
| | | | | Now that the feature depends on multiple features, log all of their states in the message.
* video: remove internal stereo_out flagwm42018-04-291-2/+2
| | | | | | Also rename stereo3d to stereo_in. The only real change is that the vo_gpu OSD code now uses the actual stereo 3D mode, instead of the --video-steroe-mode value. (Why does this vo_gpu code even exist?)
* vo_libmpv: support GPU rendered screenshotswm42018-04-291-0/+9
| | | | | Like DR, this needed a lot of preparation, and here's the boring glue code that finally implements it.
* vo_libmpv: add support for DRwm42018-04-291-0/+9
| | | | | With all the preparation work done, this only has to do the annoying dance of passing it through all the damn layers.
* vo_gpu: move some extra code for screenshot to video.cwm42018-04-201-5/+14