summaryrefslogtreecommitdiffstats
path: root/video/out/opengl
Commit message (Collapse)AuthorAgeFilesLines
* vo_gpu: hwdec_vaegl: Rename and move to hwdec_vaapiPhilip Langdale2019-07-081-558/+0
| | | | | In preparation for adding Vulkan interop support, let's rename to remove the egl reference and move to an api neutral location.
* vo/gpu: hwdec_vdpau: Support direct mode for 4:4:4 contentPhilip Langdale2019-07-081-4/+15
| | | | | | | New releases of VDPAU support decoding 4:4:4 content, and that comes back as NV24 when using 'direct mode' in OpenGL Interop. That means we need to be a little bit smarter about how we set up the OpenGL textures.
* opengl/context_wayland: Fix crash on configure before initial reconfigMichael Forney2019-07-081-1/+3
| | | | | | | | | If the compositor sends a configure event before the surface is initially mapped, resize gets called before the egl_window gets created, resulting in a crash in wl_egl_window_resize. This was fixed back in 618361c697, but was reintroduced when the wayland code was rewritten in 68f9ee7e0b.
* video/out/gpu: Add a `storable` flag to ra_formatPhilip Langdale2019-07-081-0/+3
| | | | | | | | | | | | | | | | While `ra` supports the concept of a texture as a storage destination, it does not support the concept of a texture format being usable for a storage texture. This can lead to us attempting to create a texture from an incompatible format, with undefined results. So, let's introduce an explicit format flag for storage and use it. In `ra_pl` we can simply reflect the `storable` flag. For GL and D3D, we'll need to write some new code to do the compatibility checks. I'm not going to do it here because it's not a regression; we were already implicitly assuming all formats were storable. Fixes #6657
* drm_common: Add proper help option to drm-modeAnton Kindestam2019-05-041-1/+1
| | | | | | | This was implemented by using OPT_STRING_VALIDATE for drm-mode, instead of OPT_INT. Using a string here also prepares for future additions to drm-mode that aim to allow specifying a mode by its resolution.
* drm_common: Add option to toggle use of atomic modesettingAnton Kindestam2019-05-041-1/+2
| | | | | | It is useful when debugging to be able to force atomic off, or as a workaround if atomic breaks for some user. Legacy modesetting is less likely to break by virtue of being a less complex API.
* vo/gpu: hwdec_cuda: Refactor gpu api specific code into separate filesPhilip Langdale2019-05-031-749/+0
| | | | | | | | The amount of code now present that's specific to Vulkan or OpenGL has reached the point where we really want to split it out to avoid a mess of #ifdefs. At the same time, I'm moving the code to an api neutral location.
* context_drm_egl: Add support for presentation feedbackAnton Kindestam2019-05-031-15/+118
| | | | | This implements presentation feedback for context_drm_egl using the values that get fed to the page flip handler.
* vo_gpu/hwdec_cuda: fixup compilation with vulkan disabledJan Ekström2019-04-221-0/+2
| | | | | | | The actual code utilizing this enum was seemingly properly if'd, but not the enum in the struct itself. Fixes compilation.
* vo/gpu: hwdec_cuda: Reorganise backend-specific codePhilip Langdale2019-04-211-151/+223
| | | | | This tries to tidy up the GL vs Vulkan code to be a bit cleaner and easier to read.
* vo_gpu: hwdec_cuda: Implement interop for placeboPhilip Langdale2019-04-211-145/+224
| | | | | | | | | | | | | | | | | | | | This change updates the vulkan interop code to work with the libplacebo based ra_vk, but also introduces direct VkImage sharing to avoid the use of the intermediate buffer. It is also necessary and desirable to introduce explicit semaphore bsed synchronisation for operations on the shared images. Synchronisation means we can safely reuse the same VkImage for every mapped frame, by ensuring the frame is copied to the VkImage before mapping the next frame. This functionality requires a 417.xx or newer nvidia driver, due to bugs in the VkImage interop in the earlier 411 and 415 drivers. It's definitely worth the effort, as the raw throughput is about twice that of implementation using an intermediate buffer.
* vo_gpu: vulkan: use libplacebo insteadNiklas Haas2019-04-211-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | This commit rips out the entire mpv vulkan implementation in favor of exposing lightweight wrappers on top of libplacebo instead, which provides much of the same except in a more up-to-date and polished form. This (finally) unifies the code base between mpv and libplacebo, which is something I've been hoping to do for a long time. Note: The ra_pl wrappers are abstract enough from the actual libplacebo device type that we can in theory re-use them for other devices like d3d11 or even opengl in the future, so I moved them to a separate directory for the time being. However, the rest of the code is still vulkan-specific, so I've kept the "vulkan" naming and file paths, rather than introducing a new `--gpu-api` type. (Which would have been ended up with significantly more code duplicaiton) Plus, the code and functionality is similar enough that for most users this should just be a straight-up drop-in replacement. Note: This commit excludes some changes; specifically, the updates to context_win and hwdec_cuda are deferred to separate commits for authorship reasons.
* vo_gpu: index desc namespaces by raNiklas Haas2019-04-211-1/+1
| | | | | No reason to require them be constant. This allows them to depend on runtime characteristics of the `ra`.
* Merge branch 'master' into pr6360Jan Ekström2019-03-112-59/+151
|\ | | | | | | | | | | Manual changes done: * Merged the interface-changes under the already master'd changes. * Moved the hwdec-related option changes to video/decode/vd_lavc.c.
| * context_drm_egl: implement n-bufferingAnton Kindestam2019-02-251-59/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows context_drm_egl to use as many buffers as libgbm or the swapchain_depth setting allows (whichever is smaller). On pause and on still images (cover art etc.) to make sure that output does not lag behind user input, the swapchain is drained and reverts to working in a dual buffered (equivalent to swapchain-depth=1) manner. When possible (swapchain-depth>=2), the wait on the page flip event is now not done immediately after queueing, but is deferred to the next invocation of swap_buffers. Which should give us more CPU time between invocations. Although, since gbm_surface_has_free_buffers() can only tell us a boolean value and not how many buffers we have left, we are forced to do this contortionist dance where we first overshoot until gbm_surface_has_free_buffers() reports 0, followed by immediately waiting so we can free a buffer, to be able to get the deferred wait on page flip rolling. With this commit we do not rely on the default vsync fences/latency emulation of video/out/opengl/context.c, but supply our own, since the places we create and wait for the fences needs to be somewhat different for best performance. Minor fixes: * According to GBM documentation all BO:s gotten with gbm_surface_lock_front_buffer must be released before gbm_surface_destroy is called on the surface. * We let the page flip handler function handle the waiting_for_flip flag.
| * opengl: Support GL_ARB_sync style fences on OpenGL ES 3.0Anton Kindestam2019-02-251-0/+1
| | | | | | | | | | OpenGL ES 3.0 and up has suppport for for GL_ARB_sync style fences. Make sure that mpv can use them.
* | vo, vo_gpu, glx: correct GLX_OML_sync_control usagewm42018-12-061-67/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I misunderstood how this extension works. If I understand it correctly now, it's worse than I thought. They key thing is that the (ust, msc, sbc) tripple is not for a single swap event. Instead, (ust, msc) run independently from sbc. Assuming a CFR display/compositor, this means you can at best know the vsync phase and frequency, but not the exact time a sbc changed value. There is GLX_INTEL_swap_event, which might work as expected, but it has no EGL equivalent (while GLX_OML_sync_control does, in theory). Redo the context_glx sync code. Now it's either more correct or less correct. I wanted to add proper skip detection (if a vsync gets skipped due to rendering taking too long and other problems), but it turned out to be too complex, so only some unused fields in vo.h are left of it. The "generic" skip detection has to do. The vsync_duration field is also unused by vo.c. Actually this seems to be an improvement. In cases where the flip call timing is off, but the real driver-level timing apparently still works, this will not report vsync skips or higher vsync jitter anymore. I could observe this with screenshots and fullscreen switching. On the other hand, maybe it just introduces an A/V offset or so. Why the fuck can't there be a proper API for retrieving these statistics? I'm not even asking for much.
* | vo: use a struct for vsync feedback stuffwm42018-12-063-8/+10
| | | | | | | | So new useless stuff can be easily added.
* | vo_gpu: glx: use GLX_OML_sync_control for better vsync reportingwm42018-12-063-0/+111
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the extension to compute the (hopefully correct) video delay and vsync phase. This is very fuzzy, because the latency will suddenly be applied after some frames have already been shown. This means there _will_ be "jumps" in the time accounting, which can lead to strange effects at start of playback (such as making initial "dropped" etc. frames worse). The only reasonable way to fix this would be running a few dummy frame swaps at start of playback until the latency is known. The same happens when unpausing. This only affects display-sync mode. Correct function was not confirmed. It only "looks right". I don't have the equipment to make scientifically correct measurements. A potentially bad thing is that we trust the timestamps we're receiving. Out of bounds timestamps could wreak havoc. On the other hand, this will probably cause the higher level code to panic and just disable DS. As a further caveat, this makes a bunch of assumptions about UST timestamps. If there are delayed frames (i.e. we skipped one or more vsyncs), the latency logic is mostly reset. There is no attempt to make the vo.c skipped vsync logic to use this. Also, the latency computation determines a vsync duration, and there's no effort to reconcile or share the vo.c logic for determining vsync duration.
* drm: rename plane options to better, invariant, namesAnton Kindestam2018-12-012-70/+70
| | | | | | | | | | | | | | | | | | | | | This commit bumps the libmpv version to 1.102 drm-osd-plane -> drm-draw-plane drm-video-plane -> drm-drmprime-video-plane drm-osd-size -> drm-draw-surface-size "draw plane", as in the plane that OpenGL draws to, whether it be video + OSD or just OSD. "drmprime video plane", as in the plane used for hwdec video imported via drmprime. "draw surface size", as in the size of the surface used for the draw plane The new names are invariant whether or not hwdec_drmprime_drm is being used or not. The original naming was very confusing, as when doing regular rendering (swdec or vaapi) the video would be displayed on the "OSD plane", and the "Video plane" would remain unused.
* vo_gpu: hwdec_cuda: Guard GL and Vulkan headers properlyPhilip Langdale2018-11-181-0/+5
| | | | | | | | We are currently unnecessarily including vulkan headers even when not building with vulkan support. I also guarded the GL header inclusion even though this doesn't appear to break anything today. Fixes #6330.
* vo_gpu: opengl: disable compute shaders for old GLSLNiklas Haas2018-11-171-0/+6
| | | | Fixes #6272.
* vo_gpu: hwdec_cuda: Clean up init() error handlingPhilip Langdale2018-10-311-10/+15
| | | | | | | | | | | | Currently, the error paths in init() are a bit confusing, and we can end up trying to pop the current context when there is no context, which leads to distracting error messages. I also added an explicit path to return early if the GPU backend is not OpenGL or Vulkan. It's pointless to do any other cuda init after that point. (Of course, someone could write more interops.) Fixes #6256
* hwdec_drmprime_drm: Missing NULL-check on drm_atomic_context video_planeAnton Kindestam2018-10-251-0/+3
| | | | | | | | | Since 810acf32d6cf0dfbad66c602689ef1218fc0a6e3 video_plane can be NULL under some circumstances. While there is a check in init, init treats this as an error condition and would call uninit, which in turn calls disable_video_plane, which would then segfault. Fix this by including a NULL check inside disable_video_plane, so that it doesn't try to disable what isnt' there.
* vo_gpu: vulkan: hwdec_cuda: Add support for Vulkan interopPhilip Langdale2018-10-221-58/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Despite their place in the tree, hwdecs can be loaded and used just fine by the vulkan GPU backend. In this change we add Vulkan interop support to the cuda/nvdec hwdec. The overall process is mostly straight forward, so the main observation here is that I had to implement it using an intermediate Vulkan buffer because the direct VkImage usage is blocked by a bug in the nvidia driver. When that gets fixed, I will revist this. Nevertheless, the intermediate buffer copy is very cheap as it's all device memory from start to finish. Overall CPU utilisiation is pretty much the same as with the OpenGL GPU backend. Note that we cannot use a single intermediate buffer - rather there is a pool of them. This is done because the cuda memcpys are not explicitly synchronised with the texture uploads. In the basic case, this doesn't matter because the hwdec is not asked to map and copy the next frame until after the previous one is rendered. In the interpolation case, we need extra future frames available immediately, so we'll be asked to map/copy those frames and vulkan will be asked to render them. So far, harmless right? No. All the vulkan rendering, including the upload steps, are batched together and end up running very asynchronously from the CUDA copies. The end result is that all the copies happen one after another, and only then do the uploads happen, which means all textures are uploaded the same, final, frame data. Whoops. Unsurprisingly this results in the jerky motion because every 3/4 frames are identical. The buffer pool ensures that we do not overwrite a buffer that is still waiting to be uploaded. The ra_buf_pool implementation automatically checks if existing buffers are available for use and only creates a new one if it really has to. It's hard to say for sure what the maximum number of buffers might be but we believe it won't be so large as to make this strategy unusable. The highest I've seen is 12 when using interpolation with tscale=bicubic. A future optimisation here is to synchronise the CUDA copies with respect to the vulkan uploads. This can be done with shared semaphores that would ensure the copy of the second frames only happens after the upload of the first frame, and so on. This isn't trivial to implement as I'd have to first adjust the hwdec code to use asynchronous cuda; without that, there's no way to use the semaphore for synchronisation. This should result in fewer intermediate buffers being required.
* vo_gpu: opengl: fix segfault when gl->DeleteSync is unavailableNiklas Haas2018-10-161-1/+3
| | | | | | | This deinit code was never checked, so this line would always crash on implementations without support for sync objects. Fixes #6197.
* cocoa-cb: add Apple Software Renderer supportAkemi2018-09-301-1/+2
| | | | | | by default the pixel format creation falls back to software renderer when everything fails. this is mostly needed for VMs. additionally one can directly request an sw renderer or exclude it entirely.
* drm_atomic: Allow to create atomic context w/o drmprime video planeAnton Kindestam2018-09-301-0/+4
| | | | | | | | | | | | | | | This is to improve the experience when running with default settings on a driver that doesn't have any overlay planes (or indeed only one plane), but still supports DRM atomic. Since the drmprime video plane is set to pick an overlay plane by default it would fail on these drivers due to not being able to create any atomic context. Users with such cards had to specify --drm-video-plane-id manually to some bogus value (it's not used after all). The "video" plane is only ever used by the drmprime-drm hwdec interop, which is not used at all in the typical usecase where everything is actually rendered on to the "OSD" plane using EGL, so having an atomic context without the "video" plane should be fine most of the time.
* hwdec_vaegl: Fix VAAPI EGL interop used with gpu-context=drmAnton Kindestam2018-07-092-5/+22
| | | | | | | | Add another parameter to mpv_opengl_drm_params to hold the FD to the render node, so that the fd can be passed to hwdec_vaegl. The render node is opened in context_drm_egl and inferred from the primary device fd using drmGetRenderDeviceNameFromFd.
* context_drm_egl: Fix CRTC setup and release code when using atomicAnton Kindestam2018-07-091-33/+23
| | | | | | | | | | The previous code did not save enough information about the old state, and could end up changing what plane the fbcon:s FB got attached to, or in worse case causing a blank screen (observed in some multi-screen setups on Sandy Bridge). In addition refactor the handling of drmModeModeInfo property blobs to not leak, as well as enable reuse of already created blobs.
* context_drm_egl: Fix some memory leaks on error exitAnton Kindestam2018-07-091-63/+66
| | | | | Fix some memory leaks on error exit in crtc_setup_atomic and crtc_release_atomic.
* hwdec_drmprime_drm: Do not show error message during probingAnton Kindestam2018-06-081-1/+1
| | | | | Change the log-level of an error message that would sometimes show up during hwdec probing, and could be misleading.
* context_drm_egl: fix some comments and log messages that had not been ↵Anton Kindestam2018-05-011-6/+5
| | | | updated since the plane rename commit
* drm/atomic: Fix crtc_setup_atomic and crtc_release_atomicAnton Kindestam2018-05-011-25/+33
| | | | | | | | | | | Add some properties which where forgotten in crtc_setup_atomic. In both change to not use DRM_MODE_PAGE_FLIP_EVENT | DRM_MODE_ATOMIC_NONBLOCK flags. This should make it more similar to the drmSetCrtc which it aims to replace (take effect directly, and blocking call). This also saves us the trouble of having to set up a poll to wait for pageflip, which would've been neccesary with DRM_MODE_PAGE_FLIP_EVENT, in both crtc_setup_atomic and crtc_release_atomic.
* drm/atomic: disable video plane when unused.LongChair2018-05-011-0/+28
| | | | | | | | | | This patch will make sure that the video plane is hidden when unused. When using high resolution modes, typically UHD, and embedding mpv, having the video plane sitting in the back when you don't play any video is eating a lot of memory bandwidth for compositing. That patch makes sure that the video layer is just disabled before and after playback.
* drm/atomic: add atomic modesetting.LongChair2018-05-011-11/+104
| | | | | | | This commit allows to add atomic modesetting when using the atomic renderer. This is actually needed when using and osd with a smaller size than screen resolution. It will also make the drm atomic path more consistent
* drm/atomic: refactor planes namesLongChair2018-05-012-22/+23
| | | | | | | | We are currently using primary / overlay planes drm objects, assuming that primary plane is osd and overlay plane is video. This commit is doing two things : - replace the primary / overlay planes members with osd and video planes member without the assumption - Add two more options to determine which one of the primary / overlay is associated to osd / video. - It will default osd to overlay and video to primary if unspecified
* drm/atomic: add connector to atomic contextLongChair2018-05-012-1/+2
| | | | | | | | | | This patch adds - DRM connector object to atomic context. - fd property to the drm atomic object as well as a method to read blob type properties. This allows to ensure that the proper connector is picked up, especially when specifying it from the commandline, and also allows to make sure we're using the right one when embedding with interop into an application.
* drm/atomic: refactor hwdec_drmprime_drm with native resourcesLongChair2018-05-012-30/+52
| | | | | | | | | | | | | | | | | That new API was introduced and allows to have several native resources. Thisuses that mechanisma for drm resources rather than the deprecated opengl-cb structs. This patch therefore add two structs that can be used with the drm atomic interop. - mpv_opengl_drm_params : which will hold all the drm handles - mpv_opengl_drm_osd_size : which will hold osd layer size This commit adds a drm-osd-size=WxH parameter to commandline which allows to define the OSD plane dimension. OSD can be upscaled to screen resolution when having OSD at video resolution is too heavy. This is especially useful for UHD modes on embedded devices where the GPU cannot handle UHD modes at a decent framerate.
* cocoa: change deprecation warning from opengl-cb to libmpvAkemi2018-04-291-1/+1
|
* egl_helpers: change minimum framebuffer size to 8 bit per componentwm42018-04-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | This is for working around bugs in certain Android devices. At least one device fails to sort EGLConfigs by size, so eglChooseConfig() ends up choosing a config with 5/6/5 bits per r/g/b component. The other attributes in the affected EGLConfigs did not look like they should affect the sorting process as specified by the EGL 1.4 standard. The device was reported as: Sony Xperia Z3 Tablet Compact Firmware 6.0.1 build number 23.5.A.1.291 GL_VERSION='OpenGL ES 3.0 V@140.0 AU@ (GIT@I741a3d36ca)' GL_VENDOR='Qualcomm' GL_RENDERER='Adreno (TM) 330' Other Qualcom/Adreno devices have been reported as unaffected by this (including some with same GL_RENDERER string). "Fix" this by always requiring at least 8 bit. This means it would fail on devices which cannot provide this. We're fine with this. mpv-android/mpv-android#112
* egl_helpers: log certain EGL attributeswm42018-04-291-0/+38
| | | | Might be helpful with broken EGL implementations.
* hwdec_ios: fix crash after mapper_init failureAman Gupta2018-04-171-2/+4
|
* vo_gpu: hwdec: Use ffnvcodec to load CUDA symbolsPhilip Langdale2018-04-153-244/+45
| | | | | | The CUDA dynamic loader was broken out of ffmpeg into its own repo and package. This gives us an opportunity to re-use it in mpv and remove our custom loader logic.
* opengl: include details in EGL context errorsAman Gupta2018-04-121-3/+3
|
* client API: add a new way to pass X11 Display etc. to render APIwm42018-03-2614-86/+41
| | | | | | | | | | | | | | | | | | | | | |