summaryrefslogtreecommitdiffstats
path: root/video
Commit message (Collapse)AuthorAgeFilesLines
* vo_gpu: vulkan: hwdec_cuda: Add support for Vulkan interopPhilip Langdale2018-10-221-58/+285
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Despite their place in the tree, hwdecs can be loaded and used just fine by the vulkan GPU backend. In this change we add Vulkan interop support to the cuda/nvdec hwdec. The overall process is mostly straight forward, so the main observation here is that I had to implement it using an intermediate Vulkan buffer because the direct VkImage usage is blocked by a bug in the nvidia driver. When that gets fixed, I will revist this. Nevertheless, the intermediate buffer copy is very cheap as it's all device memory from start to finish. Overall CPU utilisiation is pretty much the same as with the OpenGL GPU backend. Note that we cannot use a single intermediate buffer - rather there is a pool of them. This is done because the cuda memcpys are not explicitly synchronised with the texture uploads. In the basic case, this doesn't matter because the hwdec is not asked to map and copy the next frame until after the previous one is rendered. In the interpolation case, we need extra future frames available immediately, so we'll be asked to map/copy those frames and vulkan will be asked to render them. So far, harmless right? No. All the vulkan rendering, including the upload steps, are batched together and end up running very asynchronously from the CUDA copies. The end result is that all the copies happen one after another, and only then do the uploads happen, which means all textures are uploaded the same, final, frame data. Whoops. Unsurprisingly this results in the jerky motion because every 3/4 frames are identical. The buffer pool ensures that we do not overwrite a buffer that is still waiting to be uploaded. The ra_buf_pool implementation automatically checks if existing buffers are available for use and only creates a new one if it really has to. It's hard to say for sure what the maximum number of buffers might be but we believe it won't be so large as to make this strategy unusable. The highest I've seen is 12 when using interpolation with tscale=bicubic. A future optimisation here is to synchronise the CUDA copies with respect to the vulkan uploads. This can be done with shared semaphores that would ensure the copy of the second frames only happens after the upload of the first frame, and so on. This isn't trivial to implement as I'd have to first adjust the hwdec code to use asynchronous cuda; without that, there's no way to use the semaphore for synchronisation. This should result in fewer intermediate buffers being required.
* vo_gpu: vulkan: Add a function to get the device UUIDPhilip Langdale2018-10-222-0/+25
| | | | We need this to do device matching for the cuda interop.
* vo_gpu: vulkan: Add arbitrary user data for an ra_vk_bufPhilip Langdale2018-10-222-0/+18
| | | | | | | | | This is arguably a little contrived, but in the case of CUDA interop, we have to track additional state on the cuda side for each exported buffer. If we want to be able to manage buffers with an ra_buf_pool, we need some way to keep that CUDA state associated with each created buffer. The easiest way to do that is to attach it directly to the buffers.
* vo_gpu: vulkan: Add support for exporting buffer memoryPhilip Langdale2018-10-228-6/+179
| | | | | | | | | | | | | | | | | | | | The CUDA/Vulkan interop works on the basis of memory being exported from Vulkan and then imported by CUDA. To enable this, we add a way to declare a buffer as being intended for export, and then add a function to do the export. For now, we support the fd and Handle based exports on Linux and Windows respectively. There are others, which we can support when a need arises. Also note that this is just for exporting buffers, rather than textures (VkImages). Image import on the CUDA side is supposed to work, but it is currently buggy and waiting for a new driver release. Finally, at least with my nvidia hardware and drivers, everything seems to work even if we don't initialise the buffer with the right exportability options. Nevertheless I'm enforcing it so that we're following the spec.
* vo_gpu: vulkan: suppress bogus error message on --vulkan-deviceNiklas Haas2018-10-211-5/+5
| | | | | | Since the code just broke out of the loop on a match rather than jumping straight to the end of the function body, it ended up hitting the code path for when the end of the list was reached.
* cocoa-cb: fix double clicking the title barAkemi2018-10-211-1/+29
| | | | | | | | | since we draw our own title bar we lose the standard functionality of the system provided title bar. because of that we have to reimplement the functionality of double clicking the title bar. depending on the system preferences we want to minimize, zoom or do nothing. Fixes #6223
* vo_gpu: vulkan: fix strncpy truncation in spirv_compiler_initBtbN2018-10-211-1/+1
| | | | | | Fixes GCC8 warning ../video/out/gpu/spirv.c: In function 'spirv_compiler_init': ../video/out/gpu/spirv.c:68:9: warning: 'strncpy' specified bound 32 equals destination size [-Wstringop-truncation]
* x11: fix icc profile when the window goes near off screenslatchurie2018-10-211-1/+1
| | | | | | | | On a multi monitor setup, when the center of the window was going off screen, the icc profile would always switch to the profile of the first screen. This fixes the issue by defaulting the value to the current screen.
* vo_gpu: vulkan: fix the buffer size on partial uploadNiklas Haas2018-10-191-0/+1
| | | | | | | | | | | This was pased on the texture height, which was a mistake. In some cases it could exceed the actual size of the buffer, leading to a vulkan API error. This didn't seem to cause any problems in practice, since a too-large synchronization is just bad for performance and shouldn't do any harm internally, but either way, it was still undefined behavior to submit a barrier outside of the buffer size. Fix the calculation, thus fixing this issue.
* vo_gpu: split --linear-scaling into two separate optionsNiklas Haas2018-10-192-13/+25
| | | | | | | | | | | | | | | | | | Since linear downscaling makes sense to handle independently from linear/sigmoid upscaling, we split this option up. Now, linear-downscaling is its own option that only controls linearization when downscaling and nothing more. Likewise, linear-upscaling / sigmoid-upscaling are two mutually exclusive options (the latter overriding the former) that apply only to upscaling and no longer implicitly enable linear light downscaling as well. The old behavior was very confusing, as evidenced by issues such as #6213. The current behavior should make much more sense, and only minimally breaks backwards compatibility (since using linear-scaling directly was very uncommon - most users got this for free as part of gpu-hq and relied only on that). Closes #6213.
* x11_common: replace atoi with strtoulNicolas F2018-10-191-1/+2
| | | | | | Using strtol and strtoul is allegedly better practice, and I'm going for strtoul here because I'm pretty sure X11 displays cannot be in the negative.
* vo_gpu: opengl: fix segfault when gl->DeleteSync is unavailableNiklas Haas2018-10-161-1/+3
| | | | | | | This deinit code was never checked, so this line would always crash on implementations without support for sync objects. Fixes #6197.
* cocoa-cb: fix side by side Split ViewAkemi2018-10-021-1/+1
| | | | | | | | | | when entering a Split View a windowDidEnterFullScreen event happens without a previous toggleFullScreen call. in that case it tries to stop an animation that was never initiated by us and basically breaks the system initiated fullscreen, or in this case the Split View. immediately after entering the fullscreen it tries top stop the animation and resizes the window, which causes the window to exit fullscreen. only try to stop an animation that was initiated by us and is safe to stop.
* {mac,cocoa}: trim trailing null out of macosx_icon when loading itRodger Combs2018-10-021-1/+3
| | | | | | This prevents crashes when loading the application icon image. Suggested-by: Akemi <der.richter@gmx.de>
* cocoa-cb: add Apple Software Renderer supportAkemi2018-09-302-3/+29
| | | | | | by default the pixel format creation falls back to software renderer when everything fails. this is mostly needed for VMs. additionally one can directly request an sw renderer or exclude it entirely.
* cocoa-cb: move macOS option retrieval to the earliest point possibleAkemi2018-09-301-6/+0
| | | | | | | moved the retrieval of the macOS specific options from the backend initialisation to the initialisation of the CocoaCB class, the earliest point possible. this way macOS specific options can be used for the opengl context creation for example.
* drm_atomic: Allow to create atomic context w/o drmprime video planeAnton Kindestam2018-09-302-6/+11
| | | | | | | | | | | | | | | This is to improve the experience when running with default settings on a driver that doesn't have any overlay planes (or indeed only one plane), but still supports DRM atomic. Since the drmprime video plane is set to pick an overlay plane by default it would fail on these drivers due to not being able to create any atomic context. Users with such cards had to specify --drm-video-plane-id manually to some bogus value (it's not used after all). The "video" plane is only ever used by the drmprime-drm hwdec interop, which is not used at all in the typical usecase where everything is actually rendered on to the "OSD" plane using EGL, so having an atomic context without the "video" plane should be fine most of the time.
* vo_gpu: fix vec3 packing in UBOs/push_constantsNiklas Haas2018-09-291-11/+13
| | | | | | | | | | | For vec3, the alignment and size differ. The current code will pack a struct like { vec3; float; vec2 } into 8 machine words, whereas the spec would only use 6. This actually fixes a real bug: The only place in the code I could find where it was conceivably possible that a vec3 is followed by a float was when using --gpu-dumb-mode in combination with --gamma-factor, and only when --gpu-api=vulkan. So it's no surprised nobody ran into it yet.
* vo_gpu: use explicit offsets for push constantsNiklas Haas2018-09-291-2/+1
| | | | | | | | | These used to be unsupported long ago, but it seems glslang added support in the meantime. (I don't know which version, but I'm guessing it was long enough ago that we don't have to add a feature check) Should hopefully help make push constant layouts more robust against possible bugs either in our code or in the driver.
* vo_gpu: adjust PRNG variant used by GL shaderssfan52018-09-261-1/+5
| | | | | | | | | | | Certain low-end Mali GPUs have a rather low precision and overflow during the PRNG calculations, thereby breaking e.g. deband-grain. Modify the permute() to avoid this, this does not impact the quality of PRNG output (noticeably). This problem was observed on: GL_VENDOR='ARM', GL_RENDERER='Mali-T720' GL_VERSION='OpenGL ES 3.1 v1.r15p0-00rel0.bdd9e62cdc8c88e0610a16b5901161e9'
* mp_image: strip all HDR peak information from SDR clipsNiklas Haas2018-09-051-0/+6
| | | | | | | By overriding it with 1.0 (aka SDR). This prevents blowing up on mistagged clips. Fixes #6111
* vo_gpu: switch to optimization level performanceNiklas Haas2018-09-011-1/+1
| | | | | | Upstream has this now. Didn't really make any different for me (except making the polar compute shader 2%-3% faster), but maybe it does for somebody else.
* vo_gpu: avoid overwriting compute shader block sizesNiklas Haas2018-08-261-4/+10
| | | | | | | | | | When using multiple compute shaders as part of the same pass, there can be a conflict in the block sizes. In the problematic case, the HDR detection shader can collide with the polar sampling shader. In this case, the solution is clear - the passes that can handle any size should "give in" and not overwrite the block sizes. Fixes #6083.
* wscript: split egl-android from androidTom Yan2018-08-201-1/+1
|
* cocoa-cb: fix crash on macOS 10.10Akemi2018-08-111-1/+3
| | | | | | the colorspace of the layer is only available on 10.11 and upwards. Fixes #6041
* cocoa-cb: fix crash when no screen is availableAkemi2018-08-111-1/+1
| | | | | | | | instead of force unwrapping and chaining the optional vars in our containsMouseLocation function, safely unwrap and guard the resulting var. Fixes #6062
* hwdec_vaegl: Fix VAAPI EGL interop used with gpu-context=drmAnton Kindestam2018-07-092-5/+22
| | | | | | | | Add another parameter to mpv_opengl_drm_params to hold the FD to the render node, so that the fd can be passed to hwdec_vaegl. The render node is opened in context_drm_egl and inferred from the primary device fd using drmGetRenderDeviceNameFromFd.
* context_drm_egl: Fix CRTC setup and release code when using atomicAnton Kindestam2018-07-096-41/+217
| | | | | | | | | | The previous code did not save enough information about the old state, and could end up changing what plane the fbcon:s FB got attached to, or in worse case causing a blank screen (observed in some multi-screen setups on Sandy Bridge). In addition refactor the handling of drmModeModeInfo property blobs to not leak, as well as enable reuse of already created blobs.
* context_drm_egl: Fix some memory leaks on error exitAnton Kindestam2018-07-091-63/+66
| | | | | Fix some memory leaks on error exit in crtc_setup_atomic and crtc_release_atomic.
* gpu: prefer 16bit floating point FBO formats to 16bit integer onesJan Ekström2018-07-081-1/+1
| | | | | | According to earlier discussions, this can improve visual quality. This only changes the preferred order of the formats, not the formats themselves.
* cocoa-cb: fix building with Swift 4.2coverity_scanAkemi2018-06-122-7/+7
| | | | | | | | | init is a reserved keyword and Swift 4.2 got a bit stricter about using it. this could be fixed by adding apostrophes around init but makes the code uglier. hence i just renamed init to initialized and for consistency uninit to uninitialized. Fixes #5899
* cocoa-cb: remove pre-allocation of window, view and layerAkemi2018-06-124-99/+114
| | | | | | | | | | | the pre-allocation was needed because the layer allocated a opengl context async itself and we couldn't influence that. so we had to start the core after the context was actually allocated. furthermore a window, view and layer hierarchy had to be created so the layer would create a context. now, instead of relying on the layer to create a context we do this manually and re-use that context later when the layer wants to create one async itself.
* vo_libmpv: pass vo struct to the control callbackAkemi2018-06-123-11/+13
|
* hwdec_drmprime_drm: Do not show error message during probingAnton Kindestam2018-06-081-1/+1
| | | | | Change the log-level of an error message that would sometimes show up during hwdec probing, and could be misleading.
* vo_sdl: add support for screensaver VOCTRL'ssfan52018-06-021-3/+24
| | | | | Previously vo_sdl would unconditonally disable the screensaver, ignoring the `stop-screensaver` option.
* vo_gpu: desaturate after peak detectionNiklas Haas2018-05-311-12/+12
| | | | | | | | | | This sacrifices some dynamic range for well-behaved sources, but prevents catastrophic desaturation on badly mastered / too bright sources. I think that's the better trade-off. This makes the desaturation algorithm much "safer" to deploy by default, as well. One could even argue going up to strength 1.0, which works better for some sources but worse for others. But I think the current strength is the best trade-off even after this change.
* input: add a define for the number of mouse buttons and use itwm42018-05-251-0/+4
| | | | (Why the fuck are there up to 20 mouse buttons?)
* vd_lavc: minor simplification for get_format fallbackwm42018-05-251-7/+1
| | | | | | | | | | | The default get_format does exactly do this, so we don't need to duplicate it. The only potential problem with this is that the logic doesn't entirely prevent that the avcodec_default_get_format hw_device_ctx path is triggered, which would probably work, but has unknown consequences and interactions. But the way the logic currently works it can't happen, provided the hwaccel metadata libavcodec provides is correct.
* x11: support Shift+TABNiklas Haas2018-05-241-1/+1
| | | | | | | | | | | | | | For some reason, the X default modifier map binds shift+tab to ISO_Left_Tab instead of the regular Tab. So to get Shift+TAB recognized by mpv, we also need to accept ISO_Left_Tab. This patch matches what other programs like e.g. Qt do, which treat Tab and ISO_Left_Tab as the same thing. God only knows why the distinction exists, and why X decides to mix up its bindings like that. Fixes #5849
* wayland_common: require wl_compositor of version 3Rostislav Pehlivanov2018-05-201-3/+2
| | | | | We already did require it, in order to call set_buffer_scale. This just makes it error out more gracefully.
* wayland_common: fix maximized stateRostislav Pehlivanov2018-05-202-13/+22
| | | | Window size should not change if the window has been maximized or tiled.
* vo_gpu: allow higher icc-contrast and improve loggingNiklas Haas2018-05-171-2/+3
| | | | | | | | | | With the advent of actual HDR devices, my real measured ICC profile has an "infinite" contrast, since the display is completely off on pure black inputs. 100k:1 might not be enough, so let's just bump it up to 1m:1 to be safe. Also, improve the logging in the case that the detected contrast is too high by default.
* drm_atomic: Fix memory leaks in drm_atomic_createAnton Kindestam2018-05-081-34/+33
| | | | | | | | | First fix a memory leak when skipping cursor planes by inverting the check and putting everything, but the free, in the body. Then fix a missed drmModeFreePlane by simply copying the fields of the drmModePlane we are interested in and freeing the drmModePlane struct early.
* build: make encoding mode non-optionalwm42018-05-031-2/+0
| | | | Makes it easier to not break the build by confusing the ifdeffery.
* encode: get rid of the output packet queuewm42018-05-033-3/+22
| | | | | | | | | | | | Until recently, ao_lavc and vo_lavc started encoding whenever the core happened to send them data. Since audio and video are not initialized at the same time, and the muxer was not necessarily opened when the first encoder started to produce data, the resulting packets were put into a queue. As soon as the muxer was opened, the queue was flushed. Change this to make the core wait with sending data until all encoders are initialized. This has the advantage that we don't need to queue up the packets.
* vo_lavc: explicitly skip redraw and repeated frameswm42018-05-032-8/+11
| | | | | | | | | | | The user won't want to have those in the video (I think). The core can sporadically issue redraws, which is what you want for actual playback, but not in encode mode. vo_lavc can explicitly detect those and skip them. It only requires switching to a more advanced internal VO API. The comments in vo.h are because vo_lavc draws to one of the images in order to render OSD. This is OK, but might come as a surprise to whoever calls draw_frame, so document it. (Current callers are OK with it.)
* encode: remove old timestamp handlingwm42018-05-031-193/+35
| | | | | This effectively makes --ocopyts the default. The --ocopyts option itself is also removed, because it's redundant.
* drm_atomic: Disallow selecting cursor planes using the optionsAnton Kindestam2018-05-011-0/+3
|
* drm_common: Be smarter when deciding on which CRTC and Encoder to useAnton Kindestam2018-05-011-1/+27
| | | | | | | | | | | | | | | | | Inspired by kmscube, first try to pick the Encoder and CRTC already associated with the selected Connector, if any. Otherwise try to find the first matching encoder & CRTC like before. The previous behavior had problems when using atomic modesetting (crtc_setup_atomic) when we picked an Encoder & CRTC that was currently being used by the fbcon together with another Encoder. drmModeSetCrtc was able to "steal" the CRTC in this case, but using atomic modesetting we do not seem to get this behavior automatically. This should also improve behavior somewhat when run on a multi screen setup with regards to deinit and VT switching (still sometimes you end up with a blank screen where you previously had a cloned display of your fbcon)
* context_drm_egl: fix some comments and log messages that had not been ↵Anton Kindestam2018-05-011-6/+5
| | | | updated since the plane rename commit
* drm/atomic: Fix crtc_setup_atomic and crtc_release_atomicAnton Kindestam2018-05-011-25/+33
| | | | | | | | | | | Add some properties which where forgotten in crtc_setup_atomic. In both change to not use DRM_MODE_PAGE_FLIP_EVENT | DRM_MODE_ATOMIC_NONBLOCK flags. This should make it more similar to the drmSetCrtc which it aims to replace (take effect directly, and blocking call). This also saves us the trouble of having to set up a poll to wait for pageflip, which would've been neccesary with DRM_MODE_PAGE_FLIP_EVENT, in both crtc_setup_atomic and crtc_release_atomic.
* drm/atomic: disable video plane when unused.LongChair2018-05-011-0/+28
| | | | | | | | | | This patch will make sure that the video plane is hidden when unused. When using high resolution modes, typically UHD, and embedding mpv, having the video plane sitting in the back when you don't play any video is eating a lot of memory bandwidth for compositing. That patch makes sure that the video layer is just disabled before and after playback.
* drm/atomic: add atomic modesetting.LongChair2018-05-011-11/+104
| | | | | | | This commit allows to add atomic modesetting when using the atomic renderer. This is actually needed when using and osd with a smaller size than screen resolution. It will also make the drm atomic path more consistent
* drm/atomic: refactor planes namesLongChair2018-05-017-55/+89
| | | | | | | | We are currently using primary / overlay planes drm objects, assuming that primary plane is osd and overlay plane is video. This commit is doing two things : - replace the primary / overlay planes members with osd and video planes member without the assumption - Add two more options to determine which one of the primary / overlay is associated to osd / video. - It will default osd to overlay and video to primary if unspecified
* drm/atomic: add connector to atomic contextLongChair2018-05-015-4/+35
| | | | | | | | | | This patch adds - DRM connector object to atomic context. - fd property to the drm atomic object as well as a method to read blob type properties. This allows to ensure that the proper connector is picked up, especially when specifying it from the commandline, and also allows to make sure we're using the right one when embedding with interop into an application.
* drm/atomic: refactor hwdec_drmprime_drm with native re