summaryrefslogtreecommitdiffstats
path: root/video/out
Commit message (Collapse)AuthorAgeFilesLines
* osx: always deactivate the early opengl flush on macOSAkemi2018-02-121-2/+6
| | | | | | | | | | | early flushing only caused problems on macOS, which includes: - performance problems and huge amount of dropped frames - problems with playing back video files with fps close to the display refresh rate - rendering at twice the rate of the video fps - not properly detected display refresh rate we always deactivate any early flush for macOS to fix these problems.
* vo_drm: support --monitorpixelaspectMarco Migliori2018-02-111-0/+2
| | | | | | | | | | This commit allows for video to be shown with the right aspect even when pixels are not square in the selected drm mode. For example, if drm mode 5 is "640x400", the right aspect on a 4:3 monitor is obtained by mpv --vo=drm --drm-mode=5 --monitorpixelaspect=5:6 ... Other vo's seem to make this parameter change the size of the window, but in the drm vo this is fixed, being as large as the screen.
* vo_drm: reset last input image on reconfigMarco Migliori2018-02-111-0/+3
| | | | | | | | | | | | | | | | | | The last image is stored in vo->priv->last_input to be used when redrawing a frame is necessary (control: VOCTRL_REDRAW_FRAME). At the beginning it is NULL, so a redraw request has no effect since draw_image ignores calls with image=NULL. When using --force-window the size of the image may change without the vo structure being re-created. Before this commit, the size of vo->priv->last_input could become inconsistent with the cropping rectangle vo->priv->src_rc, which could trigger an assert in mp_image_crop_rc(). Even if it did not, the last image of a video remained on the screen when the next file in the playlist had no video (e.g., it was an mp3 without an embedded cover). This commit deallocates and resets to NULL the image vo->priv->last_input when reconfiguring video.
* vo_drm: make the osd as large as the screenMarco Migliori2018-02-111-18/+18
| | | | | | | | | | | | | | | | | | | | Before this commit, the drm vo drew the osd over the scaled image, and then copied the result onto the framebuffer, shifted. This made the frame centered, but forced the osd to be only as large as the image. This was inconsistent with other vo's, covered the image with the progress indicator even when a black band was at the top of the screen, made the progress indicator wrap on narrow videos, etc. The change is to always use an image as large as the screen. The frame is copied scaled and shifted to it, and the osd drawn over it. The result is finally copied to the framebuffer without any shift, since it is already as large as it. Technically, cur_frame is an image as large as the screen and cur_frame_cropped is a dummy reference to it, cropped to the size of the scaled video. This way, copying the scaled image to cur_frame_cropped positions the image in the right place in cur_frame, which can then have the osd added to it and copied to the framebuffer.
* vo_gpu: make screenshots use the GL rendererwm42018-02-1110-25/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | Using the GL renderer for color conversion will make sure screenshots will use the same conversion as normal video rendering. It can do this for all types of screenshots. The logic when to write 16 bit PNGs changes. To approximate the old behavior, we decide by looking whether the source video format has more than 8 bits per component. We apply this logic even for window screenshots. Also, 16 bit PNGs now always include an unused alpha channel. The reason is that FFmpeg has RGB48 and RGBA64 formats, but no RGB064. RGB48 is 3 bytes and usually not supported by GPUs for rendering, so we have to use RGBA64, which forces an alpha channel. Will break for users who use --target-trc and similar options. I considered creating a new gl_video context, but it could double GPU memory use, so I didn't. This uses FBOs instead of glGetTexImage(), because that increases the chance it could work on GLES (e.g. ANGLE). Untested. No support for the Vulkan and D3D11 backends yet. Fixes #5498. Also fixes #5240, because the code for reading back is not used with the new code path.
* vo_gpu: add internal ability to skip osd/subs for renderingwm42018-02-115-18/+40
| | | | Needed for the following commit.
* vo_gpu: use blit() only if target ra_tex supports itwm42018-02-111-2/+3
| | | | | Even if RA_CAP_BLIT is set, this might just not be enabled for the target ra_tex.
* vo_gpu: add memory barrier on the HDR peak detectionNiklas Haas2018-02-111-0/+1
| | | | | This can cause the peak detection state to be inconsistent in rare cases, which might explain the issues when taking screenshots in #5499.
* vo_gpu: correctly infer HDR peak detection supportNiklas Haas2018-02-111-1/+4
| | | | | | The re-ordering of commits e3d93fd and 0870859 ended up swallowing the change which made the HDR tone mapping algorithm actually check for RA_CAP_NUM_GROUPS support.
* vo_gpu: refactor HDR peak detection algorithmNiklas Haas2018-02-113-16/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The major changes are as follows: 1. Use `uint32_t` instead of `unsigned int` for the SSBO size calculation. This doesn't really matter, since a too-big buffer will still work just fine, but since `uint` is a 32-bit integer by definition this is the correct way to do it. 2. Pre-divide the frame_sum by the num_wg immediately at the end of a frame. This change was made to prevent overflow. At 4K screen size, this code is currently already very at risk of overflow, especially once I started playing with longer averaging sizes. Pre-dividing this out makes it just about fit into 32-bit even for worst-case PQ content. (It's technically also faster and easier this way, so I should have done it to begin with). Rename `frame_sum` to `frame_avg` to clearly signal the change in semantics. 3. Implement a scene transition detection algorithm. This basically compares the current frame's average brightness against the (averaged) value of the past frames. If it exceeds a threshold, which I experimentally configured, we reset the peak detection SSBO's state immediately - so that it just contains the current frame. This prevents annoying "eye adaptation"-like effects on scene transitions. 4. As a result of the previous change, we can now use a much larger buffer size by default, which results in a more stable and less flickery result. I experimented with values between 20 and 256 and settled on the new value of 64. (I also switched to a power-of-2 array size, because I like powers of two)
* wayland_common: fix idle_inhibitor protocol segfaultRostislav Pehlivanov2018-02-091-0/+1
| | | | The pointer is used as a state and wasn't zeroed after seeks.
* drmprime interop : Add frames triple bufferingLongChair2018-02-071-3/+8
| | | | | | | | | | | | | | | | | Currently using the drmprime interop with external mpv intgration can lead to rendering issues because the current frame is being released too early. Typically using this with Qt results in one frame shift because Qt will do waitforvsync and swap, rather than swap and waitforvsync. This leads to tearing as the frambuffer is released while being displayed on screen. In order to avoid releasing the framebuffer that is displayed, We keep the framebuffer alive for one more frame with triple buffering to make sure that whatever rendering process is used, the framebuffer will not be released when it's still on screen. This was tested on RockChip Rock64
* vo_gpu: port HDR tone mapping algorithm from libplaceboNiklas Haas2018-02-053-70/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current peak detection algorithm was very bugged (which contributed to the excessive cross-frame flicker without long normalization) and also didn't take into account the frame average brightness level. The new algorithm both takes into account frame average brightness (in addition to peak brightness), and also computes the values in a more stable/correct way. (The old path was basically undefined behavior) In addition to improving the algorithm, we also switch to hable tone mapping by default, and try to enable peak computation automatically whever possible (compute shaders + SSBOs supported). We also make the desaturation milder, after extensive testing during libplacebo development. I also had to compensate a bit for the representational differences between mpv and libplacebo (libplacebo treats 1.0 as the reference peak, but mpv treats it as the nominal peak), but it shouldn't have caused any problems. This is still not quite the same as libplacebo, since libplacebo also allows tagging the desired scene average brightness on the output, and it also supports reading the scene average brightness from static metadata (MaxFALL) where available. But those changes are a bit more involved. It's possible we could also read this from metadata in the future, but we have problems communicating with AVFrames as it is and I don't want to touch the mpv colorimetry structs for the time being.
* vo_gpu: add RA_CAP for gl_NumWorkGroupsNiklas Haas2018-02-053-1/+3
| | | | | SPIRV-Cross doesn't support this for the time being. It's possible this could go away again at a later date.
* vo_gpu: vulkan: correctly enable textureGatherOffsetNiklas Haas2018-02-052-2/+3
| | | | This also requires a vulkan feature / SPIR-V capability to function
* vo_gpu: vulkan: don't issue queries for unused timersNiklas Haas2018-02-051-5/+13
| | | | | | | The vulkan validation layers warn you if you try requesting a query result from a timer that hasn't even been started yet, so we have to do some extra bit of work to keep track of which indices we've seen so far, and avoid the queries on them.
* vo_gpu: vulkan: try enabling required featuresNiklas Haas2018-02-052-0/+10
| | | | | | | Instead of enabling every feature under the sun, make an effort to just whitelist the ones we actually might use. Turns out the extended storage format support is needed for some of the storage formats we use, in particular rgba16.
* vo_gpu: vulkan: add missing buffer barrier fieldsNiklas Haas2018-02-051-0/+2
| | | | These were accidentally omitted.
* video: rewrite filtering glue codewm42018-01-303-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of the old vf.c code. Replace it with a generic filtering framework, which can potentially handle more than just --vf. At least reimplementing --af with this code is planned. This changes some --vf semantics (including runtime behavior and the "vf" command). The most important ones are listed in interface-changes. vf_convert.c is renamed to f_swscale.c. It is now an internal filter that can not be inserted by the user manually. f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is conceptually easy, but a big mess due to the data flow changes). The existing filters are all changed heavily. The data flow of the new filter framework is different. Especially EOF handling changes - EOF is now a "frame" rather than a state, and must be passed through exactly once. Another major thing is that all filters must support dynamic format changes. The filter reconfig() function goes away. (This sounds complex, but since all filters need to handle EOF draining anyway, they can use the same code, and it removes the mess with reconfig() having to predict the output format, which completely breaks with libavfilter anyway.) In addition, there is no automatic format negotiation or conversion. libavfilter's primitive and insufficient API simply doesn't allow us to do this in a reasonable way. Instead, filters can use f_autoconvert as sub-filter, and tell it which formats they support. This filter will in turn add actual conversion filters, such as f_swscale, to perform necessary format changes. vf_vapoursynth.c uses the same basic principle of operation as before, but with worryingly different details in data flow. Still appears to work. The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are heavily changed. Fortunately, they all used refqueue.c, which is for sharing the data flow logic (especially for managing future/past surfaces and such). It turns out it can be used to factor out most of the data flow. Some of these filters accepted software input. Instead of having ad-hoc upload code in each filter, surface upload is now delegated to f_autoconvert, which can use f_hwupload to perform this. Exporting VO capabilities is still a big mess (mp_stream_info stuff). The D3D11 code drops the redundant image formats, and all code uses the hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a big mess for now. f_async_queue is unused.
* vo_gpu: check for RA_CAP_FRAGCOORD in dumb mode tooJames Ross-Gowan2018-01-301-13/+14
| | | | | | | The RA_CAP_FRAGCOORD checks apply to dumb mode as well, but they were after the check for dumb mode, which returns early, so they never ran. Fixes #5436
* video: fix crash with vdpau when reinitializing renderingwm42018-01-271-3/+3
| | | | | | | | | | Using vdpau will allocate additional textures for the reinterleaving step, which uninit_rendering() will free. This is a problem because the hwdec image remains mapped when reinitializing, so the reinterleaving textures are turned into dangling pointers. Fix this by freeing the reinterleave textures on full uninit instead. Fixes #5447.
* hwdec: detach d3d and d3d9 hwaccel from anglemyfreeer2018-01-251-1/+3
| | | | Fix https://github.com/mpv-player/mpv/issues/5420
* video: change some remaining vo_opengl mentions to vo_gpuAkemi2018-01-205-6/+6
|
* osx: code cleanups and cosmetic fixesAkemi2018-01-202-2/+2
|
* ta: introduce talloc_dup() and use it in some placeswm42018-01-181-2/+2
| | | | | | | It was actually already implemented as ta_dup_ptrtype(), but that seems like a clunky name. Also we still use the talloc_ names throughout the source, and I'd rather use an old name instead of a mixing inconsistent naming conventions.
* sws_utils: don't force callers to provide option structwm42018-01-183-3/+3
| | | | | | | mp_sws_set_from_cmdline() has the only purpose to respect the --sws- command line options. Instead of forcing callers to get the option struct containing these, let callers pass mpv_global, and get it from the option core code directly. This avoids minor annoyances later on.
* vo: log reconfig callswm42018-01-181-0/+2
| | | | Helpful for debugging, sometimes.
* video: avoid some unnecessary vf.h includeswm42018-01-181-1/+0
|
* vo_gpu: skip DR for unsupported image formatswm42018-01-181-0/+3
| | | | | | | | | | | | | | DR (direct rendering) works by having the decoder decode into the GPU staging buffers, instead of copying the video data on texture upload. We did this even for formats unsupported by the GPU or the renderer. This "worked" because the staging memory is untyped, and the video frame was converted by libswscale to a supported format, and then uploaded with a copy using the normal non-DR texture upload path. Even though it "works", we don't gain anything from using the staging buffers for decoding, since we can't use them for upload anyway. Also, staging memory might be potentially limited (what really happens is up to the driver). It's easy to avoid, so just skip it in these cases.
* vo_gpu: fix broken 10 bit via integer textures playbackwm42018-01-171-3/+3
| | | | | | | | | | | The check_gl_features(p) call here checks whether dumb mode can be used. It uses the field use_integer_conversion, which is set _after_ the call in the same function. Move check_gl_features() to the end of the function, when use_integer_conversion is finally set. Fixes that it tried to use bilinear filtering with integer textures. The bug disabled the code that is supposed to convert it to non-integer textures.
* vo_gpu: rpi: defer gl_ctx_resize until after gl_ctx_initNiklas Haas2018-01-151-1/+3
| | | | | | | | This segfaults otherwise. The conditional is needed to break a circular dependency (gl_init depends on mpgl_load_functions which depends on recreate_dispmanx which calls gl_ctx_resize). Fixes #5398
* video: change some mp_image_pool semanticswm42018-01-131-1/+1
| | | | | | | | | | Remove the max_count creation parameter, because it's pointless and rarely ever did anything. Add a talloc parent parameter instead (which is something completely different, but convenient, and all callers needs to be changed anyway). Instead of clearing the pool when the now removed maximum is reached, clear it on image parameter changes instead.
* vo_gpu: hwdec_dxva2dxgi: initial implementationJames Ross-Gowan2018-01-063-0/+470
| | | | | | | | | | | | | This enables DXVA2 hardware decoding with ra_d3d11. It should be useful for Windows 7, where D3D11VA is not available. Images are transfered from D3D9 to D3D11 using D3D9Ex surface sharing[1]. Following Microsoft's recommendations, it uses a queue of shared surfaces, similar to Microsoft's ISurfaceQueue. This will hopefully prevent surface sharing from impacting parallelism and allow multiple D3D11 frames to be in-flight at once. [1]: https://msdn.microsoft.com/en-us/library/windows/desktop/ee913554.aspx
* vo_gpu: d3d11: check for NULL backbuffer in start_frameJames Ross-Gowan2018-01-041-2/+6
| | | | | | | | | | | | | In a lost device scenario, resize() will fail and p->backbuffer will be NULL. We can't recover from lost devices yet, but we should still check for a NULL backbuffer in start_frame() rather than crashing. Also remove a NULL check for p->swapchain. This was a red herring, since p->swapchain never becomes NULL in an error condition, but p->backbuffer actually does. This should fix the crash in #5320, but it doesn't fix the underlying reason for the lost device (which is probably a driver bug.)
* vo_gpu: d3d11: don't use a bgra8 swapchainJames Ross-Gowan2018-01-041-19/+8
| | | | | | | | | | Previously, mpv would attempt to use a BGRA swapchain in the hope that it would give better performance, since the Windows desktop is also composited in BGRA. In practice, it seems like there is no noticable performance difference between RGBA and BGRA swapchains and BGRA swapchains cause trouble with a42b8b1142fd, which attempts to use the swapchain format for intermediate FBOs, even though D3D11 does not guarantee BGRA surfaces will work with UAV typed stores.
* vo_gpu/context_android: replace both options with android-surface-sizesfan52018-01-021-4/+3
| | | | This allows us to automatically trigger a VOCTRL_RESIZE (also contained).
* vo_gpu/android: fallback to EGL_WIDTH/HEIGHTAman Gupta2018-01-011-3/+15
| | | | | | | | | | Uses the EGL width/height by default when the user fails to set the android-surface-width/android-surface-height options. This means the vo-resize command is optional, and does not need to be implemented on android devices which do not support rotation. Signed-off-by: Aman Gupta <aman@tmm1.net>
* vo_gpu: d3d11: avoid copying staging buffers to cbuffersJames Ross-Gowan2018-01-011-48/+15
| | | | | | | | | | | | | | | | Apparently some Intel drivers have a bug where copying from staging buffers to constant buffers does not work. We used to keep a copy of the buffer data in a staging buffer to enable partial constant buffer updates. To work around this bug, keep the copy in talloc-allocated system memory instead. There doesn't seem to be any noticable performance difference from keeping the copy in system memory. Our cbuffers are probably too small for it to matter anyway. See also: https://crbug.com/593024 Fixes #5293
* player: add internal `vo-resize` commandsfan52017-12-272-0/+7
| | | | Intended to be used with the properties from previous commit.
* vo_gpu/context: Let embedding application handle surface resizessfan52017-12-271-10/+20
| | | | | The callbacks for this are Java-only and EGL does not reliably return the correct values.
* vo_gpu: EGL: provide SwapInterval to generic codewm42017-12-271-0/+10
| | | | | | | This means that we now explicitly set an interval of 1. Although that should be the EGL default, some drivers could possibly ignore this (unconfirmed). In any case, this commit also allows disabling vsync, for users who want it.
* vo_gpu: vulkan: fix segfault due to index mismatchNiklas Haas2017-12-251-5/+8
| | | | | | | | The queue family index and the queue info index are not necessarily the same, so we're forced to do a check based on the queue family index itself. Fixes #5049
* vo_gpu: vulkan: fix some image barrier odditiesNiklas Haas2017-12-251-10/+5
| | | | | | | | | | | A vulkan validation layer update pointed out that this was wrong; we still need to use the access type corresponding to the stage mask, even if it means our code won't be able to skip the pipeline barrier (which would be wrong anyway). In additiona to this, we're also not allowed to specify any source access mask when transitioning from top_of_pipe, which doesn't make any sense anyway.
* vo_gpu: vulkan: omit needless #defineNiklas Haas2017-12-251-5/+0
|
* vo_gpu: vulkan: fix sharing mode on malloc'd buffersNiklas Haas2017-12-251-1/+0
| | | | Might explain some of the issues in multi-queue scenarios?
* vo_gpu: vulkan: fix dummyPass creationNiklas Haas2017-12-251-1/+1
| | | | This violates vulkan spec
* vo_gpu: vulkan: fix the rgb565a1 names -> rgb5a1Niklas Haas2017-12-251-2/+2
| | | | This is 5 bits per channel, not 565
* vo_gpu: vulkan: allow disabling async tf/compNiklas Haas2017-12-253-4/+21
| | | | | | | | | Async compute in particular seems to cause problems on some drivers, and even when supprted the benefits are not that massive from the tests I have seen, so it's probably safe to keep off by default. Async transfer on the other hand seems to work better and offers a more substantial improvement, so it's kept on.
* vo_gpu: vulkan: refine queue family selection algorithmNiklas Haas2017-12-251-2/+7
| | | | | | This gets confused by e.g. SPARSE_BIT on the TRANSFER_BIT, leading to situations where "more specialized" is ambiguous and the logic breaks down. So to fix it, only compare the subset we care about.
* vo_gpu: vulkan: prefer vkCmdCopyImage over vkCmdBlitImageNiklas Haas2017-12-251-8/+31
| | | | | | blit() implies scaling, copy() is the equivalent command to use when the formats are compatible (same pixel size) and the rects have the same dimensions.
* vo_gpu: attempt re-using the FBO format for p->output_texNiklas Haas2017-12-254-1/+13
| | | | | | | | | This allows RAs with support for non-opaque FBO formats to use a more appropriate FBO format for the output tex, possibly enabling a more efficient blit operation. This requires distinguishing between real formats (which can be used to create textures) and fake formats (e.g. ra_gl's FBO hack).
* vo_gpu: vulkan: properly depend on the swapchain acquire semaphoreNiklas Haas2017-12-253-15/+25
| | | | | This is now associated with the ra_tex directly and used in the correct way, rather than hackily done from submit_frame.
* vo_gpu: vulkan: use correct access flag for presentNiklas Haas2017-12-251-2/+3
| | | | This needs VK_ACCESS_MEMORY_READ_BIT (spec)