summaryrefslogtreecommitdiffstats
path: root/video
Commit message (Collapse)AuthorAgeFilesLines
* video: rewrite filtering glue codewm42018-01-3021-2616/+1078
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of the old vf.c code. Replace it with a generic filtering framework, which can potentially handle more than just --vf. At least reimplementing --af with this code is planned. This changes some --vf semantics (including runtime behavior and the "vf" command). The most important ones are listed in interface-changes. vf_convert.c is renamed to f_swscale.c. It is now an internal filter that can not be inserted by the user manually. f_lavfi.c is a refactor of player/lavfi.c. The latter will be removed once --lavfi-complex is reimplemented on top of f_lavfi.c. (which is conceptually easy, but a big mess due to the data flow changes). The existing filters are all changed heavily. The data flow of the new filter framework is different. Especially EOF handling changes - EOF is now a "frame" rather than a state, and must be passed through exactly once. Another major thing is that all filters must support dynamic format changes. The filter reconfig() function goes away. (This sounds complex, but since all filters need to handle EOF draining anyway, they can use the same code, and it removes the mess with reconfig() having to predict the output format, which completely breaks with libavfilter anyway.) In addition, there is no automatic format negotiation or conversion. libavfilter's primitive and insufficient API simply doesn't allow us to do this in a reasonable way. Instead, filters can use f_autoconvert as sub-filter, and tell it which formats they support. This filter will in turn add actual conversion filters, such as f_swscale, to perform necessary format changes. vf_vapoursynth.c uses the same basic principle of operation as before, but with worryingly different details in data flow. Still appears to work. The hardware deint filters (vf_vavpp.c, vf_d3d11vpp.c, vf_vdpaupp.c) are heavily changed. Fortunately, they all used refqueue.c, which is for sharing the data flow logic (especially for managing future/past surfaces and such). It turns out it can be used to factor out most of the data flow. Some of these filters accepted software input. Instead of having ad-hoc upload code in each filter, surface upload is now delegated to f_autoconvert, which can use f_hwupload to perform this. Exporting VO capabilities is still a big mess (mp_stream_info stuff). The D3D11 code drops the redundant image formats, and all code uses the hw_subfmt (sw_format in FFmpeg) instead. Although that too seems to be a big mess for now. f_async_queue is unused.
* vo_gpu: check for RA_CAP_FRAGCOORD in dumb mode tooJames Ross-Gowan2018-01-301-13/+14
| | | | | | | The RA_CAP_FRAGCOORD checks apply to dumb mode as well, but they were after the check for dumb mode, which returns early, so they never ran. Fixes #5436
* video: fix crash with vdpau when reinitializing renderingwm42018-01-271-3/+3
| | | | | | | | | | Using vdpau will allocate additional textures for the reinterleaving step, which uninit_rendering() will free. This is a problem because the hwdec image remains mapped when reinitializing, so the reinterleaving textures are turned into dangling pointers. Fix this by freeing the reinterleave textures on full uninit instead. Fixes #5447.
* hwdec: detach d3d and d3d9 hwaccel from anglemyfreeer2018-01-251-1/+3
| | | | Fix https://github.com/mpv-player/mpv/issues/5420
* video: minor simplificationwm42018-01-251-1/+1
| | | | | The check is redundant - if removed, it will write the same value, so it's a NOP.
* video: warn user against FFmpeg's lieswm42018-01-221-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found that at least for mjpeg streams, FFmpeg will set packet pts/dts anyway. The mjpeg raw video demuxer (along with some other raw formats) has a "framerate" demuxer option which defaults to 25, so all mjpeg streams will be played at 25 FPS by default. mpv doesn't like this much. If AVFMT_NOTIMESTAMPS is set, it prints a warning, that might print a bogus FPS value for the assumed framerate. The code was originally written with the assumption that FFmpeg would not set pts/dts for such formats, but since it does, the printed estimated framerate will never be used. --fps will also not be used by default in this situation. To make this hopefully less confusing, explicitly state the situation when the AVFMT_NOTIMESTAMPS flag is set, and give instructions how to work it around. Also, remove the warning in dec_video.c. We don't know what FPS it's going to assume anyway. If there are really no timestamps in the stream, it will trigger our normal missing pts workaround. Add the assumed FPS there. In theory, we could just clear packet timestamps if AVFMT_NOTIMESTAMPS is set, and make up our own timestamps. That is non-trivial for advanced video codecs like h264, so I'm not going there. For seeking and buffering estimation the situation thus remains half-broken. This is a mitigation for #5419.
* video: change some remaining vo_opengl mentions to vo_gpuAkemi2018-01-207-8/+8
|
* osx: code cleanups and cosmetic fixesAkemi2018-01-202-2/+2
|
* ta: introduce talloc_dup() and use it in some placeswm42018-01-181-2/+2
| | | | | | | It was actually already implemented as ta_dup_ptrtype(), but that seems like a clunky name. Also we still use the talloc_ names throughout the source, and I'd rather use an old name instead of a mixing inconsistent naming conventions.
* sws_utils: don't force callers to provide option structwm42018-01-186-7/+12
| | | | | | | mp_sws_set_from_cmdline() has the only purpose to respect the --sws- command line options. Instead of forcing callers to get the option struct containing these, let callers pass mpv_global, and get it from the option core code directly. This avoids minor annoyances later on.
* vo: log reconfig callswm42018-01-181-0/+2
| | | | Helpful for debugging, sometimes.
* mp_image_pool: add helper functions for FFmpeg hw frames poolswm42018-01-182-0/+81
| | | | | | | FFmpeg has its own rather "special" image pools (AVHWFramesContext) specifically for hardware decoding. So it's not really practical to use our own pool implementation. Add these helpers, which make it easier to use FFmpeg's code in mpv.
* mp_image: fix some metadata loss with conversion from/to AVFramewm42018-01-181-2/+14
| | | | | | | | | | | | | This fixes that AVFrames passing through libavfilter (such as with --lavfi-complex) implicitly stripped some fields. I'm not actually sure what to do with the mp_image_params.color.light field here (what happens if the colorspace changed?) - there is no equivalent in AVFrame or FFmpeg at all. It did not affect the old --vf code, because it doesn't allow libavfilter to change the metadata. Also log the .light field in verbose mode.
* video: make IMGFMT_IS_HWACCEL() return 0 or 1wm42018-01-181-1/+1
| | | | Sometimes helps avoiding usage mistakes.
* video: add utility function to pick conversion image format from a listwm42018-01-182-0/+10
|
* video: avoid some unnecessary vf.h includeswm42018-01-184-5/+0
|
* vo_gpu: skip DR for unsupported image formatswm42018-01-181-0/+3
| | | | | | | | | | | | | | DR (direct rendering) works by having the decoder decode into the GPU staging buffers, instead of copying the video data on texture upload. We did this even for formats unsupported by the GPU or the renderer. This "worked" because the staging memory is untyped, and the video frame was converted by libswscale to a supported format, and then uploaded with a copy using the normal non-DR texture upload path. Even though it "works", we don't gain anything from using the staging buffers for decoding, since we can't use them for upload anyway. Also, staging memory might be potentially limited (what really happens is up to the driver). It's easy to avoid, so just skip it in these cases.
* vo_gpu: fix broken 10 bit via integer textures playbackwm42018-01-171-3/+3
| | | | | | | | | | | The check_gl_features(p) call here checks whether dumb mode can be used. It uses the field use_integer_conversion, which is set _after_ the call in the same function. Move check_gl_features() to the end of the function, when use_integer_conversion is finally set. Fixes that it tried to use bilinear filtering with integer textures. The bug disabled the code that is supposed to convert it to non-integer textures.
* vo_gpu: rpi: defer gl_ctx_resize until after gl_ctx_initNiklas Haas2018-01-151-1/+3
| | | | | | | | This segfaults otherwise. The conditional is needed to break a circular dependency (gl_init depends on mpgl_load_functions which depends on recreate_dispmanx which calls gl_ctx_resize). Fixes #5398
* video: change some mp_image_pool semanticswm42018-01-136-14/+16
| | | | | | | | | | Remove the max_count creation parameter, because it's pointless and rarely ever did anything. Add a talloc parent parameter instead (which is something completely different, but convenient, and all callers needs to be changed anyway). Instead of clearing the pool when the now removed maximum is reached, clear it on image parameter changes instead.
* video, audio: don't actively wait for demuxer inputwm42018-01-091-0/+2
| | | | | | | | | | | | If feed_packet() ended with DATA_WAIT, the player should have gone to sleep, until the demuxer wakes it up again when there is new data. But the call to read_frame() unconditionally overwrote this status code, so it never waited. The consequence was that the core burned CPU by effectively polling the demuxer status, which was noticeable especially when seeking in network streams (since seeking is async, decoders will start out with having to wait for network). Regression since commit 33e5755c.
* vo_gpu: hwdec_dxva2dxgi: initial implementationJames Ross-Gowan2018-01-063-0/+470
| | | | | | | | | | | | | This enables DXVA2 hardware decoding with ra_d3d11. It should be useful for Windows 7, where D3D11VA is not available. Images are transfered from D3D9 to D3D11 using D3D9Ex surface sharing[1]. Following Microsoft's recommendations, it uses a queue of shared surfaces, similar to Microsoft's ISurfaceQueue. This will hopefully prevent surface sharing from impacting parallelism and allow multiple D3D11 frames to be in-flight at once. [1]: https://msdn.microsoft.com/en-us/library/windows/desktop/ee913554.aspx
* vo_gpu: d3d11: check for NULL backbuffer in start_frameJames Ross-Gowan2018-01-041-2/+6
| | | | | | | | | | | | | In a lost device scenario, resize() will fail and p->backbuffer will be NULL. We can't recover from lost devices yet, but we should still check for a NULL backbuffer in start_frame() rather than crashing. Also remove a NULL check for p->swapchain. This was a red herring, since p->swapchain never becomes NULL in an error condition, but p->backbuffer actually does. This should fix the crash in #5320, but it doesn't fix the underlying reason for the lost device (which is probably a driver bug.)
* vo_gpu: d3d11: don't use a bgra8 swapchainJames Ross-Gowan2018-01-041-19/+8
| | | | | | | | | | Previously, mpv would attempt to use a BGRA swapchain in the hope that it would give better performance, since the Windows desktop is also composited in BGRA. In practice, it seems like there is no noticable performance difference between RGBA and BGRA swapchains and BGRA swapchains cause trouble with a42b8b1142fd, which attempts to use the swapchain format for intermediate FBOs, even though D3D11 does not guarantee BGRA surfaces will work with UAV typed stores.
* vo_gpu/context_android: replace both options with android-surface-sizesfan52018-01-021-4/+3
| | | | This allows us to automatically trigger a VOCTRL_RESIZE (also contained).
* video, audio: always read all frames before getting next packetwm42018-01-012-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The old code tried to make sure at all times to try to read a new packet. Only once that was read, it tried to retrieve new video or audio frames the decoder might already have decoded. Change this to strictly read frames from the decoder until it signals that it wants a new packet, and only then read and feed a new packet. This is in theory nicer, follows the libavcodec recommended data flow, and and reduces the minimum latency by 1 frame. This merely requires switching the order in which those calls are done. Normally, the decoder will return only 1 frame until a new packet is required. If we would just feed it 1 packet, return DATA_AGAIN, and wait until the next frame is decoded, we would run the playloop 1 time too often for no reason (which is fine but might have some overhead). To avoid this, try to read a frame again after possibly feeding a packet. For this reason, move the feed/read code to its own functions each, instead of merely moving the code. The audio and video code for this particular thing is basically duplicated. The idea is to unify them one day, so make the change to both. (Doing this for video is the real motivation for this change, see below.) The video code change is slightly more complicated, because we have to care about the framedrop counting (which is just a heuristic, but for now considered better than nothing, and possibly considered required to warn the user of framedrops happening - maybe). Apparently this change helps with stalling streams on Android with the mediacodec wrapper and mpeg2 decoder implementations which deinterlace on decoding (and return 2 frames per packet). Based on an idea and observations by tmm1.
* vo_gpu/android: fallback to EGL_WIDTH/HEIGHTAman Gupta2018-01-011-3/+15
| | | | | | | | | | Uses the EGL width/height by default when the user fails to set the android-surface-width/android-surface-height options. This means the vo-resize command is optional, and does not need to be implemented on android devices which do not support rotation. Signed-off-by: Aman Gupta <aman@tmm1.net>
* vo_gpu: d3d11: avoid copying staging buffers to cbuffersJames Ross-Gowan2018-01-011-48/+15
| | | | | | | | | | | | | | | | Apparently some Intel drivers have a bug where copying from staging buffers to constant buffers does not work. We used to keep a copy of the buffer data in a staging buffer to enable partial constant buffer updates. To work around this bug, keep the copy in talloc-allocated system memory instead. There doesn't seem to be any noticable performance difference from keeping the copy in system memory. Our cbuffers are probably too small for it to matter anyway. See also: https://crbug.com/593024 Fixes #5293
* demux_mkv: add hack to pass along x264 version to decoderwm42017-12-281-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | This fixes when resuming certain broken h264 files encoded by x264. See FFmpeg commit 840b41b2a643fc8f0617c0370125a19c02c6b586 about the x264 bug itself. Normally, the unregistered user data SEI (that contains the x264 version string) is informational only. But libavcodec uses it to workaround a x264 bug, which was recently fixed in both libavcodec and x264. The fact that both encoder and decoder were buggy is the reason that it was not found earlier, and there are apparently a lot of files around created by the broken decoder. If libavcodec sees the SEI, this bug can be worked around by using the old behavior. If you resume a file with mpv (i.e. seeking when the file loads), libavcodec never sees the first video packet. Consequently it has to assume the file is not broken, and never applies the workaround, resulting in garbage being played. Fix this by always feeding the first video packet to the decoder on init, and then flushing the codec (to avoid that an unwanted image is output). Flushing the codec does not remove info such as the x264 version. We also abuse the fact that the first avcodec_send_packet() always pushes the frame into the decoder (so we don't have to trigger the decoder by requsting an output frame).
* vd_lavc: add an option to explicitly workaround x264 4:4:4 bugwm42017-12-281-0/+5
| | | | | | Technically, the user could just use --vd-lavc-o with the same result. But I find it better to make this an explicit option, so we can document the ups and downs, and also avoid setting it for non-h264.
* vd_lavc: fix crash with RPI hwdecwm42017-12-281-1/+2
| | | | | | If you use vo_rpi, this could crash, because hwdec_devs is NULL. Untested. Fixes #5301.
* player: add internal `vo-resize` commandsfan52017-12-272-0/+7
| | | | Intended to be used with the properties from previous commit.
* vo_gpu/context: Let embedding application handle surface resizessfan52017-12-271-10/+20
| | | | | The callbacks for this are Java-only and EGL does not reliably return the correct values.
* vo_gpu: EGL: provide SwapInterval to generic codewm42017-12-271-0/+10
| | | | | | | This means that we now explicitly set an interval of 1. Although that should be the EGL default, some drivers could possibly ignore this (unconfirmed). In any case, this commit also allows disabling vsync, for users who want it.
* vf_vdpaupp: fix error handling and software input modewm42017-12-271-5/+9
| | | | | | | | Crashed when no vdpau device was loaded. Also there was a mistake of not setting p->ctx, which broke software surface input mode. This was not found before, because p->ctx is not needed for anything else. Fixes #5294.
* options: drop some previously deprecated optionswm42017-12-251-4/+0
| | | | | | | | A release has been made, so drop options deprecated for that release. Also drop some options which have been deprecated a much longer time before. Also fix a typo in client-api-changes.rst.
* vo_gpu: vulkan: fix segfault due to index mismatchNiklas Haas2017-12-251-5/+8
| | | | | | | | The queue family index and the queue info index are not necessarily the same, so we're forced to do a check based on the queue family index itself. Fixes #5049
* vo_gpu: vulkan: fix some image barrier odditiesNiklas Haas2017-12-251-10/+5
| | | | | | | | | | | A vulkan validation layer update pointed out that this was wrong; we still need to use the access type corresponding to the stage mask, even if it means our code won't be able to skip the pipeline barrier (which would be wrong anyway). In additiona to this, we're also not allowed to specify any source access mask when transitioning from top_of_pipe, which doesn't make any sense anyway.
* vo_gpu: vulkan: omit needless #defineNiklas Haas2017-12-251-5/+0
|
* vo_gpu: vulkan: fix sharing mode on malloc'd buffersNiklas Haas2017-12-251-1/+0
| | | | Might explain some of the issues in multi-queue scenarios?
* vo_gpu: vulkan: fix dummyPass creationNiklas Haas2017-12-251-1/+1
| | | | This violates vulkan spec
* vo_gpu: vulkan: fix the rgb565a1 names -> rgb5a1Niklas Haas2017-12-251-2/+2
| | | | This is 5 bits per channel, not 565
* vo_gpu: vulkan: allow disabling async tf/compNiklas Haas2017-12-253-4/+21
| | | | | | | | | Async compute in particular seems to cause problems on some drivers, and even when supprted the benefits are not that massive from the tests I have seen, so it's probably safe to keep off by default. Async transfer on the other hand seems to work better and offers a more substantial improvement, so it's kept on.
* vo_gpu: vulkan: refine queue family selection algorithmNiklas Haas2017-12-251-2/+7
| | | | | | This gets confused by e.g. SPARSE_BIT on the TRANSFER_BIT, leading to situations where "more specialized" is ambiguous and the logic breaks down. So to fix it, only compare the subset we care about.
* vo_gpu: vulkan: prefer vkCmdCopyImage over vkCmdBlitImageNiklas Haas2017-12-251-8/+31
| | | | | | blit() implies scaling, copy() is the equivalent command to use when the formats are compatible (same pixel size) and the rects have the same dimensions.
* vo_gpu: attempt re-using the FBO format for p->output_texNiklas Haas2017-12-254-1/+13
| | | | | | | | | This allows RAs with support for non-opaque FBO formats to use a more appropriate FBO format for the output tex, possibly enabling a more efficient blit operation. This requires distinguishing between real formats (which can be used to create textures) and fake formats (e.g. ra_gl's FBO hack).
* vo_gpu: vulkan: properly depend on the swapchain acquire semaphoreNiklas Haas2017-12-253-15/+25
| | | | | This is now associated with the ra_tex directly and used in the correct way, rather than hackily done from submit_frame.
* vo_gpu: vulkan: use correct access flag for presentNiklas Haas2017-12-251-2/+3
| | | | This needs VK_ACCESS_MEMORY_READ_BIT (spec)
* vo_gpu: vulkan: make the swapchain more robustNiklas Haas2017-12-251-23/+50
| | | | | Now handles both VK_ERROR_OUT_OF_DATE_KHR and VK_SUBOPTIMAL_KHR for both vkAcquireNextImageKHR and vkQueuePresentKHR in the correct way.
* vo_gpu: aggressively prefer async computeNiklas Haas2017-12-253-1/+12
| | | | | | | | | | On AMD devices, we only get one graphics pipe but several compute pipes which can (in theory) run independently. As such, we should prefer compute shaders over fragment shaders in scenarios where we expect them to be better for parallelism. This is amusingly trivial to do, and actually improves performance even in a single-queue scenario.
* vo_gpu: vulkan: support split command poolsNiklas Haas2017-12-256-163/+281
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a single primary queue, we generate multiple vk_cmdpools and pick the right one dynamically based on the intent. This has a number of immediate benefits: 1. We can use async texture uploads 2. We can use the DMA engine for buffer updates 3. We can benefit from async compute on AMD GPUs Unfortunately, the major downside is that due to the lack of QF ownership tracking, we need to use CONCURRENT sharing for all resources (buffers *and* images!). In theory, we could try figuring out a way to get rid of the concurrent sharing for buffers (which is only needed for compute shader UBOs), but even so, the concurrent sharing mode doesn't really seem to have a significant impact over here (nvidia). It's possible that other platforms may disagree. Our deadlock-avoidance strategy is stupidly simple: Just flush the command every time we need to switch queues, and make sure all submission and callbacks happen in FIFO order. This required lifting the cmds_pending and cmds_queued out from vk_cmdpool to mpvk_ctx, and some functions died/got moved as a result, but that's a relatively minor change. On my hardware this is a fairly significant performance boost, mainly due to async transfers. (Nvidia doesn't expose separate compute queues anyway). On AMD, this should be a performance boost as well due to async compute.
* vo_gpu: invalidate fbotex before drawingNiklas Haas2017-12-254-10/+11
| | | | | Don't discard the OSD or pass_draw_to_screen passes though. Could be faster on some hardware.
* vo_gpu: allow invalidating FBO in renderpass_runNiklas Haas2017-12-253-5/+22
| | | | | | | | | This is especially interesting for vulkan since it allows completely skipping the layout transition as part of the renderpass. Unfortunately, that also means it needs to be put into renderpass_params, as opposed to renderpass_run_params (unlike #4777). Closes #4777.
* vo_gpu: vulk