summaryrefslogtreecommitdiffstats
path: root/video/decode/vaapi.c
Commit message (Collapse)AuthorAgeFilesLines
* build: prefix hwaccel decoder wrapper filenames with hw_wm42017-01-171-226/+0
| | | | | | Should have done this a long time ago. d3d.c remains as it is, because it's just a bunch of helper functions.
* vo_opengl, vaapi: properly probe 10 bit rendering supportwm42017-01-131-9/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are going to be users who have a Mesa installation which do not support 10 bit, but a GPU which can decode to 10 bit. So it's probably better not to hardcode whether it is supported. Introduce a more general way to signal supported formats from renderer to decoder. Obviously this is imperfect, because it still isn't part of proper format negotation (for example, what if there's a vavpp filter, which accepts anything). Still slightly better than before. I don't know any way to probe for vaapi dmabuf/EGL dmabuf support properly (in particular testing specific formats, not just general availability). So we stay with the current approach and try to create and map dummy surfaces on init to probe for support. Overdo it and check all formats that AVHWFramesConstraints reports, instead of only NV12 and P010 surfaces. Since we can support unknown formats now, add explicitly checks to the EGL/dmabuf mapper code to reject unsupported formats. I also noticed that libavutil signals support for RGB0/BGR0, but couldn't get it to work. Remove the DRM formats that are unused/didn't work the way I tried to use them. With this, 10 bit decoding + rendering should work, provided you have a capable CPU and a patched Mesa. The required Mesa patch adds support for the R16 and GR32 formats. It was sent by a Kodi developer to the Mesa developer mailing list and was not accepted yet.
* vaapi: always create AVHWDeviceContext on initwm42017-01-131-13/+2
| | | | | | For convenience. Since we still have code that works even if creating a AVHWDeviceContext fails, failure is ignored. (Although currently, it succeeds creation even with the stale/abandoned vdpau wrapper driver.)
* vaapi: fix typowm42017-01-121-1/+1
|
* vaapi: explicitly reject 10 bit surfaces outside of copy modewm42017-01-121-0/+7
| | | | | | | | Rendering support in Mesa probably doesn't exist yet. In theory it might be possible to use VPP to convert the surfaces to 8 bit (like we do it with dxva2/d3d11va as ANGLE doesn't support rendering 10 bit surface either), but that too would require explicit mechanisms. This can't be implemented either until I have a GPU with actual support.
* vaapi: handle image copying for vaapi-copy in common codewm42017-01-121-13/+0
| | | | | | | | | | | | | | Other hwdecs will also be able to use this (as soon as they are switched to use AVHWFramesContext). As an additional feature, failing to copy back the frame counts as hardware decoding failure and can trigger fallback. This can be done easily now, because it needs no way to communicate this from the hwaccel glue code to the common code. The old code is still required for the old decode API, until we either drop or rewrite it. vo_vaapi.c's OSD code (fuck...) also uses these surface functions to a higher degree.
* vaapi: properly set hw_subfmt field with new decode APIwm42017-01-121-7/+0
| | | | | | | | This fixes direct rendering with hwdec_vaegl.c. The code duplication between update_image_params() and mp_image_copy_fields_from_av_frame() is quite annoying, bit will have to be resolved in another commit.
* vaapi: set our own context in AVHWFramesContext not AVHWDeviceContextwm42017-01-121-3/+3
| | | | | | AVHWDeviceContext.user_opaque is reserved to libavutil under certain circumstances, while AVHWFramesContext.user_opaque is truly free for use by us. It's slightly simpler too.
* vaapi: support new libavcodec vaapi APIwm42017-01-111-0/+239
| | | | | | | | | | | | | | | | | The old API is deprecated, and libavcodec prints a warning at runtime. The new API is a bit nicer and does many things for you, such as managing the underlying hwaccel decoder. libavutil also provides code for managing surfaces (we use their surface pool). The new code does not contain any code from the original MPlayer VAAPI patch (that was used as base for some of the vaapi code in mpv). Thus the new code is LGPL. The new API actually does not add any visible symbols, so the only way to detect it is a version check. Of course, the versions overlap between FFmpeg and Libav, which requires additional care. The new API did not get merged into FFmpeg yet, so there's no check for FFmpeg.
* vaapi: rename vaapi.c to vaapi_old.cwm42017-01-111-570/+0
| | | | | vaapi.c will be reintroduced with the new code using the new libavcodec vaapi API.
* vaapi: support drm devices when running in vaapi-copy modeBernhard Frauendienst2016-10-021-0/+53
| | | | | | | | | | | | | | | | When the vaapi decoder is used in copy mode, it creates a dummy display to render to. In theory, this should support hardware decoding on on a separate GPU that is not actually connected to any output (like an iGPU which supports more formats than the external GPU to which the monitor is connected). However, before this change, only X11 displays were supported as dummy displays. This caused some graphics drivers (namely intel-driver) to core dump when they were not actually used as X11 module. This change introduces support for drm libav displays, which allows vaapi-copy to run on such cards which are not actually rendering the X11 output.
* vaapi: avoid forward declaration of variablewm42016-05-151-9/+7
| | | | Why is everything so horrible.
* video: add --hwdec=auto-copy modewm42016-05-111-0/+1
| | | | | | | | This uses the normal autoprobing rules like "auto", but rejects anything that isn't flagged as copying data back to system memory. The chunk in command.c was dead code, so remove it instead of updating it.
* video: refactor how VO exports hwdec device handleswm42016-05-091-17/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or documented where not), which makes the whole thing saner and cleaner. In particular, thread-safety rules become less subtle and more obvious. The new internal API makes it easier to support multiple OpenGL interop backends. (Although this is not done yet, and it's not clear whether it ever will.) This also removes all the API-specific fields from mp_hwdec_ctx and replaces them with a "ctx" field. For d3d in particular, we drop the mp_d3d_ctx struct completely, and pass the interfaces directly. Remove the emulation checks from vaapi.c and vdpau.c; they are pointless, and the checks that matter are done on the VO layer. The d3d hardware decoders might slightly change behavior: dxva2-copy will not use the VO device anymore if the VO supports proper interop. This pretty much assumes that any in such cases the VO will not use any form of exclusive mode, which makes using the VO device in copy mode unnecessary. This is a big refactor. Some things may be untested and could be broken.
* vaapi: determine surface format in decoder, not in rendererwm42016-04-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, we have made the assumption that a driver will use only 1 hardware surface format. the format is dictated by the driver (you don't create surfaces with a specific format - you just pass a rt_format and get a surface that will be in a specific driver-chosen format). In particular, the renderer created a dummy surface to probe the format, and hoped the decoder would produce the same format. Due to a driver bug this required a workaround to actually get the same format as the driver did. Change this so that the format is determined in the decoder. The format is then passed down as hw_subfmt, which allows the renderer to configure itself with the correct format. If the hardware surface changes its format midstream, the renderer can be reconfigured using the normal mechanisms. This calls va_surface_init_subformat() each time after the decoder returns a surface. Since libavcodec/AVFrame has no concept of sub- formats, this is unavoidable. It creates and destroys a derived VAImage, but this shouldn't have any bad performance effects (at least I didn't notice any measurable effects). Note that vaDeriveImage() failures are silently ignored as some drivers (the vdpau wrapper) support neither vaDeriveImage, nor EGL interop. In addition, we still probe whether we can map an image in the EGL interop code. This is important as it's the only way to determine whether EGL interop is supported at all. With respect to the driver bug mentioned above, it doesn't matter which format the test surface has. In vf_vavpp, also remove the rt_format guessing business. I think the existing logic was a bit meaningless anyway. It's not even a given that vavpp produces the same rt_format for output.
* vd_lavc: let hardware decoder request delaying frames explicitlywm42016-04-071-0/+1
| | | | | | | | | | Until now, the presence of the process_image() callback was used to set a delay queue with a hardcoded size. Change this to a vd_lavc_hwdec field instead, so the decoder can explicitly set this if it's really needed. Do this so process_image() can be used in the VideoToolbox glue code for something entirely unrelated.
* vd_lavc: fix codec vs. decoder confusionwm42016-04-071-4/+4
| | | | | | | | | | Some functions which expected a codec name (i.e. the name of the video format itself) were passed a decoder name. Most "native" libavcodec decoders have the same name as the codec, so this was never an issue. This should mean that e.g. using "--vd=lavc:h264_mmal --hwdec=mmal" should now actually enable native surface mode (instead of doing copy- back).
* vaapi: lower number of allocated surfaces againwm42016-01-261-1/+1
| | | | | | | Commit b53cb8de increased this by the number of additionally delayed surfaces. But since this is only enabled in copy-back mode (which is what process_image is about), the other additional surfaces accounted for the direct rendering case can be ignored.
* vd_lavc: delay images before reading them backwm42016-01-251-1/+1
| | | | Facilitates hardware pipelining in particular with nvidia/dxva.
* vaapi: fix compilation on older FFmpeg/Libavwm42016-01-201-1/+1
| | | | | | They don't define FF_PROFILE_VP9_0. Fixes #2737.
* vaapi: add VP9 profile entiresBtbN2015-12-201-0/+7
|
* vaapi: remove dependency on X11wm42015-09-271-13/+58
| | | | | | | | | | | | | There are at least 2 ways of using VAAPI without X11 (Wayland, DRM). Remove the X11 requirement from the decoder part and the EGL interop. This will be used by a following commit, which adds Wayland support. The worst about this is the decoder part, which includes a bad hack for using the decoder without any VO interop (also known as "vaapi-copy" mode). Separate the X11 parts so that they're self-contained. For the EGL interop code we do something similar (it's kept slightly simpler, because it essentially only has to translate between our silly MPGetNativeDisplay abstraction and the vaGetDisplay...() call).
* vaapi: add HEVC profile entrieswm42015-08-241-0/+10
| | | | | | | libavcodec does not support HEVC via VAAPI yet, so this won't work. However, there is ongoing work to add HEVC support to VAAPI, and this change might help with testing. (Or maybe not - but there is no harm in this change.)
* vd_lavc: remove unneeded hwdec parameterswm42015-08-191-3/+2
| | | | | | | All hwdec backends now use a single pixel format, and the format is always checked. Also, the init_decoder callback is now mandatory.
* vaapi: allow allocating additional surfaces during decodingwm42015-07-151-3/+2
| | | | | | | | | | | | | Fixes problems with --vo=opengl:interpolation. The issue here is that vo_opengl retains more surfaces than what was preallocated for the decoder. Until now, we just explicitly failed to decode frames for which no additional surfaces are available. Since modern drivers usually are fine with not "registering" surfaces before the decoder is created, just allow allocating additional surfaces if needed. (We also could probably recreate the HW decoder, since the HW decoder should be stateless. But let's try to avoid raising the overall complexity of the code.)
* vaapi: increase number of additional surfaceswm42015-07-081-6/+2
| | | | | | | | | | | | | | | | | | | Sometime recently, hardware decoding started to fail if h264 with full reference frames was decoded, and --vo=vaapi was used. VAAPI requires registering all surfaces that the decoder will ever use in advance, so if the playback chain uses more surfaces than originally allocated, we fail and drop back to software decoding. I'm not really sure why or when this started happening. Commit 7b9d7265 for one is not the cause - it can be reproduced with earlier commits. It also seems to be timing dependent. Possibly it has to do with the way vo.c retains previous surfaces, and the way they can be queued/unqueued asynchronously. Increasing the number of reserved additional surfaces by 1 fixes it. (Though I have no idea where exactly all these surfaces are being used. Or rather, _when_.)
* video: reduce error message when loading hwdec backend failswm42015-06-201-1/+1
| | | | | | | | | | | | | | | | | When using --hwdec=auto, about half of all systems will print: "[vdpau] Error when calling vdp_device_create_x11: 1" this happens because usually mpv will be linked against both vdpau and vaapi libs, but the drivers are not necessarily available. Then trying to load a driver will fail. This is a normal part of probing, but the error messages were printed anyway. Silence them by explicitly distinguishing probing. This pretty much goes through all the layers. We actually consider loading hw backends for vo_opengl always "auto probed", even if a hw backend is explicitly requested. In this case vd_lavc will print a warning message anyway (adjust this message a bit).
* vaapi: remove direct mapping non-sensewm42015-05-291-42/+6
| | | | | | | | | | | | | | | | | | | | This must have been some non-sense in the original vaapi mplayer patch. While I still have no good idea what this "direct mapping" business is about, it appears to be pretty much pointless. Nothing can hold additional "real" surface references (due to how the API and mpv/lavc refcounting work), so removing the additional surfaces won't break anything. It still could be that this was for achieving additional buffering (not reusing surfaces as soon), but we buffer some additional data anyway. Plus, the original intention of the vaapi mplayer code was probably increasing surface count just by 1 or 2, not actually doubling it, and/or it was a "trick" to get to the maximum count of 21 when h264 is in use. gstreamer-vaapi uses "ref_frames + SCRATCH_SURFACES_COUNT" here, with SCRATCH_SURFACES_COUNT defined to 4. It doesn't appear to check the overlay attributes at all in the decoder. In any case, remove this non-sense.
* video: un-discourage "vaapi-copy" hwdec modewm42015-02-201-5/+0
| | | | | Maybe I don't know what I'm doing. I'm fairly certain though that Intel does not know what they're doing.
* video: have a generic context struct for hwdec backendswm42015-01-221-5/+3
| | | | | | | | | | | Before this commit, each hw backend had their own specific struct types for context, and some, like VDA, had none at all. Add a context struct (mp_hwdec_ctx) that provides a somewhat generic way to pass the hwdec context around. Some things get slightly better, some slightly more verbose. mp_hwdec_info is still around; it's still needed, but is reduced to its role of handling delayed loading of the hwdec backend.
* vaapi: try dealing with Intel's braindamaged shit driverswm42014-08-211-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | So talking to a certain Intel dev, it sounded like modern VA-API drivers are reasonable thread-safe. But apparently that is not the case. Not at all. So add approximate locking around all vaapi API calls. The problem appeared once we moved decoding and display to different threads. That means the "vaapi-copy" mode was unaffected, but decoding with vo_vaapi or vo_opengl lead to random crashes. Untested on real Intel hardware. With the vdpau emulation, it seems to work fine - but actually it worked fine even before this commit, because vdpau was written and designed not by morons, but competent people (vdpau is guaranteed to be fully thread-safe). There is some probability that this commit doesn't fix things entirely. One problem is that locking might not be complete. For one, libavcodec _also_ accesses vaapi, so we have to rely on our own guesses how and when lavc uses vaapi (since we disable multithreading when doing hw decoding, our guess should be relatively good, but it's still a lavc implementation detail). One other reason that this commit might not help is Intel's amazing potential to fuckup anything that is good and holy.
* vaapi: we need more surfaceswm42014-08-181-1/+2
| | | | | | | Playing with high framedrop could make it run out of surfaces. In theory, we wouldn't need an additional surface, if we could just clear the vo_vaapi internal surface - but doing so would probably be a pain, so I don't care.
* video: move display and timing to a separate threadwm42014-08-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VO is run inside its own thread. It also does most of video timing. The playloop hands the image data and a realtime timestamp to the VO, and the VO does the rest. In particular, this allows the playloop to do other things, instead of blocking for video redraw. But if anything accesses the VO during video timing, it will block. This also fixes vo_sdl.c event handling; but that is only a side-effect, since reimplementing the broken way would require more effort. Also drop --softsleep. In theory, this option helps if the kernel's sleeping mechanism is too inaccurate for video timing. In practice, I haven't ever encountered a situation where it helps, and it just burns CPU cycles. On the other hand it's probably actively harmful, because it prevents the libavcodec decoder threads from doing real work. Side note: Originally, I intended that multiple frames can be queued to the VO. But this is not done, due to problems with OSD and other certain features. OSD in particular is simply designed in a way that it can be neither timed nor copied, so you do have to render it into the video frame before you can draw the next frame. (Subtitles have no such restriction. sd_lavc was even updated to fix this.) It seems the right solution to queuing multiple VO frames is rendering on VO-backed framebuffers, like vo_vdpau.c does. This requires VO driver support, and is out of scope of this commit. As consequence, the VO has a queue size of 1. The existing video queue is just needed to compute frame duration, and will be moved out in the next commit.
* vaapi: fix uninitialized value readwm42014-08-111-1/+1
| | | | | | Found with valgrind. This is somewhat terrifying, because the VA-API API function is supposed to fill these values, and we access them only if the API functions return success. So this shouldn't have happened.
* vaapi: fix destruction with --hwdec=haapi-copywm42014-05-281-2/+6
| | | | | | | This is incomplete; the video chain will still hold some vaapi objects after destroying the decoder and thus the vaapi context. This is very bad. Fixing it would require something like refcounting the vaapi context, but I don't really want to.
* video: warn if an emulated hwdec API is usedwm42014-05-281-0/+5
| | | | | | | | | | | | | | | | mpv supports two hardware decoding APIs on Linux: vdpau and vaapi. Each of these has emulation wrappers. The wrappers are usually slower and have fewer features than their native opposites. In particular the libva vdpau driver is practically unmaintained. Check the vendor string and print a warning if emulation is detected. Checking vendor strings is a very stupid thing to do, but I find the thought of people using an emulated API for no reason worse. Also, make --hwdec=auto never use an API that is detected as emulated. This doesn't work quite right yet, because once one API is loaded, vo_opengl doesn't unload it, so no hardware decoding will be used if the first probed API (usually vdpau) is rejected. But good enough.
* vaapi: make struct va_surface privatewm42014-03-171-11/+8
| | | | | | It's not really needed to be public. Other code can just use mp_image. The only disadvantage is that the other code needs to call an accessor to get the VASurfaceID.
* vaapi: replace image pool implementation with mp_image_poolwm42014-03-171-30/+29
| | | | | | | | | | | | Although I at first thought it would be better to have a separate implementation for hwaccels because the difference to software images are too large, it turns out you can actually save some code with it. Note that the old implementation had a small memory management bug. This got painted over in commit 269c1e1, but is hereby solved properly. Also note that I couldn't test vf_vavpp.c (due to lack of hardware), and I hope I didn't accidentally break it.
* vd_lavc: remove compatibility crapwm42014-03-161-2/+2
| | | | | | | All this code was needed for compatibility with very old libavcodec versions only (such as Libav 9). Includes some now-possible simplifications too.
* video: initialize hw decoder in get_formatwm42014-03-101-24/+10
| | | | | | | | | | | | | | | | Apparently the "right" place to initialize the hardware decoder is in the libavcodec get_format callback. This doesn't change vda.c and vdpau_old.c, because I don't have OSX, and vdpau_old.c is probably going to be removed soon (if Libav ever manages to release Libav 10). So for now the init_decoder callback added with this commit is optional. This also means vdpau.c and vaapi.c don't have to manage and check the image parameters anymore. This change is probably needed for when libavcodec VDA supports gets a new iteration of its API.
* vaapi: mp_msg conversionswm42013-12-211-26/+24
| | | | | | This ended up a little bit messy. In order to get a mp_log everywhere, mostly make use of the fact that va_surface already references global state anyway.
* Split mpvcore/ into common/, misc/, bstr/wm42013-12-171-2/+2
|
* vaapi: remove unused hw image formats, simplifywm42013-11-291-2/+2
| | | | | | | | | | PIX_FMT_VDA_VLD and PIX_FMT_VAAPI_VLD were never used anywhere. I'm not sure why they were even added, and they sound like they are just for compatibility with XvMC-style decoding, which sucks anyway. Now that there's only a single vaapi format, remove the IMGFMT_IS_VAAPI() macro. Also get rid of IMGFMT_IS_VDA(), which was unused.
* video: move struct mp_hwdec_info into its own header filewm42013-11-231-1/+1
| | | | | | | | This means most code accessing this struct must now include hwdec.h instead of dec_video.h. I just put it into dec_video.h at first because I thought a separate file would be a waste, but it's more proper to do it this way, as there are too many files which include dec_video.h only to get the mp_hwdec_info definition.
* vo_opengl: add support for VA-API OpenGL interopwm42013-11-041-0/+1
| | | | | | | | VA-API's OpenGL/GLX interop is pretty bad and perhaps slow (renders a X11 pixmap into a FBO, and has to go over X11, probably involves one or more copies), and this code serves more as an example, rather than for serious use. On the other hand, this might be work much better than vo_vaapi, even if slightly slower.
* video: check profiles with hardware decodingwm42013-11-011-46/+31
| | | | | | | | | | | | | | | We had some code for checking profiles earlier, which was removed in commits 2508f38 and adfb71b. These commits mentioned that (working) hw decoding was sometimes prevented due to profile checking, but I can't find the samples anymore that showed this behavior. Also, I changed my opinion, and I think checking the profiles is something that should be done for better fallback to software decoding behavior. The checks roughly follow VLC's vdpau profile checks, although we do not check codec levels. (VLC's profile checks aren't necessarily completely correct, but they're a welcome help anyway.) Add a --vd-lavc-check-hw-profile option, which skips the profile check.
* vaapi: remove non-VLD entrypointswm42013-09-29