summaryrefslogtreecommitdiffstats
path: root/video/decode/vaapi.c
Commit message (Collapse)AuthorAgeFilesLines
* vaapi: avoid forward declaration of variablewm42016-05-151-9/+7
| | | | Why is everything so horrible.
* video: add --hwdec=auto-copy modewm42016-05-111-0/+1
| | | | | | | | This uses the normal autoprobing rules like "auto", but rejects anything that isn't flagged as copying data back to system memory. The chunk in command.c was dead code, so remove it instead of updating it.
* video: refactor how VO exports hwdec device handleswm42016-05-091-17/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or documented where not), which makes the whole thing saner and cleaner. In particular, thread-safety rules become less subtle and more obvious. The new internal API makes it easier to support multiple OpenGL interop backends. (Although this is not done yet, and it's not clear whether it ever will.) This also removes all the API-specific fields from mp_hwdec_ctx and replaces them with a "ctx" field. For d3d in particular, we drop the mp_d3d_ctx struct completely, and pass the interfaces directly. Remove the emulation checks from vaapi.c and vdpau.c; they are pointless, and the checks that matter are done on the VO layer. The d3d hardware decoders might slightly change behavior: dxva2-copy will not use the VO device anymore if the VO supports proper interop. This pretty much assumes that any in such cases the VO will not use any form of exclusive mode, which makes using the VO device in copy mode unnecessary. This is a big refactor. Some things may be untested and could be broken.
* vaapi: determine surface format in decoder, not in rendererwm42016-04-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, we have made the assumption that a driver will use only 1 hardware surface format. the format is dictated by the driver (you don't create surfaces with a specific format - you just pass a rt_format and get a surface that will be in a specific driver-chosen format). In particular, the renderer created a dummy surface to probe the format, and hoped the decoder would produce the same format. Due to a driver bug this required a workaround to actually get the same format as the driver did. Change this so that the format is determined in the decoder. The format is then passed down as hw_subfmt, which allows the renderer to configure itself with the correct format. If the hardware surface changes its format midstream, the renderer can be reconfigured using the normal mechanisms. This calls va_surface_init_subformat() each time after the decoder returns a surface. Since libavcodec/AVFrame has no concept of sub- formats, this is unavoidable. It creates and destroys a derived VAImage, but this shouldn't have any bad performance effects (at least I didn't notice any measurable effects). Note that vaDeriveImage() failures are silently ignored as some drivers (the vdpau wrapper) support neither vaDeriveImage, nor EGL interop. In addition, we still probe whether we can map an image in the EGL interop code. This is important as it's the only way to determine whether EGL interop is supported at all. With respect to the driver bug mentioned above, it doesn't matter which format the test surface has. In vf_vavpp, also remove the rt_format guessing business. I think the existing logic was a bit meaningless anyway. It's not even a given that vavpp produces the same rt_format for output.
* vd_lavc: let hardware decoder request delaying frames explicitlywm42016-04-071-0/+1
| | | | | | | | | | Until now, the presence of the process_image() callback was used to set a delay queue with a hardcoded size. Change this to a vd_lavc_hwdec field instead, so the decoder can explicitly set this if it's really needed. Do this so process_image() can be used in the VideoToolbox glue code for something entirely unrelated.
* vd_lavc: fix codec vs. decoder confusionwm42016-04-071-4/+4
| | | | | | | | | | Some functions which expected a codec name (i.e. the name of the video format itself) were passed a decoder name. Most "native" libavcodec decoders have the same name as the codec, so this was never an issue. This should mean that e.g. using "--vd=lavc:h264_mmal --hwdec=mmal" should now actually enable native surface mode (instead of doing copy- back).
* vaapi: lower number of allocated surfaces againwm42016-01-261-1/+1
| | | | | | | Commit b53cb8de increased this by the number of additionally delayed surfaces. But since this is only enabled in copy-back mode (which is what process_image is about), the other additional surfaces accounted for the direct rendering case can be ignored.
* vd_lavc: delay images before reading them backwm42016-01-251-1/+1
| | | | Facilitates hardware pipelining in particular with nvidia/dxva.
* vaapi: fix compilation on older FFmpeg/Libavwm42016-01-201-1/+1
| | | | | | They don't define FF_PROFILE_VP9_0. Fixes #2737.
* vaapi: add VP9 profile entiresBtbN2015-12-201-0/+7
|
* vaapi: remove dependency on X11wm42015-09-271-13/+58
| | | | | | | | | | | | | There are at least 2 ways of using VAAPI without X11 (Wayland, DRM). Remove the X11 requirement from the decoder part and the EGL interop. This will be used by a following commit, which adds Wayland support. The worst about this is the decoder part, which includes a bad hack for using the decoder without any VO interop (also known as "vaapi-copy" mode). Separate the X11 parts so that they're self-contained. For the EGL interop code we do something similar (it's kept slightly simpler, because it essentially only has to translate between our silly MPGetNativeDisplay abstraction and the vaGetDisplay...() call).
* vaapi: add HEVC profile entrieswm42015-08-241-0/+10
| | | | | | | libavcodec does not support HEVC via VAAPI yet, so this won't work. However, there is ongoing work to add HEVC support to VAAPI, and this change might help with testing. (Or maybe not - but there is no harm in this change.)
* vd_lavc: remove unneeded hwdec parameterswm42015-08-191-3/+2
| | | | | | | All hwdec backends now use a single pixel format, and the format is always checked. Also, the init_decoder callback is now mandatory.
* vaapi: allow allocating additional surfaces during decodingwm42015-07-151-3/+2
| | | | | | | | | | | | | Fixes problems with --vo=opengl:interpolation. The issue here is that vo_opengl retains more surfaces than what was preallocated for the decoder. Until now, we just explicitly failed to decode frames for which no additional surfaces are available. Since modern drivers usually are fine with not "registering" surfaces before the decoder is created, just allow allocating additional surfaces if needed. (We also could probably recreate the HW decoder, since the HW decoder should be stateless. But let's try to avoid raising the overall complexity of the code.)
* vaapi: increase number of additional surfaceswm42015-07-081-6/+2
| | | | | | | | | | | | | | | | | | | Sometime recently, hardware decoding started to fail if h264 with full reference frames was decoded, and --vo=vaapi was used. VAAPI requires registering all surfaces that the decoder will ever use in advance, so if the playback chain uses more surfaces than originally allocated, we fail and drop back to software decoding. I'm not really sure why or when this started happening. Commit 7b9d7265 for one is not the cause - it can be reproduced with earlier commits. It also seems to be timing dependent. Possibly it has to do with the way vo.c retains previous surfaces, and the way they can be queued/unqueued asynchronously. Increasing the number of reserved additional surfaces by 1 fixes it. (Though I have no idea where exactly all these surfaces are being used. Or rather, _when_.)
* video: reduce error message when loading hwdec backend failswm42015-06-201-1/+1
| | | | | | | | | | | | | | | | | When using --hwdec=auto, about half of all systems will print: "[vdpau] Error when calling vdp_device_create_x11: 1" this happens because usually mpv will be linked against both vdpau and vaapi libs, but the drivers are not necessarily available. Then trying to load a driver will fail. This is a normal part of probing, but the error messages were printed anyway. Silence them by explicitly distinguishing probing. This pretty much goes through all the layers. We actually consider loading hw backends for vo_opengl always "auto probed", even if a hw backend is explicitly requested. In this case vd_lavc will print a warning message anyway (adjust this message a bit).
* vaapi: remove direct mapping non-sensewm42015-05-291-42/+6
| | | | | | | | | | | | | | | | | | | | This must have been some non-sense in the original vaapi mplayer patch. While I still have no good idea what this "direct mapping" business is about, it appears to be pretty much pointless. Nothing can hold additional "real" surface references (due to how the API and mpv/lavc refcounting work), so removing the additional surfaces won't break anything. It still could be that this was for achieving additional buffering (not reusing surfaces as soon), but we buffer some additional data anyway. Plus, the original intention of the vaapi mplayer code was probably increasing surface count just by 1 or 2, not actually doubling it, and/or it was a "trick" to get to the maximum count of 21 when h264 is in use. gstreamer-vaapi uses "ref_frames + SCRATCH_SURFACES_COUNT" here, with SCRATCH_SURFACES_COUNT defined to 4. It doesn't appear to check the overlay attributes at all in the decoder. In any case, remove this non-sense.
* video: un-discourage "vaapi-copy" hwdec modewm42015-02-201-5/+0
| | | | | Maybe I don't know what I'm doing. I'm fairly certain though that Intel does not know what they're doing.
* video: have a generic context struct for hwdec backendswm42015-01-221-5/+3
| | | | | | | | | | | Before this commit, each hw backend had their own specific struct types for context, and some, like VDA, had none at all. Add a context struct (mp_hwdec_ctx) that provides a somewhat generic way to pass the hwdec context around. Some things get slightly better, some slightly more verbose. mp_hwdec_info is still around; it's still needed, but is reduced to its role of handling delayed loading of the hwdec backend.
* vaapi: try dealing with Intel's braindamaged shit driverswm42014-08-211-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | So talking to a certain Intel dev, it sounded like modern VA-API drivers are reasonable thread-safe. But apparently that is not the case. Not at all. So add approximate locking around all vaapi API calls. The problem appeared once we moved decoding and display to different threads. That means the "vaapi-copy" mode was unaffected, but decoding with vo_vaapi or vo_opengl lead to random crashes. Untested on real Intel hardware. With the vdpau emulation, it seems to work fine - but actually it worked fine even before this commit, because vdpau was written and designed not by morons, but competent people (vdpau is guaranteed to be fully thread-safe). There is some probability that this commit doesn't fix things entirely. One problem is that locking might not be complete. For one, libavcodec _also_ accesses vaapi, so we have to rely on our own guesses how and when lavc uses vaapi (since we disable multithreading when doing hw decoding, our guess should be relatively good, but it's still a lavc implementation detail). One other reason that this commit might not help is Intel's amazing potential to fuckup anything that is good and holy.
* vaapi: we need more surfaceswm42014-08-181-1/+2
| | | | | | | Playing with high framedrop could make it run out of surfaces. In theory, we wouldn't need an additional surface, if we could just clear the vo_vaapi internal surface - but doing so would probably be a pain, so I don't care.
* video: move display and timing to a separate threadwm42014-08-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VO is run inside its own thread. It also does most of video timing. The playloop hands the image data and a realtime timestamp to the VO, and the VO does the rest. In particular, this allows the playloop to do other things, instead of blocking for video redraw. But if anything accesses the VO during video timing, it will block. This also fixes vo_sdl.c event handling; but that is only a side-effect, since reimplementing the broken way would require more effort. Also drop --softsleep. In theory, this option helps if the kernel's sleeping mechanism is too inaccurate for video timing. In practice, I haven't ever encountered a situation where it helps, and it just burns CPU cycles. On the other hand it's probably actively harmful, because it prevents the libavcodec decoder threads from doing real work. Side note: Originally, I intended that multiple frames can be queued to the VO. But this is not done, due to problems with OSD and other certain features. OSD in particular is simply designed in a way that it can be neither timed nor copied, so you do have to render it into the video frame before you can draw the next frame. (Subtitles have no such restriction. sd_lavc was even updated to fix this.) It seems the right solution to queuing multiple VO frames is rendering on VO-backed framebuffers, like vo_vdpau.c does. This requires VO driver support, and is out of scope of this commit. As consequence, the VO has a queue size of 1. The existing video queue is just needed to compute frame duration, and will be moved out in the next commit.
* vaapi: fix uninitialized value readwm42014-08-111-1/+1
| | | | | | Found with valgrind. This is somewhat terrifying, because the VA-API API function is supposed to fill these values, and we access them only if the API functions return success. So this shouldn't have happened.
* vaapi: fix destruction with --hwdec=haapi-copywm42014-05-281-2/+6
| | | | | | | This is incomplete; the video chain will still hold some vaapi objects after destroying the decoder and thus the vaapi context. This is very bad. Fixing it would require something like refcounting the vaapi context, but I don't really want to.
* video: warn if an emulated hwdec API is usedwm42014-05-281-0/+5
| | | | | | | | | | | | | | | | mpv supports two hardware decoding APIs on Linux: vdpau and vaapi. Each of these has emulation wrappers. The wrappers are usually slower and have fewer features than their native opposites. In particular the libva vdpau driver is practically unmaintained. Check the vendor string and print a warning if emulation is detected. Checking vendor strings is a very stupid thing to do, but I find the thought of people using an emulated API for no reason worse. Also, make --hwdec=auto never use an API that is detected as emulated. This doesn't work quite right yet, because once one API is loaded, vo_opengl doesn't unload it, so no hardware decoding will be used if the first probed API (usually vdpau) is rejected. But good enough.
* vaapi: make struct va_surface privatewm42014-03-171-11/+8
| | | | | | It's not really needed to be public. Other code can just use mp_image. The only disadvantage is that the other code needs to call an accessor to get the VASurfaceID.
* vaapi: replace image pool implementation with mp_image_poolwm42014-03-171-30/+29
| | | | | | | | | | | | Although I at first thought it would be better to have a separate implementation for hwaccels because the difference to software images are too large, it turns out you can actually save some code with it. Note that the old implementation had a small memory management bug. This got painted over in commit 269c1e1, but is hereby solved properly. Also note that I couldn't test vf_vavpp.c (due to lack of hardware), and I hope I didn't accidentally break it.
* vd_lavc: remove compatibility crapwm42014-03-161-2/+2
| | | | | | | All this code was needed for compatibility with very old libavcodec versions only (such as Libav 9). Includes some now-possible simplifications too.
* video: initialize hw decoder in get_formatwm42014-03-101-24/+10
| | | | | | | | | | | | | | | | Apparently the "right" place to initialize the hardware decoder is in the libavcodec get_format callback. This doesn't change vda.c and vdpau_old.c, because I don't have OSX, and vdpau_old.c is probably going to be removed soon (if Libav ever manages to release Libav 10). So for now the init_decoder callback added with this commit is optional. This also means vdpau.c and vaapi.c don't have to manage and check the image parameters anymore. This change is probably needed for when libavcodec VDA supports gets a new iteration of its API.
* vaapi: mp_msg conversionswm42013-12-211-26/+24
| | | | | | This ended up a little bit messy. In order to get a mp_log everywhere, mostly make use of the fact that va_surface already references global state anyway.
* Split mpvcore/ into common/, misc/, bstr/wm42013-12-171-2/+2
|
* vaapi: remove unused hw image formats, simplifywm42013-11-291-2/+2
| | | | | | | | | | PIX_FMT_VDA_VLD and PIX_FMT_VAAPI_VLD were never used anywhere. I'm not sure why they were even added, and they sound like they are just for compatibility with XvMC-style decoding, which sucks anyway. Now that there's only a single vaapi format, remove the IMGFMT_IS_VAAPI() macro. Also get rid of IMGFMT_IS_VDA(), which was unused.
* video: move struct mp_hwdec_info into its own header filewm42013-11-231-1/+1
| | | | | | | | This means most code accessing this struct must now include hwdec.h instead of dec_video.h. I just put it into dec_video.h at first because I thought a separate file would be a waste, but it's more proper to do it this way, as there are too many files which include dec_video.h only to get the mp_hwdec_info definition.
* vo_opengl: add support for VA-API OpenGL interopwm42013-11-041-0/+1
| | | | | | | | VA-API's OpenGL/GLX interop is pretty bad and perhaps slow (renders a X11 pixmap into a FBO, and has to go over X11, probably involves one or more copies), and this code serves more as an example, rather than for serious use. On the other hand, this might be work much better than vo_vaapi, even if slightly slower.
* video: check profiles with hardware decodingwm42013-11-011-46/+31
| | | | | | | | | | | | | | | We had some code for checking profiles earlier, which was removed in commits 2508f38 and adfb71b. These commits mentioned that (working) hw decoding was sometimes prevented due to profile checking, but I can't find the samples anymore that showed this behavior. Also, I changed my opinion, and I think checking the profiles is something that should be done for better fallback to software decoding behavior. The checks roughly follow VLC's vdpau profile checks, although we do not check codec levels. (VLC's profile checks aren't necessarily completely correct, but they're a welcome help anyway.) Add a --vd-lavc-check-hw-profile option, which skips the profile check.
* vaapi: remove non-VLD entrypointswm42013-09-291-6/+2
| | | | | These probably don't work. libavcodec doesn't seem to support them, and neither did the original mplayer-vaapi patch.
* vaapi: fix non-sense conditionwm42013-09-291-1/+1
| | | | Attempting signed comparison on unsigned value.
* vaapi: potentially make reading surfaces back to system RAM fasterwm42013-09-271-1/+4
| | | | | | | | Don't allocate a VAImage and a mp_image every time. VAImage are cached in the surfaces themselves, and for mp_image an explicit pool is created. The retry loop runs only once for each surface now. This also makes use of vaDeriveImage() if possible.
* vaapi: allow GPU read-back with --hwdec=vaapi-copywm42013-09-251-3/+105
| | | | | | | | | | | | | | This code is actually quite inefficient: it reuses the (slow, simple) screenshot code. It uses an inefficient method to read the image (vaGetImage() instead of vaDeriveImage()), allocates new memory for each frame that is read, and it tries all image formats again each time. Also, in my tests it always picked NV12 as image format, which is not ideal if you actually want to filter the video, and vo_xv can't handle this format without conversion either. However, a user confirmed that it worked for him, so everything is fine.
* vaapi: add vf_vavpp and use it for deinterlacingxylosper2013-09-251-36/+29
| | | | | | | | Merged from pull request #246 by xylosper. Minor cosmetic changes, some adjustments (compatibility with older libva versions), and manpage additions by wm4. Signed-off-by: wm4 <wm4@nowhere>
* vaapi: use highest available profile, instead of mapping it exactlywm42013-08-191-41/+38
| | | | | | | Now the code does the same as the original MPlayer VAAPI patch, instead of trying to map the profiles exactly. See previous commit for justification and discussion.
* video/decode: pass parameters directly to hwdec allocate_image callbackwm42013-08-151-6/+2
| | | | | | Instead of passing AVFrame. This also moves the mysterious logic about the size of the allocated image to common code, instead of duplicating it everywhere.
* video: add vaapi decode and output supportwm42013-08-121-0/+414
This is based on the MPlayer VA API patches. To be exact it's based on a very stripped down version of commit f1ad459a263f8537f6c from git://gitorious.org/vaapi/mplayer.git. This doesn't contain useless things like benchmarking hacks and the demo code for GLX interop. Also, unlike in the original patch, decoding and video output are split into separate source files (the separation between decoding and display also makes pixel format hacks unnecessary). On the other hand, some features not present in the original patch were added, like screenshot support. VA API is rather bad for actual video output. Dealing with older libva versions or the completely broken vdpau backend doesn't help. OSD is low quality and should be rather slow. In some cases, only either OSD or subtitles can be shown at the same time (because OSD is drawn first, OSD is prefered). Also, libva can't decide whether it accepts straight or premultiplied alpha for OSD sub-pictures: the vdpau backend seems to assume premultiplied, while a native vaapi driver uses straight. So I picked straight alpha. It doesn't matter much, because the blending code for straight alpha I added to img_convert.c is probably buggy, and ASS subtitles might be blended incorrectly. Really good video output with VA API would probably use OpenGL and the GL interop features, but at this point you might just use vo_opengl. (Patches for making HW decoding with vo_opengl have a chance of being accepted.) Despite these issues, decoding seems to work ok. I still got tearing on the Intel system I tested (Intel(R) Core(TM) i3-2350M). It was also tested with the vdpau vaapi wrapper on a nvidia system; however this was rather broken. (Fortunately, there is no reason to use mpv's VAAPI support over native VDPAU.)