summaryrefslogtreecommitdiffstats
path: root/video/hwdec.h
Commit message (Collapse)AuthorAgeFilesLines
* hwdec: fix 2 commentswm42017-05-241-2/+1
| | | | The first is outdated, the second was always wrong.
* vdpau: crappy hack to allow initializing hw decoding after preemptionwm42017-05-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If vo_opengl is used, and vo_opengl already created the vdpau interop (for whatever reasons), and then preemption happens, and then you try to enable hw decoding, it failed. The reason was that preemption recovery is not run at any point before libavcodec accesses the vdpau device. The actual impact was that with libmpv + opengl-cb use, hardware decoding was permanently broken after display mode switching (something that caused the display to get preempted at least with older drivers). With mpv CLI, you can for example enable hw decoding during playback, then disable it, VT switch to console, switch back to X, and try to enable hw decoding again. This is mostly because libav* does not deal with preemption, and NVIDIA driver preemption behavior being horrible garbage. In addition to being misdesigned API, the preemption callback is not called before you try to access vdpau API, and then only with _some_ accesses. In summary, the preemption callback was never called, neither before nor after libavcodec tried to init the decoder. So we have to get mp_vdpau_handle_preemption() called before libavcodec accesses it. This in turn will do a dummy API access which usually triggers the preemption callback immediately (with NVIDIA's drivers). In addition, we have to update the AVHWDeviceContext's device. In theory it could change (in practice it usually seems to use handle "0"). Creating a new device would cause chaos, as we don't have a concept of switching the device context on the fly. So we simply update it directly. I'm fairly sure this violates the libav* API, but it's the best we can do.
* video: fix a typo in a commentwm42017-03-231-1/+1
|
* vd_lavc, vaapi: move hw device creation to generic codewm42017-02-201-0/+6
| | | | | | | | hw_vaapi.c didn't do much interesting anymore. Other than the function to create a device for decoding with vaapi-copy, everything can be done by generic code. Other libavcodec hwaccels are planned to provide the same API as vaapi. It will be possible to drop the other hw_ files in the future. They will use this generic code instead.
* videotoolbox: remove weird format-negotiation between VO and decoderwm42017-02-171-6/+1
| | | | | | | | | | | | | | | | Originally, there was probably some sort of intention to restrict it to formats supported by the interop, or something. But in the end it was overcomplicated nonsense. In the future, we could use mp_hwdec_ctx.supported_formats or other mechanisms to handle this in a better way. mp_hwdec_ctx.ctx is not set to a dummy pointer - hwdec_devices_load() is only used to detect whether to vo_opengl interop is present, and the common hwdec code expects that the .ctx field is not NULL. This also changes videotoolbox-copy to use --videotoolbox-format, instead of the FFmpeg-set default.
* hwdec: add a AVBufferRef(AVHWDeviceContext) fieldwm42017-01-161-0/+3
| | | | This makes "generic" interaction with libav* components easier.
* vo_opengl, vaapi: properly probe 10 bit rendering supportwm42017-01-131-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are going to be users who have a Mesa installation which do not support 10 bit, but a GPU which can decode to 10 bit. So it's probably better not to hardcode whether it is supported. Introduce a more general way to signal supported formats from renderer to decoder. Obviously this is imperfect, because it still isn't part of proper format negotation (for example, what if there's a vavpp filter, which accepts anything). Still slightly better than before. I don't know any way to probe for vaapi dmabuf/EGL dmabuf support properly (in particular testing specific formats, not just general availability). So we stay with the current approach and try to create and map dummy surfaces on init to probe for support. Overdo it and check all formats that AVHWFramesConstraints reports, instead of only NV12 and P010 surfaces. Since we can support unknown formats now, add explicitly checks to the EGL/dmabuf mapper code to reject unsupported formats. I also noticed that libavutil signals support for RGB0/BGR0, but couldn't get it to work. Remove the DRM formats that are unused/didn't work the way I tried to use them. With this, 10 bit decoding + rendering should work, provided you have a capable CPU and a patched Mesa. The required Mesa patch adds support for the R16 and GR32 formats. It was sent by a Kodi developer to the Mesa developer mailing list and was not accepted yet.
* vo_opengl: hwdec_cuda: Use dynamic loading for cuda functionsPhilip Langdale2016-11-231-0/+1
| | | | | This change applies the pattern used in ffmpeg to dynamically load cuda, to avoid requiring the CUDA SDK at build time.
* video: add --hwdec=vdpau-copy modewm42016-10-201-0/+1
| | | | | | | At this point, all other hwaccels provide -copy modes, and vdpau is the exception with not having one. Although there is vf_vdpaurb, it's less convenient in certain situations, and exposes some issues with the filter chain code as well.
* vd_lavc: Add hwdec wrapper for crystalhdPhilip Langdale2016-10-151-0/+1
| | | | This hardware decodes to system memory so it only requires a wrapper.
* rpi: add --hwdec=rpi-copywm42016-09-301-0/+1
| | | | | | This means it can be used with normal video filters. Might help out with #3604.
* hwdec_cuda: Add trivial cuda-copy wrapperPhilip Langdale2016-09-111-0/+1
| | | | | | | | | The cuvid decoder already knows how to copy back to system memory if NV12 frames are requested, and this will happen if the decoder is used without the hwdec. For convenience, let's add a wrapper hwdec so people don't have to explicitly pick the cuvid decoder if they want this behaviour.
* hwdec/opengl: Add support for CUDA and cuvid/NvDecodePhilip Langdale2016-09-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross platform, but nvidia proprietary API that exposes their hardware video decoding capabilities. It is analogous to their DXVA or VDPAU support on Windows or Linux but without using platform specific API calls. As a rule, you'd rather use DXVA or VDPAU as these are more mature and well supported APIs, but on Linux, VDPAU is falling behind the hardware capabilities, and there's no sign that nvidia are making the investments to update it. Most concretely, this means that there is no VP8/9 or HEVC Main10 support in VDPAU. On the other hand, NvDecode does export vp8/9 and partial support for HEVC Main10 (more on that below). ffmpeg already has support in the form of the "cuvid" family of decoders. Due to the design of the API, it is best exposed as a full decoder rather than an hwaccel. As such, there are decoders like h264_cuvid, hevc_cuvid, etc. These decoders support two output paths today - in both cases, NV12 frames are returned, either in CUDA device memory or regular system memory. In the case of the system memory path, the decoders can be used as-is in mpv today with a command line like: mpv --vd=lavc:h264_cuvid foobar.mp4 Doing this will take advantage of hardware decoding, but the cost of the memcpy to system memory adds up, especially for high resolution video (4K etc). To avoid that, we need an hwdec that takes advantage of CUDA's OpenGL interop to copy from device memory into OpenGL textures. That is what this change implements. The process is relatively simple as only basic device context aquisition needs to be done by us - the CUDA buffer pool is managed by the decoder - thankfully. The hwdec looks a bit like the vdpau interop one - the hwdec maintains a single set of plane textures and each output frame is repeatedly mapped into these textures to pass on. The frames are always in NV12 format, at least until 10bit output supports emerges. The only slightly interesting part of the copying process is that CUDA works by associating PBOs, so we need to define these for each of the textures. TODO Items: * I need to add a download_image function for screenshots. This would do the same copy to system memory that the decoder's system memory output does. * There are items to investigate on the ffmpeg side. There appears to be a problem with timestamps for some content. Final note: I mentioned HEVC Main10. While there is no 10bit output support, NvDecode can return dithered 8bit NV12 so you can take advantage of the hardware acceleration. This particular mode requires compiling ffmpeg with a modified header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg yet. Usage: You will need to specify vo=opengl and hwdec=cuda. Note that hwdec=auto will probably not work as it will try to use vdpau first. mpv --hwdec=cuda --vo=opengl foobar.mp4 If you want to use filters that require frames in system memory, just use the decoder directly without the hwdec, as documented above.
* videotoolbox: add --hwdec=videotoolbox-copy for h/w accelerated decoding ↵Aman Gupta2016-07-151-0/+1
| | | | with video filters
* video: add --hwdec=auto-copy modewm42016-05-111-1/+4
| | | | | | | | This uses the normal autoprobing rules like "auto", but rejects anything that isn't flagged as copying data back to system memory. The chunk in command.c was dead code, so remove it instead of updating it.
* video: refactor how VO exports hwdec device handleswm42016-05-091-25/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | The main change is with video/hwdec.h. mp_hwdec_info is made opaque (and renamed to mp_hwdec_devices). Its accessors are mainly thread-safe (or documented where not), which makes the whole thing saner and cleaner. In particular, thread-safety rules become less subtle and more obvious. The new internal API makes it easier to support multiple OpenGL interop backends. (Although this is not done yet, and it's not clear whether it ever will.) This also removes all the API-specific fields from mp_hwdec_ctx and replaces them with a "ctx" field. For d3d in particular, we drop the mp_d3d_ctx struct completely, and pass the interfaces directly. Remove the emulation checks from vaapi.c and vdpau.c; they are pointless, and the checks that matter are done on the VO layer. The d3d hardware decoders might slightly change behavior: dxva2-copy will not use the VO device anymore if the VO supports proper interop. This pretty much assumes that any in such cases the VO will not use any form of exclusive mode, which makes using the VO device in copy mode unnecessary. This is a big refactor. Some things may be untested and could be broken.
* command: change some hwdec propertieswm42016-05-041-0/+1
| | | | | | | | Introduce hwdec-current and hwdec-interop properties. Deprecate hwdec-detected, which never made a lot of sense, and which is replaced by the new properties. hwdec-active also becomes useless, as hwdec-current is a superset, so it's deprecated too (for now).
* vo_opengl: D3D11VA + ANGLE interopwm42016-04-271-0/+1
| | | | | | | | | | | | | | | | | | | This uses ID3D11VideoProcessor to convert the video to a RGBA surface, which is then bound to ANGLE. Currently ANGLE does not provide any way to bind nv12 surfaces directly, so this will have to do. ID3D11VideoContext1 would give us slightly more control about the colorspace conversion, though it's still not good, and not available in MinGW headers yet. The video processor is created lazily, because we need to have the coded frame size, of which AVFrame and mp_image have no concept of. Doing the creation lazily is less of a pain than somehow hacking the coded frame size into mp_image. I'm not really sure how ID3D11VideoProcessorInputView is supposed to work. We recreate it on every frame, which is simple and hopefully doesn't affect performance.
* hwdec: remove numbers from enumwm42016-04-271-10/+10
| | | | | | | They don't actually mean anything. Just HWDEC_NONE should remain 0, because it's the default init value for structs etc.
* videotoolbox: change how videotoolbox format is managedwm42016-04-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | The underlying intention of this code is to make changing --videotoolbox-format at runtime work. For this reason, the format can't just be statically setup, but must be read from the option at runtime. This means the format is not fixed anymore, and we have to make sure the renderer is property reinitialized if the format changes. There is currently no way to trigger reinit on this level, which is why the mp_image_params.hw_subfmt field was introduced. One sketchy thing remains: normally, the renderer is supposed to be involved with VO format negotiation, which would ensure that the VO can take the format at all. Since the hw_subfmt is not part of this format negotiation, it's implied the get_vt_fmt() callback only returns formats supported by the renderer. This is not necessarily clear because vo_opengl checks this with converted_imgfmt separately. None of this matters in practice though, because we know all formats are always supported. (This still requires somehow triggering decoder reinit to make the change effective.)
* vd_lavc: add d3d11va hwdecKevin Mitchell2016-03-301-2/+3
| | | | | | This commit adds the d3d11va-copy hwdec mode using the ffmpeg d3d11va api. Functions in common with dxva2 are handled in a separate decode/d3d.c file. A future commit will rewrite decode/dxva2.c to share this code.
* Add a mediacodec decoder hwdec wrapperJan Ekström2016-03-251-0/+1
| | | | | Does the same thing as the rpi one - makes fallback possible by pretending that h264_mediacodec is a hwdec.
* dxva2: add interop (non-copyback) hwdec_typeKevin Mitchell2016-02-171-2/+3
| | | | | This always falls back to software decoding right now. VO support will be added in future commits.
* video: remove VDA supportwm42015-09-281-1/+0
| | | | | | | | | VideoToolbox is preferred. Now that FFmpeg released 2.8, there's no reason to support VDA anymore. In fact, we had a bug that made VDA not useable with older FFmpeg versions in some newer mpv releases. VideoToolbox is supported even on slightly older OSX versions, and if not, you still can run mpv without hw decoding.
* hwdec: add VideoToolbox supportSebastien Zwickert2015-08-051-0/+1
| | | | | | | | VDA is being deprecated in OS X 10.11 so this is needed to keep hwdec working. The code needs libavcodec support which was added recently (to FFmpeg git, libav doesn't support it). Signed-off-by: Stefano Pigozzi <stefano.pigozzi@gmail.com>
* options: cleanup hwdec name mappingswm42015-07-071-1/+6
| | | | | | | | | Now there's a "canonical" table for mapping the names, that other code can use, without having to rely too much on option code magic. Also, use the central HWDEC constants, instead of magic values. (There used to be semi-ok reasons to do this, but now it makes no sense anymore.)
* vo_direct3d, dxva2: use the same D3D devicewm42015-07-031-0/+1
| | | | | Since we still read-back (and don't have hard plans on changing this), this doesn't have much of an advantage.
* RPI supportwm42015-03-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This requires FFmpeg git master for accelerated hardware decoding. Keep in mind that FFmpeg must be compiled with --enable-mmal. Libav will also work. Most things work. Screenshots don't work with accelerated/opaque decoding (except using full window screenshot mode). Subtitles are very slow - even simple but huge overlays can cause frame drops. This always uses fullscreen mode. It uses dispmanx and mmal directly, and there are no window managers or anything on this level. vo_opengl also kind of works, but is pretty useless and slow. It can't use opaque hardware decoding (copy back can be used by forcing the option --vd=lavc:h264_mmal). Keep in mind that the dispmanx backend is preferred over the X11 ones in case you're trying on X11; but X11 is even more useless on RPI. This doesn't correctly reject extended h264 profiles and thus doesn't fallback to software decoding. The hw supports only up to the high profile, and will e.g. return garbage for Hi10P video. This sets a precedent of enabling hw decoding by default, but only if RPI support is compiled (which most hopefully it will be disabled on desktop Linux platforms). While it's more or less required to use hw decoding on the weak RPI, it causes more problems than it solves on real platforms (Linux has the Intel GPU problem, OSX still has some cases with broken decoding.) So I can live with this compromise of having different defaults depending on the platform. Raspberry Pi 2 is required. This wasn't tested on the original RPI, though at least decoding itself seems to work (but full playback was not tested).
* command: add property returning detected hwdec APIwm42015-02-021-0/+13
| | | | | | | | | This is somewhat imperfect, because detection of hw decoding APIs is mostly done on demand, and often avoided if not necessary. (For example, we know very well that there are no hw decoders for certain codecs.) This also requires every hwdec backend to identify itself (see hwdec.h changes).
* video: handle hwdec screenshots differentlywm42015-01-221-0/+9
| | | | | | | | Instead of converting the hw surface to an image in the VO, provide a generic way to convet hw surfaces, and use this in the screenshot code. It's all relatively straightforward, except vdpau is being terrible. It needs a huge chunk of new code, because copying back is not simple.
* video: have a generic context struct for hwdec backendswm42015-01-221-3/+15
| | | | | | | | | | | Before this commit, each hw backend had their own specific struct types for context, and some, like VDA, had none at all. Add a context struct (mp_hwdec_ctx) that provides a somewhat generic way to pass the hwdec context around. Some things get slightly better, some slightly more verbose. mp_hwdec_info is still around; it's still needed, but is reduced to its role of handling delayed loading of the hwdec backend.
* video: move display and timing to a separate threadwm42014-08-121-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VO is run inside its own thread. It also does most of video timing. The playloop hands the image data and a realtime timestamp to the VO, and the VO does the rest. In particular, this allows the playloop to do other things, instead of blocking for video redraw. But if anything accesses the VO during video timing, it will block. This also fixes vo_sdl.c event handling; but that is only a side-effect, since reimplementing the broken way would require more effort. Also drop --softsleep. In theory, this option helps if the kernel's sleeping mechanism is too inaccurate for video timing. In practice, I haven't ever encountered a situation where it helps, and it just burns CPU cycles. On the other hand it's probably actively harmful, because it prevents the libavcodec decoder threads from doing real work. Side note: Originally, I intended that multiple frames can be queued to the VO. But this is not done, due to problems with OSD and other certain features. OSD in particular is simply designed in a way that it can be neither timed nor copied, so you do have to render it into the video frame before you can draw the next frame. (Subtitles have no such restriction. sd_lavc was even updated to fix this.) It seems the right solution to queuing multiple VO frames is rendering on VO-backed framebuffers, like vo_vdpau.c does. This requires VO driver support, and is out of scope of this commit. As consequence, the VO has a queue size of 1. The existing video queue is just needed to compute frame duration, and will be moved out in the next commit.
* video: move struct mp_hwdec_info into its own header filewm42013-11-231-0/+20
This means most code accessing this struct must now include hwdec.h instead of dec_video.h. I just put it into dec_video.h at first because I thought a separate file would be a waste, but it's more proper to do it this way, as there are too many files which include dec_video.h only to get the mp_hwdec_info definition.