summaryrefslogtreecommitdiffstats
path: root/video/out/gl_video_shaders.glsl
Commit message (Collapse)AuthorAgeFilesLines
* vo_opengl: add smoothmotion frame blendingStefano Pigozzi2015-01-231-1/+13
| | | | | | | | | | | | | | | | | | | SmoothMotion is a way to time and blend frames made popular by MadVR. It's intended behaviour is to remove stuttering caused by mismatches between the display refresh rate and the video fps, while preserving the video's original artistic qualities (no soap opera effect). It's supposed to make 24fps video playback on 60hz monitors as close as possible to a 24hz monitor. Instead of drawing a frame once once it's pts has passed the vsync time, we redraw at the display refresh rate, and if we detect the vsync is between two frames we interpolated them (depending on their position relative to the vsync). We actually interpolate as few frames as possible to avoid a blur effect as much as possible. For example, if we were to play back a 1fps video on a 60hz monitor, we would blend at most on 1 vsync for each frame (while the other 59 vsyncs would be rendered as is). Frame interpolation is always done before scaling and in linear light when possible (an ICC profile is used, or :srgb is used).
* vo_opengl: rename all scale options to make more senseNiklas Haas2015-01-221-4/+4
| | | | | This emphasizes the fact that scale is used for *all* image upscaling, with cscale only serving a minor role for subsampled material.
* vo_opengl: switch to nearest neighbour for trivial resamplingNiklas Haas2015-01-221-1/+1
| | | | | | | | | | This is significantly faster for FBOs on most modern GPUs, although it did not result in a huge difference for the video source texture on the sizes I tested. It might be more significant for 1080p or 4K content, so it's worth revisiting this in the future. It also renames SAMPLE_BILINEAR to SAMPLE_TRIVIAL to match the semantics.
* vo_opengl: implement naive anti-ringingNiklas Haas2015-01-221-6/+17
| | | | | | | | This is not quite the same thing as madVR's antiringing algorithm, but it essentially does something similar. Porting madVR's approach to elliptic coordinates will take some amount of thought.
* vo_opengl: unroll ewa_lanczos to avoid looping and unnecessary samplesNiklas Haas2015-01-221-8/+7
| | | | | | | This speeds up performance by a factor of something like 10%, since it omits unnecessary checks. This will also make adding anti-ringing easier.
* vo_opengl: clean up ewa_lanczos codeNiklas Haas2015-01-221-4/+7
| | | | | | This fixes compatibility with GLES 2.0 and makes the code a bit neater in general. It also properly forces indirect scaling for subsampled video regardless of the lscale setting.
* vo_opengl: handle grayscale input better, add YA16 supportwm42015-01-211-5/+0
| | | | | | | | | | Simply clamp off the U/V components in the colormatrix, instead of doing something special in the shader. Also, since YA8/YA16 gave a plane_bits value of 16/32, and a colormatrix calculation overflowed with 32, add a component_bits field to the image format descriptor, which for YA8/YA16 returns 8/16 (the wrong value had no bad consequences otherwise).
* vo_opengl: remove 1D texture usagewm42015-01-181-3/+1
| | | | | | | Broke operation with GLSL. Since 1D texture usage was apparently (and mysteriously) good for speed, it might be added back, but it's unknown how to do so in a clean way.
* vo_opengl: get rid of approx-gamma and make it the default as per BT.1886Niklas Haas2015-01-161-29/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | After finding out more about how video mastering is done in the real world it dawned upon me why the "hack" we figured out in #534 looks so much better. Since mastering studios have historically been using only CRTs, the practice adopted for backwards compatibility was to simulate CRT responses even on modern digital monitors, a practice so ubiquitous that the ITU-R formalized it in R-Rec BT.1886 to be precisely gamma 2.40. As such, we finally have enough proof to get rid of the option altogether and just always do that. The value 1.961 is a rounded version of my experimentally obtained approximation of the BT.709 curve, which resulted in a value of around 1.9610336. This is the closest average match to the source brightness while preserving the nonlinear response of the BT.1886 ideal monitor. For playback in dark environments, it's expected that the gamma shift should be reproduced by a user controlled setting, up to a maximum of 1.224 (2.4/1.961) for a pitch black environment. More information: https://developer.apple.com/library/mac/technotes/tn2257/_index.html
* vo_opengl: add ewa_lanczos upscaler (aka jinc)Niklas Haas2015-01-151-0/+21
| | | | | This is the polar (elliptic weighted average) version of lanczos. This introduces a general new form of polar filters.
* video: Add sigmoidal upscaling to avoid ringing artifactsNiklas Haas2015-01-091-0/+11
| | | | | | | | | This avoids issues when upscaling directly in linear light, and is the recommended way to upscale images according to imagemagick. The default slope of 6.5 offers a reasonable compromise between ringing artifacts eliminated and ringing artifacts introduced by sigmoid-upscaling. Same goes for the default center of 0.75.
* video: Remove some stale CMS code, minor cosmeticsNiklas Haas2015-01-071-8/+4
| | | | This removes an old code path that was disabled in 016bb14.
* vo_opengl: remove obsolete comment in shaderwm42015-01-041-1/+1
|
* vo_opengl: improve fallback handling with GLESwm42014-12-211-6/+0
| | | | | | | | | Whether we have texture_rg doesn't matter much anymore; the scaler should be fine with this. But on ES 2.0, 1st class arrays are missing, so even if filterable float textures should be available, it won't work. Dithering (at least the "fruit" variant) will not work either, because it uses floats.
* vo_opengl: GLES does not support GL_BGRAwm42014-12-201-1/+1
| | | | | | | | | | | | | Apparently GLES 2 and 3 do not support this. (The implementations I tested with were derived from desktop OpenGL and were not overly strict with this.) This is no problem; just use GL_RGBA and mangle the channels in the shader. Also disable direct support for image formats like IMGFMT_RGB555 with GLES; at least some of them are not supported in this form, and the formats aren't important anyway.
* vo_opengl: add GLES 2 supportwm42014-12-191-4/+15
| | | | | | | | Rather basic support. Almost nothing works, and even if it does, it's bound to be inefficient (due to texture upload). This was tested with the nVidia desktop binary drivers, which provide GLES 2 support only. However, nVidia is not known to be very strict about OpenGL, and the driver is very new too, so the vo_opengl code will have bugs too.
* vo_opengl: do not use 4x3 matrixwm42014-12-181-2/+3
| | | | | | | | | | | This was a nice trick to get the mpv colormatrix directly into OpenGL, because the memory representation happened to match. Unfortunately, OpenGL ES 2 doesn't have glUniformMatrix4x3fv(). Even more unfortunately, the memory representation is now incompatible. It would be nice to change it, but that would mean getting into a big mess.
* vo_opengl: simplify the case without texture_rgwm42014-12-181-9/+7
| | | | | | | | | | | | If GL_RED was not available, we used GL_ALPHA. But this is an unnecessary complication, and it's easier to use GL_LUMINANCE instead. With the latter, a texture will return the .r component set, and as long as the shader doesn't look at the other components, the shader doesn't need any changes. Some of the changes added in 0e8fbdbd are now unneeeded. Also, realign the entire gl_byte_formats_legacy table.
* vo_opengl: GLES 3 supportwm42014-12-171-0/+4
| | | | | | | | | | | | Tested with MESA on software emulation. Seems to work well, although the default FBO format in opengl-hq disables most interesting features. I have no idea how well it will work on real hardware (or if it does at all). Unfortunately, some features, including playback of 10 bit video, are not supported. Not sure what to do about this. GLES 2 or 1 do not work.
* vo_opengl: glsl: stricter typingwm42014-12-171-20/+20
| | | | | | | | Older GLSL dialects as well as GLES3 do not support the following things in expressions: - implicit conversions of integer constants to float - arithmetic of float*vecN
* vo_opengl: remove requirement for RG textureswm42014-12-161-13/+21
| | | | | Features not supported are disabled (although with a misleading error message).
* vo_opengl: use all filter sizes possible with the shaderswm42014-12-081-5/+0
| | | | | | | | | | | | | | Not all filter sizes the shaders could handle were in the filter_sizes list. The shader can handle any multiple of 4 (the sizes 2 and 6 are special-cased to keep it simple). Add all possible filter sizes, up to 64. 64 is ridiculously high anyway. Most of the larger filter sizes are completely useless for upscaling, but help with the fancy-downscaling option. (Although it would still be more efficient to use cascaded scalers to handle downscaling better.) I considered doing something less stupid than the hardcoded array, but it seems this is still the simplest solution.
* vo_opengl: refactor: instantiate scaler functions at runtimewm42014-12-081-34/+17
| | | | | | | | | | | | | | | | | Before this commit, the convolution scaler shader functions were pre- instantiated in the shader file. For every filter size, a corresponding function (with the filter size as suffix) had to be present. Change this, and make the C code emit the necessary bits. This means the shader code is much reduced. (Although hopefully it doesn't make shader compilation faster - it would require a really dumb compiler if it spends its time on dead code.) It also makes it more flexible, which is the main goal. The DEF_SCALER0 stuff is needed because the C code writes the header of the shader, at a point where scaler macros are not defined yet.
* vo_opengl: never use 1D textures for lookup textureswm42014-12-081-29/+27
| | | | | | | | | This was a microoptimization for small filters which need 4 or less weights per sample point. When I originally wrote this code, using a 1D texture seemed to give a slight speed gain, but now I couldn't measure any difference. Remove this to simplify the code.
* vo_opengl: refactor: merge convolution function and sampler entrypointwm42014-12-081-67/+36
| | | | | | | | | There's not much of a reason to have the actual convolution code in a separate function. Merging them actually simplifies the code a bit, and gets rid of the repetitious macro invocations to define the functions for each filter size. There should be no changes in behavior or output.
* vo_opengl: extend filter size to 64wm42014-12-061-0/+5
| | | | | | For better downscaling. Maybe the list of filter sizes shouldn't be static...
* vo_opengl: extend filter size to 32wm42014-12-061-22/+21
| | | | | | Also replace the weights calculations for 8/12/16 with the generic weight function definition macro. (The weights 2/4/6 follow slightly different rules.)
* vo_opengl: Linearize non-RGB sRGB files correctly (eg. JPEG)Niklas Haas2014-11-261-0/+16
| | | | Signed-off-by: wm4 <wm4@nowhere>
* vo_opengl: Reword comment in shaderNiklas Haas2014-11-261-2/+3
| | | | | I didn't quite understand this comment after looking at the code again months later, so I reworded it for better clarity.
* vo_opengl: draw OSD twice in 3D mode casewm42014-10-291-1/+2
| | | | | | | | | | | | | Apparently this is needed for correct 3D mode subtitles. In general, it seems you need to duplicate the whole "GUI", so it's done for all OSD elements. This doesn't handle the "duplication" of the mouse pointer. Instead, the mouse can be used for the top/left field only. Also, it's possible that we should "compress" the OSD in the direction it's duplicated, but I don't know about that. Fixes #1124, at least partially.
* vo_opengl: remove macro operator from shaderBin Jin2014-08-291-16/+2
| | | | Removes '##' operator from OpenGL shader code.
* vo_opengl: fix shaderwm42014-08-281-7/+9
| | | | | | | | | | | | Regression since commit f14722a4. For some reason, this worked on nvidia, but rightfully failed on mesa. At least in C, the ## operator indeed needs two macro arguments, and you can't just concatenate with non-arguments. This change will most likely fix it. CC: @bjin
* vo_opengl: add cparam1 and cparam2 optionsBin Jin2014-08-261-7/+22
| | | | | | Although cscale is rarely used, it's possible that params of cscale are accidentally set to lparam1 and lparam2, which might cause unexpected results.
* vo_opengl: Make approx-gamma affect OSD/subNiklas Haas2014-06-221-3/+4
| | | | | | Close #837 Signed-off-by: wm4 <wm4@nowhere>
* video: Generate an accurate CMS matrix instead of hard-codingNiklas Haas2014-06-221-18/+13
| | | | | | | | | This also avoids an extra matrix multiplication when using :srgb, making that path both more efficient and also eliminating more hard-coded values. In addition, the previously hard-coded XYZ to RGB matrix will be dynamically generated.
* video: Support BT.2020 constant luminance systemNiklas Haas2014-06-221-5/+43
| | | | Signed-off-by: wm4 <wm4@nowhere>
* video: Add support for non-BT.709 primariesNiklas Haas2014-06-221-4/+30
| | | | | | | This add support for reading primary information from lavc, categorized into BT.601-525, BT.601-625, BT.709 and BT.2020; and passes it on to the vo. In vo_opengl, we always generate the 3dlut against the wider BT.2020 and transform our source into this colorspace in the shader.
* video: Add BT.2020-NCL colorspace and transfer functionNiklas Haas2014-06-221-9/+14
| | | | Source: http://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.2020-0-201208-I!!PDF-E.pdf
* vo_opengl: Simplify and clarify color correction codeNiklas Haas2014-03-101-14/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit: - Changes some of the #define and variable names for clarification and adds comments where appropriate. - Unifies :srgb and :icc-profile, making them fit into the same step of the decoding process and removing the weird interactions between both of them. - Makes :icc-profile take precedence over :srgb (to significantly reduce the number of confusing and useless special cases) - Moves BT709 decompanding (approximate or actual) to the shader in all cases, making it happen before upscaling (instead of the old 0.45 gamma function). This is the simpler and more proper way to do it. - Enables the approx gamma function to work with :srgb as well due to this (since they now share the gamma expansion code). - Renames :icc-approx-gamma to :approx-gamma since it is no longer tied to the ICC options or LittleCMS. - Uses gamma 2.4 as input space for the actual 3DLUT, this is now a pretty arbitrary factor but I picked 2.4 mainly because a higher pure power value here seems to produce visually better results with wide gamut profiles, rather then the previous 1.95 or BT.709. - Adds the input gamma space to the 3dlut cache header in case we change it more in the future, or even make it user customizable (though I don't see why the latter would really be necessary). - Fixes the OSD's gamma when using :srgb, which was previously still using the old (0.45) approximation in all cases. - Updates documentation on :srgb, it was still mentioning the old behavior from circa a year ago. This commit should serve to both open up and make the CMS/shader code much more accessible and less confusing/error-prone and simultaneously also improve the performance of 3DLUTs with wide gamut color spaces. I would liked to have made it more modular but almost all of these changes are interdependent, save for the documentation updates. Note: Right now, the "3DLUT takes precedence over SRGB" logic is just coded into gl_lcms.c's compile_shaders function. Ideally, this should be done earlier, when parsing the options (by overriding the actual opts.srgb flag) and output a warning to the user. Note: I'm not sure how well this works together with real-world subtitles that may need to be color corrected as well. I'm not sure whether :approx-gamma needs to apply to subtitles as well. I'll need to test this on proper files later. Note: As of now, linear light scaling is still intrinsically tied to either :srgb or :icc-profile. It would be thinkable to have this as an extra option, :linear-scaling or similar, that could be used with or without the two color management options.
* vo_opengl: Use bt709_expand on OSD for :srgbNiklas Haas2014-03-101-1/+1
| | | | | This affects the OSD only when :srgb is enabled, this still used the old gamma approximation of 2.22 previously.
* vo_opengl: make :srgb decompand the BT.709 values correctlynand2014-02-121-2/+15
| | | | | | | | This is the same issue as addressed by 257d9f1, except this time for the :srgb option as well. (257d9f1 only addressed :icc-profile) The conditions of the srgb_compand mix() call are also flipped to prevent an off-by-one error.
* vo_opengl: add support for rectangle textureswm42013-12-011-13/+21
| | | | | | | | | This allows vo_opengl to use GL_TEXTURE_RECTANGLE textures, either by enabling it with the 'rectangle-textures' sub-option, or by having a hwdec backend force it. By default it's off. The _only_ reason we're adding this is because VDA can export rectangle textures only.
* vo_opengl: blend alpha components by defaultwm42013-09-191-0/+3
| | | | | | Improves display of images and video with alpha channel, especially if the transparent regions contain (supposed to be invisible) garbage color values.
* gl_video: add support for more rgb formatswm42013-07-181-16/+12
| | | | | | | | | | | | | | | Until now, only formats directly supported by OpenGL were supported. This excludes various permutations of 8-bit RGB[A|0]. But we can simply permutate the color channels in the shader, so do that. This also adds support for all these weird RGB0 formats. Note that we could use libavutil's pixfmt list instead of the mp_packed_formats array, but trying to decrypt the pixfmt info would probably end in pain, so this array with duplicated information is actually better and shorter. Note: I didn't actually test whether the alpha components are reproduced correctly with alpha formats.
* Merge remote-tracking branch 'origin/low_quality_intel_crap'Martin Herkt2013-07-081-13/+16
|\ | | | | | | | | Conflicts: video/out/gl_video_shaders.glsl
| * Merge branch 'master' into low_quality_intel_crapwm42013-04-301-13/+16
| | | | | | | | | | | | Conflicts: video/out/gl_video_shaders.glsl video/out/vo_opengl.c
* | vo_opengl: handle chroma locationwm42013-06-281-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the video decoder chroma location flags and render chroma locations other than centered. Until now, we've always used the intuitive and obvious centered chroma location, but H.264 uses something else. FFmpeg provides a small overview in libavcodec/avcodec.h: ----------- /** * X X 3 4 X X are luma samples, * 1 2 1-6 are possible chroma positions * X X 5 6 X 0 is undefined/unknown position */ enum AVChromaLocation{ AVCHROMA_LOC_UNSPECIFIED = 0, AVCHROMA_LOC_LEFT = 1, ///< mpeg2/4, h264 default AVCHROMA_LOC_CENTER = 2, ///< mpeg1, jpeg, h263 AVCHROMA_LOC_TOPLEFT = 3, ///< DV AVCHROMA_LOC_TOP = 4, AVCHROMA_LOC_BOTTOMLEFT = 5, AVCHROMA_LOC_BOTTOM = 6, AVCHROMA_LOC_NB , ///< Not part of ABI }; ----------- The visual difference is literally minimal, but since videophiles apparently consider this detail as quality mark of a video renderer, support it anyway. We don't bother with chroma locations other than centered and left, though. Not sure about correctness, but it's probably ok.
* | gl_video: explicitly clamp colormatrix outputwm42013-06-031-0/+1
| | | | | | | | | | | | | | This could lead to quite visible artifacts when using an appropriate ICC and float FBOs. The float FBOs allow storing out of range values, and my guess is that the rest of the precessing chain elevated these out of range values, resulting in artifacts.
* | gl_video: change a GLSL statementwm42013-05-301-1/+1
| | | | | | | | | | | | This might be better with dumb shader compilers, which won't vectorize this to a single vector-division, assuming the hardware does have such an instruction. Affects "bicubic_fast" scale mode only.
* | gl_video: fix some dithering bugswm42013-05-301-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The internal texture format GL_RED is typically 8 bit, which is clearly not good enough for the new dither matrix. The idea was to use a float texture format, but this was somehow "forgotten". Use GL_R16, since 16 bit textures are more robust, and provide more precision for the same memory usage. Change how the offset for centering the dither matrix is applied. This is needed for making it possible to round up values to the target depth. Before this commit, this changed the output even if the input was exact and input and output depth were the same, which i