| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SmoothMotion is a way to time and blend frames made popular by MadVR. It's
intended behaviour is to remove stuttering caused by mismatches between the
display refresh rate and the video fps, while preserving the video's original
artistic qualities (no soap opera effect). It's supposed to make 24fps video
playback on 60hz monitors as close as possible to a 24hz monitor.
Instead of drawing a frame once once it's pts has passed the vsync time, we
redraw at the display refresh rate, and if we detect the vsync is between two
frames we interpolated them (depending on their position relative to the vsync).
We actually interpolate as few frames as possible to avoid a blur effect as
much as possible. For example, if we were to play back a 1fps video on a 60hz
monitor, we would blend at most on 1 vsync for each frame (while the other 59
vsyncs would be rendered as is).
Frame interpolation is always done before scaling and in linear light when
possible (an ICC profile is used, or :srgb is used).
|
|
|
|
|
| |
This emphasizes the fact that scale is used for *all* image upscaling,
with cscale only serving a minor role for subsampled material.
|
|
|
|
|
|
|
|
|
|
| |
This is significantly faster for FBOs on most modern GPUs, although it
did not result in a huge difference for the video source texture on the
sizes I tested. It might be more significant for 1080p or 4K content, so
it's worth revisiting this in the future.
It also renames SAMPLE_BILINEAR to SAMPLE_TRIVIAL to match the
semantics.
|
|
|
|
|
|
|
|
| |
This is not quite the same thing as madVR's antiringing algorithm, but
it essentially does something similar.
Porting madVR's approach to elliptic coordinates will take some amount
of thought.
|
|
|
|
|
|
|
| |
This speeds up performance by a factor of something like 10%,
since it omits unnecessary checks.
This will also make adding anti-ringing easier.
|
|
|
|
|
|
| |
This fixes compatibility with GLES 2.0 and makes the code a bit neater
in general. It also properly forces indirect scaling for subsampled
video regardless of the lscale setting.
|
|
|
|
|
|
|
|
|
|
| |
Simply clamp off the U/V components in the colormatrix, instead of doing
something special in the shader.
Also, since YA8/YA16 gave a plane_bits value of 16/32, and a colormatrix
calculation overflowed with 32, add a component_bits field to the image
format descriptor, which for YA8/YA16 returns 8/16 (the wrong value had
no bad consequences otherwise).
|
|
|
|
|
|
|
| |
Broke operation with GLSL.
Since 1D texture usage was apparently (and mysteriously) good for speed,
it might be added back, but it's unknown how to do so in a clean way.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After finding out more about how video mastering is done in the real
world it dawned upon me why the "hack" we figured out in #534 looks so
much better.
Since mastering studios have historically been using only CRTs, the
practice adopted for backwards compatibility was to simulate CRT
responses even on modern digital monitors, a practice so ubiquitous that
the ITU-R formalized it in R-Rec BT.1886 to be precisely gamma 2.40.
As such, we finally have enough proof to get rid of the option
altogether and just always do that.
The value 1.961 is a rounded version of my experimentally obtained
approximation of the BT.709 curve, which resulted in a value of around
1.9610336. This is the closest average match to the source brightness
while preserving the nonlinear response of the BT.1886 ideal monitor.
For playback in dark environments, it's expected that the gamma shift
should be reproduced by a user controlled setting, up to a maximum of
1.224 (2.4/1.961) for a pitch black environment.
More information:
https://developer.apple.com/library/mac/technotes/tn2257/_index.html
|
|
|
|
|
| |
This is the polar (elliptic weighted average) version of lanczos.
This introduces a general new form of polar filters.
|
|
|
|
|
|
|
|
|
| |
This avoids issues when upscaling directly in linear light, and is the
recommended way to upscale images according to imagemagick.
The default slope of 6.5 offers a reasonable compromise between
ringing artifacts eliminated and ringing artifacts introduced by
sigmoid-upscaling. Same goes for the default center of 0.75.
|
|
|
|
| |
This removes an old code path that was disabled in 016bb14.
|
| |
|
|
|
|
|
|
|
|
|
| |
Whether we have texture_rg doesn't matter much anymore; the scaler
should be fine with this. But on ES 2.0, 1st class arrays are missing,
so even if filterable float textures should be available, it won't work.
Dithering (at least the "fruit" variant) will not work either, because
it uses floats.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently GLES 2 and 3 do not support this. (The implementations I
tested with were derived from desktop OpenGL and were not overly strict
with this.)
This is no problem; just use GL_RGBA and mangle the channels in the
shader.
Also disable direct support for image formats like IMGFMT_RGB555 with
GLES; at least some of them are not supported in this form, and the
formats aren't important anyway.
|
|
|
|
|
|
|
|
| |
Rather basic support. Almost nothing works, and even if it does, it's
bound to be inefficient (due to texture upload). This was tested with
the nVidia desktop binary drivers, which provide GLES 2 support only.
However, nVidia is not known to be very strict about OpenGL, and the
driver is very new too, so the vo_opengl code will have bugs too.
|
|
|
|
|
|
|
|
|
|
|
| |
This was a nice trick to get the mpv colormatrix directly into OpenGL,
because the memory representation happened to match.
Unfortunately, OpenGL ES 2 doesn't have glUniformMatrix4x3fv().
Even more unfortunately, the memory representation is now incompatible.
It would be nice to change it, but that would mean getting into a big
mess.
|
|
|
|
|
|
|
|
|
|
|
|
| |
If GL_RED was not available, we used GL_ALPHA. But this is an
unnecessary complication, and it's easier to use GL_LUMINANCE instead.
With the latter, a texture will return the .r component set, and as long
as the shader doesn't look at the other components, the shader doesn't
need any changes.
Some of the changes added in 0e8fbdbd are now unneeeded.
Also, realign the entire gl_byte_formats_legacy table.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tested with MESA on software emulation. Seems to work well, although the
default FBO format in opengl-hq disables most interesting features. I
have no idea how well it will work on real hardware (or if it does at
all).
Unfortunately, some features, including playback of 10 bit video, are
not supported. Not sure what to do about this.
GLES 2 or 1 do not work.
|
|
|
|
|
|
|
|
| |
Older GLSL dialects as well as GLES3 do not support the following things
in expressions:
- implicit conversions of integer constants to float
- arithmetic of float*vecN
|
|
|
|
|
| |
Features not supported are disabled (although with a misleading error
message).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not all filter sizes the shaders could handle were in the filter_sizes
list. The shader can handle any multiple of 4 (the sizes 2 and 6 are
special-cased to keep it simple).
Add all possible filter sizes, up to 64. 64 is ridiculously high anyway.
Most of the larger filter sizes are completely useless for upscaling,
but help with the fancy-downscaling option. (Although it would still be
more efficient to use cascaded scalers to handle downscaling better.)
I considered doing something less stupid than the hardcoded array, but
it seems this is still the simplest solution.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this commit, the convolution scaler shader functions were pre-
instantiated in the shader file. For every filter size, a corresponding
function (with the filter size as suffix) had to be present.
Change this, and make the C code emit the necessary bits.
This means the shader code is much reduced. (Although hopefully it
doesn't make shader compilation faster - it would require a really dumb
compiler if it spends its time on dead code.)
It also makes it more flexible, which is the main goal.
The DEF_SCALER0 stuff is needed because the C code writes the header of
the shader, at a point where scaler macros are not defined yet.
|
|
|
|
|
|
|
|
|
| |
This was a microoptimization for small filters which need 4 or less
weights per sample point. When I originally wrote this code, using a 1D
texture seemed to give a slight speed gain, but now I couldn't measure
any difference.
Remove this to simplify the code.
|
|
|
|
|
|
|
|
|
| |
There's not much of a reason to have the actual convolution code in a
separate function. Merging them actually simplifies the code a bit, and
gets rid of the repetitious macro invocations to define the functions
for each filter size.
There should be no changes in behavior or output.
|
|
|
|
|
|
| |
For better downscaling.
Maybe the list of filter sizes shouldn't be static...
|
|
|
|
|
|
| |
Also replace the weights calculations for 8/12/16 with the generic
weight function definition macro. (The weights 2/4/6 follow slightly
different rules.)
|
|
|
|
| |
Signed-off-by: wm4 <wm4@nowhere>
|
|
|
|
|
| |
I didn't quite understand this comment after looking at the code again
months later, so I reworded it for better clarity.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently this is needed for correct 3D mode subtitles. In general,
it seems you need to duplicate the whole "GUI", so it's done for all
OSD elements.
This doesn't handle the "duplication" of the mouse pointer. Instead,
the mouse can be used for the top/left field only. Also, it's possible
that we should "compress" the OSD in the direction it's duplicated, but
I don't know about that.
Fixes #1124, at least partially.
|
|
|
|
| |
Removes '##' operator from OpenGL shader code.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Regression since commit f14722a4. For some reason, this worked on
nvidia, but rightfully failed on mesa.
At least in C, the ## operator indeed needs two macro arguments, and
you can't just concatenate with non-arguments.
This change will most likely fix it.
CC: @bjin
|
|
|
|
|
|
| |
Although cscale is rarely used, it's possible that params of cscale
are accidentally set to lparam1 and lparam2, which might cause
unexpected results.
|
|
|
|
|
|
| |
Close #837
Signed-off-by: wm4 <wm4@nowhere>
|
|
|
|
|
|
|
|
|
| |
This also avoids an extra matrix multiplication when using :srgb, making
that path both more efficient and also eliminating more hard-coded
values.
In addition, the previously hard-coded XYZ to RGB matrix will be
dynamically generated.
|
|
|
|
| |
Signed-off-by: wm4 <wm4@nowhere>
|
|
|
|
|
|
|
| |
This add support for reading primary information from lavc, categorized
into BT.601-525, BT.601-625, BT.709 and BT.2020; and passes it on to the
vo. In vo_opengl, we always generate the 3dlut against the wider BT.2020
and transform our source into this colorspace in the shader.
|
|
|
|
| |
Source: http://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.2020-0-201208-I!!PDF-E.pdf
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit:
- Changes some of the #define and variable names for clarification and
adds comments where appropriate.
- Unifies :srgb and :icc-profile, making them fit into the same step of
the decoding process and removing the weird interactions between both
of them.
- Makes :icc-profile take precedence over :srgb (to significantly reduce
the number of confusing and useless special cases)
- Moves BT709 decompanding (approximate or actual) to the shader in all
cases, making it happen before upscaling (instead of the old 0.45
gamma function). This is the simpler and more proper way to do it.
- Enables the approx gamma function to work with :srgb as well due to
this (since they now share the gamma expansion code).
- Renames :icc-approx-gamma to :approx-gamma since it is no longer tied
to the ICC options or LittleCMS.
- Uses gamma 2.4 as input space for the actual 3DLUT, this is now a
pretty arbitrary factor but I picked 2.4 mainly because a higher pure
power value here seems to produce visually better results with wide
gamut profiles, rather then the previous 1.95 or BT.709.
- Adds the input gamma space to the 3dlut cache header in case we change
it more in the future, or even make it user customizable (though I
don't see why the latter would really be necessary).
- Fixes the OSD's gamma when using :srgb, which was previously still
using the old (0.45) approximation in all cases.
- Updates documentation on :srgb, it was still mentioning the old
behavior from circa a year ago.
This commit should serve to both open up and make the CMS/shader code much
more accessible and less confusing/error-prone and simultaneously also
improve the performance of 3DLUTs with wide gamut color spaces.
I would liked to have made it more modular but almost all of these
changes are interdependent, save for the documentation updates.
Note: Right now, the "3DLUT takes precedence over SRGB" logic is just
coded into gl_lcms.c's compile_shaders function. Ideally, this should be
done earlier, when parsing the options (by overriding the actual
opts.srgb flag) and output a warning to the user.
Note: I'm not sure how well this works together with real-world
subtitles that may need to be color corrected as well. I'm not sure
whether :approx-gamma needs to apply to subtitles as well. I'll need to
test this on proper files later.
Note: As of now, linear light scaling is still intrinsically tied to
either :srgb or :icc-profile. It would be thinkable to have this as an
extra option, :linear-scaling or similar, that could be used with or
without the two color management options.
|
|
|
|
|
| |
This affects the OSD only when :srgb is enabled, this still used the old
gamma approximation of 2.22 previously.
|
|
|
|
|
|
|
|
| |
This is the same issue as addressed by 257d9f1, except this time for
the :srgb option as well. (257d9f1 only addressed :icc-profile)
The conditions of the srgb_compand mix() call are also flipped to
prevent an off-by-one error.
|
|
|
|
|
|
|
|
|
| |
This allows vo_opengl to use GL_TEXTURE_RECTANGLE textures, either by
enabling it with the 'rectangle-textures' sub-option, or by having a
hwdec backend force it. By default it's off.
The _only_ reason we're adding this is because VDA can export rectangle
textures only.
|
|
|
|
|
|
| |
Improves display of images and video with alpha channel, especially if
the transparent regions contain (supposed to be invisible) garbage
color values.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, only formats directly supported by OpenGL were supported.
This excludes various permutations of 8-bit RGB[A|0]. But we can simply
permutate the color channels in the shader, so do that. This also adds
support for all these weird RGB0 formats.
Note that we could use libavutil's pixfmt list instead of the
mp_packed_formats array, but trying to decrypt the pixfmt info would
probably end in pain, so this array with duplicated information is
actually better and shorter.
Note: I didn't actually test whether the alpha components are reproduced
correctly with alpha formats.
|
|\
| |
| |
| |
| | |
Conflicts:
video/out/gl_video_shaders.glsl
|
| |
| |
| |
| |
| |
| | |
Conflicts:
video/out/gl_video_shaders.glsl
video/out/vo_opengl.c
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Use the video decoder chroma location flags and render chroma locations
other than centered. Until now, we've always used the intuitive and
obvious centered chroma location, but H.264 uses something else.
FFmpeg provides a small overview in libavcodec/avcodec.h:
-----------
/**
* X X 3 4 X X are luma samples,
* 1 2 1-6 are possible chroma positions
* X X 5 6 X 0 is undefined/unknown position
*/
enum AVChromaLocation{
AVCHROMA_LOC_UNSPECIFIED = 0,
AVCHROMA_LOC_LEFT = 1, ///< mpeg2/4, h264 default
AVCHROMA_LOC_CENTER = 2, ///< mpeg1, jpeg, h263
AVCHROMA_LOC_TOPLEFT = 3, ///< DV
AVCHROMA_LOC_TOP = 4,
AVCHROMA_LOC_BOTTOMLEFT = 5,
AVCHROMA_LOC_BOTTOM = 6,
AVCHROMA_LOC_NB , ///< Not part of ABI
};
-----------
The visual difference is literally minimal, but since videophiles
apparently consider this detail as quality mark of a video renderer,
support it anyway. We don't bother with chroma locations other than
centered and left, though.
Not sure about correctness, but it's probably ok.
|
| |
| |
| |
| |
| |
| |
| | |
This could lead to quite visible artifacts when using an appropriate ICC
and float FBOs. The float FBOs allow storing out of range values, and my
guess is that the rest of the precessing chain elevated these out of
range values, resulting in artifacts.
|
| |
| |
| |
| |
| |
| | |
This might be better with dumb shader compilers, which won't vectorize
this to a single vector-division, assuming the hardware does have such
an instruction. Affects "bicubic_fast" scale mode only.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The internal texture format GL_RED is typically 8 bit, which is clearly
not good enough for the new dither matrix. The idea was to use a float
texture format, but this was somehow "forgotten". Use GL_R16, since
16 bit textures are more robust, and provide more precision for the
same memory usage.
Change how the offset for centering the dither matrix is applied. This
is needed for making it possible to round up values to the target depth.
Before this commit, this changed the output even if the input was exact
and input and output depth were the same, which i |