From f24c2e0f56fdcef9b14c4a2ed15c4e9e801bbeab Mon Sep 17 00:00:00 2001 From: Niklas Haas Date: Tue, 20 Jan 2015 20:56:40 +0100 Subject: vo_opengl: always prefer indirect scaling This is better even for non-separable. The only exception is when using bilinear for both lscale and cscale. I've fixed the documentation/comments to make more sense. --- DOCS/man/vo.rst | 23 ++++++++++++++--------- video/out/gl_video.c | 19 +++++-------------- 2 files changed, 19 insertions(+), 23 deletions(-) diff --git a/DOCS/man/vo.rst b/DOCS/man/vo.rst index 5e03361bf9..f1e69c2b99 100644 --- a/DOCS/man/vo.rst +++ b/DOCS/man/vo.rst @@ -420,17 +420,22 @@ Available video output drivers are: ``no-scale-sep`` When using a separable scale filter for luma, usually two filter - passes are done. This is often faster. However, it forces - conversion to RGB in an extra pass, so it can actually be slower - if used with fast filters on small screen resolutions. Using - this options will make rendering a single operation. - Note that chroma scalers are always done as 1-pass filters. + passes are done, and when using ``cscale`` chroma information is also + scaled separately from luma. This is often faster and better for + most image scalers. However, the extra passes and preprocessing logic + can actually make it slower if used with fast filters on small screen + resolutions. Using this option will make rendering a single operation + if possible, often at the cost of performance or image quality. + + It's safe to enable this if using ``bilinear`` for both ``lscale`` + and ``cscale``. ``cscale=`` - As ``lscale``, but for chroma (2x slower with little visible effect). - Note that with some scaling filters, upscaling is always done in - RGB. If chroma is not subsampled, this option is ignored, and the - luma scaler is used instead. Setting this option is often useless. + As ``lscale``, but for interpolating chroma information. If the image + is not subsampled, this option is ignored entirely. Note that the + implementation is currently always done as a single pass, so using + it with separable filters will result in slow performance for very + little visible benefit. ``lscale-down=`` Like ``lscale``, but apply these filters on downscaling diff --git a/video/out/gl_video.c b/video/out/gl_video.c index ddccd3a3e5..4ab41d8076 100644 --- a/video/out/gl_video.c +++ b/video/out/gl_video.c @@ -1194,11 +1194,7 @@ static void compile_shaders(struct gl_video *p) shader_setup_scaler(&header_final, &p->scalers[0], -1); } - // We want to do scaling in linear light. Scaling is closely connected to - // texture sampling due to how the shader is structured (or if GL bilinear - // scaling is used). The purpose of the "indirect" pass is to convert the - // input video to linear RGB. - // Another purpose is reducing input to a single texture for scaling. + // The indirect pass is used to preprocess the image before scaling. bool use_indirect = p->opts.indirect; // Don't sample from input video textures before converting the input to @@ -1206,15 +1202,10 @@ static void compile_shaders(struct gl_video *p) if (use_input_gamma || use_conv_gamma || use_linear_light || use_const_luma) use_indirect = true; - // It doesn't make sense to scale the chroma with cscale in the 1. scale - // step and with lscale in the 2. step. If the chroma is subsampled, a - // convolution filter wouldn't even work entirely correctly, because the - // luma scaler would sample two texels instead of one per tap for chroma. - // Also, even with 4:4:4 YUV or planar RGB, the indirection might be faster, - // because the shader can't use one scaler for sampling from 3 textures. It - // has to fetch the coefficients for each texture separately, even though - // they're the same (this is not an inherent restriction, but would require - // to restructure the shader). + // If the video is subsampled, chroma information needs to be pulled up to + // the input size before scaling can be done. Even for 4:4:4 or planar RGB + // this is also faster because it means the scalers can operate on all + // channels simultaneously. Disabling scale_sep overrides this behavior. if (p->opts.scale_sep && p->plane_count > 1) use_indirect = true; -- cgit v1.2.3