summaryrefslogtreecommitdiffstats
path: root/video/out/opengl/utils.c
Commit message (Collapse)AuthorAgeFilesLines
* vo_opengl: enable color management on GLESJames Ross-Gowan2016-05-271-1/+5
| | | | | | This requires the GL_EXT_texture_norm16 extension and works in ANGLE. A default precision had to be set for sampler3Ds, otherwise the shaders would fail to compile.
* vo_opengl: fix other minor namespace issueswm42016-05-231-4/+4
| | | | See previous commit.
* vo_opengl: rename glUploadTex, drop unused parameterwm42016-05-231-9/+6
| | | | | | | | Rename it to get out of OpenGL's namespace. The gl_ prefix is used by other mpv functions, but no OpenGL ones. The "slice" parameter was never actually used, and all callers passed 0 for it.
* vo_opengl: unify PBO and normal OSD texture upload pathwm42016-05-231-25/+0
| | | | | | | | | | | | | The main change is actually that e first copy to a "staging" memory frame, and then upload this at once. The old non-PBO code called glTexsubImage2D for each OSD sub-bitmap. The new non-PBO code path is a bit faster now if there are many small sub-bitmaps (on Linux/nVidia). It's also a bit simpler, so this is a win. (Although I don't particularly appreciate the mixed normal/PBO texture code.)
* vo_opengl: support framebuffer invalidationwm42016-05-231-0/+15
| | | | | | | | | | Not sure how much can be gained with this, as we can't use it properly yet. For now, this is used only before rendering, which probably does overwhelmingly nothing. In the future, this should be used after temporary passes, which could possibly reduce memory usage and even memory bandwidth usage, depending on the drivers.
* vo_opengl: make gl_sc_enable_extension() permanent/idempotentwm42016-05-191-2/+12
| | | | | | | | No reason not to, and makes the following commit slightly simpler. In fact, this makes the shaders more correct too. Normally, "#extension" must come before any normal shader text, including the "precision" directive. Not sure why this worked before. (Probably didn't.)
* vo_opengl: move UT_buffer to switch handlingwm42016-05-171-5/+5
| | | | No reason to make it a special case.
* vo_opengl: make number of cached shaders/uniform dynamicwm42016-05-171-12/+24
| | | | | | | | | | Use dynamic memory allocation, as the static allocation is starting to get annoying. Currently, SC_MAX_ENTRIES is essentially still a static upper limit on the number of shaders. But in future we could try a more clever cache replacement strategy, which does not keep stale entries forever if the maximum happens not to be reached.
* vo_opengl: move cached uniforms to a separate structwm42016-05-171-10/+15
|
* vo_opengl: increase shader limitsNiklas Haas2016-05-171-2/+2
| | | | | | | | | | | | | The new uniforms introduced by 362015c have exceeded the uniform limit when using high-radius tscale. In addition, the SC limit of 32 entries might be pushing it with user shaders. Just make these value a bigger to delay the onset of this same failure mode. Maybe in the future it should be reworked to grow dynamically? Either way, we *can* always predict a static upper bound on the number of uniforms and shader cache entries, it's just that we forgot to do so. Fixes #3151
* vo_opengl: make the screen blue on shader errorsNiklas Haas2016-05-151-0/+18
| | | | | This helps visually signify that somthing went wrong, and prevents confusing shader compilation errors with other types of bugs.
* vo_opengl: support external user hooksNiklas Haas2016-05-151-0/+5
| | | | | | | | | | | This allows users to add their own near-arbitrary hooks to the vo_opengl processing pipeline, greatly enhancing the flexibility of user shaders. This enables, among other things, user shaders such as CrossBilateral, SuperRes, LumaSharpen and many more. To make parsing the user shaders easier, shaders are now loaded as bstrs, and the hooks are set up during video reconfig instead of on every single frame.
* vo_opengl: remove some pointless compatibilitywm42016-05-141-1/+0
| | | | | | Remove non-texture_rg compatibility from LUT sampling. OpenGL without texture_rg support will always trigger dumb-mode, and dumb-mode does not use LUTs. It used not to, and that was when this made sense.
* vo_opengl: reorganize texture format handlingwm42016-05-121-96/+17
| | | | | | | | | | | | | This merges all knowledge about texture format into a central table. Most of the work done here is actually identifying which formats exactly are supported by OpenGL(ES) under which circumstances, and keeping this information in the format table in a somewhat declarative way. (Although only to the extend needed by mpv.) In particular, ES and float formats are a horrible mess. Again this is a big refactor that might cause regression on "obscure" configurations.
* vo_opengl: angle: dump translated shaderswm42016-05-121-0/+10
| | | | Helpful for debugging and such.
* vo_opengl: d3d11egl: native NV12 sampling supportwm42016-05-101-0/+1
| | | | | | | | | | | | | | | | | This uses EGL_ANGLE_stream_producer_d3d_texture_nv12 and related extensions to map the D3D textures coming from the hardware decoder directly in GL. In theory this would be trivial to achieve, but unfortunately ANGLE does not have a mechanism to "import" D3D textures as GL textures. Instead, an awkward mechanism via EGL_KHR_stream was implemented, which involves at least 5 extensions and a lot of glue code. (Even worse than VAAPI EGL interop, and very far from the simplicity you get on OSX.) The ANGLE mechanism so far supports only the NV12 texture format, which means 10 bit won't work. It also does not work in ES3 mode yet. For these reasons, the "old" ID3D11VideoProcessor code is kept and used as a fallback.
* vo_opengl: slightly compress gl_set_debug_logger()wm42016-03-281-7/+2
| | | | No functional changes.
* vo_opengl: reduce temporary variables in gl_transform_trans()wm42016-03-281-5/+5
| | | | | Using a single gl_transform variable instead of many float ones makes it easier to see what it's doing. No functional change.
* vo_opengl: fix row-major vs. column-major confusionwm42016-03-281-2/+2
| | | | | | | | | gl_transform_vec() assumed column-major, while everything else seemed to assumed row-major memory organization for gl_transform.m. Also, gl_transform_trans() seems to contain additional confusion. This didn't matter until now, as everything has been orthogonal, this the swapped matrix entries were always 0.
* vo_opengl: minor coding style adjustmentwm42016-03-241-3/+4
|
* vo_opengl: utils: some more minor shader string building optimizationwm42016-03-241-23/+35
| | | | | | | | | | | | Instead of reallocating almost all of the shader string several times per pass, build it into a fixed buffer that will be reallocated as needed. While this still uses a linear search and full comparison of the shader text, this will compare the shader's string length first before doing a full comparison as a nice side effect. (That's also why the fragment shader is compared first - it's more likely to be different for different cache entries than the vertex shader stub.)
* vo_opengl: utils: slightly optimize shader string buildingwm42016-03-231-22/+21
| | | | | Use bstr as appending buffer, which should avoid frequent reallocation and copying. The previous commit should help with this a little.
* vo_opengl: use the same type for cached and current uniform valueswm42016-03-101-12/+11
| | | | Slightly improvement over the previous commit.
* vo_opengl: cache the values of the uniform variablesigv2016-03-101-20/+31
|
* vo_opengl: cache the locations of the uniform variablesigv2016-03-091-6/+13
|
* vo_opengl: refactor pass_read_video and texture bindingNiklas Haas2016-03-051-11/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a pretty major rewrite of the internal texture binding mechanic, which makes it more flexible. In general, the difference between the old and current approaches is that now, all texture description is held in a struct img_tex and only explicitly bound with pass_bind. (Once bound, a texture unit is assumed to be set in stone and no longer tied to the img_tex) This approach makes the code inside pass_read_video significantly more flexible and cuts down on the number of weird special cases and spaghetti logic. It also has some improvements, e.g. cutting down greatly on the number of unnecessary conversion passes inside pass_read_video (which was previously mostly done to cope with the fact that the alternative would have resulted in a combinatorial explosion of code complexity). Some other notable changes (and potential improvements): - texture expansion is now *always* handled in pass_read_video, and the colormatrix never does this anymore. (Which means the code could probably be removed from the colormatrix generation logic, modulo some other VOs) - struct fbo_tex now stores both its "physical" and "logical" (configured) size, which cuts down on the amount of width/height baggage on some function calls - vo_opengl can now technically support textures with different bit depths (e.g. 10 bit luma, 8 bit chroma) - but the APIs it queries inside img_format.c doesn't export this (nor does ffmpeg support it, really) so the status quo of using the same tex_mul for all planes is kept. - dumb_mode is now only needed because of the indirect_fbo being in the main rendering pipeline. If we reintroduce p->use_indirect and thread a transform through the entire program this could be skipped where unnecessary, allowing for the removal of dumb_mode. But I'm not sure how to do this in a clean way. (Which is part of why it got introduced to begin with) - It would be trivial to resurrect source-shader now (it would just be one extra 'if' inside pass_read_video).
* vo_opengl: declare vec4 color inside fragment shader stubNiklas Haas2016-02-231-1/+2
| | | | | | Why was this done so stupidly, with so many complicated special cases, before? Declare it once so the shader bits don't have to figure out where and when to do so themselves.
* vo_opengl: add precision qualifier to usampler2D on ANGLEwm42016-01-271-1/+1
| | | | | | | | GLES requires this. Some more common sampler types have default precisions, but not usampler2D. Newer ANGLE builds verify this more strictly than older builds, so this wasn't caught before. Fixes #2761.
* vo_opengl: support 10 bit support with ANGLEwm42016-01-261-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | GLES does not support high bit depth fixed point textures for unknown reasons, so direct 10 bit input is not possible. But we can still use integer textures, which are supported by GLES 3.0. These store integer data just like the standard fixed point textures, except they are not normalized on sampling. They also don't support bilinear filtering, and require a special sampler ("usampler2D"). While these texture formats enable us to shuffle the data to the GPU, they're rather impractical with the requirements mentioned above and our current architecture. One problem is that most code assumes it can always use bilinear scaling (even if bilinear is never used when using appropriate scale/cscale options). Another is that we don't have any concept of running a function on a texture in an uniform way. So for now, run a simple conversion step through a FBO. The FBO will use the rgba16f format normally, which gives enough bits for 10 bit, and will at least gracefully degrade with higher depth input. This is bound to be much slower than a more "direct" method, but at least it works and is simple to implement. The odd change of function call order in init_video() is to properly disable "dumb mode" (no FBO use) if these texture formats are in use.
* Change GPL/LGPL dual-licensed files to LGPLwm42016-01-191-12/+7
| | | | | | | | | | | Do this to make the license situation less confusing. This change should be of no consequence, since LGPL is compatible with GPL anyway, and making it LGPL-only does not restrict the use with GPL code. Additionally, the wording implies that this is allowed, and that we can just remove the GPL part.
* vo_opengl: fix operation on GLES 2.0wm42016-01-041-2/+2
| | | | | | | | | GLSL in GLES 2.0 did not have line continuation in its preprocessor. This broke shader compilation. It also broke subtitle rendering in vo_rpi, which reuses some of the OpenGL code. Line continuation was finally added in GLES 3.0, which is perhaps the reason why ANGLE accepted it.
* vo_opengl: fix shader compilation regressionwm42015-12-081-0/+5
| | | | | | | | | | The recent LUT adjustment changes broke interpolation. The concatenation of the shader stages is a bit messy, and it seems like sampler_prelude is not a good place to add this macro. Always add the macro to every shader instead. (While this doesn't seem too elegant, this isn't too inelegant either, and goes these problems out of the way.)
* vo_opengl: create FBOs in a more GLES conformant waywm42015-11-191-2/+40
| | | | | | | | | | | | | | | While desktop GL's glTexImage2D() essentially accepts anything, GLES is much stricter. The combination of allowed formats/types/internal formats is exactly specified. The GLES 3.0.4 specification lists them in table 3.2. (The ANGLE API validation code references this table.) The table could probably be extended into a general declarative table about GL formats covering other uses, but this would be a big non-trivial project, so don't bother and accept a minor degree of duplication with other tables. Note that the format and type do (or should) not matter here, because no image data is transferred to the GPU.
* vo_opengl: handle GL_ARB_uniform_buffer_object with low GLSL versionswm42015-11-091-3/+13
| | | | Why is this stupid crap being so much a pain for no reason.
* vo_opengl: implement NNEDI3 prescalerBin Jin2015-11-051-2/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement NNEDI3, a neural network based deinterlacer. The shader is reimplemented in GLSL and supports both 8x4 and 8x6 sampling window now. This allows the shader to be licensed under LGPL2.1 so that it can be used in mpv. The current implementation supports uploading the NN weights (up to 51kb with placebo setting) in two different way, via uniform buffer object or hard coding into shader source. UBO requires OpenGL 3.1, which only guarantee 16kb per block. But I find that 64kb seems to be a default setting for recent card/driver (which nnedi3 is targeting), so I think we're fine here (with default nnedi3 setting the size of weights is 9kb). Hard-coding into shader requires OpenGL 3.3, for the "intBitsToFloat()" built-in function. This is necessary to precisely represent these weights in GLSL. I tried several human readable floating point number format (with really high precision as for single precision float), but for some reason they are not working nicely, bad pixels (with NaN value) could be produced with some weights set. We could also add support to upload these weights with texture, just for compatibility reason (etc. upscaling a still image with a low end graphics card). But as I tested, it's rather slow even with 1D texture (we probably had to use 2D texture due to dimension size limitation). Since there is always better choice to do NNEDI3 upscaling for still image (vapoursynth plugin), it's not implemented in this commit. If this turns out to be a popular demand from the user, it should be easy to add it later. For those who wants to optimize the performance a bit further, the bottleneck seems to be: 1. overhead to upload and access these weights, (in particular, the shader code will be regenerated for each frame, it's on CPU though). 2. "dot()" performance in the main loop. 3. "exp()" performance in the main loop, there are various fast implementation with some bit tricks (probably with the help of the intBitsToFloat function). The code is tested with nvidia card and driver (355.11), on Linux. Closes #2230
* vo_opengl: add Super-xBR filter for upscalingBin Jin2015-11-051-1/+13
| | | | | | | | | | | Add the Super-xBR filter for image doubling, and the prescaling framework to support it. The shader code was ported from MPDN extensions project, with modification to process luma only. This commit is largely inspired by code from #2266, with `gl_transform_trans()` authored by @haasn taken directly.
* vo_opengl: move shader file caching to video.cwm42015-09-231-43/+1
| | | | | | It's just about loading and cachign small files, not does not necessarily have anything to do with shaders. Move it to video.c where it's used.
* vo_opengl: move sampler type mapping to a functionwm42015-09-101-7/+12
|
* vo_opengl: implement debanding (and remove source-shader)Niklas Haas2015-09-091-0/+8
| | | | | | | | | | The removal of source-shader is a side effect, since this effectively replaces it - and the video-reading code has been significantly restructured to make more sense and be more readable. This means users no longer have to constantly download and maintain a separate deband.glsl installation alongside mpv, which was the only real use case for source-shader that we found either way.
* vo_opengl: remove gl_ prefixes from files in video/out/openglNiklas Haas2015-09-091-0/+951
This is a bit redundant with the name of the directory itself, and not in line with existing naming conventions.