libass - a portable subtitle renderer for the ASS/SSA (Advanced Substation Alpha/Substation Alpha) subtitle format

	Commit message (Collapse)	Author	Age	Files	Lines
*	Make assembly position-independent	Dr.Smile	2021-04-21	4	-29/+188
\|
*	rasterizer: improve assembly	Dr.Smile	2021-04-21	1	-187/+149
\|
*	rasterizer: make C and assembly functions bitwise identical	Dr.Smile	2021-04-21	1	-8/+9
\| \| \| \|	Fixes https://github.com/libass/libass/issues/475
*	blur: slightly improve assembly	Dr.Smile	2021-04-21	1	-34/+28
\|
*	Make argument order uniform between bitmap functions	Dr.Smile	2021-04-21	2	-37/+36
\|
*	Rewrite be_blur() assembly	Dr.Smile	2021-04-21	1	-222/+202
\| \| \| \| \| \| \| \| \| \| \| \| \|	Change list: - Fixed differences from C version introduced in f23b9ed64bd4ccf249c686616dd3f51a69d285dc. - Common macro for SSE2 and AVX2 versions. - Reduced register usage and efficient 32-bit version. - Full width memory operations instead of half-register. - Vectorized handling of width tails instead of byte/word loops. - Vectorized initial population of temporary buffer and final line fill. - Interleaved layout of temporary buffer. - Great speedup overall.
*	Rewrite mul_bitmaps() assembly	Dr.Smile	2021-04-21	1	-120/+64
\| \| \| \| \| \| \| \| \| \|	Change list: - No special handling of unaligned case. - Common macro for SSE2 and AVX2 versions, AVX2 has got significantly faster. - Reduced register usage and efficient 32-bit version. - Full width memory operations instead of half-register. - Vectorized handling of width tails instead of byte loops.
*	Rewrite add/sub_bitmaps() assembly	Dr.Smile	2021-04-21	1	-136/+51
\| \| \| \| \| \| \| \| \| \|	Change list: - No special handling of unaligned case (removed in previous commit). - Common macro for both add_bitmaps() and sub_bitmaps(). - Reduced register usage and efficient 32-bit version. - add_bitmaps() no longer rely on zero padding. - Vectorized handling of width tails (instead of byte loop in sub_bitmaps(), great speedup for non-empty tails).
*	x86: update x86inc	Ryan Lucia	2021-02-23	2	-263/+538
\| \| \| \|	This should fix the warnings introduced with nasm 2.15
*	Simplify blur algorithm	Dr.Smile	2020-10-09	2	-644/+271
\| \| \| \| \|	This commit removes prefilters altogether at the cost of enlarged main filter kernel.
*	Update names in copyright headers	rcombs	2020-05-29	4	-4/+4
\|
*	x86/cpuid: fix missing include	rcombs	2020-05-26	1	-0/+2
\|
*	x86: update x86inc.asm	Rodger Combs	2017-09-05	1	-497/+599
\|
*	x86: asm adjustments for nasm compatibility	Rodger Combs	2017-09-05	7	-63/+62
\|
*	Fix crash when the OS doesn't support AVX2	Rodger Combs	2015-07-27	2	-3/+20
\|
*	Implement cascade gaussian blur	Dr.Smile	2015-07-04	3	-71/+1512
\| \| \| \| \| \| \| \| \| \|	That's complete version with SSE2/AVX2 assembly. Should be much faster than old algorithm even in pure C. Algorithm description can be found in this article (PDF): https://github.com/MrSmile/CascadeBlur/releases Close #9
*	Switch to virtual function table	Dr.Smile	2015-06-26	3	-142/+0
\| \| \| \| \| \| \| \| \| \|	Use one pointer to table of functions instead of scattered bunch of function pointers. Different versions of these tables can be constructed in compile time. Also, bitmap memory alignment now depends only on SSE2/AVX2 support and is constant for every width. That simplifies code without noticeable performance penalty.
*	Improve rasterizer comments	Dr.Smile	2015-06-26	1	-20/+21
\|
*	Skip memset() when using internal rasterizer	Dr.Smile	2015-02-09	2	-7/+16
\|
*	Flip coordinate system in rasterizer	Dr.Smile	2014-11-23	1	-34/+34
\|
*	Implement fast quad-tree rasterizer in C and x86/SSE2/AVX2	Dr.Smile	2014-04-29	2	-0/+972
\| \| \| \|	Signed-off-by: Rodger Combs <rodger.combs@gmail.com>
*	Remove dirty pixels from ASM be_blur output	Oleg Oshmyan	2014-03-13	1	-6/+8
\| \| \| \| \| \| \| \|	A loop initializer was missing, so output started one row too early. A loop condition check was missing, so output sometimes stopped one column too late. Also remove a couple of dead assignments.
*	Remove incorrect declaration of HAVE_ALIGNED_STACK	11rcombs	2014-03-09	1	-1/+0
\|
*	Remove unnecessary instruction	11rcombs	2014-02-16	1	-3/+0
\|
*	Added XMM register count in be_blur; should help #48	11rcombs	2014-02-16	1	-2/+2
\|
*	Use lower mm registers in be_blur.asm	11rcombs	2014-02-16	1	-8/+8
\|
*	Added license headers in ASM files	11rcombs	2014-02-16	3	-0/+48
\|
*	Added x86 ASM functions	11rcombs	2014-01-25	7	-0/+2121