diff options
author | Dr.Smile <vabnick@gmail.com> | 2021-03-09 04:35:20 +0300 |
---|---|---|
committer | Dr.Smile <vabnick@gmail.com> | 2021-04-21 21:46:00 +0300 |
commit | 10160c4eddd3c1a4e340a193dde8f188c13d3a04 (patch) | |
tree | 25686237b46c8a10733575fbf079bf7c8b885265 /libass/ass_func_template.h | |
parent | ccc646de63f1e9e5594c991e04618240e81902bd (diff) | |
download | libass-10160c4eddd3c1a4e340a193dde8f188c13d3a04.tar.bz2 libass-10160c4eddd3c1a4e340a193dde8f188c13d3a04.tar.xz |
Rewrite be_blur() assembly
Change list:
- Fixed differences from C version introduced
in f23b9ed64bd4ccf249c686616dd3f51a69d285dc.
- Common macro for SSE2 and AVX2 versions.
- Reduced register usage and efficient 32-bit version.
- Full width memory operations instead of half-register.
- Vectorized handling of width tails instead of byte/word loops.
- Vectorized initial population of temporary buffer and final line fill.
- Interleaved layout of temporary buffer.
- Great speedup overall.
Diffstat (limited to 'libass/ass_func_template.h')
-rw-r--r-- | libass/ass_func_template.h | 4 |
1 files changed, 0 insertions, 4 deletions
diff --git a/libass/ass_func_template.h b/libass/ass_func_template.h index b6905ad..4c28777 100644 --- a/libass/ass_func_template.h +++ b/libass/ass_func_template.h @@ -108,11 +108,7 @@ const BitmapEngine DECORATE(bitmap_engine) = { .sub_bitmaps = DECORATE(sub_bitmaps), .mul_bitmaps = DECORATE(mul_bitmaps), -#ifdef __x86_64__ .be_blur = DECORATE(be_blur), -#else - .be_blur = ass_be_blur_c, -#endif .stripe_unpack = DECORATE(stripe_unpack), .stripe_pack = DECORATE(stripe_pack), |