summaryrefslogtreecommitdiff
path: root/vp8/common/x86/subpixel_sse2.asm
Commit message (Collapse)AuthorAgeFilesLines
* simplify x86_abi_support.asm symbol declarationJohann2020-04-131-8/+8
| | | | | | | | | | Define LIBVPX_{ELF,MACHO} to simplify blocks. Create new globalsym macro and include logic for PRIVATE. BUG=webm:1679 Change-Id: I303ba1492a2813f685de51155ccef7e4831e1881
* x86_abi_support: use correct hidden syntaxJohann2020-04-011-8/+8
| | | | | | | | | Chromium needs :function hidden and the space between the symbol and the colon removed, at least for nasm. This matches x86inc.asm. BUG=webm:1679 Change-Id: Ie47bb75d44d3130791639cbf4e2ebe019e2d686e
* vp8 bilinear: rewrite 16x16Johann2018-10-251-269/+0
| | | | | | | Marginally faster. Most importantly it drops a dependency on an external symbol (vp8_bilinear_filters_x86_8). Change-Id: Iff022e718720f1f0eeced6201a1ad69a9c9c4f45
* vp8 bilinear: rewrite in intrinsicsJohann2018-10-241-145/+0
| | | | | | | | 8x8 is 15% faster than the assembly. 8x4 is 200% faster than MMX. Remove MMX version. Change-Id: I55642ebd276db265911f2c79616177a3a9a7e04f
* explicitly label .text sectionsJohann2017-12-011-0/+1
| | | | | | | nasm should infer .text but does not for windows: https://bugzilla.nasm.us/show_bug.cgi?id=3392451 Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
* Keep vp8 sixtap read within boundsJohann2016-09-211-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | When filtering it needs 6 pixels: 2 prior to the source, the source, and 3 after the source. When filtering 16 wide, that means 21. To accomplish this the SSE2 reads [-2] to [5], [6] to [13], and [14] to [21], a total of 24 bytes (reading in groups of 8 is easy) The filter then shifts this last set to the top half of the register and uses 'or' to combine it with the previous set. Valgrind detected an issue reading pixels [19], [20] and [21]: Address 0x7f581c2 is 434 bytes inside a block of size 441 alloc'd Note: we only need pixels [16], [17], and [18] as context for [15]. To fix this, it now reads 8 bytes starting at [11], which re-loads [11] through [13], but stops at [18] and does not over-read any values. This is shifted by 5 and 'or'd with xmm1. Although the lower bits are not cleared, they overlap directly with [11] through [13], so 'or' produces the correct results. Change-Id: I0c89c03afa660fc9b0108ac055d7bd403e493320
* Make libvpx Chromium build friendlyAlpha Lam2012-05-231-10/+10
| | | | | | | | | | Add PRIVATE macro for adding private_extern directive for yasm to hide global symbols. This is only enabled if -DCHROMIUM is used with YASM. Also fixed a small problem with rtcd_defs.sh to guard TEMPORAL_DENOISING. Change-Id: I9027fce3ebddcf20078293e4b86b396f21da7857
* Move shared data to shared locationJohann2011-11-181-8/+8
| | | | | | | | | Storing vp8_bilinear_filters_mmx in an mmx file and using it in an sse2 file is bad Moving towards allowing --disable-mmx Change-Id: I20493b35bdedcdcfc0915e6f05fdbe6c81a4a742
* Use local labels for jumps/loops in x86 assembly.Fritz Koenig2011-08-231-31/+31
| | | | | | | | | Prepend . to local labels in assembly code. This allows non unique labels within a file. Also makes profiling information more informative by keeping the function name with the loop name. Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
* modify SAVE_XMM for potential 64bit useJohann2011-04-191-11/+9
| | | | | | | | the win64 abi requires saving and restoring xmm6:xmm15. currently SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow specifying the highest register used and if the stack is unaligned. Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867
* nasm: address labels 'rel label' vice 'wrt rip'Jan Kratochvil2010-10-041-24/+24
| | | | | | | | | | | nasm does not support `label wrt rip', it requires `rel label'. It is still fully compatible with yasm. Provide nasm compatibility. No binary change by this patch with yasm on {x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on {x86_64,i686}-fedora13-linux-gnu have been checked as safe. Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50
* Use WebM in copyright notice for consistencyJohn Koleszar2010-09-091-1/+1
| | | | | | | | | Changes 'The VP8 project' to 'The WebM project', for consistency with other webmproject.org repositories. Fixes issue #97. Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
* cosmetics: trim trailing whitespaceJohn Koleszar2010-06-181-2/+2
| | | | | | | | When the license headers were updated, they accidentally contained trailing whitespace, so unfortunately we have to touch all the files again. Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
* More on "some XMM registers are non-volatile on windows x64 ABI"Yunqing Wang2010-06-151-2/+10
| | | | | | Add same fix in subpixel_sse2.asm. Change-Id: Icfda6103cbf74ec43308e96961dd738aa823c14d
* some XMM registers are non-volatile on windows x64 ABIMakoto Kato2010-06-111-0/+12
| | | | | | | XMM6 to XMM15 are non-volatile on Windows x64 ABI. We have to save these registers. Change-Id: I4676309f1350af25c8a35f0c81b1f0499ab99076
* Improve vp8_sixtap_predict functionsYunqing Wang2010-06-101-78/+399
| | | | | | | | | Restructure vp8_sixtap_predict functions to eliminate extra 5-line calculation while doing first-pass only. Also, combline functions to eliminate usage of intermediate buffer. This gives decoder a 3% performance gain on my test clips. Change-Id: I13de49638884d1a57d0855c63aea719316d08c1b
* LICENSE: update with latest textJohn Koleszar2010-06-041-4/+5
| | | | Change-Id: Ieebea089095d9073b3a94932791099f614ce120c
* Initial WebM releasev0.9.0John Koleszar2010-05-181-0/+1032