| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Define LIBVPX_{ELF,MACHO} to simplify blocks.
Create new globalsym macro and include logic for PRIVATE.
BUG=webm:1679
Change-Id: I303ba1492a2813f685de51155ccef7e4831e1881
|
|
|
|
|
|
|
|
|
| |
Chromium needs :function hidden and the space between the symbol and the
colon removed, at least for nasm. This matches x86inc.asm.
BUG=webm:1679
Change-Id: Ie47bb75d44d3130791639cbf4e2ebe019e2d686e
|
|
|
|
|
|
|
| |
Marginally faster. Most importantly it drops a dependency on an
external symbol (vp8_bilinear_filters_x86_8).
Change-Id: Iff022e718720f1f0eeced6201a1ad69a9c9c4f45
|
|
|
|
|
|
|
|
| |
8x8 is 15% faster than the assembly. 8x4 is 200% faster than MMX.
Remove MMX version.
Change-Id: I55642ebd276db265911f2c79616177a3a9a7e04f
|
|
|
|
|
|
|
| |
nasm should infer .text but does not for windows:
https://bugzilla.nasm.us/show_bug.cgi?id=3392451
Change-Id: Ib195465e5f33405f5ff61c4cf88aa2a72640cacb
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When filtering it needs 6 pixels: 2 prior to the source, the source, and
3 after the source.
When filtering 16 wide, that means 21. To accomplish this the SSE2 reads
[-2] to [5], [6] to [13], and [14] to [21], a total of 24 bytes (reading
in groups of 8 is easy)
The filter then shifts this last set to the top half of the register and
uses 'or' to combine it with the previous set.
Valgrind detected an issue reading pixels [19], [20] and [21]:
Address 0x7f581c2 is 434 bytes inside a block of size 441 alloc'd
Note: we only need pixels [16], [17], and [18] as context for [15].
To fix this, it now reads 8 bytes starting at [11], which re-loads [11]
through [13], but stops at [18] and does not over-read any values.
This is shifted by 5 and 'or'd with xmm1. Although the lower bits are
not cleared, they overlap directly with [11] through [13], so 'or'
produces the correct results.
Change-Id: I0c89c03afa660fc9b0108ac055d7bd403e493320
|
|
|
|
|
|
|
|
|
|
| |
Add PRIVATE macro for adding private_extern directive for yasm
to hide global symbols. This is only enabled if -DCHROMIUM is used
with YASM.
Also fixed a small problem with rtcd_defs.sh to guard TEMPORAL_DENOISING.
Change-Id: I9027fce3ebddcf20078293e4b86b396f21da7857
|
|
|
|
|
|
|
|
|
| |
Storing vp8_bilinear_filters_mmx in an mmx file and using it in an sse2
file is bad
Moving towards allowing --disable-mmx
Change-Id: I20493b35bdedcdcfc0915e6f05fdbe6c81a4a742
|
|
|
|
|
|
|
|
|
| |
Prepend . to local labels in assembly code. This
allows non unique labels within a file. Also
makes profiling information more informative
by keeping the function name with the loop name.
Change-Id: I7a983cb3a5ba2413d5dafd0a37936b268fb9e37f
|
|
|
|
|
|
|
|
| |
the win64 abi requires saving and restoring xmm6:xmm15. currently
SAVE_XMM and RESTORE XMM only allow for saving xmm6:xmm7. allow
specifying the highest register used and if the stack is unaligned.
Change-Id: Ica5699622ffe3346d3a486f48eef0206c51cf867
|
|
|
|
|
|
|
|
|
|
|
| |
nasm does not support `label wrt rip', it requires `rel label'. It is
still fully compatible with yasm.
Provide nasm compatibility. No binary change by this patch with yasm on
{x86_64,i686}-fedora13-linux-gnu. Few longer opcodes with nasm on
{x86_64,i686}-fedora13-linux-gnu have been checked as safe.
Change-Id: I488773a4e930a56e43b0cc72d867ee5291215f50
|
|
|
|
|
|
|
|
|
| |
Changes 'The VP8 project' to 'The WebM project', for consistency
with other webmproject.org repositories.
Fixes issue #97.
Change-Id: I37c13ed5fbdb9d334ceef71c6350e9febed9bbba
|
|
|
|
|
|
|
|
| |
When the license headers were updated, they accidentally contained
trailing whitespace, so unfortunately we have to touch all the files
again.
Change-Id: I236c05fade06589e417179c0444cb39b09e4200d
|
|
|
|
|
|
| |
Add same fix in subpixel_sse2.asm.
Change-Id: Icfda6103cbf74ec43308e96961dd738aa823c14d
|
|
|
|
|
|
|
| |
XMM6 to XMM15 are non-volatile on Windows x64 ABI. We have to save
these registers.
Change-Id: I4676309f1350af25c8a35f0c81b1f0499ab99076
|
|
|
|
|
|
|
|
|
| |
Restructure vp8_sixtap_predict functions to eliminate extra 5-line
calculation while doing first-pass only. Also, combline functions
to eliminate usage of intermediate buffer. This gives decoder a 3%
performance gain on my test clips.
Change-Id: I13de49638884d1a57d0855c63aea719316d08c1b
|
|
|
|
| |
Change-Id: Ieebea089095d9073b3a94932791099f614ce120c
|
|
|