| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| | |
* changes:
Add 2D-specific Neon horizontal convolution functions
Refactor standard bitdepth Neon convolution functions
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
2D 8-tap convolution filtering is performed in two passes -
horizontal and vertical. The horizontal pass must produce enough
input data for the subsequent vertical pass - 3 rows above and 4 rows
below, in addition to the actual block height.
At present, all Neon horizontal convolution algorithms process 4 rows
at a time, but this means we end up doing at least 1 row too much
work in the 2D first pass case where we need h + 7, not h + 8 rows of
output.
This patch adds additional dot-product (SDOT and USDOT) Neon paths
that process h + 7 rows of data exactly, saving the work of the
unnecessary extra row. It is impractical to take a similar approach
for the Armv8.0 MLA paths since we have to transpose the data block
both before and after calling the convolution helper functions.
vpx_convolve_neon performance impact: we observe a speedup of ~9% for
smaller (and wider) blocks, and a speedup of 0-3% for larger blocks.
This is to be expected since the proportion of redundant work
decreases as the block height increases.
Change-Id: Ie77ad1848707d2d48bb8851345a469aae9d097e1
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
1) Use #define constant instead of magic numbers for right shifts.
2) Move saturating narrow into helper functions that return 4-element
result vectors.
3) Use mem_neon.h helpers for load/store sequences in Armv8.0 paths.
4) Tidy up: assert conditions and some longer variable names.
5) Prefer != 0 to > 0 where possible for loop termination conditions.
Change-Id: Idfcac43ca38faf729dca07b8cc8f7f45ad264d24
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This avoids link errors related to the sanitizers:
https://clang.llvm.org/docs/AddressSanitizer.html#usage
"When linking shared libraries, the AddressSanitizer run-time is not
linked, so -Wl,-z,defs may cause link errors ..."
See also:
https://crbug.com/aomedia/3438
Bug: webm:1801
Fixed: webm:1801
Change-Id: Ie212318005a5f7222e5486775175534025306367
|
|/ /
| |
| |
| |
| |
| |
| | |
libraries under third_party/ are out of scope for this change.
Bug: webm:1793
Change-Id: I562065a3c0ea9fdfc9615d1a6b1ae47da79b8ce0
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | | |
Bug: webm:1793
Change-Id: Ia940b06bd23a915a050432e03bb630567e891d8d
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | | |
* changes:
gen_msvs_vcxproj: add ARM64EC w/VS >= 2022
configure: add clang-cl vs1[67] arm64 targets
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
rather than define new targets, add a platform to the arm64 list as they
share the same configuration.
Bug: webm:1788
Change-Id: Iac020280b1103fb12b559f21439aeff26568fba4
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
x86 and armv7 are skipped for now as the intrinsics will need different
flags than cl.exe (/arch:... -> -m...).
Bug: webm:1788
Change-Id: I8ca8660a8644cdd84c51cb1f75005e371ba8207d
|
|\ \ \ \ |
|
| | |/ /
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:
If ref_padding = 0
Block Scaling
size SSE2 AVX2
8x4 3.24x 3.24x
8x8 4.22x 4.90x
8x16 5.91x 5.93x
16x8 1.63x 3.52x
16x16 1.53x 4.19x
16x32 1.38x 4.82x
32x16 1.28x 3.08x
32x32 1.45x 3.13x
32x64 1.38x 3.04x
64x32 1.39x 2.12x
64x64 1.46x 2.24x
If ref_padding = 8
Block Scaling
size SSE2 AVX2
8x4 3.20x 3.21x
8x8 4.61x 4.83x
8x16 5.50x 6.45x
16x8 1.56x 3.35x
16x16 1.53x 4.19x
16x32 1.37x 4.83x
32x16 1.28x 3.07x
32x32 1.46x 3.29x
32x64 1.38x 3.22x
64x32 1.38x 2.14x
64x64 1.38x 2.12x
This is a bit-exact change.
Change-Id: I72c5d155f64d0c630bc8c3aef21dc8bbd045d9e6
|
|\ \ \ \
| | |/ /
| |/| | |
|
| | |/
| |/|
| | |
| | | |
Change-Id: If2d5811a55f6bb60eeba7d28b69c78157a17e87f
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* changes:
vp8_[cd]x_iface: clear setjmp flag on function exit
vp9_decodeframe,tile_worker_hook: relocate setjmp=1
vp9,encoder_set_config: set setjmp flag after setjmp()
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
in vp8e_encode, also move setting the setjmp() call closer to setting
the flag.
Change-Id: Ie165d4100b84776f9c34eddcf64657bd78cce4f5
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
after the call to setjmp(); this is more correct and consistent with
other code.
Change-Id: I6d9bb8daad6a959bfe4f25484f9d6664b99da19e
|
| | | |
| | | |
| | | |
| | | | |
Change-Id: I6858e574d24aaff64f725404706f58e04e43717d
|
|\ \ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Contains the size of GOP - also the size of the list of TPL stats for
each frame in this GOP.
VpxTplGopStats will be the unit for VP9E_GET_TPL_STATS control to return
TPL stats from the encoder.
Bug: b/273736974
Change-Id: I1682242fc6db4aafcd6314af023aa0d704976585
|
|\ \ \ \ \
| |/ / / / |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
There were multiple implementations of CHECK_MEM_ERROR across the
library that take different arguments and used in different places.
This CL will unify them and have only one implementation that takes
vpx_internal_error_info.
Change-Id: I2c568639473815bc00b1fc2b72be56e5ccba1a35
|
|\ \ \ \ \
| |/ / / /
| | | | /
| |_|_|/
|/| | | |
|
| | |/
| |/|
| | |
| | |
| | |
| | | |
Also change the return type of vp9_set_roi_map to vpx_codec_err_t
Change-Id: I60d9ff45f2d3dfc44cd6e2aab2cb1ba389ff15f3
|
|\ \ \
| |/ /
|/| | |
|
| |/
| |
| |
| |
| |
| |
| | |
with --enable-experimental --enable-rate-ctrl
Bug: webm:1793
Change-Id: I9ca664538bcf0c2aca8aea73283bbb0232eb86e9
|
|\ \ |
|
| | |
| | |
| | |
| | | |
Change-Id: Ic5ec8dc7d9637091d4137a47d793cf29e76fdc45
|
| | |
| | |
| | |
| | |
| | | |
Bug: webm:1793
Change-Id: I5f9c09f31b06fecc123c6a9d01f5fbed39142356
|
|\ \ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Bug: webm:1793
Change-Id: Ib2e3bd3c52632cdd4410cb2c54d69750e64e5201
|
|\ \ \ \
| |_|/ /
|/| | |
| | | |
| | | |
| | | |
| | | | |
* changes:
Overwrite cm->error->detail before freeing
Have vpx_codec_error take const vpx_codec_ctx_t *
Add comments about vpx_codec_enc_init_ver failure
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Help detect use after free of the return value of
vpx_codec_error_detail(). If vpx_codec_error_detail() is called after
vpx_codec_encode() fails, the return value may be equal to
cm->error->detail, which is freed when vpx_codec_destroy() is called.
Document the lifetime of the string returned by
vpx_codec_error_detail().
Change-Id: I8089e90a4499b4f3cc5b9cfdbb25d72368faa319
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Also have vpx_codec_error_detail take vpx_codec_ctx_t *. Both functions
are getter functions that don't modify the codec context.
Change-Id: I4689022425efbf7b1da5034255ac052fce5e5b4f
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Address the questions:
1. If vpx_codec_enc_init_ver() fails, should I still call
vpx_codec_destroy() on the encoder context?
2. Is it safe to call vpx_codec_error_detail() when
vpx_codec_enc_init_ver() failed?
Change-Id: I1b0e090d11dd9f853fe203f4cbb6080c3c7b0506
|
|\ \ \
| |_|/
|/| | |
|
| |/
| |
| |
| | |
Change-Id: I8badedc2ad07d60896e45de28b707ad9f6c4d499
|
|\ \
| |/
|/| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Added AVX2 intrinsic optimization for the following functions
1. vpx_idct16x16_256_add
2. vpx_idct32x32_1024_add
3. vpx_idct32x32_135_add
The module level scaling w.r.t C function (timer based) for
existing (SSE2) and new AVX2 intrinsics:
Scaling
Function Name SSE2 AVX2
vpx_idct32x32_1024_add 3.62x 7.49x
vpx_idct32x32_135_add 4.85x 9.41x
vpx_idct16x16_256_add 4.82x 7.70x
This is a bit-exact change.
Change-Id: Id9dda933aa1f5093bb6b35ac3b8a41846afca9d2
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
I realized the calculation of the size of the list of VpxTplBlockStats
is non-trivial. So it's better to add the field for the size.
Bug: b/273736974
Change-Id: Ic1b50597c1f89a8f866b5669ca676407be6dc9d8
|
|\ \ \
| |/ / |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is to avoid symbols redifinition when integrating with other
libraries.
Bug: b/273736974
Change-Id: I891af78b1907504d5bb9f735164aea18c2aba944
|
|\ \ \
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | | |
* changes:
Fix mismatched param names in vpx_dsp/x86/sad4d_avx2.c
Fix mismatched param names in vpx_dsp/arm/highbd_sad4d_neon.c
Fix mismatched param names in vpx_dsp/arm/sad4d_neon.c
Fix mismatched param names in vpx_dsp/arm/highbd_avg_neon.c
Fix clang warning on const-qualification of parameters
|
| | |
| | |
| | |
| | | |
Change-Id: I226215a2ff8798b72abe0c2caf3d18875595caa5
|
| | |
| | |
| | |
| | | |
Change-Id: Ia4918eb0bac3b28b27e1ef205b9171680b2eb9a4
|
| | |
| | |
| | |
| | | |
Change-Id: If621944684cf9bb9f353db5961ed8b4b4ae38f24
|
| | |
| | |
| | |
| | | |
Change-Id: Ibf00a6e1029284e637b10ef01ac9b31ffadc74ca
|
| | |
| | |
| | |
| | | |
Change-Id: I900a0a48dde5fcb262157b191ac536e18269feb3
|