summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Merge changes Ie77ad184,Idfcac43c into mainHEADmainJames Zern2023-05-164-476/+650
|\ | | | | | | | | | | * changes: Add 2D-specific Neon horizontal convolution functions Refactor standard bitdepth Neon convolution functions
| * Add 2D-specific Neon horizontal convolution functionsJonathan Wright2023-05-134-4/+301
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 2D 8-tap convolution filtering is performed in two passes - horizontal and vertical. The horizontal pass must produce enough input data for the subsequent vertical pass - 3 rows above and 4 rows below, in addition to the actual block height. At present, all Neon horizontal convolution algorithms process 4 rows at a time, but this means we end up doing at least 1 row too much work in the 2D first pass case where we need h + 7, not h + 8 rows of output. This patch adds additional dot-product (SDOT and USDOT) Neon paths that process h + 7 rows of data exactly, saving the work of the unnecessary extra row. It is impractical to take a similar approach for the Armv8.0 MLA paths since we have to transpose the data block both before and after calling the convolution helper functions. vpx_convolve_neon performance impact: we observe a speedup of ~9% for smaller (and wider) blocks, and a speedup of 0-3% for larger blocks. This is to be expected since the proportion of redundant work decreases as the block height increases. Change-Id: Ie77ad1848707d2d48bb8851345a469aae9d097e1
| * Refactor standard bitdepth Neon convolution functionsJonathan Wright2023-05-122-472/+349
| | | | | | | | | | | | | | | | | | | | | | 1) Use #define constant instead of magic numbers for right shifts. 2) Move saturating narrow into helper functions that return 4-element result vectors. 3) Use mem_neon.h helpers for load/store sequences in Armv8.0 paths. 4) Tidy up: assert conditions and some longer variable names. 5) Prefer != 0 to > 0 where possible for loop termination conditions. Change-Id: Idfcac43ca38faf729dca07b8cc8f7f45ad264d24
* | Merge "Don't use -Wl,-z,defs with Clang's sanitizers" into mainJames Zern2023-05-121-1/+15
|\ \
| * | Don't use -Wl,-z,defs with Clang's sanitizersJames Zern2023-05-121-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids link errors related to the sanitizers: https://clang.llvm.org/docs/AddressSanitizer.html#usage "When linking shared libraries, the AddressSanitizer run-time is not linked, so -Wl,-z,defs may cause link errors ..." See also: https://crbug.com/aomedia/3438 Bug: webm:1801 Fixed: webm:1801 Change-Id: Ie212318005a5f7222e5486775175534025306367
* | | configure: add -WshadowJames Zern2023-05-094-1/+8
|/ / | | | | | | | | | | | | libraries under third_party/ are out of scope for this change. Bug: webm:1793 Change-Id: I562065a3c0ea9fdfc9615d1a6b1ae47da79b8ce0
* | Merge "vp8_macros_msa.h: clear -Wshadow warnings" into mainJames Zern2023-05-091-129/+129
|\ \
| * | vp8_macros_msa.h: clear -Wshadow warningsJames Zern2023-05-081-129/+129
| | | | | | | | | | | | | | | Bug: webm:1793 Change-Id: Ia940b06bd23a915a050432e03bb630567e891d8d
* | | Merge changes Iac020280,I8ca8660a into mainJames Zern2023-05-093-12/+36
|\ \ \ | | | | | | | | | | | | | | | | | | | | * changes: gen_msvs_vcxproj: add ARM64EC w/VS >= 2022 configure: add clang-cl vs1[67] arm64 targets
| * | | gen_msvs_vcxproj: add ARM64EC w/VS >= 2022James Zern2023-05-081-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rather than define new targets, add a platform to the arm64 list as they share the same configuration. Bug: webm:1788 Change-Id: Iac020280b1103fb12b559f21439aeff26568fba4
| * | | configure: add clang-cl vs1[67] arm64 targetsJames Zern2023-05-083-12/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86 and armv7 are skipped for now as the intrinsics will need different flags than cl.exe (/arch:... -> -m...). Bug: webm:1788 Change-Id: I8ca8660a8644cdd84c51cb1f75005e371ba8207d
* | | | Merge "Add AVX2 intrinsic for vpx_comp_avg_pred() function" into mainYunqing Wang2023-05-098-17/+134
|\ \ \ \
| * | | | Add AVX2 intrinsic for vpx_comp_avg_pred() functionAnupam Pandey2023-05-098-17/+134
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The module level scaling w.r.t C function (timer based) for existing (SSE2) and new AVX2 intrinsics: If ref_padding = 0 Block Scaling size SSE2 AVX2 8x4 3.24x 3.24x 8x8 4.22x 4.90x 8x16 5.91x 5.93x 16x8 1.63x 3.52x 16x16 1.53x 4.19x 16x32 1.38x 4.82x 32x16 1.28x 3.08x 32x32 1.45x 3.13x 32x64 1.38x 3.04x 64x32 1.39x 2.12x 64x64 1.46x 2.24x If ref_padding = 8 Block Scaling size SSE2 AVX2 8x4 3.20x 3.21x 8x8 4.61x 4.83x 8x16 5.50x 6.45x 16x8 1.56x 3.35x 16x16 1.53x 4.19x 16x32 1.37x 4.83x 32x16 1.28x 3.07x 32x32 1.46x 3.29x 32x64 1.38x 3.22x 64x32 1.38x 2.14x 64x64 1.38x 2.12x This is a bit-exact change. Change-Id: I72c5d155f64d0c630bc8c3aef21dc8bbd045d9e6
* | | | Merge "README: update target list" into mainJames Zern2023-05-081-0/+6
|\ \ \ \ | | |/ / | |/| |
| * | | README: update target listJames Zern2023-05-051-0/+6
| | |/ | |/| | | | | | | Change-Id: If2d5811a55f6bb60eeba7d28b69c78157a17e87f
* | | Merge changes Ie165d410,I6d9bb8da,I6858e574 into mainJames Zern2023-05-084-10/+12
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | * changes: vp8_[cd]x_iface: clear setjmp flag on function exit vp9_decodeframe,tile_worker_hook: relocate setjmp=1 vp9,encoder_set_config: set setjmp flag after setjmp()
| * | | vp8_[cd]x_iface: clear setjmp flag on function exitJames Zern2023-05-052-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | in vp8e_encode, also move setting the setjmp() call closer to setting the flag. Change-Id: Ie165d4100b84776f9c34eddcf64657bd78cce4f5
| * | | vp9_decodeframe,tile_worker_hook: relocate setjmp=1James Zern2023-05-051-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | after the call to setjmp(); this is more correct and consistent with other code. Change-Id: I6d9bb8daad6a959bfe4f25484f9d6664b99da19e
| * | | vp9,encoder_set_config: set setjmp flag after setjmp()James Zern2023-05-051-0/+1
| | | | | | | | | | | | | | | | Change-Id: I6858e574d24aaff64f725404706f58e04e43717d
* | | | Merge "Add VpxTplGopStats" into mainJerome Jiang2023-05-087-36/+68
|\ \ \ \
| * | | | Add VpxTplGopStatsJerome Jiang2023-05-087-36/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Contains the size of GOP - also the size of the list of TPL stats for each frame in this GOP. VpxTplGopStats will be the unit for VP9E_GET_TPL_STATS control to return TPL stats from the encoder. Bug: b/273736974 Change-Id: I1682242fc6db4aafcd6314af023aa0d704976585
* | | | | Merge "Unify implementation of CHECK_MEM_ERROR" into mainJerome Jiang2023-05-0824-197/+171
|\ \ \ \ \ | |/ / / /
| * | | | Unify implementation of CHECK_MEM_ERRORJerome Jiang2023-05-0824-197/+171
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There were multiple implementations of CHECK_MEM_ERROR across the library that take different arguments and used in different places. This CL will unify them and have only one implementation that takes vpx_internal_error_info. Change-Id: I2c568639473815bc00b1fc2b72be56e5ccba1a35
* | | | | Merge "CHECK_MEM_ERROR to return in vp9_set_roi_map" into mainJerome Jiang2023-05-083-18/+17
|\ \ \ \ \ | |/ / / / | | | | / | |_|_|/ |/| | |
| * | | CHECK_MEM_ERROR to return in vp9_set_roi_mapJerome Jiang2023-05-083-18/+17
| | |/ | |/| | | | | | | | | | | | | Also change the return type of vp9_set_roi_map to vpx_codec_err_t Change-Id: I60d9ff45f2d3dfc44cd6e2aab2cb1ba389ff15f3
* | | Merge "vp9_encoder: clear -Wshadow warning" into mainJames Zern2023-05-061-4/+7
|\ \ \ | |/ / |/| |
| * | vp9_encoder: clear -Wshadow warningJames Zern2023-05-051-4/+7
| |/ | | | | | | | | | | | | with --enable-experimental --enable-rate-ctrl Bug: webm:1793 Change-Id: I9ca664538bcf0c2aca8aea73283bbb0232eb86e9
* | Merge "Set setjmp flag in VP9 RTC rate control library" into mainJerome Jiang2023-05-051-0/+11
|\ \
| * | Set setjmp flag in VP9 RTC rate control libraryJerome Jiang2023-05-051-0/+11
| | | | | | | | | | | | Change-Id: Ic5ec8dc7d9637091d4137a47d793cf29e76fdc45
* | | sixtap_filter_msa.c: clear -Wshadow warningsJames Zern2023-05-051-74/+107
| | | | | | | | | | | | | | | Bug: webm:1793 Change-Id: I5f9c09f31b06fecc123c6a9d01f5fbed39142356
* | | Merge "macros_msa.h: clear -Wshadow warnings" into mainJames Zern2023-05-051-28/+28
|\ \ \
| * | | macros_msa.h: clear -Wshadow warningsJames Zern2023-05-051-28/+28
| | | | | | | | | | | | | | | | | | | | Bug: webm:1793 Change-Id: Ib2e3bd3c52632cdd4410cb2c54d69750e64e5201
* | | | Merge changes I8089e90a,I46890224,I1b0e090d into mainJames Zern2023-05-056-7/+23
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | * changes: Overwrite cm->error->detail before freeing Have vpx_codec_error take const vpx_codec_ctx_t * Add comments about vpx_codec_enc_init_ver failure
| * | | Overwrite cm->error->detail before freeingWan-Teh Chang2023-05-042-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Help detect use after free of the return value of vpx_codec_error_detail(). If vpx_codec_error_detail() is called after vpx_codec_encode() fails, the return value may be equal to cm->error->detail, which is freed when vpx_codec_destroy() is called. Document the lifetime of the string returned by vpx_codec_error_detail(). Change-Id: I8089e90a4499b4f3cc5b9cfdbb25d72368faa319
| * | | Have vpx_codec_error take const vpx_codec_ctx_t *Wan-Teh Chang2023-05-042-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Also have vpx_codec_error_detail take vpx_codec_ctx_t *. Both functions are getter functions that don't modify the codec context. Change-Id: I4689022425efbf7b1da5034255ac052fce5e5b4f
| * | | Add comments about vpx_codec_enc_init_ver failureWan-Teh Chang2023-05-043-1/+10
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | Address the questions: 1. If vpx_codec_enc_init_ver() fails, should I still call vpx_codec_destroy() on the encoder context? 2. Is it safe to call vpx_codec_error_detail() when vpx_codec_enc_init_ver() failed? Change-Id: I1b0e090d11dd9f853fe203f4cbb6080c3c7b0506
* | | Merge "vpx_subpixel_8t_intrin_avx2,cosmetics: shorten long comment" into mainJames Zern2023-05-051-2/+2
|\ \ \ | |_|/ |/| |
| * | vpx_subpixel_8t_intrin_avx2,cosmetics: shorten long commentJames Zern2023-05-041-2/+2
| |/ | | | | | | Change-Id: I8badedc2ad07d60896e45de28b707ad9f6c4d499
* | Merge "Add AVX2 intrinsic for idct16x16 and idct32x32 functions" into mainYunqing Wang2023-05-058-8/+926
|\ \ | |/ |/|
| * Add AVX2 intrinsic for idct16x16 and idct32x32 functionsAnupam Pandey2023-05-058-8/+926
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added AVX2 intrinsic optimization for the following functions 1. vpx_idct16x16_256_add 2. vpx_idct32x32_1024_add 3. vpx_idct32x32_135_add The module level scaling w.r.t C function (timer based) for existing (SSE2) and new AVX2 intrinsics: Scaling Function Name SSE2 AVX2 vpx_idct32x32_1024_add 3.62x 7.49x vpx_idct32x32_135_add 4.85x 9.41x vpx_idct16x16_256_add 4.82x 7.70x This is a bit-exact change. Change-Id: Id9dda933aa1f5093bb6b35ac3b8a41846afca9d2
* | Merge "Add num_blocks to VpxTplFrameStats" into mainJerome Jiang2023-05-044-1/+4
|\ \
| * | Add num_blocks to VpxTplFrameStatsJerome Jiang2023-05-044-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | I realized the calculation of the size of the list of VpxTplBlockStats is non-trivial. So it's better to add the field for the size. Bug: b/273736974 Change-Id: Ic1b50597c1f89a8f866b5669ca676407be6dc9d8
* | | Merge "Add Vpx* prefix to Tpl{Block,Frame}Stats" into mainJerome Jiang2023-05-047-17/+18
|\ \ \ | |/ /
| * | Add Vpx* prefix to Tpl{Block,Frame}StatsJerome Jiang2023-05-047-17/+18
| | | | | | | | | | | | | | | | | | | | | | | | This is to avoid symbols redifinition when integrating with other libraries. Bug: b/273736974 Change-Id: I891af78b1907504d5bb9f735164aea18c2aba944
* | | Merge changes I226215a2,Ia4918eb0,If6219446,Ibf00a6e1,I900a0a48 into mainChi Yo Tsai2023-05-0411-103/+105
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | * changes: Fix mismatched param names in vpx_dsp/x86/sad4d_avx2.c Fix mismatched param names in vpx_dsp/arm/highbd_sad4d_neon.c Fix mismatched param names in vpx_dsp/arm/sad4d_neon.c Fix mismatched param names in vpx_dsp/arm/highbd_avg_neon.c Fix clang warning on const-qualification of parameters
| * | Fix mismatched param names in vpx_dsp/x86/sad4d_avx2.cchiyotsai2023-05-031-30/+30
| | | | | | | | | | | | Change-Id: I226215a2ff8798b72abe0c2caf3d18875595caa5
| * | Fix mismatched param names in vpx_dsp/arm/highbd_sad4d_neon.cchiyotsai2023-05-031-14/+15
| | | | | | | | | | | | Change-Id: Ia4918eb0bac3b28b27e1ef205b9171680b2eb9a4
| * | Fix mismatched param names in vpx_dsp/arm/sad4d_neon.cchiyotsai2023-05-031-15/+17
| | | | | | | | | | | | Change-Id: If621944684cf9bb9f353db5961ed8b4b4ae38f24
| * | Fix mismatched param names in vpx_dsp/arm/highbd_avg_neon.cchiyotsai2023-05-031-30/+29
| | | | | | | | | | | | Change-Id: Ibf00a6e1029284e637b10ef01ac9b31ffadc74ca
| * | Fix clang warning on const-qualification of parameterschiyotsai2023-05-037-14/+14
| | | | | | | | | | | | Change-Id: I900a0a48dde5fcb262157b191ac536e18269feb3