diff options
author | Dale Curtis <dalecurtis@chromium.org> | 2022-12-16 22:37:46 +0000 |
---|---|---|
committer | Michael BrĂ¼ning <michael.bruning@qt.io> | 2023-03-27 08:12:03 +0000 |
commit | c885ec409f9b6ffa25e03851729b1bc2ad2005b3 (patch) | |
tree | 0c9f205efc231ede87d2704b2780d1569caf5111 /chromium/third_party/dav1d/libdav1d/src/x86/itx_avx2.asm | |
parent | 0d63fc949d16f3e37ed7ab43d335b9d81cc6fdf7 (diff) | |
download | qtwebengine-chromium-c885ec409f9b6ffa25e03851729b1bc2ad2005b3.tar.gz |
[Backport] Security bug 1401571102-based
Manual update of libdav1d to match the version introduced by patch
https://chromium-review.googlesource.com/c/chromium/src/+/4114163:
Roll src/third_party/dav1d/libdav1d/ 87f9a81cd..ed63a7459 (104 commits)
This roll required a few changes to get working:
- "properties" => "built in options" crossfile configuration change due to Meson deprecation.
- generic config creation never worked, so fixed.
- PPC64 configs were never checked in, so switched to generic.
- copyright header changes for generate_sources.
- Updated readme.chromium with potential issues that can arise.
https://chromium.googlesource.com/external/github.com/videolan/dav1d.git/+log/87f9a81cd770..ed63a7459376
$ git log 87f9a81cd..ed63a7459 --date=short --no-merges --format='%ad %ae %s'
2022-12-09 jamrial dav1d: add an option to skip decoding some frame types
2022-12-08 jamrial picture: support creating and freeing refs without tile data
2022-12-07 gramner x86: Add 10bpc 8x32/32x8 itx AVX-512 (Ice Lake) asm
2022-12-07 gramner x86: Add minor DC-only IDCT optimizations
2022-12-13 gramner getbits: Fix assertion failure
2022-12-13 gramner checkasm: Fix integer overflow in refmvs test
2022-01-26 gramner dav1dplay: Update to new libplacebo API
2022-12-09 gramner Add minor getbits improvements
2022-12-09 gramner Add a separate getbits function for getting a single bit
2022-12-09 gramner Remove redundant zeroing in sequence header parsing
2022-12-09 gramner Set the correct default value of initial_display_delay
2022-12-09 jamrial tools: remove the null last entry in inloop_filters_tbl
2022-12-04 lu_zero Do not assume the picture allocation starts as the left edge
2022-11-21 lu_zero ppc: Allocate the correct temp buffer size
2022-11-21 lu_zero ppc: Do not use static const with vec_splats
2022-11-02 charlie.c.hayden Add info to dav1d_send_data docs
2022-10-30 jbeich build: drop -D_DARWIN_C_SOURCE on macOS/iOS after 6b611d36acab
2022-10-30 jbeich build: drop -D_POSIX_C_SOURCE on non-Linux after 6b611d36acab
2022-06-28 victorien threading: Add a pending list for async task insertion
2022-10-26 martin Implement atomic_compare_exchange_strong in the atomic compat headers
2022-10-06 victorien threading: Fix a race around frame completion (frame-mt)
2022-10-07 sebastian Handle host_machine.system() 'ios' and 'tvos' the same way as 'darwin'
2022-09-23 gramner x86: Add 10-bit 8x8/8x16/16x8/16x16 itx AVX-512 (Ice Lake) asm
2022-09-30 gramner Specify hidden visibility for global data symbol declarations
2022-09-28 gramner build: strip() the result of cc.get_define()
2022-09-26 gramner checkasm: Move printf format string to .rodata on x86
2022-09-26 gramner checkasm: Improve 32-bit parameter clobbering on x86-64
2022-09-26 gramner x86: Fix incorrect 32-bit parameter usage in high bit-depth AVX-512 mc
2022-09-09 martin arm: itx: Add clipping to row_clip_min/max in the 10 bpc codepaths
2022-09-15 gramner x86: Fix overflows in 12bpc AVX2 IDCT/IADST
2022-09-15 gramner x86: Fix overflows in 12bpc AVX2 DC-only IDCT
2022-09-15 gramner x86: Fix clipping in high bit-depth AVX2 4x16 IDCT
2022-03-21 martin Don't use gas-preprocessor with clang-cl for arm targets
2022-06-07 david_conrad Fix checking the reference dimesions for the projection process
2022-06-07 david_conrad Fix calculation of OBMC lap dimensions
2022-06-07 david_conrad Support film grain application whose only effect is clipping to video range
2022-06-07 david_conrad Ignore T.35 metadata if the OBU contains no payload
2022-06-07 david_conrad Fix chroma deblock filter size calculation for lossless
2022-06-07 david_conrad Fix rounding in the calculation of initialSubpelX
2022-06-07 david_conrad Fix overflow when saturating dequantized coefficients clipped to 0
2022-06-08 david_conrad Fix overflow in 8-bit NEON ADST
2022-09-14 martin tools: Allocate the priv structs with proper alignment
2022-09-08 gramner x86: Fix clipping in 10bpc SSE4.1 IDCT asm
2022-09-08 gramner build: Improve Windows linking options
2022-09-08 gramner tools: Improve demuxer probing
2022-08-30 code CI: Disable trimming on some tests
2022-08-30 code CI: Remove git 'safe.directory' config
2022-08-30 code gcovr: Ignore parsing errors
2022-08-30 code crossfiles: Update Android toolchains
2022-08-30 code CI: Update images
(...)
2022-09-01 victorien checkasm: Add short options
2022-09-01 victorien checkasm: Add pattern matching to --test
2022-09-01 victorien checkasm: Remove pattern matching from --bench
2022-08-29 victorien checkasm: Add a --function option
2022-08-30 victorien threading: Fix copy_lpf_progress initialization
2022-08-19 jamrial data: don't overwrite the Dav1dDataProps size value
2022-07-18 gramner Adjust inlining attributes on some functions
2022-07-19 gramner x86: Remove leftover instruction in loopfilter AVX2 asm
2022-06-07 david_conrad Enable pointer authentication in assembly when building arm64e
2022-06-07 david_conrad Don't trash the return stack buffer in the NEON loop filter
2022-07-03 thresh CI: Removed snap package generation
2022-07-06 gramner Eliminate unused C DSP functions at compile time
2022-07-06 gramner cpu: Inline dav1d_get_cpu_flags()
2022-06-22 gramner x86: Add minor loopfilter asm improvements
2022-06-15 gramner checkasm: Speed up signal handling
2022-06-15 gramner checkasm: Improve seed generation on Windows
2022-06-20 gramner ci: Don't specify a specific MacOS version
2022-06-14 gramner x86: Add high bit-depth loopfilter AVX-512 (Ice Lake) asm
2022-06-13 victorien checkasm/lpf: Use operating dimensions
2022-06-03 gramner checkasm: Print the cpu model and cpuid signature on x86
2022-06-03 gramner checkasm: Add a vzeroupper check on x86
2022-06-02 gramner x86: Add a workaround for quirky AVX-512 hardware behavior
2022-05-31 victorien checkasm: Fix uninitialized variable
2022-05-14 code CI: Update coverage collecting
2022-05-05 code CI: Add a build with the minimum requirements
2022-05-05 code CI: Deactivate git 'safe.directory'
2022-03-24 code CI: Update images
2022-05-25 victorien Fix typo
2022-05-19 gramner x86: Add high bit-depth cdef_filter AVX-512 (Ice Lake) asm
2022-05-20 gramner checkasm: Print --help message to stderr instead of stdout
2022-05-20 gramner checkasm: Split cdef test into separate pri/sec/pri+sec parts
2022-05-20 gramner checkasm: Improve benchmarking of functions that modify their input
2022-05-18 b x86/itx_avx2: fix typo
2022-04-22 code CI: Add gcc12 and clang14 builds with mold linker
2022-04-26 code CI: Trigger documentation rebuild if configuration changes
2022-04-24 code meson/doc: Fix doxygen config
2022-04-28 gramner Use a relaxed memory ordering in dav1d_ref_inc()
2022-04-28 gramner Remove redundant code in dav1d_cdf_thread_unref()
2022-04-28 gramner Inline dav1d_ref_inc()
2022-04-24 code x86/itx: Add 32x8 12bpc AVX2 transforms
2022-04-24 code x86/itx: Add 8x32 12bpc AVX2 transforms
2022-04-24 code x86/itx: Deduplicate dconly code
2022-04-23 code lib: Fix typo in documentation
2022-04-07 jamrial obu: don't output invisible but showable key frames more than once
2022-04-07 jamrial obu: check that the frame referenced by existing_frame_idx is showable
2022-04-07 jamrial obu: check refresh_frame_flags is not equal to allFrames on Intra Only frames
2022-03-29 robux4 remove multipass wait from dav1d_decode_frame
2022-04-07 jamrial picture: ensure the new seq header and op param info flags are attached to the next visible picture in display order
2022-03-31 jamrial lib: add a function to query the decoder frame delay
2022-03-31 jamrial lib: split calculating thread count to its own function
Created with:
roll-dep src/third_party/dav1d/libdav1d
Fixed: 1401571
Change-Id: Ic3cef540a87a2cf411abe6071fd4c9963ea61f75
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4114163
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1084574}
Reviewed-on: https://codereview.qt-project.org/c/qt/qtwebengine-chromium/+/468619
Reviewed-by: Michal Klocek <michal.klocek@qt.io>
Diffstat (limited to 'chromium/third_party/dav1d/libdav1d/src/x86/itx_avx2.asm')
-rw-r--r-- | chromium/third_party/dav1d/libdav1d/src/x86/itx_avx2.asm | 84 |
1 files changed, 35 insertions, 49 deletions
diff --git a/chromium/third_party/dav1d/libdav1d/src/x86/itx_avx2.asm b/chromium/third_party/dav1d/libdav1d/src/x86/itx_avx2.asm index 092c842786d..a67f053a61b 100644 --- a/chromium/third_party/dav1d/libdav1d/src/x86/itx_avx2.asm +++ b/chromium/third_party/dav1d/libdav1d/src/x86/itx_avx2.asm @@ -126,7 +126,7 @@ pw_m2751_3035x8: dw -2751*8, 3035*8 SECTION .text -; Code size reduction trickery: Intead of using rip-relative loads with +; Code size reduction trickery: Instead of using rip-relative loads with ; mandatory 4-byte offsets everywhere, we can set up a base pointer with a ; single rip-relative lea and then address things relative from that with ; 1-byte offsets as long as data is within +-128 bytes of the base pointer. @@ -1194,13 +1194,9 @@ cglobal iidentity_4x16_internal_8bpc, 0, 5, 11, dst, stride, c, eob, tx2 %ifidn %1_%2, dct_dct movd xm1, [o(pw_2896x8)] pmulhrsw xm0, xm1, [cq] + mov [cq], eobd pmulhrsw xm0, xm1 - movd xm2, [o(pw_2048)] - pmulhrsw xm0, xm1 - pmulhrsw xm0, xm2 - vpbroadcastw m0, xm0 - mova m1, m0 - jmp m(iadst_8x4_internal_8bpc).end3 + jmp m(inv_txfm_add_dct_dct_8x8_8bpc).dconly2 %endif %endmacro @@ -1340,20 +1336,20 @@ cglobal iidentity_8x4_internal_8bpc, 0, 5, 7, dst, stride, c, eob, tx2 pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_16384)] mov [cq], eobd + or r3d, 8 +.dconly: pmulhrsw xm0, xm2 - psrlw xm2, 3 ; pw_2048 +.dconly2: + movd xm2, [pw_2048] pmulhrsw xm0, xm1 + lea r2, [strideq*3] pmulhrsw xm0, xm2 vpbroadcastw m0, xm0 -.end: - mov r2d, 2 -.end2: - lea r3, [strideq*3] -.loop: - WRITE_8X4 0, 0, 1, 2 +.dconly_loop: + WRITE_8X4 0, 0, 1, 2, strideq*1, strideq*2, r2 lea dstq, [dstq+strideq*4] - dec r2d - jg .loop + sub r3d, 4 + jg .dconly_loop RET %endif %endmacro @@ -1543,13 +1539,8 @@ cglobal iidentity_8x8_internal_8bpc, 0, 5, 7, dst, stride, c, eob, tx2 movd xm2, [o(pw_16384)] mov [cq], eobd pmulhrsw xm0, xm1 - pmulhrsw xm0, xm2 - psrlw xm2, 3 ; pw_2048 - pmulhrsw xm0, xm1 - pmulhrsw xm0, xm2 - vpbroadcastw m0, xm0 - mov r2d, 4 - jmp m(inv_txfm_add_dct_dct_8x8_8bpc).end2 + or r3d, 16 + jmp m(inv_txfm_add_dct_dct_8x8_8bpc).dconly %endif %endmacro @@ -1902,7 +1893,7 @@ cglobal iidentity_8x16_internal_8bpc, 0, 5, 13, dst, stride, c, eob, tx2 pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_16384)] mov [cq], eobd - mov r2d, 2 + or r3d, 4 .dconly: pmulhrsw xm0, xm2 movd xm2, [pw_2048] ; intentionally rip-relative @@ -1911,17 +1902,17 @@ cglobal iidentity_8x16_internal_8bpc, 0, 5, 13, dst, stride, c, eob, tx2 vpbroadcastw m0, xm0 pxor m3, m3 .dconly_loop: - mova xm1, [dstq] - vinserti128 m1, [dstq+strideq], 1 + mova xm1, [dstq+strideq*0] + vinserti128 m1, [dstq+strideq*1], 1 punpckhbw m2, m1, m3 punpcklbw m1, m3 paddw m2, m0 paddw m1, m0 packuswb m1, m2 - mova [dstq], xm1 - vextracti128 [dstq+strideq], m1, 1 + mova [dstq+strideq*0], xm1 + vextracti128 [dstq+strideq*1], m1, 1 lea dstq, [dstq+strideq*2] - dec r2d + sub r3d, 2 jg .dconly_loop RET %endif @@ -2162,7 +2153,7 @@ cglobal iidentity_16x4_internal_8bpc, 0, 5, 11, dst, stride, c, eob, tx2 movd xm2, [o(pw_16384)] mov [cq], eobd pmulhrsw xm0, xm1 - mov r2d, 4 + or r3d, 8 jmp m(inv_txfm_add_dct_dct_16x4_8bpc).dconly %endif %endmacro @@ -2473,7 +2464,7 @@ cglobal iidentity_16x8_internal_8bpc, 0, 5, 13, dst, stride, c, eob, tx2 pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_8192)] mov [cq], eobd - mov r2d, 8 + or r3d, 16 jmp m(inv_txfm_add_dct_dct_16x4_8bpc).dconly %endif %endmacro @@ -3120,13 +3111,8 @@ cglobal inv_txfm_add_dct_dct_8x32_8bpc, 4, 4, 0, dst, stride, c, eob pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_8192)] mov [cq], eobd - pmulhrsw xm0, xm2 - psrlw xm2, 2 ; pw_2048 - pmulhrsw xm0, xm1 - pmulhrsw xm0, xm2 - vpbroadcastw m0, xm0 - mov r2d, 8 - jmp m(inv_txfm_add_dct_dct_8x8_8bpc).end2 + or r3d, 32 + jmp m(inv_txfm_add_dct_dct_8x8_8bpc).dconly .full: REPX {pmulhrsw x, m9}, m12, m13, m14, m15 pmulhrsw m6, m9, [rsp+32*2] @@ -3290,7 +3276,7 @@ cglobal inv_txfm_add_dct_dct_32x8_8bpc, 4, 4, 0, dst, stride, c, eob pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_8192)] mov [cq], eobd - mov r2d, 8 + or r3d, 8 .dconly: pmulhrsw xm0, xm2 movd xm2, [pw_2048] ; intentionally rip-relative @@ -3307,7 +3293,7 @@ cglobal inv_txfm_add_dct_dct_32x8_8bpc, 4, 4, 0, dst, stride, c, eob packuswb m1, m2 mova [dstq], m1 add dstq, strideq - dec r2d + dec r3d jg .dconly_loop RET .normal: @@ -3672,7 +3658,7 @@ cglobal inv_txfm_add_dct_dct_16x32_8bpc, 4, 4, 0, dst, stride, c, eob movd xm2, [o(pw_16384)] mov [cq], eobd pmulhrsw xm0, xm1 - mov r2d, 16 + or r3d, 32 jmp m(inv_txfm_add_dct_dct_16x4_8bpc).dconly .full: mova [tmp1q-32*4], m1 @@ -3991,7 +3977,7 @@ cglobal inv_txfm_add_dct_dct_32x16_8bpc, 4, 4, 0, dst, stride, c, eob movd xm2, [o(pw_16384)] mov [cq], eobd pmulhrsw xm0, xm1 - mov r2d, 16 + or r3d, 16 jmp m(inv_txfm_add_dct_dct_32x8_8bpc).dconly .normal: PROLOGUE 0, 6, 16, 32*19, dst, stride, c, eob, tmp1, tmp2 @@ -4222,7 +4208,7 @@ cglobal inv_txfm_add_dct_dct_32x32_8bpc, 4, 4, 0, dst, stride, c, eob pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_8192)] mov [cq], eobd - mov r2d, 32 + or r3d, 32 jmp m(inv_txfm_add_dct_dct_32x8_8bpc).dconly .normal: PROLOGUE 0, 9, 16, 32*67, dst, stride, c, eob, tmp1, tmp2, \ @@ -4486,7 +4472,7 @@ cglobal inv_txfm_add_dct_dct_16x64_8bpc, 4, 4, 0, dst, stride, c, eob pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_8192)] mov [cq], eobd - mov r2d, 32 + or r3d, 64 jmp m(inv_txfm_add_dct_dct_16x4_8bpc).dconly .normal: PROLOGUE 0, 10, 16, 32*67, dst, stride, c, eob, tmp1, tmp2 @@ -4832,7 +4818,7 @@ cglobal inv_txfm_add_dct_dct_64x16_8bpc, 4, 4, 0, dst, stride, c, eob pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_8192)] mov [cq], eobd - mov r2d, 16 + or r3d, 16 .dconly: pmulhrsw xm0, xm2 movd xm2, [o(pw_2048)] @@ -4856,7 +4842,7 @@ cglobal inv_txfm_add_dct_dct_64x16_8bpc, 4, 4, 0, dst, stride, c, eob mova [dstq+32*0], m2 mova [dstq+32*1], m3 add dstq, strideq - dec r2d + dec r3d jg .dconly_loop RET .normal: @@ -4997,7 +4983,7 @@ cglobal inv_txfm_add_dct_dct_32x64_8bpc, 4, 4, 0, dst, stride, c, eob movd xm2, [o(pw_16384)] mov [cq], eobd pmulhrsw xm0, xm1 - mov r2d, 64 + or r3d, 64 jmp m(inv_txfm_add_dct_dct_32x8_8bpc).dconly .normal: PROLOGUE 0, 11, 16, 32*99, dst, stride, c, eob, tmp1, tmp2 @@ -5200,7 +5186,7 @@ cglobal inv_txfm_add_dct_dct_64x32_8bpc, 4, 4, 0, dst, stride, c, eob movd xm2, [o(pw_16384)] mov [cq], eobd pmulhrsw xm0, xm1 - mov r2d, 32 + or r3d, 32 jmp m(inv_txfm_add_dct_dct_64x16_8bpc).dconly .normal: PROLOGUE 0, 9, 16, 32*131, dst, stride, c, eob, tmp1, tmp2, \ @@ -5381,7 +5367,7 @@ cglobal inv_txfm_add_dct_dct_64x64_8bpc, 4, 4, 0, dst, stride, c, eob pmulhrsw xm0, xm1, [cq] movd xm2, [o(pw_8192)] mov [cq], eobd - mov r2d, 64 + or r3d, 64 jmp m(inv_txfm_add_dct_dct_64x16_8bpc).dconly .normal: PROLOGUE 0, 11, 16, 32*199, dst, stride, c, eob, tmp1, tmp2 |