summaryrefslogtreecommitdiff
path: root/libavutil/arm
Commit message (Collapse)AuthorAgeFilesLines
* arm: relax byte-swap assembler constraintsRémi Denis-Courmont2022-09-031-7/+10
| | | | | | | | | | | There are no particular reasons to force the compiler to use the same register as output and input operand. This forces an extra MOV instruction if the input value needs to be reused after the swap. In most cases, this makes no differences, as the compiler will seleect the same register for both operands either way. Signed-off-by: Martin Storsjö <martin@martin.st>
* arm: Check the build time constants in av_clip_*intp2Martin Storsjö2022-09-021-6/+18
| | | | | | This fixes building for arm targets with optimizations disabled. Signed-off-by: Martin Storsjö <martin@martin.st>
* avutil: use getauxval(3) for CPU capabilities on linux/android ARMAman Karmani2022-02-071-3/+21
| | | | | | | | | | getauxval is marginally faster, and works even when procfs is not mounted support on Linux was added in glibc 2.16 support on Android was added in 4.4 (API 20) fixes #6578 Signed-off-by: Aman Karmani <aman@tmm1.net>
* arm/aarch64: Use mach_absolute_time as timer on apple platformsMartin Storsjö2021-02-211-1/+7
| | | | | | | | | This is much less precise than the cycle counter register, but the cycle counter register is not available on apple platforms (and on linux, it requires a kernel module for allowing user mode access). Signed-off-by: Martin Storsjö <martin@martin.st>
* Merge commit '41cf3e3b1ca375962951fde1b90a03b16197d205'James Almer2019-02-201-0/+2
|\ | | | | | | | | | | | | * commit '41cf3e3b1ca375962951fde1b90a03b16197d205': arm: Create proper .rdata sections for COFF Merged-by: James Almer <jamrial@gmail.com>
| * arm: Create proper .rdata sections for COFFMartin Storsjö2019-01-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | As .rodata isn't one of the default created sections for COFF, it was created as a read-write data section. By using the default .rdata section name for COFF, it automatically becomes a read-only data section. The existing ".section .rodata" works as intended for ELF though. This is based on an original patch and diagnose by Tom Tan <Tom.Tan@microsoft.com>. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '5584abf69d83169a010aca404cd1cf95c23ad9ef'James Almer2019-02-201-0/+8
|\ \ | |/ | | | | | | | | | | * commit '5584abf69d83169a010aca404cd1cf95c23ad9ef': arm: Emit .thumb_func directives Merged-by: James Almer <jamrial@gmail.com>
| * arm: Emit .thumb_func directivesMartin Storsjö2018-10-121-0/+8
| | | | | | | | | | | | | | | | | | | | | | Prior to Xcode 9.3, the clang built-in assembler didn't support altmacro, and gas-preprocessor was used for assembling for arm/darwin. For thumb functions, gas-preprocessor took care of adding the .thumb_func directives, but when now being able to assemble without gas-preprocessor, we need to add these directives ourselves. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '3a7b4ae62c798edbd82bcd8fef863c74ed2acd4a'James Almer2018-03-301-1/+7
|\ \ | |/ | | | | | | | | | | * commit '3a7b4ae62c798edbd82bcd8fef863c74ed2acd4a': arm: Produce .const_data instead of .section .rodata for Mach-O Merged-by: James Almer <jamrial@gmail.com>
| * arm: Produce .const_data instead of .section .rodata for Mach-OMartin Storsjö2018-03-301-1/+7
| | | | | | | | | | | | | | | | | | | | This is the same combination of .section directives as used in aarch64/asm.S. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '4cf84e254ae75b524e1cacae499a97d7cc9e5906'James Almer2018-02-111-1/+0
|\ \ | |/ | | | | | | | | | | * commit '4cf84e254ae75b524e1cacae499a97d7cc9e5906': Drop some unnecessary config.h #includes Merged-by: James Almer <jamrial@gmail.com>
| * Drop some unnecessary config.h #includesDiego Biurrun2018-02-061-1/+0
| |
| * arm: Check for have_vfp_vm instead of !have_vfpv3 for float_dsp_vfpMartin Storsjö2017-10-241-2/+2
| | | | | | | | | | | | | | This was missed in e2710e790c0 since those functions weren't exercised by checkasm. Signed-off-by: Martin Storsjö <martin@martin.st>
| * cpu: split flag checks per arch in av_cpu_max_align()James Almer2017-10-091-0/+9
| | | | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* | libavutil: Add saturating subtraction functionsAndrew D'Addesio2017-12-041-0/+16
| | | | | | | | | | | | | | | | | | Add av_sat_sub32 and av_sat_dsub32 as the subtraction analogues to av_sat_add32/av_sat_dadd32. Also clarify the formulas for dadd32/dsub32. Signed-off-by: Andrew D'Addesio <modchipv12@gmail.com>
* | Merge commit '59cee42d7d22530e66a155305389e29679b11f78'James Almer2017-10-301-0/+2
|\ \ | |/ | | | | | | | | | | * commit '59cee42d7d22530e66a155305389e29679b11f78': arm: Check for the .arch directive in configure Merged-by: James Almer <jamrial@gmail.com>
| * arm: Check for the .arch directive in configureMartin Storsjö2017-05-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | When targeting windows, the .arch directive isn't available. So far, when building for windows, we've always used gas-preprocessor, both when using msvc's armasm and when using clang. Lately, clang/llvm has implemented the last missing piece (altmacro support) for building our assembly without gas-preprocessor. This means that we now build for arm/windows with clang without any extra compatibility layer. Signed-off-by: Martin Storsjö <martin@martin.st>
* | lavu/arm: Check for have_vfp_vm instead of !have_vfpv3 for float_dsp_vfpMartin Storsjö2017-10-231-2/+2
| | | | | | | | | | | | | | This was missed in e754c8e8 / e2710e790c0 since those functions weren't exercised by checkasm. Fixes ticket #6766.
* | avutil/cpu: split flag checks per arch in av_cpu_max_align()James Almer2017-09-271-0/+10
| | | | | | | | Signed-off-by: James Almer <jamrial@gmail.com>
* | Merge commit '6a1ea4ec932f4fc9fdc00ec51ee070b298ddb35f'James Almer2017-04-041-0/+9
|\ \ | |/ | | | | | | | | | | * commit '6a1ea4ec932f4fc9fdc00ec51ee070b298ddb35f': arm: warn/error on movrelx usage problematic with PIC on ELF Merged-by: James Almer <jamrial@gmail.com>
| * arm: warn/error on movrelx usage problematic with PIC on ELFJanne Grunau2016-11-241-0/+9
| | | | | | | | | | | | The warning has false positives but our asm does not trigger it. For new code false positives can only be avoided by changing the register allocation.
| * arm: Clear the gp register alias at the end of functionsMartin Storsjö2016-11-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | We reset .Lpic_gp to zero at the start of each function, which means that the logic within movrelx for clearing gp when necessary will be missed. This fixes using movrelx in different functions with a different helper register. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '6f9e34baea4f6f484392e4e67f606a0835d07b73'Clément Bœsch2017-02-021-2/+8
|\ \ | |/ | | | | | | | | | | * commit '6f9e34baea4f6f484392e4e67f606a0835d07b73': arm: Check for support for the .fpu directive Merged-by: Clément Bœsch <cboesch@gopro.com>
| * arm: Check for support for the .fpu directiveMartin Storsjö2016-07-211-2/+8
| | | | | | | | | | | | | | When targeting COFF (windows), clang doesn't support this directive (while binutils supports it for all targets). Signed-off-by: Martin Storsjö <martin@martin.st>
* | arm: Clear the gp register alias at the end of functionsMartin Storsjö2016-11-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We reset .Lpic_gp to zero at the start of each function, which means that the logic within movrelx for clearing gp when necessary will be missed. This fixes using movrelx in different functions with a different helper register. This is cherry-picked from libav commit 824e8c284054f323f854892d1b4739239ed1fdc7. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | all: Add missing header guardsTimothy Gu2016-01-281-0/+5
| |
* | Merge commit '73c8c0341cce9e1a6c4169721f5123f97fc4be2f'Hendrik Leppkes2016-01-191-1/+1
|\ \ | |/ | | | | | | | | | | * commit '73c8c0341cce9e1a6c4169721f5123f97fc4be2f': arm: Fix vfp dead code elimination with have_vfp_vm Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
| * arm: Fix vfp dead code elimination with have_vfp_vmMartin Storsjö2016-01-081-1/+1
| | | | | | | | | | | | | | | | | | This fixes builds with --disable-vfp. Checking for the armv6 cpu flag is incorrect, since vfpv2 isn't armv6 specific. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0'Hendrik Leppkes2016-01-022-0/+9
|\ \ | |/ | | | | | | | | | | * commit 'e2710e790c09e49e86baa58c6063af0097cc8cb0': arm: add a cpu flag for the VFPv2 vector mode Merged-by: Hendrik Leppkes <h.leppkes@gmail.com>
| * arm: add a cpu flag for the VFPv2 vector modeJanne Grunau2015-12-142-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The vector mode was deprecated in ARMv7-A/VFPv3 and various cpu implementations do not support it in hardware. Vector mode code will depending the OS either be emulated in software or result in an illegal instruction on cpus which does not support it. This was not really problem in practice since NEON implementations of the same functions are preferred. It will however become a problem for checkasm which tests every cpu flag separately. Since this is a cpu feature newer cpu do not support anymore the behaviour of this flag differs from the other flags. It can be only activated by runtime cpu feature selection.
* | avutil/attributes: add AV_GCC_VERSION_AT_MOSTJames Almer2015-09-182-4/+4
| | | | | | | | | | Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: James Almer <jamrial@gmail.com>
* | avutil/arm/intmath: return int for uint8 / uint16 clipMichael Niedermayer2015-07-201-4/+4
| | | | | | | | | | | | | | | | | | | | The C functions return uint8/16_t but that is effectively int not unsigned int Fixes fate-filter-tblend We do not return uint8/16_t as that would require the compiler to truncate the values, slowing it down. Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
* | arm: only enable setend on ARMv6Andreas Cadhalpun2015-06-051-1/+1
| | | | | | | | | | | | | | Without this check it causes SIGILL crashes on ARMv5. Reviewed-by: Michael Niedermayer <michaelni@gmx.at> Signed-off-by: Andreas Cadhalpun <Andreas.Cadhalpun@googlemail.com>
* | Merge commit 'dcae2e32f7d8a1ca5fb8c1e4aa81313be854dd73'Michael Niedermayer2015-03-071-0/+6
|\ \ | |/ | | | | | | | | | | * commit 'dcae2e32f7d8a1ca5fb8c1e4aa81313be854dd73': arm: Suppress tags about used cpu arch and extensions Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: Suppress tags about used cpu arch and extensionsMartin Storsjö2015-03-071-0/+6
| | | | | | | | | | | | | | | | | | | | When all the codepaths using manually set .arch/.fpu code is behind runtime detection, the elf attributes should be suppressed. This allows tools to know that the final built binary doesn't strictly require these extensions. Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc'Michael Niedermayer2015-02-211-0/+8
|\ \ | |/ | | | | | | | | | | * commit '76ce9bd8e26dcb3652240a1072840ff4011d7cdc': libavutil: Add ARM av_clip_intp2_arm Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * libavutil: Add ARM av_clip_intp2_armPeter Meerwald2015-02-211-0/+8
| | | | | | | | | | | | | | | | | | | | add ARM code for implementing av_clip_intp2 using the ssat instruction on Cortex-A8, av_clip_intp2_arm() is faster than av_clip_intp2_c() and the generic av_clip(), about -19% Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net> Signed-off-by: Luca Barbato <lu_zero@gentoo.org>
* | Merge commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46'Michael Niedermayer2014-12-091-1/+5
|\ \ | |/ | | | | | | | | | | | | | | | | * commit 'f963f80399deb1a2b44c1bac3af7123e8a0c9e46': arm: Use .data.rel.ro for const data with relocations Conflicts: configure Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: Use .data.rel.ro for const data with relocationsMartin Storsjö2014-12-091-1/+5
| | | | | | | | Signed-off-by: Martin Storsjö <martin@martin.st>
* | avutil/arm/float_dsp_init_vfp: replace restrict by av_restrictjessejiang2014-11-201-1/+1
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | avutil: turn arm setend into a cpuflagMichael Niedermayer2014-08-132-7/+9
| | | | | | | | | | | | | | | | | | | | | | this allows disabling and enabling it it also prevents crashes if vfpv3 and neon are disabled which previously would have enabled the flag And last but not least one can enable setend on cpus like cortex-a8 where its fast but disabled by default Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '6869612f5c7d4d2f20f69a5658328a761deadb1c'Michael Niedermayer2014-07-221-0/+6
|\ \ | |/ | | | | | | | | | | | | | | | | * commit '6869612f5c7d4d2f20f69a5658328a761deadb1c': arm: Macroize the test for 'setend' CPU instruction support Conflicts: libavcodec/arm/h264dsp_init_arm.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: Macroize the test for 'setend' CPU instruction supportBen Avison2014-07-211-0/+6
| | | | | | | | Signed-off-by: Diego Biurrun <diego@biurrun.de>
| * armv6: Accelerate butterflies_floatBen Avison2014-07-182-0/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1542.8 43.7 1470.5 41.5 100.0% +4.9% butterflies_float 130.0 11.9 70.2 12.1 100.0% +85.2% Signed-off-by: Martin Storsjö <martin@martin.st>
| * armv6: Accelerate vector_fmul_windowBen Avison2014-07-182-1/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1598.2 47.4 1529.2 25.4 100.0% +4.5% vector_fmul_window 244.0 22.1 188.9 22.3 100.0% +29.2% Signed-off-by: Martin Storsjö <martin@martin.st>
* | armv6: Accelerate butterflies_floatBen Avison2014-07-162-0/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in butterflies_float_c() / ff_butterflies_float_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1542.8 43.7 1470.5 41.5 100.0% +4.9% butterflies_float 130.0 11.9 70.2 12.1 100.0% +85.2% Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | armv6: Accelerate vector_fmul_windowBen Avison2014-07-162-1/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | I benchmarked the result by measuring the number of gperftools samples that hit anywhere in the AAC decoder (starting from aac_decode_frame()) or specifically in vector_fmul_window_c() / ff_vector_fmul_window_vfp() for the same sample AAC stream: Before After Mean StdDev Mean StdDev Confidence Change Audio decode 1598.2 47.4 1529.2 25.4 100.0% +4.5% vector_fmul_window 244.0 22.1 188.9 22.3 100.0% +29.2% Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge commit '7b0c7c9163fe3dd0081696befde28617119d2590'Michael Niedermayer2014-06-281-1/+3
|\ \ | |/ | | | | | | | | | | * commit '7b0c7c9163fe3dd0081696befde28617119d2590': arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernel Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * arm: Detect 32 bit cpu features on ARMv8 when running on a 64 bit kernelMartin Storsjö2014-06-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running on a 64 bit kernel, /proc/cpuinfo lists different optional features than on 32 bit kernels (because some of them are mandatory in the 64 bit implemenations). The kernel does list the old features properly if they are queried via /proc/self/auxv though - however this file is not always readable (e.g. on most android systems). The getauxval function could also provide the same info as /proc/self/auxv even if this file isn't readable, but this function is not always available (and thus would need to be loaded with dlsym for compatibility with older android versions). The android cpufeatures library does this slightly differently, by assuming that these are available if the "CPU architecture" line is >= 8, see [1] for details. It has been suggested to include the old, non-optional features in /proc/cpuinfo as well, but that suggested patch never was merged. See [2] for the discussion around this suggestion. [1] https://android-review.googlesource.com/91380 [2] http://marc.info/?l=linux-arm-kernel&m=139087240101974 Signed-off-by: Martin Storsjö <martin@martin.st>
* | Merge commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412'Michael Niedermayer2014-06-041-3/+9
|\ \ | |/ | | | | | | | | | | * commit 'd5a55981986ac5d1a31aef3a8d16eaff8534a412': build: check if AS supports the '.func' directive Merged-by: Michael Niedermayer <michaelni@gmx.at>