| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
* Align CONTRIBUTING.md with the google/new-project template.
* Explain the support story for the CMake config.
PiperOrigin-RevId: 421311695
|
|
|
|
|
|
| |
to avoid UB of passing uninitialized argument by value.
PiperOrigin-RevId: 406052814
|
|
|
|
| |
PiperOrigin-RevId: 394247182
|
|\
| |
| |
| | |
PiperOrigin-RevId: 394061345
|
|/
|
|
|
|
|
|
|
|
|
|
|
| |
The final ip advance value doesn't have to wait for
the result of offset to load *tag. It can be computed
along with the offset, so the codegen will use one
csinc in parallel with ldrb. This will improve the
throughput.
With this change it is observed ~4.2% uplift in UFlat/10
and ~3.7% in UFlatMedley
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I20ab211235bbf578c6c978f2bbd9160a49e920da
|
|\
| |
| |
| | |
PiperOrigin-RevId: 393681630
|
| |
| |
| |
| |
| | |
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I3fade568ff92b4303387705f843d0051d5e88349
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
After SHUFFLE code blocks are refactored, "tmmintrin.h"
is missed, and bmi2 code part will have build failure
as type conflicts.
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I7800cd7e050f4d349e5a227206b14b9c566e547f
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The #if predicate evaluates to false if the macro is undefined, or
defined to 0. #ifdef (and its synonym #if defined) evaluates to false
only if the macro is undefined.
The new setup allows differentiating between setting a macro to 0 (to
express that the capability definitely does not exist / should not be
used) and leaving a macro undefined (to express not knowing whether a
capability exists / not caring if a capability is used).
PiperOrigin-RevId: 391094241
|
| |
| |
| |
| | |
PiperOrigin-RevId: 391082698
|
|\ \
| | |
| | |
| | | |
PiperOrigin-RevId: 390767998
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Clang doesn't realize the load with free zero-extension,
and emits another extra 'and xn, xm, 0xff' to calc offset.
With this change ,this extra op is removed, and consistent
1.7% performance uplift is observed.
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: Ica4617852c4b93eadc6c5c551dc3961ffbadb8f0
|
|\ \
| |/
|/|
| | |
PiperOrigin-RevId: 390715690
|
|/
|
|
|
|
|
|
|
|
|
| |
Inspired by kExtractMasksCombined, this patch uses shift
to replace table lookup. On Arm the codegen is 2 shift ops
(lsl+lsr). Comparing to previous ldr which requires 4 cycles
latency, the lsl+lsr only need 2 cycles.
Slight (~0.3%) uplift observed on N1, and ~3% on A72.
Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: I5b53632d22d9e5cf1a49d0c5cdd16265a15de23b
|
|
|
|
|
|
| |
improvement for ARM. Probably because ARM has more relaxed address computation than x86 https://www.godbolt.org/z/bfM1ezx41. I don't think this is a compiler bug or it can do something about it
PiperOrigin-RevId: 387569896
|
|
|
|
| |
PiperOrigin-RevId: 387356237
|
|
|
|
|
|
| |
code. clang for ARM and gcc for x86 https://gcc.godbolt.org/z/oxeGG7aEx
PiperOrigin-RevId: 383467656
|
|
|
|
|
|
| |
generation (csinc). For codegen see https://gcc.godbolt.org/z/a8z9j95Pv
PiperOrigin-RevId: 382688740
|
|
|
|
|
|
| |
The SSSE3 intrinsics we use have their direct analogues in NEON, so making this optimization portable requires a very thin translation layer.
PiperOrigin-RevId: 381280165
|
|
|
|
|
|
|
| |
Xcode (drives macOS image) : 12.2 => 12.5
Clang : 10 => 12
GCC : 10 => 11
PiperOrigin-RevId: 375610083
|
|
|
|
|
|
| |
context, because the other 5 bits in the byte are used for len-4 and the tag.
PiperOrigin-RevId: 374926553
|
|
|
|
| |
PiperOrigin-RevId: 372007801
|
|
|
|
|
|
| |
While we're here, take care of a couple of lint warnings by converting CHECK(a != b) to CHECK_NE(a, b).
PiperOrigin-RevId: 369132446
|
|
|
|
| |
PiperOrigin-RevId: 362386747
|
|
|
|
|
|
|
| |
This CL also removes support for using the gflags library to modify the
flags.
PiperOrigin-RevId: 361583626
|
|
|
|
| |
PiperOrigin-RevId: 361582956
|
|
|
|
| |
PiperOrigin-RevId: 357807059
|
|
|
|
| |
PiperOrigin-RevId: 347861379
|
|
|
|
| |
PiperOrigin-RevId: 347861229
|
|
|
|
|
|
|
|
| |
This lets us remove main() from snappy_bench.cc and snappy_unittest.cc,
which simplifies integrating these tests and benchmarks with other
suites.
PiperOrigin-RevId: 347857427
|
|\
| |
| |
| | |
PiperOrigin-RevId: 347736844
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
LibFuzzer does not ship with the Mac OSX Command Line Tools.
```
ld: file not found: /Applications/Xcode-12.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/12.0.0/lib/darwin/libclang_rt.fuzzer_osx.a
clang: error: linker command failed with exit code 1 (use -v to see invocation)
```
|
|/
|
|
| |
PiperOrigin-RevId: 347736380
|
| |
|
|\
| |
| |
| | |
PiperOrigin-RevId: 347660305
|
| | |
|
| | |
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gcc was unable to inline a function call, which caused a build
failure due to `-Wall -Werror`.
The build error was:
```
../snappy.cc:292:76: error: ignoring attributes on template argument ‘__m128i’ [-Werror=ignored-attributes]
292 | static inline std::pair<__m128i /* pattern */, __m128i /* reshuffle_mask */>
| ^
../snappy.cc:292:76: error: ignoring attributes on template argument ‘__m128i’ [-Werror=ignored-attributes]
cc1plus: all warnings being treated as errors
```
|
|
|
|
|
|
| |
CheckSuccess was removed in e1e91ee464373e0bba4aadfbd3d88a6d84dc5b95.
PiperOrigin-RevId: 347625874
|
| |
|
|
|
|
| |
PiperOrigin-RevId: 347541488
|
|
|
|
| |
PiperOrigin-RevId: 347541028
|
| |
|
|
|
|
|
|
| |
This will not change the compilation output.
PiperOrigin-RevId: 347525836
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Snappy includes a testing framework, which implements a subset of the
Google Test API, and can be used when Google Test is not available.
Snappy also includes a micro-benchmark framework, which implements an
old version of the Google Benchmark API.
This CL replaces the custom test and micro-benchmark frameworks with
google/googletest and google/benchmark. The code is vendored in
third_party/ via git submodules. The setup is similar to google/crc32c
and google/leveldb.
This CL also updates the benchmarking code to the modern Google
Benchmark API.
Benchmark results are expected to be more precise, as the old framework
ran each benchmark with a fixed number of iterations, whereas Google
Benchmark keeps iterating until the noise is low.
PiperOrigin-RevId: 347456142
|
|
|
|
| |
PiperOrigin-RevId: 347402877
|
|
|
|
| |
PiperOrigin-RevId: 347397797
|
|
|
|
| |
PiperOrigin-RevId: 347341130
|
|
|
|
|
|
| |
This feature requires C++17. Fortunately, inline is useful for header declarations, which may be included in multiple compilation units. The declarations modified by this CL occur in a single compilation unit.
PiperOrigin-RevId: 347338760
|
|
|
|
|
|
|
|
|
| |
necessary data. We now store len - offset in a signed int16, this happens to remove masking offset in the calculations and the calculations that need to be done precisely give the flags that we need for testing correctness.
2) Replace offset extraction with a lookup mask. This is less uops and is needed because we need to special case type 3 to always return 0 as to properly trigger the fallback.
3) Unroll the loop twice, this removes some loop-condition checks AND it improves the generated assembly. The loop variables tend to end up in a different register requiring mov's having two consecutive copies allows the elision of the mov's.
PiperOrigin-RevId: 346663328
|