summaryrefslogtreecommitdiff
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* Resolve cppcheck Signed integer overflow errorsBrad Hubbard2017-04-101-1/+1
| | | | | | | | | | The type of expression '1<<31' is signed int and this causes cppcheck to issue the following warning. src/gf_w32.c:681]: (error) Signed integer overflow for expression '1<<31'. Signed-off-by: Brad Hubbard <bhubbard@redhat.com>
* Support for runtime SIMD detectionBassam Tabbara2016-09-131-0/+20
| | | | | | | | | | This commits adds support for runtime detection of SIMD instructions. The idea is that you would build once with all supported SIMD functions and the same binaries could run on different machines with varying support for SIMD. At runtime gf-complete will select the right functions based on the processor. gf_cpu.c has the logic to detect SIMD instructions. On Intel processors this is done through cpuid. For ARM on linux we use getauxv. The logic in gf_w*.c has been changed to check for runtime SIMD support and fallback to generic code. Also a new test has been added. It compares the functions selected by gf_init when we enable/disable SIMD support through build flags, with runtime enabling/disabling. The test checks if the results are identical.
* Add support for printing functions selected in gf_initBassam Tabbara2016-09-131-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | There is currently no way to figure out which functions were selected during gf_init and as a result of SIMD options. This is not even possible in gdb since most functions are static. This commit adds a new macro SET_FUNCTION that records the name of the function selected during init inside the gf_internal structure. This macro only works when DEBUG_FUNCTIONS is defined during compile. Otherwise the code works exactly as it did before this change. The names of selected functions will be used during testing of SIMD runtime detection. All calls such as: gf->multiply.w32 = gf_w16_shift_multiply; need to be replaced with the following: SET_FUNCTION(gf,multiply,w32,gf_w16_shift_multiply) Also added a new flag to tools/gf_methods that will print the names of functions selected during gf_init.
* arm: NEON optimisations for gf_w64Janne Grunau2014-10-241-0/+50
| | | | | Optimisations for 4,64 split table region multiplications. Only used on ARMv8-A since it is not faster on ARMv7-A.
* arm: NEON optimisations for gf_w32Janne Grunau2014-10-241-0/+71
| | | | | | | | | | | | | Optimisations for 4,32 split table multiplications. Selected time_tool.sh results on a 1.7 GHz cortex-a9: Region Best (MB/s): 346.67 W-Method: 32 -m SPLIT 32 4 -r SIMD - Region Best (MB/s): 92.89 W-Method: 32 -m SPLIT 32 4 -r NOSIMD - Region Best (MB/s): 258.17 W-Method: 32 -m SPLIT 32 4 -r SIMD -r ALTMAP - Region Best (MB/s): 162.00 W-Method: 32 -m SPLIT 32 8 - Region Best (MB/s): 160.53 W-Method: 32 -m SPLIT 8 8 - Region Best (MB/s): 32.74 W-Method: 32 -m COMPOSITE 2 - - Region Best (MB/s): 199.79 W-Method: 32 -m COMPOSITE 2 - -r ALTMAP -
* arm: NEON optimisations for gf_w16Janne Grunau2014-10-241-0/+66
| | | | | | | | | | | | | | Optimisations for the 4,16 split table region multiplications. Selected time_tool.sh 16 -A -B results for a 1.7 GHz cortex-a9: Region Best (MB/s): 532.14 W-Method: 16 -m SPLIT 16 4 -r SIMD - Region Best (MB/s): 212.34 W-Method: 16 -m SPLIT 16 4 -r NOSIMD - Region Best (MB/s): 801.36 W-Method: 16 -m SPLIT 16 4 -r SIMD -r ALTMAP - Region Best (MB/s): 93.20 W-Method: 16 -m SPLIT 16 4 -r NOSIMD -r ALTMAP - Region Best (MB/s): 273.99 W-Method: 16 -m SPLIT 16 8 - Region Best (MB/s): 270.81 W-Method: 16 -m SPLIT 8 8 - Region Best (MB/s): 70.42 W-Method: 16 -m COMPOSITE 2 - - Region Best (MB/s): 393.54 W-Method: 16 -m COMPOSITE 2 - -r ALTMAP -
* arm: NEON optimisations for gf_w8Janne Grunau2014-10-241-0/+99
| | | | | | | | | | | | | | | | | | | Optimisations for the 4,4 split table region multiplication and carry less multiplication using NEON's polynomial long multiplication. arm: w8: NEON carry less multiplication Selected time_tool.sh results for a 1.7GHz cortex-a9: Region Best (MB/s): 375.86 W-Method: 8 -m CARRY_FREE - Region Best (MB/s): 142.94 W-Method: 8 -m TABLE - Region Best (MB/s): 225.01 W-Method: 8 -m TABLE -r DOUBLE - Region Best (MB/s): 211.23 W-Method: 8 -m TABLE -r DOUBLE -r LAZY - Region Best (MB/s): 160.09 W-Method: 8 -m LOG - Region Best (MB/s): 123.61 W-Method: 8 -m LOG_ZERO - Region Best (MB/s): 123.85 W-Method: 8 -m LOG_ZERO_EXT - Region Best (MB/s): 1183.79 W-Method: 8 -m SPLIT 8 4 -r SIMD - Region Best (MB/s): 177.68 W-Method: 8 -m SPLIT 8 4 -r NOSIMD - Region Best (MB/s): 87.85 W-Method: 8 -m COMPOSITE 2 - - Region Best (MB/s): 428.59 W-Method: 8 -m COMPOSITE 2 - -r ALTMAP -
* arm: NEON optimisations for gf_w4Janne Grunau2014-10-241-0/+63
| | | | | | | | | | | | | | | | | | Optimisations for the single table region multiplication and carry less multiplication using NEON's polynomial multiplication of 8-bit values. The single polynomial multiplication is not that useful but vector version is for region multiplication. Selected time_tool.sh results for a 1.7GHz cortex-a9: Region Best (MB/s): 672.72 W-Method: 4 -m CARRY_FREE - Region Best (MB/s): 265.84 W-Method: 4 -m BYTWO_p - Region Best (MB/s): 329.41 W-Method: 4 -m TABLE -r DOUBLE - Region Best (MB/s): 278.63 W-Method: 4 -m TABLE -r QUAD - Region Best (MB/s): 329.81 W-Method: 4 -m TABLE -r QUAD -r LAZY - Region Best (MB/s): 1318.03 W-Method: 4 -m TABLE -r SIMD - Region Best (MB/s): 165.15 W-Method: 4 -m TABLE -r NOSIMD - Region Best (MB/s): 99.73 W-Method: 4 -m LOG -
* configure: add ARM/AArch64 NEON supportJanne Grunau2014-10-091-0/+4
| | | | Checks for arm_neon.h header.
* simd: rename the region flags from SSE to SIMDJanne Grunau2014-10-092-5/+7
| | | | | SSE is not the only supported SIMD instruction set. Keep the old names for backward compatibility.
* On CPU that doesn't support SSE4.2 instructions set, this will failLeo Laksmana2014-08-231-1/+6
| | | | | | | because incorrect header is included. smmintrin.h => SSE4.1 nmmintrin.h => SSE4.2
* Removed comments marking CARRY_FREE_GK additions.Adam Disney2014-06-161-1/+1
|
* Merge remote-tracking branch 'jayrde/wip-autoconf-cleanup'Adam Disney2014-06-162-184/+0
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: .gitignore INSTALL Makefile.in aclocal.m4 config.guess config.sub configure examples/Makefile.in include/config.h.in include/config.h.in~ install-sh ltmain.sh m4/libtool.m4 m4/ltversion.m4 missing src/Makefile.in test/Makefile.in tools/Makefile.in
| * remove autogenerated files from repositoryJens Rosenboom2014-03-182-178/+0
| |
* | autoreconf'd to reflect addition of --disable-sseKevin Greenan2014-06-091-0/+3
| |
* | Implemented CARRY_FREE_GK. Sections added are tagged with a comment //ADAMAdam Disney2014-06-061-7/+8
| | | | | | | | for easy navigation.
* | fix comment/message on GF_E_SP128_A/GF_E_SP128_SDanny Al-Gaaf2014-04-221-2/+2
| | | | | | | | | | | | Swap comments/messages on GF_E_SP128_A/GF_E_SP128_S. Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
* | Ran autogen to pick-up the changes needed to run 'make check'Kevin Greenan2014-04-021-0/+3
|/
* Added more header files to the distribution, which will allowKevin Greenan2014-01-021-3/+0
| | | | clients of the lib to take advantage of even more stuff.
* Removed GROUP/128/SSE. It wasn't compiling, and it needed an overhaul.Jim Plank2014-01-011-1/+0
| | | | I'll do it someday when I'm bored.
* Fixed the problem with PCLMUL and gf_complete.h. RemovedJim Plank2013-12-311-7/+1
| | | | | ARCH_64 from everything but 128/GROUP/SSE. Fortunately, no one ever uses that.
* Third.1 time's a charm (autoconf non-sense for PCLMUL).Kevin Greenan2013-12-301-3/+0
|
* Added entry to configure.ac to avoid running autotools during normal build.Kevin Greenan2013-12-301-0/+3
|
* Added PCLMUL to the autoconf macro...Kevin Greenan2013-12-301-0/+3
|
* Build failed... It was because the some headers were in the wrong place.Kevin Greenan2013-12-043-0/+284
| | | | | It was working for me because the headers were installed in /usr/local/include on my Linux box.
* Setting up autoconf/automake for GF-CompleteKevin Greenan2013-12-044-0/+396
Also re-organized the directory structure. Signed-off-by: Kevin Greenan <kmgreen2@gmail.com>