| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Optimisations for the 4,16 split table region multiplications.
Selected time_tool.sh 16 -A -B results for a 1.7 GHz cortex-a9:
Region Best (MB/s): 532.14 W-Method: 16 -m SPLIT 16 4 -r SIMD -
Region Best (MB/s): 212.34 W-Method: 16 -m SPLIT 16 4 -r NOSIMD -
Region Best (MB/s): 801.36 W-Method: 16 -m SPLIT 16 4 -r SIMD -r ALTMAP -
Region Best (MB/s): 93.20 W-Method: 16 -m SPLIT 16 4 -r NOSIMD -r ALTMAP -
Region Best (MB/s): 273.99 W-Method: 16 -m SPLIT 16 8 -
Region Best (MB/s): 270.81 W-Method: 16 -m SPLIT 8 8 -
Region Best (MB/s): 70.42 W-Method: 16 -m COMPOSITE 2 - -
Region Best (MB/s): 393.54 W-Method: 16 -m COMPOSITE 2 - -r ALTMAP -
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Optimisations for the 4,4 split table region multiplication and carry
less multiplication using NEON's polynomial long multiplication.
arm: w8: NEON carry less multiplication
Selected time_tool.sh results for a 1.7GHz cortex-a9:
Region Best (MB/s): 375.86 W-Method: 8 -m CARRY_FREE -
Region Best (MB/s): 142.94 W-Method: 8 -m TABLE -
Region Best (MB/s): 225.01 W-Method: 8 -m TABLE -r DOUBLE -
Region Best (MB/s): 211.23 W-Method: 8 -m TABLE -r DOUBLE -r LAZY -
Region Best (MB/s): 160.09 W-Method: 8 -m LOG -
Region Best (MB/s): 123.61 W-Method: 8 -m LOG_ZERO -
Region Best (MB/s): 123.85 W-Method: 8 -m LOG_ZERO_EXT -
Region Best (MB/s): 1183.79 W-Method: 8 -m SPLIT 8 4 -r SIMD -
Region Best (MB/s): 177.68 W-Method: 8 -m SPLIT 8 4 -r NOSIMD -
Region Best (MB/s): 87.85 W-Method: 8 -m COMPOSITE 2 - -
Region Best (MB/s): 428.59 W-Method: 8 -m COMPOSITE 2 - -r ALTMAP -
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Optimisations for the single table region multiplication and carry less
multiplication using NEON's polynomial multiplication of 8-bit values.
The single polynomial multiplication is not that useful but vector
version is for region multiplication.
Selected time_tool.sh results for a 1.7GHz cortex-a9:
Region Best (MB/s): 672.72 W-Method: 4 -m CARRY_FREE -
Region Best (MB/s): 265.84 W-Method: 4 -m BYTWO_p -
Region Best (MB/s): 329.41 W-Method: 4 -m TABLE -r DOUBLE -
Region Best (MB/s): 278.63 W-Method: 4 -m TABLE -r QUAD -
Region Best (MB/s): 329.81 W-Method: 4 -m TABLE -r QUAD -r LAZY -
Region Best (MB/s): 1318.03 W-Method: 4 -m TABLE -r SIMD -
Region Best (MB/s): 165.15 W-Method: 4 -m TABLE -r NOSIMD -
Region Best (MB/s): 99.73 W-Method: 4 -m LOG -
|
| | |
|
| |
| |
| |
| | |
Properly emulate aligned allocation if posix_memalign is not available.
|
| |
| |
| |
| | |
Checks for arm_neon.h header.
|
| |
| |
| |
| |
| | |
SSE is not the only supported SIMD instruction set. Keep the old names
for backward compatibility.
|
| | |
|
| | |
|
|/
|
|
|
| |
There is no need to force the non-default CFLAGS on users trying to set
them via enviroment variable or on configure command.
|
|\
| |
| | |
static code analysis fixes
|
| |
| |
| |
| |
| |
| |
| |
| | |
Since there can only be one -m, base cannot be set by -m COMPOSITE and
then deallocated on the second -m if it is bugous. The second -m will
exit on error at _gf_errno = GF_E_TWOMULT;.
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
|
|/
|
|
|
|
| |
Because >> 64 does not have a defined behavior.
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
|
|\
| |
| | |
On CPU that doesn't support SSE4.2 instructions set, this will fail
|
|/
|
|
|
|
|
| |
because incorrect header is included.
smmintrin.h => SSE4.1
nmmintrin.h => SSE4.2
|
| |
|
|\ |
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| | |
Fix dead assignment in case of INTEL_SSSE3 defined.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| | |
The 'm2' variable in gf_w64_clm_multiply_region_from_single_2() isn't
used except for calculations on 'm2' which are not used later in the code.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| | |
These assigments are never used and directly overwritten later
in the function.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Due to man page of malloc the behaviour in case of allocation size of
0 bytes is undefined: "If size was equal to 0, either NULL or a
pointer suitable to be passed to free() is returned"
Fix for clang scan-build report:
Unix API Undefined allocation of 0 bytes (CERT MEM04-C; CWE-131)
210 poly = (gf_general_t *) malloc(sizeof(gf_general_t)*(n+1));
9 Call to 'malloc' has an allocation size of 0 bytes
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| | |
Check for array boundaries of 't' in while loop header.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Free all with malloc allocated memory before exit. Change
if checks against 'w' to be a if-else check to prevent checking
after already matched.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | |
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Conflicts:
.gitignore
INSTALL
Makefile.in
aclocal.m4
config.guess
config.sub
configure
examples/Makefile.in
include/config.h.in
include/config.h.in~
install-sh
ltmain.sh
m4/libtool.m4
m4/ltversion.m4
missing
src/Makefile.in
test/Makefile.in
tools/Makefile.in
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| |\ \
| | | |
| | | |
| | | | |
https://bitbucket.org/jayrde/gf-complete into wip-autoconf-cleanup
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| |/ / |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | | |
get the manual.
|
| |/
|/|
| |
| | |
for easy navigation.
|
|\ \
| | |
| | | |
Fixes for some issues found via Coverity in the Ceph project.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fix for coverity issue from Ceph project:
CID 1193093 (#1 of 1): Structurally dead code (UNREACHABLE)
unreachable: This code cannot be reached: "return gf_w4_double_table_i...".
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Remove identical expression, reorganize code in gf_error_check()
to be identical handled trough all checks. Removed (raltmap && arg1 != 4)
check - this is dead code (arg1 is always 4 in this code path).
Fix for coverity issue from Ceph project:
CID 1193071 (#1 of 1): Same on both sides (CONSTANT_EXPRESSION_RESULT)
pointless_expression: The expression (arg1 == 4 && arg2 == 32) ||
(arg1 == 4 && arg2 == 32) does not accomplish anything because it
evaluates to either of its identical operands, arg1 == 4 && arg2 == 32.
Did you intend the operands to be different?
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Swap comments/messages on GF_E_SP128_A/GF_E_SP128_S.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since there is no comment indicating fallthrough on purpose added a
break in switch value 5 and 6.
Fix for coverity issue from Ceph project:
CID 1193084 (#1 of 1): Missing break in switch (MISSING_BREAK)
unterminated_case: This case (value 5) is not terminated by a 'break'
statement.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|