delta/ffmpeg.git - git.ffmpeg.org: ffmpeg.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	sbrdsp: move #if to disable all educational code	Janne Grunau	2014-03-18	1	-4/+8
\| \| \| \|	Avoids a warning of the unused function 'autocorrelate'.
*	sbrdsp: Unroll and use integer operations	Christophe Gisquet	2013-05-03	1	-12/+27
\| \| \| \| \| \| \| \| \| \| \| \|	This patch can be controversial, by assuming floats are IEEE-754 and particular behaviour of the FPU will get in the way. Timing on Arrandale and Win32 (thus, x87 FPU is used in the reference). sbr_qmf_pre_shuffle_c: 115 to 76 sbr_neg_odd_64_c: 84 to 55 sbr_qmf_post_shuffle_c: 112 to 83 Signed-off-by: Diego Biurrun <diego@biurrun.de>
*	sbrdsp: Unroll sbr_autocorrelate_c	Christophe Gisquet	2013-05-03	1	-0/+25
\| \| \| \| \| \|	1410 cycles to 1148 on Arrandale/Win64 Signed-off-by: Diego Biurrun <diego@biurrun.de>
*	x86: call most of the x86 dsp init functions under if (ARCH_X86)	Janne Grunau	2012-10-08	1	-1/+1
\| \| \| \|	Rename the called dsp init functions to *_init_x86.
*	SBR DSP: unroll sum_square	Christophe GISQUET	2012-03-07	1	-4/+9
\| \| \| \| \| \| \| \|	The length is even, so some unrolling can be performed. Timings are for x86: - 32bits: 102c -> 82c - 64bits: 82c -> 69c Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
*	SBR DSP x86: implement SSE sbr_sum_square_sse	Christophe GISQUET	2012-02-23	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The 32bits targets have been compiled with -mfpmath=sse for proper reference. sbr_sum_square C /32bits: 82c (unrolled)/102c C /64bits: 69c (unrolled)/82c SSE/32bits: 42c SSE/64bits: 31c Use of SSE4.1 dpps to perform the final sum is slower. Not unrolling to perform 8 operations in a loop yields 10 more cycles. Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
*	SBR DSP: use intptr_t for the ixh parameter.	Christophe GISQUET	2012-02-23	1	-1/+1
\| \| \| \|	Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
*	aacsbr: ARM NEON optimised sbrdsp functions	Mans Rullgard	2012-01-28	1	-0/+4
\| \| \| \| \| \|	Overall speedup of HE-AAC decoding 2.3x on Cortex-A8, 1.2x on A9. Signed-off-by: Mans Rullgard <mans@mansr.com>
*	aacsbr: move some simdable loops to function pointers	Mans Rullgard	2012-01-28	1	-0/+237
	This prepares for assembly optimisations by moving the most time-consuming loops to functions called through pointers in a new context. Signed-off-by: Mans Rullgard <mans@mansr.com>