summaryrefslogtreecommitdiff
path: root/libavcodec/x86/dsputil_yasm.asm
Commit message (Collapse)AuthorAgeFilesLines
* Add forgotten %ifdef HAVE_AVX.Reimar Döffinger2011-12-031-0/+2
| | | | | | Fixes compilation with older YASM/NASM versions. Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
* Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-11-231-48/+49
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (22 commits) aacdec: Fix PS in ADTS. avconv: Consistently use PIX_FMT_NONE. dsputil: use cpuflags in x86 emu_edge_core dsputil: use movups instead of movdqu in ff_emu_edge_core_sse() wma: initialize prev_block_len_bits, next_block_len_bits, and block_len_bits. mov: Remove some redundant and obsolete comments. Add libavutil/mathematics.h #includes for INFINITY doxy: structure libavformat groups doxy: introduce an empty structure in libavcodec doxy: provide a start page and document libavutil doxy: cleanup pixfmt.h regtest: split video encode/decode tests into individual targets ARM: add explicit .arch and .fpu directives to asm.S pthread: do not touch has_b_frames avconv: cleanup the transcoding loop in output_packet(). avconv: split subtitle transcoding out of output_packet(). avconv: split video transcoding out of output_packet(). avconv: split audio transcoding out of output_packet(). avconv: reindent. avconv: move streamcopy-only code out of decoding loop. ... Conflicts: avconv.c libavcodec/aaccoder.c libavcodec/pthread.c libavcodec/version.h libavutil/audioconvert.h libavutil/avutil.h libavutil/mem.h tests/ref/vsynth1/dv tests/ref/vsynth1/mpeg2thread tests/ref/vsynth2/dv tests/ref/vsynth2/mpeg2thread Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil: use cpuflags in x86 emu_edge_coreJustin Ruggles2011-11-221-45/+46
| | | | | | | | avoids passing around the extra argument among all the macros it uses
| * dsputil: use movups instead of movdqu in ff_emu_edge_core_sse()Justin Ruggles2011-11-221-3/+3
| | | | | | | | | | This allows emulated_edge_mc_sse() and gmc_sse() to be used under AV_CPU_FLAG_SSE.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-11-121-0/+48
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: vble: remove vble_error_close VBLE Decoder tta: use an integer instead of a pointer to iterate output samples shorten: do not modify samples pointer when interleaving mpc7: only support stereo input. dpcm: do not try to decode empty packets dpcm: remove unneeded buf_size==0 check. twinvq: add SSE/AVX optimized sum/difference stereo interleaving vqf/twinvq: pass vqf COMM chunk info in extradata vqf: do not set bits_per_coded_sample for TwinVQ. twinvq: check for allocation failure in init_mdct_win() swscale: add padding to conversion buffer. rtpdec: Simplify finalize_packet http: Handle proxy authentication http: Print an error message for Authorization Required, too AVOptions: don't return an invalid option when option list is empty AIFF: add 'twos' FourCC for the mux/demuxer (big endian PCM audio) Conflicts: libavcodec/avcodec.h libavcodec/tta.c libavcodec/vble.c libavcodec/version.h libavutil/opt.c libswscale/utils.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * twinvq: add SSE/AVX optimized sum/difference stereo interleavingJustin Ruggles2011-11-111-0/+48
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-11-081-17/+23
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: avformat: Avoid a warning about mixed declarations and code BMV demuxer and decoder matroskaenc: Make sure the seekhead struct is freed even on seek failure mpeg12enc: Remove write-only variables. mpeg12enc: Don't set up run-level info for level 0. msmpeg4: Don't set up run-level info for level 0. avformat: Warn about using network functions without calling avformat_network_init avformat: Revise wording rdt: Set AVFMT_NOFILE on ff_rdt_demuxer rdt: Check the return value of avformat_open rtsp: Discard the dynamic handler, if it has an alloc function which failed dsputil: use cpuflags in x86 versions of vector_clip_int32() Conflicts: libavcodec/avcodec.h libavcodec/version.h libavformat/Makefile libavformat/allformats.c libavformat/version.h Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * dsputil: use cpuflags in x86 versions of vector_clip_int32()Justin Ruggles2011-11-061-17/+23
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-10-221-8/+0
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (35 commits) flvdec: Do not call parse_keyframes_index with a NULL stream libspeexdec: include system headers before local headers libspeexdec: return meaningful error codes libspeexdec: cosmetics: reindent libspeexdec: decode one frame at a time. swscale: fix signed shift overflows in ff_yuv2rgb_c_init_tables() Move timefilter code from lavf to lavd. mov: add support for hdvd and pgapmetadata atoms mov: rename function _stik, some indentation cosmetics mov: rename function _int8 to remove ambiguity, some indentation cosmetics mov: parse the gnre atom mp3on4: check for allocation failures in decode_init_mp3on4() mp3on4: create a separate flush function for MP3onMP4. mp3on4: ensure that the frame channel count does not exceed the codec channel count. mp3on4: set channel layout mp3on4: fix the output channel order mp3on4: allocate temp buffer with av_malloc() instead of on the stack. mp3on4: copy MPADSPContext from first context to all contexts. fmtconvert: port float_to_int16_interleave() 2-channel x86 inline asm to yasm fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasm ... Conflicts: libavcodec/arm/h264dsp_init_arm.c libavcodec/h264.c libavcodec/h264.h libavcodec/h264_cabac.c libavcodec/h264_cavlc.c libavcodec/h264_ps.c libavcodec/h264dsp_template.c libavcodec/h264idct_template.c libavcodec/h264pred.c libavcodec/h264pred_template.c libavcodec/x86/h264dsp_mmx.c libavdevice/Makefile libavdevice/jack_audio.c libavformat/Makefile libavformat/flvdec.c libavformat/flvenc.c libavutil/pixfmt.h libswscale/utils.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * fmtconvert: port int32_to_float_fmul_scalar() x86 inline asm to yasmJustin Ruggles2011-10-211-8/+0
| |
* | Move x264asm to libavutil.Kieran Kunhya2011-10-191-1/+1
| | | | | | | | Signed-off-by: Michael Niedermayer <michaelni@gmx.at>
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-08-181-33/+1
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: (23 commits) h264: hide reference frame errors unless requested swscale: split hScale() function pointer into h[cy]Scale(). Move clipd macros to x86util.asm. avconv: reindent. avconv: rescue poor abused start_time global. avconv: rescue poor abused recording_time global. avconv: merge two loops in output_packet(). avconv: fix broken indentation. avconv: get rid of the arbitrary MAX_FILES limit. avconv: get rid of the output_streams_for_file vs. ost_table schizophrenia avconv: add a wrapper for output AVFormatContexts and merge output_opts into it avconv: make itsscale syntax consistent with other options. avconv: factor out adding input streams. avconv: Factorize combining auto vsync with format. avconv: Factorize video resampling. avconv: Don't unnecessarily convert ipts to a double. ffmpeg: remove unsed variable nopts RV3/4 parser: remove unused variable 'off' add XMV demuxer rmdec: parse FPS in RealMedia properly ... Conflicts: avconv.c libavformat/version.h libswscale/swscale.c tests/ref/fate/lmlm4-demux Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Move clipd macros to x86util.asm.Ronald S. Bultje2011-08-171-33/+1
| | | | | | | | This allows sharing them between multiple .asm files.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-08-151-1/+1
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: Fix NASM include directive dsputil_mmx: Honor HAVE_AMD3DNOW lavf,lavd: remove all usage of AVFormatParameters from demuxers. jack: add 'channels' private option. VC-1: fix reading of custom PAR. Remove redundant and dubious video codec detection by its extradata mpeg12: remove repeat-field code disabled since May 2002 patch checklist: suggest fate instead of regression tests Turn on resampling on sudden size change instead of bailing out during recode. avtools: reinitialise filter chain when input video stream changes dimensions Conflicts: Makefile avconv.c doc/developer.texi ffplay.c libavcodec/x86/dsputil_mmx.c libavdevice/libdc1394.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Fix NASM include directiveDave Yeo2011-08-151-1/+1
| | | | | | | | Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
* | Merge commit 'b2c087871dafc7d030b2d48457ddff597dfd4925'Michael Niedermayer2011-08-131-1/+1
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * commit 'b2c087871dafc7d030b2d48457ddff597dfd4925': Move x86util.asm from libavcodec/ to libavutil/. Move x86inc.asm to libavutil/. APIchanges: note error_recognition in lavf lavf: add support for error_recognition, use it in avidec, and bump minor API version avconv: change semantics of -map avconv: get rid of new* options. cmdutils: allow precisely specifying a stream for AVOptions. configure: add missing CFLAGS to fix building on the HURD libx264: Include hint for possible values for configuring libx264 cmdutils: allow ':'-separated modifiers in option names. avconv: make -map_metadata work consistently with the other options avconv: remove deprecated options. avconv: make -map_chapters accept only the input file index. Make a copy of ffmpeg under a new name -- avconv. ffmpeg: add a warning stating that the program is deprecated. Add weighted motion compensation for RV40 B-frames RV3/4: calculate B-frame motion weights once per frame Move RV3/4-specific DSP functions into their own context mjpeg: propagate decode errors from ff_mjpeg_decode_sos and ff_mjpeg_decode_dqt h264: notice memory allocation failure Conflicts: .gitignore Makefile cmdutils.c configure doc/ffplay.texi doc/ffprobe.texi doc/ffserver.texi libavcodec/libx264.c libavformat/avformat.h libavformat/avidec.c libavformat/version.h tests/lavf-regression.sh tests/lavfi-regression.sh Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Move x86inc.asm to libavutil/.Ronald S. Bultje2011-08-121-1/+1
| | | | | | | | This allows using it in libswscale/ also.
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-07-021-0/+115
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: get_bits: remove x86 inline asm in A32 bitstream reader doc: Remove outdated information about our issue tracker avidec: Factor out the sync fucntionality. fate-aac: Expand coverage. ac3dsp: add x86-optimized versions of ac3dsp.extract_exponents(). ac3dsp: simplify extract_exponents() now that it does not need to do clipping. ac3enc: clip coefficients after MDCT. ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions. swscale: for >8bit scaling, read in native bit-depth. matroskadec: matroska_read_seek after after EBML_STOP leads to failure. doxygen: fix usage of @file directive in libavutil/{dict,file}.h doxygen: Help doxygen parser to understand the DECLARE_ALIGNED and offsetof macros Conflicts: doc/issue_tracker.txt libavformat/avidec.c libavutil/dict.h libswscale/swscale.c libswscale/utils.c tests/ref/lavfi/pixfmts_scale Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * ac3enc: add int32_t array clipping function to DSPUtil, including x86 versions.Justin Ruggles2011-07-011-0/+115
| |
* | Merge remote-tracking branch 'qatar/master'Michael Niedermayer2011-05-211-1/+1
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: configure: make executable again LATM/AAC: Free previously initialized context on reinit. configure: Do not unconditionally add -Wall to host CFLAGS. configure: Set OS/2 objformat to a.out. Add support for a.out object format to assembler macros. fate: disable threading for encoding fate: add comment field fate: allow overriding default build and install dirs mpegtsenc: Add an AVClass pointer to the private data mpegaudio: clean up #includes mpegaudio: move all header parsing to mpegaudiodecheader.[ch] Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Add support for a.out object format to assembler macros.Dave Yeo2011-05-201-1/+1
| | | | | | | | | | | | This format is still used by e.g. OS/2. Signed-off-by: Diego Biurrun <diego@biurrun.de>
* | Merge remote branch 'qatar/master'Michael Niedermayer2011-05-151-1/+1
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * qatar/master: Fix FSF address copy paste error in some license headers. Add an aac sample which uses LTP to fate-aac. DUPLICATE [PATCH] Update pixdesc_be fate refs after adding 9/10bit YUV420P formats. arm: properly mark external symbol call Conflicts: libavcodec/x86/ac3dsp.asm libavcodec/x86/deinterlace.asm libavcodec/x86/dsputil_yasm.asm libavcodec/x86/dsputilenc_yasm.asm libavcodec/x86/fft_mmx.asm libavcodec/x86/fmtconvert.asm libavcodec/x86/h264_chromamc.asm libavcodec/x86/h264_deblock.asm libavcodec/x86/h264_idct.asm libavcodec/x86/h264_intrapred.asm libavcodec/x86/h264_weight.asm libavcodec/x86/vc1dsp_yasm.asm libavcodec/x86/vp3dsp.asm libavcodec/x86/vp56dsp.asm libavcodec/x86/vp8dsp.asm libavcodec/x86/x86util.asm libswscale/ppc/swscale_template.c Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Fix FSF address copy paste error in some license headers.Diego Biurrun2011-05-141-1/+1
| |
* | Merge remote-tracking branch 'newdev/master'Michael Niedermayer2011-03-241-0/+126
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * newdev/master: avio: make udp_set_remote_url/get_local_port internal. asfdec: also subtract preroll when reading simple index object matroskaenc: remove a variable that's unused after bc17bd9. avio: cosmetics - nicer vertical alignment. Remove unnecessary icc version checks Disable 'attribute "foo" ignored' warnings from icc rtsp: Don't use a locale dependent format string Add xd55 codec tag for XDCAM HD422 720p25 CBR files. configure: get libavcodec version from new version.h header lavc: move the version macros to a new installed header. matroskaenc: simplify get_aac_sample_rates by using ff_mpeg4audio_get_config Do not use format string "%0.3f" for RTSP Range field. Add apply_window_int16() to DSPContext with x86-optimized versions and use it in the ac3_fixed encoder. Document usage of import libraries created by dlltool configure: Set the correct lib target for arm/wince dlltool fate: simplify regression-funcs.sh fate: add support for multithread testing Conflicts: libavformat/rtspdec.c libavutil/attributes.h libavutil/internal.h libavutil/mem.h Merged-by: Michael Niedermayer <michaelni@gmx.at>
| * Add apply_window_int16() to DSPContext with x86-optimized versions and use itJustin Ruggles2011-03-221-0/+126
| | | | | | | | in the ac3_fixed encoder.
| * Replace FFmpeg with Libav in licence headersMans Rullgard2011-03-191-4/+4
| | | | | | | | Signed-off-by: Mans Rullgard <mans@mansr.com>
| * Fix ff_emu_edge_core_sse() on Win64.Ronald S. Bultje2011-02-081-5/+15
| | | | | | | | | | | | | | Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict on the size of registers and which registers are being used for operations where multiple are available. This fixes segfaults in emulated_edge() function calls on Win64.
| * Separate format conversion DSP functions from DSPContext.Justin Ruggles2011-02-021-69/+0
| | | | | | | | | | | | | | This will be beneficial for use with the audio conversion API without requiring it to depend on all of dsputil. Signed-off-by: Mans Rullgard <mans@mansr.com>
| * Implement a SIMD version of emulated_edge_mc() for x86.Ronald S. Bultje2011-01-311-0/+560
| | | | | | | | | | From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32) and 196 (SSE2/x86-32) cycles.
* | Fix ff_emu_edge_core_sse() on Win64.Ronald S. Bultje2011-02-091-5/+15
| | | | | | | | | | | | | | | | Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict on the size of registers and which registers are being used for operations where multiple are available. This fixes segfaults in emulated_edge() function calls on Win64. (cherry picked from commit 17cf7c68ed26a4cb3c7adf7488a38c2e19118918)
* | Separate format conversion DSP functions from DSPContext.Justin Ruggles2011-02-041-69/+0
| | | | | | | | | | | | | | | | This will be beneficial for use with the audio conversion API without requiring it to depend on all of dsputil. Signed-off-by: Mans Rullgard <mans@mansr.com> (cherry picked from commit c73d99e672329c8f2df290736ffc474c360ac4ae)
* | Implement a SIMD version of emulated_edge_mc() for x86.Ronald S. Bultje2011-02-021-0/+560
|/ | | | | | From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32) and 196 (SSE2/x86-32) cycles. (cherry picked from commit 81f2a3f4ffcc6935b8b8ada4954700b3f333ae4f)
* Update x264asm header files to latest versions.Jason Garrett-Glaser2010-06-231-11/+11
| | | | | | | Modify the asm accordingly. GLOBAL is now no longoer necessary for PIC-compliant loads. Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Implement an sse version of scalarproduct_float().Alex Converse2010-01-221-0/+24
| | | | Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
* fix a crash in ape decoding on x86_32 sse2Loren Merritt2009-12-081-1/+1
| | | | Originally committed as revision 20777 to svn://svn.ffmpeg.org/ffmpeg/trunk
* slightly faster scalarproduct_and_madd_int16_ssse3 on penryn, no change on ↵Loren Merritt2009-12-051-5/+13
| | | | | | conroe Originally committed as revision 20743 to svn://svn.ffmpeg.org/ffmpeg/trunk
* refactor and optimize scalarproductLoren Merritt2009-12-051-37/+127
| | | | | | | | 29-105% faster apply_filter, 6-90% faster ape decoding on core2 (Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.) 9-123% faster ape decoding on G4. Originally committed as revision 20739 to svn://svn.ffmpeg.org/ffmpeg/trunk
* port ape dsp functions from sse2 to mmxLoren Merritt2009-12-031-0/+75
| | | | | | now requires yasm Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
* fix linking on systems with a function name prefix (10l in r20287)Loren Merritt2009-10-181-1/+1
| | | | Originally committed as revision 20294 to svn://svn.ffmpeg.org/ffmpeg/trunk
* huffyuv: add some const qualifiersLoren Merritt2009-10-181-2/+2
| | | | Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
* simd add_hfyu_left_predictionLoren Merritt2009-10-181-0/+74
| | | | | | | 2.2x faster than C on conroe, 3.6x on penryn. 4-6% faster huffyuv decoding if using left or plane mode and yuv Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
* ff_add_hfyu_median_prediction_mmx2Loren Merritt2009-02-081-0/+60
| | | | | | overall ffvhuff decoding speedup: 28% on core2, 25% on k8. Originally committed as revision 17059 to svn://svn.ffmpeg.org/ffmpeg/trunk
* Rename libavcodec/i386/ --> libavcodec/x86/.Diego Biurrun2008-12-221-0/+92
It contains optimizations that are not specific to i386 and libavutil uses this naming scheme already. Originally committed as revision 16270 to svn://svn.ffmpeg.org/ffmpeg/trunk