summaryrefslogtreecommitdiff
path: root/gcc/doc
diff options
context:
space:
mode:
authorhjl <hjl@138bc75d-0d04-0410-961f-82ee72b054a4>2010-10-27 12:36:15 +0000
committerhjl <hjl@138bc75d-0d04-0410-961f-82ee72b054a4>2010-10-27 12:36:15 +0000
commit3970ad845c9e8831fed616742bf3e269df28f3b3 (patch)
treee6bb0c094c327d8c2c95394432623ac76fa04746 /gcc/doc
parent31ccad94ad5151ede14d26b78431d4798c99af04 (diff)
downloadgcc-3970ad845c9e8831fed616742bf3e269df28f3b3.tar.gz
Add -mvzeroupper to x86.
gcc/ 2010-10-27 H.J. Lu <hongjiu.lu@intel.com> * config/i386/i386-protos.h (init_cumulative_args): Add an int. * config/i386/i386.c (block_info): New. (BLOCK_INFO): Likewise. (call_avx256_state): Likewise. (check_avx256_stores): Likewise. (move_or_delete_vzeroupper_2): Likewise. (move_or_delete_vzeroupper_1): Likewise. (move_or_delete_vzeroupper): Likewise. (use_avx256_p): Likewise. (function_pass_avx256_p): Likewise. (flag_opts): Add -mvzeroupper. (ix86_option_override_internal): Turn on MASK_VZEROUPPER by default for TARGET_AVX. Turn off MASK_VZEROUPPER if TARGET_AVX is disabled. (ix86_function_ok_for_sibcall): Disable sibcall if we need to generate vzeroupper. (init_cumulative_args): Add an int to indicate caller. Set use_avx256_p, callee_return_avx256_p and caller_use_avx256_p based on return type. (ix86_function_arg): Set use_avx256_p, callee_pass_avx256_p and caller_pass_avx256_p based on argument type. (ix86_expand_epilogue): Emit vzeroupper if 256bit AVX register is used, but not returned by caller. (ix86_expand_call): Emit vzeroupper if 256bit AVX register is used. (ix86_local_alignment): Set use_avx256_p if 256bit AVX register is used. (ix86_minimum_alignment): Likewise. (ix86_expand_special_args_builtin): Set target to GEN_INT (vzeroupper_intrinsic) for CODE_FOR_avx_vzeroupper. (ix86_reorg): Run the vzeroupper optimization if needed. * config/i386/i386.h (ix86_args): Add caller. (INIT_CUMULATIVE_ARGS): Updated. (machine_function): Add use_vzeroupper_p, use_avx256_p, caller_pass_avx256_p, caller_return_avx256_p, callee_pass_avx256_p and callee_return_avx256_p. * config/i386/i386.opt (-mvzeroupper): New. * config/i386/predicates.md (vzeroupper_operation): Removed. * config/i386/sse.md (avx_vzeroupper): Removed. (*avx_vzeroupper): Removed. (avx_vzeroupper): New. * doc/invoke.texi: Document -mvzeroupper. gcc/testsuite/ 2010-10-27 H.J. Lu <hongjiu.lu@intel.com> * gcc.target/i386/avx-vzeroupper-1.c: Add -mtune=generic. * gcc.target/i386/avx-vzeroupper-2.c: Likewise. * gcc.target/i386/avx-vzeroupper-3.c: New. * gcc.target/i386/avx-vzeroupper-4.c: Likewise. * gcc.target/i386/avx-vzeroupper-5.c: Likewise. * gcc.target/i386/avx-vzeroupper-6.c: Likewise. * gcc.target/i386/avx-vzeroupper-7.c: Likewise. * gcc.target/i386/avx-vzeroupper-8.c: Likewise. * gcc.target/i386/avx-vzeroupper-9.c: Likewise. * gcc.target/i386/avx-vzeroupper-10.c: Likewise. * gcc.target/i386/avx-vzeroupper-11.c: Likewise. * gcc.target/i386/avx-vzeroupper-12.c: Likewise. * gcc.target/i386/avx-vzeroupper-13.c: Likewise. * gcc.target/i386/avx-vzeroupper-14.c: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@166000 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/doc')
-rw-r--r--gcc/doc/invoke.texi9
1 files changed, 8 insertions, 1 deletions
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7ea042f6775..365b8c3af43 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -594,7 +594,7 @@ Objective-C and Objective-C++ Dialects}.
-mno-wide-multiply -mrtd -malign-double @gol
-mpreferred-stack-boundary=@var{num}
-mincoming-stack-boundary=@var{num} @gol
--mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip @gol
+-mcld -mcx16 -msahf -mmovbe -mcrc32 -mrecip -mvzeroupper @gol
-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 -mavx @gol
-maes -mpclmul -mfsgsbase -mrdrnd -mf16c -mfused-madd @gol
-msse4a -m3dnow -mpopcnt -mabm -mfma4 -mxop -mlwp @gol
@@ -12466,6 +12466,13 @@ GCC with the @option{--enable-cld} configure option. Generation of @code{cld}
instructions can be suppressed with the @option{-mno-cld} compiler option
in this case.
+@item -mvzeroupper
+@opindex mvzeroupper
+This option instructs GCC to emit a @code{vzeroupper} instruction
+before a transfer of control flow out of the function to minimize
+AVX to SSE transition penalty as well as remove unnecessary zeroupper
+intrinsics.
+
@item -mcx16
@opindex mcx16
This option will enable GCC to use CMPXCHG16B instruction in generated code.