summaryrefslogtreecommitdiff
path: root/gcc/config/arm/arm.md
diff options
context:
space:
mode:
authorKyrylo Tkachov <kyrylo.tkachov@arm.com>2019-11-07 10:39:39 +0000
committerKyrylo Tkachov <ktkachov@gcc.gnu.org>2019-11-07 10:39:39 +0000
commitcf16f980e5278c146f04587ea2a378fab950d7b3 (patch)
tree7d0163547be22d04c3d43c5bc0d6c02b9f1c5773 /gcc/config/arm/arm.md
parente9d01715bd7e033eacfbadbfab5f1f206221305c (diff)
downloadgcc-cf16f980e5278c146f04587ea2a378fab950d7b3.tar.gz
[arm][1/X] Add initial support for saturation intrinsics
This patch adds the plumbing for and an implementation of the saturation intrinsics from ACLE, in particular the __ssat, __usat intrinsics. These intrinsics set the Q sticky bit in APSR if an overflow occurred. ACLE allows the user to read that bit (within the same function, it's not defined across function boundaries) using the __saturation_occurred intrinsic and reset it using __set_saturation_occurred. Thus, if the user cares about the Q bit they would be using a flow such as: __set_saturation_occurred (0); // reset the Q bit ... __ssat (...) // Do some calculations involving __ssat ... if (__saturation_occurred ()) // if Q bit set handle overflow ... For the implementation this has a few implications: * We must track the Q-setting side-effects of these instructions to make sure saturation reading/writing intrinsics are ordered properly. This is done by introducing a new "apsrq" register (and associated APSRQ_REGNUM) in a similar way to the "fake"" cc register. * The RTL patterns coming out of these intrinsics can have two forms: one where they set the APSRQ_REGNUM and one where they don't. Which one is used depends on whether the function cares about reading the Q flag. This is detected using the TARGET_CHECK_BUILTIN_CALL hook on the __saturation_occurred, __set_saturation_occurred occurrences. If no Q-flag read is present in the function we'll use the simpler non-Q-setting form to allow for more aggressive scheduling and such. If a Q-bit read is present then the Q-setting form is emitted. To avoid adding two patterns for each intrinsic to the MD file we make use of define_subst to auto-generate the Q-setting forms * Some existing patterns already produce instructions that may clobber the Q bit, but they don't model it (as we didn't care about that bit up till now). Since these patterns can be generated from straight-line C code they can affect the Q-bit reads from intrinsics. Therefore they have to be disabled when a Q-bit read is present. These are mostly patterns in arm-fixed.md that are not very common anyway, but there are also a couple of widening multiply-accumulate patterns in arm.md that can set the Q-bit during accumulation. There are more Q-setting intrinsics in ACLE, but these will be implemented in a more mechanical fashion once the infrastructure in this patch goes in. * config/arm/aout.h (REGISTER_NAMES): Add apsrq. * config/arm/arm.md (APSRQ_REGNUM): Define. (add_setq): New define_subst. (add_clobber_q_name): New define_subst_attr. (add_clobber_q_pred): Likewise. (maddhisi4): Change to define_expand. Split into mult and add if ARM_Q_BIT_READ. (arm_maddhisi4): New define_insn. (*maddhisi4tb): Disable for ARM_Q_BIT_READ. (*maddhisi4tt): Likewise. (arm_ssat): New define_expand. (arm_usat): Likewise. (arm_get_apsr): New define_insn. (arm_set_apsr): Likewise. (arm_saturation_occurred): New define_expand. (arm_set_saturation): Likewise. (*satsi_<SAT:code>): Rename to... (satsi_<SAT:code><add_clobber_q_name>): ... This. (*satsi_<SAT:code>_shift): Disable for ARM_Q_BIT_READ. * config/arm/arm.h (FIXED_REGISTERS): Mark apsrq as fixed. (CALL_USED_REGISTERS): Mark apsrq. (FIRST_PSEUDO_REGISTER): Update value. (REG_ALLOC_ORDER): Add APSRQ_REGNUM. (machine_function): Add q_bit_access. (ARM_Q_BIT_READ): Define. * config/arm/arm.c (TARGET_CHECK_BUILTIN_CALL): Define. (arm_conditional_register_usage): Clear APSRQ_REGNUM from operand_reg_set. (arm_q_bit_access): Define. * config/arm/arm-builtins.c: Include stringpool.h. (arm_sat_binop_imm_qualifiers, arm_unsigned_sat_binop_unsigned_imm_qualifiers, arm_sat_occurred_qualifiers, arm_set_sat_qualifiers): Define. (SAT_BINOP_UNSIGNED_IMM_QUALIFIERS, UNSIGNED_SAT_BINOP_UNSIGNED_IMM_QUALIFIERS, SAT_OCCURRED_QUALIFIERS, SET_SAT_QUALIFIERS): Likewise. (arm_builtins): Define ARM_BUILTIN_SAT_IMM_CHECK. (arm_init_acle_builtins): Initialize __builtin_sat_imm_check. Handle 0 argument expander. (arm_expand_acle_builtin): Handle ARM_BUILTIN_SAT_IMM_CHECK. (arm_check_builtin_call): Define. * config/arm/arm.md (ssmulsa3, usmulusa3, usmuluha3, arm_ssatsihi_shift, arm_usatsihi): Disable when ARM_Q_BIT_READ. * config/arm/arm-protos.h (arm_check_builtin_call): Declare prototype. (arm_q_bit_access): Likewise. * config/arm/arm_acle.h (__ssat, __usat, __ignore_saturation, __saturation_occurred, __set_saturation_occurred): Define. * config/arm/arm_acle_builtins.def: Define builtins for ssat, usat, saturation_occurred, set_saturation_occurred. * config/arm/unspecs.md (UNSPEC_Q_SET): Define. (UNSPEC_APSR_READ): Likewise. (VUNSPEC_APSR_WRITE): Likewise. * config/arm/arm-fixed.md (ssadd<mode>3): Convert to define_expand. (*arm_ssadd<mode>3): New define_insn. (sssub<mode>3): Convert to define_expand. (*arm_sssub<mode>3): New define_insn. (ssmulsa3): Convert to define_expand. (*arm_ssmulsa3): New define_insn. (usmulusa3): Convert to define_expand. (*arm_usmulusa3): New define_insn. (ssmulha3): FAIL if ARM_Q_BIT_READ. (arm_ssatsihi_shift, arm_usatsihi): Disable for ARM_Q_BIT_READ. * config/arm/iterators.md (qaddsub_clob_q): New mode attribute. * gcc.target/arm/acle/saturation.c: New test. * gcc.target/arm/acle/sat_no_smlatb.c: Likewise. * lib/target-supports.exp (check_effective_target_arm_qbit_ok_nocache): Define.. (check_effective_target_arm_qbit_ok): Likewise. (add_options_for_arm_qbit): Likewise. From-SVN: r277914
Diffstat (limited to 'gcc/config/arm/arm.md')
-rw-r--r--gcc/config/arm/arm.md152
1 files changed, 145 insertions, 7 deletions
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 4f035cbfddd..992d7b60bbc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -39,6 +39,7 @@
(LAST_ARM_REGNUM 15) ;
(CC_REGNUM 100) ; Condition code pseudo register
(VFPCC_REGNUM 101) ; VFP Condition code pseudo register
+ (APSRQ_REGNUM 104) ; Q bit pseudo register
]
)
;; 3rd operand to select_dominance_cc_mode
@@ -423,6 +424,20 @@
(include "marvell-pj4.md")
(include "xgene1.md")
+;; define_subst and associated attributes
+
+(define_subst "add_setq"
+ [(set (match_operand:SI 0 "" "")
+ (match_operand:SI 1 "" ""))]
+ ""
+ [(set (match_dup 0)
+ (match_dup 1))
+ (set (reg:CC APSRQ_REGNUM)
+ (unspec:CC [(reg:CC APSRQ_REGNUM)] UNSPEC_Q_SET))])
+
+(define_subst_attr "add_clobber_q_name" "add_setq" "" "_setq")
+(define_subst_attr "add_clobber_q_pred" "add_setq" "!ARM_Q_BIT_READ"
+ "ARM_Q_BIT_READ")
;;---------------------------------------------------------------------------
;; Insn patterns
@@ -2515,14 +2530,36 @@
(set_attr "predicable" "yes")]
)
-(define_insn "maddhisi4"
+(define_expand "maddhisi4"
+ [(set (match_operand:SI 0 "s_register_operand")
+ (plus:SI (mult:SI (sign_extend:SI
+ (match_operand:HI 1 "s_register_operand"))
+ (sign_extend:SI
+ (match_operand:HI 2 "s_register_operand")))
+ (match_operand:SI 3 "s_register_operand")))]
+ "TARGET_DSP_MULTIPLY"
+ {
+ /* If this function reads the Q bit from ACLE intrinsics break up the
+ multiplication and accumulation as an overflow during accumulation will
+ clobber the Q flag. */
+ if (ARM_Q_BIT_READ)
+ {
+ rtx tmp = gen_reg_rtx (SImode);
+ emit_insn (gen_mulhisi3 (tmp, operands[1], operands[2]));
+ emit_insn (gen_addsi3 (operands[0], tmp, operands[3]));
+ DONE;
+ }
+ }
+)
+
+(define_insn "*arm_maddhisi4"
[(set (match_operand:SI 0 "s_register_operand" "=r")
(plus:SI (mult:SI (sign_extend:SI
(match_operand:HI 1 "s_register_operand" "r"))
(sign_extend:SI
(match_operand:HI 2 "s_register_operand" "r")))
(match_operand:SI 3 "s_register_operand" "r")))]
- "TARGET_DSP_MULTIPLY"
+ "TARGET_DSP_MULTIPLY && !ARM_Q_BIT_READ"
"smlabb%?\\t%0, %1, %2, %3"
[(set_attr "type" "smlaxy")
(set_attr "predicable" "yes")]
@@ -2537,7 +2574,7 @@
(sign_extend:SI
(match_operand:HI 2 "s_register_operand" "r")))
(match_operand:SI 3 "s_register_operand" "r")))]
- "TARGET_DSP_MULTIPLY"
+ "TARGET_DSP_MULTIPLY && !ARM_Q_BIT_READ"
"smlatb%?\\t%0, %1, %2, %3"
[(set_attr "type" "smlaxy")
(set_attr "predicable" "yes")]
@@ -2552,7 +2589,7 @@
(match_operand:SI 2 "s_register_operand" "r")
(const_int 16)))
(match_operand:SI 3 "s_register_operand" "r")))]
- "TARGET_DSP_MULTIPLY"
+ "TARGET_DSP_MULTIPLY && !ARM_Q_BIT_READ"
"smlatt%?\\t%0, %1, %2, %3"
[(set_attr "type" "smlaxy")
(set_attr "predicable" "yes")]
@@ -4044,12 +4081,113 @@
(define_code_attr SATlo [(smin "1") (smax "2")])
(define_code_attr SAThi [(smin "2") (smax "1")])
-(define_insn "*satsi_<SAT:code>"
+(define_expand "arm_ssat"
+ [(match_operand:SI 0 "s_register_operand")
+ (match_operand:SI 1 "s_register_operand")
+ (match_operand:SI 2 "const_int_operand")]
+ "TARGET_32BIT && arm_arch6"
+ {
+ HOST_WIDE_INT val = INTVAL (operands[2]);
+ /* The builtin checking code should have ensured the right
+ range for the immediate. */
+ gcc_assert (IN_RANGE (val, 1, 32));
+ HOST_WIDE_INT upper_bound = (HOST_WIDE_INT_1 << (val - 1)) - 1;
+ HOST_WIDE_INT lower_bound = -upper_bound - 1;
+ rtx up_rtx = gen_int_mode (upper_bound, SImode);
+ rtx lo_rtx = gen_int_mode (lower_bound, SImode);
+ if (ARM_Q_BIT_READ)
+ emit_insn (gen_satsi_smin_setq (operands[0], lo_rtx,
+ up_rtx, operands[1]));
+ else
+ emit_insn (gen_satsi_smin (operands[0], lo_rtx, up_rtx, operands[1]));
+ DONE;
+ }
+)
+
+(define_expand "arm_usat"
+ [(match_operand:SI 0 "s_register_operand")
+ (match_operand:SI 1 "s_register_operand")
+ (match_operand:SI 2 "const_int_operand")]
+ "TARGET_32BIT && arm_arch6"
+ {
+ HOST_WIDE_INT val = INTVAL (operands[2]);
+ /* The builtin checking code should have ensured the right
+ range for the immediate. */
+ gcc_assert (IN_RANGE (val, 0, 31));
+ HOST_WIDE_INT upper_bound = (HOST_WIDE_INT_1 << val) - 1;
+ rtx up_rtx = gen_int_mode (upper_bound, SImode);
+ rtx lo_rtx = CONST0_RTX (SImode);
+ if (ARM_Q_BIT_READ)
+ emit_insn (gen_satsi_smin_setq (operands[0], lo_rtx, up_rtx,
+ operands[1]));
+ else
+ emit_insn (gen_satsi_smin (operands[0], lo_rtx, up_rtx, operands[1]));
+ DONE;
+ }
+)
+
+(define_insn "arm_get_apsr"
+ [(set (match_operand:SI 0 "s_register_operand" "=r")
+ (unspec:SI [(reg:CC APSRQ_REGNUM)] UNSPEC_APSR_READ))]
+ "TARGET_ARM_QBIT"
+ "mrs%?\t%0, APSR"
+ [(set_attr "predicable" "yes")
+ (set_attr "conds" "use")]
+)
+
+(define_insn "arm_set_apsr"
+ [(set (reg:CC APSRQ_REGNUM)
+ (unspec_volatile:CC
+ [(match_operand:SI 0 "s_register_operand" "r")] VUNSPEC_APSR_WRITE))]
+ "TARGET_ARM_QBIT"
+ "msr%?\tAPSR_nzcvq, %0"
+ [(set_attr "predicable" "yes")
+ (set_attr "conds" "set")]
+)
+
+;; Read the APSR and extract the Q bit (bit 27)
+(define_expand "arm_saturation_occurred"
+ [(match_operand:SI 0 "s_register_operand")]
+ "TARGET_ARM_QBIT"
+ {
+ rtx apsr = gen_reg_rtx (SImode);
+ emit_insn (gen_arm_get_apsr (apsr));
+ emit_insn (gen_extzv (operands[0], apsr, CONST1_RTX (SImode),
+ gen_int_mode (27, SImode)));
+ DONE;
+ }
+)
+
+;; Read the APSR and set the Q bit (bit position 27) according to operand 0
+(define_expand "arm_set_saturation"
+ [(match_operand:SI 0 "reg_or_int_operand")]
+ "TARGET_ARM_QBIT"
+ {
+ rtx apsr = gen_reg_rtx (SImode);
+ emit_insn (gen_arm_get_apsr (apsr));
+ rtx to_insert = gen_reg_rtx (SImode);
+ if (CONST_INT_P (operands[0]))
+ emit_move_insn (to_insert, operands[0] == CONST0_RTX (SImode)
+ ? CONST0_RTX (SImode) : CONST1_RTX (SImode));
+ else
+ {
+ rtx cmp = gen_rtx_NE (SImode, operands[0], CONST0_RTX (SImode));
+ emit_insn (gen_cstoresi4 (to_insert, cmp, operands[0],
+ CONST0_RTX (SImode)));
+ }
+ emit_insn (gen_insv (apsr, CONST1_RTX (SImode),
+ gen_int_mode (27, SImode), to_insert));
+ emit_insn (gen_arm_set_apsr (apsr));
+ DONE;
+ }
+)
+
+(define_insn "satsi_<SAT:code><add_clobber_q_name>"
[(set (match_operand:SI 0 "s_register_operand" "=r")
(SAT:SI (<SATrev>:SI (match_operand:SI 3 "s_register_operand" "r")
(match_operand:SI 1 "const_int_operand" "i"))
(match_operand:SI 2 "const_int_operand" "i")))]
- "TARGET_32BIT && arm_arch6
+ "TARGET_32BIT && arm_arch6 && <add_clobber_q_pred>
&& arm_sat_operator_match (operands[<SAT:SATlo>], operands[<SAT:SAThi>], NULL, NULL)"
{
int mask;
@@ -4075,7 +4213,7 @@
(match_operand:SI 5 "const_int_operand" "i")])
(match_operand:SI 1 "const_int_operand" "i"))
(match_operand:SI 2 "const_int_operand" "i")))]
- "TARGET_32BIT && arm_arch6
+ "TARGET_32BIT && arm_arch6 && !ARM_Q_BIT_READ
&& arm_sat_operator_match (operands[<SAT:SATlo>], operands[<SAT:SAThi>], NULL, NULL)"
{
int mask;