diff options
author | jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> | 2013-12-10 11:46:01 +0000 |
---|---|---|
committer | jakub <jakub@138bc75d-0d04-0410-961f-82ee72b054a4> | 2013-12-10 11:46:01 +0000 |
commit | c71d3c24670970ab5084c5eec16c44473bfa2d28 (patch) | |
tree | 638f17e2a1b480505958dcbd6cca009ca6ade040 /gcc/tree-if-conv.c | |
parent | 4a5cda1358eb806a30d3750dcfee3e9253b5998d (diff) | |
download | gcc-c71d3c24670970ab5084c5eec16c44473bfa2d28.tar.gz |
* tree-vectorizer.h (struct _loop_vec_info): Add scalar_loop field.
(LOOP_VINFO_SCALAR_LOOP): Define.
(slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument.
* config/i386/sse.md (maskload<mode>, maskstore<mode>): New expanders.
* tree-data-ref.c (get_references_in_stmt): Handle MASK_LOAD and
MASK_STORE.
* internal-fn.def (LOOP_VECTORIZED, MASK_LOAD, MASK_STORE): New
internal fns.
* tree-if-conv.c: Include expr.h, optabs.h, tree-ssa-loop-ivopts.h and
tree-ssa-address.h.
(release_bb_predicate): New function.
(free_bb_predicate): Use it.
(reset_bb_predicate): Likewise. Don't unallocate bb->aux
just to immediately allocate it again.
(add_to_predicate_list): Add loop argument. If basic blocks that
dominate loop->latch don't insert any predicate.
(add_to_dst_predicate_list): Adjust caller.
(if_convertible_phi_p): Add any_mask_load_store argument, if true,
handle it like flag_tree_loop_if_convert_stores.
(insert_gimplified_predicates): Likewise.
(ifcvt_can_use_mask_load_store): New function.
(if_convertible_gimple_assign_stmt_p): Add any_mask_load_store
argument, check if some conditional loads or stores can't be
converted into MASK_LOAD or MASK_STORE.
(if_convertible_stmt_p): Add any_mask_load_store argument,
pass it down to if_convertible_gimple_assign_stmt_p.
(predicate_bbs): Don't return bool, only check if the last stmt
of a basic block is GIMPLE_COND and handle that. Adjust
add_to_predicate_list caller.
(if_convertible_loop_p_1): Only call predicate_bbs if
flag_tree_loop_if_convert_stores and free_bb_predicate in that case
afterwards, check gimple_code of stmts here. Replace is_predicated
check with dominance check. Add any_mask_load_store argument,
pass it down to if_convertible_stmt_p and if_convertible_phi_p,
call if_convertible_phi_p only after all if_convertible_stmt_p
calls.
(if_convertible_loop_p): Add any_mask_load_store argument,
pass it down to if_convertible_loop_p_1.
(predicate_mem_writes): Emit MASK_LOAD and/or MASK_STORE calls.
(combine_blocks): Add any_mask_load_store argument, pass
it down to insert_gimplified_predicates and call predicate_mem_writes
if it is set. Call predicate_bbs.
(version_loop_for_if_conversion): New function.
(tree_if_conversion): Adjust if_convertible_loop_p and combine_blocks
calls. Return todo flags instead of bool, call
version_loop_for_if_conversion if if-conversion should be just
for the vectorized loops and nothing else.
(main_tree_if_conversion): Adjust caller. Don't call
tree_if_conversion for dont_vectorize loops if if-conversion
isn't explicitly enabled.
* tree-vect-data-refs.c (vect_check_gather): Handle
MASK_LOAD/MASK_STORE.
(vect_analyze_data_refs, vect_supportable_dr_alignment): Likewise.
* gimple.h (gimple_expr_type): Handle MASK_STORE.
* internal-fn.c (expand_LOOP_VECTORIZED, expand_MASK_LOAD,
expand_MASK_STORE): New functions.
* tree-vectorizer.c: Include tree-cfg.h and gimple-fold.h.
(vect_loop_vectorized_call, fold_loop_vectorized_call): New functions.
(vectorize_loops): Don't try to vectorize loops with
loop->dont_vectorize set. Set LOOP_VINFO_SCALAR_LOOP for if-converted
loops, fold LOOP_VECTORIZED internal call depending on if loop
has been vectorized or not.
* tree-vect-loop-manip.c (slpeel_duplicate_current_defs_from_edges):
New function.
(slpeel_tree_duplicate_loop_to_edge_cfg): Add scalar_loop argument.
If non-NULL, copy basic blocks from scalar_loop instead of loop, but
still to loop's entry or exit edge.
(slpeel_tree_peel_loop_to_edge): Add scalar_loop argument, pass it
down to slpeel_tree_duplicate_loop_to_edge_cfg.
(vect_do_peeling_for_loop_bound, vect_do_peeling_for_loop_alignment):
Adjust callers.
(vect_loop_versioning): If LOOP_VINFO_SCALAR_LOOP, perform loop
versioning from that loop instead of LOOP_VINFO_LOOP, move it to the
right place in the CFG afterwards.
* tree-vect-loop.c (vect_determine_vectorization_factor): Handle
MASK_STORE.
* cfgloop.h (struct loop): Add dont_vectorize field.
* tree-loop-distribution.c (copy_loop_before): Adjust
slpeel_tree_duplicate_loop_to_edge_cfg caller.
* optabs.def (maskload_optab, maskstore_optab): New optabs.
* passes.def: Add a note that pass_vectorize must immediately follow
pass_if_conversion.
* tree-predcom.c (split_data_refs_to_components): Give up if
DR_STMT is a call.
* tree-vect-stmts.c (vect_mark_relevant): Don't crash if lhs
is NULL.
(exist_non_indexing_operands_for_use_p): Handle MASK_LOAD
and MASK_STORE.
(vectorizable_mask_load_store): New function.
(vectorizable_call): Call it for MASK_LOAD or MASK_STORE.
(vect_transform_stmt): Handle MASK_STORE.
* tree-ssa-phiopt.c (cond_if_else_store_replacement): Ignore
DR_STMT where lhs is NULL.
* optabs.h (can_vec_perm_p): Fix up comment typo.
(can_vec_mask_load_store_p): New prototype.
* optabs.c (can_vec_mask_load_store_p): New function.
* gcc.dg/vect/vect-cond-11.c: New test.
* gcc.target/i386/vect-cond-1.c: New test.
* gcc.target/i386/avx2-gather-5.c: New test.
* gcc.target/i386/avx2-gather-6.c: New test.
* gcc.dg/vect/vect-mask-loadstore-1.c: New test.
* gcc.dg/vect/vect-mask-load-1.c: New test.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@205856 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/tree-if-conv.c')
-rw-r--r-- | gcc/tree-if-conv.c | 416 |
1 files changed, 303 insertions, 113 deletions
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index 71a25f1f1c0..59404ec14e2 100644 --- a/gcc/tree-if-conv.c +++ b/gcc/tree-if-conv.c @@ -110,8 +110,12 @@ along with GCC; see the file COPYING3. If not see #include "tree-chrec.h" #include "tree-data-ref.h" #include "tree-scalar-evolution.h" +#include "tree-ssa-loop-ivopts.h" +#include "tree-ssa-address.h" #include "tree-pass.h" #include "dbgcnt.h" +#include "expr.h" +#include "optabs.h" /* List of basic blocks in if-conversion-suitable order. */ static basic_block *ifc_bbs; @@ -194,39 +198,48 @@ init_bb_predicate (basic_block bb) set_bb_predicate (bb, boolean_true_node); } -/* Free the predicate of basic block BB. */ +/* Release the SSA_NAMEs associated with the predicate of basic block BB, + but don't actually free it. */ static inline void -free_bb_predicate (basic_block bb) +release_bb_predicate (basic_block bb) { - gimple_seq stmts; - - if (!bb_has_predicate (bb)) - return; - - /* Release the SSA_NAMEs created for the gimplification of the - predicate. */ - stmts = bb_predicate_gimplified_stmts (bb); + gimple_seq stmts = bb_predicate_gimplified_stmts (bb); if (stmts) { gimple_stmt_iterator i; for (i = gsi_start (stmts); !gsi_end_p (i); gsi_next (&i)) free_stmt_operands (cfun, gsi_stmt (i)); + set_bb_predicate_gimplified_stmts (bb, NULL); } +} + +/* Free the predicate of basic block BB. */ +static inline void +free_bb_predicate (basic_block bb) +{ + if (!bb_has_predicate (bb)) + return; + + release_bb_predicate (bb); free (bb->aux); bb->aux = NULL; } -/* Free the predicate of BB and reinitialize it with the true - predicate. */ +/* Reinitialize predicate of BB with the true predicate. */ static inline void reset_bb_predicate (basic_block bb) { - free_bb_predicate (bb); - init_bb_predicate (bb); + if (!bb_has_predicate (bb)) + init_bb_predicate (bb); + else + { + release_bb_predicate (bb); + set_bb_predicate (bb, boolean_true_node); + } } /* Returns a new SSA_NAME of type TYPE that is assigned the value of @@ -382,10 +395,11 @@ fold_build_cond_expr (tree type, tree cond, tree rhs, tree lhs) return build3 (COND_EXPR, type, cond, rhs, lhs); } -/* Add condition NC to the predicate list of basic block BB. */ +/* Add condition NC to the predicate list of basic block BB. LOOP is + the loop to be if-converted. */ static inline void -add_to_predicate_list (basic_block bb, tree nc) +add_to_predicate_list (struct loop *loop, basic_block bb, tree nc) { tree bc, *tp; @@ -393,7 +407,14 @@ add_to_predicate_list (basic_block bb, tree nc) return; if (!is_predicated (bb)) - bc = nc; + { + /* If dominance tells us this basic block is always executed, don't + record any predicates for it. */ + if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) + return; + + bc = nc; + } else { bc = bb_predicate (bb); @@ -434,7 +455,7 @@ add_to_dst_predicate_list (struct loop *loop, edge e, cond = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, prev_cond, cond); - add_to_predicate_list (e->dest, cond); + add_to_predicate_list (loop, e->dest, cond); } /* Return true if one of the successor edges of BB exits LOOP. */ @@ -464,7 +485,8 @@ bb_with_exit_edge_p (struct loop *loop, basic_block bb) - there is a virtual PHI in a BB other than the loop->header. */ static bool -if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi) +if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi, + bool any_mask_load_store) { if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -479,7 +501,7 @@ if_convertible_phi_p (struct loop *loop, basic_block bb, gimple phi) return false; } - if (flag_tree_loop_if_convert_stores) + if (flag_tree_loop_if_convert_stores || any_mask_load_store) return true; /* When the flag_tree_loop_if_convert_stores is not set, check @@ -695,6 +717,56 @@ ifcvt_could_trap_p (gimple stmt, vec<data_reference_p> refs) return gimple_could_trap_p (stmt); } +/* Return true if STMT could be converted into a masked load or store + (conditional load or store based on a mask computed from bb predicate). */ + +static bool +ifcvt_can_use_mask_load_store (gimple stmt) +{ + tree lhs, ref; + enum machine_mode mode; + basic_block bb = gimple_bb (stmt); + bool is_load; + + if (!(flag_tree_loop_vectorize || bb->loop_father->force_vect) + || bb->loop_father->dont_vectorize + || !gimple_assign_single_p (stmt) + || gimple_has_volatile_ops (stmt)) + return false; + + /* Check whether this is a load or store. */ + lhs = gimple_assign_lhs (stmt); + if (gimple_store_p (stmt)) + { + if (!is_gimple_val (gimple_assign_rhs1 (stmt))) + return false; + is_load = false; + ref = lhs; + } + else if (gimple_assign_load_p (stmt)) + { + is_load = true; + ref = gimple_assign_rhs1 (stmt); + } + else + return false; + + if (may_be_nonaddressable_p (ref)) + return false; + + /* Mask should be integer mode of the same size as the load/store + mode. */ + mode = TYPE_MODE (TREE_TYPE (lhs)); + if (int_mode_for_mode (mode) == BLKmode + || VECTOR_MODE_P (mode)) + return false; + + if (can_vec_mask_load_store_p (mode, is_load)) + return true; + + return false; +} + /* Return true when STMT is if-convertible. GIMPLE_ASSIGN statement is not if-convertible if, @@ -704,7 +776,8 @@ ifcvt_could_trap_p (gimple stmt, vec<data_reference_p> refs) static bool if_convertible_gimple_assign_stmt_p (gimple stmt, - vec<data_reference_p> refs) + vec<data_reference_p> refs, + bool *any_mask_load_store) { tree lhs = gimple_assign_lhs (stmt); basic_block bb; @@ -730,10 +803,21 @@ if_convertible_gimple_assign_stmt_p (gimple stmt, return false; } + /* tree-into-ssa.c uses GF_PLF_1, so avoid it, because + in between if_convertible_loop_p and combine_blocks + we can perform loop versioning. */ + gimple_set_plf (stmt, GF_PLF_2, false); + if (flag_tree_loop_if_convert_stores) { if (ifcvt_could_trap_p (stmt, refs)) { + if (ifcvt_can_use_mask_load_store (stmt)) + { + gimple_set_plf (stmt, GF_PLF_2, true); + *any_mask_load_store = true; + return true; + } if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "tree could trap...\n"); return false; @@ -743,6 +827,12 @@ if_convertible_gimple_assign_stmt_p (gimple stmt, if (gimple_assign_rhs_could_trap_p (stmt)) { + if (ifcvt_can_use_mask_load_store (stmt)) + { + gimple_set_plf (stmt, GF_PLF_2, true); + *any_mask_load_store = true; + return true; + } if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "tree could trap...\n"); return false; @@ -754,6 +844,12 @@ if_convertible_gimple_assign_stmt_p (gimple stmt, && bb != bb->loop_father->header && !bb_with_exit_edge_p (bb->loop_father, bb)) { + if (ifcvt_can_use_mask_load_store (stmt)) + { + gimple_set_plf (stmt, GF_PLF_2, true); + *any_mask_load_store = true; + return true; + } if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, "LHS is not var\n"); @@ -772,7 +868,8 @@ if_convertible_gimple_assign_stmt_p (gimple stmt, - it is a GIMPLE_LABEL or a GIMPLE_COND. */ static bool -if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs) +if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs, + bool *any_mask_load_store) { switch (gimple_code (stmt)) { @@ -782,7 +879,8 @@ if_convertible_stmt_p (gimple stmt, vec<data_reference_p> refs) return true; case GIMPLE_ASSIGN: - return if_convertible_gimple_assign_stmt_p (stmt, refs); + return if_convertible_gimple_assign_stmt_p (stmt, refs, + any_mask_load_store); case GIMPLE_CALL: { @@ -984,7 +1082,7 @@ get_loop_body_in_if_conv_order (const struct loop *loop) S1 will be predicated with "x", and S2 will be predicated with "!x". */ -static bool +static void predicate_bbs (loop_p loop) { unsigned int i; @@ -996,7 +1094,7 @@ predicate_bbs (loop_p loop) { basic_block bb = ifc_bbs[i]; tree cond; - gimple_stmt_iterator itr; + gimple stmt; /* The loop latch is always executed and has no extra conditions to be processed: skip it. */ @@ -1007,52 +1105,32 @@ predicate_bbs (loop_p loop) } cond = bb_predicate (bb); - - for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr)) + stmt = last_stmt (bb); + if (stmt && gimple_code (stmt) == GIMPLE_COND) { - gimple stmt = gsi_stmt (itr); - - switch (gimple_code (stmt)) - { - case GIMPLE_LABEL: - case GIMPLE_ASSIGN: - case GIMPLE_CALL: - case GIMPLE_DEBUG: - break; - - case GIMPLE_COND: - { - tree c2; - edge true_edge, false_edge; - location_t loc = gimple_location (stmt); - tree c = fold_build2_loc (loc, gimple_cond_code (stmt), - boolean_type_node, - gimple_cond_lhs (stmt), - gimple_cond_rhs (stmt)); - - /* Add new condition into destination's predicate list. */ - extract_true_false_edges_from_block (gimple_bb (stmt), - &true_edge, &false_edge); - - /* If C is true, then TRUE_EDGE is taken. */ - add_to_dst_predicate_list (loop, true_edge, - unshare_expr (cond), - unshare_expr (c)); - - /* If C is false, then FALSE_EDGE is taken. */ - c2 = build1_loc (loc, TRUTH_NOT_EXPR, - boolean_type_node, unshare_expr (c)); - add_to_dst_predicate_list (loop, false_edge, - unshare_expr (cond), c2); - - cond = NULL_TREE; - break; - } - - default: - /* Not handled yet in if-conversion. */ - return false; - } + tree c2; + edge true_edge, false_edge; + location_t loc = gimple_location (stmt); + tree c = fold_build2_loc (loc, gimple_cond_code (stmt), + boolean_type_node, + gimple_cond_lhs (stmt), + gimple_cond_rhs (stmt)); + + /* Add new condition into destination's predicate list. */ + extract_true_false_edges_from_block (gimple_bb (stmt), + &true_edge, &false_edge); + + /* If C is true, then TRUE_EDGE is taken. */ + add_to_dst_predicate_list (loop, true_edge, unshare_expr (cond), + unshare_expr (c)); + + /* If C is false, then FALSE_EDGE is taken. */ + c2 = build1_loc (loc, TRUTH_NOT_EXPR, boolean_type_node, + unshare_expr (c)); + add_to_dst_predicate_list (loop, false_edge, + unshare_expr (cond), c2); + + cond = NULL_TREE; } /* If current bb has only one successor, then consider it as an @@ -1067,7 +1145,7 @@ predicate_bbs (loop_p loop) if (cond == NULL_TREE) cond = boolean_true_node; - add_to_predicate_list (bb_n, cond); + add_to_predicate_list (loop, bb_n, cond); } } @@ -1075,8 +1153,6 @@ predicate_bbs (loop_p loop) reset_bb_predicate (loop->header); gcc_assert (bb_predicate_gimplified_stmts (loop->header) == NULL && bb_predicate_gimplified_stmts (loop->latch) == NULL); - - return true; } /* Return true when LOOP is if-convertible. This is a helper function @@ -1087,7 +1163,7 @@ static bool if_convertible_loop_p_1 (struct loop *loop, vec<loop_p> *loop_nest, vec<data_reference_p> *refs, - vec<ddr_p> *ddrs) + vec<ddr_p> *ddrs, bool *any_mask_load_store) { bool res; unsigned int i; @@ -1121,9 +1197,24 @@ if_convertible_loop_p_1 (struct loop *loop, exit_bb = bb; } - res = predicate_bbs (loop); - if (!res) - return false; + for (i = 0; i < loop->num_nodes; i++) + { + basic_block bb = ifc_bbs[i]; + gimple_stmt_iterator gsi; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + switch (gimple_code (gsi_stmt (gsi))) + { + case GIMPLE_LABEL: + case GIMPLE_ASSIGN: + case GIMPLE_CALL: + case GIMPLE_DEBUG: + case GIMPLE_COND: + break; + default: + return false; + } + } if (flag_tree_loop_if_convert_stores) { @@ -1135,6 +1226,7 @@ if_convertible_loop_p_1 (struct loop *loop, DR_WRITTEN_AT_LEAST_ONCE (dr) = -1; DR_RW_UNCONDITIONALLY (dr) = -1; } + predicate_bbs (loop); } for (i = 0; i < loop->num_nodes; i++) @@ -1142,17 +1234,31 @@ if_convertible_loop_p_1 (struct loop *loop, basic_block bb = ifc_bbs[i]; gimple_stmt_iterator itr; - for (itr = gsi_start_phis (bb); !gsi_end_p (itr); gsi_next (&itr)) - if (!if_convertible_phi_p (loop, bb, gsi_stmt (itr))) - return false; - /* Check the if-convertibility of statements in predicated BBs. */ - if (is_predicated (bb)) + if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) for (itr = gsi_start_bb (bb); !gsi_end_p (itr); gsi_next (&itr)) - if (!if_convertible_stmt_p (gsi_stmt (itr), *refs)) + if (!if_convertible_stmt_p (gsi_stmt (itr), *refs, + any_mask_load_store)) return false; } + if (flag_tree_loop_if_convert_stores) + for (i = 0; i < loop->num_nodes; i++) + free_bb_predicate (ifc_bbs[i]); + + /* Checking PHIs needs to be done after stmts, as the fact whether there + are any masked loads or stores affects the tests. */ + for (i = 0; i < loop->num_nodes; i++) + { + basic_block bb = ifc_bbs[i]; + gimple_stmt_iterator itr; + + for (itr = gsi_start_phis (bb); !gsi_end_p (itr); gsi_next (&itr)) + if (!if_convertible_phi_p (loop, bb, gsi_stmt (itr), + *any_mask_load_store)) + return false; + } + if (dump_file) fprintf (dump_file, "Applying if-conversion\n"); @@ -1168,7 +1274,7 @@ if_convertible_loop_p_1 (struct loop *loop, - if its basic blocks and phi nodes are if convertible. */ static bool -if_convertible_loop_p (struct loop *loop) +if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store) { edge e; edge_iterator ei; @@ -1209,7 +1315,8 @@ if_convertible_loop_p (struct loop *loop) refs.create (5); ddrs.create (25); stack_vec<loop_p, 3> loop_nest; - res = if_convertible_loop_p_1 (loop, &loop_nest, &refs, &ddrs); + res = if_convertible_loop_p_1 (loop, &loop_nest, &refs, &ddrs, + any_mask_load_store); if (flag_tree_loop_if_convert_stores) { @@ -1395,7 +1502,7 @@ predicate_all_scalar_phis (struct loop *loop) gimplification of the predicates. */ static void -insert_gimplified_predicates (loop_p loop) +insert_gimplified_predicates (loop_p loop, bool any_mask_load_store) { unsigned int i; @@ -1416,7 +1523,8 @@ insert_gimplified_predicates (loop_p loop) stmts = bb_predicate_gimplified_stmts (bb); if (stmts) { - if (flag_tree_loop_if_convert_stores) + if (flag_tree_loop_if_convert_stores + || any_mask_load_store) { /* Insert the predicate of the BB just after the label, as the if-conversion of memory writes will use this @@ -1575,9 +1683,49 @@ predicate_mem_writes (loop_p loop) } for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) - if ((stmt = gsi_stmt (gsi)) - && gimple_assign_single_p (stmt) - && gimple_vdef (stmt)) + if (!gimple_assign_single_p (stmt = gsi_stmt (gsi))) + continue; + else if (gimple_plf (stmt, GF_PLF_2)) + { + tree lhs = gimple_assign_lhs (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + tree ref, addr, ptr, masktype, mask_op0, mask_op1, mask; + gimple new_stmt; + int bitsize = GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (lhs))); + + masktype = build_nonstandard_integer_type (bitsize, 1); + mask_op0 = build_int_cst (masktype, swap ? 0 : -1); + mask_op1 = build_int_cst (masktype, swap ? -1 : 0); + ref = TREE_CODE (lhs) == SSA_NAME ? rhs : lhs; + mark_addressable (ref); + addr = force_gimple_operand_gsi (&gsi, build_fold_addr_expr (ref), + true, NULL_TREE, true, + GSI_SAME_STMT); + cond = force_gimple_operand_gsi_1 (&gsi, unshare_expr (cond), + is_gimple_condexpr, NULL_TREE, + true, GSI_SAME_STMT); + mask = fold_build_cond_expr (masktype, unshare_expr (cond), + mask_op0, mask_op1); + mask = ifc_temp_var (masktype, mask, &gsi); + ptr = build_int_cst (reference_alias_ptr_type (ref), 0); + /* Copy points-to info if possible. */ + if (TREE_CODE (addr) == SSA_NAME && !SSA_NAME_PTR_INFO (addr)) + copy_ref_info (build2 (MEM_REF, TREE_TYPE (ref), addr, ptr), + ref); + if (TREE_CODE (lhs) == SSA_NAME) + { + new_stmt + = gimple_build_call_internal (IFN_MASK_LOAD, 3, addr, + ptr, mask); + gimple_call_set_lhs (new_stmt, lhs); + } + else + new_stmt + = gimple_build_call_internal (IFN_MASK_STORE, 4, addr, ptr, + mask, rhs); + gsi_replace (&gsi, new_stmt, false); + } + else if (gimple_vdef (stmt)) { tree lhs = gimple_assign_lhs (stmt); tree rhs = gimple_assign_rhs1 (stmt); @@ -1647,7 +1795,7 @@ remove_conditions_and_labels (loop_p loop) blocks. Replace PHI nodes with conditional modify expressions. */ static void -combine_blocks (struct loop *loop) +combine_blocks (struct loop *loop, bool any_mask_load_store) { basic_block bb, exit_bb, merge_target_bb; unsigned int orig_loop_num_nodes = loop->num_nodes; @@ -1655,11 +1803,12 @@ combine_blocks (struct loop *loop) edge e; edge_iterator ei; + predicate_bbs (loop); remove_conditions_and_labels (loop); - insert_gimplified_predicates (loop); + insert_gimplified_predicates (loop, any_mask_load_store); predicate_all_scalar_phis (loop); - if (flag_tree_loop_if_convert_stores) + if (flag_tree_loop_if_convert_stores || any_mask_load_store) predicate_mem_writes (loop); /* Merge basic blocks: first remove all the edges in the loop, @@ -1749,28 +1898,76 @@ combine_blocks (struct loop *loop) ifc_bbs = NULL; } -/* If-convert LOOP when it is legal. For the moment this pass has no - profitability analysis. Returns true when something changed. */ +/* Version LOOP before if-converting it, the original loop + will be then if-converted, the new copy of the loop will not, + and the LOOP_VECTORIZED internal call will be guarding which + loop to execute. The vectorizer pass will fold this + internal call into either true or false. */ static bool +version_loop_for_if_conversion (struct loop *loop) +{ + basic_block cond_bb; + tree cond = make_ssa_name (boolean_type_node, NULL); + struct loop *new_loop; + gimple g; + gimple_stmt_iterator gsi; + + g = gimple_build_call_internal (IFN_LOOP_VECTORIZED, 2, + build_int_cst (integer_type_node, loop->num), + integer_zero_node); + gimple_call_set_lhs (g, cond); + + initialize_original_copy_tables (); + new_loop = loop_version (loop, cond, &cond_bb, + REG_BR_PROB_BASE, REG_BR_PROB_BASE, + REG_BR_PROB_BASE, true); + free_original_copy_tables (); + if (new_loop == NULL) + return false; + new_loop->dont_vectorize = true; + new_loop->force_vect = false; + gsi = gsi_last_bb (cond_bb); + gimple_call_set_arg (g, 1, build_int_cst (integer_type_node, new_loop->num)); + gsi_insert_before (&gsi, g, GSI_SAME_STMT); + update_ssa (TODO_update_ssa); + return true; +} + +/* If-convert LOOP when it is legal. For the moment this pass has no + profitability analysis. Returns non-zero todo flags when something + changed. */ + +static unsigned int tree_if_conversion (struct loop *loop) { - bool changed = false; + unsigned int todo = 0; ifc_bbs = NULL; + bool any_mask_load_store = false; - if (!if_convertible_loop_p (loop) + if (!if_convertible_loop_p (loop, &any_mask_load_store) || !dbg_cnt (if_conversion_tree)) goto cleanup; + if (any_mask_load_store + && ((!flag_tree_loop_vectorize && !loop->force_vect) + || loop->dont_vectorize)) + goto cleanup; + + if (any_mask_load_store && !version_loop_for_if_conversion (loop)) + goto cleanup; + /* Now all statements are if-convertible. Combine all the basic blocks into one huge basic block doing the if-conversion on-the-fly. */ - combine_blocks (loop); - - if (flag_tree_loop_if_convert_stores) - mark_virtual_operands_for_renaming (cfun); + combine_blocks (loop, any_mask_load_store); - changed = true; + todo |= TODO_cleanup_cfg; + if (flag_tree_loop_if_convert_stores || any_mask_load_store) + { + mark_virtual_operands_for_renaming (cfun); + todo |= TODO_update_ssa_only_virtuals; + } cleanup: if (ifc_bbs) @@ -1784,7 +1981,7 @@ tree_if_conversion (struct loop *loop) ifc_bbs = NULL; } - return changed; + return todo; } /* Tree if-conversion pass management. */ @@ -1793,7 +1990,6 @@ static unsigned int main_tree_if_conversion (void) { struct loop *loop; - bool changed = false; unsigned todo = 0; if (number_of_loops (cfun) <= 1) @@ -1802,15 +1998,9 @@ main_tree_if_conversion (void) FOR_EACH_LOOP (loop, 0) if (flag_tree_loop_if_convert == 1 || flag_tree_loop_if_convert_stores == 1 - || flag_tree_loop_vectorize - || loop->force_vect) - changed |= tree_if_conversion (loop); - - if (changed) - todo |= TODO_cleanup_cfg; - - if (changed && flag_tree_loop_if_convert_stores) - todo |= TODO_update_ssa_only_virtuals; + || ((flag_tree_loop_vectorize || loop->force_vect) + && !loop->dont_vectorize)) + todo |= tree_if_conversion (loop); #ifdef ENABLE_CHECKING { |