diff options
author | vmakarov <vmakarov@138bc75d-0d04-0410-961f-82ee72b054a4> | 2012-10-23 15:51:41 +0000 |
---|---|---|
committer | vmakarov <vmakarov@138bc75d-0d04-0410-961f-82ee72b054a4> | 2012-10-23 15:51:41 +0000 |
commit | c6a6cdaaea571860c94f9a9fe0f98c597fef7c81 (patch) | |
tree | 915ce489d01a05653371ff4f7770258ffacab1b4 | |
parent | d9459f6b9e27edcf999b5c06b87e21f8f24fd26f (diff) | |
download | gcc-c6a6cdaaea571860c94f9a9fe0f98c597fef7c81.tar.gz |
2012-10-23 Vladimir Makarov <vmakarov@redhat.com>
* dbxout.c (dbxout_symbol_location): Pass new argument to
alter_subreg.
* dwarf2out.c: Include ira.h and lra.h.
(based_loc_descr, compute_frame_pointer_to_fb_displacement): Use
lra_eliminate_regs for LRA instead of eliminate_regs.
* expr.c (emit_move_insn_1): Pass an additional argument to
emit_move_via_integer. Use emit_move_via_integer for LRA only if
the insn is recognized.
* emit-rtl.c (gen_rtx_REG): Add lra_in_progress.
(validate_subreg): Don't check offset for LRA and floating point
modes.
* final.c (final_scan_insn, cleanup_subreg_operands): Pass new
argument to alter_subreg.
(walk_alter_subreg, output_operand): Ditto.
(alter_subreg): Add new argument.
* gcse.c (calculate_bb_reg_pressure): Add parameter to
ira_setup_eliminable_regset call.
* ira.c: Include lra.h.
(ira_init_once, ira_init, ira_finish_once): Call lra_start_once,
lra_init, lra_finish_once in anyway.
(ira_setup_eliminable_regset): Add parameter. Remove need_fp.
Call lra_init_elimination and mark HARD_FRAME_POINTER_REGNUM as
living forever if frame_pointer_needed.
(setup_reg_class_relations): Set up ira_reg_class_subset.
(ira_reg_equiv_invariant_p, ira_reg_equiv_const): Remove.
(find_reg_equiv_invariant_const): Ditto.
(setup_reg_renumber): Use ira_equiv_no_lvalue_p instead of
ira_reg_equiv_invariant_p. Skip caps for LRA.
(setup_reg_equiv_init, ira_update_equiv_info_by_shuffle_insn): New
functions.
(ira_reg_equiv_len, ira_reg_equiv): New externals.
(ira_reg_equiv): New.
(ira_expand_reg_equiv, init_reg_equiv, finish_reg_equiv): New
functions.
(no_equiv, update_equiv_regs): Use ira_reg_equiv instead of
reg_equiv_init.
(setup_reg_equiv): New function.
(ira_use_lra_p): New global.
(ira): Set up lra_simple_p and ira_conflicts_p. Set up and
restore flag_caller_saves and flag_ira_region. Move
initialization of ira_obstack and ira_bitmap_obstack upper. Call
init_reg_equiv, setup_reg_equiv, and setup_reg_equiv_init instead
of initialization of ira_reg_equiv_len, ira_reg_equiv_invariant_p,
and ira_reg_equiv_const. Call ira_setup_eliminable_regset with a
new argument. Don't flatten IRA IRA for LRA. Don't reassign
conflict allocnos for LRA. Call finish_reg_equiv.
(do_reload): Prepare code for LRA call. Call LRA.
* ira.h (ira_use_lra_p): New external.
(struct target_ira): Add members x_ira_class_subset_p
x_ira_reg_class_subset, and x_ira_reg_classes_intersect_p.
(ira_class_subset_p, ira_reg_class_subset): New macros.
(ira_reg_classes_intersect_p): New macro.
(struct ira_reg_equiv): New.
(ira_setup_eliminable_regset): Add an argument.
(ira_expand_reg_equiv, ira_update_equiv_info_by_shuffle_insn): New
prototypes.
* ira-color.c (color_pass, move_spill_restore, coalesce_allocnos):
Use ira_equiv_no_lvalue_p.
(coalesce_spill_slots, ira_sort_regnos_for_alter_reg): Ditto.
* ira-emit.c (ira_create_new_reg): Call ira_expand_reg_equiv.
(generate_edge_moves, change_loop) Use ira_equiv_no_lvalue_p.
(emit_move_list): Simplify code. Call
ira_update_equiv_info_by_shuffle_insn. Use ira_reg_equiv instead
of ira_reg_equiv_invariant_p and ira_reg_equiv_const. Change
assert.
* ira-int.h (struct target_ira_int): Remove x_ira_class_subset_p
and x_ira_reg_classes_intersect_p.
(ira_class_subset_p, ira_reg_classes_intersect_p): Remove.
(ira_reg_equiv_len, ira_reg_equiv_invariant_p): Ditto.
(ira_reg_equiv_const): Ditto.
(ira_equiv_no_lvalue_p): New function.
* jump.c (true_regnum): Always use hard_regno for subreg_get_info
when lra is in progress.
* haifa-sched.c (sched_init): Pass new argument to
ira_setup_eliminable_regset.
* loop-invariant.c (calculate_loop_reg_pressure): Pass new
argument to ira_setup_eliminable_regset.
* lra.h: New.
* lra-int.h: Ditto.
* lra.c: Ditto.
* lra-assigns.c: Ditto.
* lra-constraints.c: Ditto.
* lra-coalesce.c: Ditto.
* lra-eliminations.c: Ditto.
* lra-lives.c: Ditto.
* lra-spills.c: Ditto.
* Makefile.in (LRA_INT_H): New.
(OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o,
lra-constraints.o, lra-eliminations.o, lra-lives.o, and
lra-spills.o.
(dwarf2out.o): Add dependence on ira.h and lra.h.
(ira.o): Add dependence on lra.h.
(lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New
entries.
(lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto.
* output.h (alter_subreg): Add new argument.
* rtlanal.c (simplify_subreg_regno): Permit mode changes for LRA.
Permit ARG_POINTER_REGNUM and STACK_POINTER_REGNUM for LRA.
* recog.c (general_operand, register_operand): Accept paradoxical
FLOAT_MODE subregs for LRA.
(scratch_operand): Accept pseudos for LRA.
* rtl.h (lra_in_progress): New external.
(debug_bb_n_slim, debug_bb_slim, print_value_slim): New
prototypes.
(debug_rtl_slim, debug_insn_slim): Ditto.
* sdbout.c (sdbout_symbol): Pass new argument to alter_subreg.
* sched-vis.c (print_value_slim): New.
* target.def (lra_p): New hook.
(register_priority): Ditto.
(different_addr_displacement_p): Ditto.
(spill_class): Ditto.
* target-globals.h (this_target_lra_int): New external.
(target_globals): New member lra_int.
(restore_target_globals): Restore this_target_lra_int.
* target-globals.c: Include lra-int.h.
(default_target_globals): Add &default_target_lra_int.
* targhooks.c (default_lra_p): New function.
(default_register_priority): Ditto.
(default_different_addr_displacement_p): Ditto.
* targhooks.h (default_lra_p): Declare.
(default_register_priority): Ditto.
(default_different_addr_displacement_p): Ditto.
* timevar.def (TV_LRA, TV_LRA_ELIMINATE, TV_LRA_INHERITANCE): New.
(TV_LRA_CREATE_LIVE_RANGES, TV_LRA_ASSIGN, TV_LRA_COALESCE): New.
* config/arm/arm.c (load_multiple_sequence): Pass new argument toOB
alter_subreg.
(store_multiple_sequence): Ditto.
* config/i386/i386.h (enum ix86_tune_indices): Add
X86_TUNE_GENERAL_REGS_SSE_SPILL.
(TARGET_GENERAL_REGS_SSE_SPILL): New macro.
* config/i386/i386.c (initial_ix86_tune_features): Set up
X86_TUNE_GENERAL_REGS_SSE_SPILL for m_COREI7 and m_CORE2I7.
(ix86_lra_p, ix86_register_priority): New functions.
(ix86_secondary_reload): Add NON_Q_REGS, SIREG, DIREG.
(inline_secondary_memory_needed): Change assert.
(ix86_spill_class): New function.
(TARGET_LRA_P, TARGET_REGISTER_BANK, TARGET_SPILL_CLASS): New
macros.
* config/m68k/m68k.c (emit_move_sequence): Pass new argument to
alter_subreg.
* config/m32r/m32r.c (gen_split_move_double): Ditto.
* config/pa/pa.c (pa_emit_move_sequence): Ditto.
* config/sh/sh.md: Ditto.
* config/v850/v850.c (v850_reorg): Ditto.
* config/xtensa/xtensa.c (fixup_subreg_mem): Ditto.
* doc/md.texi: Add new interpretation of hint * for LRA.
* doc/passes.texi: Describe LRA pass.
* doc/tm.texi.in: Add TARGET_LRA_P, TARGET_REGISTER_PRIORITY,
TARGET_DIFFERENT_ADDR_DISPLACEMENT_P, and TARGET_SPILL_CLASS.
* doc/tm.texi: Update.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@192719 138bc75d-0d04-0410-961f-82ee72b054a4
50 files changed, 13722 insertions, 282 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 41e004b44cb..049321a9189 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,156 @@ +2012-10-23 Vladimir Makarov <vmakarov@redhat.com> + + * dbxout.c (dbxout_symbol_location): Pass new argument to + alter_subreg. + * dwarf2out.c: Include ira.h and lra.h. + (based_loc_descr, compute_frame_pointer_to_fb_displacement): Use + lra_eliminate_regs for LRA instead of eliminate_regs. + * expr.c (emit_move_insn_1): Pass an additional argument to + emit_move_via_integer. Use emit_move_via_integer for LRA only if + the insn is recognized. + * emit-rtl.c (gen_rtx_REG): Add lra_in_progress. + (validate_subreg): Don't check offset for LRA and floating point + modes. + * final.c (final_scan_insn, cleanup_subreg_operands): Pass new + argument to alter_subreg. + (walk_alter_subreg, output_operand): Ditto. + (alter_subreg): Add new argument. + * gcse.c (calculate_bb_reg_pressure): Add parameter to + ira_setup_eliminable_regset call. + * ira.c: Include lra.h. + (ira_init_once, ira_init, ira_finish_once): Call lra_start_once, + lra_init, lra_finish_once in anyway. + (ira_setup_eliminable_regset): Add parameter. Remove need_fp. + Call lra_init_elimination and mark HARD_FRAME_POINTER_REGNUM as + living forever if frame_pointer_needed. + (setup_reg_class_relations): Set up ira_reg_class_subset. + (ira_reg_equiv_invariant_p, ira_reg_equiv_const): Remove. + (find_reg_equiv_invariant_const): Ditto. + (setup_reg_renumber): Use ira_equiv_no_lvalue_p instead of + ira_reg_equiv_invariant_p. Skip caps for LRA. + (setup_reg_equiv_init, ira_update_equiv_info_by_shuffle_insn): New + functions. + (ira_reg_equiv_len, ira_reg_equiv): New externals. + (ira_reg_equiv): New. + (ira_expand_reg_equiv, init_reg_equiv, finish_reg_equiv): New + functions. + (no_equiv, update_equiv_regs): Use ira_reg_equiv instead of + reg_equiv_init. + (setup_reg_equiv): New function. + (ira_use_lra_p): New global. + (ira): Set up lra_simple_p and ira_conflicts_p. Set up and + restore flag_caller_saves and flag_ira_region. Move + initialization of ira_obstack and ira_bitmap_obstack upper. Call + init_reg_equiv, setup_reg_equiv, and setup_reg_equiv_init instead + of initialization of ira_reg_equiv_len, ira_reg_equiv_invariant_p, + and ira_reg_equiv_const. Call ira_setup_eliminable_regset with a + new argument. Don't flatten IRA IRA for LRA. Don't reassign + conflict allocnos for LRA. Call finish_reg_equiv. + (do_reload): Prepare code for LRA call. Call LRA. + * ira.h (ira_use_lra_p): New external. + (struct target_ira): Add members x_ira_class_subset_p + x_ira_reg_class_subset, and x_ira_reg_classes_intersect_p. + (ira_class_subset_p, ira_reg_class_subset): New macros. + (ira_reg_classes_intersect_p): New macro. + (struct ira_reg_equiv): New. + (ira_setup_eliminable_regset): Add an argument. + (ira_expand_reg_equiv, ira_update_equiv_info_by_shuffle_insn): New + prototypes. + * ira-color.c (color_pass, move_spill_restore, coalesce_allocnos): + Use ira_equiv_no_lvalue_p. + (coalesce_spill_slots, ira_sort_regnos_for_alter_reg): Ditto. + * ira-emit.c (ira_create_new_reg): Call ira_expand_reg_equiv. + (generate_edge_moves, change_loop) Use ira_equiv_no_lvalue_p. + (emit_move_list): Simplify code. Call + ira_update_equiv_info_by_shuffle_insn. Use ira_reg_equiv instead + of ira_reg_equiv_invariant_p and ira_reg_equiv_const. Change + assert. + * ira-int.h (struct target_ira_int): Remove x_ira_class_subset_p + and x_ira_reg_classes_intersect_p. + (ira_class_subset_p, ira_reg_classes_intersect_p): Remove. + (ira_reg_equiv_len, ira_reg_equiv_invariant_p): Ditto. + (ira_reg_equiv_const): Ditto. + (ira_equiv_no_lvalue_p): New function. + * jump.c (true_regnum): Always use hard_regno for subreg_get_info + when lra is in progress. + * haifa-sched.c (sched_init): Pass new argument to + ira_setup_eliminable_regset. + * loop-invariant.c (calculate_loop_reg_pressure): Pass new + argument to ira_setup_eliminable_regset. + * lra.h: New. + * lra-int.h: Ditto. + * lra.c: Ditto. + * lra-assigns.c: Ditto. + * lra-constraints.c: Ditto. + * lra-coalesce.c: Ditto. + * lra-eliminations.c: Ditto. + * lra-lives.c: Ditto. + * lra-spills.c: Ditto. + * Makefile.in (LRA_INT_H): New. + (OBJS): Add lra.o, lra-assigns.o, lra-coalesce.o, + lra-constraints.o, lra-eliminations.o, lra-lives.o, and + lra-spills.o. + (dwarf2out.o): Add dependence on ira.h and lra.h. + (ira.o): Add dependence on lra.h. + (lra.o, lra-assigns.o, lra-coalesce.o, lra-constraints.o): New + entries. + (lra-eliminations.o, lra-lives.o, lra-spills.o): Ditto. + * output.h (alter_subreg): Add new argument. + * rtlanal.c (simplify_subreg_regno): Permit mode changes for LRA. + Permit ARG_POINTER_REGNUM and STACK_POINTER_REGNUM for LRA. + * recog.c (general_operand, register_operand): Accept paradoxical + FLOAT_MODE subregs for LRA. + (scratch_operand): Accept pseudos for LRA. + * rtl.h (lra_in_progress): New external. + (debug_bb_n_slim, debug_bb_slim, print_value_slim): New + prototypes. + (debug_rtl_slim, debug_insn_slim): Ditto. + * sdbout.c (sdbout_symbol): Pass new argument to alter_subreg. + * sched-vis.c (print_value_slim): New. + * target.def (lra_p): New hook. + (register_priority): Ditto. + (different_addr_displacement_p): Ditto. + (spill_class): Ditto. + * target-globals.h (this_target_lra_int): New external. + (target_globals): New member lra_int. + (restore_target_globals): Restore this_target_lra_int. + * target-globals.c: Include lra-int.h. + (default_target_globals): Add &default_target_lra_int. + * targhooks.c (default_lra_p): New function. + (default_register_priority): Ditto. + (default_different_addr_displacement_p): Ditto. + * targhooks.h (default_lra_p): Declare. + (default_register_priority): Ditto. + (default_different_addr_displacement_p): Ditto. + * timevar.def (TV_LRA, TV_LRA_ELIMINATE, TV_LRA_INHERITANCE): New. + (TV_LRA_CREATE_LIVE_RANGES, TV_LRA_ASSIGN, TV_LRA_COALESCE): New. + * config/arm/arm.c (load_multiple_sequence): Pass new argument toOB + alter_subreg. + (store_multiple_sequence): Ditto. + * config/i386/i386.h (enum ix86_tune_indices): Add + X86_TUNE_GENERAL_REGS_SSE_SPILL. + (TARGET_GENERAL_REGS_SSE_SPILL): New macro. + * config/i386/i386.c (initial_ix86_tune_features): Set up + X86_TUNE_GENERAL_REGS_SSE_SPILL for m_COREI7 and m_CORE2I7. + (ix86_lra_p, ix86_register_priority): New functions. + (ix86_secondary_reload): Add NON_Q_REGS, SIREG, DIREG. + (inline_secondary_memory_needed): Change assert. + (ix86_spill_class): New function. + (TARGET_LRA_P, TARGET_REGISTER_BANK, TARGET_SPILL_CLASS): New + macros. + * config/m68k/m68k.c (emit_move_sequence): Pass new argument to + alter_subreg. + * config/m32r/m32r.c (gen_split_move_double): Ditto. + * config/pa/pa.c (pa_emit_move_sequence): Ditto. + * config/sh/sh.md: Ditto. + * config/v850/v850.c (v850_reorg): Ditto. + * config/xtensa/xtensa.c (fixup_subreg_mem): Ditto. + * doc/md.texi: Add new interpretation of hint * for LRA. + * doc/passes.texi: Describe LRA pass. + * doc/tm.texi.in: Add TARGET_LRA_P, TARGET_REGISTER_PRIORITY, + TARGET_DIFFERENT_ADDR_DISPLACEMENT_P, and TARGET_SPILL_CLASS. + * doc/tm.texi: Update. + 2012-10-23 Jan Hubicka <jh@suse.cz> * loop-unroll.c (decide_peel_simple): Simple peeling makes sense even diff --git a/gcc/Makefile.in b/gcc/Makefile.in index e18dc8f735c..c729ee6f7d8 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -940,6 +940,7 @@ TREE_DATA_REF_H = tree-data-ref.h $(OMEGA_H) graphds.h $(SCEV_H) TREE_INLINE_H = tree-inline.h vecir.h REAL_H = real.h $(MACHMODE_H) IRA_INT_H = ira.h ira-int.h $(CFGLOOP_H) alloc-pool.h +LRA_INT_H = lra.h $(BITMAP_H) $(RECOG_H) $(INSN_ATTR_H) insn-codes.h lra-int.h DBGCNT_H = dbgcnt.h dbgcnt.def EBITMAP_H = ebitmap.h sbitmap.h LTO_STREAMER_H = lto-streamer.h $(LINKER_PLUGIN_API_H) $(TARGET_H) \ @@ -1272,6 +1273,13 @@ OBJS = \ loop-unroll.o \ loop-unswitch.o \ lower-subreg.o \ + lra.o \ + lra-assigns.o \ + lra-coalesce.o \ + lra-constraints.o \ + lra-eliminations.o \ + lra-lives.o \ + lra-spills.o \ lto-cgraph.o \ lto-streamer.o \ lto-streamer-in.o \ @@ -2783,7 +2791,7 @@ dwarf2out.o : dwarf2out.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h \ toplev.h $(DIAGNOSTIC_CORE_H) $(DWARF2OUT_H) reload.h \ $(GGC_H) $(EXCEPT_H) dwarf2asm.h $(TM_P_H) langhooks.h $(HASHTAB_H) \ gt-dwarf2out.h $(TARGET_H) $(CGRAPH_H) $(MD5_H) $(INPUT_H) $(FUNCTION_H) \ - $(GIMPLE_H) $(TREE_FLOW_H) \ + $(GIMPLE_H) ira.h lra.h $(TREE_FLOW_H) \ $(TREE_PRETTY_PRINT_H) $(COMMON_TARGET_H) $(OPTS_H) dwarf2cfi.o : dwarf2cfi.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ version.h $(RTL_H) $(EXPR_H) $(REGS_H) $(FUNCTION_H) output.h \ @@ -3217,7 +3225,43 @@ ira.o: ira.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(REGS_H) $(RTL_H) $(TM_P_H) $(TARGET_H) $(FLAGS_H) $(OBSTACK_H) \ $(BITMAP_H) hard-reg-set.h $(BASIC_BLOCK_H) $(DBGCNT_H) $(FUNCTION_H) \ $(EXPR_H) $(RECOG_H) $(PARAMS_H) $(TREE_PASS_H) output.h \ - $(EXCEPT_H) reload.h toplev.h $(DIAGNOSTIC_CORE_H) $(DF_H) $(GGC_H) $(IRA_INT_H) + $(EXCEPT_H) reload.h toplev.h $(DIAGNOSTIC_CORE_H) \ + $(DF_H) $(GGC_H) $(IRA_INT_H) lra.h +lra.o : lra.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ + $(RTL_H) $(REGS_H) insn-config.h insn-codes.h $(TIMEVAR_H) $(TREE_PASS_H) \ + $(DF_H) $(RECOG_H) output.h addresses.h $(REGS_H) hard-reg-set.h \ + $(FLAGS_H) $(FUNCTION_H) $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) \ + $(EXCEPT_H) ira.h $(LRA_INT_H) +lra-assigns.o : lra-assigns.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \ + $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) ira.h \ + rtl-error.h sparseset.h $(LRA_INT_H) +lra-coalesce.o : lra-coalesce.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \ + $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) ira.h \ + rtl-error.h ira.h $(LRA_INT_H) +lra-constraints.o : lra-constraints.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(RTL_H) $(REGS_H) insn-config.h insn-codes.h $(DF_H) \ + $(RECOG_H) output.h addresses.h $(REGS_H) hard-reg-set.h $(FLAGS_H) \ + $(FUNCTION_H) $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \ + ira.h rtl-error.h $(LRA_INT_H) +lra-eliminations.o : lra-eliminations.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ + $(TM_H) $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \ + $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) ira.h \ + rtl-error.h $(LRA_INT_H) +lra-lives.o : lra-lives.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ + $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \ + $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \ + $(LRA_INT_H) +lra-spills.o : lra-spills.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ + $(RTL_H) $(REGS_H) insn-config.h $(DF_H) \ + $(RECOG_H) output.h $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \ + $(EXPR_H) $(BASIC_BLOCK_H) $(TM_P_H) $(EXCEPT_H) \ + ira.h $(LRA_INT_H) regmove.o : regmove.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ insn-config.h $(TREE_PASS_H) $(DF_H) \ $(RECOG_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) $(FUNCTION_H) \ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index e9b94631cf9..b7bec6e0cc4 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -10328,7 +10328,7 @@ load_multiple_sequence (rtx *operands, int nops, int *regs, int *saved_order, /* Convert a subreg of a mem into the mem itself. */ if (GET_CODE (operands[nops + i]) == SUBREG) - operands[nops + i] = alter_subreg (operands + (nops + i)); + operands[nops + i] = alter_subreg (operands + (nops + i), true); gcc_assert (MEM_P (operands[nops + i])); @@ -10480,7 +10480,7 @@ store_multiple_sequence (rtx *operands, int nops, int nops_total, /* Convert a subreg of a mem into the mem itself. */ if (GET_CODE (operands[nops + i]) == SUBREG) - operands[nops + i] = alter_subreg (operands + (nops + i)); + operands[nops + i] = alter_subreg (operands + (nops + i), true); gcc_assert (MEM_P (operands[nops + i])); diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ae48d1a91d5..c98c6b7a52f 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2267,7 +2267,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { /* X86_TUNE_REASSOC_FP_TO_PARALLEL: Try to produce parallel computations during reassociation of fp computation. */ - m_ATOM + m_ATOM, + + /* X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE + regs instead of memory. */ + m_COREI7 | m_CORE2I7 }; /* Feature tests against the various architecture variations. */ @@ -32046,6 +32050,38 @@ ix86_free_from_memory (enum machine_mode mode) } } +/* Return true if we use LRA instead of reload pass. */ +static bool +ix86_lra_p (void) +{ + return true; +} + +/* Return a register priority for hard reg REGNO. */ +static int +ix86_register_priority (int hard_regno) +{ + /* ebp and r13 as the base always wants a displacement, r12 as the + base always wants an index. So discourage their usage in an + address. */ + if (hard_regno == R12_REG || hard_regno == R13_REG) + return 0; + if (hard_regno == BP_REG) + return 1; + /* New x86-64 int registers result in bigger code size. Discourage + them. */ + if (FIRST_REX_INT_REG <= hard_regno && hard_regno <= LAST_REX_INT_REG) + return 2; + /* New x86-64 SSE registers result in bigger code size. Discourage + them. */ + if (FIRST_REX_SSE_REG <= hard_regno && hard_regno <= LAST_REX_SSE_REG) + return 2; + /* Usage of AX register results in smaller code. Prefer it. */ + if (hard_regno == 0) + return 4; + return 3; +} + /* Implement TARGET_PREFERRED_RELOAD_CLASS. Put float CONST_DOUBLE in the constant pool instead of fp regs. @@ -32179,6 +32215,9 @@ ix86_secondary_reload (bool in_p, rtx x, reg_class_t rclass, && !in_p && mode == QImode && (rclass == GENERAL_REGS || rclass == LEGACY_REGS + || rclass == NON_Q_REGS + || rclass == SIREG + || rclass == DIREG || rclass == INDEX_REGS)) { int regno; @@ -32288,7 +32327,7 @@ inline_secondary_memory_needed (enum reg_class class1, enum reg_class class2, || MAYBE_MMX_CLASS_P (class1) != MMX_CLASS_P (class1) || MAYBE_MMX_CLASS_P (class2) != MMX_CLASS_P (class2)) { - gcc_assert (!strict); + gcc_assert (!strict || lra_in_progress); return true; } @@ -40839,6 +40878,22 @@ ix86_autovectorize_vector_sizes (void) return (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0; } + + +/* Return class of registers which could be used for pseudo of MODE + and of class RCLASS for spilling instead of memory. Return NO_REGS + if it is not possible or non-profitable. */ +static reg_class_t +ix86_spill_class (reg_class_t rclass, enum machine_mode mode) +{ + if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! TARGET_MMX + && hard_reg_set_subset_p (reg_class_contents[rclass], + reg_class_contents[GENERAL_REGS]) + && (mode == SImode || (TARGET_64BIT && mode == DImode))) + return SSE_REGS; + return NO_REGS; +} + /* Implement targetm.vectorize.init_cost. */ static void * @@ -41241,6 +41296,12 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val) #undef TARGET_LEGITIMATE_ADDRESS_P #define TARGET_LEGITIMATE_ADDRESS_P ix86_legitimate_address_p +#undef TARGET_LRA_P +#define TARGET_LRA_P ix86_lra_p + +#undef TARGET_REGISTER_PRIORITY +#define TARGET_REGISTER_PRIORITY ix86_register_priority + #undef TARGET_LEGITIMATE_CONSTANT_P #define TARGET_LEGITIMATE_CONSTANT_P ix86_legitimate_constant_p @@ -41264,6 +41325,9 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val) #define TARGET_INIT_LIBFUNCS darwin_rename_builtins #endif +#undef TARGET_SPILL_CLASS +#define TARGET_SPILL_CLASS ix86_spill_class + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-i386.h" diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 16db3ca1056..f923a973b64 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -327,6 +327,7 @@ enum ix86_tune_indices { X86_TUNE_AVX128_OPTIMAL, X86_TUNE_REASSOC_INT_TO_PARALLEL, X86_TUNE_REASSOC_FP_TO_PARALLEL, + X86_TUNE_GENERAL_REGS_SSE_SPILL, X86_TUNE_LAST }; @@ -431,6 +432,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; ix86_tune_features[X86_TUNE_REASSOC_INT_TO_PARALLEL] #define TARGET_REASSOC_FP_TO_PARALLEL \ ix86_tune_features[X86_TUNE_REASSOC_FP_TO_PARALLEL] +#define TARGET_GENERAL_REGS_SSE_SPILL \ + ix86_tune_features[X86_TUNE_GENERAL_REGS_SSE_SPILL] /* Feature tests against the various architecture variations. */ enum ix86_arch_indices { diff --git a/gcc/config/m32r/m32r.c b/gcc/config/m32r/m32r.c index 03360b6a5b0..18b6d8a8a3e 100644 --- a/gcc/config/m32r/m32r.c +++ b/gcc/config/m32r/m32r.c @@ -1030,9 +1030,9 @@ gen_split_move_double (rtx operands[]) subregs to make this code simpler. It is safe to call alter_subreg any time after reload. */ if (GET_CODE (dest) == SUBREG) - alter_subreg (&dest); + alter_subreg (&dest, true); if (GET_CODE (src) == SUBREG) - alter_subreg (&src); + alter_subreg (&src, true); start_sequence (); if (REG_P (dest)) diff --git a/gcc/config/m68k/m68k.c b/gcc/config/m68k/m68k.c index b2e2e6c2564..e86960efc7b 100644 --- a/gcc/config/m68k/m68k.c +++ b/gcc/config/m68k/m68k.c @@ -3658,7 +3658,7 @@ emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg) rtx temp = gen_rtx_SUBREG (GET_MODE (operand0), reg_equiv_mem (REGNO (SUBREG_REG (operand0))), SUBREG_BYTE (operand0)); - operand0 = alter_subreg (&temp); + operand0 = alter_subreg (&temp, true); } if (scratch_reg @@ -3675,7 +3675,7 @@ emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg) rtx temp = gen_rtx_SUBREG (GET_MODE (operand1), reg_equiv_mem (REGNO (SUBREG_REG (operand1))), SUBREG_BYTE (operand1)); - operand1 = alter_subreg (&temp); + operand1 = alter_subreg (&temp, true); } if (scratch_reg && reload_in_progress && GET_CODE (operand0) == MEM diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c index 6c8f8278d58..6476daa0fc3 100644 --- a/gcc/config/pa/pa.c +++ b/gcc/config/pa/pa.c @@ -1616,7 +1616,7 @@ pa_emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg) rtx temp = gen_rtx_SUBREG (GET_MODE (operand0), reg_equiv_mem (REGNO (SUBREG_REG (operand0))), SUBREG_BYTE (operand0)); - operand0 = alter_subreg (&temp); + operand0 = alter_subreg (&temp, true); } if (scratch_reg @@ -1633,7 +1633,7 @@ pa_emit_move_sequence (rtx *operands, enum machine_mode mode, rtx scratch_reg) rtx temp = gen_rtx_SUBREG (GET_MODE (operand1), reg_equiv_mem (REGNO (SUBREG_REG (operand1))), SUBREG_BYTE (operand1)); - operand1 = alter_subreg (&temp); + operand1 = alter_subreg (&temp, true); } if (scratch_reg && reload_in_progress && GET_CODE (operand0) == MEM diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md index 2ef4a1a4ee0..d875a63961a 100644 --- a/gcc/config/sh/sh.md +++ b/gcc/config/sh/sh.md @@ -7366,7 +7366,7 @@ label: rtx regop = operands[store_p], word0 ,word1; if (GET_CODE (regop) == SUBREG) - alter_subreg (®op); + alter_subreg (®op, true); if (REGNO (XEXP (addr, 0)) == REGNO (XEXP (addr, 1))) offset = 2; else @@ -7374,9 +7374,9 @@ label: mem = copy_rtx (mem); PUT_MODE (mem, SImode); word0 = gen_rtx_SUBREG (SImode, regop, 0); - alter_subreg (&word0); + alter_subreg (&word0, true); word1 = gen_rtx_SUBREG (SImode, regop, 4); - alter_subreg (&word1); + alter_subreg (&word1, true); if (store_p || ! refers_to_regno_p (REGNO (word0), REGNO (word0) + 1, addr, 0)) { @@ -7834,7 +7834,7 @@ label: else { x = gen_rtx_SUBREG (V2SFmode, operands[0], i * 8); - alter_subreg (&x); + alter_subreg (&x, true); } if (MEM_P (operands[1])) @@ -7843,7 +7843,7 @@ label: else { y = gen_rtx_SUBREG (V2SFmode, operands[1], i * 8); - alter_subreg (&y); + alter_subreg (&y, true); } emit_insn (gen_movv2sf_i (x, y)); diff --git a/gcc/config/v850/v850.c b/gcc/config/v850/v850.c index fc06675c6f5..5d297cf3b23 100644 --- a/gcc/config/v850/v850.c +++ b/gcc/config/v850/v850.c @@ -1301,11 +1301,11 @@ v850_reorg (void) if (GET_CODE (dest) == SUBREG && (GET_CODE (SUBREG_REG (dest)) == MEM || GET_CODE (SUBREG_REG (dest)) == REG)) - alter_subreg (&dest); + alter_subreg (&dest, true); if (GET_CODE (src) == SUBREG && (GET_CODE (SUBREG_REG (src)) == MEM || GET_CODE (SUBREG_REG (src)) == REG)) - alter_subreg (&src); + alter_subreg (&src, true); if (GET_CODE (dest) == MEM && GET_CODE (src) == MEM) mem = NULL_RTX; diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c index dba8c41345e..38dc7d5b933 100644 --- a/gcc/config/xtensa/xtensa.c +++ b/gcc/config/xtensa/xtensa.c @@ -1087,7 +1087,7 @@ fixup_subreg_mem (rtx x) gen_rtx_SUBREG (GET_MODE (x), reg_equiv_mem (REGNO (SUBREG_REG (x))), SUBREG_BYTE (x)); - x = alter_subreg (&temp); + x = alter_subreg (&temp, true); } return x; } diff --git a/gcc/dbxout.c b/gcc/dbxout.c index e8e73bbd173..5492c7011ba 100644 --- a/gcc/dbxout.c +++ b/gcc/dbxout.c @@ -2994,7 +2994,7 @@ dbxout_symbol_location (tree decl, tree type, const char *suffix, rtx home) if (REGNO (value) >= FIRST_PSEUDO_REGISTER) return 0; } - home = alter_subreg (&home); + home = alter_subreg (&home, true); } if (REG_P (home)) { diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 32866d5c156..696ca94c40a 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -1,5 +1,5 @@ @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1996, 1998, 1999, 2000, 2001, -@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 +@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 @c Free Software Foundation, Inc. @c This is part of the GCC manual. @c For copying conditions, see the file gcc.texi. @@ -1622,7 +1622,9 @@ register preferences. @item * Says that the following character should be ignored when choosing register preferences. @samp{*} has no effect on the meaning of the -constraint as a constraint, and no effect on reloading. +constraint as a constraint, and no effect on reloading. For LRA +@samp{*} additionally disparages slightly the alternative if the +following character matches the operand. @ifset INTERNALS Here is an example: the 68000 has an instruction to sign-extend a diff --git a/gcc/doc/passes.texi b/gcc/doc/passes.texi index 8329dddd987..693ad31dd81 100644 --- a/gcc/doc/passes.texi +++ b/gcc/doc/passes.texi @@ -771,7 +771,7 @@ branch instructions. The source file for this pass is @file{gcse.c}. This pass attempts to replace conditional branches and surrounding assignments with arithmetic, boolean value producing comparison instructions, and conditional move instructions. In the very last -invocation after reload, it will generate predicated instructions +invocation after reload/LRA, it will generate predicated instructions when supported by the target. The code is located in @file{ifcvt.c}. @item Web construction @@ -842,9 +842,9 @@ source file is @file{regmove.c}. The integrated register allocator (@acronym{IRA}). It is called integrated because coalescing, register live range splitting, and hard register preferencing are done on-the-fly during coloring. It also -has better integration with the reload pass. Pseudo-registers spilled -by the allocator or the reload have still a chance to get -hard-registers if the reload evicts some pseudo-registers from +has better integration with the reload/LRA pass. Pseudo-registers spilled +by the allocator or the reload/LRA have still a chance to get +hard-registers if the reload/LRA evicts some pseudo-registers from hard-registers. The allocator helps to choose better pseudos for spilling based on their live ranges and to coalesce stack slots allocated for the spilled pseudo-registers. IRA is a regional @@ -875,6 +875,23 @@ instructions to save and restore call-clobbered registers around calls. Source files are @file{reload.c} and @file{reload1.c}, plus the header @file{reload.h} used for communication between them. + +@cindex Local Register Allocator (LRA) +@item +This pass is a modern replacement of the reload pass. Source files +are @file{lra.c}, @file{lra-assign.c}, @file{lra-coalesce.c}, +@file{lra-constraints.c}, @file{lra-eliminations.c}, +@file{lra-equivs.c}, @file{lra-lives.c}, @file{lra-saves.c}, +@file{lra-spills.c}, the header @file{lra-int.h} used for +communication between them, and the header @file{lra.h} used for +communication between LRA and the rest of compiler. + +Unlike the reload pass, intermediate LRA decisions are reflected in +RTL as much as possible. This reduces the number of target-dependent +macros and hooks, leaving instruction constraints as the primary +source of control. + +LRA is run on targets for which TARGET_LRA_P returns true. @end itemize @item Basic block reordering diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 72ea14cd28e..68713f7bb2f 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -2893,6 +2893,22 @@ as below: @end smallexample @end defmac +@deftypefn {Target Hook} bool TARGET_LRA_P (void) +A target hook which returns true if we use LRA instead of reload pass. It means that LRA was ported to the target. The default version of this target hook returns always false. +@end deftypefn + +@deftypefn {Target Hook} int TARGET_REGISTER_PRIORITY (int) +A target hook which returns the register priority number to which the register @var{hard_regno} belongs to. The bigger the number, the more preferable the hard register usage (when all other conditions are the same). This hook can be used to prefer some hard register over others in LRA. For example, some x86-64 register usage needs additional prefix which makes instructions longer. The hook can return lower priority number for such registers make them less favorable and as result making the generated code smaller. The default version of this target hook returns always zero. +@end deftypefn + +@deftypefn {Target Hook} bool TARGET_DIFFERENT_ADDR_DISPLACEMENT_P (void) +A target hook which returns true if an address with the same structure can have different maximal legitimate displacement. For example, the displacement can depend on memory mode or on operand combinations in the insn. The default version of this target hook returns always false. +@end deftypefn + +@deftypefn {Target Hook} reg_class_t TARGET_SPILL_CLASS (reg_class_t, enum @var{machine_mode}) +This hook defines a class of registers which could be used for spilling pseudos of the given mode and class, or @code{NO_REGS} if only memory should be used. Not defining this hook is equivalent to returning @code{NO_REGS} for all inputs. +@end deftypefn + @node Old Constraints @section Obsolete Macros for Defining Constraints @cindex defining constraints, obsolete method diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index ce31aae4b72..c325cd4ae6e 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -2869,6 +2869,14 @@ as below: @end smallexample @end defmac +@hook TARGET_LRA_P + +@hook TARGET_REGISTER_PRIORITY + +@hook TARGET_DIFFERENT_ADDR_DISPLACEMENT_P + +@hook TARGET_SPILL_CLASS + @node Old Constraints @section Obsolete Macros for Defining Constraints @cindex defining constraints, obsolete method diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c index fcdb1b11951..bc5868b6a25 100644 --- a/gcc/dwarf2out.c +++ b/gcc/dwarf2out.c @@ -90,6 +90,8 @@ along with GCC; see the file COPYING3. If not see #include "cgraph.h" #include "input.h" #include "gimple.h" +#include "ira.h" +#include "lra.h" #include "dumpfile.h" #include "opts.h" @@ -10162,7 +10164,9 @@ based_loc_descr (rtx reg, HOST_WIDE_INT offset, argument pointer and soft frame pointer rtx's. */ if (reg == arg_pointer_rtx || reg == frame_pointer_rtx) { - rtx elim = eliminate_regs (reg, VOIDmode, NULL_RTX); + rtx elim = (ira_use_lra_p + ? lra_eliminate_regs (reg, VOIDmode, NULL_RTX) + : eliminate_regs (reg, VOIDmode, NULL_RTX)); if (elim != reg) { @@ -15020,7 +15024,9 @@ compute_frame_pointer_to_fb_displacement (HOST_WIDE_INT offset) offset += ARG_POINTER_CFA_OFFSET (current_function_decl); #endif - elim = eliminate_regs (reg, VOIDmode, NULL_RTX); + elim = (ira_use_lra_p + ? lra_eliminate_regs (reg, VOIDmode, NULL_RTX) + : eliminate_regs (reg, VOIDmode, NULL_RTX)); if (GET_CODE (elim) == PLUS) { offset += INTVAL (XEXP (elim, 1)); diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 7d7b1dfb0a1..cb23d5a4e3f 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -578,7 +578,7 @@ gen_rtx_REG (enum machine_mode mode, unsigned int regno) Also don't do this when we are making new REGs in reload, since we don't want to get confused with the real pointers. */ - if (mode == Pmode && !reload_in_progress) + if (mode == Pmode && !reload_in_progress && !lra_in_progress) { if (regno == FRAME_POINTER_REGNUM && (!reload_completed || frame_pointer_needed)) @@ -720,7 +720,14 @@ validate_subreg (enum machine_mode omode, enum machine_mode imode, (subreg:SI (reg:DF) 0) isn't. */ else if (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode)) { - if (isize != osize) + if (! (isize == osize + /* LRA can use subreg to store a floating point value in + an integer mode. Although the floating point and the + integer modes need the same number of hard registers, + the size of floating point mode can be less than the + integer mode. LRA also uses subregs for a register + should be used in different mode in on insn. */ + || lra_in_progress)) return false; } @@ -753,7 +760,8 @@ validate_subreg (enum machine_mode omode, enum machine_mode imode, of a subword. A subreg does *not* perform arbitrary bit extraction. Given that we've already checked mode/offset alignment, we only have to check subword subregs here. */ - if (osize < UNITS_PER_WORD) + if (osize < UNITS_PER_WORD + && ! (lra_in_progress && (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode)))) { enum machine_mode wmode = isize > UNITS_PER_WORD ? word_mode : imode; unsigned int low_off = subreg_lowpart_offset (omode, wmode); diff --git a/gcc/expr.c b/gcc/expr.c index f00ae6a9eac..448596c3396 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -3448,9 +3448,13 @@ emit_move_insn_1 (rtx x, rtx y) fits within a HOST_WIDE_INT. */ if (!CONSTANT_P (y) || GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT) { - rtx ret = emit_move_via_integer (mode, x, y, false); + rtx ret = emit_move_via_integer (mode, x, y, lra_in_progress); + if (ret) - return ret; + { + if (! lra_in_progress || recog (PATTERN (ret), ret, 0) >= 0) + return ret; + } } return emit_move_multi_word (mode, x, y); diff --git a/gcc/final.c b/gcc/final.c index bffc1a9c460..ceb688e5e31 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -2560,7 +2560,7 @@ final_scan_insn (rtx insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED, { rtx src1, src2; if (GET_CODE (SET_SRC (set)) == SUBREG) - SET_SRC (set) = alter_subreg (&SET_SRC (set)); + SET_SRC (set) = alter_subreg (&SET_SRC (set), true); src1 = SET_SRC (set); src2 = NULL_RTX; @@ -2568,10 +2568,10 @@ final_scan_insn (rtx insn, FILE *file, int optimize_p ATTRIBUTE_UNUSED, { if (GET_CODE (XEXP (SET_SRC (set), 0)) == SUBREG) XEXP (SET_SRC (set), 0) - = alter_subreg (&XEXP (SET_SRC (set), 0)); + = alter_subreg (&XEXP (SET_SRC (set), 0), true); if (GET_CODE (XEXP (SET_SRC (set), 1)) == SUBREG) XEXP (SET_SRC (set), 1) - = alter_subreg (&XEXP (SET_SRC (set), 1)); + = alter_subreg (&XEXP (SET_SRC (set), 1), true); if (XEXP (SET_SRC (set), 1) == CONST0_RTX (GET_MODE (XEXP (SET_SRC (set), 0)))) src2 = XEXP (SET_SRC (set), 0); @@ -2974,7 +2974,7 @@ cleanup_subreg_operands (rtx insn) expression directly. */ if (GET_CODE (*recog_data.operand_loc[i]) == SUBREG) { - recog_data.operand[i] = alter_subreg (recog_data.operand_loc[i]); + recog_data.operand[i] = alter_subreg (recog_data.operand_loc[i], true); changed = true; } else if (GET_CODE (recog_data.operand[i]) == PLUS @@ -2987,7 +2987,7 @@ cleanup_subreg_operands (rtx insn) { if (GET_CODE (*recog_data.dup_loc[i]) == SUBREG) { - *recog_data.dup_loc[i] = alter_subreg (recog_data.dup_loc[i]); + *recog_data.dup_loc[i] = alter_subreg (recog_data.dup_loc[i], true); changed = true; } else if (GET_CODE (*recog_data.dup_loc[i]) == PLUS @@ -2999,11 +2999,11 @@ cleanup_subreg_operands (rtx insn) df_insn_rescan (insn); } -/* If X is a SUBREG, replace it with a REG or a MEM, - based on the thing it is a subreg of. */ +/* If X is a SUBREG, try to replace it with a REG or a MEM, based on + the thing it is a subreg of. Do it anyway if FINAL_P. */ rtx -alter_subreg (rtx *xp) +alter_subreg (rtx *xp, bool final_p) { rtx x = *xp; rtx y = SUBREG_REG (x); @@ -3027,16 +3027,19 @@ alter_subreg (rtx *xp) offset += difference % UNITS_PER_WORD; } - *xp = adjust_address (y, GET_MODE (x), offset); + if (final_p) + *xp = adjust_address (y, GET_MODE (x), offset); + else + *xp = adjust_address_nv (y, GET_MODE (x), offset); } else { rtx new_rtx = simplify_subreg (GET_MODE (x), y, GET_MODE (y), - SUBREG_BYTE (x)); + SUBREG_BYTE (x)); if (new_rtx != 0) *xp = new_rtx; - else if (REG_P (y)) + else if (final_p && REG_P (y)) { /* Simplify_subreg can't handle some REG cases, but we have to. */ unsigned int regno; @@ -3076,7 +3079,7 @@ walk_alter_subreg (rtx *xp, bool *changed) case SUBREG: *changed = true; - return alter_subreg (xp); + return alter_subreg (xp, true); default: break; @@ -3682,7 +3685,7 @@ void output_operand (rtx x, int code ATTRIBUTE_UNUSED) { if (x && GET_CODE (x) == SUBREG) - x = alter_subreg (&x); + x = alter_subreg (&x, true); /* X must not be a pseudo reg. */ gcc_assert (!x || !REG_P (x) || REGNO (x) < FIRST_PSEUDO_REGISTER); diff --git a/gcc/gcse.c b/gcc/gcse.c index 99e7685cdba..90b551bbc4d 100644 --- a/gcc/gcse.c +++ b/gcc/gcse.c @@ -3377,7 +3377,7 @@ calculate_bb_reg_pressure (void) bitmap_iterator bi; - ira_setup_eliminable_regset (); + ira_setup_eliminable_regset (false); curr_regs_live = BITMAP_ALLOC (®_obstack); FOR_EACH_BB (bb) { diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 838d3a5161d..fa80f246b42 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -6548,7 +6548,7 @@ sched_init (void) sched_pressure = SCHED_PRESSURE_NONE; if (sched_pressure != SCHED_PRESSURE_NONE) - ira_setup_eliminable_regset (); + ira_setup_eliminable_regset (false); /* Initialize SPEC_INFO. */ if (targetm.sched.set_sched_flags) diff --git a/gcc/ira-color.c b/gcc/ira-color.c index bcf03216af8..dd4c73b9482 100644 --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -2834,8 +2834,7 @@ color_pass (ira_loop_tree_node_t loop_tree_node) exit_freq = ira_loop_edge_freq (subloop_node, regno, true); enter_freq = ira_loop_edge_freq (subloop_node, regno, false); ira_assert (regno < ira_reg_equiv_len); - if (ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX) + if (ira_equiv_no_lvalue_p (regno)) { if (! ALLOCNO_ASSIGNED_P (subloop_allocno)) { @@ -2940,9 +2939,7 @@ move_spill_restore (void) copies and the reload pass can spill the allocno set by copy although the allocno will not get memory slot. */ - || (regno < ira_reg_equiv_len - && (ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX)) + || ira_equiv_no_lvalue_p (regno) || !bitmap_bit_p (loop_node->border_allocnos, ALLOCNO_NUM (a))) continue; mode = ALLOCNO_MODE (a); @@ -3366,9 +3363,7 @@ coalesce_allocnos (void) a = ira_allocnos[j]; regno = ALLOCNO_REGNO (a); if (! ALLOCNO_ASSIGNED_P (a) || ALLOCNO_HARD_REGNO (a) >= 0 - || (regno < ira_reg_equiv_len - && (ira_reg_equiv_const[regno] != NULL_RTX - || ira_reg_equiv_invariant_p[regno]))) + || ira_equiv_no_lvalue_p (regno)) continue; for (cp = ALLOCNO_COPIES (a); cp != NULL; cp = next_cp) { @@ -3383,9 +3378,7 @@ coalesce_allocnos (void) if ((cp->insn != NULL || cp->constraint_p) && ALLOCNO_ASSIGNED_P (cp->second) && ALLOCNO_HARD_REGNO (cp->second) < 0 - && (regno >= ira_reg_equiv_len - || (! ira_reg_equiv_invariant_p[regno] - && ira_reg_equiv_const[regno] == NULL_RTX))) + && ! ira_equiv_no_lvalue_p (regno)) sorted_copies[cp_num++] = cp; } else if (cp->second == a) @@ -3651,9 +3644,7 @@ coalesce_spill_slots (ira_allocno_t *spilled_coalesced_allocnos, int num) allocno = spilled_coalesced_allocnos[i]; if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno || bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (allocno)) - || (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len - && (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX - || ira_reg_equiv_invariant_p[ALLOCNO_REGNO (allocno)]))) + || ira_equiv_no_lvalue_p (ALLOCNO_REGNO (allocno))) continue; for (j = 0; j < i; j++) { @@ -3661,9 +3652,7 @@ coalesce_spill_slots (ira_allocno_t *spilled_coalesced_allocnos, int num) n = ALLOCNO_COALESCE_DATA (a)->temp; if (ALLOCNO_COALESCE_DATA (a)->first == a && ! bitmap_bit_p (set_jump_crosses, ALLOCNO_REGNO (a)) - && (ALLOCNO_REGNO (a) >= ira_reg_equiv_len - || (! ira_reg_equiv_invariant_p[ALLOCNO_REGNO (a)] - && ira_reg_equiv_const[ALLOCNO_REGNO (a)] == NULL_RTX)) + && ! ira_equiv_no_lvalue_p (ALLOCNO_REGNO (a)) && ! slot_coalesced_allocno_live_ranges_intersect_p (allocno, n)) break; } @@ -3771,9 +3760,7 @@ ira_sort_regnos_for_alter_reg (int *pseudo_regnos, int n, allocno = spilled_coalesced_allocnos[i]; if (ALLOCNO_COALESCE_DATA (allocno)->first != allocno || ALLOCNO_HARD_REGNO (allocno) >= 0 - || (ALLOCNO_REGNO (allocno) < ira_reg_equiv_len - && (ira_reg_equiv_const[ALLOCNO_REGNO (allocno)] != NULL_RTX - || ira_reg_equiv_invariant_p[ALLOCNO_REGNO (allocno)]))) + || ira_equiv_no_lvalue_p (ALLOCNO_REGNO (allocno))) continue; if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, " Slot %d (freq,size):", slot_num); diff --git a/gcc/ira-emit.c b/gcc/ira-emit.c index b0d9a825124..683d47eba80 100644 --- a/gcc/ira-emit.c +++ b/gcc/ira-emit.c @@ -340,6 +340,7 @@ ira_create_new_reg (rtx original_reg) if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, " Creating newreg=%i from oldreg=%i\n", REGNO (new_reg), REGNO (original_reg)); + ira_expand_reg_equiv (); return new_reg; } @@ -518,8 +519,7 @@ generate_edge_moves (edge e) /* Remove unnecessary stores at the region exit. We should do this for readonly memory for sure and this is guaranteed by that we never generate moves on region borders (see - checking ira_reg_equiv_invariant_p in function - change_loop). */ + checking in function change_loop). */ if (ALLOCNO_HARD_REGNO (dest_allocno) < 0 && ALLOCNO_HARD_REGNO (src_allocno) >= 0 && store_can_be_removed_p (src_allocno, dest_allocno)) @@ -613,8 +613,7 @@ change_loop (ira_loop_tree_node_t node) /* don't create copies because reload can spill an allocno set by copy although the allocno will not get memory slot. */ - || ira_reg_equiv_invariant_p[regno] - || ira_reg_equiv_const[regno] != NULL_RTX)) + || ira_equiv_no_lvalue_p (regno))) continue; original_reg = allocno_emit_reg (allocno); if (parent_allocno == NULL @@ -902,17 +901,22 @@ modify_move_list (move_t list) static rtx emit_move_list (move_t list, int freq) { - int cost, regno; - rtx result, insn, set, to; + rtx to, from, dest; + int to_regno, from_regno, cost, regno; + rtx result, insn, set; enum machine_mode mode; enum reg_class aclass; + grow_reg_equivs (); start_sequence (); for (; list != NULL; list = list->next) { start_sequence (); - emit_move_insn (allocno_emit_reg (list->to), - allocno_emit_reg (list->from)); + to = allocno_emit_reg (list->to); + to_regno = REGNO (to); + from = allocno_emit_reg (list->from); + from_regno = REGNO (from); + emit_move_insn (to, from); list->insn = get_insns (); end_sequence (); for (insn = list->insn; insn != NULL_RTX; insn = NEXT_INSN (insn)) @@ -928,21 +932,22 @@ emit_move_list (move_t list, int freq) to use the equivalence. */ if ((set = single_set (insn)) != NULL_RTX) { - to = SET_DEST (set); - if (GET_CODE (to) == SUBREG) - to = SUBREG_REG (to); - ira_assert (REG_P (to)); - regno = REGNO (to); + dest = SET_DEST (set); + if (GET_CODE (dest) == SUBREG) + dest = SUBREG_REG (dest); + ira_assert (REG_P (dest)); + regno = REGNO (dest); if (regno >= ira_reg_equiv_len - || (! ira_reg_equiv_invariant_p[regno] - && ira_reg_equiv_const[regno] == NULL_RTX)) + || (ira_reg_equiv[regno].invariant == NULL_RTX + && ira_reg_equiv[regno].constant == NULL_RTX)) continue; /* regno has no equivalence. */ ira_assert ((int) VEC_length (reg_equivs_t, reg_equivs) - >= ira_reg_equiv_len); + > regno); reg_equiv_init (regno) = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno)); } } + ira_update_equiv_info_by_shuffle_insn (to_regno, from_regno, list->insn); emit_insn (list->insn); mode = ALLOCNO_MODE (list->to); aclass = ALLOCNO_CLASS (list->to); diff --git a/gcc/ira-int.h b/gcc/ira-int.h index bde69861e78..a64e3a14afe 100644 --- a/gcc/ira-int.h +++ b/gcc/ira-int.h @@ -795,11 +795,6 @@ struct target_ira_int { /* Map class->true if class is a pressure class, false otherwise. */ bool x_ira_reg_pressure_class_p[N_REG_CLASSES]; - /* Register class subset relation: TRUE if the first class is a subset - of the second one considering only hard registers available for the - allocation. */ - int x_ira_class_subset_p[N_REG_CLASSES][N_REG_CLASSES]; - /* Array of the number of hard registers of given class which are available for allocation. The order is defined by the hard register numbers. */ @@ -852,13 +847,8 @@ struct target_ira_int { taking all hard-registers including fixed ones into account. */ enum reg_class x_ira_reg_class_intersect[N_REG_CLASSES][N_REG_CLASSES]; - /* True if the two classes (that is calculated taking only hard - registers available for allocation into account; are - intersected. */ - bool x_ira_reg_classes_intersect_p[N_REG_CLASSES][N_REG_CLASSES]; - /* Classes with end marker LIM_REG_CLASSES which are intersected with - given class (the first index;. That includes given class itself. + given class (the first index). That includes given class itself. This is calculated taking only hard registers available for allocation into account. */ enum reg_class x_ira_reg_class_super_classes[N_REG_CLASSES][N_REG_CLASSES]; @@ -875,7 +865,7 @@ struct target_ira_int { /* For each reg class, table listing all the classes contained in it (excluding the class itself. Non-allocatable registers are - excluded from the consideration;. */ + excluded from the consideration). */ enum reg_class x_alloc_reg_class_subclasses[N_REG_CLASSES][N_REG_CLASSES]; /* Array whose values are hard regset of hard registers for which @@ -908,8 +898,6 @@ extern struct target_ira_int *this_target_ira_int; (this_target_ira_int->x_ira_reg_allocno_class_p) #define ira_reg_pressure_class_p \ (this_target_ira_int->x_ira_reg_pressure_class_p) -#define ira_class_subset_p \ - (this_target_ira_int->x_ira_class_subset_p) #define ira_non_ordered_class_hard_regs \ (this_target_ira_int->x_ira_non_ordered_class_hard_regs) #define ira_class_hard_reg_index \ @@ -928,8 +916,6 @@ extern struct target_ira_int *this_target_ira_int; (this_target_ira_int->x_ira_uniform_class_p) #define ira_reg_class_intersect \ (this_target_ira_int->x_ira_reg_class_intersect) -#define ira_reg_classes_intersect_p \ - (this_target_ira_int->x_ira_reg_classes_intersect_p) #define ira_reg_class_super_classes \ (this_target_ira_int->x_ira_reg_class_super_classes) #define ira_reg_class_subunion \ @@ -950,17 +936,6 @@ extern void ira_debug_disposition (void); extern void ira_debug_allocno_classes (void); extern void ira_init_register_move_cost (enum machine_mode); -/* The length of the two following arrays. */ -extern int ira_reg_equiv_len; - -/* The element value is TRUE if the corresponding regno value is - invariant. */ -extern bool *ira_reg_equiv_invariant_p; - -/* The element value is equiv constant of given pseudo-register or - NULL_RTX. */ -extern rtx *ira_reg_equiv_const; - /* ira-build.c */ /* The current loop tree node and its regno allocno map. */ @@ -1044,6 +1019,20 @@ extern void ira_emit (bool); +/* Return true if equivalence of pseudo REGNO is not a lvalue. */ +static inline bool +ira_equiv_no_lvalue_p (int regno) +{ + if (regno >= ira_reg_equiv_len) + return false; + return (ira_reg_equiv[regno].constant != NULL_RTX + || ira_reg_equiv[regno].invariant != NULL_RTX + || (ira_reg_equiv[regno].memory != NULL_RTX + && MEM_READONLY_P (ira_reg_equiv[regno].memory))); +} + + + /* Initialize register costs for MODE if necessary. */ static inline void ira_init_register_move_cost_if_necessary (enum machine_mode mode) diff --git a/gcc/ira.c b/gcc/ira.c index 78b3f92db00..e91d37ddaa5 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -382,6 +382,7 @@ along with GCC; see the file COPYING3. If not see #include "function.h" #include "ggc.h" #include "ira-int.h" +#include "lra.h" #include "dce.h" #include "dbgcnt.h" @@ -1201,6 +1202,7 @@ setup_reg_class_relations (void) { ira_reg_classes_intersect_p[cl1][cl2] = false; ira_reg_class_intersect[cl1][cl2] = NO_REGS; + ira_reg_class_subset[cl1][cl2] = NO_REGS; COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl1]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); COPY_HARD_REG_SET (temp_set2, reg_class_contents[cl2]); @@ -1248,9 +1250,8 @@ setup_reg_class_relations (void) COPY_HARD_REG_SET (union_set, reg_class_contents[cl1]); IOR_HARD_REG_SET (union_set, reg_class_contents[cl2]); AND_COMPL_HARD_REG_SET (union_set, no_unit_alloc_regs); - for (i = 0; i < ira_important_classes_num; i++) + for (cl3 = 0; cl3 < N_REG_CLASSES; cl3++) { - cl3 = ira_important_classes[i]; COPY_HARD_REG_SET (temp_hard_regset, reg_class_contents[cl3]); AND_COMPL_HARD_REG_SET (temp_hard_regset, no_unit_alloc_regs); if (hard_reg_set_subset_p (temp_hard_regset, intersection_set)) @@ -1258,25 +1259,45 @@ setup_reg_class_relations (void) /* CL3 allocatable hard register set is inside of intersection of allocatable hard register sets of CL1 and CL2. */ + if (important_class_p[cl3]) + { + COPY_HARD_REG_SET + (temp_set2, + reg_class_contents + [(int) ira_reg_class_intersect[cl1][cl2]]); + AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); + if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) + /* If the allocatable hard register sets are + the same, prefer GENERAL_REGS or the + smallest class for debugging + purposes. */ + || (hard_reg_set_equal_p (temp_hard_regset, temp_set2) + && (cl3 == GENERAL_REGS + || ((ira_reg_class_intersect[cl1][cl2] + != GENERAL_REGS) + && hard_reg_set_subset_p + (reg_class_contents[cl3], + reg_class_contents + [(int) + ira_reg_class_intersect[cl1][cl2]]))))) + ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; + } COPY_HARD_REG_SET (temp_set2, - reg_class_contents[(int) - ira_reg_class_intersect[cl1][cl2]]); + reg_class_contents[(int) ira_reg_class_subset[cl1][cl2]]); AND_COMPL_HARD_REG_SET (temp_set2, no_unit_alloc_regs); - if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) - /* If the allocatable hard register sets are the - same, prefer GENERAL_REGS or the smallest - class for debugging purposes. */ + if (! hard_reg_set_subset_p (temp_hard_regset, temp_set2) + /* Ignore unavailable hard registers and prefer + smallest class for debugging purposes. */ || (hard_reg_set_equal_p (temp_hard_regset, temp_set2) - && (cl3 == GENERAL_REGS - || (ira_reg_class_intersect[cl1][cl2] != GENERAL_REGS - && hard_reg_set_subset_p - (reg_class_contents[cl3], - reg_class_contents - [(int) ira_reg_class_intersect[cl1][cl2]]))))) - ira_reg_class_intersect[cl1][cl2] = (enum reg_class) cl3; + && hard_reg_set_subset_p + (reg_class_contents[cl3], + reg_class_contents + [(int) ira_reg_class_subset[cl1][cl2]]))) + ira_reg_class_subset[cl1][cl2] = (enum reg_class) cl3; } - if (hard_reg_set_subset_p (temp_hard_regset, union_set)) + if (important_class_p[cl3] + && hard_reg_set_subset_p (temp_hard_regset, union_set)) { /* CL3 allocatbale hard register set is inside of union of allocatable hard register sets of CL1 @@ -1632,6 +1653,7 @@ void ira_init_once (void) { ira_init_costs_once (); + lra_init_once (); } /* Free ira_max_register_move_cost, ira_may_move_in_cost and @@ -1679,6 +1701,7 @@ ira_init (void) clarify_prohibited_class_mode_regs (); setup_hard_regno_aclass (); ira_init_costs (); + lra_init (); } /* Function called once at the end of compiler work. */ @@ -1687,6 +1710,7 @@ ira_finish_once (void) { ira_finish_costs_once (); free_register_move_costs (); + lra_finish_once (); } @@ -1823,9 +1847,11 @@ compute_regs_asm_clobbered (void) } -/* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and REGS_EVER_LIVE. */ +/* Set up ELIMINABLE_REGSET, IRA_NO_ALLOC_REGS, and REGS_EVER_LIVE. + If the function is called from IRA (not from the insn scheduler or + RTL loop invariant motion), FROM_IRA_P is true. */ void -ira_setup_eliminable_regset (void) +ira_setup_eliminable_regset (bool from_ira_p) { #ifdef ELIMINABLE_REGS int i; @@ -1835,7 +1861,7 @@ ira_setup_eliminable_regset (void) sp for alloca. So we can't eliminate the frame pointer in that case. At some point, we should improve this by emitting the sp-adjusting insns for this case. */ - int need_fp + frame_pointer_needed = (! flag_omit_frame_pointer || (cfun->calls_alloca && EXIT_IGNORE_STACK) /* We need the frame pointer to catch stack overflow exceptions @@ -1845,8 +1871,14 @@ ira_setup_eliminable_regset (void) || crtl->stack_realign_needed || targetm.frame_pointer_required ()); - frame_pointer_needed = need_fp; + if (from_ira_p && ira_use_lra_p) + /* It can change FRAME_POINTER_NEEDED. We call it only from IRA + because it is expensive. */ + lra_init_elimination (); + if (frame_pointer_needed) + df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true); + COPY_HARD_REG_SET (ira_no_alloc_regs, no_unit_alloc_regs); CLEAR_HARD_REG_SET (eliminable_regset); @@ -1859,7 +1891,7 @@ ira_setup_eliminable_regset (void) { bool cannot_elim = (! targetm.can_eliminate (eliminables[i].from, eliminables[i].to) - || (eliminables[i].to == STACK_POINTER_REGNUM && need_fp)); + || (eliminables[i].to == STACK_POINTER_REGNUM && frame_pointer_needed)); if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, eliminables[i].from)) { @@ -1878,10 +1910,10 @@ ira_setup_eliminable_regset (void) if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM)) { SET_HARD_REG_BIT (eliminable_regset, HARD_FRAME_POINTER_REGNUM); - if (need_fp) + if (frame_pointer_needed) SET_HARD_REG_BIT (ira_no_alloc_regs, HARD_FRAME_POINTER_REGNUM); } - else if (need_fp) + else if (frame_pointer_needed) error ("%s cannot be used in asm here", reg_names[HARD_FRAME_POINTER_REGNUM]); else @@ -1892,10 +1924,10 @@ ira_setup_eliminable_regset (void) if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM)) { SET_HARD_REG_BIT (eliminable_regset, FRAME_POINTER_REGNUM); - if (need_fp) + if (frame_pointer_needed) SET_HARD_REG_BIT (ira_no_alloc_regs, FRAME_POINTER_REGNUM); } - else if (need_fp) + else if (frame_pointer_needed) error ("%s cannot be used in asm here", reg_names[FRAME_POINTER_REGNUM]); else df_set_regs_ever_live (FRAME_POINTER_REGNUM, true); @@ -1904,66 +1936,6 @@ ira_setup_eliminable_regset (void) -/* The length of the following two arrays. */ -int ira_reg_equiv_len; - -/* The element value is TRUE if the corresponding regno value is - invariant. */ -bool *ira_reg_equiv_invariant_p; - -/* The element value is equiv constant of given pseudo-register or - NULL_RTX. */ -rtx *ira_reg_equiv_const; - -/* Set up the two arrays declared above. */ -static void -find_reg_equiv_invariant_const (void) -{ - unsigned int i; - bool invariant_p; - rtx list, insn, note, constant, x; - - for (i = FIRST_PSEUDO_REGISTER; i < VEC_length (reg_equivs_t, reg_equivs); i++) - { - constant = NULL_RTX; - invariant_p = false; - for (list = reg_equiv_init (i); list != NULL_RTX; list = XEXP (list, 1)) - { - insn = XEXP (list, 0); - note = find_reg_note (insn, REG_EQUIV, NULL_RTX); - - if (note == NULL_RTX) - continue; - - x = XEXP (note, 0); - - if (! CONSTANT_P (x) - || ! flag_pic || LEGITIMATE_PIC_OPERAND_P (x)) - { - /* It can happen that a REG_EQUIV note contains a MEM - that is not a legitimate memory operand. As later - stages of the reload assume that all addresses found - in the reg_equiv_* arrays were originally legitimate, - we ignore such REG_EQUIV notes. */ - if (memory_operand (x, VOIDmode)) - invariant_p = MEM_READONLY_P (x); - else if (function_invariant_p (x)) - { - if (GET_CODE (x) == PLUS - || x == frame_pointer_rtx || x == arg_pointer_rtx) - invariant_p = true; - else - constant = x; - } - } - } - ira_reg_equiv_invariant_p[i] = invariant_p; - ira_reg_equiv_const[i] = constant; - } -} - - - /* Vector of substitutions of register numbers, used to map pseudo regs into hardware regs. This is set up as a result of register allocation. @@ -1984,6 +1956,8 @@ setup_reg_renumber (void) caller_save_needed = 0; FOR_EACH_ALLOCNO (a, ai) { + if (ira_use_lra_p && ALLOCNO_CAP_MEMBER (a) != NULL) + continue; /* There are no caps at this point. */ ira_assert (ALLOCNO_CAP_MEMBER (a) == NULL); if (! ALLOCNO_ASSIGNED_P (a)) @@ -2015,9 +1989,7 @@ setup_reg_renumber (void) ira_assert (!optimize || flag_caller_saves || (ALLOCNO_CALLS_CROSSED_NUM (a) == ALLOCNO_CHEAP_CALLS_CROSSED_NUM (a)) - || regno >= ira_reg_equiv_len - || ira_reg_equiv_const[regno] - || ira_reg_equiv_invariant_p[regno]); + || ira_equiv_no_lvalue_p (regno)); caller_save_needed = 1; } } @@ -2184,6 +2156,109 @@ check_allocation (void) } #endif +/* Allocate REG_EQUIV_INIT. Set up it from IRA_REG_EQUIV which should + be already calculated. */ +static void +setup_reg_equiv_init (void) +{ + int i; + int max_regno = max_reg_num (); + + for (i = 0; i < max_regno; i++) + reg_equiv_init (i) = ira_reg_equiv[i].init_insns; +} + +/* Update equiv regno from movement of FROM_REGNO to TO_REGNO. INSNS + are insns which were generated for such movement. It is assumed + that FROM_REGNO and TO_REGNO always have the same value at the + point of any move containing such registers. This function is used + to update equiv info for register shuffles on the region borders + and for caller save/restore insns. */ +void +ira_update_equiv_info_by_shuffle_insn (int to_regno, int from_regno, rtx insns) +{ + rtx insn, x, note; + + if (! ira_reg_equiv[from_regno].defined_p + && (! ira_reg_equiv[to_regno].defined_p + || ((x = ira_reg_equiv[to_regno].memory) != NULL_RTX + && ! MEM_READONLY_P (x)))) + return; + insn = insns; + if (NEXT_INSN (insn) != NULL_RTX) + { + if (! ira_reg_equiv[to_regno].defined_p) + { + ira_assert (ira_reg_equiv[to_regno].init_insns == NULL_RTX); + return; + } + ira_reg_equiv[to_regno].defined_p = false; + ira_reg_equiv[to_regno].memory + = ira_reg_equiv[to_regno].constant + = ira_reg_equiv[to_regno].invariant + = ira_reg_equiv[to_regno].init_insns = NULL_RTX; + if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) + fprintf (ira_dump_file, + " Invalidating equiv info for reg %d\n", to_regno); + return; + } + /* It is possible that FROM_REGNO still has no equivalence because + in shuffles to_regno<-from_regno and from_regno<-to_regno the 2nd + insn was not processed yet. */ + if (ira_reg_equiv[from_regno].defined_p) + { + ira_reg_equiv[to_regno].defined_p = true; + if ((x = ira_reg_equiv[from_regno].memory) != NULL_RTX) + { + ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX + && ira_reg_equiv[from_regno].constant == NULL_RTX); + ira_assert (ira_reg_equiv[to_regno].memory == NULL_RTX + || rtx_equal_p (ira_reg_equiv[to_regno].memory, x)); + ira_reg_equiv[to_regno].memory = x; + if (! MEM_READONLY_P (x)) + /* We don't add the insn to insn init list because memory + equivalence is just to say what memory is better to use + when the pseudo is spilled. */ + return; + } + else if ((x = ira_reg_equiv[from_regno].constant) != NULL_RTX) + { + ira_assert (ira_reg_equiv[from_regno].invariant == NULL_RTX); + ira_assert (ira_reg_equiv[to_regno].constant == NULL_RTX + || rtx_equal_p (ira_reg_equiv[to_regno].constant, x)); + ira_reg_equiv[to_regno].constant = x; + } + else + { + x = ira_reg_equiv[from_regno].invariant; + ira_assert (x != NULL_RTX); + ira_assert (ira_reg_equiv[to_regno].invariant == NULL_RTX + || rtx_equal_p (ira_reg_equiv[to_regno].invariant, x)); + ira_reg_equiv[to_regno].invariant = x; + } + if (find_reg_note (insn, REG_EQUIV, x) == NULL_RTX) + { + note = set_unique_reg_note (insn, REG_EQUIV, x); + gcc_assert (note != NULL_RTX); + if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) + { + fprintf (ira_dump_file, + " Adding equiv note to insn %u for reg %d ", + INSN_UID (insn), to_regno); + print_value_slim (ira_dump_file, x, 1); + fprintf (ira_dump_file, "\n"); + } + } + } + ira_reg_equiv[to_regno].init_insns + = gen_rtx_INSN_LIST (VOIDmode, insn, + ira_reg_equiv[to_regno].init_insns); + if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) + fprintf (ira_dump_file, + " Adding equiv init move insn %u to reg %d\n", + INSN_UID (insn), to_regno); +} + /* Fix values of array REG_EQUIV_INIT after live range splitting done by IRA. */ static void @@ -2221,6 +2296,7 @@ fix_reg_equiv_init (void) prev = x; else { + /* Remove the wrong list element. */ if (prev == NULL_RTX) reg_equiv_init (i) = next; else @@ -2360,6 +2436,46 @@ mark_elimination (int from, int to) +/* The length of the following array. */ +int ira_reg_equiv_len; + +/* Info about equiv. info for each register. */ +struct ira_reg_equiv *ira_reg_equiv; + +/* Expand ira_reg_equiv if necessary. */ +void +ira_expand_reg_equiv (void) +{ + int old = ira_reg_equiv_len; + + if (ira_reg_equiv_len > max_reg_num ()) + return; + ira_reg_equiv_len = max_reg_num () * 3 / 2 + 1; + ira_reg_equiv + = (struct ira_reg_equiv *) xrealloc (ira_reg_equiv, + ira_reg_equiv_len + * sizeof (struct ira_reg_equiv)); + gcc_assert (old < ira_reg_equiv_len); + memset (ira_reg_equiv + old, 0, + sizeof (struct ira_reg_equiv) * (ira_reg_equiv_len - old)); +} + +static void +init_reg_equiv (void) +{ + ira_reg_equiv_len = 0; + ira_reg_equiv = NULL; + ira_expand_reg_equiv (); +} + +static void +finish_reg_equiv (void) +{ + free (ira_reg_equiv); +} + + + struct equivalence { /* Set when a REG_EQUIV note is found or created. Use to @@ -2733,7 +2849,8 @@ no_equiv (rtx reg, const_rtx store ATTRIBUTE_UNUSED, should keep their initialization insns. */ if (reg_equiv[regno].is_arg_equivalence) return; - reg_equiv_init (regno) = NULL_RTX; + ira_reg_equiv[regno].defined_p = false; + ira_reg_equiv[regno].init_insns = NULL_RTX; for (; list; list = XEXP (list, 1)) { rtx insn = XEXP (list, 0); @@ -2769,7 +2886,7 @@ static int recorded_label_ref; value into the using insn. If it succeeds, we can eliminate the register completely. - Initialize the REG_EQUIV_INIT array of initializing insns. + Initialize init_insns in ira_reg_equiv array. Return non-zero if jump label rebuilding should be done. */ static int @@ -2844,14 +2961,16 @@ update_equiv_regs (void) gcc_assert (REG_P (dest)); regno = REGNO (dest); - /* Note that we don't want to clear reg_equiv_init even if there - are multiple sets of this register. */ + /* Note that we don't want to clear init_insns in + ira_reg_equiv even if there are multiple sets of this + register. */ reg_equiv[regno].is_arg_equivalence = 1; /* Record for reload that this is an equivalencing insn. */ if (rtx_equal_p (src, XEXP (note, 0))) - reg_equiv_init (regno) - = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno)); + ira_reg_equiv[regno].init_insns + = gen_rtx_INSN_LIST (VOIDmode, insn, + ira_reg_equiv[regno].init_insns); /* Continue normally in case this is a candidate for replacements. */ @@ -2951,8 +3070,9 @@ update_equiv_regs (void) /* If we haven't done so, record for reload that this is an equivalencing insn. */ if (!reg_equiv[regno].is_arg_equivalence) - reg_equiv_init (regno) - = gen_rtx_INSN_LIST (VOIDmode, insn, reg_equiv_init (regno)); + ira_reg_equiv[regno].init_insns + = gen_rtx_INSN_LIST (VOIDmode, insn, + ira_reg_equiv[regno].init_insns); /* Record whether or not we created a REG_EQUIV note for a LABEL_REF. We might end up substituting the LABEL_REF for uses of the @@ -3052,7 +3172,7 @@ update_equiv_regs (void) { /* This insn makes the equivalence, not the one initializing the register. */ - reg_equiv_init (regno) + ira_reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX); df_notes_rescan (init_insn); } @@ -3106,9 +3226,10 @@ update_equiv_regs (void) /* reg_equiv[REGNO].replace gets set only when REG_N_REFS[REGNO] is 2, i.e. the register is set - once and used once. (If it were only set, but not used, - flow would have deleted the setting insns.) Hence - there can only be one insn in reg_equiv[REGNO].init_insns. */ + once and used once. (If it were only set, but + not used, flow would have deleted the setting + insns.) Hence there can only be one insn in + reg_equiv[REGNO].init_insns. */ gcc_assert (reg_equiv[regno].init_insns && !XEXP (reg_equiv[regno].init_insns, 1)); equiv_insn = XEXP (reg_equiv[regno].init_insns, 0); @@ -3155,7 +3276,7 @@ update_equiv_regs (void) reg_equiv[regno].init_insns = XEXP (reg_equiv[regno].init_insns, 1); - reg_equiv_init (regno) = NULL_RTX; + ira_reg_equiv[regno].init_insns = NULL_RTX; bitmap_set_bit (cleared_regs, regno); } /* Move the initialization of the register to just before @@ -3188,7 +3309,7 @@ update_equiv_regs (void) if (insn == BB_HEAD (bb)) BB_HEAD (bb) = PREV_INSN (insn); - reg_equiv_init (regno) + ira_reg_equiv[regno].init_insns = gen_rtx_INSN_LIST (VOIDmode, new_insn, NULL_RTX); bitmap_set_bit (cleared_regs, regno); } @@ -3236,6 +3357,88 @@ update_equiv_regs (void) +/* Set up fields memory, constant, and invariant from init_insns in + the structures of array ira_reg_equiv. */ +static void +setup_reg_equiv (void) +{ + int i; + rtx elem, insn, set, x; + + for (i = FIRST_PSEUDO_REGISTER; i < ira_reg_equiv_len; i++) + for (elem = ira_reg_equiv[i].init_insns; elem; elem = XEXP (elem, 1)) + { + insn = XEXP (elem, 0); + set = single_set (insn); + + /* Init insns can set up equivalence when the reg is a destination or + a source (in this case the destination is memory). */ + if (set != 0 && (REG_P (SET_DEST (set)) || REG_P (SET_SRC (set)))) + { + if ((x = find_reg_note (insn, REG_EQUIV, NULL_RTX)) != NULL) + x = XEXP (x, 0); + else if (REG_P (SET_DEST (set)) + && REGNO (SET_DEST (set)) == (unsigned int) i) + x = SET_SRC (set); + else + { + gcc_assert (REG_P (SET_SRC (set)) + && REGNO (SET_SRC (set)) == (unsigned int) i); + x = SET_DEST (set); + } + if (! function_invariant_p (x) + || ! flag_pic + /* A function invariant is often CONSTANT_P but may + include a register. We promise to only pass + CONSTANT_P objects to LEGITIMATE_PIC_OPERAND_P. */ + || (CONSTANT_P (x) && LEGITIMATE_PIC_OPERAND_P (x))) + { + /* It can happen that a REG_EQUIV note contains a MEM + that is not a legitimate memory operand. As later + stages of reload assume that all addresses found in + the lra_regno_equiv_* arrays were originally + legitimate, we ignore such REG_EQUIV notes. */ + if (memory_operand (x, VOIDmode)) + { + ira_reg_equiv[i].defined_p = true; + ira_reg_equiv[i].memory = x; + continue; + } + else if (function_invariant_p (x)) + { + enum machine_mode mode; + + mode = GET_MODE (SET_DEST (set)); + if (GET_CODE (x) == PLUS + || x == frame_pointer_rtx || x == arg_pointer_rtx) + /* This is PLUS of frame pointer and a constant, + or fp, or argp. */ + ira_reg_equiv[i].invariant = x; + else if (targetm.legitimate_constant_p (mode, x)) + ira_reg_equiv[i].constant = x; + else + { + ira_reg_equiv[i].memory = force_const_mem (mode, x); + if (ira_reg_equiv[i].memory == NULL_RTX) + { + ira_reg_equiv[i].defined_p = false; + ira_reg_equiv[i].init_insns = NULL_RTX; + break; + } + } + ira_reg_equiv[i].defined_p = true; + continue; + } + } + } + ira_reg_equiv[i].defined_p = false; + ira_reg_equiv[i].init_insns = NULL_RTX; + break; + } +} + + + /* Print chain C to FILE. */ static void print_insn_chain (FILE *file, struct insn_chain *c) @@ -4130,6 +4333,11 @@ allocate_initial_values (void) } } + +/* True when we use LRA instead of reload pass for the current + function. */ +bool ira_use_lra_p; + /* All natural loops. */ struct loops ira_loops; @@ -4147,6 +4355,31 @@ ira (FILE *f) bool loops_p; int max_regno_before_ira, ira_max_point_before_emit; int rebuild_p; + bool saved_flag_caller_saves = flag_caller_saves; + enum ira_region saved_flag_ira_region = flag_ira_region; + + ira_conflicts_p = optimize > 0; + + ira_use_lra_p = targetm.lra_p (); + /* If there are too many pseudos and/or basic blocks (e.g. 10K + pseudos and 10K blocks or 100K pseudos and 1K blocks), we will + use simplified and faster algorithms in LRA. */ + lra_simple_p + = (ira_use_lra_p && max_reg_num () >= (1 << 26) / last_basic_block); + if (lra_simple_p) + { + /* It permits to skip live range splitting in LRA. */ + flag_caller_saves = false; + /* There is no sense to do regional allocation when we use + simplified LRA. */ + flag_ira_region = IRA_REGION_ONE; + ira_conflicts_p = false; + } + +#ifndef IRA_NO_OBSTACK + gcc_obstack_init (&ira_obstack); +#endif + bitmap_obstack_initialize (&ira_bitmap_obstack); if (flag_caller_saves) init_caller_save (); @@ -4162,7 +4395,6 @@ ira (FILE *f) ira_dump_file = stderr; } - ira_conflicts_p = optimize > 0; setup_prohibited_mode_move_regs (); df_note_add_problem (); @@ -4188,30 +4420,18 @@ ira (FILE *f) if (resize_reg_info () && flag_ira_loop_pressure) ira_set_pseudo_classes (true, ira_dump_file); + init_reg_equiv (); rebuild_p = update_equiv_regs (); + setup_reg_equiv (); + setup_reg_equiv_init (); -#ifndef IRA_NO_OBSTACK - gcc_obstack_init (&ira_obstack); -#endif - bitmap_obstack_initialize (&ira_bitmap_obstack); - if (optimize) + if (optimize && rebuild_p) { - max_regno = max_reg_num (); - ira_reg_equiv_len = max_regno; - ira_reg_equiv_invariant_p - = (bool *) ira_allocate (max_regno * sizeof (bool)); - memset (ira_reg_equiv_invariant_p, 0, max_regno * sizeof (bool)); - ira_reg_equiv_const = (rtx *) ira_allocate (max_regno * sizeof (rtx)); - memset (ira_reg_equiv_const, 0, max_regno * sizeof (rtx)); - find_reg_equiv_invariant_const (); - if (rebuild_p) - { - timevar_push (TV_JUMP); - rebuild_jump_labels (get_insns ()); - if (purge_all_dead_edges ()) - delete_unreachable_blocks (); - timevar_pop (TV_JUMP); - } + timevar_push (TV_JUMP); + rebuild_jump_labels (get_insns ()); + if (purge_all_dead_edges ()) + delete_unreachable_blocks (); + timevar_pop (TV_JUMP); } allocated_reg_info_size = max_reg_num (); @@ -4226,7 +4446,7 @@ ira (FILE *f) find_moveable_pseudos (); max_regno_before_ira = max_reg_num (); - ira_setup_eliminable_regset (); + ira_setup_eliminable_regset (true); ira_overall_cost = ira_reg_cost = ira_mem_cost = 0; ira_load_cost = ira_store_cost = ira_shuffle_cost = 0; @@ -4263,19 +4483,32 @@ ira (FILE *f) ira_emit (loops_p); + max_regno = max_reg_num (); if (ira_conflicts_p) { - max_regno = max_reg_num (); - if (! loops_p) - ira_initiate_assign (); + { + if (! ira_use_lra_p) + ira_initiate_assign (); + } else { expand_reg_info (); - if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) - fprintf (ira_dump_file, "Flattening IR\n"); - ira_flattening (max_regno_before_ira, ira_max_point_before_emit); + if (ira_use_lra_p) + { + ira_allocno_t a; + ira_allocno_iterator ai; + + FOR_EACH_ALLOCNO (a, ai) + ALLOCNO_REGNO (a) = REGNO (ALLOCNO_EMIT_DATA (a)->reg); + } + else + { + if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) + fprintf (ira_dump_file, "Flattening IR\n"); + ira_flattening (max_regno_before_ira, ira_max_point_before_emit); + } /* New insns were generated: add notes and recalculate live info. */ df_analyze (); @@ -4289,9 +4522,12 @@ ira (FILE *f) current_loops = &ira_loops; record_loop_exits (); - setup_allocno_assignment_flags (); - ira_initiate_assign (); - ira_reassign_conflict_allocnos (max_regno); + if (! ira_use_lra_p) + { + setup_allocno_assignment_flags (); + ira_initiate_assign (); + ira_reassign_conflict_allocnos (max_regno); + } } } @@ -4338,6 +4574,13 @@ ira (FILE *f) /* See comment for find_moveable_pseudos call. */ if (ira_conflicts_p) move_unallocated_pseudos (); + + /* Restore original values. */ + if (lra_simple_p) + { + flag_caller_saves = saved_flag_caller_saves; + flag_ira_region = saved_flag_ira_region; + } } static void @@ -4349,46 +4592,77 @@ do_reload (void) if (flag_ira_verbose < 10) ira_dump_file = dump_file; - df_set_flags (DF_NO_INSN_RESCAN); - build_insn_chain (); + timevar_push (TV_RELOAD); + if (ira_use_lra_p) + { + if (current_loops != NULL) + { + release_recorded_exits (); + flow_loops_free (&ira_loops); + free_dominance_info (CDI_DOMINATORS); + } + FOR_ALL_BB (bb) + bb->loop_father = NULL; + current_loops = NULL; + + if (ira_conflicts_p) + ira_free (ira_spilled_reg_stack_slots); + + ira_destroy (); - need_dce = reload (get_insns (), ira_conflicts_p); + lra (ira_dump_file); + /* ???!!! Move it before lra () when we use ira_reg_equiv in + LRA. */ + VEC_free (reg_equivs_t, gc, reg_equivs); + reg_equivs = NULL; + need_dce = false; + } + else + { + df_set_flags (DF_NO_INSN_RESCAN); + build_insn_chain (); + + need_dce = reload (get_insns (), ira_conflicts_p); + + } + + timevar_pop (TV_RELOAD); timevar_push (TV_IRA); - if (ira_conflicts_p) + if (ira_conflicts_p && ! ira_use_lra_p) { ira_free (ira_spilled_reg_stack_slots); - ira_finish_assign (); } + if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL && overall_cost_before != ira_overall_cost) fprintf (ira_dump_file, "+++Overall after reload %d\n", ira_overall_cost); - ira_destroy (); flag_ira_share_spill_slots = saved_flag_ira_share_spill_slots; - if (current_loops != NULL) + if (! ira_use_lra_p) { - release_recorded_exits (); - flow_loops_free (&ira_loops); - free_dominance_info (CDI_DOMINATORS); + ira_destroy (); + if (current_loops != NULL) + { + release_recorded_exits (); + flow_loops_free (&ira_loops); + free_dominance_info (CDI_DOMINATORS); + } + FOR_ALL_BB (bb) + bb->loop_father = NULL; + current_loops = NULL; + + regstat_free_ri (); + regstat_free_n_sets_and_refs (); } - FOR_ALL_BB (bb) - bb->loop_father = NULL; - current_loops = NULL; - - regstat_free_ri (); - regstat_free_n_sets_and_refs (); if (optimize) - { - cleanup_cfg (CLEANUP_EXPENSIVE); + cleanup_cfg (CLEANUP_EXPENSIVE); - ira_free (ira_reg_equiv_invariant_p); - ira_free (ira_reg_equiv_const); - } + finish_reg_equiv (); bitmap_obstack_release (&ira_bitmap_obstack); #ifndef IRA_NO_OBSTACK diff --git a/gcc/ira.h b/gcc/ira.h index 0cafdf4a94c..19852ee934a 100644 --- a/gcc/ira.h +++ b/gcc/ira.h @@ -20,11 +20,16 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +/* True when we use LRA instead of reload pass for the current + function. */ +extern bool ira_use_lra_p; + /* True if we have allocno conflicts. It is false for non-optimized mode or when the conflict table is too big. */ extern bool ira_conflicts_p; -struct target_ira { +struct target_ira +{ /* Map: hard register number -> allocno class it belongs to. If the corresponding class is NO_REGS, the hard register is not available for allocation. */ @@ -79,6 +84,23 @@ struct target_ira { class. */ int x_ira_class_hard_regs_num[N_REG_CLASSES]; + /* Register class subset relation: TRUE if the first class is a subset + of the second one considering only hard registers available for the + allocation. */ + int x_ira_class_subset_p[N_REG_CLASSES][N_REG_CLASSES]; + + /* The biggest class inside of intersection of the two classes (that + is calculated taking only hard registers available for allocation + into account. If the both classes contain no hard registers + available for allocation, the value is calculated with taking all + hard-registers including fixed ones into account. */ + enum reg_class x_ira_reg_class_subset[N_REG_CLASSES][N_REG_CLASSES]; + + /* True if the two classes (that is calculated taking only hard + registers available for allocation into account; are + intersected. */ + bool x_ira_reg_classes_intersect_p[N_REG_CLASSES][N_REG_CLASSES]; + /* If class CL has a single allocatable register of mode M, index [CL][M] gives the number of that register, otherwise it is -1. */ short x_ira_class_singleton[N_REG_CLASSES][MAX_MACHINE_MODE]; @@ -121,18 +143,48 @@ extern struct target_ira *this_target_ira; (this_target_ira->x_ira_class_hard_regs) #define ira_class_hard_regs_num \ (this_target_ira->x_ira_class_hard_regs_num) +#define ira_class_subset_p \ + (this_target_ira->x_ira_class_subset_p) +#define ira_reg_class_subset \ + (this_target_ira->x_ira_reg_class_subset) +#define ira_reg_classes_intersect_p \ + (this_target_ira->x_ira_reg_classes_intersect_p) #define ira_class_singleton \ (this_target_ira->x_ira_class_singleton) #define ira_no_alloc_regs \ (this_target_ira->x_ira_no_alloc_regs) +/* Major structure describing equivalence info for a pseudo. */ +struct ira_reg_equiv +{ + /* True if we can use this equivalence. */ + bool defined_p; + /* True if the usage of the equivalence is profitable. */ + bool profitable_p; + /* Equiv. memory, constant, invariant, and initializing insns of + given pseudo-register or NULL_RTX. */ + rtx memory; + rtx constant; + rtx invariant; + /* Always NULL_RTX if defined_p is false. */ + rtx init_insns; +}; + +/* The length of the following array. */ +extern int ira_reg_equiv_len; + +/* Info about equiv. info for each register. */ +extern struct ira_reg_equiv *ira_reg_equiv; + extern void ira_init_once (void); extern void ira_init (void); extern void ira_finish_once (void); -extern void ira_setup_eliminable_regset (void); +extern void ira_setup_eliminable_regset (bool); extern rtx ira_eliminate_regs (rtx, enum machine_mode); extern void ira_set_pseudo_classes (bool, FILE *); extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *); +extern void ira_expand_reg_equiv (void); +extern void ira_update_equiv_info_by_shuffle_insn (int, int, rtx); extern void ira_sort_regnos_for_alter_reg (int *, int, unsigned int *); extern void ira_mark_allocation_change (int); diff --git a/gcc/jump.c b/gcc/jump.c index 73582435749..acc96341d65 100644 --- a/gcc/jump.c +++ b/gcc/jump.c @@ -1868,7 +1868,8 @@ true_regnum (const_rtx x) { if (REG_P (x)) { - if (REGNO (x) >= FIRST_PSEUDO_REGISTER && reg_renumber[REGNO (x)] >= 0) + if (REGNO (x) >= FIRST_PSEUDO_REGISTER + && (lra_in_progress || reg_renumber[REGNO (x)] >= 0)) return reg_renumber[REGNO (x)]; return REGNO (x); } @@ -1880,7 +1881,8 @@ true_regnum (const_rtx x) { struct subreg_info info; - subreg_get_info (REGNO (SUBREG_REG (x)), + subreg_get_info (lra_in_progress + ? (unsigned) base : REGNO (SUBREG_REG (x)), GET_MODE (SUBREG_REG (x)), SUBREG_BYTE (x), GET_MODE (x), &info); diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index 996e6e3a645..ba31541f80f 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1824,7 +1824,7 @@ calculate_loop_reg_pressure (void) bitmap_initialize (&LOOP_DATA (loop)->regs_ref, ®_obstack); bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); } - ira_setup_eliminable_regset (); + ira_setup_eliminable_regset (false); bitmap_initialize (&curr_regs_live, ®_obstack); FOR_EACH_BB (bb) { diff --git a/gcc/lra-assigns.c b/gcc/lra-assigns.c new file mode 100644 index 00000000000..b957563716f --- /dev/null +++ b/gcc/lra-assigns.c @@ -0,0 +1,1398 @@ +/* Assign reload pseudos. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + + +/* This file's main objective is to assign hard registers to reload + pseudos. It also tries to allocate hard registers to other + pseudos, but at a lower priority than the reload pseudos. The pass + does not transform the RTL. + + We must allocate a hard register to every reload pseudo. We try to + increase the chances of finding a viable allocation by assigning + the pseudos in order of fewest available hard registers first. If + we still fail to find a hard register, we spill other (non-reload) + pseudos in order to make room. + + find_hard_regno_for finds hard registers for allocation without + spilling. spill_for does the same with spilling. Both functions + use a cost model to determine the most profitable choice of hard + and spill registers. + + Once we have finished allocating reload pseudos, we also try to + assign registers to other (non-reload) pseudos. This is useful if + hard registers were freed up by the spilling just described. + + We try to assign hard registers by collecting pseudos into threads. + These threads contain reload and inheritance pseudos that are + connected by copies (move insns). Doing this improves the chances + of pseudos in the thread getting the same hard register and, as a + result, of allowing some move insns to be deleted. + + When we assign a hard register to a pseudo, we decrease the cost of + using the same hard register for pseudos that are connected by + copies. + + If two hard registers have the same frequency-derived cost, we + prefer hard registers with higher priorities. The mapping of + registers to priorities is controlled by the register_priority + target hook. For example, x86-64 has a few register priorities: + hard registers with and without REX prefixes have different + priorities. This permits us to generate smaller code as insns + without REX prefixes are shorter. + + If a few hard registers are still equally good for the assignment, + we choose the least used hard register. It is called leveling and + may be profitable for some targets. + + Only insns with changed allocation pseudos are processed on the + next constraint pass. + + The pseudo live-ranges are used to find conflicting pseudos. + + For understanding the code, it is important to keep in mind that + inheritance, split, and reload pseudos created since last + constraint pass have regno >= lra_constraint_new_regno_start. + Inheritance and split pseudos created on any pass are in the + corresponding bitmaps. Inheritance and split pseudos since the + last constraint pass have also the corresponding non-negative + restore_regno. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "hard-reg-set.h" +#include "rtl.h" +#include "tm_p.h" +#include "target.h" +#include "insn-config.h" +#include "recog.h" +#include "output.h" +#include "regs.h" +#include "function.h" +#include "expr.h" +#include "basic-block.h" +#include "except.h" +#include "df.h" +#include "ira.h" +#include "sparseset.h" +#include "lra-int.h" + +/* Array containing corresponding values of function + lra_get_allocno_class. It is used to speed up the code. */ +static enum reg_class *regno_allocno_class_array; + +/* Information about the thread to which a pseudo belongs. Threads are + a set of connected reload and inheritance pseudos with the same set of + available hard registers. Lone registers belong to their own threads. */ +struct regno_assign_info +{ + /* First/next pseudo of the same thread. */ + int first, next; + /* Frequency of the thread (execution frequency of only reload + pseudos in the thread when the thread contains a reload pseudo). + Defined only for the first thread pseudo. */ + int freq; +}; + +/* Map regno to the corresponding regno assignment info. */ +static struct regno_assign_info *regno_assign_info; + +/* Process a pseudo copy with execution frequency COPY_FREQ connecting + REGNO1 and REGNO2 to form threads. */ +static void +process_copy_to_form_thread (int regno1, int regno2, int copy_freq) +{ + int last, regno1_first, regno2_first; + + lra_assert (regno1 >= lra_constraint_new_regno_start + && regno2 >= lra_constraint_new_regno_start); + regno1_first = regno_assign_info[regno1].first; + regno2_first = regno_assign_info[regno2].first; + if (regno1_first != regno2_first) + { + for (last = regno2_first; + regno_assign_info[last].next >= 0; + last = regno_assign_info[last].next) + regno_assign_info[last].first = regno1_first; + regno_assign_info[last].first = regno1_first; + regno_assign_info[last].next = regno_assign_info[regno1_first].next; + regno_assign_info[regno1_first].next = regno2_first; + regno_assign_info[regno1_first].freq + += regno_assign_info[regno2_first].freq; + } + regno_assign_info[regno1_first].freq -= 2 * copy_freq; + lra_assert (regno_assign_info[regno1_first].freq >= 0); +} + +/* Initialize REGNO_ASSIGN_INFO and form threads. */ +static void +init_regno_assign_info (void) +{ + int i, regno1, regno2, max_regno = max_reg_num (); + lra_copy_t cp; + + regno_assign_info = XNEWVEC (struct regno_assign_info, max_regno); + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + { + regno_assign_info[i].first = i; + regno_assign_info[i].next = -1; + regno_assign_info[i].freq = lra_reg_info[i].freq; + } + /* Form the threads. */ + for (i = 0; (cp = lra_get_copy (i)) != NULL; i++) + if ((regno1 = cp->regno1) >= lra_constraint_new_regno_start + && (regno2 = cp->regno2) >= lra_constraint_new_regno_start + && reg_renumber[regno1] < 0 && lra_reg_info[regno1].nrefs != 0 + && reg_renumber[regno2] < 0 && lra_reg_info[regno2].nrefs != 0 + && (ira_class_hard_regs_num[regno_allocno_class_array[regno1]] + == ira_class_hard_regs_num[regno_allocno_class_array[regno2]])) + process_copy_to_form_thread (regno1, regno2, cp->freq); +} + +/* Free REGNO_ASSIGN_INFO. */ +static void +finish_regno_assign_info (void) +{ + free (regno_assign_info); +} + +/* The function is used to sort *reload* and *inheritance* pseudos to + try to assign them hard registers. We put pseudos from the same + thread always nearby. */ +static int +reload_pseudo_compare_func (const void *v1p, const void *v2p) +{ + int r1 = *(const int *) v1p, r2 = *(const int *) v2p; + enum reg_class cl1 = regno_allocno_class_array[r1]; + enum reg_class cl2 = regno_allocno_class_array[r2]; + int diff; + + lra_assert (r1 >= lra_constraint_new_regno_start + && r2 >= lra_constraint_new_regno_start); + + /* Prefer to assign reload registers with smaller classes first to + guarantee assignment to all reload registers. */ + if ((diff = (ira_class_hard_regs_num[cl1] + - ira_class_hard_regs_num[cl2])) != 0) + return diff; + if ((diff = (regno_assign_info[regno_assign_info[r2].first].freq + - regno_assign_info[regno_assign_info[r1].first].freq)) != 0) + return diff; + /* Put pseudos from the thread nearby. */ + if ((diff = regno_assign_info[r1].first - regno_assign_info[r2].first) != 0) + return diff; + /* If regs are equally good, sort by their numbers, so that the + results of qsort leave nothing to chance. */ + return r1 - r2; +} + +/* The function is used to sort *non-reload* pseudos to try to assign + them hard registers. The order calculation is simpler than in the + previous function and based on the pseudo frequency usage. */ +static int +pseudo_compare_func (const void *v1p, const void *v2p) +{ + int r1 = *(const int *) v1p, r2 = *(const int *) v2p; + int diff; + + /* Prefer to assign more frequently used registers first. */ + if ((diff = lra_reg_info[r2].freq - lra_reg_info[r1].freq) != 0) + return diff; + + /* If regs are equally good, sort by their numbers, so that the + results of qsort leave nothing to chance. */ + return r1 - r2; +} + +/* Arrays of size LRA_LIVE_MAX_POINT mapping a program point to the + pseudo live ranges with given start point. We insert only live + ranges of pseudos interesting for assignment purposes. They are + reload pseudos and pseudos assigned to hard registers. */ +static lra_live_range_t *start_point_ranges; + +/* Used as a flag that a live range is not inserted in the start point + chain. */ +static struct lra_live_range not_in_chain_mark; + +/* Create and set up START_POINT_RANGES. */ +static void +create_live_range_start_chains (void) +{ + int i, max_regno; + lra_live_range_t r; + + start_point_ranges = XCNEWVEC (lra_live_range_t, lra_live_max_point); + max_regno = max_reg_num (); + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + if (i >= lra_constraint_new_regno_start || reg_renumber[i] >= 0) + { + for (r = lra_reg_info[i].live_ranges; r != NULL; r = r->next) + { + r->start_next = start_point_ranges[r->start]; + start_point_ranges[r->start] = r; + } + } + else + { + for (r = lra_reg_info[i].live_ranges; r != NULL; r = r->next) + r->start_next = ¬_in_chain_mark; + } +} + +/* Insert live ranges of pseudo REGNO into start chains if they are + not there yet. */ +static void +insert_in_live_range_start_chain (int regno) +{ + lra_live_range_t r = lra_reg_info[regno].live_ranges; + + if (r->start_next != ¬_in_chain_mark) + return; + for (; r != NULL; r = r->next) + { + r->start_next = start_point_ranges[r->start]; + start_point_ranges[r->start] = r; + } +} + +/* Free START_POINT_RANGES. */ +static void +finish_live_range_start_chains (void) +{ + gcc_assert (start_point_ranges != NULL); + free (start_point_ranges); + start_point_ranges = NULL; +} + +/* Map: program point -> bitmap of all pseudos living at the point and + assigned to hard registers. */ +static bitmap_head *live_hard_reg_pseudos; +static bitmap_obstack live_hard_reg_pseudos_bitmap_obstack; + +/* reg_renumber corresponding to pseudos marked in + live_hard_reg_pseudos. reg_renumber might be not matched to + live_hard_reg_pseudos but live_pseudos_reg_renumber always reflects + live_hard_reg_pseudos. */ +static int *live_pseudos_reg_renumber; + +/* Sparseset used to calculate living hard reg pseudos for some program + point range. */ +static sparseset live_range_hard_reg_pseudos; + +/* Sparseset used to calculate living reload/inheritance pseudos for + some program point range. */ +static sparseset live_range_reload_inheritance_pseudos; + +/* Allocate and initialize the data about living pseudos at program + points. */ +static void +init_lives (void) +{ + int i, max_regno = max_reg_num (); + + live_range_hard_reg_pseudos = sparseset_alloc (max_regno); + live_range_reload_inheritance_pseudos = sparseset_alloc (max_regno); + live_hard_reg_pseudos = XNEWVEC (bitmap_head, lra_live_max_point); + bitmap_obstack_initialize (&live_hard_reg_pseudos_bitmap_obstack); + for (i = 0; i < lra_live_max_point; i++) + bitmap_initialize (&live_hard_reg_pseudos[i], + &live_hard_reg_pseudos_bitmap_obstack); + live_pseudos_reg_renumber = XNEWVEC (int, max_regno); + for (i = 0; i < max_regno; i++) + live_pseudos_reg_renumber[i] = -1; +} + +/* Free the data about living pseudos at program points. */ +static void +finish_lives (void) +{ + sparseset_free (live_range_hard_reg_pseudos); + sparseset_free (live_range_reload_inheritance_pseudos); + free (live_hard_reg_pseudos); + bitmap_obstack_release (&live_hard_reg_pseudos_bitmap_obstack); + free (live_pseudos_reg_renumber); +} + +/* Update the LIVE_HARD_REG_PSEUDOS and LIVE_PSEUDOS_REG_RENUMBER + entries for pseudo REGNO. Assume that the register has been + spilled if FREE_P, otherwise assume that it has been assigned + reg_renumber[REGNO] (if >= 0). We also insert the pseudo live + ranges in the start chains when it is assumed to be assigned to a + hard register because we use the chains of pseudos assigned to hard + registers during allocation. */ +static void +update_lives (int regno, bool free_p) +{ + int p; + lra_live_range_t r; + + if (reg_renumber[regno] < 0) + return; + live_pseudos_reg_renumber[regno] = free_p ? -1 : reg_renumber[regno]; + for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next) + { + for (p = r->start; p <= r->finish; p++) + if (free_p) + bitmap_clear_bit (&live_hard_reg_pseudos[p], regno); + else + { + bitmap_set_bit (&live_hard_reg_pseudos[p], regno); + insert_in_live_range_start_chain (regno); + } + } +} + +/* Sparseset used to calculate reload pseudos conflicting with a given + pseudo when we are trying to find a hard register for the given + pseudo. */ +static sparseset conflict_reload_and_inheritance_pseudos; + +/* Map: program point -> bitmap of all reload and inheritance pseudos + living at the point. */ +static bitmap_head *live_reload_and_inheritance_pseudos; +static bitmap_obstack live_reload_and_inheritance_pseudos_bitmap_obstack; + +/* Allocate and initialize data about living reload pseudos at any + given program point. */ +static void +init_live_reload_and_inheritance_pseudos (void) +{ + int i, p, max_regno = max_reg_num (); + lra_live_range_t r; + + conflict_reload_and_inheritance_pseudos = sparseset_alloc (max_regno); + live_reload_and_inheritance_pseudos = XNEWVEC (bitmap_head, lra_live_max_point); + bitmap_obstack_initialize (&live_reload_and_inheritance_pseudos_bitmap_obstack); + for (p = 0; p < lra_live_max_point; p++) + bitmap_initialize (&live_reload_and_inheritance_pseudos[p], + &live_reload_and_inheritance_pseudos_bitmap_obstack); + for (i = lra_constraint_new_regno_start; i < max_regno; i++) + { + for (r = lra_reg_info[i].live_ranges; r != NULL; r = r->next) + for (p = r->start; p <= r->finish; p++) + bitmap_set_bit (&live_reload_and_inheritance_pseudos[p], i); + } +} + +/* Finalize data about living reload pseudos at any given program + point. */ +static void +finish_live_reload_and_inheritance_pseudos (void) +{ + sparseset_free (conflict_reload_and_inheritance_pseudos); + free (live_reload_and_inheritance_pseudos); + bitmap_obstack_release (&live_reload_and_inheritance_pseudos_bitmap_obstack); +} + +/* The value used to check that cost of given hard reg is really + defined currently. */ +static int curr_hard_regno_costs_check = 0; +/* Array used to check that cost of the corresponding hard reg (the + array element index) is really defined currently. */ +static int hard_regno_costs_check[FIRST_PSEUDO_REGISTER]; +/* The current costs of allocation of hard regs. Defined only if the + value of the corresponding element of the previous array is equal to + CURR_HARD_REGNO_COSTS_CHECK. */ +static int hard_regno_costs[FIRST_PSEUDO_REGISTER]; + +/* Adjust cost of HARD_REGNO by INCR. Reset the cost first if it is + not defined yet. */ +static inline void +adjust_hard_regno_cost (int hard_regno, int incr) +{ + if (hard_regno_costs_check[hard_regno] != curr_hard_regno_costs_check) + hard_regno_costs[hard_regno] = 0; + hard_regno_costs_check[hard_regno] = curr_hard_regno_costs_check; + hard_regno_costs[hard_regno] += incr; +} + +/* Try to find a free hard register for pseudo REGNO. Return the + hard register on success and set *COST to the cost of using + that register. (If several registers have equal cost, the one with + the highest priority wins.) Return -1 on failure. + + If TRY_ONLY_HARD_REGNO >= 0, consider only that hard register, + otherwise consider all hard registers in REGNO's class. */ +static int +find_hard_regno_for (int regno, int *cost, int try_only_hard_regno) +{ + HARD_REG_SET conflict_set; + int best_cost = INT_MAX, best_priority = INT_MIN, best_usage = INT_MAX; + lra_live_range_t r; + int p, i, j, rclass_size, best_hard_regno, priority, hard_regno; + int hr, conflict_hr, nregs; + enum machine_mode biggest_mode; + unsigned int k, conflict_regno; + int val, biggest_nregs, nregs_diff; + enum reg_class rclass; + bitmap_iterator bi; + bool *rclass_intersect_p; + HARD_REG_SET impossible_start_hard_regs; + + COPY_HARD_REG_SET (conflict_set, lra_no_alloc_regs); + rclass = regno_allocno_class_array[regno]; + rclass_intersect_p = ira_reg_classes_intersect_p[rclass]; + curr_hard_regno_costs_check++; + sparseset_clear (conflict_reload_and_inheritance_pseudos); + sparseset_clear (live_range_hard_reg_pseudos); + IOR_HARD_REG_SET (conflict_set, lra_reg_info[regno].conflict_hard_regs); + biggest_mode = lra_reg_info[regno].biggest_mode; + for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next) + { + EXECUTE_IF_SET_IN_BITMAP (&live_hard_reg_pseudos[r->start], 0, k, bi) + if (rclass_intersect_p[regno_allocno_class_array[k]]) + sparseset_set_bit (live_range_hard_reg_pseudos, k); + EXECUTE_IF_SET_IN_BITMAP (&live_reload_and_inheritance_pseudos[r->start], + 0, k, bi) + if (lra_reg_info[k].preferred_hard_regno1 >= 0 + && live_pseudos_reg_renumber[k] < 0 + && rclass_intersect_p[regno_allocno_class_array[k]]) + sparseset_set_bit (conflict_reload_and_inheritance_pseudos, k); + for (p = r->start + 1; p <= r->finish; p++) + { + lra_live_range_t r2; + + for (r2 = start_point_ranges[p]; + r2 != NULL; + r2 = r2->start_next) + { + if (r2->regno >= lra_constraint_new_regno_start + && lra_reg_info[r2->regno].preferred_hard_regno1 >= 0 + && live_pseudos_reg_renumber[r2->regno] < 0 + && rclass_intersect_p[regno_allocno_class_array[r2->regno]]) + sparseset_set_bit (conflict_reload_and_inheritance_pseudos, + r2->regno); + if (live_pseudos_reg_renumber[r2->regno] >= 0 + && rclass_intersect_p[regno_allocno_class_array[r2->regno]]) + sparseset_set_bit (live_range_hard_reg_pseudos, r2->regno); + } + } + } + if ((hard_regno = lra_reg_info[regno].preferred_hard_regno1) >= 0) + { + adjust_hard_regno_cost + (hard_regno, -lra_reg_info[regno].preferred_hard_regno_profit1); + if ((hard_regno = lra_reg_info[regno].preferred_hard_regno2) >= 0) + adjust_hard_regno_cost + (hard_regno, -lra_reg_info[regno].preferred_hard_regno_profit2); + } +#ifdef STACK_REGS + if (lra_reg_info[regno].no_stack_p) + for (i = FIRST_STACK_REG; i <= LAST_STACK_REG; i++) + SET_HARD_REG_BIT (conflict_set, i); +#endif + sparseset_clear_bit (conflict_reload_and_inheritance_pseudos, regno); + val = lra_reg_info[regno].val; + CLEAR_HARD_REG_SET (impossible_start_hard_regs); + EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos, conflict_regno) + if (val == lra_reg_info[conflict_regno].val) + { + conflict_hr = live_pseudos_reg_renumber[conflict_regno]; + nregs = (hard_regno_nregs[conflict_hr] + [lra_reg_info[conflict_regno].biggest_mode]); + /* Remember about multi-register pseudos. For example, 2 hard + register pseudos can start on the same hard register but can + not start on HR and HR+1/HR-1. */ + for (hr = conflict_hr + 1; + hr < FIRST_PSEUDO_REGISTER && hr < conflict_hr + nregs; + hr++) + SET_HARD_REG_BIT (impossible_start_hard_regs, hr); + for (hr = conflict_hr - 1; + hr >= 0 && hr + hard_regno_nregs[hr][biggest_mode] > conflict_hr; + hr--) + SET_HARD_REG_BIT (impossible_start_hard_regs, hr); + } + else + { + add_to_hard_reg_set (&conflict_set, + lra_reg_info[conflict_regno].biggest_mode, + live_pseudos_reg_renumber[conflict_regno]); + if (hard_reg_set_subset_p (reg_class_contents[rclass], + conflict_set)) + return -1; + } + EXECUTE_IF_SET_IN_SPARSESET (conflict_reload_and_inheritance_pseudos, + conflict_regno) + if (val != lra_reg_info[conflict_regno].val) + { + lra_assert (live_pseudos_reg_renumber[conflict_regno] < 0); + if ((hard_regno + = lra_reg_info[conflict_regno].preferred_hard_regno1) >= 0) + { + adjust_hard_regno_cost + (hard_regno, + lra_reg_info[conflict_regno].preferred_hard_regno_profit1); + if ((hard_regno + = lra_reg_info[conflict_regno].preferred_hard_regno2) >= 0) + adjust_hard_regno_cost + (hard_regno, + lra_reg_info[conflict_regno].preferred_hard_regno_profit2); + } + } + /* Make sure that all registers in a multi-word pseudo belong to the + required class. */ + IOR_COMPL_HARD_REG_SET (conflict_set, reg_class_contents[rclass]); + lra_assert (rclass != NO_REGS); + rclass_size = ira_class_hard_regs_num[rclass]; + best_hard_regno = -1; + hard_regno = ira_class_hard_regs[rclass][0]; + biggest_nregs = hard_regno_nregs[hard_regno][biggest_mode]; + nregs_diff = (biggest_nregs + - hard_regno_nregs[hard_regno][PSEUDO_REGNO_MODE (regno)]); + for (i = 0; i < rclass_size; i++) + { + if (try_only_hard_regno >= 0) + hard_regno = try_only_hard_regno; + else + hard_regno = ira_class_hard_regs[rclass][i]; + if (! overlaps_hard_reg_set_p (conflict_set, + PSEUDO_REGNO_MODE (regno), hard_regno) + /* We can not use prohibited_class_mode_regs because it is + not defined for all classes. */ + && HARD_REGNO_MODE_OK (hard_regno, PSEUDO_REGNO_MODE (regno)) + && ! TEST_HARD_REG_BIT (impossible_start_hard_regs, hard_regno) + && (nregs_diff == 0 +#ifdef WORDS_BIG_ENDIAN + || (hard_regno - nregs_diff >= 0 + && TEST_HARD_REG_BIT (reg_class_contents[rclass], + hard_regno - nregs_diff)) +#else + || TEST_HARD_REG_BIT (reg_class_contents[rclass], + hard_regno + nregs_diff) +#endif + )) + { + if (hard_regno_costs_check[hard_regno] + != curr_hard_regno_costs_check) + { + hard_regno_costs_check[hard_regno] = curr_hard_regno_costs_check; + hard_regno_costs[hard_regno] = 0; + } + for (j = 0; + j < hard_regno_nregs[hard_regno][PSEUDO_REGNO_MODE (regno)]; + j++) + if (! TEST_HARD_REG_BIT (call_used_reg_set, hard_regno + j) + && ! df_regs_ever_live_p (hard_regno + j)) + /* It needs save restore. */ + hard_regno_costs[hard_regno] + += 2 * ENTRY_BLOCK_PTR->next_bb->frequency; + priority = targetm.register_priority (hard_regno); + if (best_hard_regno < 0 || hard_regno_costs[hard_regno] < best_cost + || (hard_regno_costs[hard_regno] == best_cost + && (priority > best_priority + /* Hard register usage leveling actually results + in bigger code for targets with conditional + execution like ARM because it reduces chance + of if-conversion after LRA. */ + || (! targetm.have_conditional_execution () + && priority == best_priority + && best_usage > lra_hard_reg_usage[hard_regno])))) + { + best_hard_regno = hard_regno; + best_cost = hard_regno_costs[hard_regno]; + best_priority = priority; + best_usage = lra_hard_reg_usage[hard_regno]; + } + } + if (try_only_hard_regno >= 0) + break; + } + if (best_hard_regno >= 0) + *cost = best_cost - lra_reg_info[regno].freq; + return best_hard_regno; +} + +/* Current value used for checking elements in + update_hard_regno_preference_check. */ +static int curr_update_hard_regno_preference_check; +/* If an element value is equal to the above variable value, then the + corresponding regno has been processed for preference + propagation. */ +static int *update_hard_regno_preference_check; + +/* Update the preference for using HARD_REGNO for pseudos that are + connected directly or indirectly with REGNO. Apply divisor DIV + to any preference adjustments. + + The more indirectly a pseudo is connected, the smaller its effect + should be. We therefore increase DIV on each "hop". */ +static void +update_hard_regno_preference (int regno, int hard_regno, int div) +{ + int another_regno, cost; + lra_copy_t cp, next_cp; + + /* Search depth 5 seems to be enough. */ + if (div > (1 << 5)) + return; + for (cp = lra_reg_info[regno].copies; cp != NULL; cp = next_cp) + { + if (cp->regno1 == regno) + { + next_cp = cp->regno1_next; + another_regno = cp->regno2; + } + else if (cp->regno2 == regno) + { + next_cp = cp->regno2_next; + another_regno = cp->regno1; + } + else + gcc_unreachable (); + if (reg_renumber[another_regno] < 0 + && (update_hard_regno_preference_check[another_regno] + != curr_update_hard_regno_preference_check)) + { + update_hard_regno_preference_check[another_regno] + = curr_update_hard_regno_preference_check; + cost = cp->freq < div ? 1 : cp->freq / div; + lra_setup_reload_pseudo_preferenced_hard_reg + (another_regno, hard_regno, cost); + update_hard_regno_preference (another_regno, hard_regno, div * 2); + } + } +} + +/* Update REG_RENUMBER and other pseudo preferences by assignment of + HARD_REGNO to pseudo REGNO and print about it if PRINT_P. */ +void +lra_setup_reg_renumber (int regno, int hard_regno, bool print_p) +{ + int i, hr; + + /* We can not just reassign hard register. */ + lra_assert (hard_regno < 0 || reg_renumber[regno] < 0); + if ((hr = hard_regno) < 0) + hr = reg_renumber[regno]; + reg_renumber[regno] = hard_regno; + lra_assert (hr >= 0); + for (i = 0; i < hard_regno_nregs[hr][PSEUDO_REGNO_MODE (regno)]; i++) + if (hard_regno < 0) + lra_hard_reg_usage[hr + i] -= lra_reg_info[regno].freq; + else + lra_hard_reg_usage[hr + i] += lra_reg_info[regno].freq; + if (print_p && lra_dump_file != NULL) + fprintf (lra_dump_file, " Assign %d to %sr%d (freq=%d)\n", + reg_renumber[regno], + regno < lra_constraint_new_regno_start + ? "" + : bitmap_bit_p (&lra_inheritance_pseudos, regno) ? "inheritance " + : bitmap_bit_p (&lra_split_regs, regno) ? "split " + : bitmap_bit_p (&lra_optional_reload_pseudos, regno) + ? "optional reload ": "reload ", + regno, lra_reg_info[regno].freq); + if (hard_regno >= 0) + { + curr_update_hard_regno_preference_check++; + update_hard_regno_preference (regno, hard_regno, 1); + } +} + +/* Pseudos which occur in insns containing a particular pseudo. */ +static bitmap_head insn_conflict_pseudos; + +/* Bitmaps used to contain spill pseudos for given pseudo hard regno + and best spill pseudos for given pseudo (and best hard regno). */ +static bitmap_head spill_pseudos_bitmap, best_spill_pseudos_bitmap; + +/* Current pseudo check for validity of elements in + TRY_HARD_REG_PSEUDOS. */ +static int curr_pseudo_check; +/* Array used for validity of elements in TRY_HARD_REG_PSEUDOS. */ +static int try_hard_reg_pseudos_check[FIRST_PSEUDO_REGISTER]; +/* Pseudos who hold given hard register at the considered points. */ +static bitmap_head try_hard_reg_pseudos[FIRST_PSEUDO_REGISTER]; + +/* Set up try_hard_reg_pseudos for given program point P and class + RCLASS. Those are pseudos living at P and assigned to a hard + register of RCLASS. In other words, those are pseudos which can be + spilled to assign a hard register of RCLASS to a pseudo living at + P. */ +static void +setup_try_hard_regno_pseudos (int p, enum reg_class rclass) +{ + int i, hard_regno; + enum machine_mode mode; + unsigned int spill_regno; + bitmap_iterator bi; + + /* Find what pseudos could be spilled. */ + EXECUTE_IF_SET_IN_BITMAP (&live_hard_reg_pseudos[p], 0, spill_regno, bi) + { + mode = PSEUDO_REGNO_MODE (spill_regno); + hard_regno = live_pseudos_reg_renumber[spill_regno]; + if (overlaps_hard_reg_set_p (reg_class_contents[rclass], + mode, hard_regno)) + { + for (i = hard_regno_nregs[hard_regno][mode] - 1; i >= 0; i--) + { + if (try_hard_reg_pseudos_check[hard_regno + i] + != curr_pseudo_check) + { + try_hard_reg_pseudos_check[hard_regno + i] + = curr_pseudo_check; + bitmap_clear (&try_hard_reg_pseudos[hard_regno + i]); + } + bitmap_set_bit (&try_hard_reg_pseudos[hard_regno + i], + spill_regno); + } + } + } +} + +/* Assign temporarily HARD_REGNO to pseudo REGNO. Temporary + assignment means that we might undo the data change. */ +static void +assign_temporarily (int regno, int hard_regno) +{ + int p; + lra_live_range_t r; + + for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next) + { + for (p = r->start; p <= r->finish; p++) + if (hard_regno < 0) + bitmap_clear_bit (&live_hard_reg_pseudos[p], regno); + else + { + bitmap_set_bit (&live_hard_reg_pseudos[p], regno); + insert_in_live_range_start_chain (regno); + } + } + live_pseudos_reg_renumber[regno] = hard_regno; +} + +/* Array used for sorting reload pseudos for subsequent allocation + after spilling some pseudo. */ +static int *sorted_reload_pseudos; + +/* Spill some pseudos for a reload pseudo REGNO and return hard + register which should be used for pseudo after spilling. The + function adds spilled pseudos to SPILLED_PSEUDO_BITMAP. When we + choose hard register (and pseudos occupying the hard registers and + to be spilled), we take into account not only how REGNO will + benefit from the spills but also how other reload pseudos not yet + assigned to hard registers benefit from the spills too. In very + rare cases, the function can fail and return -1. */ +static int +spill_for (int regno, bitmap spilled_pseudo_bitmap) +{ + int i, j, n, p, hard_regno, best_hard_regno, cost, best_cost, rclass_size; + int reload_hard_regno, reload_cost; + enum machine_mode mode, mode2; + enum reg_class rclass; + HARD_REG_SET spilled_hard_regs; + unsigned int spill_regno, reload_regno, uid; + int insn_pseudos_num, best_insn_pseudos_num; + lra_live_range_t r; + bitmap_iterator bi; + + rclass = regno_allocno_class_array[regno]; + lra_assert (reg_renumber[regno] < 0 && rclass != NO_REGS); + bitmap_clear (&insn_conflict_pseudos); + bitmap_clear (&best_spill_pseudos_bitmap); + EXECUTE_IF_SET_IN_BITMAP (&lra_reg_info[regno].insn_bitmap, 0, uid, bi) + { + struct lra_insn_reg *ir; + + for (ir = lra_get_insn_regs (uid); ir != NULL; ir = ir->next) + if (ir->regno >= FIRST_PSEUDO_REGISTER) + bitmap_set_bit (&insn_conflict_pseudos, ir->regno); + } + best_hard_regno = -1; + best_cost = INT_MAX; + best_insn_pseudos_num = INT_MAX; + rclass_size = ira_class_hard_regs_num[rclass]; + mode = PSEUDO_REGNO_MODE (regno); + /* Invalidate try_hard_reg_pseudos elements. */ + curr_pseudo_check++; + for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next) + for (p = r->start; p <= r->finish; p++) + setup_try_hard_regno_pseudos (p, rclass); + for (i = 0; i < rclass_size; i++) + { + hard_regno = ira_class_hard_regs[rclass][i]; + bitmap_clear (&spill_pseudos_bitmap); + for (j = hard_regno_nregs[hard_regno][mode] - 1; j >= 0; j--) + { + if (try_hard_reg_pseudos_check[hard_regno + j] != curr_pseudo_check) + continue; + lra_assert (!bitmap_empty_p (&try_hard_reg_pseudos[hard_regno + j])); + bitmap_ior_into (&spill_pseudos_bitmap, + &try_hard_reg_pseudos[hard_regno + j]); + } + /* Spill pseudos. */ + CLEAR_HARD_REG_SET (spilled_hard_regs); + EXECUTE_IF_SET_IN_BITMAP (&spill_pseudos_bitmap, 0, spill_regno, bi) + if ((int) spill_regno >= lra_constraint_new_regno_start + && ! bitmap_bit_p (&lra_inheritance_pseudos, spill_regno) + && ! bitmap_bit_p (&lra_split_regs, spill_regno) + && ! bitmap_bit_p (&lra_optional_reload_pseudos, spill_regno)) + goto fail; + insn_pseudos_num = 0; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Trying %d:", hard_regno); + sparseset_clear (live_range_reload_inheritance_pseudos); + EXECUTE_IF_SET_IN_BITMAP (&spill_pseudos_bitmap, 0, spill_regno, bi) + { + if (bitmap_bit_p (&insn_conflict_pseudos, spill_regno)) + insn_pseudos_num++; + mode2 = PSEUDO_REGNO_MODE (spill_regno); + update_lives (spill_regno, true); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " spill %d(freq=%d)", + spill_regno, lra_reg_info[spill_regno].freq); + add_to_hard_reg_set (&spilled_hard_regs, + mode2, reg_renumber[spill_regno]); + for (r = lra_reg_info[spill_regno].live_ranges; + r != NULL; + r = r->next) + { + for (p = r->start; p <= r->finish; p++) + { + lra_live_range_t r2; + + for (r2 = start_point_ranges[p]; + r2 != NULL; + r2 = r2->start_next) + if (r2->regno >= lra_constraint_new_regno_start) + sparseset_set_bit (live_range_reload_inheritance_pseudos, + r2->regno); + } + } + } + hard_regno = find_hard_regno_for (regno, &cost, -1); + if (hard_regno >= 0) + { + assign_temporarily (regno, hard_regno); + n = 0; + EXECUTE_IF_SET_IN_SPARSESET (live_range_reload_inheritance_pseudos, + reload_regno) + if (live_pseudos_reg_renumber[reload_regno] < 0 + && (hard_reg_set_intersect_p + (reg_class_contents + [regno_allocno_class_array[reload_regno]], + spilled_hard_regs))) + sorted_reload_pseudos[n++] = reload_regno; + qsort (sorted_reload_pseudos, n, sizeof (int), + reload_pseudo_compare_func); + for (j = 0; j < n; j++) + { + reload_regno = sorted_reload_pseudos[j]; + lra_assert (live_pseudos_reg_renumber[reload_regno] < 0); + if ((reload_hard_regno + = find_hard_regno_for (reload_regno, + &reload_cost, -1)) >= 0 + && (overlaps_hard_reg_set_p + (spilled_hard_regs, + PSEUDO_REGNO_MODE (reload_regno), reload_hard_regno))) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " assign %d(cost=%d)", + reload_regno, reload_cost); + assign_temporarily (reload_regno, reload_hard_regno); + cost += reload_cost; + } + } + EXECUTE_IF_SET_IN_BITMAP (&spill_pseudos_bitmap, 0, spill_regno, bi) + { + rtx x; + + cost += lra_reg_info[spill_regno].freq; + if (ira_reg_equiv[spill_regno].memory != NULL + || ira_reg_equiv[spill_regno].constant != NULL) + for (x = ira_reg_equiv[spill_regno].init_insns; + x != NULL; + x = XEXP (x, 1)) + cost -= REG_FREQ_FROM_BB (BLOCK_FOR_INSN (XEXP (x, 0))); + } + if (best_insn_pseudos_num > insn_pseudos_num + || (best_insn_pseudos_num == insn_pseudos_num + && best_cost > cost)) + { + best_insn_pseudos_num = insn_pseudos_num; + best_cost = cost; + best_hard_regno = hard_regno; + bitmap_copy (&best_spill_pseudos_bitmap, &spill_pseudos_bitmap); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Now best %d(cost=%d)\n", + hard_regno, cost); + } + assign_temporarily (regno, -1); + for (j = 0; j < n; j++) + { + reload_regno = sorted_reload_pseudos[j]; + if (live_pseudos_reg_renumber[reload_regno] >= 0) + assign_temporarily (reload_regno, -1); + } + } + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "\n"); + /* Restore the live hard reg pseudo info for spilled pseudos. */ + EXECUTE_IF_SET_IN_BITMAP (&spill_pseudos_bitmap, 0, spill_regno, bi) + update_lives (spill_regno, false); + fail: + ; + } + /* Spill: */ + EXECUTE_IF_SET_IN_BITMAP (&best_spill_pseudos_bitmap, 0, spill_regno, bi) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Spill %sr%d(hr=%d, freq=%d) for r%d\n", + ((int) spill_regno < lra_constraint_new_regno_start + ? "" + : bitmap_bit_p (&lra_inheritance_pseudos, spill_regno) + ? "inheritance " + : bitmap_bit_p (&lra_split_regs, spill_regno) + ? "split " + : bitmap_bit_p (&lra_optional_reload_pseudos, spill_regno) + ? "optional reload " : "reload "), + spill_regno, reg_renumber[spill_regno], + lra_reg_info[spill_regno].freq, regno); + update_lives (spill_regno, true); + lra_setup_reg_renumber (spill_regno, -1, false); + } + bitmap_ior_into (spilled_pseudo_bitmap, &best_spill_pseudos_bitmap); + return best_hard_regno; +} + +/* Assign HARD_REGNO to REGNO. */ +static void +assign_hard_regno (int hard_regno, int regno) +{ + int i; + + lra_assert (hard_regno >= 0); + lra_setup_reg_renumber (regno, hard_regno, true); + update_lives (regno, false); + for (i = 0; + i < hard_regno_nregs[hard_regno][lra_reg_info[regno].biggest_mode]; + i++) + df_set_regs_ever_live (hard_regno + i, true); +} + +/* Array used for sorting different pseudos. */ +static int *sorted_pseudos; + +/* The constraints pass is allowed to create equivalences between + pseudos that make the current allocation "incorrect" (in the sense + that pseudos are assigned to hard registers from their own conflict + sets). The global variable lra_risky_transformations_p says + whether this might have happened. + + Process pseudos assigned to hard registers (less frequently used + first), spill if a conflict is found, and mark the spilled pseudos + in SPILLED_PSEUDO_BITMAP. Set up LIVE_HARD_REG_PSEUDOS from + pseudos, assigned to hard registers. */ +static void +setup_live_pseudos_and_spill_after_risky_transforms (bitmap + spilled_pseudo_bitmap) +{ + int p, i, j, n, regno, hard_regno; + unsigned int k, conflict_regno; + int val; + HARD_REG_SET conflict_set; + enum machine_mode mode; + lra_live_range_t r; + bitmap_iterator bi; + int max_regno = max_reg_num (); + + if (! lra_risky_transformations_p) + { + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + if (reg_renumber[i] >= 0 && lra_reg_info[i].nrefs > 0) + update_lives (i, false); + return; + } + for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + if (reg_renumber[i] >= 0 && lra_reg_info[i].nrefs > 0) + sorted_pseudos[n++] = i; + qsort (sorted_pseudos, n, sizeof (int), pseudo_compare_func); + for (i = n - 1; i >= 0; i--) + { + regno = sorted_pseudos[i]; + hard_regno = reg_renumber[regno]; + lra_assert (hard_regno >= 0); + mode = lra_reg_info[regno].biggest_mode; + sparseset_clear (live_range_hard_reg_pseudos); + for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next) + { + EXECUTE_IF_SET_IN_BITMAP (&live_hard_reg_pseudos[r->start], 0, k, bi) + sparseset_set_bit (live_range_hard_reg_pseudos, k); + for (p = r->start + 1; p <= r->finish; p++) + { + lra_live_range_t r2; + + for (r2 = start_point_ranges[p]; + r2 != NULL; + r2 = r2->start_next) + if (live_pseudos_reg_renumber[r2->regno] >= 0) + sparseset_set_bit (live_range_hard_reg_pseudos, r2->regno); + } + } + COPY_HARD_REG_SET (conflict_set, lra_no_alloc_regs); + IOR_HARD_REG_SET (conflict_set, lra_reg_info[regno].conflict_hard_regs); + val = lra_reg_info[regno].val; + EXECUTE_IF_SET_IN_SPARSESET (live_range_hard_reg_pseudos, conflict_regno) + if (val != lra_reg_info[conflict_regno].val + /* If it is multi-register pseudos they should start on + the same hard register. */ + || hard_regno != reg_renumber[conflict_regno]) + add_to_hard_reg_set (&conflict_set, + lra_reg_info[conflict_regno].biggest_mode, + reg_renumber[conflict_regno]); + if (! overlaps_hard_reg_set_p (conflict_set, mode, hard_regno)) + { + update_lives (regno, false); + continue; + } + bitmap_set_bit (spilled_pseudo_bitmap, regno); + for (j = 0; + j < hard_regno_nregs[hard_regno][PSEUDO_REGNO_MODE (regno)]; + j++) + lra_hard_reg_usage[hard_regno + j] -= lra_reg_info[regno].freq; + reg_renumber[regno] = -1; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Spill r%d after risky transformations\n", + regno); + } +} + +/* Improve allocation by assigning the same hard regno of inheritance + pseudos to the connected pseudos. We need this because inheritance + pseudos are allocated after reload pseudos in the thread and when + we assign a hard register to a reload pseudo we don't know yet that + the connected inheritance pseudos can get the same hard register. + Add pseudos with changed allocation to bitmap CHANGED_PSEUDOS. */ +static void +improve_inheritance (bitmap changed_pseudos) +{ + unsigned int k; + int regno, another_regno, hard_regno, another_hard_regno, cost, i, n; + lra_copy_t cp, next_cp; + bitmap_iterator bi; + + n = 0; + EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, k, bi) + if (reg_renumber[k] >= 0 && lra_reg_info[k].nrefs != 0) + sorted_pseudos[n++] = k; + qsort (sorted_pseudos, n, sizeof (int), pseudo_compare_func); + for (i = 0; i < n; i++) + { + regno = sorted_pseudos[i]; + hard_regno = reg_renumber[regno]; + lra_assert (hard_regno >= 0); + for (cp = lra_reg_info[regno].copies; cp != NULL; cp = next_cp) + { + if (cp->regno1 == regno) + { + next_cp = cp->regno1_next; + another_regno = cp->regno2; + } + else if (cp->regno2 == regno) + { + next_cp = cp->regno2_next; + another_regno = cp->regno1; + } + else + gcc_unreachable (); + /* Don't change reload pseudo allocation. It might have + this allocation for a purpose and changing it can result + in LRA cycling. */ + if ((another_regno < lra_constraint_new_regno_start + || bitmap_bit_p (&lra_inheritance_pseudos, another_regno)) + && (another_hard_regno = reg_renumber[another_regno]) >= 0 + && another_hard_regno != hard_regno) + { + if (lra_dump_file != NULL) + fprintf + (lra_dump_file, + " Improving inheritance for %d(%d) and %d(%d)...\n", + regno, hard_regno, another_regno, another_hard_regno); + update_lives (another_regno, true); + lra_setup_reg_renumber (another_regno, -1, false); + if (hard_regno + == find_hard_regno_for (another_regno, &cost, hard_regno)) + assign_hard_regno (hard_regno, another_regno); + else + assign_hard_regno (another_hard_regno, another_regno); + bitmap_set_bit (changed_pseudos, another_regno); + } + } + } +} + + +/* Bitmap finally containing all pseudos spilled on this assignment + pass. */ +static bitmap_head all_spilled_pseudos; +/* All pseudos whose allocation was changed. */ +static bitmap_head changed_pseudo_bitmap; + +/* Assign hard registers to reload pseudos and other pseudos. */ +static void +assign_by_spills (void) +{ + int i, n, nfails, iter, regno, hard_regno, cost, restore_regno; + rtx insn; + basic_block bb; + bitmap_head changed_insns, do_not_assign_nonreload_pseudos; + bitmap_head non_reload_pseudos; + unsigned int u; + bitmap_iterator bi; + int max_regno = max_reg_num (); + + for (n = 0, i = lra_constraint_new_regno_start; i < max_regno; i++) + if (reg_renumber[i] < 0 && lra_reg_info[i].nrefs != 0 + && regno_allocno_class_array[i] != NO_REGS) + sorted_pseudos[n++] = i; + bitmap_initialize (&insn_conflict_pseudos, ®_obstack); + bitmap_initialize (&spill_pseudos_bitmap, ®_obstack); + bitmap_initialize (&best_spill_pseudos_bitmap, ®_obstack); + update_hard_regno_preference_check = XCNEWVEC (int, max_regno); + curr_update_hard_regno_preference_check = 0; + memset (try_hard_reg_pseudos_check, 0, sizeof (try_hard_reg_pseudos_check)); + for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) + bitmap_initialize (&try_hard_reg_pseudos[i], ®_obstack); + curr_pseudo_check = 0; + bitmap_initialize (&changed_insns, ®_obstack); + bitmap_initialize (&non_reload_pseudos, ®_obstack); + bitmap_ior (&non_reload_pseudos, &lra_inheritance_pseudos, &lra_split_regs); + bitmap_ior_into (&non_reload_pseudos, &lra_optional_reload_pseudos); + for (iter = 0; iter <= 1; iter++) + { + qsort (sorted_pseudos, n, sizeof (int), reload_pseudo_compare_func); + nfails = 0; + for (i = 0; i < n; i++) + { + regno = sorted_pseudos[i]; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Assigning to %d " + "(cl=%s, orig=%d, freq=%d, tfirst=%d, tfreq=%d)...\n", + regno, reg_class_names[regno_allocno_class_array[regno]], + ORIGINAL_REGNO (regno_reg_rtx[regno]), + lra_reg_info[regno].freq, regno_assign_info[regno].first, + regno_assign_info[regno_assign_info[regno].first].freq); + hard_regno = find_hard_regno_for (regno, &cost, -1); + if (hard_regno < 0 + && ! bitmap_bit_p (&non_reload_pseudos, regno)) + hard_regno = spill_for (regno, &all_spilled_pseudos); + if (hard_regno < 0) + { + if (! bitmap_bit_p (&non_reload_pseudos, regno)) + sorted_pseudos[nfails++] = regno; + } + else + { + /* This register might have been spilled by the previous + pass. Indicate that it is no longer spilled. */ + bitmap_clear_bit (&all_spilled_pseudos, regno); + assign_hard_regno (hard_regno, regno); + } + } + if (nfails == 0) + break; + lra_assert (iter == 0); + /* This is a very rare event. We can not assign a hard + register to reload pseudo because the hard register was + assigned to another reload pseudo on a previous + assignment pass. For x86 example, on the 1st pass we + assigned CX (although another hard register could be used + for this) to reload pseudo in an insn, on the 2nd pass we + need CX (and only this) hard register for a new reload + pseudo in the same insn. */ + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " 2nd iter for reload pseudo assignments:\n"); + for (i = 0; i < nfails; i++) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Reload r%d assignment failure\n", + sorted_pseudos[i]); + bitmap_ior_into (&changed_insns, + &lra_reg_info[sorted_pseudos[i]].insn_bitmap); + } + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + if (bitmap_bit_p (&changed_insns, INSN_UID (insn))) + { + lra_insn_recog_data_t data; + struct lra_insn_reg *r; + + data = lra_get_insn_recog_data (insn); + for (r = data->regs; r != NULL; r = r->next) + { + regno = r->regno; + /* A reload pseudo did not get a hard register on the + first iteration because of the conflict with + another reload pseudos in the same insn. So we + consider only reload pseudos assigned to hard + registers. We shall exclude inheritance pseudos as + they can occur in original insns (not reload ones). + We can omit the check for split pseudos because + they occur only in move insns containing non-reload + pseudos. */ + if (regno < lra_constraint_new_regno_start + || bitmap_bit_p (&lra_inheritance_pseudos, regno) + || reg_renumber[regno] < 0) + continue; + sorted_pseudos[nfails++] = regno; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + " Spill reload r%d(hr=%d, freq=%d)\n", + regno, reg_renumber[regno], + lra_reg_info[regno].freq); + update_lives (regno, true); + lra_setup_reg_renumber (regno, -1, false); + } + } + n = nfails; + } + improve_inheritance (&changed_pseudo_bitmap); + bitmap_clear (&non_reload_pseudos); + bitmap_clear (&changed_insns); + if (! lra_simple_p) + { + /* We should not assign to original pseudos of inheritance + pseudos or split pseudos if any its inheritance pseudo did + not get hard register or any its split pseudo was not split + because undo inheritance/split pass will extend live range of + such inheritance or split pseudos. */ + bitmap_initialize (&do_not_assign_nonreload_pseudos, ®_obstack); + EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, u, bi) + if ((restore_regno = lra_reg_info[u].restore_regno) >= 0 + && reg_renumber[u] < 0 + && bitmap_bit_p (&lra_inheritance_pseudos, u)) + bitmap_set_bit (&do_not_assign_nonreload_pseudos, restore_regno); + EXECUTE_IF_SET_IN_BITMAP (&lra_split_regs, 0, u, bi) + if ((restore_regno = lra_reg_info[u].restore_regno) >= 0 + && reg_renumber[u] >= 0) + bitmap_set_bit (&do_not_assign_nonreload_pseudos, restore_regno); + for (n = 0, i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + if (((i < lra_constraint_new_regno_start + && ! bitmap_bit_p (&do_not_assign_nonreload_pseudos, i)) + || (bitmap_bit_p (&lra_inheritance_pseudos, i) + && lra_reg_info[i].restore_regno >= 0) + || (bitmap_bit_p (&lra_split_regs, i) + && lra_reg_info[i].restore_regno >= 0) + || bitmap_bit_p (&lra_optional_reload_pseudos, i)) + && reg_renumber[i] < 0 && lra_reg_info[i].nrefs != 0 + && regno_allocno_class_array[i] != NO_REGS) + sorted_pseudos[n++] = i; + bitmap_clear (&do_not_assign_nonreload_pseudos); + if (n != 0 && lra_dump_file != NULL) + fprintf (lra_dump_file, " Reassigning non-reload pseudos\n"); + qsort (sorted_pseudos, n, sizeof (int), pseudo_compare_func); + for (i = 0; i < n; i++) + { + regno = sorted_pseudos[i]; + hard_regno = find_hard_regno_for (regno, &cost, -1); + if (hard_regno >= 0) + { + assign_hard_regno (hard_regno, regno); + /* We change allocation for non-reload pseudo on this + iteration -- mark the pseudo for invalidation of used + alternatives of insns containing the pseudo. */ + bitmap_set_bit (&changed_pseudo_bitmap, regno); + } + } + } + free (update_hard_regno_preference_check); + bitmap_clear (&best_spill_pseudos_bitmap); + bitmap_clear (&spill_pseudos_bitmap); + bitmap_clear (&insn_conflict_pseudos); +} + + +/* Entry function to assign hard registers to new reload pseudos + starting with LRA_CONSTRAINT_NEW_REGNO_START (by possible spilling + of old pseudos) and possibly to the old pseudos. The function adds + what insns to process for the next constraint pass. Those are all + insns who contains non-reload and non-inheritance pseudos with + changed allocation. + + Return true if we did not spill any non-reload and non-inheritance + pseudos. */ +bool +lra_assign (void) +{ + int i; + unsigned int u; + bitmap_iterator bi; + bitmap_head insns_to_process; + bool no_spills_p; + int max_regno = max_reg_num (); + + timevar_push (TV_LRA_ASSIGN); + init_lives (); + sorted_pseudos = XNEWVEC (int, max_regno); + sorted_reload_pseudos = XNEWVEC (int, max_regno); + regno_allocno_class_array = XNEWVEC (enum reg_class, max_regno); + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + regno_allocno_class_array[i] = lra_get_allocno_class (i); + init_regno_assign_info (); + bitmap_initialize (&all_spilled_pseudos, ®_obstack); + create_live_range_start_chains (); + setup_live_pseudos_and_spill_after_risky_transforms (&all_spilled_pseudos); +#ifdef ENABLE_CHECKING + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0 + && lra_reg_info[i].call_p + && overlaps_hard_reg_set_p (call_used_reg_set, + PSEUDO_REGNO_MODE (i), reg_renumber[i])) + gcc_unreachable (); +#endif + /* Setup insns to process on the next constraint pass. */ + bitmap_initialize (&changed_pseudo_bitmap, ®_obstack); + init_live_reload_and_inheritance_pseudos (); + assign_by_spills (); + finish_live_reload_and_inheritance_pseudos (); + bitmap_ior_into (&changed_pseudo_bitmap, &all_spilled_pseudos); + no_spills_p = true; + EXECUTE_IF_SET_IN_BITMAP (&all_spilled_pseudos, 0, u, bi) + /* We ignore spilled pseudos created on last inheritance pass + because they will be removed. */ + if (lra_reg_info[u].restore_regno < 0) + { + no_spills_p = false; + break; + } + finish_live_range_start_chains (); + bitmap_clear (&all_spilled_pseudos); + bitmap_initialize (&insns_to_process, ®_obstack); + EXECUTE_IF_SET_IN_BITMAP (&changed_pseudo_bitmap, 0, u, bi) + bitmap_ior_into (&insns_to_process, &lra_reg_info[u].insn_bitmap); + bitmap_clear (&changed_pseudo_bitmap); + EXECUTE_IF_SET_IN_BITMAP (&insns_to_process, 0, u, bi) + { + lra_push_insn_by_uid (u); + /* Invalidate alternatives for insn should be processed. */ + lra_set_used_insn_alternative_by_uid (u, -1); + } + bitmap_clear (&insns_to_process); + finish_regno_assign_info (); + free (regno_allocno_class_array); + free (sorted_pseudos); + free (sorted_reload_pseudos); + finish_lives (); + timevar_pop (TV_LRA_ASSIGN); + return no_spills_p; +} diff --git a/gcc/lra-coalesce.c b/gcc/lra-coalesce.c new file mode 100644 index 00000000000..57c3111b922 --- /dev/null +++ b/gcc/lra-coalesce.c @@ -0,0 +1,351 @@ +/* Coalesce spilled pseudos. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + + +/* This file contains a pass making some simple RTL code + transformations by coalescing pseudos to remove some move insns. + + Spilling pseudos in LRA can create memory-memory moves. We should + remove potential memory-memory moves before the next constraint + pass because the constraint pass will generate additional insns for + such moves and all these insns will be hard to remove afterwards. + + Here we coalesce only spilled pseudos. Coalescing non-spilled + pseudos (with different hard regs) might result in spilling + additional pseudos because of possible conflicts with other + non-spilled pseudos and, as a consequence, in more constraint + passes and even LRA infinite cycling. Trivial the same hard + register moves will be removed by subsequent compiler passes. + + We don't coalesce special reload pseudos. It complicates LRA code + a lot without visible generated code improvement. + + The pseudo live-ranges are used to find conflicting pseudos during + coalescing. + + Most frequently executed moves is tried to be coalesced first. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "rtl.h" +#include "tm_p.h" +#include "insn-config.h" +#include "recog.h" +#include "output.h" +#include "regs.h" +#include "hard-reg-set.h" +#include "flags.h" +#include "function.h" +#include "expr.h" +#include "basic-block.h" +#include "except.h" +#include "timevar.h" +#include "ira.h" +#include "lra-int.h" +#include "df.h" + +/* Arrays whose elements represent the first and the next pseudo + (regno) in the coalesced pseudos group to which given pseudo (its + regno is the index) belongs. The next of the last pseudo in the + group refers to the first pseudo in the group, in other words the + group is represented by a cyclic list. */ +static int *first_coalesced_pseudo, *next_coalesced_pseudo; + +/* The function is used to sort moves according to their execution + frequencies. */ +static int +move_freq_compare_func (const void *v1p, const void *v2p) +{ + rtx mv1 = *(const rtx *) v1p; + rtx mv2 = *(const rtx *) v2p; + int pri1, pri2; + + pri1 = BLOCK_FOR_INSN (mv1)->frequency; + pri2 = BLOCK_FOR_INSN (mv2)->frequency; + if (pri2 - pri1) + return pri2 - pri1; + + /* If frequencies are equal, sort by moves, so that the results of + qsort leave nothing to chance. */ + return (int) INSN_UID (mv1) - (int) INSN_UID (mv2); +} + +/* Pseudos which go away after coalescing. */ +static bitmap_head coalesced_pseudos_bitmap; + +/* Merge two sets of coalesced pseudos given correspondingly by + pseudos REGNO1 and REGNO2 (more accurately merging REGNO2 group + into REGNO1 group). Set up COALESCED_PSEUDOS_BITMAP. */ +static void +merge_pseudos (int regno1, int regno2) +{ + int regno, first, first2, last, next; + + first = first_coalesced_pseudo[regno1]; + if ((first2 = first_coalesced_pseudo[regno2]) == first) + return; + for (last = regno2, regno = next_coalesced_pseudo[regno2];; + regno = next_coalesced_pseudo[regno]) + { + first_coalesced_pseudo[regno] = first; + bitmap_set_bit (&coalesced_pseudos_bitmap, regno); + if (regno == regno2) + break; + last = regno; + } + next = next_coalesced_pseudo[first]; + next_coalesced_pseudo[first] = regno2; + next_coalesced_pseudo[last] = next; + lra_reg_info[first].live_ranges + = (lra_merge_live_ranges + (lra_reg_info[first].live_ranges, + lra_copy_live_range_list (lra_reg_info[first2].live_ranges))); + if (GET_MODE_SIZE (lra_reg_info[first].biggest_mode) + < GET_MODE_SIZE (lra_reg_info[first2].biggest_mode)) + lra_reg_info[first].biggest_mode = lra_reg_info[first2].biggest_mode; +} + +/* Change pseudos in *LOC on their coalescing group + representatives. */ +static bool +substitute (rtx *loc) +{ + int i, regno; + const char *fmt; + enum rtx_code code; + bool res; + + if (*loc == NULL_RTX) + return false; + code = GET_CODE (*loc); + if (code == REG) + { + regno = REGNO (*loc); + if (regno < FIRST_PSEUDO_REGISTER + || first_coalesced_pseudo[regno] == regno) + return false; + *loc = regno_reg_rtx[first_coalesced_pseudo[regno]]; + return true; + } + + res = false; + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + { + if (substitute (&XEXP (*loc, i))) + res = true; + } + else if (fmt[i] == 'E') + { + int j; + + for (j = XVECLEN (*loc, i) - 1; j >= 0; j--) + if (substitute (&XVECEXP (*loc, i, j))) + res = true; + } + } + return res; +} + +/* The current iteration (1, 2, ...) of the coalescing pass. */ +int lra_coalesce_iter; + +/* Return true if the move involving REGNO1 and REGNO2 is a potential + memory-memory move. */ +static bool +mem_move_p (int regno1, int regno2) +{ + return reg_renumber[regno1] < 0 && reg_renumber[regno2] < 0; +} + +/* Pseudos used instead of the coalesced pseudos. */ +static bitmap_head used_pseudos_bitmap; + +/* Set up USED_PSEUDOS_BITMAP, and update LR_BITMAP (a BB live info + bitmap). */ +static void +update_live_info (bitmap lr_bitmap) +{ + unsigned int j; + bitmap_iterator bi; + + bitmap_clear (&used_pseudos_bitmap); + EXECUTE_IF_AND_IN_BITMAP (&coalesced_pseudos_bitmap, lr_bitmap, + FIRST_PSEUDO_REGISTER, j, bi) + bitmap_set_bit (&used_pseudos_bitmap, first_coalesced_pseudo[j]); + if (! bitmap_empty_p (&used_pseudos_bitmap)) + { + bitmap_and_compl_into (lr_bitmap, &coalesced_pseudos_bitmap); + bitmap_ior_into (lr_bitmap, &used_pseudos_bitmap); + } +} + +/* Return true if pseudo REGNO can be potentially coalesced. Use + SPLIT_PSEUDO_BITMAP to find pseudos whose live ranges were + split. */ +static bool +coalescable_pseudo_p (int regno, bitmap split_origin_bitmap) +{ + lra_assert (regno >= FIRST_PSEUDO_REGISTER); + /* Don't coalesce inheritance pseudos because spilled inheritance + pseudos will be removed in subsequent 'undo inheritance' + pass. */ + return (lra_reg_info[regno].restore_regno < 0 + /* We undo splits for spilled pseudos whose live ranges were + split. So don't coalesce them, it is not necessary and + the undo transformations would be wrong. */ + && ! bitmap_bit_p (split_origin_bitmap, regno) + /* We don't want to coalesce regnos with equivalences, at + least without updating this info. */ + && ira_reg_equiv[regno].constant == NULL_RTX + && ira_reg_equiv[regno].memory == NULL_RTX + && ira_reg_equiv[regno].invariant == NULL_RTX); +} + +/* The major function for aggressive pseudo coalescing of moves only + if the both pseudos were spilled and not special reload pseudos. */ +bool +lra_coalesce (void) +{ + basic_block bb; + rtx mv, set, insn, next, *sorted_moves; + int i, mv_num, sregno, dregno, restore_regno; + unsigned int regno; + int coalesced_moves; + int max_regno = max_reg_num (); + bitmap_head involved_insns_bitmap, split_origin_bitmap; + bitmap_iterator bi; + + timevar_push (TV_LRA_COALESCE); + + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + "\n********** Pseudos coalescing #%d: **********\n\n", + ++lra_coalesce_iter); + first_coalesced_pseudo = XNEWVEC (int, max_regno); + next_coalesced_pseudo = XNEWVEC (int, max_regno); + for (i = 0; i < max_regno; i++) + first_coalesced_pseudo[i] = next_coalesced_pseudo[i] = i; + sorted_moves = XNEWVEC (rtx, get_max_uid ()); + mv_num = 0; + /* Collect pseudos whose live ranges were split. */ + bitmap_initialize (&split_origin_bitmap, ®_obstack); + EXECUTE_IF_SET_IN_BITMAP (&lra_split_regs, 0, regno, bi) + if ((restore_regno = lra_reg_info[regno].restore_regno) >= 0) + bitmap_set_bit (&split_origin_bitmap, restore_regno); + /* Collect moves. */ + coalesced_moves = 0; + FOR_EACH_BB (bb) + { + FOR_BB_INSNS_SAFE (bb, insn, next) + if (INSN_P (insn) + && (set = single_set (insn)) != NULL_RTX + && REG_P (SET_DEST (set)) && REG_P (SET_SRC (set)) + && (sregno = REGNO (SET_SRC (set))) >= FIRST_PSEUDO_REGISTER + && (dregno = REGNO (SET_DEST (set))) >= FIRST_PSEUDO_REGISTER + && mem_move_p (sregno, dregno) + && coalescable_pseudo_p (sregno, &split_origin_bitmap) + && coalescable_pseudo_p (dregno, &split_origin_bitmap) + && ! side_effects_p (set) + && !(lra_intersected_live_ranges_p + (lra_reg_info[sregno].live_ranges, + lra_reg_info[dregno].live_ranges))) + sorted_moves[mv_num++] = insn; + } + bitmap_clear (&split_origin_bitmap); + qsort (sorted_moves, mv_num, sizeof (rtx), move_freq_compare_func); + /* Coalesced copies, most frequently executed first. */ + bitmap_initialize (&coalesced_pseudos_bitmap, ®_obstack); + bitmap_initialize (&involved_insns_bitmap, ®_obstack); + for (i = 0; i < mv_num; i++) + { + mv = sorted_moves[i]; + set = single_set (mv); + lra_assert (set != NULL && REG_P (SET_SRC (set)) + && REG_P (SET_DEST (set))); + sregno = REGNO (SET_SRC (set)); + dregno = REGNO (SET_DEST (set)); + if (first_coalesced_pseudo[sregno] == first_coalesced_pseudo[dregno]) + { + coalesced_moves++; + if (lra_dump_file != NULL) + fprintf + (lra_dump_file, " Coalescing move %i:r%d-r%d (freq=%d)\n", + INSN_UID (mv), sregno, dregno, + BLOCK_FOR_INSN (mv)->frequency); + /* We updated involved_insns_bitmap when doing the merge. */ + } + else if (!(lra_intersected_live_ranges_p + (lra_reg_info[first_coalesced_pseudo[sregno]].live_ranges, + lra_reg_info[first_coalesced_pseudo[dregno]].live_ranges))) + { + coalesced_moves++; + if (lra_dump_file != NULL) + fprintf + (lra_dump_file, + " Coalescing move %i:r%d(%d)-r%d(%d) (freq=%d)\n", + INSN_UID (mv), sregno, ORIGINAL_REGNO (SET_SRC (set)), + dregno, ORIGINAL_REGNO (SET_DEST (set)), + BLOCK_FOR_INSN (mv)->frequency); + bitmap_ior_into (&involved_insns_bitmap, + &lra_reg_info[sregno].insn_bitmap); + bitmap_ior_into (&involved_insns_bitmap, + &lra_reg_info[dregno].insn_bitmap); + merge_pseudos (sregno, dregno); + } + } + bitmap_initialize (&used_pseudos_bitmap, ®_obstack); + FOR_EACH_BB (bb) + { + update_live_info (df_get_live_in (bb)); + update_live_info (df_get_live_out (bb)); + FOR_BB_INSNS_SAFE (bb, insn, next) + if (INSN_P (insn) + && bitmap_bit_p (&involved_insns_bitmap, INSN_UID (insn))) + { + if (! substitute (&insn)) + continue; + lra_update_insn_regno_info (insn); + if ((set = single_set (insn)) != NULL_RTX && set_noop_p (set)) + { + /* Coalesced move. */ + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Removing move %i (freq=%d)\n", + INSN_UID (insn), BLOCK_FOR_INSN (insn)->frequency); + lra_set_insn_deleted (insn); + } + } + } + bitmap_clear (&used_pseudos_bitmap); + bitmap_clear (&involved_insns_bitmap); + bitmap_clear (&coalesced_pseudos_bitmap); + if (lra_dump_file != NULL && coalesced_moves != 0) + fprintf (lra_dump_file, "Coalesced Moves = %d\n", coalesced_moves); + free (sorted_moves); + free (next_coalesced_pseudo); + free (first_coalesced_pseudo); + timevar_pop (TV_LRA_COALESCE); + return coalesced_moves != 0; +} diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c new file mode 100644 index 00000000000..ec48e9ea02c --- /dev/null +++ b/gcc/lra-constraints.c @@ -0,0 +1,5130 @@ +/* Code for RTL transformations to satisfy insn constraints. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it under + the terms of the GNU General Public License as published by the Free + Software Foundation; either version 3, or (at your option) any later + version. + + GCC is distributed in the hope that it will be useful, but WITHOUT ANY + WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + + +/* This file contains code for 3 passes: constraint pass, + inheritance/split pass, and pass for undoing failed inheritance and + split. + + The major goal of constraint pass is to transform RTL to satisfy + insn and address constraints by: + o choosing insn alternatives; + o generating *reload insns* (or reloads in brief) and *reload + pseudos* which will get necessary hard registers later; + o substituting pseudos with equivalent values and removing the + instructions that initialized those pseudos. + + The constraint pass has biggest and most complicated code in LRA. + There are a lot of important details like: + o reuse of input reload pseudos to simplify reload pseudo + allocations; + o some heuristics to choose insn alternative to improve the + inheritance; + o early clobbers etc. + + The pass is mimicking former reload pass in alternative choosing + because the reload pass is oriented to current machine description + model. It might be changed if the machine description model is + changed. + + There is special code for preventing all LRA and this pass cycling + in case of bugs. + + On the first iteration of the pass we process every instruction and + choose an alternative for each one. On subsequent iterations we try + to avoid reprocessing instructions if we can be sure that the old + choice is still valid. + + The inheritance/spilt pass is to transform code to achieve + ineheritance and live range splitting. It is done on backward + traversal of EBBs. + + The inheritance optimization goal is to reuse values in hard + registers. There is analogous optimization in old reload pass. The + inheritance is achieved by following transformation: + + reload_p1 <- p reload_p1 <- p + ... new_p <- reload_p1 + ... => ... + reload_p2 <- p reload_p2 <- new_p + + where p is spilled and not changed between the insns. Reload_p1 is + also called *original pseudo* and new_p is called *inheritance + pseudo*. + + The subsequent assignment pass will try to assign the same (or + another if it is not possible) hard register to new_p as to + reload_p1 or reload_p2. + + If the assignment pass fails to assign a hard register to new_p, + this file will undo the inheritance and restore the original code. + This is because implementing the above sequence with a spilled + new_p would make the code much worse. The inheritance is done in + EBB scope. The above is just a simplified example to get an idea + of the inheritance as the inheritance is also done for non-reload + insns. + + Splitting (transformation) is also done in EBB scope on the same + pass as the inheritance: + + r <- ... or ... <- r r <- ... or ... <- r + ... s <- r (new insn -- save) + ... => + ... r <- s (new insn -- restore) + ... <- r ... <- r + + The *split pseudo* s is assigned to the hard register of the + original pseudo or hard register r. + + Splitting is done: + o In EBBs with high register pressure for global pseudos (living + in at least 2 BBs) and assigned to hard registers when there + are more one reloads needing the hard registers; + o for pseudos needing save/restore code around calls. + + If the split pseudo still has the same hard register as the + original pseudo after the subsequent assignment pass or the + original pseudo was split, the opposite transformation is done on + the same pass for undoing inheritance. */ + +#undef REG_OK_STRICT + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "hard-reg-set.h" +#include "rtl.h" +#include "tm_p.h" +#include "regs.h" +#include "insn-config.h" +#include "insn-codes.h" +#include "recog.h" +#include "output.h" +#include "addresses.h" +#include "target.h" +#include "function.h" +#include "expr.h" +#include "basic-block.h" +#include "except.h" +#include "optabs.h" +#include "df.h" +#include "ira.h" +#include "rtl-error.h" +#include "lra-int.h" + +/* Value of LRA_CURR_RELOAD_NUM at the beginning of BB of the current + insn. Remember that LRA_CURR_RELOAD_NUM is the number of emitted + reload insns. */ +static int bb_reload_num; + +/* The current insn being processed and corresponding its data (basic + block, the insn data, the insn static data, and the mode of each + operand). */ +static rtx curr_insn; +static basic_block curr_bb; +static lra_insn_recog_data_t curr_id; +static struct lra_static_insn_data *curr_static_id; +static enum machine_mode curr_operand_mode[MAX_RECOG_OPERANDS]; + + + +/* Start numbers for new registers and insns at the current constraints + pass start. */ +static int new_regno_start; +static int new_insn_uid_start; + +/* Return hard regno of REGNO or if it is was not assigned to a hard + register, use a hard register from its allocno class. */ +static int +get_try_hard_regno (int regno) +{ + int hard_regno; + enum reg_class rclass; + + if ((hard_regno = regno) >= FIRST_PSEUDO_REGISTER) + hard_regno = lra_get_regno_hard_regno (regno); + if (hard_regno >= 0) + return hard_regno; + rclass = lra_get_allocno_class (regno); + if (rclass == NO_REGS) + return -1; + return ira_class_hard_regs[rclass][0]; +} + +/* Return final hard regno (plus offset) which will be after + elimination. We do this for matching constraints because the final + hard regno could have a different class. */ +static int +get_final_hard_regno (int hard_regno, int offset) +{ + if (hard_regno < 0) + return hard_regno; + hard_regno = lra_get_elimination_hard_regno (hard_regno); + return hard_regno + offset; +} + +/* Return hard regno of X after removing subreg and making + elimination. If X is not a register or subreg of register, return + -1. For pseudo use its assignment. */ +static int +get_hard_regno (rtx x) +{ + rtx reg; + int offset, hard_regno; + + reg = x; + if (GET_CODE (x) == SUBREG) + reg = SUBREG_REG (x); + if (! REG_P (reg)) + return -1; + if ((hard_regno = REGNO (reg)) >= FIRST_PSEUDO_REGISTER) + hard_regno = lra_get_regno_hard_regno (hard_regno); + if (hard_regno < 0) + return -1; + offset = 0; + if (GET_CODE (x) == SUBREG) + offset += subreg_regno_offset (hard_regno, GET_MODE (reg), + SUBREG_BYTE (x), GET_MODE (x)); + return get_final_hard_regno (hard_regno, offset); +} + +/* If REGNO is a hard register or has been allocated a hard register, + return the class of that register. If REGNO is a reload pseudo + created by the current constraints pass, return its allocno class. + Return NO_REGS otherwise. */ +static enum reg_class +get_reg_class (int regno) +{ + int hard_regno; + + if ((hard_regno = regno) >= FIRST_PSEUDO_REGISTER) + hard_regno = lra_get_regno_hard_regno (regno); + if (hard_regno >= 0) + { + hard_regno = get_final_hard_regno (hard_regno, 0); + return REGNO_REG_CLASS (hard_regno); + } + if (regno >= new_regno_start) + return lra_get_allocno_class (regno); + return NO_REGS; +} + +/* Return true if REG satisfies (or will satisfy) reg class constraint + CL. Use elimination first if REG is a hard register. If REG is a + reload pseudo created by this constraints pass, assume that it will + be allocated a hard register from its allocno class, but allow that + class to be narrowed to CL if it is currently a superset of CL. + + If NEW_CLASS is nonnull, set *NEW_CLASS to the new allocno class of + REGNO (reg), or NO_REGS if no change in its class was needed. */ +static bool +in_class_p (rtx reg, enum reg_class cl, enum reg_class *new_class) +{ + enum reg_class rclass, common_class; + enum machine_mode reg_mode; + int class_size, hard_regno, nregs, i, j; + int regno = REGNO (reg); + + if (new_class != NULL) + *new_class = NO_REGS; + if (regno < FIRST_PSEUDO_REGISTER) + { + rtx final_reg = reg; + rtx *final_loc = &final_reg; + + lra_eliminate_reg_if_possible (final_loc); + return TEST_HARD_REG_BIT (reg_class_contents[cl], REGNO (*final_loc)); + } + reg_mode = GET_MODE (reg); + rclass = get_reg_class (regno); + if (regno < new_regno_start + /* Do not allow the constraints for reload instructions to + influence the classes of new pseudos. These reloads are + typically moves that have many alternatives, and restricting + reload pseudos for one alternative may lead to situations + where other reload pseudos are no longer allocatable. */ + || INSN_UID (curr_insn) >= new_insn_uid_start) + /* When we don't know what class will be used finally for reload + pseudos, we use ALL_REGS. */ + return ((regno >= new_regno_start && rclass == ALL_REGS) + || (rclass != NO_REGS && ira_class_subset_p[rclass][cl] + && ! hard_reg_set_subset_p (reg_class_contents[cl], + lra_no_alloc_regs))); + else + { + common_class = ira_reg_class_subset[rclass][cl]; + if (new_class != NULL) + *new_class = common_class; + if (hard_reg_set_subset_p (reg_class_contents[common_class], + lra_no_alloc_regs)) + return false; + /* Check that there are enough allocatable regs. */ + class_size = ira_class_hard_regs_num[common_class]; + for (i = 0; i < class_size; i++) + { + hard_regno = ira_class_hard_regs[common_class][i]; + nregs = hard_regno_nregs[hard_regno][reg_mode]; + if (nregs == 1) + return true; + for (j = 0; j < nregs; j++) + if (TEST_HARD_REG_BIT (lra_no_alloc_regs, hard_regno + j)) + break; + if (j >= nregs) + return true; + } + return false; + } +} + +/* Return true if REGNO satisfies a memory constraint. */ +static bool +in_mem_p (int regno) +{ + return get_reg_class (regno) == NO_REGS; +} + +/* If we have decided to substitute X with another value, return that + value, otherwise return X. */ +static rtx +get_equiv_substitution (rtx x) +{ + int regno; + rtx res; + + if (! REG_P (x) || (regno = REGNO (x)) < FIRST_PSEUDO_REGISTER + || ! ira_reg_equiv[regno].defined_p + || ! ira_reg_equiv[regno].profitable_p + || lra_get_regno_hard_regno (regno) >= 0) + return x; + if ((res = ira_reg_equiv[regno].memory) != NULL_RTX) + return res; + if ((res = ira_reg_equiv[regno].constant) != NULL_RTX) + return res; + if ((res = ira_reg_equiv[regno].invariant) != NULL_RTX) + return res; + gcc_unreachable (); +} + +/* Set up curr_operand_mode. */ +static void +init_curr_operand_mode (void) +{ + int nop = curr_static_id->n_operands; + for (int i = 0; i < nop; i++) + { + enum machine_mode mode = GET_MODE (*curr_id->operand_loc[i]); + if (mode == VOIDmode) + { + /* The .md mode for address operands is the mode of the + addressed value rather than the mode of the address itself. */ + if (curr_id->icode >= 0 && curr_static_id->operand[i].is_address) + mode = Pmode; + else + mode = curr_static_id->operand[i].mode; + } + curr_operand_mode[i] = mode; + } +} + + + +/* The page contains code to reuse input reloads. */ + +/* Structure describes input reload of the current insns. */ +struct input_reload +{ + /* Reloaded value. */ + rtx input; + /* Reload pseudo used. */ + rtx reg; +}; + +/* The number of elements in the following array. */ +static int curr_insn_input_reloads_num; +/* Array containing info about input reloads. It is used to find the + same input reload and reuse the reload pseudo in this case. */ +static struct input_reload curr_insn_input_reloads[LRA_MAX_INSN_RELOADS]; + +/* Initiate data concerning reuse of input reloads for the current + insn. */ +static void +init_curr_insn_input_reloads (void) +{ + curr_insn_input_reloads_num = 0; +} + +/* Change class of pseudo REGNO to NEW_CLASS. Print info about it + using TITLE. Output a new line if NL_P. */ +static void +change_class (int regno, enum reg_class new_class, + const char *title, bool nl_p) +{ + lra_assert (regno >= FIRST_PSEUDO_REGISTER); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "%s to class %s for r%d", + title, reg_class_names[new_class], regno); + setup_reg_classes (regno, new_class, NO_REGS, new_class); + if (lra_dump_file != NULL && nl_p) + fprintf (lra_dump_file, "\n"); +} + +/* Create a new pseudo using MODE, RCLASS, ORIGINAL or reuse already + created input reload pseudo (only if TYPE is not OP_OUT). The + result pseudo is returned through RESULT_REG. Return TRUE if we + created a new pseudo, FALSE if we reused the already created input + reload pseudo. Use TITLE to describe new registers for debug + purposes. */ +static bool +get_reload_reg (enum op_type type, enum machine_mode mode, rtx original, + enum reg_class rclass, const char *title, rtx *result_reg) +{ + int i, regno; + enum reg_class new_class; + + if (type == OP_OUT) + { + *result_reg + = lra_create_new_reg_with_unique_value (mode, original, rclass, title); + return true; + } + for (i = 0; i < curr_insn_input_reloads_num; i++) + if (rtx_equal_p (curr_insn_input_reloads[i].input, original) + && in_class_p (curr_insn_input_reloads[i].reg, rclass, &new_class)) + { + lra_assert (! side_effects_p (original)); + *result_reg = curr_insn_input_reloads[i].reg; + regno = REGNO (*result_reg); + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, " Reuse r%d for reload ", regno); + print_value_slim (lra_dump_file, original, 1); + } + if (rclass != new_class) + change_class (regno, new_class, ", change", false); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "\n"); + return false; + } + *result_reg = lra_create_new_reg (mode, original, rclass, title); + lra_assert (curr_insn_input_reloads_num < LRA_MAX_INSN_RELOADS); + curr_insn_input_reloads[curr_insn_input_reloads_num].input = original; + curr_insn_input_reloads[curr_insn_input_reloads_num++].reg = *result_reg; + return true; +} + + + +/* The page contains code to extract memory address parts. */ + +/* Info about base and index regs of an address. In some rare cases, + base/index register can be actually memory. In this case we will + reload it. */ +struct address +{ + /* NULL if there is no a base register. */ + rtx *base_reg_loc; + /* Second location of {post/pre}_modify, NULL otherwise. */ + rtx *base_reg_loc2; + /* NULL if there is no an index register. */ + rtx *index_reg_loc; + /* Location of index reg * scale or index_reg_loc otherwise. */ + rtx *index_loc; + /* NULL if there is no a displacement. */ + rtx *disp_loc; + /* Defined if base_reg_loc is not NULL. */ + enum rtx_code base_outer_code, index_code; + /* True if the base register is modified in the address, for + example, in PRE_INC. */ + bool base_modify_p; +}; + +/* Wrapper around REGNO_OK_FOR_INDEX_P, to allow pseudos. */ +static inline bool +ok_for_index_p_nonstrict (rtx reg) +{ + unsigned regno = REGNO (reg); + + return regno >= FIRST_PSEUDO_REGISTER || REGNO_OK_FOR_INDEX_P (regno); +} + +/* A version of regno_ok_for_base_p for use here, when all pseudos + should count as OK. Arguments as for regno_ok_for_base_p. */ +static inline bool +ok_for_base_p_nonstrict (rtx reg, enum machine_mode mode, addr_space_t as, + enum rtx_code outer_code, enum rtx_code index_code) +{ + unsigned regno = REGNO (reg); + + if (regno >= FIRST_PSEUDO_REGISTER) + return true; + return ok_for_base_p_1 (regno, mode, as, outer_code, index_code); +} + +/* Process address part in space AS (or all address if TOP_P) with + location *LOC to extract address characteristics. + + If CONTEXT_P is false, we are looking at the base part of an + address, otherwise we are looking at the index part. + + MODE is the mode of the memory reference; OUTER_CODE and INDEX_CODE + give the context that the rtx appears in; MODIFY_P if *LOC is + modified. */ +static void +extract_loc_address_regs (bool top_p, enum machine_mode mode, addr_space_t as, + rtx *loc, bool context_p, enum rtx_code outer_code, + enum rtx_code index_code, + bool modify_p, struct address *ad) +{ + rtx x = *loc; + enum rtx_code code = GET_CODE (x); + bool base_ok_p; + + switch (code) + { + case CONST_INT: + case CONST: + case SYMBOL_REF: + case LABEL_REF: + if (! context_p) + { + lra_assert (top_p); + ad->disp_loc = loc; + } + return; + + case CC0: + case PC: + return; + + case PLUS: + case LO_SUM: + /* When we have an address that is a sum, we must determine + whether registers are "base" or "index" regs. If there is a + sum of two registers, we must choose one to be the + "base". */ + { + rtx *arg0_loc = &XEXP (x, 0); + rtx *arg1_loc = &XEXP (x, 1); + rtx *tloc; + rtx arg0 = *arg0_loc; + rtx arg1 = *arg1_loc; + enum rtx_code code0 = GET_CODE (arg0); + enum rtx_code code1 = GET_CODE (arg1); + + /* Look inside subregs. */ + if (code0 == SUBREG) + { + arg0_loc = &SUBREG_REG (arg0); + arg0 = *arg0_loc; + code0 = GET_CODE (arg0); + } + if (code1 == SUBREG) + { + arg1_loc = &SUBREG_REG (arg1); + arg1 = *arg1_loc; + code1 = GET_CODE (arg1); + } + + if (CONSTANT_P (arg0) + || code1 == PLUS || code1 == MULT || code1 == ASHIFT) + { + tloc = arg1_loc; + arg1_loc = arg0_loc; + arg0_loc = tloc; + arg0 = *arg0_loc; + code0 = GET_CODE (arg0); + arg1 = *arg1_loc; + code1 = GET_CODE (arg1); + } + /* If this machine only allows one register per address, it + must be in the first operand. */ + if (MAX_REGS_PER_ADDRESS == 1 || code == LO_SUM) + { + lra_assert (ad->disp_loc == NULL); + ad->disp_loc = arg1_loc; + extract_loc_address_regs (false, mode, as, arg0_loc, false, code, + code1, modify_p, ad); + } + /* Base + disp addressing */ + else if (code0 != PLUS && code0 != MULT && code0 != ASHIFT + && CONSTANT_P (arg1)) + { + lra_assert (ad->disp_loc == NULL); + ad->disp_loc = arg1_loc; + extract_loc_address_regs (false, mode, as, arg0_loc, false, PLUS, + code1, modify_p, ad); + } + /* If index and base registers are the same on this machine, + just record registers in any non-constant operands. We + assume here, as well as in the tests below, that all + addresses are in canonical form. */ + else if (INDEX_REG_CLASS + == base_reg_class (VOIDmode, as, PLUS, SCRATCH) + && code0 != PLUS && code0 != MULT && code0 != ASHIFT) + { + extract_loc_address_regs (false, mode, as, arg0_loc, false, PLUS, + code1, modify_p, ad); + lra_assert (! CONSTANT_P (arg1)); + extract_loc_address_regs (false, mode, as, arg1_loc, true, PLUS, + code0, modify_p, ad); + } + /* It might be [base + ]index * scale + disp. */ + else if (CONSTANT_P (arg1)) + { + lra_assert (ad->disp_loc == NULL); + ad->disp_loc = arg1_loc; + extract_loc_address_regs (false, mode, as, arg0_loc, context_p, + PLUS, code0, modify_p, ad); + } + /* If both operands are registers but one is already a hard + register of index or reg-base class, give the other the + class that the hard register is not. */ + else if (code0 == REG && code1 == REG + && REGNO (arg0) < FIRST_PSEUDO_REGISTER + && ((base_ok_p + = ok_for_base_p_nonstrict (arg0, mode, as, PLUS, REG)) + || ok_for_index_p_nonstrict (arg0))) + { + extract_loc_address_regs (false, mode, as, arg0_loc, ! base_ok_p, + PLUS, REG, modify_p, ad); + extract_loc_address_regs (false, mode, as, arg1_loc, base_ok_p, + PLUS, REG, modify_p, ad); + } + else if (code0 == REG && code1 == REG + && REGNO (arg1) < FIRST_PSEUDO_REGISTER + && ((base_ok_p + = ok_for_base_p_nonstrict (arg1, mode, as, PLUS, REG)) + || ok_for_index_p_nonstrict (arg1))) + { + extract_loc_address_regs (false, mode, as, arg0_loc, base_ok_p, + PLUS, REG, modify_p, ad); + extract_loc_address_regs (false, mode, as, arg1_loc, ! base_ok_p, + PLUS, REG, modify_p, ad); + } + /* Otherwise, count equal chances that each might be a base or + index register. This case should be rare. */ + else + { + extract_loc_address_regs (false, mode, as, arg0_loc, false, PLUS, + code1, modify_p, ad); + extract_loc_address_regs (false, mode, as, arg1_loc, + ad->base_reg_loc != NULL, PLUS, + code0, modify_p, ad); + } + } + break; + + case MULT: + case ASHIFT: + { + rtx *arg0_loc = &XEXP (x, 0); + enum rtx_code code0 = GET_CODE (*arg0_loc); + + if (code0 == CONST_INT) + arg0_loc = &XEXP (x, 1); + extract_loc_address_regs (false, mode, as, arg0_loc, true, + outer_code, code, modify_p, ad); + lra_assert (ad->index_loc == NULL); + ad->index_loc = loc; + break; + } + + case POST_MODIFY: + case PRE_MODIFY: + extract_loc_address_regs (false, mode, as, &XEXP (x, 0), false, + code, GET_CODE (XEXP (XEXP (x, 1), 1)), + true, ad); + lra_assert (rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0))); + ad->base_reg_loc2 = &XEXP (XEXP (x, 1), 0); + if (REG_P (XEXP (XEXP (x, 1), 1))) + extract_loc_address_regs (false, mode, as, &XEXP (XEXP (x, 1), 1), + true, code, REG, modify_p, ad); + break; + + case POST_INC: + case PRE_INC: + case POST_DEC: + case PRE_DEC: + extract_loc_address_regs (false, mode, as, &XEXP (x, 0), false, code, + SCRATCH, true, ad); + break; + + /* We process memory as a register. That means we flatten + addresses. In other words, the final code will never + contains memory in an address even if the target supports + such addresses (it is too rare these days). Memory also can + occur in address as a result some previous transformations + like equivalence substitution. */ + case MEM: + case REG: + if (context_p) + { + lra_assert (ad->index_reg_loc == NULL); + ad->index_reg_loc = loc; + } + else + { + lra_assert (ad->base_reg_loc == NULL); + ad->base_reg_loc = loc; + ad->base_outer_code = outer_code; + ad->index_code = index_code; + ad->base_modify_p = modify_p; + } + break; + default: + { + const char *fmt = GET_RTX_FORMAT (code); + int i; + + if (GET_RTX_LENGTH (code) != 1 + || fmt[0] != 'e' || GET_CODE (XEXP (x, 0)) != UNSPEC) + { + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + if (fmt[i] == 'e') + extract_loc_address_regs (false, mode, as, &XEXP (x, i), + context_p, code, SCRATCH, + modify_p, ad); + break; + } + /* fall through for case UNARY_OP (UNSPEC ...) */ + } + + case UNSPEC: + if (ad->disp_loc == NULL) + ad->disp_loc = loc; + else if (ad->base_reg_loc == NULL) + { + ad->base_reg_loc = loc; + ad->base_outer_code = outer_code; + ad->index_code = index_code; + ad->base_modify_p = modify_p; + } + else + { + lra_assert (ad->index_reg_loc == NULL); + ad->index_reg_loc = loc; + } + break; + + } +} + + +/* Describe address *LOC in AD. There are two cases: + - *LOC is the address in a (mem ...). In this case OUTER_CODE is MEM + and AS is the mem's address space. + - *LOC is matched to an address constraint such as 'p'. In this case + OUTER_CODE is ADDRESS and AS is ADDR_SPACE_GENERIC. */ +static void +extract_address_regs (enum machine_mode mem_mode, addr_space_t as, + rtx *loc, enum rtx_code outer_code, struct address *ad) +{ + ad->base_reg_loc = ad->base_reg_loc2 + = ad->index_reg_loc = ad->index_loc = ad->disp_loc = NULL; + ad->base_outer_code = SCRATCH; + ad->index_code = SCRATCH; + ad->base_modify_p = false; + extract_loc_address_regs (true, mem_mode, as, loc, false, outer_code, + SCRATCH, false, ad); + if (ad->index_loc == NULL) + /* SUBREG ??? */ + ad->index_loc = ad->index_reg_loc; +} + + + +/* The page contains major code to choose the current insn alternative + and generate reloads for it. */ + +/* Return the offset from REGNO of the least significant register + in (reg:MODE REGNO). + + This function is used to tell whether two registers satisfy + a matching constraint. (reg:MODE1 REGNO1) matches (reg:MODE2 REGNO2) if: + + REGNO1 + lra_constraint_offset (REGNO1, MODE1) + == REGNO2 + lra_constraint_offset (REGNO2, MODE2) */ +int +lra_constraint_offset (int regno, enum machine_mode mode) +{ + lra_assert (regno < FIRST_PSEUDO_REGISTER); + if (WORDS_BIG_ENDIAN && GET_MODE_SIZE (mode) > UNITS_PER_WORD + && SCALAR_INT_MODE_P (mode)) + return hard_regno_nregs[regno][mode] - 1; + return 0; +} + +/* Like rtx_equal_p except that it allows a REG and a SUBREG to match + if they are the same hard reg, and has special hacks for + auto-increment and auto-decrement. This is specifically intended for + process_alt_operands to use in determining whether two operands + match. X is the operand whose number is the lower of the two. + + It is supposed that X is the output operand and Y is the input + operand. Y_HARD_REGNO is the final hard regno of register Y or + register in subreg Y as we know it now. Otherwise, it is a + negative value. */ +static bool +operands_match_p (rtx x, rtx y, int y_hard_regno) +{ + int i; + RTX_CODE code = GET_CODE (x); + const char *fmt; + + if (x == y) + return true; + if ((code == REG || (code == SUBREG && REG_P (SUBREG_REG (x)))) + && (REG_P (y) || (GET_CODE (y) == SUBREG && REG_P (SUBREG_REG (y))))) + { + int j; + + i = get_hard_regno (x); + if (i < 0) + goto slow; + + if ((j = y_hard_regno) < 0) + goto slow; + + i += lra_constraint_offset (i, GET_MODE (x)); + j += lra_constraint_offset (j, GET_MODE (y)); + + return i == j; + } + + /* If two operands must match, because they are really a single + operand of an assembler insn, then two post-increments are invalid + because the assembler insn would increment only once. On the + other hand, a post-increment matches ordinary indexing if the + post-increment is the output operand. */ + if (code == POST_DEC || code == POST_INC || code == POST_MODIFY) + return operands_match_p (XEXP (x, 0), y, y_hard_regno); + + /* Two pre-increments are invalid because the assembler insn would + increment only once. On the other hand, a pre-increment matches + ordinary indexing if the pre-increment is the input operand. */ + if (GET_CODE (y) == PRE_DEC || GET_CODE (y) == PRE_INC + || GET_CODE (y) == PRE_MODIFY) + return operands_match_p (x, XEXP (y, 0), -1); + + slow: + + if (code == REG && GET_CODE (y) == SUBREG && REG_P (SUBREG_REG (y)) + && x == SUBREG_REG (y)) + return true; + if (GET_CODE (y) == REG && code == SUBREG && REG_P (SUBREG_REG (x)) + && SUBREG_REG (x) == y) + return true; + + /* Now we have disposed of all the cases in which different rtx + codes can match. */ + if (code != GET_CODE (y)) + return false; + + /* (MULT:SI x y) and (MULT:HI x y) are NOT equivalent. */ + if (GET_MODE (x) != GET_MODE (y)) + return false; + + switch (code) + { + CASE_CONST_UNIQUE: + return false; + + case LABEL_REF: + return XEXP (x, 0) == XEXP (y, 0); + case SYMBOL_REF: + return XSTR (x, 0) == XSTR (y, 0); + + default: + break; + } + + /* Compare the elements. If any pair of corresponding elements fail + to match, return false for the whole things. */ + + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + int val, j; + switch (fmt[i]) + { + case 'w': + if (XWINT (x, i) != XWINT (y, i)) + return false; + break; + + case 'i': + if (XINT (x, i) != XINT (y, i)) + return false; + break; + + case 'e': + val = operands_match_p (XEXP (x, i), XEXP (y, i), -1); + if (val == 0) + return false; + break; + + case '0': + break; + + case 'E': + if (XVECLEN (x, i) != XVECLEN (y, i)) + return false; + for (j = XVECLEN (x, i) - 1; j >= 0; --j) + { + val = operands_match_p (XVECEXP (x, i, j), XVECEXP (y, i, j), -1); + if (val == 0) + return false; + } + break; + + /* It is believed that rtx's at this level will never + contain anything but integers and other rtx's, except for + within LABEL_REFs and SYMBOL_REFs. */ + default: + gcc_unreachable (); + } + } + return true; +} + +/* True if X is a constant that can be forced into the constant pool. + MODE is the mode of the operand, or VOIDmode if not known. */ +#define CONST_POOL_OK_P(MODE, X) \ + ((MODE) != VOIDmode \ + && CONSTANT_P (X) \ + && GET_CODE (X) != HIGH \ + && !targetm.cannot_force_const_mem (MODE, X)) + +/* True if C is a non-empty register class that has too few registers + to be safely used as a reload target class. */ +#define SMALL_REGISTER_CLASS_P(C) \ + (reg_class_size [(C)] == 1 \ + || (reg_class_size [(C)] >= 1 && targetm.class_likely_spilled_p (C))) + +/* If REG is a reload pseudo, try to make its class satisfying CL. */ +static void +narrow_reload_pseudo_class (rtx reg, enum reg_class cl) +{ + enum reg_class rclass; + + /* Do not make more accurate class from reloads generated. They are + mostly moves with a lot of constraints. Making more accurate + class may results in very narrow class and impossibility of find + registers for several reloads of one insn. */ + if (INSN_UID (curr_insn) >= new_insn_uid_start) + return; + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + if (! REG_P (reg) || (int) REGNO (reg) < new_regno_start) + return; + if (in_class_p (reg, cl, &rclass) && rclass != cl) + change_class (REGNO (reg), rclass, " Change", true); +} + +/* Generate reloads for matching OUT and INS (array of input operand + numbers with end marker -1) with reg class GOAL_CLASS. Add input + and output reloads correspondingly to the lists *BEFORE and + *AFTER. */ +static void +match_reload (signed char out, signed char *ins, enum reg_class goal_class, + rtx *before, rtx *after) +{ + int i, in; + rtx new_in_reg, new_out_reg, reg; + enum machine_mode inmode, outmode; + rtx in_rtx = *curr_id->operand_loc[ins[0]]; + rtx out_rtx = *curr_id->operand_loc[out]; + + outmode = curr_operand_mode[out]; + inmode = curr_operand_mode[ins[0]]; + push_to_sequence (*before); + if (inmode != outmode) + { + if (GET_MODE_SIZE (inmode) > GET_MODE_SIZE (outmode)) + { + reg = new_in_reg + = lra_create_new_reg_with_unique_value (inmode, in_rtx, + goal_class, ""); + if (SCALAR_INT_MODE_P (inmode)) + new_out_reg = gen_lowpart_SUBREG (outmode, reg); + else + new_out_reg = gen_rtx_SUBREG (outmode, reg, 0); + } + else + { + reg = new_out_reg + = lra_create_new_reg_with_unique_value (outmode, out_rtx, + goal_class, ""); + if (SCALAR_INT_MODE_P (outmode)) + new_in_reg = gen_lowpart_SUBREG (inmode, reg); + else + new_in_reg = gen_rtx_SUBREG (inmode, reg, 0); + /* NEW_IN_REG is non-paradoxical subreg. We don't want + NEW_OUT_REG living above. We add clobber clause for + this. */ + emit_clobber (new_out_reg); + } + } + else + { + /* Pseudos have values -- see comments for lra_reg_info. + Different pseudos with the same value do not conflict even if + they live in the same place. When we create a pseudo we + assign value of original pseudo (if any) from which we + created the new pseudo. If we create the pseudo from the + input pseudo, the new pseudo will no conflict with the input + pseudo which is wrong when the input pseudo lives after the + insn and as the new pseudo value is changed by the insn + output. Therefore we create the new pseudo from the output. + + We cannot reuse the current output register because we might + have a situation like "a <- a op b", where the constraints + force the second input operand ("b") to match the output + operand ("a"). "b" must then be copied into a new register + so that it doesn't clobber the current value of "a". */ + + new_in_reg = new_out_reg + = lra_create_new_reg_with_unique_value (outmode, out_rtx, + goal_class, ""); + } + /* In and out operand can be got from transformations before + processing insn constraints. One example of such transformations + is subreg reloading (see function simplify_operand_subreg). The + new pseudos created by the transformations might have inaccurate + class (ALL_REGS) and we should make their classes more + accurate. */ + narrow_reload_pseudo_class (in_rtx, goal_class); + narrow_reload_pseudo_class (out_rtx, goal_class); + lra_emit_move (copy_rtx (new_in_reg), in_rtx); + *before = get_insns (); + end_sequence (); + for (i = 0; (in = ins[i]) >= 0; i++) + { + lra_assert + (GET_MODE (*curr_id->operand_loc[in]) == VOIDmode + || GET_MODE (new_in_reg) == GET_MODE (*curr_id->operand_loc[in])); + *curr_id->operand_loc[in] = new_in_reg; + } + lra_update_dups (curr_id, ins); + if (find_reg_note (curr_insn, REG_UNUSED, out_rtx) == NULL_RTX) + { + start_sequence (); + lra_emit_move (out_rtx, copy_rtx (new_out_reg)); + emit_insn (*after); + *after = get_insns (); + end_sequence (); + } + *curr_id->operand_loc[out] = new_out_reg; + lra_update_dup (curr_id, out); +} + +/* Return register class which is union of all reg classes in insn + constraint alternative string starting with P. */ +static enum reg_class +reg_class_from_constraints (const char *p) +{ + int c, len; + enum reg_class op_class = NO_REGS; + + do + switch ((c = *p, len = CONSTRAINT_LEN (c, p)), c) + { + case '#': + case ',': + return op_class; + + case 'p': + op_class = (reg_class_subunion + [op_class][base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, + ADDRESS, SCRATCH)]); + break; + + case 'g': + case 'r': + op_class = reg_class_subunion[op_class][GENERAL_REGS]; + break; + + default: + if (REG_CLASS_FROM_CONSTRAINT (c, p) == NO_REGS) + { +#ifdef EXTRA_CONSTRAINT_STR + if (EXTRA_ADDRESS_CONSTRAINT (c, p)) + op_class + = (reg_class_subunion + [op_class][base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, + ADDRESS, SCRATCH)]); +#endif + break; + } + + op_class + = reg_class_subunion[op_class][REG_CLASS_FROM_CONSTRAINT (c, p)]; + break; + } + while ((p += len), c); + return op_class; +} + +/* If OP is a register, return the class of the register as per + get_reg_class, otherwise return NO_REGS. */ +static inline enum reg_class +get_op_class (rtx op) +{ + return REG_P (op) ? get_reg_class (REGNO (op)) : NO_REGS; +} + +/* Return generated insn mem_pseudo:=val if TO_P or val:=mem_pseudo + otherwise. If modes of MEM_PSEUDO and VAL are different, use + SUBREG for VAL to make them equal. */ +static rtx +emit_spill_move (bool to_p, rtx mem_pseudo, rtx val) +{ + if (GET_MODE (mem_pseudo) != GET_MODE (val)) + val = gen_rtx_SUBREG (GET_MODE (mem_pseudo), + GET_CODE (val) == SUBREG ? SUBREG_REG (val) : val, + 0); + return (to_p + ? gen_move_insn (mem_pseudo, val) + : gen_move_insn (val, mem_pseudo)); +} + +/* Process a special case insn (register move), return true if we + don't need to process it anymore. Return that RTL was changed + through CHANGE_P and macro SECONDARY_MEMORY_NEEDED says to use + secondary memory through SEC_MEM_P. */ +static bool +check_and_process_move (bool *change_p, bool *sec_mem_p) +{ + int sregno, dregno; + rtx set, dest, src, dreg, sreg, old_sreg, new_reg, before, scratch_reg; + enum reg_class dclass, sclass, secondary_class; + enum machine_mode sreg_mode; + secondary_reload_info sri; + + *sec_mem_p = *change_p = false; + if ((set = single_set (curr_insn)) == NULL) + return false; + dreg = dest = SET_DEST (set); + sreg = src = SET_SRC (set); + /* Quick check on the right move insn which does not need + reloads. */ + if ((dclass = get_op_class (dest)) != NO_REGS + && (sclass = get_op_class (src)) != NO_REGS + /* The backend guarantees that register moves of cost 2 never + need reloads. */ + && targetm.register_move_cost (GET_MODE (src), dclass, sclass) == 2) + return true; + if (GET_CODE (dest) == SUBREG) + dreg = SUBREG_REG (dest); + if (GET_CODE (src) == SUBREG) + sreg = SUBREG_REG (src); + if (! REG_P (dreg) || ! REG_P (sreg)) + return false; + sclass = dclass = NO_REGS; + dreg = get_equiv_substitution (dreg); + if (REG_P (dreg)) + dclass = get_reg_class (REGNO (dreg)); + if (dclass == ALL_REGS) + /* ALL_REGS is used for new pseudos created by transformations + like reload of SUBREG_REG (see function + simplify_operand_subreg). We don't know their class yet. We + should figure out the class from processing the insn + constraints not in this fast path function. Even if ALL_REGS + were a right class for the pseudo, secondary_... hooks usually + are not define for ALL_REGS. */ + return false; + sreg_mode = GET_MODE (sreg); + old_sreg = sreg; + sreg = get_equiv_substitution (sreg); + if (REG_P (sreg)) + sclass = get_reg_class (REGNO (sreg)); + if (sclass == ALL_REGS) + /* See comments above. */ + return false; +#ifdef SECONDARY_MEMORY_NEEDED + if (dclass != NO_REGS && sclass != NO_REGS + && SECONDARY_MEMORY_NEEDED (sclass, dclass, GET_MODE (src))) + { + *sec_mem_p = true; + return false; + } +#endif + sri.prev_sri = NULL; + sri.icode = CODE_FOR_nothing; + sri.extra_cost = 0; + secondary_class = NO_REGS; + /* Set up hard register for a reload pseudo for hook + secondary_reload because some targets just ignore unassigned + pseudos in the hook. */ + if (dclass != NO_REGS && lra_get_regno_hard_regno (REGNO (dreg)) < 0) + { + dregno = REGNO (dreg); + reg_renumber[dregno] = ira_class_hard_regs[dclass][0]; + } + else + dregno = -1; + if (sclass != NO_REGS && lra_get_regno_hard_regno (REGNO (sreg)) < 0) + { + sregno = REGNO (sreg); + reg_renumber[sregno] = ira_class_hard_regs[sclass][0]; + } + else + sregno = -1; + if (sclass != NO_REGS) + secondary_class + = (enum reg_class) targetm.secondary_reload (false, dest, + (reg_class_t) sclass, + GET_MODE (src), &sri); + if (sclass == NO_REGS + || ((secondary_class != NO_REGS || sri.icode != CODE_FOR_nothing) + && dclass != NO_REGS)) + { +#if ENABLE_ASSERT_CHECKING + enum reg_class old_sclass = secondary_class; + secondary_reload_info old_sri = sri; +#endif + + sri.prev_sri = NULL; + sri.icode = CODE_FOR_nothing; + sri.extra_cost = 0; + secondary_class + = (enum reg_class) targetm.secondary_reload (true, sreg, + (reg_class_t) dclass, + sreg_mode, &sri); + /* Check the target hook consistency. */ + lra_assert + ((secondary_class == NO_REGS && sri.icode == CODE_FOR_nothing) + || (old_sclass == NO_REGS && old_sri.icode == CODE_FOR_nothing) + || (secondary_class == old_sclass && sri.icode == old_sri.icode)); + } + if (sregno >= 0) + reg_renumber [sregno] = -1; + if (dregno >= 0) + reg_renumber [dregno] = -1; + if (secondary_class == NO_REGS && sri.icode == CODE_FOR_nothing) + return false; + *change_p = true; + new_reg = NULL_RTX; + if (secondary_class != NO_REGS) + new_reg = lra_create_new_reg_with_unique_value (sreg_mode, NULL_RTX, + secondary_class, + "secondary"); + start_sequence (); + if (old_sreg != sreg) + sreg = copy_rtx (sreg); + if (sri.icode == CODE_FOR_nothing) + lra_emit_move (new_reg, sreg); + else + { + enum reg_class scratch_class; + + scratch_class = (reg_class_from_constraints + (insn_data[sri.icode].operand[2].constraint)); + scratch_reg = (lra_create_new_reg_with_unique_value + (insn_data[sri.icode].operand[2].mode, NULL_RTX, + scratch_class, "scratch")); + emit_insn (GEN_FCN (sri.icode) (new_reg != NULL_RTX ? new_reg : dest, + sreg, scratch_reg)); + } + before = get_insns (); + end_sequence (); + lra_process_new_insns (curr_insn, before, NULL_RTX, "Inserting the move"); + if (new_reg != NULL_RTX) + { + if (GET_CODE (src) == SUBREG) + SUBREG_REG (src) = new_reg; + else + SET_SRC (set) = new_reg; + } + else + { + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, "Deleting move %u\n", INSN_UID (curr_insn)); + debug_rtl_slim (lra_dump_file, curr_insn, curr_insn, -1, 0); + } + lra_set_insn_deleted (curr_insn); + return true; + } + return false; +} + +/* The following data describe the result of process_alt_operands. + The data are used in curr_insn_transform to generate reloads. */ + +/* The chosen reg classes which should be used for the corresponding + operands. */ +static enum reg_class goal_alt[MAX_RECOG_OPERANDS]; +/* True if the operand should be the same as another operand and that + other operand does not need a reload. */ +static bool goal_alt_match_win[MAX_RECOG_OPERANDS]; +/* True if the operand does not need a reload. */ +static bool goal_alt_win[MAX_RECOG_OPERANDS]; +/* True if the operand can be offsetable memory. */ +static bool goal_alt_offmemok[MAX_RECOG_OPERANDS]; +/* The number of an operand to which given operand can be matched to. */ +static int goal_alt_matches[MAX_RECOG_OPERANDS]; +/* The number of elements in the following array. */ +static int goal_alt_dont_inherit_ops_num; +/* Numbers of operands whose reload pseudos should not be inherited. */ +static int goal_alt_dont_inherit_ops[MAX_RECOG_OPERANDS]; +/* True if the insn commutative operands should be swapped. */ +static bool goal_alt_swapped; +/* The chosen insn alternative. */ +static int goal_alt_number; + +/* The following five variables are used to choose the best insn + alternative. They reflect final characteristics of the best + alternative. */ + +/* Number of necessary reloads and overall cost reflecting the + previous value and other unpleasantness of the best alternative. */ +static int best_losers, best_overall; +/* Number of small register classes used for operands of the best + alternative. */ +static int best_small_class_operands_num; +/* Overall number hard registers used for reloads. For example, on + some targets we need 2 general registers to reload DFmode and only + one floating point register. */ +static int best_reload_nregs; +/* Overall number reflecting distances of previous reloading the same + value. The distances are counted from the current BB start. It is + used to improve inheritance chances. */ +static int best_reload_sum; + +/* True if the current insn should have no correspondingly input or + output reloads. */ +static bool no_input_reloads_p, no_output_reloads_p; + +/* True if we swapped the commutative operands in the current + insn. */ +static int curr_swapped; + +/* Arrange for address element *LOC to be a register of class CL. + Add any input reloads to list BEFORE. AFTER is nonnull if *LOC is an + automodified value; handle that case by adding the required output + reloads to list AFTER. Return true if the RTL was changed. */ +static bool +process_addr_reg (rtx *loc, rtx *before, rtx *after, enum reg_class cl) +{ + int regno; + enum reg_class rclass, new_class; + rtx reg = *loc; + rtx new_reg; + enum machine_mode mode; + bool before_p = false; + + mode = GET_MODE (reg); + if (! REG_P (reg)) + { + /* Always reload memory in an address even if the target supports + such addresses. */ + new_reg = lra_create_new_reg_with_unique_value (mode, reg, cl, "address"); + before_p = true; + } + else + { + regno = REGNO (reg); + rclass = get_reg_class (regno); + if ((*loc = get_equiv_substitution (reg)) != reg) + { + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + "Changing pseudo %d in address of insn %u on equiv ", + REGNO (reg), INSN_UID (curr_insn)); + print_value_slim (lra_dump_file, *loc, 1); + fprintf (lra_dump_file, "\n"); + } + *loc = copy_rtx (*loc); + } + if (*loc != reg || ! in_class_p (reg, cl, &new_class)) + { + reg = *loc; + if (get_reload_reg (after == NULL ? OP_IN : OP_INOUT, + mode, reg, cl, "address", &new_reg)) + before_p = true; + } + else if (new_class != NO_REGS && rclass != new_class) + { + change_class (regno, new_class, " Change", true); + return false; + } + else + return false; + } + if (before_p) + { + push_to_sequence (*before); + lra_emit_move (new_reg, reg); + *before = get_insns (); + end_sequence (); + } + *loc = new_reg; + if (after != NULL) + { + start_sequence (); + lra_emit_move (reg, new_reg); + emit_insn (*after); + *after = get_insns (); + end_sequence (); + } + return true; +} + +#ifndef SLOW_UNALIGNED_ACCESS +#define SLOW_UNALIGNED_ACCESS(mode, align) 0 +#endif + +/* Make reloads for subreg in operand NOP with internal subreg mode + REG_MODE, add new reloads for further processing. Return true if + any reload was generated. */ +static bool +simplify_operand_subreg (int nop, enum machine_mode reg_mode) +{ + int hard_regno; + rtx before, after; + enum machine_mode mode; + rtx reg, new_reg; + rtx operand = *curr_id->operand_loc[nop]; + + before = after = NULL_RTX; + + if (GET_CODE (operand) != SUBREG) + return false; + + mode = GET_MODE (operand); + reg = SUBREG_REG (operand); + /* If we change address for paradoxical subreg of memory, the + address might violate the necessary alignment or the access might + be slow. So take this into consideration. */ + if ((MEM_P (reg) + && ((! STRICT_ALIGNMENT + && ! SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (reg))) + || MEM_ALIGN (reg) >= GET_MODE_ALIGNMENT (mode))) + || (REG_P (reg) && REGNO (reg) < FIRST_PSEUDO_REGISTER)) + { + alter_subreg (curr_id->operand_loc[nop], false); + return true; + } + /* Put constant into memory when we have mixed modes. It generates + a better code in most cases as it does not need a secondary + reload memory. It also prevents LRA looping when LRA is using + secondary reload memory again and again. */ + if (CONSTANT_P (reg) && CONST_POOL_OK_P (reg_mode, reg) + && SCALAR_INT_MODE_P (reg_mode) != SCALAR_INT_MODE_P (mode)) + { + SUBREG_REG (operand) = force_const_mem (reg_mode, reg); + alter_subreg (curr_id->operand_loc[nop], false); + return true; + } + /* Force a reload of the SUBREG_REG if this is a constant or PLUS or + if there may be a problem accessing OPERAND in the outer + mode. */ + if ((REG_P (reg) + && REGNO (reg) >= FIRST_PSEUDO_REGISTER + && (hard_regno = lra_get_regno_hard_regno (REGNO (reg))) >= 0 + /* Don't reload paradoxical subregs because we could be looping + having repeatedly final regno out of hard regs range. */ + && (hard_regno_nregs[hard_regno][GET_MODE (reg)] + >= hard_regno_nregs[hard_regno][mode]) + && simplify_subreg_regno (hard_regno, GET_MODE (reg), + SUBREG_BYTE (operand), mode) < 0) + || CONSTANT_P (reg) || GET_CODE (reg) == PLUS || MEM_P (reg)) + { + enum op_type type = curr_static_id->operand[nop].type; + /* The class will be defined later in curr_insn_transform. */ + enum reg_class rclass + = (enum reg_class) targetm.preferred_reload_class (reg, ALL_REGS); + + new_reg = lra_create_new_reg_with_unique_value (reg_mode, reg, rclass, + "subreg reg"); + bitmap_set_bit (&lra_optional_reload_pseudos, REGNO (new_reg)); + if (type != OP_OUT + || GET_MODE_SIZE (GET_MODE (reg)) > GET_MODE_SIZE (mode)) + { + push_to_sequence (before); + lra_emit_move (new_reg, reg); + before = get_insns (); + end_sequence (); + } + if (type != OP_IN) + { + start_sequence (); + lra_emit_move (reg, new_reg); + emit_insn (after); + after = get_insns (); + end_sequence (); + } + SUBREG_REG (operand) = new_reg; + lra_process_new_insns (curr_insn, before, after, + "Inserting subreg reload"); + return true; + } + return false; +} + +/* Return TRUE if X refers for a hard register from SET. */ +static bool +uses_hard_regs_p (rtx x, HARD_REG_SET set) +{ + int i, j, x_hard_regno; + enum machine_mode mode; + const char *fmt; + enum rtx_code code; + + if (x == NULL_RTX) + return false; + code = GET_CODE (x); + mode = GET_MODE (x); + if (code == SUBREG) + { + x = SUBREG_REG (x); + code = GET_CODE (x); + if (GET_MODE_SIZE (GET_MODE (x)) > GET_MODE_SIZE (mode)) + mode = GET_MODE (x); + } + + if (REG_P (x)) + { + x_hard_regno = get_hard_regno (x); + return (x_hard_regno >= 0 + && overlaps_hard_reg_set_p (set, mode, x_hard_regno)); + } + if (MEM_P (x)) + { + struct address ad; + enum machine_mode mode = GET_MODE (x); + rtx *addr_loc = &XEXP (x, 0); + + extract_address_regs (mode, MEM_ADDR_SPACE (x), addr_loc, MEM, &ad); + if (ad.base_reg_loc != NULL) + { + if (uses_hard_regs_p (*ad.base_reg_loc, set)) + return true; + } + if (ad.index_reg_loc != NULL) + { + if (uses_hard_regs_p (*ad.index_reg_loc, set)) + return true; + } + } + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + { + if (uses_hard_regs_p (XEXP (x, i), set)) + return true; + } + else if (fmt[i] == 'E') + { + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + if (uses_hard_regs_p (XVECEXP (x, i, j), set)) + return true; + } + } + return false; +} + +/* Return true if OP is a spilled pseudo. */ +static inline bool +spilled_pseudo_p (rtx op) +{ + return (REG_P (op) + && REGNO (op) >= FIRST_PSEUDO_REGISTER && in_mem_p (REGNO (op))); +} + +/* Return true if X is a general constant. */ +static inline bool +general_constant_p (rtx x) +{ + return CONSTANT_P (x) && (! flag_pic || LEGITIMATE_PIC_OPERAND_P (x)); +} + +/* Cost factor for each additional reload and maximal cost bound for + insn reloads. One might ask about such strange numbers. Their + values occurred historically from former reload pass. */ +#define LOSER_COST_FACTOR 6 +#define MAX_OVERALL_COST_BOUND 600 + +/* Major function to choose the current insn alternative and what + operands should be reloaded and how. If ONLY_ALTERNATIVE is not + negative we should consider only this alternative. Return false if + we can not choose the alternative or find how to reload the + operands. */ +static bool +process_alt_operands (int only_alternative) +{ + bool ok_p = false; + int nop, small_class_operands_num, overall, nalt; + int n_alternatives = curr_static_id->n_alternatives; + int n_operands = curr_static_id->n_operands; + /* LOSERS counts the operands that don't fit this alternative and + would require loading. */ + int losers; + /* REJECT is a count of how undesirable this alternative says it is + if any reloading is required. If the alternative matches exactly + then REJECT is ignored, but otherwise it gets this much counted + against it in addition to the reloading needed. */ + int reject; + /* The number of elements in the following array. */ + int early_clobbered_regs_num; + /* Numbers of operands which are early clobber registers. */ + int early_clobbered_nops[MAX_RECOG_OPERANDS]; + enum reg_class curr_alt[MAX_RECOG_OPERANDS]; + HARD_REG_SET curr_alt_set[MAX_RECOG_OPERANDS]; + bool curr_alt_match_win[MAX_RECOG_OPERANDS]; + bool curr_alt_win[MAX_RECOG_OPERANDS]; + bool curr_alt_offmemok[MAX_RECOG_OPERANDS]; + int curr_alt_matches[MAX_RECOG_OPERANDS]; + /* The number of elements in the following array. */ + int curr_alt_dont_inherit_ops_num; + /* Numbers of operands whose reload pseudos should not be inherited. */ + int curr_alt_dont_inherit_ops[MAX_RECOG_OPERANDS]; + rtx op; + /* The register when the operand is a subreg of register, otherwise the + operand itself. */ + rtx no_subreg_reg_operand[MAX_RECOG_OPERANDS]; + /* The register if the operand is a register or subreg of register, + otherwise NULL. */ + rtx operand_reg[MAX_RECOG_OPERANDS]; + int hard_regno[MAX_RECOG_OPERANDS]; + enum machine_mode biggest_mode[MAX_RECOG_OPERANDS]; + int reload_nregs, reload_sum; + bool costly_p; + enum reg_class cl; + + /* Calculate some data common for all alternatives to speed up the + function. */ + for (nop = 0; nop < n_operands; nop++) + { + op = no_subreg_reg_operand[nop] = *curr_id->operand_loc[nop]; + /* The real hard regno of the operand after the allocation. */ + hard_regno[nop] = get_hard_regno (op); + + operand_reg[nop] = op; + biggest_mode[nop] = GET_MODE (operand_reg[nop]); + if (GET_CODE (operand_reg[nop]) == SUBREG) + { + operand_reg[nop] = SUBREG_REG (operand_reg[nop]); + if (GET_MODE_SIZE (biggest_mode[nop]) + < GET_MODE_SIZE (GET_MODE (operand_reg[nop]))) + biggest_mode[nop] = GET_MODE (operand_reg[nop]); + } + if (REG_P (operand_reg[nop])) + no_subreg_reg_operand[nop] = operand_reg[nop]; + else + operand_reg[nop] = NULL_RTX; + } + + /* The constraints are made of several alternatives. Each operand's + constraint looks like foo,bar,... with commas separating the + alternatives. The first alternatives for all operands go + together, the second alternatives go together, etc. + + First loop over alternatives. */ + for (nalt = 0; nalt < n_alternatives; nalt++) + { + /* Loop over operands for one constraint alternative. */ +#ifdef HAVE_ATTR_enabled + if (curr_id->alternative_enabled_p != NULL + && ! curr_id->alternative_enabled_p[nalt]) + continue; +#endif + + if (only_alternative >= 0 && nalt != only_alternative) + continue; + + overall = losers = reject = reload_nregs = reload_sum = 0; + for (nop = 0; nop < n_operands; nop++) + reject += (curr_static_id + ->operand_alternative[nalt * n_operands + nop].reject); + early_clobbered_regs_num = 0; + + for (nop = 0; nop < n_operands; nop++) + { + const char *p; + char *end; + int len, c, m, i, opalt_num, this_alternative_matches; + bool win, did_match, offmemok, early_clobber_p; + /* false => this operand can be reloaded somehow for this + alternative. */ + bool badop; + /* true => this operand can be reloaded if the alternative + allows regs. */ + bool winreg; + /* True if a constant forced into memory would be OK for + this operand. */ + bool constmemok; + enum reg_class this_alternative, this_costly_alternative; + HARD_REG_SET this_alternative_set, this_costly_alternative_set; + bool this_alternative_match_win, this_alternative_win; + bool this_alternative_offmemok; + enum machine_mode mode; + + opalt_num = nalt * n_operands + nop; + if (curr_static_id->operand_alternative[opalt_num].anything_ok) + { + /* Fast track for no constraints at all. */ + curr_alt[nop] = NO_REGS; + CLEAR_HARD_REG_SET (curr_alt_set[nop]); + curr_alt_win[nop] = true; + curr_alt_match_win[nop] = false; + curr_alt_offmemok[nop] = false; + curr_alt_matches[nop] = -1; + continue; + } + + op = no_subreg_reg_operand[nop]; + mode = curr_operand_mode[nop]; + + win = did_match = winreg = offmemok = constmemok = false; + badop = true; + + early_clobber_p = false; + p = curr_static_id->operand_alternative[opalt_num].constraint; + + this_costly_alternative = this_alternative = NO_REGS; + /* We update set of possible hard regs besides its class + because reg class might be inaccurate. For example, + union of LO_REGS (l), HI_REGS(h), and STACK_REG(k) in ARM + is translated in HI_REGS because classes are merged by + pairs and there is no accurate intermediate class. */ + CLEAR_HARD_REG_SET (this_alternative_set); + CLEAR_HARD_REG_SET (this_costly_alternative_set); + this_alternative_win = false; + this_alternative_match_win = false; + this_alternative_offmemok = false; + this_alternative_matches = -1; + + /* An empty constraint should be excluded by the fast + track. */ + lra_assert (*p != 0 && *p != ','); + + /* Scan this alternative's specs for this operand; set WIN + if the operand fits any letter in this alternative. + Otherwise, clear BADOP if this operand could fit some + letter after reloads, or set WINREG if this operand could + fit after reloads provided the constraint allows some + registers. */ + costly_p = false; + do + { + switch ((c = *p, len = CONSTRAINT_LEN (c, p)), c) + { + case '\0': + len = 0; + break; + case ',': + c = '\0'; + break; + + case '=': case '+': case '?': case '*': case '!': + case ' ': case '\t': + break; + + case '%': + /* We only support one commutative marker, the first + one. We already set commutative above. */ + break; + + case '&': + early_clobber_p = true; + break; + + case '#': + /* Ignore rest of this alternative. */ + c = '\0'; + break; + + case '0': case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + { + int m_hregno; + bool match_p; + + m = strtoul (p, &end, 10); + p = end; + len = 0; + lra_assert (nop > m); + + this_alternative_matches = m; + m_hregno = get_hard_regno (*curr_id->operand_loc[m]); + /* We are supposed to match a previous operand. + If we do, we win if that one did. If we do + not, count both of the operands as losers. + (This is too conservative, since most of the + time only a single reload insn will be needed + to make the two operands win. As a result, + this alternative may be rejected when it is + actually desirable.) */ + match_p = false; + if (operands_match_p (*curr_id->operand_loc[nop], + *curr_id->operand_loc[m], m_hregno)) + { + /* We should reject matching of an early + clobber operand if the matching operand is + not dying in the insn. */ + if (! curr_static_id->operand[m].early_clobber + || operand_reg[nop] == NULL_RTX + || (find_regno_note (curr_insn, REG_DEAD, + REGNO (operand_reg[nop])) + != NULL_RTX)) + match_p = true; + } + if (match_p) + { + /* If we are matching a non-offsettable + address where an offsettable address was + expected, then we must reject this + combination, because we can't reload + it. */ + if (curr_alt_offmemok[m] + && MEM_P (*curr_id->operand_loc[m]) + && curr_alt[m] == NO_REGS && ! curr_alt_win[m]) + continue; + + } + else + { + /* Operands don't match. Both operands must + allow a reload register, otherwise we + cannot make them match. */ + if (curr_alt[m] == NO_REGS) + break; + /* Retroactively mark the operand we had to + match as a loser, if it wasn't already and + it wasn't matched to a register constraint + (e.g it might be matched by memory). */ + if (curr_alt_win[m] + && (operand_reg[m] == NULL_RTX + || hard_regno[m] < 0)) + { + losers++; + reload_nregs + += (ira_reg_class_max_nregs[curr_alt[m]] + [GET_MODE (*curr_id->operand_loc[m])]); + } + + /* We prefer no matching alternatives because + it gives more freedom in RA. */ + if (operand_reg[nop] == NULL_RTX + || (find_regno_note (curr_insn, REG_DEAD, + REGNO (operand_reg[nop])) + == NULL_RTX)) + reject += 2; + } + /* If we have to reload this operand and some + previous operand also had to match the same + thing as this operand, we don't know how to do + that. */ + if (!match_p || !curr_alt_win[m]) + { + for (i = 0; i < nop; i++) + if (curr_alt_matches[i] == m) + break; + if (i < nop) + break; + } + else + did_match = true; + + /* This can be fixed with reloads if the operand + we are supposed to match can be fixed with + reloads. */ + badop = false; + this_alternative = curr_alt[m]; + COPY_HARD_REG_SET (this_alternative_set, curr_alt_set[m]); + break; + } + + case 'p': + cl = base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, + ADDRESS, SCRATCH); + this_alternative = reg_class_subunion[this_alternative][cl]; + IOR_HARD_REG_SET (this_alternative_set, + reg_class_contents[cl]); + if (costly_p) + { + this_costly_alternative + = reg_class_subunion[this_costly_alternative][cl]; + IOR_HARD_REG_SET (this_costly_alternative_set, + reg_class_contents[cl]); + } + win = true; + badop = false; + break; + + case TARGET_MEM_CONSTRAINT: + if (MEM_P (op) || spilled_pseudo_p (op)) + win = true; + if (CONST_POOL_OK_P (mode, op)) + badop = false; + constmemok = true; + break; + + case '<': + if (MEM_P (op) + && (GET_CODE (XEXP (op, 0)) == PRE_DEC + || GET_CODE (XEXP (op, 0)) == POST_DEC)) + win = true; + break; + + case '>': + if (MEM_P (op) + && (GET_CODE (XEXP (op, 0)) == PRE_INC + || GET_CODE (XEXP (op, 0)) == POST_INC)) + win = true; + break; + + /* Memory op whose address is not offsettable. */ + case 'V': + if (MEM_P (op) + && ! offsettable_nonstrict_memref_p (op)) + win = true; + break; + + /* Memory operand whose address is offsettable. */ + case 'o': + if ((MEM_P (op) + && offsettable_nonstrict_memref_p (op)) + || spilled_pseudo_p (op)) + win = true; + if (CONST_POOL_OK_P (mode, op) || MEM_P (op)) + badop = false; + constmemok = true; + offmemok = true; + break; + + case 'E': + case 'F': + if (GET_CODE (op) == CONST_DOUBLE + || (GET_CODE (op) == CONST_VECTOR + && (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT))) + win = true; + break; + + case 'G': + case 'H': + if (GET_CODE (op) == CONST_DOUBLE + && CONST_DOUBLE_OK_FOR_CONSTRAINT_P (op, c, p)) + win = true; + break; + + case 's': + if (CONST_INT_P (op) + || (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)) + break; + case 'i': + if (general_constant_p (op)) + win = true; + break; + + case 'n': + if (CONST_INT_P (op) + || (GET_CODE (op) == CONST_DOUBLE && mode == VOIDmode)) + win = true; + break; + + case 'I': + case 'J': + case 'K': + case 'L': + case 'M': + case 'N': + case 'O': + case 'P': + if (CONST_INT_P (op) + && CONST_OK_FOR_CONSTRAINT_P (INTVAL (op), c, p)) + win = true; + break; + + case 'X': + /* This constraint should be excluded by the fast + track. */ + gcc_unreachable (); + break; + + case 'g': + if (MEM_P (op) + || general_constant_p (op) + || spilled_pseudo_p (op)) + win = true; + /* Drop through into 'r' case. */ + + case 'r': + this_alternative + = reg_class_subunion[this_alternative][GENERAL_REGS]; + IOR_HARD_REG_SET (this_alternative_set, + reg_class_contents[GENERAL_REGS]); + if (costly_p) + { + this_costly_alternative + = (reg_class_subunion + [this_costly_alternative][GENERAL_REGS]); + IOR_HARD_REG_SET (this_costly_alternative_set, + reg_class_contents[GENERAL_REGS]); + } + goto reg; + + default: + if (REG_CLASS_FROM_CONSTRAINT (c, p) == NO_REGS) + { +#ifdef EXTRA_CONSTRAINT_STR + if (EXTRA_MEMORY_CONSTRAINT (c, p)) + { + if (EXTRA_CONSTRAINT_STR (op, c, p)) + win = true; + else if (spilled_pseudo_p (op)) + win = true; + + /* If we didn't already win, we can reload + constants via force_const_mem, and other + MEMs by reloading the address like for + 'o'. */ + if (CONST_POOL_OK_P (mode, op) || MEM_P (op)) + badop = false; + constmemok = true; + offmemok = true; + break; + } + if (EXTRA_ADDRESS_CONSTRAINT (c, p)) + { + if (EXTRA_CONSTRAINT_STR (op, c, p)) + win = true; + + /* If we didn't already win, we can reload + the address into a base register. */ + cl = base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, + ADDRESS, SCRATCH); + this_alternative + = reg_class_subunion[this_alternative][cl]; + IOR_HARD_REG_SET (this_alternative_set, + reg_class_contents[cl]); + if (costly_p) + { + this_costly_alternative + = (reg_class_subunion + [this_costly_alternative][cl]); + IOR_HARD_REG_SET (this_costly_alternative_set, + reg_class_contents[cl]); + } + badop = false; + break; + } + + if (EXTRA_CONSTRAINT_STR (op, c, p)) + win = true; +#endif + break; + } + + cl = REG_CLASS_FROM_CONSTRAINT (c, p); + this_alternative = reg_class_subunion[this_alternative][cl]; + IOR_HARD_REG_SET (this_alternative_set, + reg_class_contents[cl]); + if (costly_p) + { + this_costly_alternative + = reg_class_subunion[this_costly_alternative][cl]; + IOR_HARD_REG_SET (this_costly_alternative_set, + reg_class_contents[cl]); + } + reg: + if (mode == BLKmode) + break; + winreg = true; + if (REG_P (op)) + { + if (hard_regno[nop] >= 0 + && in_hard_reg_set_p (this_alternative_set, + mode, hard_regno[nop])) + win = true; + else if (hard_regno[nop] < 0 + && in_class_p (op, this_alternative, NULL)) + win = true; + } + break; + } + if (c != ' ' && c != '\t') + costly_p = c == '*'; + } + while ((p += len), c); + + /* Record which operands fit this alternative. */ + if (win) + { + this_alternative_win = true; + if (operand_reg[nop] != NULL_RTX) + { + if (hard_regno[nop] >= 0) + { + if (in_hard_reg_set_p (this_costly_alternative_set, + mode, hard_regno[nop])) + reject++; + } + else + { + /* Prefer won reg to spilled pseudo under other equal + conditions. */ + reject++; + if (in_class_p (operand_reg[nop], + this_costly_alternative, NULL)) + reject++; + } + /* We simulate the behaviour of old reload here. + Although scratches need hard registers and it + might result in spilling other pseudos, no reload + insns are generated for the scratches. So it + might cost something but probably less than old + reload pass believes. */ + if (lra_former_scratch_p (REGNO (operand_reg[nop]))) + reject += LOSER_COST_FACTOR; + } + } + else if (did_match) + this_alternative_match_win = true; + else + { + int const_to_mem = 0; + bool no_regs_p; + + no_regs_p + = (this_alternative == NO_REGS + || (hard_reg_set_subset_p + (reg_class_contents[this_alternative], + lra_no_alloc_regs))); + /* If this operand accepts a register, and if the + register class has at least one allocatable register, + then this operand can be reloaded. */ + if (winreg && !no_regs_p) + badop = false; + + if (badop) + goto fail; + + this_alternative_offmemok = offmemok; + if (this_costly_alternative != NO_REGS) + reject++; + /* If the operand is dying, has a matching constraint, + and satisfies constraints of the matched operand + which failed to satisfy the own constraints, we do + not need to generate a reload insn for this + operand. */ + if (!(this_alternative_matches >= 0 + && !curr_alt_win[this_alternative_matches] + && REG_P (op) + && find_regno_note (curr_insn, REG_DEAD, REGNO (op)) + && (hard_regno[nop] >= 0 + ? in_hard_reg_set_p (this_alternative_set, + mode, hard_regno[nop]) + : in_class_p (op, this_alternative, NULL)))) + losers++; + if (operand_reg[nop] != NULL_RTX + /* Output operands and matched input operands are + not inherited. The following conditions do not + exactly describe the previous statement but they + are pretty close. */ + && curr_static_id->operand[nop].type != OP_OUT + && (this_alternative_matches < 0 + || curr_static_id->operand[nop].type != OP_IN)) + { + int last_reload = (lra_reg_info[ORIGINAL_REGNO + (operand_reg[nop])] + .last_reload); + + if (last_reload > bb_reload_num) + reload_sum += last_reload - bb_reload_num; + } + /* If this is a constant that is reloaded into the + desired class by copying it to memory first, count + that as another reload. This is consistent with + other code and is required to avoid choosing another + alternative when the constant is moved into memory. + Note that the test here is precisely the same as in + the code below that calls force_const_mem. */ + if (CONST_POOL_OK_P (mode, op) + && ((targetm.preferred_reload_class + (op, this_alternative) == NO_REGS) + || no_input_reloads_p)) + { + const_to_mem = 1; + if (! no_regs_p) + losers++; + } + + /* Alternative loses if it requires a type of reload not + permitted for this insn. We can always reload + objects with a REG_UNUSED note. */ + if ((curr_static_id->operand[nop].type != OP_IN + && no_output_reloads_p + && ! find_reg_note (curr_insn, REG_UNUSED, op)) + || (curr_static_id->operand[nop].type != OP_OUT + && no_input_reloads_p && ! const_to_mem)) + goto fail; + + /* If we can't reload this value at all, reject this + alternative. Note that we could also lose due to + LIMIT_RELOAD_CLASS, but we don't check that here. */ + if (! CONSTANT_P (op) && ! no_regs_p) + { + if (targetm.preferred_reload_class + (op, this_alternative) == NO_REGS) + reject = MAX_OVERALL_COST_BOUND; + + if (curr_static_id->operand[nop].type == OP_OUT + && (targetm.preferred_output_reload_class + (op, this_alternative) == NO_REGS)) + reject = MAX_OVERALL_COST_BOUND; + } + + if (! ((const_to_mem && constmemok) + || (MEM_P (op) && offmemok))) + { + /* We prefer to reload pseudos over reloading other + things, since such reloads may be able to be + eliminated later. So bump REJECT in other cases. + Don't do this in the case where we are forcing a + constant into memory and it will then win since + we don't want to have a different alternative + match then. */ + if (! (REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER)) + reject += 2; + + if (! no_regs_p) + reload_nregs + += ira_reg_class_max_nregs[this_alternative][mode]; + } + + /* Input reloads can be inherited more often than output + reloads can be removed, so penalize output + reloads. */ + if (!REG_P (op) || curr_static_id->operand[nop].type != OP_IN) + reject++; + } + + if (early_clobber_p) + reject++; + /* ??? We check early clobbers after processing all operands + (see loop below) and there we update the costs more. + Should we update the cost (may be approximately) here + because of early clobber register reloads or it is a rare + or non-important thing to be worth to do it. */ + overall = losers * LOSER_COST_FACTOR + reject; + if ((best_losers == 0 || losers != 0) && best_overall < overall) + goto fail; + + curr_alt[nop] = this_alternative; + COPY_HARD_REG_SET (curr_alt_set[nop], this_alternative_set); + curr_alt_win[nop] = this_alternative_win; + curr_alt_match_win[nop] = this_alternative_match_win; + curr_alt_offmemok[nop] = this_alternative_offmemok; + curr_alt_matches[nop] = this_alternative_matches; + + if (this_alternative_matches >= 0 + && !did_match && !this_alternative_win) + curr_alt_win[this_alternative_matches] = false; + + if (early_clobber_p && operand_reg[nop] != NULL_RTX) + early_clobbered_nops[early_clobbered_regs_num++] = nop; + } + ok_p = true; + curr_alt_dont_inherit_ops_num = 0; + for (nop = 0; nop < early_clobbered_regs_num; nop++) + { + int i, j, clobbered_hard_regno; + HARD_REG_SET temp_set; + + i = early_clobbered_nops[nop]; + if ((! curr_alt_win[i] && ! curr_alt_match_win[i]) + || hard_regno[i] < 0) + continue; + clobbered_hard_regno = hard_regno[i]; + CLEAR_HARD_REG_SET (temp_set); + add_to_hard_reg_set (&temp_set, biggest_mode[i], clobbered_hard_regno); + for (j = 0; j < n_operands; j++) + if (j == i + /* We don't want process insides of match_operator and + match_parallel because otherwise we would process + their operands once again generating a wrong + code. */ + || curr_static_id->operand[j].is_operator) + continue; + else if ((curr_alt_matches[j] == i && curr_alt_match_win[j]) + || (curr_alt_matches[i] == j && curr_alt_match_win[i])) + continue; + else if (uses_hard_regs_p (*curr_id->operand_loc[j], temp_set)) + break; + if (j >= n_operands) + continue; + /* We need to reload early clobbered register. */ + for (j = 0; j < n_operands; j++) + if (curr_alt_matches[j] == i) + { + curr_alt_match_win[j] = false; + losers++; + overall += LOSER_COST_FACTOR; + } + if (! curr_alt_match_win[i]) + curr_alt_dont_inherit_ops[curr_alt_dont_inherit_ops_num++] = i; + else + { + /* Remember pseudos used for match reloads are never + inherited. */ + lra_assert (curr_alt_matches[i] >= 0); + curr_alt_win[curr_alt_matches[i]] = false; + } + curr_alt_win[i] = curr_alt_match_win[i] = false; + losers++; + overall += LOSER_COST_FACTOR; + } + small_class_operands_num = 0; + for (nop = 0; nop < n_operands; nop++) + small_class_operands_num + += SMALL_REGISTER_CLASS_P (curr_alt[nop]) ? 1 : 0; + + /* If this alternative can be made to work by reloading, and it + needs less reloading than the others checked so far, record + it as the chosen goal for reloading. */ + if ((best_losers != 0 && losers == 0) + || (((best_losers == 0 && losers == 0) + || (best_losers != 0 && losers != 0)) + && (best_overall > overall + || (best_overall == overall + /* If the cost of the reloads is the same, + prefer alternative which requires minimal + number of small register classes for the + operands. This improves chances of reloads + for insn requiring small register + classes. */ + && (small_class_operands_num + < best_small_class_operands_num + || (small_class_operands_num + == best_small_class_operands_num + && (reload_nregs < best_reload_nregs + || (reload_nregs == best_reload_nregs + && best_reload_sum < reload_sum)))))))) + { + for (nop = 0; nop < n_operands; nop++) + { + goal_alt_win[nop] = curr_alt_win[nop]; + goal_alt_match_win[nop] = curr_alt_match_win[nop]; + goal_alt_matches[nop] = curr_alt_matches[nop]; + goal_alt[nop] = curr_alt[nop]; + goal_alt_offmemok[nop] = curr_alt_offmemok[nop]; + } + goal_alt_dont_inherit_ops_num = curr_alt_dont_inherit_ops_num; + for (nop = 0; nop < curr_alt_dont_inherit_ops_num; nop++) + goal_alt_dont_inherit_ops[nop] = curr_alt_dont_inherit_ops[nop]; + goal_alt_swapped = curr_swapped; + best_overall = overall; + best_losers = losers; + best_small_class_operands_num = small_class_operands_num; + best_reload_nregs = reload_nregs; + best_reload_sum = reload_sum; + goal_alt_number = nalt; + } + if (losers == 0) + /* Everything is satisfied. Do not process alternatives + anymore. */ + break; + fail: + ; + } + return ok_p; +} + +/* Return 1 if ADDR is a valid memory address for mode MODE in address + space AS, and check that each pseudo has the proper kind of hard + reg. */ +static int +valid_address_p (enum machine_mode mode ATTRIBUTE_UNUSED, + rtx addr, addr_space_t as) +{ +#ifdef GO_IF_LEGITIMATE_ADDRESS + lra_assert (ADDR_SPACE_GENERIC_P (as)); + GO_IF_LEGITIMATE_ADDRESS (mode, addr, win); + return 0; + + win: + return 1; +#else + return targetm.addr_space.legitimate_address_p (mode, addr, 0, as); +#endif +} + +/* Make reload base reg + disp from address AD in space AS of memory + with MODE into a new pseudo. Return the new pseudo. */ +static rtx +base_plus_disp_to_reg (enum machine_mode mode, addr_space_t as, + struct address *ad) +{ + enum reg_class cl; + rtx new_reg; + + lra_assert (ad->base_reg_loc != NULL && ad->disp_loc != NULL); + cl = base_reg_class (mode, as, ad->base_outer_code, ad->index_code); + new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "base + disp"); + lra_emit_add (new_reg, *ad->base_reg_loc, *ad->disp_loc); + return new_reg; +} + +/* Make substitution in address AD in space AS with location ADDR_LOC. + Update AD and ADDR_LOC if it is necessary. Return true if a + substitution was made. */ +static bool +equiv_address_substitution (struct address *ad, rtx *addr_loc, + enum machine_mode mode, addr_space_t as, + enum rtx_code code) +{ + rtx base_reg, new_base_reg, index_reg, new_index_reg; + HOST_WIDE_INT disp, scale; + bool change_p; + + if (ad->base_reg_loc == NULL) + base_reg = new_base_reg = NULL_RTX; + else + { + base_reg = *ad->base_reg_loc; + new_base_reg = get_equiv_substitution (base_reg); + } + if (ad->index_reg_loc == NULL) + index_reg = new_index_reg = NULL_RTX; + else + { + index_reg = *ad->index_reg_loc; + new_index_reg = get_equiv_substitution (index_reg); + } + if (base_reg == new_base_reg && index_reg == new_index_reg) + return false; + disp = 0; + change_p = false; + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, "Changing address in insn %d ", + INSN_UID (curr_insn)); + print_value_slim (lra_dump_file, *addr_loc, 1); + } + if (base_reg != new_base_reg) + { + if (REG_P (new_base_reg)) + { + *ad->base_reg_loc = new_base_reg; + change_p = true; + } + else if (GET_CODE (new_base_reg) == PLUS + && REG_P (XEXP (new_base_reg, 0)) + && CONST_INT_P (XEXP (new_base_reg, 1))) + { + disp += INTVAL (XEXP (new_base_reg, 1)); + *ad->base_reg_loc = XEXP (new_base_reg, 0); + change_p = true; + } + if (ad->base_reg_loc2 != NULL) + *ad->base_reg_loc2 = *ad->base_reg_loc; + } + scale = 1; + if (ad->index_loc != NULL && GET_CODE (*ad->index_loc) == MULT) + { + lra_assert (CONST_INT_P (XEXP (*ad->index_loc, 1))); + scale = INTVAL (XEXP (*ad->index_loc, 1)); + } + if (index_reg != new_index_reg) + { + if (REG_P (new_index_reg)) + { + *ad->index_reg_loc = new_index_reg; + change_p = true; + } + else if (GET_CODE (new_index_reg) == PLUS + && REG_P (XEXP (new_index_reg, 0)) + && CONST_INT_P (XEXP (new_index_reg, 1))) + { + disp += INTVAL (XEXP (new_index_reg, 1)) * scale; + *ad->index_reg_loc = XEXP (new_index_reg, 0); + change_p = true; + } + } + if (disp != 0) + { + if (ad->disp_loc != NULL) + *ad->disp_loc = plus_constant (Pmode, *ad->disp_loc, disp); + else + { + *addr_loc = gen_rtx_PLUS (Pmode, *addr_loc, GEN_INT (disp)); + extract_address_regs (mode, as, addr_loc, code, ad); + } + change_p = true; + } + if (lra_dump_file != NULL) + { + if (! change_p) + fprintf (lra_dump_file, " -- no change\n"); + else + { + fprintf (lra_dump_file, " on equiv "); + print_value_slim (lra_dump_file, *addr_loc, 1); + fprintf (lra_dump_file, "\n"); + } + } + return change_p; +} + +/* Major function to make reloads for address in operand NOP. Add to + reloads to the list *BEFORE and *AFTER. We might need to add + reloads to *AFTER because of inc/dec, {pre, post} modify in the + address. Return true for any RTL change. */ +static bool +process_address (int nop, rtx *before, rtx *after) +{ + struct address ad; + enum machine_mode mode; + rtx new_reg, *addr_loc, saved_index_reg, saved_base_reg; + bool ok_p; + addr_space_t as; + rtx op = *curr_id->operand_loc[nop]; + const char *constraint = curr_static_id->operand[nop].constraint; + bool change_p; + enum rtx_code code; + + if (constraint[0] == 'p' + || EXTRA_ADDRESS_CONSTRAINT (constraint[0], constraint)) + { + mode = VOIDmode; + addr_loc = curr_id->operand_loc[nop]; + as = ADDR_SPACE_GENERIC; + code = ADDRESS; + } + else if (MEM_P (op)) + { + mode = GET_MODE (op); + addr_loc = &XEXP (op, 0); + as = MEM_ADDR_SPACE (op); + code = MEM; + } + else if (GET_CODE (op) == SUBREG + && MEM_P (SUBREG_REG (op))) + { + mode = GET_MODE (SUBREG_REG (op)); + addr_loc = &XEXP (SUBREG_REG (op), 0); + as = MEM_ADDR_SPACE (SUBREG_REG (op)); + code = MEM; + } + else + return false; + if (GET_CODE (*addr_loc) == AND) + addr_loc = &XEXP (*addr_loc, 0); + extract_address_regs (mode, as, addr_loc, code, &ad); + change_p = equiv_address_substitution (&ad, addr_loc, mode, as, code); + if (ad.base_reg_loc != NULL + && (process_addr_reg + (ad.base_reg_loc, before, + (ad.base_modify_p && REG_P (*ad.base_reg_loc) + && find_regno_note (curr_insn, REG_DEAD, + REGNO (*ad.base_reg_loc)) == NULL_RTX + ? after : NULL), + base_reg_class (mode, as, ad.base_outer_code, ad.index_code)))) + { + change_p = true; + if (ad.base_reg_loc2 != NULL) + *ad.base_reg_loc2 = *ad.base_reg_loc; + } + if (ad.index_reg_loc != NULL + && process_addr_reg (ad.index_reg_loc, before, NULL, INDEX_REG_CLASS)) + change_p = true; + + /* The address was valid before LRA. We only change its form if the + address has a displacement, so if it has no displacement it must + still be valid. */ + if (ad.disp_loc == NULL) + return change_p; + + /* See whether the address is still valid. Some ports do not check + displacements for eliminable registers, so we replace them + temporarily with the elimination target. */ + saved_base_reg = saved_index_reg = NULL_RTX; + if (ad.base_reg_loc != NULL) + { + saved_base_reg = *ad.base_reg_loc; + lra_eliminate_reg_if_possible (ad.base_reg_loc); + if (ad.base_reg_loc2 != NULL) + *ad.base_reg_loc2 = *ad.base_reg_loc; + } + if (ad.index_reg_loc != NULL) + { + saved_index_reg = *ad.index_reg_loc; + lra_eliminate_reg_if_possible (ad.index_reg_loc); + } + /* Some ports do not check displacements for virtual registers -- so + we substitute them temporarily by real registers. */ + ok_p = valid_address_p (mode, *addr_loc, as); + if (saved_base_reg != NULL_RTX) + { + *ad.base_reg_loc = saved_base_reg; + if (ad.base_reg_loc2 != NULL) + *ad.base_reg_loc2 = saved_base_reg; + } + if (saved_index_reg != NULL_RTX) + *ad.index_reg_loc = saved_index_reg; + + if (ok_p) + return change_p; + + /* Addresses were legitimate before LRA. So if the address has + two registers than it can have two of them. We should also + not worry about scale for the same reason. */ + push_to_sequence (*before); + if (ad.base_reg_loc == NULL) + { + if (ad.index_reg_loc == NULL) + { + int code = -1; + enum reg_class cl = base_reg_class (mode, as, SCRATCH, SCRATCH); + + new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "disp"); +#ifdef HAVE_lo_sum + { + rtx insn; + rtx last = get_last_insn (); + + /* disp => lo_sum (new_base, disp) */ + insn = emit_insn (gen_rtx_SET + (VOIDmode, new_reg, + gen_rtx_HIGH (Pmode, copy_rtx (*ad.disp_loc)))); + code = recog_memoized (insn); + if (code >= 0) + { + rtx save = *ad.disp_loc; + + *ad.disp_loc = gen_rtx_LO_SUM (Pmode, new_reg, *ad.disp_loc); + if (! valid_address_p (mode, *ad.disp_loc, as)) + { + *ad.disp_loc = save; + code = -1; + } + } + if (code < 0) + delete_insns_since (last); + } +#endif + if (code < 0) + { + /* disp => new_base */ + lra_emit_move (new_reg, *ad.disp_loc); + *ad.disp_loc = new_reg; + } + } + else + { + /* index * scale + disp => new base + index * scale */ + enum reg_class cl = base_reg_class (mode, as, SCRATCH, SCRATCH); + + lra_assert (INDEX_REG_CLASS != NO_REGS); + new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "disp"); + lra_assert (GET_CODE (*addr_loc) == PLUS); + lra_emit_move (new_reg, *ad.disp_loc); + if (CONSTANT_P (XEXP (*addr_loc, 1))) + XEXP (*addr_loc, 1) = XEXP (*addr_loc, 0); + XEXP (*addr_loc, 0) = new_reg; + } + } + else if (ad.index_reg_loc == NULL) + { + /* base + disp => new base */ + /* Another option would be to reload the displacement into an + index register. However, postreload has code to optimize + address reloads that have the same base and different + displacements, so reloading into an index register would + not necessarily be a win. */ + new_reg = base_plus_disp_to_reg (mode, as, &ad); + *addr_loc = new_reg; + } + else + { + /* base + scale * index + disp => new base + scale * index */ + new_reg = base_plus_disp_to_reg (mode, as, &ad); + *addr_loc = gen_rtx_PLUS (Pmode, new_reg, *ad.index_loc); + } + *before = get_insns (); + end_sequence (); + return true; +} + +/* Emit insns to reload VALUE into a new register. VALUE is an + auto-increment or auto-decrement RTX whose operand is a register or + memory location; so reloading involves incrementing that location. + IN is either identical to VALUE, or some cheaper place to reload + value being incremented/decremented from. + + INC_AMOUNT is the number to increment or decrement by (always + positive and ignored for POST_MODIFY/PRE_MODIFY). + + Return pseudo containing the result. */ +static rtx +emit_inc (enum reg_class new_rclass, rtx in, rtx value, int inc_amount) +{ + /* REG or MEM to be copied and incremented. */ + rtx incloc = XEXP (value, 0); + /* Nonzero if increment after copying. */ + int post = (GET_CODE (value) == POST_DEC || GET_CODE (value) == POST_INC + || GET_CODE (value) == POST_MODIFY); + rtx last; + rtx inc; + rtx add_insn; + int code; + rtx real_in = in == value ? incloc : in; + rtx result; + bool plus_p = true; + + if (GET_CODE (value) == PRE_MODIFY || GET_CODE (value) == POST_MODIFY) + { + lra_assert (GET_CODE (XEXP (value, 1)) == PLUS + || GET_CODE (XEXP (value, 1)) == MINUS); + lra_assert (rtx_equal_p (XEXP (XEXP (value, 1), 0), XEXP (value, 0))); + plus_p = GET_CODE (XEXP (value, 1)) == PLUS; + inc = XEXP (XEXP (value, 1), 1); + } + else + { + if (GET_CODE (value) == PRE_DEC || GET_CODE (value) == POST_DEC) + inc_amount = -inc_amount; + + inc = GEN_INT (inc_amount); + } + + if (! post && REG_P (incloc)) + result = incloc; + else + result = lra_create_new_reg (GET_MODE (value), value, new_rclass, + "INC/DEC result"); + + if (real_in != result) + { + /* First copy the location to the result register. */ + lra_assert (REG_P (result)); + emit_insn (gen_move_insn (result, real_in)); + } + + /* We suppose that there are insns to add/sub with the constant + increment permitted in {PRE/POST)_{DEC/INC/MODIFY}. At least the + old reload worked with this assumption. If the assumption + becomes wrong, we should use approach in function + base_plus_disp_to_reg. */ + if (in == value) + { + /* See if we can directly increment INCLOC. */ + last = get_last_insn (); + add_insn = emit_insn (plus_p + ? gen_add2_insn (incloc, inc) + : gen_sub2_insn (incloc, inc)); + + code = recog_memoized (add_insn); + if (code >= 0) + { + if (! post && result != incloc) + emit_insn (gen_move_insn (result, incloc)); + return result; + } + delete_insns_since (last); + } + + /* If couldn't do the increment directly, must increment in RESULT. + The way we do this depends on whether this is pre- or + post-increment. For pre-increment, copy INCLOC to the reload + register, increment it there, then save back. */ + if (! post) + { + if (real_in != result) + emit_insn (gen_move_insn (result, real_in)); + if (plus_p) + emit_insn (gen_add2_insn (result, inc)); + else + emit_insn (gen_sub2_insn (result, inc)); + if (result != incloc) + emit_insn (gen_move_insn (incloc, result)); + } + else + { + /* Post-increment. + + Because this might be a jump insn or a compare, and because + RESULT may not be available after the insn in an input + reload, we must do the incrementing before the insn being + reloaded for. + + We have already copied IN to RESULT. Increment the copy in + RESULT, save that back, then decrement RESULT so it has + the original value. */ + if (plus_p) + emit_insn (gen_add2_insn (result, inc)); + else + emit_insn (gen_sub2_insn (result, inc)); + emit_insn (gen_move_insn (incloc, result)); + /* Restore non-modified value for the result. We prefer this + way because it does not require an additional hard + register. */ + if (plus_p) + { + if (CONST_INT_P (inc)) + emit_insn (gen_add2_insn (result, GEN_INT (-INTVAL (inc)))); + else + emit_insn (gen_sub2_insn (result, inc)); + } + else + emit_insn (gen_add2_insn (result, inc)); + } + return result; +} + +/* Swap operands NOP and NOP + 1. */ +static inline void +swap_operands (int nop) +{ + enum machine_mode mode = curr_operand_mode[nop]; + curr_operand_mode[nop] = curr_operand_mode[nop + 1]; + curr_operand_mode[nop + 1] = mode; + rtx x = *curr_id->operand_loc[nop]; + *curr_id->operand_loc[nop] = *curr_id->operand_loc[nop + 1]; + *curr_id->operand_loc[nop + 1] = x; + /* Swap the duplicates too. */ + lra_update_dup (curr_id, nop); + lra_update_dup (curr_id, nop + 1); +} + +/* Main entry point of the constraint code: search the body of the + current insn to choose the best alternative. It is mimicking insn + alternative cost calculation model of former reload pass. That is + because machine descriptions were written to use this model. This + model can be changed in future. Make commutative operand exchange + if it is chosen. + + Return true if some RTL changes happened during function call. */ +static bool +curr_insn_transform (void) +{ + int i, j, k; + int n_operands; + int n_alternatives; + int commutative; + signed char goal_alt_matched[MAX_RECOG_OPERANDS][MAX_RECOG_OPERANDS]; + rtx before, after; + bool alt_p = false; + /* Flag that the insn has been changed through a transformation. */ + bool change_p; + bool sec_mem_p; +#ifdef SECONDARY_MEMORY_NEEDED + bool use_sec_mem_p; +#endif + int max_regno_before; + int reused_alternative_num; + + no_input_reloads_p = no_output_reloads_p = false; + goal_alt_number = -1; + + if (check_and_process_move (&change_p, &sec_mem_p)) + return change_p; + + /* JUMP_INSNs and CALL_INSNs are not allowed to have any output + reloads; neither are insns that SET cc0. Insns that use CC0 are + not allowed to have any input reloads. */ + if (JUMP_P (curr_insn) || CALL_P (curr_insn)) + no_output_reloads_p = true; + +#ifdef HAVE_cc0 + if (reg_referenced_p (cc0_rtx, PATTERN (curr_insn))) + no_input_reloads_p = true; + if (reg_set_p (cc0_rtx, PATTERN (curr_insn))) + no_output_reloads_p = true; +#endif + + n_operands = curr_static_id->n_operands; + n_alternatives = curr_static_id->n_alternatives; + + /* Just return "no reloads" if insn has no operands with + constraints. */ + if (n_operands == 0 || n_alternatives == 0) + return false; + + max_regno_before = max_reg_num (); + + for (i = 0; i < n_operands; i++) + { + goal_alt_matched[i][0] = -1; + goal_alt_matches[i] = -1; + } + + commutative = curr_static_id->commutative; + + /* Now see what we need for pseudos that didn't get hard regs or got + the wrong kind of hard reg. For this, we must consider all the + operands together against the register constraints. */ + + best_losers = best_overall = MAX_RECOG_OPERANDS * 2 + MAX_OVERALL_COST_BOUND; + best_small_class_operands_num = best_reload_sum = 0; + + curr_swapped = false; + goal_alt_swapped = false; + + /* Make equivalence substitution and memory subreg elimination + before address processing because an address legitimacy can + depend on memory mode. */ + for (i = 0; i < n_operands; i++) + { + rtx op = *curr_id->operand_loc[i]; + rtx subst, old = op; + bool op_change_p = false; + + if (GET_CODE (old) == SUBREG) + old = SUBREG_REG (old); + subst = get_equiv_substitution (old); + if (subst != old) + { + subst = copy_rtx (subst); + lra_assert (REG_P (old)); + if (GET_CODE (op) == SUBREG) + SUBREG_REG (op) = subst; + else + *curr_id->operand_loc[i] = subst; + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + "Changing pseudo %d in operand %i of insn %u on equiv ", + REGNO (old), i, INSN_UID (curr_insn)); + print_value_slim (lra_dump_file, subst, 1); + fprintf (lra_dump_file, "\n"); + } + op_change_p = change_p = true; + } + if (simplify_operand_subreg (i, GET_MODE (old)) || op_change_p) + { + change_p = true; + lra_update_dup (curr_id, i); + } + } + + /* Reload address registers and displacements. We do it before + finding an alternative because of memory constraints. */ + before = after = NULL_RTX; + for (i = 0; i < n_operands; i++) + if (! curr_static_id->operand[i].is_operator + && process_address (i, &before, &after)) + { + change_p = true; + lra_update_dup (curr_id, i); + } + + if (change_p) + /* If we've changed the instruction then any alternative that + we chose previously may no longer be valid. */ + lra_set_used_insn_alternative (curr_insn, -1); + + try_swapped: + + reused_alternative_num = curr_id->used_insn_alternative; + if (lra_dump_file != NULL && reused_alternative_num >= 0) + fprintf (lra_dump_file, "Reusing alternative %d for insn #%u\n", + reused_alternative_num, INSN_UID (curr_insn)); + + if (process_alt_operands (reused_alternative_num)) + alt_p = true; + + /* If insn is commutative (it's safe to exchange a certain pair of + operands) then we need to try each alternative twice, the second + time matching those two operands as if we had exchanged them. To + do this, really exchange them in operands. + + If we have just tried the alternatives the second time, return + operands to normal and drop through. */ + + if (reused_alternative_num < 0 && commutative >= 0) + { + curr_swapped = !curr_swapped; + if (curr_swapped) + { + swap_operands (commutative); + goto try_swapped; + } + else + swap_operands (commutative); + } + + /* The operands don't meet the constraints. goal_alt describes the + alternative that we could reach by reloading the fewest operands. + Reload so as to fit it. */ + + if (! alt_p && ! sec_mem_p) + { + /* No alternative works with reloads?? */ + if (INSN_CODE (curr_insn) >= 0) + fatal_insn ("unable to generate reloads for:", curr_insn); + error_for_asm (curr_insn, + "inconsistent operand constraints in an %<asm%>"); + /* Avoid further trouble with this insn. */ + PATTERN (curr_insn) = gen_rtx_USE (VOIDmode, const0_rtx); + lra_invalidate_insn_data (curr_insn); + return true; + } + + /* If the best alternative is with operands 1 and 2 swapped, swap + them. Update the operand numbers of any reloads already + pushed. */ + + if (goal_alt_swapped) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Commutative operand exchange in insn %u\n", + INSN_UID (curr_insn)); + + /* Swap the duplicates too. */ + swap_operands (commutative); + change_p = true; + } + +#ifdef SECONDARY_MEMORY_NEEDED + /* Some target macros SECONDARY_MEMORY_NEEDED (e.g. x86) are defined + too conservatively. So we use the secondary memory only if there + is no any alternative without reloads. */ + use_sec_mem_p = false; + if (! alt_p) + use_sec_mem_p = true; + else if (sec_mem_p) + { + for (i = 0; i < n_operands; i++) + if (! goal_alt_win[i] && ! goal_alt_match_win[i]) + break; + use_sec_mem_p = i < n_operands; + } + + if (use_sec_mem_p) + { + rtx new_reg, set, src, dest; + enum machine_mode sec_mode; + + lra_assert (sec_mem_p); + set = single_set (curr_insn); + lra_assert (set != NULL_RTX && ! side_effects_p (set)); + dest = SET_DEST (set); + src = SET_SRC (set); +#ifdef SECONDARY_MEMORY_NEEDED_MODE + sec_mode = SECONDARY_MEMORY_NEEDED_MODE (GET_MODE (src)); +#else + sec_mode = GET_MODE (src); +#endif + new_reg = lra_create_new_reg (sec_mode, NULL_RTX, + NO_REGS, "secondary"); + /* If the mode is changed, it should be wider. */ + lra_assert (GET_MODE_SIZE (GET_MODE (new_reg)) + >= GET_MODE_SIZE (GET_MODE (src))); + after = emit_spill_move (false, new_reg, dest); + lra_process_new_insns (curr_insn, NULL_RTX, after, + "Inserting the sec. move"); + before = emit_spill_move (true, new_reg, src); + lra_process_new_insns (curr_insn, before, NULL_RTX, "Changing on"); + lra_set_insn_deleted (curr_insn); + return true; + } +#endif + + lra_assert (goal_alt_number >= 0); + lra_set_used_insn_alternative (curr_insn, goal_alt_number); + + if (lra_dump_file != NULL) + { + const char *p; + + fprintf (lra_dump_file, " Choosing alt %d in insn %u:", + goal_alt_number, INSN_UID (curr_insn)); + for (i = 0; i < n_operands; i++) + { + p = (curr_static_id->operand_alternative + [goal_alt_number * n_operands + i].constraint); + if (*p == '\0') + continue; + fprintf (lra_dump_file, " (%d) ", i); + for (; *p != '\0' && *p != ',' && *p != '#'; p++) + fputc (*p, lra_dump_file); + } + fprintf (lra_dump_file, "\n"); + } + + /* Right now, for any pair of operands I and J that are required to + match, with J < I, goal_alt_matches[I] is J. Add I to + goal_alt_matched[J]. */ + + for (i = 0; i < n_operands; i++) + if ((j = goal_alt_matches[i]) >= 0) + { + for (k = 0; goal_alt_matched[j][k] >= 0; k++) + ; + /* We allow matching one output operand and several input + operands. */ + lra_assert (k == 0 + || (curr_static_id->operand[j].type == OP_OUT + && curr_static_id->operand[i].type == OP_IN + && (curr_static_id->operand + [goal_alt_matched[j][0]].type == OP_IN))); + goal_alt_matched[j][k] = i; + goal_alt_matched[j][k + 1] = -1; + } + + for (i = 0; i < n_operands; i++) + goal_alt_win[i] |= goal_alt_match_win[i]; + + /* Any constants that aren't allowed and can't be reloaded into + registers are here changed into memory references. */ + for (i = 0; i < n_operands; i++) + if (goal_alt_win[i]) + { + int regno; + enum reg_class new_class; + rtx reg = *curr_id->operand_loc[i]; + + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + + if (REG_P (reg) && (regno = REGNO (reg)) >= FIRST_PSEUDO_REGISTER) + { + bool ok_p = in_class_p (reg, goal_alt[i], &new_class); + + if (new_class != NO_REGS && get_reg_class (regno) != new_class) + { + lra_assert (ok_p); + change_class (regno, new_class, " Change", true); + } + } + } + else + { + const char *constraint; + char c; + rtx op = *curr_id->operand_loc[i]; + rtx subreg = NULL_RTX; + enum machine_mode mode = curr_operand_mode[i]; + + if (GET_CODE (op) == SUBREG) + { + subreg = op; + op = SUBREG_REG (op); + mode = GET_MODE (op); + } + + if (CONST_POOL_OK_P (mode, op) + && ((targetm.preferred_reload_class + (op, (enum reg_class) goal_alt[i]) == NO_REGS) + || no_input_reloads_p)) + { + rtx tem = force_const_mem (mode, op); + + change_p = true; + if (subreg != NULL_RTX) + tem = gen_rtx_SUBREG (mode, tem, SUBREG_BYTE (subreg)); + + *curr_id->operand_loc[i] = tem; + lra_update_dup (curr_id, i); + process_address (i, &before, &after); + + /* If the alternative accepts constant pool refs directly + there will be no reload needed at all. */ + if (subreg != NULL_RTX) + continue; + /* Skip alternatives before the one requested. */ + constraint = (curr_static_id->operand_alternative + [goal_alt_number * n_operands + i].constraint); + for (; + (c = *constraint) && c != ',' && c != '#'; + constraint += CONSTRAINT_LEN (c, constraint)) + { + if (c == TARGET_MEM_CONSTRAINT || c == 'o') + break; +#ifdef EXTRA_CONSTRAINT_STR + if (EXTRA_MEMORY_CONSTRAINT (c, constraint) + && EXTRA_CONSTRAINT_STR (tem, c, constraint)) + break; +#endif + } + if (c == '\0' || c == ',' || c == '#') + continue; + + goal_alt_win[i] = true; + } + } + + for (i = 0; i < n_operands; i++) + { + rtx old, new_reg; + rtx op = *curr_id->operand_loc[i]; + + if (goal_alt_win[i]) + { + if (goal_alt[i] == NO_REGS + && REG_P (op) + /* When we assign NO_REGS it means that we will not + assign a hard register to the scratch pseudo by + assigment pass and the scratch pseudo will be + spilled. Spilled scratch pseudos are transformed + back to scratches at the LRA end. */ + && lra_former_scratch_operand_p (curr_insn, i)) + change_class (REGNO (op), NO_REGS, " Change", true); + continue; + } + + /* Operands that match previous ones have already been handled. */ + if (goal_alt_matches[i] >= 0) + continue; + + /* We should not have an operand with a non-offsettable address + appearing where an offsettable address will do. It also may + be a case when the address should be special in other words + not a general one (e.g. it needs no index reg). */ + if (goal_alt_matched[i][0] == -1 && goal_alt_offmemok[i] && MEM_P (op)) + { + enum reg_class rclass; + rtx *loc = &XEXP (op, 0); + enum rtx_code code = GET_CODE (*loc); + + push_to_sequence (before); + rclass = base_reg_class (GET_MODE (op), MEM_ADDR_SPACE (op), + MEM, SCRATCH); + if (GET_RTX_CLASS (code) == RTX_AUTOINC) + new_reg = emit_inc (rclass, *loc, *loc, + /* This value does not matter for MODIFY. */ + GET_MODE_SIZE (GET_MODE (op))); + else if (get_reload_reg (OP_IN, Pmode, *loc, rclass, + "offsetable address", &new_reg)) + lra_emit_move (new_reg, *loc); + before = get_insns (); + end_sequence (); + *loc = new_reg; + lra_update_dup (curr_id, i); + } + else if (goal_alt_matched[i][0] == -1) + { + enum machine_mode mode; + rtx reg, *loc; + int hard_regno, byte; + enum op_type type = curr_static_id->operand[i].type; + + loc = curr_id->operand_loc[i]; + mode = curr_operand_mode[i]; + if (GET_CODE (*loc) == SUBREG) + { + reg = SUBREG_REG (*loc); + byte = SUBREG_BYTE (*loc); + if (REG_P (reg) + /* Strict_low_part requires reload the register not + the sub-register. */ + && (curr_static_id->operand[i].strict_low + || (GET_MODE_SIZE (mode) + <= GET_MODE_SIZE (GET_MODE (reg)) + && (hard_regno + = get_try_hard_regno (REGNO (reg))) >= 0 + && (simplify_subreg_regno + (hard_regno, + GET_MODE (reg), byte, mode) < 0) + && (goal_alt[i] == NO_REGS + || (simplify_subreg_regno + (ira_class_hard_regs[goal_alt[i]][0], + GET_MODE (reg), byte, mode) >= 0))))) + { + loc = &SUBREG_REG (*loc); + mode = GET_MODE (*loc); + } + } + old = *loc; + if (get_reload_reg (type, mode, old, goal_alt[i], "", &new_reg) + && type != OP_OUT) + { + push_to_sequence (before); + lra_emit_move (new_reg, old); + before = get_insns (); + end_sequence (); + } + *loc = new_reg; + if (type != OP_IN + && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX) + { + start_sequence (); + lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg); + emit_insn (after); + after = get_insns (); + end_sequence (); + *loc = new_reg; + } + for (j = 0; j < goal_alt_dont_inherit_ops_num; j++) + if (goal_alt_dont_inherit_ops[j] == i) + { + lra_set_regno_unique_value (REGNO (new_reg)); + break; + } + lra_update_dup (curr_id, i); + } + else if (curr_static_id->operand[i].type == OP_IN + && (curr_static_id->operand[goal_alt_matched[i][0]].type + == OP_OUT)) + { + signed char arr[2]; + + arr[0] = i; + arr[1] = -1; + match_reload (goal_alt_matched[i][0], arr, + goal_alt[i], &before, &after); + } + else if (curr_static_id->operand[i].type == OP_OUT + && (curr_static_id->operand[goal_alt_matched[i][0]].type + == OP_IN)) + match_reload (i, goal_alt_matched[i], goal_alt[i], &before, &after); + else + /* We must generate code in any case when function + process_alt_operands decides that it is possible. */ + gcc_unreachable (); + } + if (before != NULL_RTX || after != NULL_RTX + || max_regno_before != max_reg_num ()) + change_p = true; + if (change_p) + { + lra_update_operator_dups (curr_id); + /* Something changes -- process the insn. */ + lra_update_insn_regno_info (curr_insn); + } + lra_process_new_insns (curr_insn, before, after, "Inserting insn reload"); + return change_p; +} + +/* Return true if X is in LIST. */ +static bool +in_list_p (rtx x, rtx list) +{ + for (; list != NULL_RTX; list = XEXP (list, 1)) + if (XEXP (list, 0) == x) + return true; + return false; +} + +/* Return true if X contains an allocatable hard register (if + HARD_REG_P) or a (spilled if SPILLED_P) pseudo. */ +static bool +contains_reg_p (rtx x, bool hard_reg_p, bool spilled_p) +{ + int i, j; + const char *fmt; + enum rtx_code code; + + code = GET_CODE (x); + if (REG_P (x)) + { + int regno = REGNO (x); + HARD_REG_SET alloc_regs; + + if (hard_reg_p) + { + if (regno >= FIRST_PSEUDO_REGISTER) + regno = lra_get_regno_hard_regno (regno); + if (regno < 0) + return false; + COMPL_HARD_REG_SET (alloc_regs, lra_no_alloc_regs); + return overlaps_hard_reg_set_p (alloc_regs, GET_MODE (x), regno); + } + else + { + if (regno < FIRST_PSEUDO_REGISTER) + return false; + if (! spilled_p) + return true; + return lra_get_regno_hard_regno (regno) < 0; + } + } + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + { + if (contains_reg_p (XEXP (x, i), hard_reg_p, spilled_p)) + return true; + } + else if (fmt[i] == 'E') + { + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + if (contains_reg_p (XVECEXP (x, i, j), hard_reg_p, spilled_p)) + return true; + } + } + return false; +} + +/* Process all regs in debug location *LOC and change them on + equivalent substitution. Return true if any change was done. */ +static bool +debug_loc_equivalence_change_p (rtx *loc) +{ + rtx subst, reg, x = *loc; + bool result = false; + enum rtx_code code = GET_CODE (x); + const char *fmt; + int i, j; + + if (code == SUBREG) + { + reg = SUBREG_REG (x); + if ((subst = get_equiv_substitution (reg)) != reg + && GET_MODE (subst) == VOIDmode) + { + /* We cannot reload debug location. Simplify subreg here + while we know the inner mode. */ + *loc = simplify_gen_subreg (GET_MODE (x), subst, + GET_MODE (reg), SUBREG_BYTE (x)); + return true; + } + } + if (code == REG && (subst = get_equiv_substitution (x)) != x) + { + *loc = subst; + return true; + } + + /* Scan all the operand sub-expressions. */ + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + result = debug_loc_equivalence_change_p (&XEXP (x, i)) || result; + else if (fmt[i] == 'E') + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + result + = debug_loc_equivalence_change_p (&XVECEXP (x, i, j)) || result; + } + return result; +} + +/* Maximum allowed number of constraint pass iterations after the last + spill pass. It is for preventing LRA cycling in a bug case. */ +#define MAX_CONSTRAINT_ITERATION_NUMBER 15 + +/* Maximum number of generated reload insns per an insn. It is for + preventing this pass cycling in a bug case. */ +#define MAX_RELOAD_INSNS_NUMBER LRA_MAX_INSN_RELOADS + +/* The current iteration number of this LRA pass. */ +int lra_constraint_iter; + +/* The current iteration number of this LRA pass after the last spill + pass. */ +int lra_constraint_iter_after_spill; + +/* True if we substituted equiv which needs checking register + allocation correctness because the equivalent value contains + allocatable hard registers or when we restore multi-register + pseudo. */ +bool lra_risky_transformations_p; + +/* Return true if REGNO is referenced in more than one block. */ +static bool +multi_block_pseudo_p (int regno) +{ + basic_block bb = NULL; + unsigned int uid; + bitmap_iterator bi; + + if (regno < FIRST_PSEUDO_REGISTER) + return false; + + EXECUTE_IF_SET_IN_BITMAP (&lra_reg_info[regno].insn_bitmap, 0, uid, bi) + if (bb == NULL) + bb = BLOCK_FOR_INSN (lra_insn_recog_data[uid]->insn); + else if (BLOCK_FOR_INSN (lra_insn_recog_data[uid]->insn) != bb) + return true; + return false; +} + +/* Return true if X contains a pseudo dying in INSN. */ +static bool +dead_pseudo_p (rtx x, rtx insn) +{ + int i, j; + const char *fmt; + enum rtx_code code; + + if (REG_P (x)) + return (insn != NULL_RTX + && find_regno_note (insn, REG_DEAD, REGNO (x)) != NULL_RTX); + code = GET_CODE (x); + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + { + if (dead_pseudo_p (XEXP (x, i), insn)) + return true; + } + else if (fmt[i] == 'E') + { + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + if (dead_pseudo_p (XVECEXP (x, i, j), insn)) + return true; + } + } + return false; +} + +/* Return true if INSN contains a dying pseudo in INSN right hand + side. */ +static bool +insn_rhs_dead_pseudo_p (rtx insn) +{ + rtx set = single_set (insn); + + gcc_assert (set != NULL); + return dead_pseudo_p (SET_SRC (set), insn); +} + +/* Return true if any init insn of REGNO contains a dying pseudo in + insn right hand side. */ +static bool +init_insn_rhs_dead_pseudo_p (int regno) +{ + rtx insns = ira_reg_equiv[regno].init_insns; + + if (insns == NULL) + return false; + if (INSN_P (insns)) + return insn_rhs_dead_pseudo_p (insns); + for (; insns != NULL_RTX; insns = XEXP (insns, 1)) + if (insn_rhs_dead_pseudo_p (XEXP (insns, 0))) + return true; + return false; +} + +/* Entry function of LRA constraint pass. Return true if the + constraint pass did change the code. */ +bool +lra_constraints (bool first_p) +{ + bool changed_p; + int i, hard_regno, new_insns_num; + unsigned int min_len, new_min_len; + rtx set, x, dest_reg; + basic_block last_bb; + + lra_constraint_iter++; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "\n********** Local #%d: **********\n\n", + lra_constraint_iter); + lra_constraint_iter_after_spill++; + if (lra_constraint_iter_after_spill > MAX_CONSTRAINT_ITERATION_NUMBER) + internal_error + ("Maximum number of LRA constraint passes is achieved (%d)\n", + MAX_CONSTRAINT_ITERATION_NUMBER); + changed_p = false; + lra_risky_transformations_p = false; + new_insn_uid_start = get_max_uid (); + new_regno_start = first_p ? lra_constraint_new_regno_start : max_reg_num (); + for (i = FIRST_PSEUDO_REGISTER; i < new_regno_start; i++) + if (lra_reg_info[i].nrefs != 0) + { + ira_reg_equiv[i].profitable_p = true; + if ((hard_regno = lra_get_regno_hard_regno (i)) >= 0) + { + int j, nregs = hard_regno_nregs[hard_regno][PSEUDO_REGNO_MODE (i)]; + + for (j = 0; j < nregs; j++) + df_set_regs_ever_live (hard_regno + j, true); + } + else if ((x = get_equiv_substitution (regno_reg_rtx[i])) != NULL_RTX) + { + bool pseudo_p = contains_reg_p (x, false, false); + rtx set, insn; + + /* We don't use DF for compilation speed sake. So it is + problematic to update live info when we use an + equivalence containing pseudos in more than one BB. */ + if ((pseudo_p && multi_block_pseudo_p (i)) + /* If it is not a reverse equivalence, we check that a + pseudo in rhs of the init insn is not dying in the + insn. Otherwise, the live info at the beginning of + the corresponding BB might be wrong after we + removed the insn. When the equiv can be a + constant, the right hand side of the init insn can + be a pseudo. */ + || (! ((insn = ira_reg_equiv[i].init_insns) != NULL_RTX + && INSN_P (insn) + && (set = single_set (insn)) != NULL_RTX + && REG_P (SET_DEST (set)) + && (int) REGNO (SET_DEST (set)) == i) + && init_insn_rhs_dead_pseudo_p (i))) + ira_reg_equiv[i].defined_p = false; + else if (! first_p && pseudo_p) + /* After RTL transformation, we can not guarantee that + pseudo in the substitution was not reloaded which + might make equivalence invalid. For example, in + reverse equiv of p0 + + p0 <- ... + ... + equiv_mem <- p0 + + the memory address register was reloaded before the + 2nd insn. */ + ira_reg_equiv[i].defined_p = false; + if (contains_reg_p (x, false, true)) + ira_reg_equiv[i].profitable_p = false; + } + } + lra_eliminate (false); + min_len = lra_insn_stack_length (); + new_insns_num = 0; + last_bb = NULL; + changed_p = false; + while ((new_min_len = lra_insn_stack_length ()) != 0) + { + curr_insn = lra_pop_insn (); + --new_min_len; + curr_bb = BLOCK_FOR_INSN (curr_insn); + if (curr_bb != last_bb) + { + last_bb = curr_bb; + bb_reload_num = lra_curr_reload_num; + } + if (min_len > new_min_len) + { + min_len = new_min_len; + new_insns_num = 0; + } + if (new_insns_num > MAX_RELOAD_INSNS_NUMBER) + internal_error + ("Max. number of generated reload insns per insn is achieved (%d)\n", + MAX_RELOAD_INSNS_NUMBER); + new_insns_num++; + if (DEBUG_INSN_P (curr_insn)) + { + /* We need to check equivalence in debug insn and change + pseudo to the equivalent value if necessary. */ + curr_id = lra_get_insn_recog_data (curr_insn); + if (debug_loc_equivalence_change_p (curr_id->operand_loc[0])) + changed_p = true; + } + else if (INSN_P (curr_insn)) + { + if ((set = single_set (curr_insn)) != NULL_RTX) + { + dest_reg = SET_DEST (set); + /* The equivalence pseudo could be set up as SUBREG in a + case when it is a call restore insn in a mode + different from the pseudo mode. */ + if (GET_CODE (dest_reg) == SUBREG) + dest_reg = SUBREG_REG (dest_reg); + if ((REG_P (dest_reg) + && (x = get_equiv_substitution (dest_reg)) != dest_reg + /* Remove insns which set up a pseudo whose value + can not be changed. Such insns might be not in + init_insns because we don't update equiv data + during insn transformations. + + As an example, let suppose that a pseudo got + hard register and on the 1st pass was not + changed to equivalent constant. We generate an + additional insn setting up the pseudo because of + secondary memory movement. Then the pseudo is + spilled and we use the equiv constant. In this + case we should remove the additional insn and + this insn is not init_insns list. */ + && (! MEM_P (x) || MEM_READONLY_P (x) + || in_list_p (curr_insn, + ira_reg_equiv + [REGNO (dest_reg)].init_insns))) + || (((x = get_equiv_substitution (SET_SRC (set))) + != SET_SRC (set)) + && in_list_p (curr_insn, + ira_reg_equiv + [REGNO (SET_SRC (set))].init_insns))) + { + /* This is equiv init insn of pseudo which did not get a + hard register -- remove the insn. */ + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + " Removing equiv init insn %i (freq=%d)\n", + INSN_UID (curr_insn), + BLOCK_FOR_INSN (curr_insn)->frequency); + debug_rtl_slim (lra_dump_file, + curr_insn, curr_insn, -1, 0); + } + if (contains_reg_p (x, true, false)) + lra_risky_transformations_p = true; + lra_set_insn_deleted (curr_insn); + continue; + } + } + curr_id = lra_get_insn_recog_data (curr_insn); + curr_static_id = curr_id->insn_static_data; + init_curr_insn_input_reloads (); + init_curr_operand_mode (); + if (curr_insn_transform ()) + changed_p = true; + } + } + /* If we used a new hard regno, changed_p should be true because the + hard reg is assigned to a new pseudo. */ +#ifdef ENABLE_CHECKING + if (! changed_p) + { + for (i = FIRST_PSEUDO_REGISTER; i < new_regno_start; i++) + if (lra_reg_info[i].nrefs != 0 + && (hard_regno = lra_get_regno_hard_regno (i)) >= 0) + { + int j, nregs = hard_regno_nregs[hard_regno][PSEUDO_REGNO_MODE (i)]; + + for (j = 0; j < nregs; j++) + lra_assert (df_regs_ever_live_p (hard_regno + j)); + } + } +#endif + return changed_p; +} + +/* Initiate the LRA constraint pass. It is done once per + function. */ +void +lra_constraints_init (void) +{ +} + +/* Finalize the LRA constraint pass. It is done once per + function. */ +void +lra_constraints_finish (void) +{ +} + + + +/* This page contains code to do inheritance/split + transformations. */ + +/* Number of reloads passed so far in current EBB. */ +static int reloads_num; + +/* Number of calls passed so far in current EBB. */ +static int calls_num; + +/* Current reload pseudo check for validity of elements in + USAGE_INSNS. */ +static int curr_usage_insns_check; + +/* Info about last usage of registers in EBB to do inheritance/split + transformation. Inheritance transformation is done from a spilled + pseudo and split transformations from a hard register or a pseudo + assigned to a hard register. */ +struct usage_insns +{ + /* If the value is equal to CURR_USAGE_INSNS_CHECK, then the member + value INSNS is valid. The insns is chain of optional debug insns + and a finishing non-debug insn using the corresponding reg. */ + int check; + /* Value of global reloads_num at the last insn in INSNS. */ + int reloads_num; + /* Value of global reloads_nums at the last insn in INSNS. */ + int calls_num; + /* It can be true only for splitting. And it means that the restore + insn should be put after insn given by the following member. */ + bool after_p; + /* Next insns in the current EBB which use the original reg and the + original reg value is not changed between the current insn and + the next insns. In order words, e.g. for inheritance, if we need + to use the original reg value again in the next insns we can try + to use the value in a hard register from a reload insn of the + current insn. */ + rtx insns; +}; + +/* Map: regno -> corresponding pseudo usage insns. */ +static struct usage_insns *usage_insns; + +static void +setup_next_usage_insn (int regno, rtx insn, int reloads_num, bool after_p) +{ + usage_insns[regno].check = curr_usage_insns_check; + usage_insns[regno].insns = insn; + usage_insns[regno].reloads_num = reloads_num; + usage_insns[regno].calls_num = calls_num; + usage_insns[regno].after_p = after_p; +} + +/* The function is used to form list REGNO usages which consists of + optional debug insns finished by a non-debug insn using REGNO. + RELOADS_NUM is current number of reload insns processed so far. */ +static void +add_next_usage_insn (int regno, rtx insn, int reloads_num) +{ + rtx next_usage_insns; + + if (usage_insns[regno].check == curr_usage_insns_check + && (next_usage_insns = usage_insns[regno].insns) != NULL_RTX + && DEBUG_INSN_P (insn)) + { + /* Check that we did not add the debug insn yet. */ + if (next_usage_insns != insn + && (GET_CODE (next_usage_insns) != INSN_LIST + || XEXP (next_usage_insns, 0) != insn)) + usage_insns[regno].insns = gen_rtx_INSN_LIST (VOIDmode, insn, + next_usage_insns); + } + else if (NONDEBUG_INSN_P (insn)) + setup_next_usage_insn (regno, insn, reloads_num, false); + else + usage_insns[regno].check = 0; +} + +/* Replace all references to register OLD_REGNO in *LOC with pseudo + register NEW_REG. Return true if any change was made. */ +static bool +substitute_pseudo (rtx *loc, int old_regno, rtx new_reg) +{ + rtx x = *loc; + bool result = false; + enum rtx_code code; + const char *fmt; + int i, j; + + if (x == NULL_RTX) + return false; + + code = GET_CODE (x); + if (code == REG && (int) REGNO (x) == old_regno) + { + enum machine_mode mode = GET_MODE (*loc); + enum machine_mode inner_mode = GET_MODE (new_reg); + + if (mode != inner_mode) + { + if (GET_MODE_SIZE (mode) >= GET_MODE_SIZE (inner_mode) + || ! SCALAR_INT_MODE_P (inner_mode)) + new_reg = gen_rtx_SUBREG (mode, new_reg, 0); + else + new_reg = gen_lowpart_SUBREG (mode, new_reg); + } + *loc = new_reg; + return true; + } + + /* Scan all the operand sub-expressions. */ + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + { + if (substitute_pseudo (&XEXP (x, i), old_regno, new_reg)) + result = true; + } + else if (fmt[i] == 'E') + { + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + if (substitute_pseudo (&XVECEXP (x, i, j), old_regno, new_reg)) + result = true; + } + } + return result; +} + +/* Registers involved in inheritance/split in the current EBB + (inheritance/split pseudos and original registers). */ +static bitmap_head check_only_regs; + +/* Do inheritance transformations for insn INSN, which defines (if + DEF_P) or uses ORIGINAL_REGNO. NEXT_USAGE_INSNS specifies which + instruction in the EBB next uses ORIGINAL_REGNO; it has the same + form as the "insns" field of usage_insns. Return true if we + succeed in such transformation. + + The transformations look like: + + p <- ... i <- ... + ... p <- i (new insn) + ... => + <- ... p ... <- ... i ... + or + ... i <- p (new insn) + <- ... p ... <- ... i ... + ... => + <- ... p ... <- ... i ... + where p is a spilled original pseudo and i is a new inheritance pseudo. + + + The inheritance pseudo has the smallest class of two classes CL and + class of ORIGINAL REGNO. */ +static bool +inherit_reload_reg (bool def_p, int original_regno, + enum reg_class cl, rtx insn, rtx next_usage_insns) +{ + enum reg_class rclass = lra_get_allocno_class (original_regno); + rtx original_reg = regno_reg_rtx[original_regno]; + rtx new_reg, new_insns, usage_insn; + + lra_assert (! usage_insns[original_regno].after_p); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + " <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<\n"); + if (! ira_reg_classes_intersect_p[cl][rclass]) + { + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + " Rejecting inheritance for %d " + "because of disjoint classes %s and %s\n", + original_regno, reg_class_names[cl], + reg_class_names[rclass]); + fprintf (lra_dump_file, + " >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n"); + } + return false; + } + if ((ira_class_subset_p[cl][rclass] && cl != rclass) + /* We don't use a subset of two classes because it can be + NO_REGS. This transformation is still profitable in most + cases even if the classes are not intersected as register + move is probably cheaper than a memory load. */ + || ira_class_hard_regs_num[cl] < ira_class_hard_regs_num[rclass]) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Use smallest class of %s and %s\n", + reg_class_names[cl], reg_class_names[rclass]); + + rclass = cl; + } + new_reg = lra_create_new_reg (GET_MODE (original_reg), original_reg, + rclass, "inheritance"); + start_sequence (); + if (def_p) + emit_move_insn (original_reg, new_reg); + else + emit_move_insn (new_reg, original_reg); + new_insns = get_insns (); + end_sequence (); + if (NEXT_INSN (new_insns) != NULL_RTX) + { + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + " Rejecting inheritance %d->%d " + "as it results in 2 or more insns:\n", + original_regno, REGNO (new_reg)); + debug_rtl_slim (lra_dump_file, new_insns, NULL_RTX, -1, 0); + fprintf (lra_dump_file, + " >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n"); + } + return false; + } + substitute_pseudo (&insn, original_regno, new_reg); + lra_update_insn_regno_info (insn); + if (! def_p) + /* We now have a new usage insn for original regno. */ + setup_next_usage_insn (original_regno, new_insns, reloads_num, false); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Original reg change %d->%d (bb%d):\n", + original_regno, REGNO (new_reg), BLOCK_FOR_INSN (insn)->index); + lra_reg_info[REGNO (new_reg)].restore_regno = original_regno; + bitmap_set_bit (&check_only_regs, REGNO (new_reg)); + bitmap_set_bit (&check_only_regs, original_regno); + bitmap_set_bit (&lra_inheritance_pseudos, REGNO (new_reg)); + if (def_p) + lra_process_new_insns (insn, NULL_RTX, new_insns, + "Add original<-inheritance"); + else + lra_process_new_insns (insn, new_insns, NULL_RTX, + "Add inheritance<-original"); + while (next_usage_insns != NULL_RTX) + { + if (GET_CODE (next_usage_insns) != INSN_LIST) + { + usage_insn = next_usage_insns; + lra_assert (NONDEBUG_INSN_P (usage_insn)); + next_usage_insns = NULL; + } + else + { + usage_insn = XEXP (next_usage_insns, 0); + lra_assert (DEBUG_INSN_P (usage_insn)); + next_usage_insns = XEXP (next_usage_insns, 1); + } + substitute_pseudo (&usage_insn, original_regno, new_reg); + lra_update_insn_regno_info (usage_insn); + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + " Inheritance reuse change %d->%d (bb%d):\n", + original_regno, REGNO (new_reg), + BLOCK_FOR_INSN (usage_insn)->index); + debug_rtl_slim (lra_dump_file, usage_insn, usage_insn, + -1, 0); + } + } + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + " >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n"); + return true; +} + +/* Return true if we need a caller save/restore for pseudo REGNO which + was assigned to a hard register. */ +static inline bool +need_for_call_save_p (int regno) +{ + lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0); + return (usage_insns[regno].calls_num < calls_num + && (overlaps_hard_reg_set_p + (call_used_reg_set, + PSEUDO_REGNO_MODE (regno), reg_renumber[regno]))); +} + +/* Global registers occuring in the current EBB. */ +static bitmap_head ebb_global_regs; + +/* Return true if we need a split for hard register REGNO or pseudo + REGNO which was assigned to a hard register. + POTENTIAL_RELOAD_HARD_REGS contains hard registers which might be + used for reloads since the EBB end. It is an approximation of the + used hard registers in the split range. The exact value would + require expensive calculations. If we were aggressive with + splitting because of the approximation, the split pseudo will save + the same hard register assignment and will be removed in the undo + pass. We still need the approximation because too aggressive + splitting would result in too inaccurate cost calculation in the + assignment pass because of too many generated moves which will be + probably removed in the undo pass. */ +static inline bool +need_for_split_p (HARD_REG_SET potential_reload_hard_regs, int regno) +{ + int hard_regno = regno < FIRST_PSEUDO_REGISTER ? regno : reg_renumber[regno]; + + lra_assert (hard_regno >= 0); + return ((TEST_HARD_REG_BIT (potential_reload_hard_regs, hard_regno) + /* Don't split eliminable hard registers, otherwise we can + split hard registers like hard frame pointer, which + lives on BB start/end according to DF-infrastructure, + when there is a pseudo assigned to the register and + living in the same BB. */ + && (regno >= FIRST_PSEUDO_REGISTER + || ! TEST_HARD_REG_BIT (eliminable_regset, hard_regno)) + && ! TEST_HARD_REG_BIT (lra_no_alloc_regs, hard_regno) + /* We need at least 2 reloads to make pseudo splitting + profitable. We should provide hard regno splitting in + any case to solve 1st insn scheduling problem when + moving hard register definition up might result in + impossibility to find hard register for reload pseudo of + small register class. */ + && (usage_insns[regno].reloads_num + + (regno < FIRST_PSEUDO_REGISTER ? 0 : 2) < reloads_num) + && (regno < FIRST_PSEUDO_REGISTER + /* For short living pseudos, spilling + inheritance can + be considered a substitution for splitting. + Therefore we do not splitting for local pseudos. It + decreases also aggressiveness of splitting. The + minimal number of references is chosen taking into + account that for 2 references splitting has no sense + as we can just spill the pseudo. */ + || (regno >= FIRST_PSEUDO_REGISTER + && lra_reg_info[regno].nrefs > 3 + && bitmap_bit_p (&ebb_global_regs, regno)))) + || (regno >= FIRST_PSEUDO_REGISTER && need_for_call_save_p (regno))); +} + +/* Return class for the split pseudo created from original pseudo with + ALLOCNO_CLASS and MODE which got a hard register HARD_REGNO. We + choose subclass of ALLOCNO_CLASS which contains HARD_REGNO and + results in no secondary memory movements. */ +static enum reg_class +choose_split_class (enum reg_class allocno_class, + int hard_regno ATTRIBUTE_UNUSED, + enum machine_mode mode ATTRIBUTE_UNUSED) +{ +#ifndef SECONDARY_MEMORY_NEEDED + return allocno_class; +#else + int i; + enum reg_class cl, best_cl = NO_REGS; + enum reg_class hard_reg_class = REGNO_REG_CLASS (hard_regno); + + if (! SECONDARY_MEMORY_NEEDED (allocno_class, allocno_class, mode) + && TEST_HARD_REG_BIT (reg_class_contents[allocno_class], hard_regno)) + return allocno_class; + for (i = 0; + (cl = reg_class_subclasses[allocno_class][i]) != LIM_REG_CLASSES; + i++) + if (! SECONDARY_MEMORY_NEEDED (cl, hard_reg_class, mode) + && ! SECONDARY_MEMORY_NEEDED (hard_reg_class, cl, mode) + && TEST_HARD_REG_BIT (reg_class_contents[cl], hard_regno) + && (best_cl == NO_REGS + || ira_class_hard_regs_num[best_cl] < ira_class_hard_regs_num[cl])) + best_cl = cl; + return best_cl; +#endif +} + +/* Do split transformations for insn INSN, which defines or uses + ORIGINAL_REGNO. NEXT_USAGE_INSNS specifies which instruction in + the EBB next uses ORIGINAL_REGNO; it has the same form as the + "insns" field of usage_insns. + + The transformations look like: + + p <- ... p <- ... + ... s <- p (new insn -- save) + ... => + ... p <- s (new insn -- restore) + <- ... p ... <- ... p ... + or + <- ... p ... <- ... p ... + ... s <- p (new insn -- save) + ... => + ... p <- s (new insn -- restore) + <- ... p ... <- ... p ... + + where p is an original pseudo got a hard register or a hard + register and s is a new split pseudo. The save is put before INSN + if BEFORE_P is true. Return true if we succeed in such + transformation. */ +static bool +split_reg (bool before_p, int original_regno, rtx insn, rtx next_usage_insns) +{ + enum reg_class rclass; + rtx original_reg; + int hard_regno; + rtx new_reg, save, restore, usage_insn; + bool after_p; + bool call_save_p; + + if (original_regno < FIRST_PSEUDO_REGISTER) + { + rclass = ira_allocno_class_translate[REGNO_REG_CLASS (original_regno)]; + hard_regno = original_regno; + call_save_p = false; + } + else + { + hard_regno = reg_renumber[original_regno]; + rclass = lra_get_allocno_class (original_regno); + original_reg = regno_reg_rtx[original_regno]; + call_save_p = need_for_call_save_p (original_regno); + } + original_reg = regno_reg_rtx[original_regno]; + lra_assert (hard_regno >= 0); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + " ((((((((((((((((((((((((((((((((((((((((((((((((\n"); + if (call_save_p) + { + enum machine_mode sec_mode; + +#ifdef SECONDARY_MEMORY_NEEDED_MODE + sec_mode = SECONDARY_MEMORY_NEEDED_MODE (GET_MODE (original_reg)); +#else + sec_mode = GET_MODE (original_reg); +#endif + new_reg = lra_create_new_reg (sec_mode, NULL_RTX, + NO_REGS, "save"); + } + else + { + rclass = choose_split_class (rclass, hard_regno, + GET_MODE (original_reg)); + if (rclass == NO_REGS) + { + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + " Rejecting split of %d(%s): " + "no good reg class for %d(%s)\n", + original_regno, + reg_class_names[lra_get_allocno_class (original_regno)], + hard_regno, + reg_class_names[REGNO_REG_CLASS (hard_regno)]); + fprintf + (lra_dump_file, + " ))))))))))))))))))))))))))))))))))))))))))))))))\n"); + } + return false; + } + new_reg = lra_create_new_reg (GET_MODE (original_reg), original_reg, + rclass, "split"); + reg_renumber[REGNO (new_reg)] = hard_regno; + } + save = emit_spill_move (true, new_reg, original_reg); + if (NEXT_INSN (save) != NULL_RTX) + { + lra_assert (! call_save_p); + if (lra_dump_file != NULL) + { + fprintf + (lra_dump_file, + " Rejecting split %d->%d resulting in > 2 %s save insns:\n", + original_regno, REGNO (new_reg), call_save_p ? "call" : ""); + debug_rtl_slim (lra_dump_file, save, NULL_RTX, -1, 0); + fprintf (lra_dump_file, + " ))))))))))))))))))))))))))))))))))))))))))))))))\n"); + } + return false; + } + restore = emit_spill_move (false, new_reg, original_reg); + if (NEXT_INSN (restore) != NULL_RTX) + { + lra_assert (! call_save_p); + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, + " Rejecting split %d->%d " + "resulting in > 2 %s restore insns:\n", + original_regno, REGNO (new_reg), call_save_p ? "call" : ""); + debug_rtl_slim (lra_dump_file, restore, NULL_RTX, -1, 0); + fprintf (lra_dump_file, + " ))))))))))))))))))))))))))))))))))))))))))))))))\n"); + } + return false; + } + after_p = usage_insns[original_regno].after_p; + lra_reg_info[REGNO (new_reg)].restore_regno = original_regno; + bitmap_set_bit (&check_only_regs, REGNO (new_reg)); + bitmap_set_bit (&check_only_regs, original_regno); + bitmap_set_bit (&lra_split_regs, REGNO (new_reg)); + for (;;) + { + if (GET_CODE (next_usage_insns) != INSN_LIST) + { + usage_insn = next_usage_insns; + break; + } + usage_insn = XEXP (next_usage_insns, 0); + lra_assert (DEBUG_INSN_P (usage_insn)); + next_usage_insns = XEXP (next_usage_insns, 1); + substitute_pseudo (&usage_insn, original_regno, new_reg); + lra_update_insn_regno_info (usage_insn); + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, " Split reuse change %d->%d:\n", + original_regno, REGNO (new_reg)); + debug_rtl_slim (lra_dump_file, usage_insn, usage_insn, + -1, 0); + } + } + lra_assert (NOTE_P (usage_insn) || NONDEBUG_INSN_P (usage_insn)); + lra_assert (usage_insn != insn || (after_p && before_p)); + lra_process_new_insns (usage_insn, after_p ? NULL_RTX : restore, + after_p ? restore : NULL_RTX, + call_save_p + ? "Add reg<-save" : "Add reg<-split"); + lra_process_new_insns (insn, before_p ? save : NULL_RTX, + before_p ? NULL_RTX : save, + call_save_p + ? "Add save<-reg" : "Add split<-reg"); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + " ))))))))))))))))))))))))))))))))))))))))))))))))\n"); + return true; +} + +/* Recognize that we need a split transformation for insn INSN, which + defines or uses REGNO in its insn biggest MODE (we use it only if + REGNO is a hard register). POTENTIAL_RELOAD_HARD_REGS contains + hard registers which might be used for reloads since the EBB end. + Put the save before INSN if BEFORE_P is true. MAX_UID is maximla + uid before starting INSN processing. Return true if we succeed in + such transformation. */ +static bool +split_if_necessary (int regno, enum machine_mode mode, + HARD_REG_SET potential_reload_hard_regs, + bool before_p, rtx insn, int max_uid) +{ + bool res = false; + int i, nregs = 1; + rtx next_usage_insns; + + if (regno < FIRST_PSEUDO_REGISTER) + nregs = hard_regno_nregs[regno][mode]; + for (i = 0; i < nregs; i++) + if (usage_insns[regno + i].check == curr_usage_insns_check + && (next_usage_insns = usage_insns[regno + i].insns) != NULL_RTX + /* To avoid processing the register twice or more. */ + && ((GET_CODE (next_usage_insns) != INSN_LIST + && INSN_UID (next_usage_insns) < max_uid) + || (GET_CODE (next_usage_insns) == INSN_LIST + && (INSN_UID (XEXP (next_usage_insns, 0)) < max_uid))) + && need_for_split_p (potential_reload_hard_regs, regno + i) + && split_reg (before_p, regno + i, insn, next_usage_insns)) + res = true; + return res; +} + +/* Check only registers living at the current program point in the + current EBB. */ +static bitmap_head live_regs; + +/* Update live info in EBB given by its HEAD and TAIL insns after + inheritance/split transformation. The function removes dead moves + too. */ +static void +update_ebb_live_info (rtx head, rtx tail) +{ + unsigned int j; + int regno; + bool live_p; + rtx prev_insn, set; + bool remove_p; + basic_block last_bb, prev_bb, curr_bb; + bitmap_iterator bi; + struct lra_insn_reg *reg; + edge e; + edge_iterator ei; + + last_bb = BLOCK_FOR_INSN (tail); + prev_bb = NULL; + for (curr_insn = tail; + curr_insn != PREV_INSN (head); + curr_insn = prev_insn) + { + prev_insn = PREV_INSN (curr_insn); + if (! INSN_P (curr_insn)) + continue; + curr_bb = BLOCK_FOR_INSN (curr_insn); + if (curr_bb != prev_bb) + { + if (prev_bb != NULL) + { + /* Update df_get_live_in (prev_bb): */ + EXECUTE_IF_SET_IN_BITMAP (&check_only_regs, 0, j, bi) + if (bitmap_bit_p (&live_regs, j)) + bitmap_set_bit (df_get_live_in (prev_bb), j); + else + bitmap_clear_bit (df_get_live_in (prev_bb), j); + } + if (curr_bb != last_bb) + { + /* Update df_get_live_out (curr_bb): */ + EXECUTE_IF_SET_IN_BITMAP (&check_only_regs, 0, j, bi) + { + live_p = bitmap_bit_p (&live_regs, j); + if (! live_p) + FOR_EACH_EDGE (e, ei, curr_bb->succs) + if (bitmap_bit_p (df_get_live_in (e->dest), j)) + { + live_p = true; + break; + } + if (live_p) + bitmap_set_bit (df_get_live_out (curr_bb), j); + else + bitmap_clear_bit (df_get_live_out (curr_bb), j); + } + } + prev_bb = curr_bb; + bitmap_and (&live_regs, &check_only_regs, df_get_live_out (curr_bb)); + } + if (DEBUG_INSN_P (curr_insn)) + continue; + curr_id = lra_get_insn_recog_data (curr_insn); + remove_p = false; + if ((set = single_set (curr_insn)) != NULL_RTX && REG_P (SET_DEST (set)) + && (regno = REGNO (SET_DEST (set))) >= FIRST_PSEUDO_REGISTER + && bitmap_bit_p (&check_only_regs, regno) + && ! bitmap_bit_p (&live_regs, regno)) + remove_p = true; + /* See which defined values die here. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type == OP_OUT && ! reg->subreg_p) + bitmap_clear_bit (&live_regs, reg->regno); + /* Mark each used value as live. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type == OP_IN + && bitmap_bit_p (&check_only_regs, reg->regno)) + bitmap_set_bit (&live_regs, reg->regno); + /* It is quite important to remove dead move insns because it + means removing dead store. We don't need to process them for + constraints. */ + if (remove_p) + { + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, " Removing dead insn:\n "); + debug_rtl_slim (lra_dump_file, curr_insn, curr_insn, -1, 0); + } + lra_set_insn_deleted (curr_insn); + } + } +} + +/* The structure describes info to do an inheritance for the current + insn. We need to collect such info first before doing the + transformations because the transformations change the insn + internal representation. */ +struct to_inherit +{ + /* Original regno. */ + int regno; + /* Subsequent insns which can inherit original reg value. */ + rtx insns; +}; + +/* Array containing all info for doing inheritance from the current + insn. */ +static struct to_inherit to_inherit[LRA_MAX_INSN_RELOADS]; + +/* Number elements in the previous array. */ +static int to_inherit_num; + +/* Add inheritance info REGNO and INSNS. Their meaning is described in + structure to_inherit. */ +static void +add_to_inherit (int regno, rtx insns) +{ + int i; + + for (i = 0; i < to_inherit_num; i++) + if (to_inherit[i].regno == regno) + return; + lra_assert (to_inherit_num < LRA_MAX_INSN_RELOADS); + to_inherit[to_inherit_num].regno = regno; + to_inherit[to_inherit_num++].insns = insns; +} + +/* Return the last non-debug insn in basic block BB, or the block begin + note if none. */ +static rtx +get_last_insertion_point (basic_block bb) +{ + rtx insn; + + FOR_BB_INSNS_REVERSE (bb, insn) + if (NONDEBUG_INSN_P (insn) || NOTE_INSN_BASIC_BLOCK_P (insn)) + return insn; + gcc_unreachable (); +} + +/* Set up RES by registers living on edges FROM except the edge (FROM, + TO) or by registers set up in a jump insn in BB FROM. */ +static void +get_live_on_other_edges (basic_block from, basic_block to, bitmap res) +{ + rtx last; + struct lra_insn_reg *reg; + edge e; + edge_iterator ei; + + lra_assert (to != NULL); + bitmap_clear (res); + FOR_EACH_EDGE (e, ei, from->succs) + if (e->dest != to) + bitmap_ior_into (res, df_get_live_in (e->dest)); + last = get_last_insertion_point (from); + if (! JUMP_P (last)) + return; + curr_id = lra_get_insn_recog_data (last); + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type != OP_IN) + bitmap_set_bit (res, reg->regno); +} + +/* Used as a temporary results of some bitmap calculations. */ +static bitmap_head temp_bitmap; + +/* Do inheritance/split transformations in EBB starting with HEAD and + finishing on TAIL. We process EBB insns in the reverse order. + Return true if we did any inheritance/split transformation in the + EBB. + + We should avoid excessive splitting which results in worse code + because of inaccurate cost calculations for spilling new split + pseudos in such case. To achieve this we do splitting only if + register pressure is high in given basic block and there are reload + pseudos requiring hard registers. We could do more register + pressure calculations at any given program point to avoid necessary + splitting even more but it is to expensive and the current approach + works well enough. */ +static bool +inherit_in_ebb (rtx head, rtx tail) +{ + int i, src_regno, dst_regno, nregs; + bool change_p, succ_p; + rtx prev_insn, next_usage_insns, set, last_insn; + enum reg_class cl; + struct lra_insn_reg *reg; + basic_block last_processed_bb, curr_bb = NULL; + HARD_REG_SET potential_reload_hard_regs, live_hard_regs; + bitmap to_process; + unsigned int j; + bitmap_iterator bi; + bool head_p, after_p; + + change_p = false; + curr_usage_insns_check++; + reloads_num = calls_num = 0; + bitmap_clear (&check_only_regs); + last_processed_bb = NULL; + CLEAR_HARD_REG_SET (potential_reload_hard_regs); + CLEAR_HARD_REG_SET (live_hard_regs); + /* We don't process new insns generated in the loop. */ + for (curr_insn = tail; curr_insn != PREV_INSN (head); curr_insn = prev_insn) + { + prev_insn = PREV_INSN (curr_insn); + if (BLOCK_FOR_INSN (curr_insn) != NULL) + curr_bb = BLOCK_FOR_INSN (curr_insn); + if (last_processed_bb != curr_bb) + { + /* We are at the end of BB. Add qualified living + pseudos for potential splitting. */ + to_process = df_get_live_out (curr_bb); + if (last_processed_bb != NULL) + { + /* We are somewhere in the middle of EBB. */ + get_live_on_other_edges (curr_bb, last_processed_bb, + &temp_bitmap); + to_process = &temp_bitmap; + } + last_processed_bb = curr_bb; + last_insn = get_last_insertion_point (curr_bb); + after_p = (! JUMP_P (last_insn) + && (! CALL_P (last_insn) + || (find_reg_note (last_insn, + REG_NORETURN, NULL_RTX) == NULL_RTX + && ! SIBLING_CALL_P (last_insn)))); + REG_SET_TO_HARD_REG_SET (live_hard_regs, df_get_live_out (curr_bb)); + IOR_HARD_REG_SET (live_hard_regs, eliminable_regset); + IOR_HARD_REG_SET (live_hard_regs, lra_no_alloc_regs); + CLEAR_HARD_REG_SET (potential_reload_hard_regs); + EXECUTE_IF_SET_IN_BITMAP (to_process, 0, j, bi) + { + if ((int) j >= lra_constraint_new_regno_start) + break; + if (j < FIRST_PSEUDO_REGISTER || reg_renumber[j] >= 0) + { + if (j < FIRST_PSEUDO_REGISTER) + SET_HARD_REG_BIT (live_hard_regs, j); + else + add_to_hard_reg_set (&live_hard_regs, + PSEUDO_REGNO_MODE (j), + reg_renumber[j]); + setup_next_usage_insn (j, last_insn, reloads_num, after_p); + } + } + } + src_regno = dst_regno = -1; + if (NONDEBUG_INSN_P (curr_insn) + && (set = single_set (curr_insn)) != NULL_RTX + && REG_P (SET_DEST (set)) && REG_P (SET_SRC (set))) + { + src_regno = REGNO (SET_SRC (set)); + dst_regno = REGNO (SET_DEST (set)); + } + if (src_regno < lra_constraint_new_regno_start + && src_regno >= FIRST_PSEUDO_REGISTER + && reg_renumber[src_regno] < 0 + && dst_regno >= lra_constraint_new_regno_start + && (cl = lra_get_allocno_class (dst_regno)) != NO_REGS) + { + /* 'reload_pseudo <- original_pseudo'. */ + reloads_num++; + succ_p = false; + if (usage_insns[src_regno].check == curr_usage_insns_check + && (next_usage_insns = usage_insns[src_regno].insns) != NULL_RTX) + succ_p = inherit_reload_reg (false, src_regno, cl, + curr_insn, next_usage_insns); + if (succ_p) + change_p = true; + else + setup_next_usage_insn (src_regno, curr_insn, reloads_num, false); + if (hard_reg_set_subset_p (reg_class_contents[cl], live_hard_regs)) + IOR_HARD_REG_SET (potential_reload_hard_regs, + reg_class_contents[cl]); + } + else if (src_regno >= lra_constraint_new_regno_start + && dst_regno < lra_constraint_new_regno_start + && dst_regno >= FIRST_PSEUDO_REGISTER + && reg_renumber[dst_regno] < 0 + && (cl = lra_get_allocno_class (src_regno)) != NO_REGS + && usage_insns[dst_regno].check == curr_usage_insns_check + && (next_usage_insns + = usage_insns[dst_regno].insns) != NULL_RTX) + { + reloads_num++; + /* 'original_pseudo <- reload_pseudo'. */ + if (! JUMP_P (curr_insn) + && inherit_reload_reg (true, dst_regno, cl, + curr_insn, next_usage_insns)) + change_p = true; + /* Invalidate. */ + usage_insns[dst_regno].check = 0; + if (hard_reg_set_subset_p (reg_class_contents[cl], live_hard_regs)) + IOR_HARD_REG_SET (potential_reload_hard_regs, + reg_class_contents[cl]); + } + else if (INSN_P (curr_insn)) + { + int max_uid = get_max_uid (); + + curr_id = lra_get_insn_recog_data (curr_insn); + to_inherit_num = 0; + /* Process insn definitions. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type != OP_IN + && (dst_regno = reg->regno) < lra_constraint_new_regno_start) + { + if (dst_regno >= FIRST_PSEUDO_REGISTER && reg->type == OP_OUT + && reg_renumber[dst_regno] < 0 && ! reg->subreg_p + && usage_insns[dst_regno].check == curr_usage_insns_check + && (next_usage_insns + = usage_insns[dst_regno].insns) != NULL_RTX) + { + struct lra_insn_reg *r; + + for (r = curr_id->regs; r != NULL; r = r->next) + if (r->type != OP_OUT && r->regno == dst_regno) + break; + /* Don't do inheritance if the pseudo is also + used in the insn. */ + if (r == NULL) + /* We can not do inheritance right now + because the current insn reg info (chain + regs) can change after that. */ + add_to_inherit (dst_regno, next_usage_insns); + } + /* We can not process one reg twice here because of + usage_insns invalidation. */ + if ((dst_regno < FIRST_PSEUDO_REGISTER + || reg_renumber[dst_regno] >= 0) + && ! reg->subreg_p && reg->type == OP_OUT) + { + HARD_REG_SET s; + + if (split_if_necessary (dst_regno, reg->biggest_mode, + potential_reload_hard_regs, + false, curr_insn, max_uid)) + change_p = true; + CLEAR_HARD_REG_SET (s); + if (dst_regno < FIRST_PSEUDO_REGISTER) + add_to_hard_reg_set (&s, reg->biggest_mode, dst_regno); + else + add_to_hard_reg_set (&s, PSEUDO_REGNO_MODE (dst_regno), + reg_renumber[dst_regno]); + AND_COMPL_HARD_REG_SET (live_hard_regs, s); + } + /* We should invalidate potential inheritance or + splitting for the current insn usages to the next + usage insns (see code below) as the output pseudo + prevents this. */ + if ((dst_regno >= FIRST_PSEUDO_REGISTER + && reg_renumber[dst_regno] < 0) + || (reg->type == OP_OUT && ! reg->subreg_p + && (dst_regno < FIRST_PSEUDO_REGISTER + || reg_renumber[dst_regno] >= 0))) + { + /* Invalidate. */ + if (dst_regno >= FIRST_PSEUDO_REGISTER) + usage_insns[dst_regno].check = 0; + else + { + nregs = hard_regno_nregs[dst_regno][reg->biggest_mode]; + for (i = 0; i < nregs; i++) + usage_insns[dst_regno + i].check = 0; + } + } + } + if (! JUMP_P (curr_insn)) + for (i = 0; i < to_inherit_num; i++) + if (inherit_reload_reg (true, to_inherit[i].regno, + ALL_REGS, curr_insn, + to_inherit[i].insns)) + change_p = true; + if (CALL_P (curr_insn)) + { + rtx cheap, pat, dest, restore; + int regno, hard_regno; + + calls_num++; + if ((cheap = find_reg_note (curr_insn, + REG_RETURNED, NULL_RTX)) != NULL_RTX + && ((cheap = XEXP (cheap, 0)), true) + && (regno = REGNO (cheap)) >= FIRST_PSEUDO_REGISTER + && (hard_regno = reg_renumber[regno]) >= 0 + /* If there are pending saves/restores, the + optimization is not worth. */ + && usage_insns[regno].calls_num == calls_num - 1 + && TEST_HARD_REG_BIT (call_used_reg_set, hard_regno)) + { + /* Restore the pseudo from the call result as + REG_RETURNED note says that the pseudo value is + in the call result and the pseudo is an argument + of the call. */ + pat = PATTERN (curr_insn); + if (GET_CODE (pat) == PARALLEL) + pat = XVECEXP (pat, 0, 0); + dest = SET_DEST (pat); + start_sequence (); + emit_move_insn (cheap, copy_rtx (dest)); + restore = get_insns (); + end_sequence (); + lra_process_new_insns (curr_insn, NULL, restore, + "Inserting call parameter restore"); + /* We don't need to save/restore of the pseudo from + this call. */ + usage_insns[regno].calls_num = calls_num; + bitmap_set_bit (&check_only_regs, regno); + } + } + to_inherit_num = 0; + /* Process insn usages. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if ((reg->type != OP_OUT + || (reg->type == OP_OUT && reg->subreg_p)) + && (src_regno = reg->regno) < lra_constraint_new_regno_start) + { + if (src_regno >= FIRST_PSEUDO_REGISTER + && reg_renumber[src_regno] < 0 && reg->type == OP_IN) + { + if (usage_insns[src_regno].check == curr_usage_insns_check + && (next_usage_insns + = usage_insns[src_regno].insns) != NULL_RTX + && NONDEBUG_INSN_P (curr_insn)) + add_to_inherit (src_regno, next_usage_insns); + else + /* Add usages. */ + add_next_usage_insn (src_regno, curr_insn, reloads_num); + } + else if (src_regno < FIRST_PSEUDO_REGISTER + || reg_renumber[src_regno] >= 0) + { + bool before_p; + rtx use_insn = curr_insn; + + before_p = (JUMP_P (curr_insn) + || (CALL_P (curr_insn) && reg->type == OP_IN)); + if (NONDEBUG_INSN_P (curr_insn) + && split_if_necessary (src_regno, reg->biggest_mode, + potential_reload_hard_regs, + before_p, curr_insn, max_uid)) + { + if (reg->subreg_p) + lra_risky_transformations_p = true; + change_p = true; + /* Invalidate. */ + usage_insns[src_regno].check = 0; + if (before_p) + use_insn = PREV_INSN (curr_insn); + } + if (NONDEBUG_INSN_P (curr_insn)) + { + if (src_regno < FIRST_PSEUDO_REGISTER) + add_to_hard_reg_set (&live_hard_regs, + reg->biggest_mode, src_regno); + else + add_to_hard_reg_set (&live_hard_regs, + PSEUDO_REGNO_MODE (src_regno), + reg_renumber[src_regno]); + } + add_next_usage_insn (src_regno, use_insn, reloads_num); + } + } + for (i = 0; i < to_inherit_num; i++) + { + src_regno = to_inherit[i].regno; + if (inherit_reload_reg (false, src_regno, ALL_REGS, + curr_insn, to_inherit[i].insns)) + change_p = true; + else + setup_next_usage_insn (src_regno, curr_insn, reloads_num, false); + } + } + /* We reached the start of the current basic block. */ + if (prev_insn == NULL_RTX || prev_insn == PREV_INSN (head) + || BLOCK_FOR_INSN (prev_insn) != curr_bb) + { + /* We reached the beginning of the current block -- do + rest of spliting in the current BB. */ + to_process = df_get_live_in (curr_bb); + if (BLOCK_FOR_INSN (head) != curr_bb) + { + /* We are somewhere in the middle of EBB. */ + get_live_on_other_edges (EDGE_PRED (curr_bb, 0)->src, + curr_bb, &temp_bitmap); + to_process = &temp_bitmap; + } + head_p = true; + EXECUTE_IF_SET_IN_BITMAP (to_process, 0, j, bi) + { + if ((int) j >= lra_constraint_new_regno_start) + break; + if (((int) j < FIRST_PSEUDO_REGISTER || reg_renumber[j] >= 0) + && usage_insns[j].check == curr_usage_insns_check + && (next_usage_insns = usage_insns[j].insns) != NULL_RTX) + { + if (need_for_split_p (potential_reload_hard_regs, j)) + { + if (lra_dump_file != NULL && head_p) + { + fprintf (lra_dump_file, + " ----------------------------------\n"); + head_p = false; + } + if (split_reg (false, j, bb_note (curr_bb), + next_usage_insns)) + change_p = true; + } + usage_insns[j].check = 0; + } + } + } + } + return change_p; +} + +/* This value affects EBB forming. If probability of edge from EBB to + a BB is not greater than the following value, we don't add the BB + to EBB. */ +#define EBB_PROBABILITY_CUTOFF (REG_BR_PROB_BASE / 2) + +/* Current number of inheritance/split iteration. */ +int lra_inheritance_iter; + +/* Entry function for inheritance/split pass. */ +void +lra_inheritance (void) +{ + int i; + basic_block bb, start_bb; + edge e; + + timevar_push (TV_LRA_INHERITANCE); + lra_inheritance_iter++; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "\n********** Inheritance #%d: **********\n\n", + lra_inheritance_iter); + curr_usage_insns_check = 0; + usage_insns = XNEWVEC (struct usage_insns, lra_constraint_new_regno_start); + for (i = 0; i < lra_constraint_new_regno_start; i++) + usage_insns[i].check = 0; + bitmap_initialize (&check_only_regs, ®_obstack); + bitmap_initialize (&live_regs, ®_obstack); + bitmap_initialize (&temp_bitmap, ®_obstack); + bitmap_initialize (&ebb_global_regs, ®_obstack); + FOR_EACH_BB (bb) + { + start_bb = bb; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "EBB"); + /* Form a EBB starting with BB. */ + bitmap_clear (&ebb_global_regs); + bitmap_ior_into (&ebb_global_regs, df_get_live_in (bb)); + for (;;) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " %d", bb->index); + if (bb->next_bb == EXIT_BLOCK_PTR || LABEL_P (BB_HEAD (bb->next_bb))) + break; + e = find_fallthru_edge (bb->succs); + if (! e) + break; + if (e->probability <= EBB_PROBABILITY_CUTOFF) + break; + bb = bb->next_bb; + } + bitmap_ior_into (&ebb_global_regs, df_get_live_out (bb)); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "\n"); + if (inherit_in_ebb (BB_HEAD (start_bb), BB_END (bb))) + /* Remember that the EBB head and tail can change in + inherit_in_ebb. */ + update_ebb_live_info (BB_HEAD (start_bb), BB_END (bb)); + } + bitmap_clear (&ebb_global_regs); + bitmap_clear (&temp_bitmap); + bitmap_clear (&live_regs); + bitmap_clear (&check_only_regs); + free (usage_insns); + + timevar_pop (TV_LRA_INHERITANCE); +} + + + +/* This page contains code to undo failed inheritance/split + transformations. */ + +/* Current number of iteration undoing inheritance/split. */ +int lra_undo_inheritance_iter; + +/* Fix BB live info LIVE after removing pseudos created on pass doing + inheritance/split which are REMOVED_PSEUDOS. */ +static void +fix_bb_live_info (bitmap live, bitmap removed_pseudos) +{ + unsigned int regno; + bitmap_iterator bi; + + EXECUTE_IF_SET_IN_BITMAP (removed_pseudos, 0, regno, bi) + if (bitmap_clear_bit (live, regno)) + bitmap_set_bit (live, lra_reg_info[regno].restore_regno); +} + +/* Return regno of the (subreg of) REG. Otherwise, return a negative + number. */ +static int +get_regno (rtx reg) +{ + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + if (REG_P (reg)) + return REGNO (reg); + return -1; +} + +/* Remove inheritance/split pseudos which are in REMOVE_PSEUDOS and + return true if we did any change. The undo transformations for + inheritance looks like + i <- i2 + p <- i => p <- i2 + or removing + p <- i, i <- p, and i <- i3 + where p is original pseudo from which inheritance pseudo i was + created, i and i3 are removed inheritance pseudos, i2 is another + not removed inheritance pseudo. All split pseudos or other + occurrences of removed inheritance pseudos are changed on the + corresponding original pseudos. + + The function also schedules insns changed and created during + inheritance/split pass for processing by the subsequent constraint + pass. */ +static bool +remove_inheritance_pseudos (bitmap remove_pseudos) +{ + basic_block bb; + int regno, sregno, prev_sregno, dregno, restore_regno; + rtx set, prev_set, prev_insn; + bool change_p, done_p; + + change_p = ! bitmap_empty_p (remove_pseudos); + /* We can not finish the function right away if CHANGE_P is true + because we need to marks insns affected by previous + inheritance/split pass for processing by the subsequent + constraint pass. */ + FOR_EACH_BB (bb) + { + fix_bb_live_info (df_get_live_in (bb), remove_pseudos); + fix_bb_live_info (df_get_live_out (bb), remove_pseudos); + FOR_BB_INSNS_REVERSE (bb, curr_insn) + { + if (! INSN_P (curr_insn)) + continue; + done_p = false; + sregno = dregno = -1; + if (change_p && NONDEBUG_INSN_P (curr_insn) + && (set = single_set (curr_insn)) != NULL_RTX) + { + dregno = get_regno (SET_DEST (set)); + sregno = get_regno (SET_SRC (set)); + } + + if (sregno >= 0 && dregno >= 0) + { + if ((bitmap_bit_p (remove_pseudos, sregno) + && (lra_reg_info[sregno].restore_regno == dregno + || (bitmap_bit_p (remove_pseudos, dregno) + && (lra_reg_info[sregno].restore_regno + == lra_reg_info[dregno].restore_regno)))) + || (bitmap_bit_p (remove_pseudos, dregno) + && lra_reg_info[dregno].restore_regno == sregno)) + /* One of the following cases: + original <- removed inheritance pseudo + removed inherit pseudo <- another removed inherit pseudo + removed inherit pseudo <- original pseudo + Or + removed_split_pseudo <- original_reg + original_reg <- removed_split_pseudo */ + { + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, " Removing %s:\n", + bitmap_bit_p (&lra_split_regs, sregno) + || bitmap_bit_p (&lra_split_regs, dregno) + ? "split" : "inheritance"); + debug_rtl_slim (lra_dump_file, + curr_insn, curr_insn, -1, 0); + } + lra_set_insn_deleted (curr_insn); + done_p = true; + } + else if (bitmap_bit_p (remove_pseudos, sregno) + && bitmap_bit_p (&lra_inheritance_pseudos, sregno)) + { + /* Search the following pattern: + inherit_or_split_pseudo1 <- inherit_or_split_pseudo2 + original_pseudo <- inherit_or_split_pseudo1 + where the 2nd insn is the current insn and + inherit_or_split_pseudo2 is not removed. If it is found, + change the current insn onto: + original_pseudo <- inherit_or_split_pseudo2. */ + for (prev_insn = PREV_INSN (curr_insn); + prev_insn != NULL_RTX && ! NONDEBUG_INSN_P (prev_insn); + prev_insn = PREV_INSN (prev_insn)) + ; + if (prev_insn != NULL_RTX && BLOCK_FOR_INSN (prev_insn) == bb + && (prev_set = single_set (prev_insn)) != NULL_RTX + /* There should be no subregs in insn we are + searching because only the original reg might + be in subreg when we changed the mode of + load/store for splitting. */ + && REG_P (SET_DEST (prev_set)) + && REG_P (SET_SRC (prev_set)) + && (int) REGNO (SET_DEST (prev_set)) == sregno + && ((prev_sregno = REGNO (SET_SRC (prev_set))) + >= FIRST_PSEUDO_REGISTER) + /* As we consider chain of inheritance or + splitting described in above comment we should + check that sregno and prev_sregno were + inheritance/split pseudos created from the + same original regno. */ + && (lra_reg_info[sregno].restore_regno + == lra_reg_info[prev_sregno].restore_regno) + && ! bitmap_bit_p (remove_pseudos, prev_sregno)) + { + lra_assert (GET_MODE (SET_SRC (prev_set)) + == GET_MODE (regno_reg_rtx[sregno])); + if (GET_CODE (SET_SRC (set)) == SUBREG) + SUBREG_REG (SET_SRC (set)) = SET_SRC (prev_set); + else + SET_SRC (set) = SET_SRC (prev_set); + lra_push_insn_and_update_insn_regno_info (curr_insn); + lra_set_used_insn_alternative_by_uid + (INSN_UID (curr_insn), -1); + done_p = true; + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, " Change reload insn:\n"); + debug_rtl_slim (lra_dump_file, + curr_insn, curr_insn, -1, 0); + } + } + } + } + if (! done_p) + { + struct lra_insn_reg *reg; + bool restored_regs_p = false; + bool kept_regs_p = false; + + curr_id = lra_get_insn_recog_data (curr_insn); + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + { + regno = reg->regno; + restore_regno = lra_reg_info[regno].restore_regno; + if (restore_regno >= 0) + { + if (change_p && bitmap_bit_p (remove_pseudos, regno)) + { + substitute_pseudo (&curr_insn, regno, + regno_reg_rtx[restore_regno]); + restored_regs_p = true; + } + else + kept_regs_p = true; + } + } + if (NONDEBUG_INSN_P (curr_insn) && kept_regs_p) + { + /* The instruction has changed since the previous + constraints pass. */ + lra_push_insn_and_update_insn_regno_info (curr_insn); + lra_set_used_insn_alternative_by_uid + (INSN_UID (curr_insn), -1); + } + else if (restored_regs_p) + /* The instruction has been restored to the form that + it had during the previous constraints pass. */ + lra_update_insn_regno_info (curr_insn); + if (restored_regs_p && lra_dump_file != NULL) + { + fprintf (lra_dump_file, " Insn after restoring regs:\n"); + debug_rtl_slim (lra_dump_file, curr_insn, curr_insn, -1, 0); + } + } + } + } + return change_p; +} + +/* Entry function for undoing inheritance/split transformation. Return true + if we did any RTL change in this pass. */ +bool +lra_undo_inheritance (void) +{ + unsigned int regno; + int restore_regno, hard_regno; + int n_all_inherit, n_inherit, n_all_split, n_split; + bitmap_head remove_pseudos; + bitmap_iterator bi; + bool change_p; + + lra_undo_inheritance_iter++; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + "\n********** Undoing inheritance #%d: **********\n\n", + lra_undo_inheritance_iter); + bitmap_initialize (&remove_pseudos, ®_obstack); + n_inherit = n_all_inherit = 0; + EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, regno, bi) + if (lra_reg_info[regno].restore_regno >= 0) + { + n_all_inherit++; + if (reg_renumber[regno] < 0) + bitmap_set_bit (&remove_pseudos, regno); + else + n_inherit++; + } + if (lra_dump_file != NULL && n_all_inherit != 0) + fprintf (lra_dump_file, "Inherit %d out of %d (%.2f%%)\n", + n_inherit, n_all_inherit, + (double) n_inherit / n_all_inherit * 100); + n_split = n_all_split = 0; + EXECUTE_IF_SET_IN_BITMAP (&lra_split_regs, 0, regno, bi) + if ((restore_regno = lra_reg_info[regno].restore_regno) >= 0) + { + n_all_split++; + hard_regno = (restore_regno >= FIRST_PSEUDO_REGISTER + ? reg_renumber[restore_regno] : restore_regno); + if (hard_regno < 0 || reg_renumber[regno] == hard_regno) + bitmap_set_bit (&remove_pseudos, regno); + else + { + n_split++; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Keep split r%d (orig=r%d)\n", + regno, restore_regno); + } + } + if (lra_dump_file != NULL && n_all_split != 0) + fprintf (lra_dump_file, "Split %d out of %d (%.2f%%)\n", + n_split, n_all_split, + (double) n_split / n_all_split * 100); + change_p = remove_inheritance_pseudos (&remove_pseudos); + bitmap_clear (&remove_pseudos); + /* Clear restore_regnos. */ + EXECUTE_IF_SET_IN_BITMAP (&lra_inheritance_pseudos, 0, regno, bi) + lra_reg_info[regno].restore_regno = -1; + EXECUTE_IF_SET_IN_BITMAP (&lra_split_regs, 0, regno, bi) + lra_reg_info[regno].restore_regno = -1; + return change_p; +} diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c new file mode 100644 index 00000000000..2222d805f1b --- /dev/null +++ b/gcc/lra-eliminations.c @@ -0,0 +1,1301 @@ +/* Code for RTL register eliminations. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +/* Eliminable registers (like a soft argument or frame pointer) are + widely used in RTL. These eliminable registers should be replaced + by real hard registers (like the stack pointer or hard frame + pointer) plus some offset. The offsets usually change whenever the + stack is expanded. We know the final offsets only at the very end + of LRA. + + Within LRA, we usually keep the RTL in such a state that the + eliminable registers can be replaced by just the corresponding hard + register (without any offset). To achieve this we should add the + initial elimination offset at the beginning of LRA and update the + offsets whenever the stack is expanded. We need to do this before + every constraint pass because the choice of offset often affects + whether a particular address or memory constraint is satisfied. + + We keep RTL code at most time in such state that the virtual + registers can be changed by just the corresponding hard registers + (with zero offsets) and we have the right RTL code. To achieve this + we should add initial offset at the beginning of LRA work and update + offsets after each stack expanding. But actually we update virtual + registers to the same virtual registers + corresponding offsets + before every constraint pass because it affects constraint + satisfaction (e.g. an address displacement became too big for some + target). + + The final change of eliminable registers to the corresponding hard + registers are done at the very end of LRA when there were no change + in offsets anymore: + + fp + 42 => sp + 42 + +*/ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "hard-reg-set.h" +#include "rtl.h" +#include "tm_p.h" +#include "regs.h" +#include "insn-config.h" +#include "insn-codes.h" +#include "recog.h" +#include "output.h" +#include "addresses.h" +#include "target.h" +#include "function.h" +#include "expr.h" +#include "basic-block.h" +#include "except.h" +#include "optabs.h" +#include "df.h" +#include "ira.h" +#include "rtl-error.h" +#include "lra-int.h" + +/* This structure is used to record information about hard register + eliminations. */ +struct elim_table +{ + /* Hard register number to be eliminated. */ + int from; + /* Hard register number used as replacement. */ + int to; + /* Difference between values of the two hard registers above on + previous iteration. */ + HOST_WIDE_INT previous_offset; + /* Difference between the values on the current iteration. */ + HOST_WIDE_INT offset; + /* Nonzero if this elimination can be done. */ + bool can_eliminate; + /* CAN_ELIMINATE since the last check. */ + bool prev_can_eliminate; + /* REG rtx for the register to be eliminated. We cannot simply + compare the number since we might then spuriously replace a hard + register corresponding to a pseudo assigned to the reg to be + eliminated. */ + rtx from_rtx; + /* REG rtx for the replacement. */ + rtx to_rtx; +}; + +/* The elimination table. Each array entry describes one possible way + of eliminating a register in favor of another. If there is more + than one way of eliminating a particular register, the most + preferred should be specified first. */ +static struct elim_table *reg_eliminate = 0; + +/* This is an intermediate structure to initialize the table. It has + exactly the members provided by ELIMINABLE_REGS. */ +static const struct elim_table_1 +{ + const int from; + const int to; +} reg_eliminate_1[] = + +/* If a set of eliminable hard registers was specified, define the + table from it. Otherwise, default to the normal case of the frame + pointer being replaced by the stack pointer. */ + +#ifdef ELIMINABLE_REGS + ELIMINABLE_REGS; +#else + {{ FRAME_POINTER_REGNUM, STACK_POINTER_REGNUM}}; +#endif + +#define NUM_ELIMINABLE_REGS ARRAY_SIZE (reg_eliminate_1) + +/* Print info about elimination table to file F. */ +static void +print_elim_table (FILE *f) +{ + struct elim_table *ep; + + for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) + fprintf (f, "%s eliminate %d to %d (offset=" HOST_WIDE_INT_PRINT_DEC + ", prev_offset=" HOST_WIDE_INT_PRINT_DEC ")\n", + ep->can_eliminate ? "Can" : "Can't", + ep->from, ep->to, ep->offset, ep->previous_offset); +} + +/* Print info about elimination table to stderr. */ +void +lra_debug_elim_table (void) +{ + print_elim_table (stderr); +} + +/* Setup possibility of elimination in elimination table element EP to + VALUE. Setup FRAME_POINTER_NEEDED if elimination from frame + pointer to stack pointer is not possible anymore. */ +static void +setup_can_eliminate (struct elim_table *ep, bool value) +{ + ep->can_eliminate = ep->prev_can_eliminate = value; + if (! value + && ep->from == FRAME_POINTER_REGNUM && ep->to == STACK_POINTER_REGNUM) + frame_pointer_needed = 1; +} + +/* Map: eliminable "from" register -> its current elimination, + or NULL if none. The elimination table may contain more than + one elimination for the same hard register, but this map specifies + the one that we are currently using. */ +static struct elim_table *elimination_map[FIRST_PSEUDO_REGISTER]; + +/* When an eliminable hard register becomes not eliminable, we use the + following special structure to restore original offsets for the + register. */ +static struct elim_table self_elim_table; + +/* Offsets should be used to restore original offsets for eliminable + hard register which just became not eliminable. Zero, + otherwise. */ +static HOST_WIDE_INT self_elim_offsets[FIRST_PSEUDO_REGISTER]; + +/* Map: hard regno -> RTL presentation. RTL presentations of all + potentially eliminable hard registers are stored in the map. */ +static rtx eliminable_reg_rtx[FIRST_PSEUDO_REGISTER]; + +/* Set up ELIMINATION_MAP of the currently used eliminations. */ +static void +setup_elimination_map (void) +{ + int i; + struct elim_table *ep; + + for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) + elimination_map[i] = NULL; + for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) + if (ep->can_eliminate && elimination_map[ep->from] == NULL) + elimination_map[ep->from] = ep; +} + + + +/* Compute the sum of X and Y, making canonicalizations assumed in an + address, namely: sum constant integers, surround the sum of two + constants with a CONST, put the constant as the second operand, and + group the constant on the outermost sum. + + This routine assumes both inputs are already in canonical form. */ +static rtx +form_sum (rtx x, rtx y) +{ + rtx tem; + enum machine_mode mode = GET_MODE (x); + + if (mode == VOIDmode) + mode = GET_MODE (y); + + if (mode == VOIDmode) + mode = Pmode; + + if (CONST_INT_P (x)) + return plus_constant (mode, y, INTVAL (x)); + else if (CONST_INT_P (y)) + return plus_constant (mode, x, INTVAL (y)); + else if (CONSTANT_P (x)) + tem = x, x = y, y = tem; + + if (GET_CODE (x) == PLUS && CONSTANT_P (XEXP (x, 1))) + return form_sum (XEXP (x, 0), form_sum (XEXP (x, 1), y)); + + /* Note that if the operands of Y are specified in the opposite + order in the recursive calls below, infinite recursion will + occur. */ + if (GET_CODE (y) == PLUS && CONSTANT_P (XEXP (y, 1))) + return form_sum (form_sum (x, XEXP (y, 0)), XEXP (y, 1)); + + /* If both constant, encapsulate sum. Otherwise, just form sum. A + constant will have been placed second. */ + if (CONSTANT_P (x) && CONSTANT_P (y)) + { + if (GET_CODE (x) == CONST) + x = XEXP (x, 0); + if (GET_CODE (y) == CONST) + y = XEXP (y, 0); + + return gen_rtx_CONST (VOIDmode, gen_rtx_PLUS (mode, x, y)); + } + + return gen_rtx_PLUS (mode, x, y); +} + +/* Return the current substitution hard register of the elimination of + HARD_REGNO. If HARD_REGNO is not eliminable, return itself. */ +int +lra_get_elimination_hard_regno (int hard_regno) +{ + struct elim_table *ep; + + if (hard_regno < 0 || hard_regno >= FIRST_PSEUDO_REGISTER) + return hard_regno; + if ((ep = elimination_map[hard_regno]) == NULL) + return hard_regno; + return ep->to; +} + +/* Return elimination which will be used for hard reg REG, NULL + otherwise. */ +static struct elim_table * +get_elimination (rtx reg) +{ + int hard_regno; + struct elim_table *ep; + HOST_WIDE_INT offset; + + lra_assert (REG_P (reg)); + if ((hard_regno = REGNO (reg)) < 0 || hard_regno >= FIRST_PSEUDO_REGISTER) + return NULL; + if ((ep = elimination_map[hard_regno]) != NULL) + return ep->from_rtx != reg ? NULL : ep; + if ((offset = self_elim_offsets[hard_regno]) == 0) + return NULL; + /* This is an iteration to restore offsets just after HARD_REGNO + stopped to be eliminable. */ + self_elim_table.from = self_elim_table.to = hard_regno; + self_elim_table.from_rtx + = self_elim_table.to_rtx + = eliminable_reg_rtx[hard_regno]; + lra_assert (self_elim_table.from_rtx != NULL); + self_elim_table.offset = offset; + return &self_elim_table; +} + +/* Scan X and replace any eliminable registers (such as fp) with a + replacement (such as sp) if SUBST_P, plus an offset. The offset is + a change in the offset between the eliminable register and its + substitution if UPDATE_P, or the full offset if FULL_P, or + otherwise zero. + + MEM_MODE is the mode of an enclosing MEM. We need this to know how + much to adjust a register for, e.g., PRE_DEC. Also, if we are + inside a MEM, we are allowed to replace a sum of a hard register + and the constant zero with the hard register, which we cannot do + outside a MEM. In addition, we need to record the fact that a + hard register is referenced outside a MEM. + + Alternatively, INSN may be a note (an EXPR_LIST or INSN_LIST). + That's used when we eliminate in expressions stored in notes. */ +rtx +lra_eliminate_regs_1 (rtx x, enum machine_mode mem_mode, + bool subst_p, bool update_p, bool full_p) +{ + enum rtx_code code = GET_CODE (x); + struct elim_table *ep; + rtx new_rtx; + int i, j; + const char *fmt; + int copied = 0; + + if (! current_function_decl) + return x; + + switch (code) + { + CASE_CONST_ANY: + case CONST: + case SYMBOL_REF: + case CODE_LABEL: + case PC: + case CC0: + case ASM_INPUT: + case ADDR_VEC: + case ADDR_DIFF_VEC: + case RETURN: + return x; + + case REG: + /* First handle the case where we encounter a bare hard register + that is eliminable. Replace it with a PLUS. */ + if ((ep = get_elimination (x)) != NULL) + { + rtx to = subst_p ? ep->to_rtx : ep->from_rtx; + + if (update_p) + return plus_constant (Pmode, to, ep->offset - ep->previous_offset); + else if (full_p) + return plus_constant (Pmode, to, ep->offset); + else + return to; + } + return x; + + case PLUS: + /* If this is the sum of an eliminable register and a constant, rework + the sum. */ + if (REG_P (XEXP (x, 0)) && CONSTANT_P (XEXP (x, 1))) + { + if ((ep = get_elimination (XEXP (x, 0))) != NULL) + { + HOST_WIDE_INT offset; + rtx to = subst_p ? ep->to_rtx : ep->from_rtx; + + if (! update_p && ! full_p) + return gen_rtx_PLUS (Pmode, to, XEXP (x, 1)); + + offset = (update_p + ? ep->offset - ep->previous_offset : ep->offset); + if (CONST_INT_P (XEXP (x, 1)) + && INTVAL (XEXP (x, 1)) == -offset) + return to; + else + return gen_rtx_PLUS (Pmode, to, + plus_constant (Pmode, + XEXP (x, 1), offset)); + } + + /* If the hard register is not eliminable, we are done since + the other operand is a constant. */ + return x; + } + + /* If this is part of an address, we want to bring any constant + to the outermost PLUS. We will do this by doing hard + register replacement in our operands and seeing if a constant + shows up in one of them. + + Note that there is no risk of modifying the structure of the + insn, since we only get called for its operands, thus we are + either modifying the address inside a MEM, or something like + an address operand of a load-address insn. */ + + { + rtx new0 = lra_eliminate_regs_1 (XEXP (x, 0), mem_mode, + subst_p, update_p, full_p); + rtx new1 = lra_eliminate_regs_1 (XEXP (x, 1), mem_mode, + subst_p, update_p, full_p); + + if (new0 != XEXP (x, 0) || new1 != XEXP (x, 1)) + return form_sum (new0, new1); + } + return x; + + case MULT: + /* If this is the product of an eliminable hard register and a + constant, apply the distribute law and move the constant out + so that we have (plus (mult ..) ..). This is needed in order + to keep load-address insns valid. This case is pathological. + We ignore the possibility of overflow here. */ + if (REG_P (XEXP (x, 0)) && CONST_INT_P (XEXP (x, 1)) + && (ep = get_elimination (XEXP (x, 0))) != NULL) + { + rtx to = subst_p ? ep->to_rtx : ep->from_rtx; + + if (update_p) + return + plus_constant (Pmode, + gen_rtx_MULT (Pmode, to, XEXP (x, 1)), + (ep->offset - ep->previous_offset) + * INTVAL (XEXP (x, 1))); + else if (full_p) + return + plus_constant (Pmode, + gen_rtx_MULT (Pmode, to, XEXP (x, 1)), + ep->offset * INTVAL (XEXP (x, 1))); + else + return gen_rtx_MULT (Pmode, to, XEXP (x, 1)); + } + + /* ... fall through ... */ + + case CALL: + case COMPARE: + /* See comments before PLUS about handling MINUS. */ + case MINUS: + case DIV: case UDIV: + case MOD: case UMOD: + case AND: case IOR: case XOR: + case ROTATERT: case ROTATE: + case ASHIFTRT: case LSHIFTRT: case ASHIFT: + case NE: case EQ: + case GE: case GT: case GEU: case GTU: + case LE: case LT: case LEU: case LTU: + { + rtx new0 = lra_eliminate_regs_1 (XEXP (x, 0), mem_mode, + subst_p, update_p, full_p); + rtx new1 = XEXP (x, 1) + ? lra_eliminate_regs_1 (XEXP (x, 1), mem_mode, + subst_p, update_p, full_p) : 0; + + if (new0 != XEXP (x, 0) || new1 != XEXP (x, 1)) + return gen_rtx_fmt_ee (code, GET_MODE (x), new0, new1); + } + return x; + + case EXPR_LIST: + /* If we have something in XEXP (x, 0), the usual case, + eliminate it. */ + if (XEXP (x, 0)) + { + new_rtx = lra_eliminate_regs_1 (XEXP (x, 0), mem_mode, + subst_p, update_p, full_p); + if (new_rtx != XEXP (x, 0)) + { + /* If this is a REG_DEAD note, it is not valid anymore. + Using the eliminated version could result in creating a + REG_DEAD note for the stack or frame pointer. */ + if (REG_NOTE_KIND (x) == REG_DEAD) + return (XEXP (x, 1) + ? lra_eliminate_regs_1 (XEXP (x, 1), mem_mode, + subst_p, update_p, full_p) + : NULL_RTX); + + x = alloc_reg_note (REG_NOTE_KIND (x), new_rtx, XEXP (x, 1)); + } + } + + /* ... fall through ... */ + + case INSN_LIST: + /* Now do eliminations in the rest of the chain. If this was + an EXPR_LIST, this might result in allocating more memory than is + strictly needed, but it simplifies the code. */ + if (XEXP (x, 1)) + { + new_rtx = lra_eliminate_regs_1 (XEXP (x, 1), mem_mode, + subst_p, update_p, full_p); + if (new_rtx != XEXP (x, 1)) + return + gen_rtx_fmt_ee (GET_CODE (x), GET_MODE (x), + XEXP (x, 0), new_rtx); + } + return x; + + case PRE_INC: + case POST_INC: + case PRE_DEC: + case POST_DEC: + /* We do not support elimination of a register that is modified. + elimination_effects has already make sure that this does not + happen. */ + return x; + + case PRE_MODIFY: + case POST_MODIFY: + /* We do not support elimination of a hard register that is + modified. LRA has already make sure that this does not + happen. The only remaining case we need to consider here is + that the increment value may be an eliminable register. */ + if (GET_CODE (XEXP (x, 1)) == PLUS + && XEXP (XEXP (x, 1), 0) == XEXP (x, 0)) + { + rtx new_rtx = lra_eliminate_regs_1 (XEXP (XEXP (x, 1), 1), mem_mode, + subst_p, update_p, full_p); + + if (new_rtx != XEXP (XEXP (x, 1), 1)) + return gen_rtx_fmt_ee (code, GET_MODE (x), XEXP (x, 0), + gen_rtx_PLUS (GET_MODE (x), + XEXP (x, 0), new_rtx)); + } + return x; + + case STRICT_LOW_PART: + case NEG: case NOT: + case SIGN_EXTEND: case ZERO_EXTEND: + case TRUNCATE: case FLOAT_EXTEND: case FLOAT_TRUNCATE: + case FLOAT: case FIX: + case UNSIGNED_FIX: case UNSIGNED_FLOAT: + case ABS: + case SQRT: + case FFS: + case CLZ: + case CTZ: + case POPCOUNT: + case PARITY: + case BSWAP: + new_rtx = lra_eliminate_regs_1 (XEXP (x, 0), mem_mode, + subst_p, update_p, full_p); + if (new_rtx != XEXP (x, 0)) + return gen_rtx_fmt_e (code, GET_MODE (x), new_rtx); + return x; + + case SUBREG: + new_rtx = lra_eliminate_regs_1 (SUBREG_REG (x), mem_mode, + subst_p, update_p, full_p); + + if (new_rtx != SUBREG_REG (x)) + { + int x_size = GET_MODE_SIZE (GET_MODE (x)); + int new_size = GET_MODE_SIZE (GET_MODE (new_rtx)); + + if (MEM_P (new_rtx) && x_size <= new_size) + { + SUBREG_REG (x) = new_rtx; + alter_subreg (&x, false); + return x; + } + else + return gen_rtx_SUBREG (GET_MODE (x), new_rtx, SUBREG_BYTE (x)); + } + + return x; + + case MEM: + /* Our only special processing is to pass the mode of the MEM to our + recursive call and copy the flags. While we are here, handle this + case more efficiently. */ + return + replace_equiv_address_nv + (x, + lra_eliminate_regs_1 (XEXP (x, 0), GET_MODE (x), + subst_p, update_p, full_p)); + + case USE: + /* Handle insn_list USE that a call to a pure function may generate. */ + new_rtx = lra_eliminate_regs_1 (XEXP (x, 0), VOIDmode, + subst_p, update_p, full_p); + if (new_rtx != XEXP (x, 0)) + return gen_rtx_USE (GET_MODE (x), new_rtx); + return x; + + case CLOBBER: + case SET: + gcc_unreachable (); + + default: + break; + } + + /* Process each of our operands recursively. If any have changed, make a + copy of the rtx. */ + fmt = GET_RTX_FORMAT (code); + for (i = 0; i < GET_RTX_LENGTH (code); i++, fmt++) + { + if (*fmt == 'e') + { + new_rtx = lra_eliminate_regs_1 (XEXP (x, i), mem_mode, + subst_p, update_p, full_p); + if (new_rtx != XEXP (x, i) && ! copied) + { + x = shallow_copy_rtx (x); + copied = 1; + } + XEXP (x, i) = new_rtx; + } + else if (*fmt == 'E') + { + int copied_vec = 0; + for (j = 0; j < XVECLEN (x, i); j++) + { + new_rtx = lra_eliminate_regs_1 (XVECEXP (x, i, j), mem_mode, + subst_p, update_p, full_p); + if (new_rtx != XVECEXP (x, i, j) && ! copied_vec) + { + rtvec new_v = gen_rtvec_v (XVECLEN (x, i), + XVEC (x, i)->elem); + if (! copied) + { + x = shallow_copy_rtx (x); + copied = 1; + } + XVEC (x, i) = new_v; + copied_vec = 1; + } + XVECEXP (x, i, j) = new_rtx; + } + } + } + + return x; +} + +/* This function is used externally in subsequent passes of GCC. It + always does a full elimination of X. */ +rtx +lra_eliminate_regs (rtx x, enum machine_mode mem_mode, + rtx insn ATTRIBUTE_UNUSED) +{ + return lra_eliminate_regs_1 (x, mem_mode, true, false, true); +} + +/* Scan rtx X for references to elimination source or target registers + in contexts that would prevent the elimination from happening. + Update the table of eliminables to reflect the changed state. + MEM_MODE is the mode of an enclosing MEM rtx, or VOIDmode if not + within a MEM. */ +static void +mark_not_eliminable (rtx x) +{ + enum rtx_code code = GET_CODE (x); + struct elim_table *ep; + int i, j; + const char *fmt; + + switch (code) + { + case PRE_INC: + case POST_INC: + case PRE_DEC: + case POST_DEC: + case POST_MODIFY: + case PRE_MODIFY: + if (REG_P (XEXP (x, 0)) && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER) + /* If we modify the source of an elimination rule, disable + it. Do the same if it is the source and not the hard frame + register. */ + for (ep = reg_eliminate; + ep < ®_eliminate[NUM_ELIMINABLE_REGS]; + ep++) + if (ep->from_rtx == XEXP (x, 0) + || (ep->to_rtx == XEXP (x, 0) + && ep->to_rtx != hard_frame_pointer_rtx)) + setup_can_eliminate (ep, false); + return; + + case USE: + if (REG_P (XEXP (x, 0)) && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER) + /* If using a hard register that is the source of an eliminate + we still think can be performed, note it cannot be + performed since we don't know how this hard register is + used. */ + for (ep = reg_eliminate; + ep < ®_eliminate[NUM_ELIMINABLE_REGS]; + ep++) + if (ep->from_rtx == XEXP (x, 0) + && ep->to_rtx != hard_frame_pointer_rtx) + setup_can_eliminate (ep, false); + return; + + case CLOBBER: + if (REG_P (XEXP (x, 0)) && REGNO (XEXP (x, 0)) < FIRST_PSEUDO_REGISTER) + /* If clobbering a hard register that is the replacement + register for an elimination we still think can be + performed, note that it cannot be performed. Otherwise, we + need not be concerned about it. */ + for (ep = reg_eliminate; + ep < ®_eliminate[NUM_ELIMINABLE_REGS]; + ep++) + if (ep->to_rtx == XEXP (x, 0) + && ep->to_rtx != hard_frame_pointer_rtx) + setup_can_eliminate (ep, false); + return; + + case SET: + /* Check for setting a hard register that we know about. */ + if (REG_P (SET_DEST (x)) && REGNO (SET_DEST (x)) < FIRST_PSEUDO_REGISTER) + { + /* See if this is setting the replacement hard register for + an elimination. + + If DEST is the hard frame pointer, we do nothing because + we assume that all assignments to the frame pointer are + for non-local gotos and are being done at a time when + they are valid and do not disturb anything else. Some + machines want to eliminate a fake argument pointer (or + even a fake frame pointer) with either the real frame + pointer or the stack pointer. Assignments to the hard + frame pointer must not prevent this elimination. */ + + for (ep = reg_eliminate; + ep < ®_eliminate[NUM_ELIMINABLE_REGS]; + ep++) + if (ep->to_rtx == SET_DEST (x) + && SET_DEST (x) != hard_frame_pointer_rtx) + setup_can_eliminate (ep, false); + } + + mark_not_eliminable (SET_DEST (x)); + mark_not_eliminable (SET_SRC (x)); + return; + + default: + break; + } + + fmt = GET_RTX_FORMAT (code); + for (i = 0; i < GET_RTX_LENGTH (code); i++, fmt++) + { + if (*fmt == 'e') + mark_not_eliminable (XEXP (x, i)); + else if (*fmt == 'E') + for (j = 0; j < XVECLEN (x, i); j++) + mark_not_eliminable (XVECEXP (x, i, j)); + } +} + + + +/* Scan INSN and eliminate all eliminable hard registers in it. + + If REPLACE_P is true, do the replacement destructively. Also + delete the insn as dead it if it is setting an eliminable register. + + If REPLACE_P is false, just update the offsets while keeping the + base register the same. */ + +static void +eliminate_regs_in_insn (rtx insn, bool replace_p) +{ + int icode = recog_memoized (insn); + rtx old_set = single_set (insn); + bool validate_p; + int i; + rtx substed_operand[MAX_RECOG_OPERANDS]; + rtx orig_operand[MAX_RECOG_OPERANDS]; + struct elim_table *ep; + rtx plus_src, plus_cst_src; + lra_insn_recog_data_t id; + struct lra_static_insn_data *static_id; + + if (icode < 0 && asm_noperands (PATTERN (insn)) < 0 && ! DEBUG_INSN_P (insn)) + { + lra_assert (GET_CODE (PATTERN (insn)) == USE + || GET_CODE (PATTERN (insn)) == CLOBBER + || GET_CODE (PATTERN (insn)) == ADDR_VEC + || GET_CODE (PATTERN (insn)) == ADDR_DIFF_VEC + || GET_CODE (PATTERN (insn)) == ASM_INPUT); + return; + } + + /* Check for setting an eliminable register. */ + if (old_set != 0 && REG_P (SET_DEST (old_set)) + && (ep = get_elimination (SET_DEST (old_set))) != NULL) + { + bool delete_p = replace_p; + +#ifdef HARD_FRAME_POINTER_REGNUM + /* If this is setting the frame pointer register to the hardware + frame pointer register and this is an elimination that will + be done (tested above), this insn is really adjusting the + frame pointer downward to compensate for the adjustment done + before a nonlocal goto. */ + if (ep->from == FRAME_POINTER_REGNUM + && ep->to == HARD_FRAME_POINTER_REGNUM) + { + if (replace_p) + { + SET_DEST (old_set) = ep->to_rtx; + lra_update_insn_recog_data (insn); + return; + } + else + { + rtx base = SET_SRC (old_set); + HOST_WIDE_INT offset = 0; + rtx base_insn = insn; + + while (base != ep->to_rtx) + { + rtx prev_insn, prev_set; + + if (GET_CODE (base) == PLUS && CONST_INT_P (XEXP (base, 1))) + { + offset += INTVAL (XEXP (base, 1)); + base = XEXP (base, 0); + } + else if ((prev_insn = prev_nonnote_insn (base_insn)) != 0 + && (prev_set = single_set (prev_insn)) != 0 + && rtx_equal_p (SET_DEST (prev_set), base)) + { + base = SET_SRC (prev_set); + base_insn = prev_insn; + } + else + break; + } + + if (base == ep->to_rtx) + { + rtx src; + + offset -= (ep->offset - ep->previous_offset); + src = plus_constant (Pmode, ep->to_rtx, offset); + + /* First see if this insn remains valid when we make + the change. If not, keep the INSN_CODE the same + and let the constraint pass fit it up. */ + validate_change (insn, &SET_SRC (old_set), src, 1); + validate_change (insn, &SET_DEST (old_set), + ep->from_rtx, 1); + if (! apply_change_group ()) + { + SET_SRC (old_set) = src; + SET_DEST (old_set) = ep->from_rtx; + } + lra_update_insn_recog_data (insn); + return; + } + } + + + /* We can't delete this insn, but needn't process it + since it won't be used unless something changes. */ + delete_p = false; + } +#endif + + /* This insn isn't serving a useful purpose. We delete it + when REPLACE is set. */ + if (delete_p) + lra_delete_dead_insn (insn); + return; + } + + /* We allow one special case which happens to work on all machines we + currently support: a single set with the source or a REG_EQUAL + note being a PLUS of an eliminable register and a constant. */ + plus_src = plus_cst_src = 0; + if (old_set && REG_P (SET_DEST (old_set))) + { + if (GET_CODE (SET_SRC (old_set)) == PLUS) + plus_src = SET_SRC (old_set); + /* First see if the source is of the form (plus (...) CST). */ + if (plus_src + && CONST_INT_P (XEXP (plus_src, 1))) + plus_cst_src = plus_src; + /* Check that the first operand of the PLUS is a hard reg or + the lowpart subreg of one. */ + if (plus_cst_src) + { + rtx reg = XEXP (plus_cst_src, 0); + + if (GET_CODE (reg) == SUBREG && subreg_lowpart_p (reg)) + reg = SUBREG_REG (reg); + + if (!REG_P (reg) || REGNO (reg) >= FIRST_PSEUDO_REGISTER) + plus_cst_src = 0; + } + } + if (plus_cst_src) + { + rtx reg = XEXP (plus_cst_src, 0); + HOST_WIDE_INT offset = INTVAL (XEXP (plus_cst_src, 1)); + + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + + if (REG_P (reg) && (ep = get_elimination (reg)) != NULL) + { + rtx to_rtx = replace_p ? ep->to_rtx : ep->from_rtx; + + if (! replace_p) + { + offset += (ep->offset - ep->previous_offset); + offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src)); + } + + if (GET_CODE (XEXP (plus_cst_src, 0)) == SUBREG) + to_rtx = gen_lowpart (GET_MODE (XEXP (plus_cst_src, 0)), to_rtx); + /* If we have a nonzero offset, and the source is already a + simple REG, the following transformation would increase + the cost of the insn by replacing a simple REG with (plus + (reg sp) CST). So try only when we already had a PLUS + before. */ + if (offset == 0 || plus_src) + { + rtx new_src = plus_constant (GET_MODE (to_rtx), to_rtx, offset); + + old_set = single_set (insn); + + /* First see if this insn remains valid when we make the + change. If not, try to replace the whole pattern + with a simple set (this may help if the original insn + was a PARALLEL that was only recognized as single_set + due to REG_UNUSED notes). If this isn't valid + either, keep the INSN_CODE the same and let the + constraint pass fix it up. */ + if (! validate_change (insn, &SET_SRC (old_set), new_src, 0)) + { + rtx new_pat = gen_rtx_SET (VOIDmode, + SET_DEST (old_set), new_src); + + if (! validate_change (insn, &PATTERN (insn), new_pat, 0)) + SET_SRC (old_set) = new_src; + } + lra_update_insn_recog_data (insn); + /* This can't have an effect on elimination offsets, so skip + right to the end. */ + return; + } + } + } + + /* Eliminate all eliminable registers occurring in operands that + can be handled by the constraint pass. */ + id = lra_get_insn_recog_data (insn); + static_id = id->insn_static_data; + validate_p = false; + for (i = 0; i < static_id->n_operands; i++) + { + orig_operand[i] = *id->operand_loc[i]; + substed_operand[i] = *id->operand_loc[i]; + + /* For an asm statement, every operand is eliminable. */ + if (icode < 0 || insn_data[icode].operand[i].eliminable) + { + /* Check for setting a hard register that we know about. */ + if (static_id->operand[i].type != OP_IN + && REG_P (orig_operand[i])) + { + /* If we are assigning to a hard register that can be + eliminated, it must be as part of a PARALLEL, since + the code above handles single SETs. This reg can not + be longer eliminated -- it is forced by + mark_not_eliminable. */ + for (ep = reg_eliminate; + ep < ®_eliminate[NUM_ELIMINABLE_REGS]; + ep++) + lra_assert (ep->from_rtx != orig_operand[i] + || ! ep->can_eliminate); + } + + /* Companion to the above plus substitution, we can allow + invariants as the source of a plain move. */ + substed_operand[i] + = lra_eliminate_regs_1 (*id->operand_loc[i], VOIDmode, + replace_p, ! replace_p, false); + if (substed_operand[i] != orig_operand[i]) + validate_p = true; + } + } + + /* Substitute the operands; the new values are in the substed_operand + array. */ + for (i = 0; i < static_id->n_operands; i++) + *id->operand_loc[i] = substed_operand[i]; + for (i = 0; i < static_id->n_dups; i++) + *id->dup_loc[i] = substed_operand[(int) static_id->dup_num[i]]; + + if (validate_p) + { + /* If we had a move insn but now we don't, re-recognize it. + This will cause spurious re-recognition if the old move had a + PARALLEL since the new one still will, but we can't call + single_set without having put new body into the insn and the + re-recognition won't hurt in this rare case. */ + id = lra_update_insn_recog_data (insn); + static_id = id->insn_static_data; + } +} + +/* Spill pseudos which are assigned to hard registers in SET. Add + affected insns for processing in the subsequent constraint + pass. */ +static void +spill_pseudos (HARD_REG_SET set) +{ + int i; + bitmap_head to_process; + rtx insn; + + if (hard_reg_set_empty_p (set)) + return; + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, " Spilling non-eliminable hard regs:"); + for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) + if (TEST_HARD_REG_BIT (set, i)) + fprintf (lra_dump_file, " %d", i); + fprintf (lra_dump_file, "\n"); + } + bitmap_initialize (&to_process, ®_obstack); + for (i = FIRST_PSEUDO_REGISTER; i < max_reg_num (); i++) + if (lra_reg_info[i].nrefs != 0 && reg_renumber[i] >= 0 + && overlaps_hard_reg_set_p (set, + PSEUDO_REGNO_MODE (i), reg_renumber[i])) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Spilling r%d(%d)\n", + i, reg_renumber[i]); + reg_renumber[i] = -1; + bitmap_ior_into (&to_process, &lra_reg_info[i].insn_bitmap); + } + IOR_HARD_REG_SET (lra_no_alloc_regs, set); + for (insn = get_insns (); insn != NULL_RTX; insn = NEXT_INSN (insn)) + if (bitmap_bit_p (&to_process, INSN_UID (insn))) + { + lra_push_insn (insn); + lra_set_used_insn_alternative (insn, -1); + } + bitmap_clear (&to_process); +} + +/* Update all offsets and possibility for elimination on eliminable + registers. Spill pseudos assigned to registers which became + uneliminable, update LRA_NO_ALLOC_REGS and ELIMINABLE_REG_SET. Add + insns to INSNS_WITH_CHANGED_OFFSETS containing eliminable hard + registers whose offsets should be changed. */ +static void +update_reg_eliminate (bitmap insns_with_changed_offsets) +{ + bool prev; + struct elim_table *ep, *ep1; + HARD_REG_SET temp_hard_reg_set; + + /* Clear self elimination offsets. */ + for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) + self_elim_offsets[ep->from] = 0; + CLEAR_HARD_REG_SET (temp_hard_reg_set); + for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) + { + /* If it is a currently used elimination: update the previous + offset. */ + if (elimination_map[ep->from] == ep) + ep->previous_offset = ep->offset; + + prev = ep->prev_can_eliminate; + setup_can_eliminate (ep, targetm.can_eliminate (ep->from, ep->to)); + if (ep->can_eliminate && ! prev) + { + /* It is possible that not eliminable register becomes + eliminable because we took other reasons into account to + set up eliminable regs in the initial set up. Just + ignore new eliminable registers. */ + setup_can_eliminate (ep, false); + continue; + } + if (ep->can_eliminate != prev && elimination_map[ep->from] == ep) + { + /* We cannot use this elimination anymore -- find another + one. */ + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + " Elimination %d to %d is not possible anymore\n", + ep->from, ep->to); + /* Mark that is not eliminable anymore. */ + elimination_map[ep->from] = NULL; + for (ep1 = ep + 1; ep1 < ®_eliminate[NUM_ELIMINABLE_REGS]; ep1++) + if (ep1->can_eliminate && ep1->from == ep->from) + break; + if (ep1 < ®_eliminate[NUM_ELIMINABLE_REGS]) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Using elimination %d to %d now\n", + ep1->from, ep1->to); + /* Prevent the hard register into which we eliminate now + from the usage for pseudos. */ + SET_HARD_REG_BIT (temp_hard_reg_set, ep1->to); + lra_assert (ep1->previous_offset == 0); + ep1->previous_offset = ep->offset; + } + else + { + /* There is no elimination anymore just use the hard + register `from' itself. Setup self elimination + offset to restore the original offset values. */ + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " %d is not eliminable at all\n", + ep->from); + self_elim_offsets[ep->from] = -ep->offset; + SET_HARD_REG_BIT (temp_hard_reg_set, ep->from); + if (ep->offset != 0) + bitmap_ior_into (insns_with_changed_offsets, + &lra_reg_info[ep->from].insn_bitmap); + } + } + +#ifdef ELIMINABLE_REGS + INITIAL_ELIMINATION_OFFSET (ep->from, ep->to, ep->offset); +#else + INITIAL_FRAME_POINTER_OFFSET (ep->offset); +#endif + } + IOR_HARD_REG_SET (lra_no_alloc_regs, temp_hard_reg_set); + AND_COMPL_HARD_REG_SET (eliminable_regset, temp_hard_reg_set); + spill_pseudos (temp_hard_reg_set); + setup_elimination_map (); + for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) + if (elimination_map[ep->from] == ep && ep->previous_offset != ep->offset) + bitmap_ior_into (insns_with_changed_offsets, + &lra_reg_info[ep->from].insn_bitmap); +} + +/* Initialize the table of hard registers to eliminate. + Pre-condition: global flag frame_pointer_needed has been set before + calling this function. */ +static void +init_elim_table (void) +{ + bool value_p; + struct elim_table *ep; +#ifdef ELIMINABLE_REGS + const struct elim_table_1 *ep1; +#endif + + if (!reg_eliminate) + reg_eliminate = XCNEWVEC (struct elim_table, NUM_ELIMINABLE_REGS); + + memset (self_elim_offsets, 0, sizeof (self_elim_offsets)); + /* Initiate member values which will be never changed. */ + self_elim_table.can_eliminate = self_elim_table.prev_can_eliminate = true; + self_elim_table.previous_offset = 0; +#ifdef ELIMINABLE_REGS + for (ep = reg_eliminate, ep1 = reg_eliminate_1; + ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++, ep1++) + { + ep->offset = ep->previous_offset = 0; + ep->from = ep1->from; + ep->to = ep1->to; + value_p = (targetm.can_eliminate (ep->from, ep->to) + && ! (ep->to == STACK_POINTER_REGNUM + && frame_pointer_needed + && (! SUPPORTS_STACK_ALIGNMENT + || ! stack_realign_fp))); + setup_can_eliminate (ep, value_p); + } +#else + reg_eliminate[0].offset = reg_eliminate[0].previous_offset = 0; + reg_eliminate[0].from = reg_eliminate_1[0].from; + reg_eliminate[0].to = reg_eliminate_1[0].to; + setup_can_eliminate (®_eliminate[0], ! frame_pointer_needed); +#endif + + /* Count the number of eliminable registers and build the FROM and TO + REG rtx's. Note that code in gen_rtx_REG will cause, e.g., + gen_rtx_REG (Pmode, STACK_POINTER_REGNUM) to equal stack_pointer_rtx. + We depend on this. */ + for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) + { + ep->from_rtx = gen_rtx_REG (Pmode, ep->from); + ep->to_rtx = gen_rtx_REG (Pmode, ep->to); + eliminable_reg_rtx[ep->from] = ep->from_rtx; + } +} + +/* Entry function for initialization of elimination once per + function. */ +void +lra_init_elimination (void) +{ + basic_block bb; + rtx insn; + + init_elim_table (); + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + if (NONDEBUG_INSN_P (insn)) + mark_not_eliminable (PATTERN (insn)); + setup_elimination_map (); +} + +/* Eliminate hard reg given by its location LOC. */ +void +lra_eliminate_reg_if_possible (rtx *loc) +{ + int regno; + struct elim_table *ep; + + lra_assert (REG_P (*loc)); + if ((regno = REGNO (*loc)) >= FIRST_PSEUDO_REGISTER + || ! TEST_HARD_REG_BIT (lra_no_alloc_regs, regno)) + return; + if ((ep = get_elimination (*loc)) != NULL) + *loc = ep->to_rtx; +} + +/* Do (final if FINAL_P) elimination in INSN. Add the insn for + subsequent processing in the constraint pass, update the insn info. */ +static void +process_insn_for_elimination (rtx insn, bool final_p) +{ + eliminate_regs_in_insn (insn, final_p); + if (! final_p) + { + /* Check that insn changed its code. This is a case when a move + insn becomes an add insn and we do not want to process the + insn as a move anymore. */ + int icode = recog (PATTERN (insn), insn, 0); + + if (icode >= 0 && icode != INSN_CODE (insn)) + { + INSN_CODE (insn) = icode; + lra_update_insn_recog_data (insn); + } + lra_update_insn_regno_info (insn); + lra_push_insn (insn); + lra_set_used_insn_alternative (insn, -1); + } +} + +/* Entry function to do final elimination if FINAL_P or to update + elimination register offsets. */ +void +lra_eliminate (bool final_p) +{ + int i; + unsigned int uid; + rtx mem_loc, invariant; + bitmap_head insns_with_changed_offsets; + bitmap_iterator bi; + struct elim_table *ep; + int regs_num = max_reg_num (); + + timevar_push (TV_LRA_ELIMINATE); + + bitmap_initialize (&insns_with_changed_offsets, ®_obstack); + if (final_p) + { +#ifdef ENABLE_CHECKING + update_reg_eliminate (&insns_with_changed_offsets); + if (! bitmap_empty_p (&insns_with_changed_offsets)) + gcc_unreachable (); +#endif + /* We change eliminable hard registers in insns so we should do + this for all insns containing any eliminable hard + register. */ + for (ep = reg_eliminate; ep < ®_eliminate[NUM_ELIMINABLE_REGS]; ep++) + if (elimination_map[ep->from] != NULL) + bitmap_ior_into (&insns_with_changed_offsets, + &lra_reg_info[ep->from].insn_bitmap); + } + else + { + update_reg_eliminate (&insns_with_changed_offsets); + if (bitmap_empty_p (&insns_with_changed_offsets)) + goto lra_eliminate_done; + } + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, "New elimination table:\n"); + print_elim_table (lra_dump_file); + } + for (i = FIRST_PSEUDO_REGISTER; i < regs_num; i++) + if (lra_reg_info[i].nrefs != 0) + { + mem_loc = ira_reg_equiv[i].memory; + if (mem_loc != NULL_RTX) + mem_loc = lra_eliminate_regs_1 (mem_loc, VOIDmode, + final_p, ! final_p, false); + ira_reg_equiv[i].memory = mem_loc; + invariant = ira_reg_equiv[i].invariant; + if (invariant != NULL_RTX) + invariant = lra_eliminate_regs_1 (invariant, VOIDmode, + final_p, ! final_p, false); + ira_reg_equiv[i].invariant = invariant; + if (lra_dump_file != NULL + && (mem_loc != NULL_RTX || invariant != NULL)) + fprintf (lra_dump_file, + "Updating elimination of equiv for reg %d\n", i); + } + EXECUTE_IF_SET_IN_BITMAP (&insns_with_changed_offsets, 0, uid, bi) + process_insn_for_elimination (lra_insn_recog_data[uid]->insn, final_p); + bitmap_clear (&insns_with_changed_offsets); + +lra_eliminate_done: + timevar_pop (TV_LRA_ELIMINATE); +} diff --git a/gcc/lra-int.h b/gcc/lra-int.h new file mode 100644 index 00000000000..d00cc12feff --- /dev/null +++ b/gcc/lra-int.h @@ -0,0 +1,438 @@ +/* Local Register Allocator (LRA) intercommunication header file. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "lra.h" +#include "bitmap.h" +#include "recog.h" +#include "insn-attr.h" +#include "insn-codes.h" + +#ifdef ENABLE_CHECKING +#define lra_assert(c) gcc_assert (c) +#else +/* Always define and include C, so that warnings for empty body in an + ‘if’ statement and unused variable do not occur. */ +#define lra_assert(c) ((void)(0 && (c))) +#endif + +/* The parameter used to prevent infinite reloading for an insn. Each + insn operands might require a reload and, if it is a memory, its + base and index registers might require a reload too. */ +#define LRA_MAX_INSN_RELOADS (MAX_RECOG_OPERANDS * 3) + +/* Return the hard register which given pseudo REGNO assigned to. + Negative value means that the register got memory or we don't know + allocation yet. */ +static inline int +lra_get_regno_hard_regno (int regno) +{ + resize_reg_info (); + return reg_renumber[regno]; +} + +typedef struct lra_live_range *lra_live_range_t; + +/* The structure describes program points where a given pseudo lives. + The live ranges can be used to find conflicts with other pseudos. + If the live ranges of two pseudos are intersected, the pseudos are + in conflict. */ +struct lra_live_range +{ + /* Pseudo regno whose live range is described by given + structure. */ + int regno; + /* Program point range. */ + int start, finish; + /* Next structure describing program points where the pseudo + lives. */ + lra_live_range_t next; + /* Pointer to structures with the same start. */ + lra_live_range_t start_next; +}; + +typedef struct lra_copy *lra_copy_t; + +/* Copy between pseudos which affects assigning hard registers. */ +struct lra_copy +{ + /* True if regno1 is the destination of the copy. */ + bool regno1_dest_p; + /* Execution frequency of the copy. */ + int freq; + /* Pseudos connected by the copy. REGNO1 < REGNO2. */ + int regno1, regno2; + /* Next copy with correspondingly REGNO1 and REGNO2. */ + lra_copy_t regno1_next, regno2_next; +}; + +/* Common info about a register (pseudo or hard register). */ +struct lra_reg +{ + /* Bitmap of UIDs of insns (including debug insns) referring the + reg. */ + bitmap_head insn_bitmap; + /* The following fields are defined only for pseudos. */ + /* Hard registers with which the pseudo conflicts. */ + HARD_REG_SET conflict_hard_regs; + /* We assign hard registers to reload pseudos which can occur in few + places. So two hard register preferences are enough for them. + The following fields define the preferred hard registers. If + there are no such hard registers the first field value is + negative. If there is only one preferred hard register, the 2nd + field is negative. */ + int preferred_hard_regno1, preferred_hard_regno2; + /* Profits to use the corresponding preferred hard registers. If + the both hard registers defined, the first hard register has not + less profit than the second one. */ + int preferred_hard_regno_profit1, preferred_hard_regno_profit2; +#ifdef STACK_REGS + /* True if the pseudo should not be assigned to a stack register. */ + bool no_stack_p; +#endif +#ifdef ENABLE_CHECKING + /* True if the pseudo crosses a call. It is setup in lra-lives.c + and used to check that the pseudo crossing a call did not get a + call used hard register. */ + bool call_p; +#endif + /* Number of references and execution frequencies of the register in + *non-debug* insns. */ + int nrefs, freq; + int last_reload; + /* Regno used to undo the inheritance. It can be non-zero only + between couple of inheritance and undo inheritance passes. */ + int restore_regno; + /* Value holding by register. If the pseudos have the same value + they do not conflict. */ + int val; + /* These members are set up in lra-lives.c and updated in + lra-coalesce.c. */ + /* The biggest size mode in which each pseudo reg is referred in + whole function (possibly via subreg). */ + enum machine_mode biggest_mode; + /* Live ranges of the pseudo. */ + lra_live_range_t live_ranges; + /* This member is set up in lra-lives.c for subsequent + assignments. */ + lra_copy_t copies; +}; + +/* References to the common info about each register. */ +extern struct lra_reg *lra_reg_info; + +/* Static info about each insn operand (common for all insns with the + same ICODE). Warning: if the structure definition is changed, the + initializer for debug_operand_data in lra.c should be changed + too. */ +struct lra_operand_data +{ + /* The machine description constraint string of the operand. */ + const char *constraint; + /* It is taken only from machine description (which is different + from recog_data.operand_mode) and can be of VOIDmode. */ + ENUM_BITFIELD(machine_mode) mode : 16; + /* The type of the operand (in/out/inout). */ + ENUM_BITFIELD (op_type) type : 8; + /* Through if accessed through STRICT_LOW. */ + unsigned int strict_low : 1; + /* True if the operand is an operator. */ + unsigned int is_operator : 1; + /* True if there is an early clobber alternative for this operand. + This field is set up every time when corresponding + operand_alternative in lra_static_insn_data is set up. */ + unsigned int early_clobber : 1; + /* True if the operand is an address. */ + unsigned int is_address : 1; +}; + +/* Info about register occurrence in an insn. */ +struct lra_insn_reg +{ + /* The biggest mode through which the insn refers to the register + occurrence (remember the register can be accessed through a + subreg in the insn). */ + ENUM_BITFIELD(machine_mode) biggest_mode : 16; + /* The type of the corresponding operand which is the register. */ + ENUM_BITFIELD (op_type) type : 8; + /* True if the reg is accessed through a subreg and the subreg is + just a part of the register. */ + unsigned int subreg_p : 1; + /* True if there is an early clobber alternative for this + operand. */ + unsigned int early_clobber : 1; + /* The corresponding regno of the register. */ + int regno; + /* Next reg info of the same insn. */ + struct lra_insn_reg *next; +}; + +/* Static part (common info for insns with the same ICODE) of LRA + internal insn info. It exists in at most one exemplar for each + non-negative ICODE. There is only one exception. Each asm insn has + own structure. Warning: if the structure definition is changed, + the initializer for debug_insn_static_data in lra.c should be + changed too. */ +struct lra_static_insn_data +{ + /* Static info about each insn operand. */ + struct lra_operand_data *operand; + /* Each duplication refers to the number of the corresponding + operand which is duplicated. */ + int *dup_num; + /* The number of an operand marked as commutative, -1 otherwise. */ + int commutative; + /* Number of operands, duplications, and alternatives of the + insn. */ + char n_operands; + char n_dups; + char n_alternatives; + /* Insns in machine description (or clobbers in asm) may contain + explicit hard regs which are not operands. The following list + describes such hard registers. */ + struct lra_insn_reg *hard_regs; + /* Array [n_alternatives][n_operand] of static constraint info for + given operand in given alternative. This info can be changed if + the target reg info is changed. */ + struct operand_alternative *operand_alternative; +}; + +/* LRA internal info about an insn (LRA internal insn + representation). */ +struct lra_insn_recog_data +{ + /* The insn code. */ + int icode; + /* The insn itself. */ + rtx insn; + /* Common data for insns with the same ICODE. Asm insns (their + ICODE is negative) do not share such structures. */ + struct lra_static_insn_data *insn_static_data; + /* Two arrays of size correspondingly equal to the operand and the + duplication numbers: */ + rtx **operand_loc; /* The operand locations, NULL if no operands. */ + rtx **dup_loc; /* The dup locations, NULL if no dups. */ + /* Number of hard registers implicitly used in given call insn. The + value can be NULL or points to array of the hard register numbers + ending with a negative value. */ + int *arg_hard_regs; +#ifdef HAVE_ATTR_enabled + /* Alternative enabled for the insn. NULL for debug insns. */ + bool *alternative_enabled_p; +#endif + /* The alternative should be used for the insn, -1 if invalid, or we + should try to use any alternative, or the insn is a debug + insn. */ + int used_insn_alternative; + /* The following member value is always NULL for a debug insn. */ + struct lra_insn_reg *regs; +}; + +typedef struct lra_insn_recog_data *lra_insn_recog_data_t; + +/* lra.c: */ + +extern FILE *lra_dump_file; + +extern bool lra_reg_spill_p; + +extern HARD_REG_SET lra_no_alloc_regs; + +extern int lra_insn_recog_data_len; +extern lra_insn_recog_data_t *lra_insn_recog_data; + +extern int lra_curr_reload_num; + +extern void lra_push_insn (rtx); +extern void lra_push_insn_by_uid (unsigned int); +extern void lra_push_insn_and_update_insn_regno_info (rtx); +extern rtx lra_pop_insn (void); +extern unsigned int lra_insn_stack_length (void); + +extern rtx lra_create_new_reg_with_unique_value (enum machine_mode, rtx, + enum reg_class, const char *); +extern void lra_set_regno_unique_value (int); +extern void lra_invalidate_insn_data (rtx); +extern void lra_set_insn_deleted (rtx); +extern void lra_delete_dead_insn (rtx); +extern void lra_emit_add (rtx, rtx, rtx); +extern void lra_emit_move (rtx, rtx); +extern void lra_update_dups (lra_insn_recog_data_t, signed char *); + +extern void lra_process_new_insns (rtx, rtx, rtx, const char *); + +extern lra_insn_recog_data_t lra_set_insn_recog_data (rtx); +extern lra_insn_recog_data_t lra_update_insn_recog_data (rtx); +extern void lra_set_used_insn_alternative (rtx, int); +extern void lra_set_used_insn_alternative_by_uid (int, int); + +extern void lra_invalidate_insn_regno_info (rtx); +extern void lra_update_insn_regno_info (rtx); +extern struct lra_insn_reg *lra_get_insn_regs (int); + +extern void lra_free_copies (void); +extern void lra_create_copy (int, int, int); +extern lra_copy_t lra_get_copy (int); +extern bool lra_former_scratch_p (int); +extern bool lra_former_scratch_operand_p (rtx, int); + +extern int lra_constraint_new_regno_start; +extern bitmap_head lra_inheritance_pseudos; +extern bitmap_head lra_split_regs; +extern bitmap_head lra_optional_reload_pseudos; +extern int lra_constraint_new_insn_uid_start; + +/* lra-constraints.c: */ + +extern int lra_constraint_offset (int, enum machine_mode); + +extern int lra_constraint_iter; +extern int lra_constraint_iter_after_spill; +extern bool lra_risky_transformations_p; +extern int lra_inheritance_iter; +extern int lra_undo_inheritance_iter; +extern bool lra_constraints (bool); +extern void lra_constraints_init (void); +extern void lra_constraints_finish (void); +extern void lra_inheritance (void); +extern bool lra_undo_inheritance (void); + +/* lra-lives.c: */ + +extern int lra_live_max_point; +extern int *lra_point_freq; + +extern int lra_hard_reg_usage[FIRST_PSEUDO_REGISTER]; + +extern int lra_live_range_iter; +extern void lra_create_live_ranges (bool); +extern lra_live_range_t lra_copy_live_range_list (lra_live_range_t); +extern lra_live_range_t lra_merge_live_ranges (lra_live_range_t, + lra_live_range_t); +extern bool lra_intersected_live_ranges_p (lra_live_range_t, + lra_live_range_t); +extern void lra_print_live_range_list (FILE *, lra_live_range_t); +extern void lra_debug_live_range_list (lra_live_range_t); +extern void lra_debug_pseudo_live_ranges (int); +extern void lra_debug_live_ranges (void); +extern void lra_clear_live_ranges (void); +extern void lra_live_ranges_init (void); +extern void lra_live_ranges_finish (void); +extern void lra_setup_reload_pseudo_preferenced_hard_reg (int, int, int); + +/* lra-assigns.c: */ + +extern void lra_setup_reg_renumber (int, int, bool); +extern bool lra_assign (void); + + +/* lra-coalesce.c: */ + +extern int lra_coalesce_iter; +extern bool lra_coalesce (void); + +/* lra-spills.c: */ + +extern bool lra_need_for_spills_p (void); +extern void lra_spill (void); +extern void lra_hard_reg_substitution (void); + + +/* lra-elimination.c: */ + +extern void lra_debug_elim_table (void); +extern int lra_get_elimination_hard_regno (int); +extern rtx lra_eliminate_regs_1 (rtx, enum machine_mode, bool, bool, bool); +extern void lra_eliminate (bool); + +extern void lra_eliminate_reg_if_possible (rtx *); + + + +/* Update insn operands which are duplication of NOP operand. The + insn is represented by its LRA internal representation ID. */ +static inline void +lra_update_dup (lra_insn_recog_data_t id, int nop) +{ + int i; + struct lra_static_insn_data *static_id = id->insn_static_data; + + for (i = 0; i < static_id->n_dups; i++) + if (static_id->dup_num[i] == nop) + *id->dup_loc[i] = *id->operand_loc[nop]; +} + +/* Process operator duplications in insn with ID. We do it after the + operands processing. Generally speaking, we could do this probably + simultaneously with operands processing because a common practice + is to enumerate the operators after their operands. */ +static inline void +lra_update_operator_dups (lra_insn_recog_data_t id) +{ + int i; + struct lra_static_insn_data *static_id = id->insn_static_data; + + for (i = 0; i < static_id->n_dups; i++) + { + int ndup = static_id->dup_num[i]; + + if (static_id->operand[ndup].is_operator) + *id->dup_loc[i] = *id->operand_loc[ndup]; + } +} + +/* Return info about INSN. Set up the info if it is not done yet. */ +static inline lra_insn_recog_data_t +lra_get_insn_recog_data (rtx insn) +{ + lra_insn_recog_data_t data; + unsigned int uid = INSN_UID (insn); + + if (lra_insn_recog_data_len > (int) uid + && (data = lra_insn_recog_data[uid]) != NULL) + { + /* Check that we did not change insn without updating the insn + info. */ + lra_assert (data->insn == insn + && (INSN_CODE (insn) < 0 + || data->icode == INSN_CODE (insn))); + return data; + } + return lra_set_insn_recog_data (insn); +} + + + +struct target_lra_int +{ + /* Map INSN_UID -> the operand alternative data (NULL if unknown). + We assume that this data is valid until register info is changed + because classes in the data can be changed. */ + struct operand_alternative *x_op_alt_data[LAST_INSN_CODE]; +}; + +extern struct target_lra_int default_target_lra_int; +#if SWITCHABLE_TARGET +extern struct target_lra_int *this_target_lra_int; +#else +#define this_target_lra_int (&default_target_lra_int) +#endif + +#define op_alt_data (this_target_lra_int->x_op_alt_data) diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c new file mode 100644 index 00000000000..6e00250c7ed --- /dev/null +++ b/gcc/lra-lives.c @@ -0,0 +1,1010 @@ +/* Build live ranges for pseudos. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + + +/* This file contains code to build pseudo live-ranges (analogous + structures used in IRA, so read comments about the live-ranges + there) and other info necessary for other passes to assign + hard-registers to pseudos, coalesce the spilled pseudos, and assign + stack memory slots to spilled pseudos. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "hard-reg-set.h" +#include "rtl.h" +#include "tm_p.h" +#include "insn-config.h" +#include "recog.h" +#include "output.h" +#include "regs.h" +#include "function.h" +#include "expr.h" +#include "basic-block.h" +#include "except.h" +#include "df.h" +#include "ira.h" +#include "sparseset.h" +#include "lra-int.h" + +/* Program points are enumerated by numbers from range + 0..LRA_LIVE_MAX_POINT-1. There are approximately two times more + program points than insns. Program points are places in the + program where liveness info can be changed. In most general case + (there are more complicated cases too) some program points + correspond to places where input operand dies and other ones + correspond to places where output operands are born. */ +int lra_live_max_point; + +/* Accumulated execution frequency of all references for each hard + register. */ +int lra_hard_reg_usage[FIRST_PSEUDO_REGISTER]; + +/* A global flag whose true value says to build live ranges for all + pseudos, otherwise the live ranges only for pseudos got memory is + build. True value means also building copies and setting up hard + register preferences. The complete info is necessary only for the + assignment pass. The complete info is not needed for the + coalescing and spill passes. */ +static bool complete_info_p; + +/* Pseudos live at current point in the RTL scan. */ +static sparseset pseudos_live; + +/* Pseudos probably living through calls and setjumps. As setjump is + a call too, if a bit in PSEUDOS_LIVE_THROUGH_SETJUMPS is set up + then the corresponding bit in PSEUDOS_LIVE_THROUGH_CALLS is set up + too. These data are necessary for cases when only one subreg of a + multi-reg pseudo is set up after a call. So we decide it is + probably live when traversing bb backward. We are sure about + living when we see its usage or definition of the pseudo. */ +static sparseset pseudos_live_through_calls; +static sparseset pseudos_live_through_setjumps; + +/* Set of hard regs (except eliminable ones) currently live. */ +static HARD_REG_SET hard_regs_live; + +/* Set of pseudos and hard registers start living/dying in the current + insn. These sets are used to update REG_DEAD and REG_UNUSED notes + in the insn. */ +static sparseset start_living, start_dying; + +/* Set of pseudos and hard regs dead and unused in the current + insn. */ +static sparseset unused_set, dead_set; + +/* Pool for pseudo live ranges. */ +static alloc_pool live_range_pool; + +/* Free live range LR. */ +static void +free_live_range (lra_live_range_t lr) +{ + pool_free (live_range_pool, lr); +} + +/* Free live range list LR. */ +static void +free_live_range_list (lra_live_range_t lr) +{ + lra_live_range_t next; + + while (lr != NULL) + { + next = lr->next; + free_live_range (lr); + lr = next; + } +} + +/* Create and return pseudo live range with given attributes. */ +static lra_live_range_t +create_live_range (int regno, int start, int finish, lra_live_range_t next) +{ + lra_live_range_t p; + + p = (lra_live_range_t) pool_alloc (live_range_pool); + p->regno = regno; + p->start = start; + p->finish = finish; + p->next = next; + return p; +} + +/* Copy live range R and return the result. */ +static lra_live_range_t +copy_live_range (lra_live_range_t r) +{ + lra_live_range_t p; + + p = (lra_live_range_t) pool_alloc (live_range_pool); + *p = *r; + return p; +} + +/* Copy live range list given by its head R and return the result. */ +lra_live_range_t +lra_copy_live_range_list (lra_live_range_t r) +{ + lra_live_range_t p, first, *chain; + + first = NULL; + for (chain = &first; r != NULL; r = r->next) + { + p = copy_live_range (r); + *chain = p; + chain = &p->next; + } + return first; +} + +/* Merge *non-intersected* ranges R1 and R2 and returns the result. + The function maintains the order of ranges and tries to minimize + size of the result range list. Ranges R1 and R2 may not be used + after the call. */ +lra_live_range_t +lra_merge_live_ranges (lra_live_range_t r1, lra_live_range_t r2) +{ + lra_live_range_t first, last, temp; + + if (r1 == NULL) + return r2; + if (r2 == NULL) + return r1; + for (first = last = NULL; r1 != NULL && r2 != NULL;) + { + if (r1->start < r2->start) + { + temp = r1; + r1 = r2; + r2 = temp; + } + if (r1->start == r2->finish + 1) + { + /* Joint ranges: merge r1 and r2 into r1. */ + r1->start = r2->start; + temp = r2; + r2 = r2->next; + pool_free (live_range_pool, temp); + } + else + { + gcc_assert (r2->finish + 1 < r1->start); + /* Add r1 to the result. */ + if (first == NULL) + first = last = r1; + else + { + last->next = r1; + last = r1; + } + r1 = r1->next; + } + } + if (r1 != NULL) + { + if (first == NULL) + first = r1; + else + last->next = r1; + } + else + { + lra_assert (r2 != NULL); + if (first == NULL) + first = r2; + else + last->next = r2; + } + return first; +} + +/* Return TRUE if live ranges R1 and R2 intersect. */ +bool +lra_intersected_live_ranges_p (lra_live_range_t r1, lra_live_range_t r2) +{ + /* Remember the live ranges are always kept ordered. */ + while (r1 != NULL && r2 != NULL) + { + if (r1->start > r2->finish) + r1 = r1->next; + else if (r2->start > r1->finish) + r2 = r2->next; + else + return true; + } + return false; +} + +/* The function processing birth of hard register REGNO. It updates + living hard regs, conflict hard regs for living pseudos, and + START_LIVING. */ +static void +make_hard_regno_born (int regno) +{ + unsigned int i; + + lra_assert (regno < FIRST_PSEUDO_REGISTER); + if (TEST_HARD_REG_BIT (lra_no_alloc_regs, regno) + || TEST_HARD_REG_BIT (hard_regs_live, regno)) + return; + SET_HARD_REG_BIT (hard_regs_live, regno); + sparseset_set_bit (start_living, regno); + EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i) + SET_HARD_REG_BIT (lra_reg_info[i].conflict_hard_regs, regno); +} + +/* Process the death of hard register REGNO. This updates + hard_regs_live and START_DYING. */ +static void +make_hard_regno_dead (int regno) +{ + lra_assert (regno < FIRST_PSEUDO_REGISTER); + if (TEST_HARD_REG_BIT (lra_no_alloc_regs, regno) + || ! TEST_HARD_REG_BIT (hard_regs_live, regno)) + return; + sparseset_set_bit (start_dying, regno); + CLEAR_HARD_REG_BIT (hard_regs_live, regno); +} + +/* Mark pseudo REGNO as living at program point POINT, update conflicting + hard registers of the pseudo and START_LIVING, and start a new live + range for the pseudo corresponding to REGNO if it is necessary. */ +static void +mark_pseudo_live (int regno, int point) +{ + lra_live_range_t p; + + lra_assert (regno >= FIRST_PSEUDO_REGISTER); + lra_assert (! sparseset_bit_p (pseudos_live, regno)); + sparseset_set_bit (pseudos_live, regno); + IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, hard_regs_live); + + if ((complete_info_p || lra_get_regno_hard_regno (regno) < 0) + && ((p = lra_reg_info[regno].live_ranges) == NULL + || (p->finish != point && p->finish + 1 != point))) + lra_reg_info[regno].live_ranges + = create_live_range (regno, point, -1, p); + sparseset_set_bit (start_living, regno); +} + +/* Mark pseudo REGNO as not living at program point POINT and update + START_DYING. + This finishes the current live range for the pseudo corresponding + to REGNO. */ +static void +mark_pseudo_dead (int regno, int point) +{ + lra_live_range_t p; + + lra_assert (regno >= FIRST_PSEUDO_REGISTER); + lra_assert (sparseset_bit_p (pseudos_live, regno)); + sparseset_clear_bit (pseudos_live, regno); + sparseset_set_bit (start_dying, regno); + if (complete_info_p || lra_get_regno_hard_regno (regno) < 0) + { + p = lra_reg_info[regno].live_ranges; + lra_assert (p != NULL); + p->finish = point; + } +} + +/* Mark register REGNO (pseudo or hard register) in MODE as live + at program point POINT. + Return TRUE if the liveness tracking sets were modified, + or FALSE if nothing changed. */ +static bool +mark_regno_live (int regno, enum machine_mode mode, int point) +{ + int last; + bool changed = false; + + if (regno < FIRST_PSEUDO_REGISTER) + { + for (last = regno + hard_regno_nregs[regno][mode]; + regno < last; + regno++) + make_hard_regno_born (regno); + } + else if (! sparseset_bit_p (pseudos_live, regno)) + { + mark_pseudo_live (regno, point); + changed = true; + } + return changed; +} + + +/* Mark register REGNO in MODE as dead at program point POINT. + Return TRUE if the liveness tracking sets were modified, + or FALSE if nothing changed. */ +static bool +mark_regno_dead (int regno, enum machine_mode mode, int point) +{ + int last; + bool changed = false; + + if (regno < FIRST_PSEUDO_REGISTER) + { + for (last = regno + hard_regno_nregs[regno][mode]; + regno < last; + regno++) + make_hard_regno_dead (regno); + } + else if (sparseset_bit_p (pseudos_live, regno)) + { + mark_pseudo_dead (regno, point); + changed = true; + } + return changed; +} + +/* Insn currently scanned. */ +static rtx curr_insn; +/* The insn data. */ +static lra_insn_recog_data_t curr_id; +/* The insn static data. */ +static struct lra_static_insn_data *curr_static_id; + +/* Return true when one of the predecessor edges of BB is marked with + EDGE_ABNORMAL_CALL or EDGE_EH. */ +static bool +bb_has_abnormal_call_pred (basic_block bb) +{ + edge e; + edge_iterator ei; + + FOR_EACH_EDGE (e, ei, bb->preds) + { + if (e->flags & (EDGE_ABNORMAL_CALL | EDGE_EH)) + return true; + } + return false; +} + +/* Vec containing execution frequencies of program points. */ +static VEC(int,heap) *point_freq_vec; + +/* The start of the above vector elements. */ +int *lra_point_freq; + +/* Increment the current program point POINT to the next point which has + execution frequency FREQ. */ +static void +next_program_point (int &point, int freq) +{ + VEC_safe_push (int, heap, point_freq_vec, freq); + lra_point_freq = VEC_address (int, point_freq_vec); + point++; +} + +/* Update the preference of HARD_REGNO for pseudo REGNO by PROFIT. */ +void +lra_setup_reload_pseudo_preferenced_hard_reg (int regno, + int hard_regno, int profit) +{ + lra_assert (regno >= lra_constraint_new_regno_start); + if (lra_reg_info[regno].preferred_hard_regno1 == hard_regno) + lra_reg_info[regno].preferred_hard_regno_profit1 += profit; + else if (lra_reg_info[regno].preferred_hard_regno2 == hard_regno) + lra_reg_info[regno].preferred_hard_regno_profit2 += profit; + else if (lra_reg_info[regno].preferred_hard_regno1 < 0) + { + lra_reg_info[regno].preferred_hard_regno1 = hard_regno; + lra_reg_info[regno].preferred_hard_regno_profit1 = profit; + } + else if (lra_reg_info[regno].preferred_hard_regno2 < 0 + || profit > lra_reg_info[regno].preferred_hard_regno_profit2) + { + lra_reg_info[regno].preferred_hard_regno2 = hard_regno; + lra_reg_info[regno].preferred_hard_regno_profit2 = profit; + } + else + return; + /* Keep the 1st hard regno as more profitable. */ + if (lra_reg_info[regno].preferred_hard_regno1 >= 0 + && lra_reg_info[regno].preferred_hard_regno2 >= 0 + && (lra_reg_info[regno].preferred_hard_regno_profit2 + > lra_reg_info[regno].preferred_hard_regno_profit1)) + { + int temp; + + temp = lra_reg_info[regno].preferred_hard_regno1; + lra_reg_info[regno].preferred_hard_regno1 + = lra_reg_info[regno].preferred_hard_regno2; + lra_reg_info[regno].preferred_hard_regno2 = temp; + temp = lra_reg_info[regno].preferred_hard_regno_profit1; + lra_reg_info[regno].preferred_hard_regno_profit1 + = lra_reg_info[regno].preferred_hard_regno_profit2; + lra_reg_info[regno].preferred_hard_regno_profit2 = temp; + } + if (lra_dump_file != NULL) + { + if ((hard_regno = lra_reg_info[regno].preferred_hard_regno1) >= 0) + fprintf (lra_dump_file, + " Hard reg %d is preferable by r%d with profit %d\n", + hard_regno, regno, + lra_reg_info[regno].preferred_hard_regno_profit1); + if ((hard_regno = lra_reg_info[regno].preferred_hard_regno2) >= 0) + fprintf (lra_dump_file, + " Hard reg %d is preferable by r%d with profit %d\n", + hard_regno, regno, + lra_reg_info[regno].preferred_hard_regno_profit2); + } +} + +/* Check that REGNO living through calls and setjumps, set up conflict + regs, and clear corresponding bits in PSEUDOS_LIVE_THROUGH_CALLS and + PSEUDOS_LIVE_THROUGH_SETJUMPS. */ +static inline void +check_pseudos_live_through_calls (int regno) +{ + if (! sparseset_bit_p (pseudos_live_through_calls, regno)) + return; + sparseset_clear_bit (pseudos_live_through_calls, regno); + IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, + call_used_reg_set); +#ifdef ENABLE_CHECKING + lra_reg_info[regno].call_p = true; +#endif + if (! sparseset_bit_p (pseudos_live_through_setjumps, regno)) + return; + sparseset_clear_bit (pseudos_live_through_setjumps, regno); + /* Don't allocate pseudos that cross setjmps or any call, if this + function receives a nonlocal goto. */ + SET_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs); +} + +/* Process insns of the basic block BB to update pseudo live ranges, + pseudo hard register conflicts, and insn notes. We do it on + backward scan of BB insns. CURR_POINT is the program point where + BB ends. The function updates this counter and returns in + CURR_POINT the program point where BB starts. */ +static void +process_bb_lives (basic_block bb, int &curr_point) +{ + int i, regno, freq; + unsigned int j; + bitmap_iterator bi; + bitmap reg_live_out; + unsigned int px; + rtx link, *link_loc; + bool need_curr_point_incr; + + reg_live_out = df_get_live_out (bb); + sparseset_clear (pseudos_live); + sparseset_clear (pseudos_live_through_calls); + sparseset_clear (pseudos_live_through_setjumps); + REG_SET_TO_HARD_REG_SET (hard_regs_live, reg_live_out); + AND_COMPL_HARD_REG_SET (hard_regs_live, eliminable_regset); + AND_COMPL_HARD_REG_SET (hard_regs_live, lra_no_alloc_regs); + EXECUTE_IF_SET_IN_BITMAP (reg_live_out, FIRST_PSEUDO_REGISTER, j, bi) + mark_pseudo_live (j, curr_point); + + freq = REG_FREQ_FROM_BB (bb); + + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " BB %d\n", bb->index); + + /* Scan the code of this basic block, noting which pseudos and hard + regs are born or die. + + Note that this loop treats uninitialized values as live until the + beginning of the block. For example, if an instruction uses + (reg:DI foo), and only (subreg:SI (reg:DI foo) 0) is ever set, + FOO will remain live until the beginning of the block. Likewise + if FOO is not set at all. This is unnecessarily pessimistic, but + it probably doesn't matter much in practice. */ + FOR_BB_INSNS_REVERSE (bb, curr_insn) + { + bool call_p; + int dst_regno, src_regno; + rtx set; + struct lra_insn_reg *reg; + + if (!NONDEBUG_INSN_P (curr_insn)) + continue; + + curr_id = lra_get_insn_recog_data (curr_insn); + curr_static_id = curr_id->insn_static_data; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Insn %u: point = %d\n", + INSN_UID (curr_insn), curr_point); + + /* Update max ref width and hard reg usage. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->regno >= FIRST_PSEUDO_REGISTER + && (GET_MODE_SIZE (reg->biggest_mode) + > GET_MODE_SIZE (lra_reg_info[reg->regno].biggest_mode))) + lra_reg_info[reg->regno].biggest_mode = reg->biggest_mode; + else if (reg->regno < FIRST_PSEUDO_REGISTER) + lra_hard_reg_usage[reg->regno] += freq; + + call_p = CALL_P (curr_insn); + if (complete_info_p + && (set = single_set (curr_insn)) != NULL_RTX + && REG_P (SET_DEST (set)) && REG_P (SET_SRC (set)) + /* Check that source regno does not conflict with + destination regno to exclude most impossible + preferences. */ + && ((((src_regno = REGNO (SET_SRC (set))) >= FIRST_PSEUDO_REGISTER + && ! sparseset_bit_p (pseudos_live, src_regno)) + || (src_regno < FIRST_PSEUDO_REGISTER + && ! TEST_HARD_REG_BIT (hard_regs_live, src_regno))) + /* It might be 'inheritance pseudo <- reload pseudo'. */ + || (src_regno >= lra_constraint_new_regno_start + && ((int) REGNO (SET_DEST (set)) + >= lra_constraint_new_regno_start)))) + { + int hard_regno = -1, regno = -1; + + dst_regno = REGNO (SET_DEST (set)); + if (dst_regno >= lra_constraint_new_regno_start + && src_regno >= lra_constraint_new_regno_start) + lra_create_copy (dst_regno, src_regno, freq); + else if (dst_regno >= lra_constraint_new_regno_start) + { + if ((hard_regno = src_regno) >= FIRST_PSEUDO_REGISTER) + hard_regno = reg_renumber[src_regno]; + regno = dst_regno; + } + else if (src_regno >= lra_constraint_new_regno_start) + { + if ((hard_regno = dst_regno) >= FIRST_PSEUDO_REGISTER) + hard_regno = reg_renumber[dst_regno]; + regno = src_regno; + } + if (regno >= 0 && hard_regno >= 0) + lra_setup_reload_pseudo_preferenced_hard_reg + (regno, hard_regno, freq); + } + + sparseset_clear (start_living); + + /* Try to avoid unnecessary program point increments, this saves + a lot of time in remove_some_program_points_and_update_live_ranges. + We only need an increment if something becomes live or dies at this + program point. */ + need_curr_point_incr = false; + + /* Mark each defined value as live. We need to do this for + unused values because they still conflict with quantities + that are live at the time of the definition. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type != OP_IN) + { + need_curr_point_incr |= mark_regno_live (reg->regno, + reg->biggest_mode, + curr_point); + check_pseudos_live_through_calls (reg->regno); + } + + for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) + if (reg->type != OP_IN) + make_hard_regno_born (reg->regno); + + sparseset_copy (unused_set, start_living); + + sparseset_clear (start_dying); + + /* See which defined values die here. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type == OP_OUT && ! reg->early_clobber && ! reg->subreg_p) + need_curr_point_incr |= mark_regno_dead (reg->regno, + reg->biggest_mode, + curr_point); + + for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) + if (reg->type == OP_OUT && ! reg->early_clobber && ! reg->subreg_p) + make_hard_regno_dead (reg->regno); + + if (call_p) + { + sparseset_ior (pseudos_live_through_calls, + pseudos_live_through_calls, pseudos_live); + if (cfun->has_nonlocal_label + || find_reg_note (curr_insn, REG_SETJMP, + NULL_RTX) != NULL_RTX) + sparseset_ior (pseudos_live_through_setjumps, + pseudos_live_through_setjumps, pseudos_live); + } + + /* Increment the current program point if we must. */ + if (need_curr_point_incr) + next_program_point (curr_point, freq); + + sparseset_clear (start_living); + + need_curr_point_incr = false; + + /* Mark each used value as live. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type == OP_IN) + { + need_curr_point_incr |= mark_regno_live (reg->regno, + reg->biggest_mode, + curr_point); + check_pseudos_live_through_calls (reg->regno); + } + + for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) + if (reg->type == OP_IN) + make_hard_regno_born (reg->regno); + + if (curr_id->arg_hard_regs != NULL) + /* Make argument hard registers live. */ + for (i = 0; (regno = curr_id->arg_hard_regs[i]) >= 0; i++) + make_hard_regno_born (regno); + + sparseset_and_compl (dead_set, start_living, start_dying); + + /* Mark early clobber outputs dead. */ + for (reg = curr_id->regs; reg != NULL; reg = reg->next) + if (reg->type == OP_OUT && reg->early_clobber && ! reg->subreg_p) + need_curr_point_incr = mark_regno_dead (reg->regno, + reg->biggest_mode, + curr_point); + + for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) + if (reg->type == OP_OUT && reg->early_clobber && ! reg->subreg_p) + make_hard_regno_dead (reg->regno); + + if (need_curr_point_incr) + next_program_point (curr_point, freq); + + /* Update notes. */ + for (link_loc = ®_NOTES (curr_insn); (link = *link_loc) != NULL_RTX;) + { + if (REG_NOTE_KIND (link) != REG_DEAD + && REG_NOTE_KIND (link) != REG_UNUSED) + ; + else if (REG_P (XEXP (link, 0))) + { + regno = REGNO (XEXP (link, 0)); + if ((REG_NOTE_KIND (link) == REG_DEAD + && ! sparseset_bit_p (dead_set, regno)) + || (REG_NOTE_KIND (link) == REG_UNUSED + && ! sparseset_bit_p (unused_set, regno))) + { + *link_loc = XEXP (link, 1); + continue; + } + if (REG_NOTE_KIND (link) == REG_DEAD) + sparseset_clear_bit (dead_set, regno); + else if (REG_NOTE_KIND (link) == REG_UNUSED) + sparseset_clear_bit (unused_set, regno); + } + link_loc = &XEXP (link, 1); + } + EXECUTE_IF_SET_IN_SPARSESET (dead_set, j) + add_reg_note (curr_insn, REG_DEAD, regno_reg_rtx[j]); + EXECUTE_IF_SET_IN_SPARSESET (unused_set, j) + add_reg_note (curr_insn, REG_UNUSED, regno_reg_rtx[j]); + } + +#ifdef EH_RETURN_DATA_REGNO + if (bb_has_eh_pred (bb)) + for (j = 0; ; ++j) + { + unsigned int regno = EH_RETURN_DATA_REGNO (j); + + if (regno == INVALID_REGNUM) + break; + make_hard_regno_born (regno); + } +#endif + + /* Pseudos can't go in stack regs at the start of a basic block that + is reached by an abnormal edge. Likewise for call clobbered regs, + because caller-save, fixup_abnormal_edges and possibly the table + driven EH machinery are not quite ready to handle such pseudos + live across such edges. */ + if (bb_has_abnormal_pred (bb)) + { +#ifdef STACK_REGS + EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, px) + lra_reg_info[px].no_stack_p = true; + for (px = FIRST_STACK_REG; px <= LAST_STACK_REG; px++) + make_hard_regno_born (px); +#endif + /* No need to record conflicts for call clobbered regs if we + have nonlocal labels around, as we don't ever try to + allocate such regs in this case. */ + if (!cfun->has_nonlocal_label && bb_has_abnormal_call_pred (bb)) + for (px = 0; px < FIRST_PSEUDO_REGISTER; px++) + if (call_used_regs[px]) + make_hard_regno_born (px); + } + + /* See if we'll need an increment at the end of this basic block. + An increment is needed if the PSEUDOS_LIVE set is not empty, + to make sure the finish points are set up correctly. */ + need_curr_point_incr = (sparseset_cardinality (pseudos_live) > 0); + + EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i) + mark_pseudo_dead (i, curr_point); + + EXECUTE_IF_SET_IN_BITMAP (df_get_live_in (bb), FIRST_PSEUDO_REGISTER, j, bi) + { + if (sparseset_cardinality (pseudos_live_through_calls) == 0) + break; + if (sparseset_bit_p (pseudos_live_through_calls, j)) + check_pseudos_live_through_calls (j); + } + + if (need_curr_point_incr) + next_program_point (curr_point, freq); +} + +/* Compress pseudo live ranges by removing program points where + nothing happens. Complexity of many algorithms in LRA is linear + function of program points number. To speed up the code we try to + minimize the number of the program points here. */ +static void +remove_some_program_points_and_update_live_ranges (void) +{ + unsigned i; + int n, max_regno; + int *map; + lra_live_range_t r, prev_r, next_r; + sbitmap born_or_dead, born, dead; + sbitmap_iterator sbi; + bool born_p, dead_p, prev_born_p, prev_dead_p; + + born = sbitmap_alloc (lra_live_max_point); + dead = sbitmap_alloc (lra_live_max_point); + sbitmap_zero (born); + sbitmap_zero (dead); + max_regno = max_reg_num (); + for (i = FIRST_PSEUDO_REGISTER; i < (unsigned) max_regno; i++) + { + for (r = lra_reg_info[i].live_ranges; r != NULL; r = r->next) + { + lra_assert (r->start <= r->finish); + SET_BIT (born, r->start); + SET_BIT (dead, r->finish); + } + } + born_or_dead = sbitmap_alloc (lra_live_max_point); + sbitmap_a_or_b (born_or_dead, born, dead); + map = XCNEWVEC (int, lra_live_max_point); + n = -1; + prev_born_p = prev_dead_p = false; + EXECUTE_IF_SET_IN_SBITMAP (born_or_dead, 0, i, sbi) + { + born_p = TEST_BIT (born, i); + dead_p = TEST_BIT (dead, i); + if ((prev_born_p && ! prev_dead_p && born_p && ! dead_p) + || (prev_dead_p && ! prev_born_p && dead_p && ! born_p)) + { + map[i] = n; + lra_point_freq[n] = MAX (lra_point_freq[n], lra_point_freq[i]); + } + else + { + map[i] = ++n; + lra_point_freq[n] = lra_point_freq[i]; + } + prev_born_p = born_p; + prev_dead_p = dead_p; + } + sbitmap_free (born_or_dead); + sbitmap_free (born); + sbitmap_free (dead); + n++; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "Compressing live ranges: from %d to %d - %d%%\n", + lra_live_max_point, n, 100 * n / lra_live_max_point); + if (n < lra_live_max_point) + { + lra_live_max_point = n; + for (i = FIRST_PSEUDO_REGISTER; i < (unsigned) max_regno; i++) + { + for (prev_r = NULL, r = lra_reg_info[i].live_ranges; + r != NULL; + r = next_r) + { + next_r = r->next; + r->start = map[r->start]; + r->finish = map[r->finish]; + if (prev_r == NULL || prev_r->start > r->finish + 1) + { + prev_r = r; + continue; + } + prev_r->start = r->start; + prev_r->next = next_r; + free_live_range (r); + } + } + } + free (map); +} + +/* Print live ranges R to file F. */ +void +lra_print_live_range_list (FILE *f, lra_live_range_t r) +{ + for (; r != NULL; r = r->next) + fprintf (f, " [%d..%d]", r->start, r->finish); + fprintf (f, "\n"); +} + +/* Print live ranges R to stderr. */ +void +lra_debug_live_range_list (lra_live_range_t r) +{ + lra_print_live_range_list (stderr, r); +} + +/* Print live ranges of pseudo REGNO to file F. */ +static void +print_pseudo_live_ranges (FILE *f, int regno) +{ + if (lra_reg_info[regno].live_ranges == NULL) + return; + fprintf (f, " r%d:", regno); + lra_print_live_range_list (f, lra_reg_info[regno].live_ranges); +} + +/* Print live ranges of pseudo REGNO to stderr. */ +void +lra_debug_pseudo_live_ranges (int regno) +{ + print_pseudo_live_ranges (stderr, regno); +} + +/* Print live ranges of all pseudos to file F. */ +static void +print_live_ranges (FILE *f) +{ + int i, max_regno; + + max_regno = max_reg_num (); + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + print_pseudo_live_ranges (f, i); +} + +/* Print live ranges of all pseudos to stderr. */ +void +lra_debug_live_ranges (void) +{ + print_live_ranges (stderr); +} + +/* Compress pseudo live ranges. */ +static void +compress_live_ranges (void) +{ + remove_some_program_points_and_update_live_ranges (); + if (lra_dump_file != NULL) + { + fprintf (lra_dump_file, "Ranges after the compression:\n"); + print_live_ranges (lra_dump_file); + } +} + +/* The number of the current live range pass. */ +int lra_live_range_iter; + +/* The main entry function creates live ranges only for memory pseudos + (or for all ones if ALL_P), set up CONFLICT_HARD_REGS for + the pseudos. */ +void +lra_create_live_ranges (bool all_p) +{ + basic_block bb; + int i, hard_regno, max_regno = max_reg_num (); + int curr_point; + + timevar_push (TV_LRA_CREATE_LIVE_RANGES); + + complete_info_p = all_p; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + "\n********** Pseudo live ranges #%d: **********\n\n", + ++lra_live_range_iter); + memset (lra_hard_reg_usage, 0, sizeof (lra_hard_reg_usage)); + for (i = 0; i < max_regno; i++) + { + lra_reg_info[i].live_ranges = NULL; + CLEAR_HARD_REG_SET (lra_reg_info[i].conflict_hard_regs); + lra_reg_info[i].preferred_hard_regno1 = -1; + lra_reg_info[i].preferred_hard_regno2 = -1; + lra_reg_info[i].preferred_hard_regno_profit1 = 0; + lra_reg_info[i].preferred_hard_regno_profit2 = 0; +#ifdef STACK_REGS + lra_reg_info[i].no_stack_p = false; +#endif + if (regno_reg_rtx[i] != NULL_RTX) + lra_reg_info[i].biggest_mode = GET_MODE (regno_reg_rtx[i]); + else + lra_reg_info[i].biggest_mode = VOIDmode; +#ifdef ENABLE_CHECKING + lra_reg_info[i].call_p = false; +#endif + if (i >= FIRST_PSEUDO_REGISTER + && lra_reg_info[i].nrefs != 0 && (hard_regno = reg_renumber[i]) >= 0) + lra_hard_reg_usage[hard_regno] += lra_reg_info[i].freq; + } + lra_free_copies (); + pseudos_live = sparseset_alloc (max_regno); + pseudos_live_through_calls = sparseset_alloc (max_regno); + pseudos_live_through_setjumps = sparseset_alloc (max_regno); + start_living = sparseset_alloc (max_regno); + start_dying = sparseset_alloc (max_regno); + dead_set = sparseset_alloc (max_regno); + unused_set = sparseset_alloc (max_regno); + curr_point = 0; + point_freq_vec = VEC_alloc (int, heap, get_max_uid () * 2); + lra_point_freq = VEC_address (int, point_freq_vec); + int *post_order_rev_cfg = XNEWVEC (int, last_basic_block); + int n_blocks_inverted = inverted_post_order_compute (post_order_rev_cfg); + lra_assert (n_blocks_inverted == n_basic_blocks); + for (i = n_blocks_inverted - 1; i >= 0; --i) + { + bb = BASIC_BLOCK (post_order_rev_cfg[i]); + if (bb == EXIT_BLOCK_PTR || bb == ENTRY_BLOCK_PTR) + continue; + process_bb_lives (bb, curr_point); + } + free (post_order_rev_cfg); + lra_live_max_point = curr_point; + if (lra_dump_file != NULL) + print_live_ranges (lra_dump_file); + /* Clean up. */ + sparseset_free (unused_set); + sparseset_free (dead_set); + sparseset_free (start_dying); + sparseset_free (start_living); + sparseset_free (pseudos_live_through_calls); + sparseset_free (pseudos_live_through_setjumps); + sparseset_free (pseudos_live); + compress_live_ranges (); + timevar_pop (TV_LRA_CREATE_LIVE_RANGES); +} + +/* Finish all live ranges. */ +void +lra_clear_live_ranges (void) +{ + int i; + + for (i = 0; i < max_reg_num (); i++) + free_live_range_list (lra_reg_info[i].live_ranges); + VEC_free (int, heap, point_freq_vec); +} + +/* Initialize live ranges data once per function. */ +void +lra_live_ranges_init (void) +{ + live_range_pool = create_alloc_pool ("live ranges", + sizeof (struct lra_live_range), 100); +} + +/* Finish live ranges data once per function. */ +void +lra_live_ranges_finish (void) +{ + free_alloc_pool (live_range_pool); +} diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c new file mode 100644 index 00000000000..ecc1de4a4d9 --- /dev/null +++ b/gcc/lra-spills.c @@ -0,0 +1,611 @@ +/* Change pseudos by memory. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + + +/* This file contains code for a pass to change spilled pseudos into + memory. + + The pass creates necessary stack slots and assigns spilled pseudos + to the stack slots in following way: + + for all spilled pseudos P most frequently used first do + for all stack slots S do + if P doesn't conflict with pseudos assigned to S then + assign S to P and goto to the next pseudo process + end + end + create new stack slot S and assign P to S + end + + The actual algorithm is bit more complicated because of different + pseudo sizes. + + After that the code changes spilled pseudos (except ones created + from scratches) by corresponding stack slot memory in RTL. + + If at least one stack slot was created, we need to run more passes + because we have new addresses which should be checked and because + the old address displacements might change and address constraints + (or insn memory constraints) might not be satisfied any more. + + For some targets, the pass can spill some pseudos into hard + registers of different class (usually into vector registers) + instead of spilling them into memory if it is possible and + profitable. Spilling GENERAL_REGS pseudo into SSE registers for + Intel Corei7 is an example of such optimization. And this is + actually recommended by Intel optimization guide. + + The file also contains code for final change of pseudos on hard + regs correspondingly assigned to them. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "rtl.h" +#include "tm_p.h" +#include "insn-config.h" +#include "recog.h" +#include "output.h" +#include "regs.h" +#include "hard-reg-set.h" +#include "flags.h" +#include "function.h" +#include "expr.h" +#include "basic-block.h" +#include "except.h" +#include "timevar.h" +#include "target.h" +#include "lra-int.h" +#include "ira.h" +#include "df.h" + + +/* Max regno at the start of the pass. */ +static int regs_num; + +/* Map spilled regno -> hard regno used instead of memory for + spilling. */ +static rtx *spill_hard_reg; + +/* The structure describes stack slot of a spilled pseudo. */ +struct pseudo_slot +{ + /* Number (0, 1, ...) of the stack slot to which given pseudo + belongs. */ + int slot_num; + /* First or next slot with the same slot number. */ + struct pseudo_slot *next, *first; + /* Memory representing the spilled pseudo. */ + rtx mem; +}; + +/* The stack slots for each spilled pseudo. Indexed by regnos. */ +static struct pseudo_slot *pseudo_slots; + +/* The structure describes a register or a stack slot which can be + used for several spilled pseudos. */ +struct slot +{ + /* First pseudo with given stack slot. */ + int regno; + /* Hard reg into which the slot pseudos are spilled. The value is + negative for pseudos spilled into memory. */ + int hard_regno; + /* Memory representing the all stack slot. It can be different from + memory representing a pseudo belonging to give stack slot because + pseudo can be placed in a part of the corresponding stack slot. + The value is NULL for pseudos spilled into a hard reg. */ + rtx mem; + /* Combined live ranges of all pseudos belonging to given slot. It + is used to figure out that a new spilled pseudo can use given + stack slot. */ + lra_live_range_t live_ranges; +}; + +/* Array containing info about the stack slots. The array element is + indexed by the stack slot number in the range [0..slots_num). */ +static struct slot *slots; +/* The number of the stack slots currently existing. */ +static int slots_num; + +/* Set up memory of the spilled pseudo I. The function can allocate + the corresponding stack slot if it is not done yet. */ +static void +assign_mem_slot (int i) +{ + rtx x = NULL_RTX; + enum machine_mode mode = GET_MODE (regno_reg_rtx[i]); + unsigned int inherent_size = PSEUDO_REGNO_BYTES (i); + unsigned int inherent_align = GET_MODE_ALIGNMENT (mode); + unsigned int max_ref_width = GET_MODE_SIZE (lra_reg_info[i].biggest_mode); + unsigned int total_size = MAX (inherent_size, max_ref_width); + unsigned int min_align = max_ref_width * BITS_PER_UNIT; + int adjust = 0; + + lra_assert (regno_reg_rtx[i] != NULL_RTX && REG_P (regno_reg_rtx[i]) + && lra_reg_info[i].nrefs != 0 && reg_renumber[i] < 0); + + x = slots[pseudo_slots[i].slot_num].mem; + + /* We can use a slot already allocated because it is guaranteed the + slot provides both enough inherent space and enough total + space. */ + if (x) + ; + /* Each pseudo has an inherent size which comes from its own mode, + and a total size which provides room for paradoxical subregs + which refer to the pseudo reg in wider modes. We allocate a new + slot, making sure that it has enough inherent space and total + space. */ + else + { + rtx stack_slot; + + /* No known place to spill from => no slot to reuse. */ + x = assign_stack_local (mode, total_size, + min_align > inherent_align + || total_size > inherent_size ? -1 : 0); + x = lra_eliminate_regs_1 (x, GET_MODE (x), false, false, true); + stack_slot = x; + /* Cancel the big-endian correction done in assign_stack_local. + Get the address of the beginning of the slot. This is so we + can do a big-endian correction unconditionally below. */ + if (BYTES_BIG_ENDIAN) + { + adjust = inherent_size - total_size; + if (adjust) + stack_slot + = adjust_address_nv (x, + mode_for_size (total_size * BITS_PER_UNIT, + MODE_INT, 1), + adjust); + } + slots[pseudo_slots[i].slot_num].mem = stack_slot; + } + + /* On a big endian machine, the "address" of the slot is the address + of the low part that fits its inherent mode. */ + if (BYTES_BIG_ENDIAN && inherent_size < total_size) + adjust += (total_size - inherent_size); + + x = adjust_address_nv (x, GET_MODE (regno_reg_rtx[i]), adjust); + + /* Set all of the memory attributes as appropriate for a spill. */ + set_mem_attrs_for_spill (x); + pseudo_slots[i].mem = x; +} + +/* Sort pseudos according their usage frequencies. */ +static int +regno_freq_compare (const void *v1p, const void *v2p) +{ + const int regno1 = *(const int *) v1p; + const int regno2 = *(const int *) v2p; + int diff; + + if ((diff = lra_reg_info[regno2].freq - lra_reg_info[regno1].freq) != 0) + return diff; + return regno1 - regno2; +} + +/* Redefine STACK_GROWS_DOWNWARD in terms of 0 or 1. */ +#ifdef STACK_GROWS_DOWNWARD +# undef STACK_GROWS_DOWNWARD +# define STACK_GROWS_DOWNWARD 1 +#else +# define STACK_GROWS_DOWNWARD 0 +#endif + +/* Sort pseudos according to their slots, putting the slots in the order + that they should be allocated. Slots with lower numbers have the highest + priority and should get the smallest displacement from the stack or + frame pointer (whichever is being used). + + The first allocated slot is always closest to the frame pointer, + so prefer lower slot numbers when frame_pointer_needed. If the stack + and frame grow in the same direction, then the first allocated slot is + always closest to the initial stack pointer and furthest away from the + final stack pointer, so allocate higher numbers first when using the + stack pointer in that case. The reverse is true if the stack and + frame grow in opposite directions. */ +static int +pseudo_reg_slot_compare (const void *v1p, const void *v2p) +{ + const int regno1 = *(const int *) v1p; + const int regno2 = *(const int *) v2p; + int diff, slot_num1, slot_num2; + int total_size1, total_size2; + + slot_num1 = pseudo_slots[regno1].slot_num; + slot_num2 = pseudo_slots[regno2].slot_num; + if ((diff = slot_num1 - slot_num2) != 0) + return (frame_pointer_needed + || !FRAME_GROWS_DOWNWARD == STACK_GROWS_DOWNWARD ? diff : -diff); + total_size1 = GET_MODE_SIZE (lra_reg_info[regno1].biggest_mode); + total_size2 = GET_MODE_SIZE (lra_reg_info[regno2].biggest_mode); + if ((diff = total_size2 - total_size1) != 0) + return diff; + return regno1 - regno2; +} + +/* Assign spill hard registers to N pseudos in PSEUDO_REGNOS which is + sorted in order of highest frequency first. Put the pseudos which + did not get a spill hard register at the beginning of array + PSEUDO_REGNOS. Return the number of such pseudos. */ +static int +assign_spill_hard_regs (int *pseudo_regnos, int n) +{ + int i, k, p, regno, res, spill_class_size, hard_regno, nr; + enum reg_class rclass, spill_class; + enum machine_mode mode; + lra_live_range_t r; + rtx insn, set; + basic_block bb; + HARD_REG_SET conflict_hard_regs; + bitmap_head ok_insn_bitmap; + bitmap setjump_crosses = regstat_get_setjmp_crosses (); + /* Hard registers which can not be used for any purpose at given + program point because they are unallocatable or already allocated + for other pseudos. */ + HARD_REG_SET *reserved_hard_regs; + + if (! lra_reg_spill_p) + return n; + /* Set up reserved hard regs for every program point. */ + reserved_hard_regs = XNEWVEC (HARD_REG_SET, lra_live_max_point); + for (p = 0; p < lra_live_max_point; p++) + COPY_HARD_REG_SET (reserved_hard_regs[p], lra_no_alloc_regs); + for (i = FIRST_PSEUDO_REGISTER; i < regs_num; i++) + if (lra_reg_info[i].nrefs != 0 + && (hard_regno = lra_get_regno_hard_regno (i)) >= 0) + for (r = lra_reg_info[i].live_ranges; r != NULL; r = r->next) + for (p = r->start; p <= r->finish; p++) + add_to_hard_reg_set (&reserved_hard_regs[p], + lra_reg_info[i].biggest_mode, hard_regno); + bitmap_initialize (&ok_insn_bitmap, ®_obstack); + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + if (DEBUG_INSN_P (insn) + || ((set = single_set (insn)) != NULL_RTX + && REG_P (SET_SRC (set)) && REG_P (SET_DEST (set)))) + bitmap_set_bit (&ok_insn_bitmap, INSN_UID (insn)); + for (res = i = 0; i < n; i++) + { + regno = pseudo_regnos[i]; + rclass = lra_get_allocno_class (regno); + if (bitmap_bit_p (setjump_crosses, regno) + || (spill_class + = ((enum reg_class) + targetm.spill_class ((reg_class_t) rclass, + PSEUDO_REGNO_MODE (regno)))) == NO_REGS + || bitmap_intersect_compl_p (&lra_reg_info[regno].insn_bitmap, + &ok_insn_bitmap)) + { + pseudo_regnos[res++] = regno; + continue; + } + lra_assert (spill_class != NO_REGS); + COPY_HARD_REG_SET (conflict_hard_regs, + lra_reg_info[regno].conflict_hard_regs); + for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next) + for (p = r->start; p <= r->finish; p++) + IOR_HARD_REG_SET (conflict_hard_regs, reserved_hard_regs[p]); + spill_class_size = ira_class_hard_regs_num[spill_class]; + mode = lra_reg_info[regno].biggest_mode; + for (k = 0; k < spill_class_size; k++) + { + hard_regno = ira_class_hard_regs[spill_class][k]; + if (! overlaps_hard_reg_set_p (conflict_hard_regs, mode, hard_regno)) + break; + } + if (k >= spill_class_size) + { + /* There is no available regs -- assign memory later. */ + pseudo_regnos[res++] = regno; + continue; + } + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Spill r%d into hr%d\n", regno, hard_regno); + /* Update reserved_hard_regs. */ + for (r = lra_reg_info[regno].live_ranges; r != NULL; r = r->next) + for (p = r->start; p <= r->finish; p++) + add_to_hard_reg_set (&reserved_hard_regs[p], + lra_reg_info[regno].biggest_mode, hard_regno); + spill_hard_reg[regno] + = gen_raw_REG (PSEUDO_REGNO_MODE (regno), hard_regno); + for (nr = 0; + nr < hard_regno_nregs[hard_regno][lra_reg_info[regno].biggest_mode]; + nr++) + /* Just loop. */; + df_set_regs_ever_live (hard_regno + nr, true); + } + bitmap_clear (&ok_insn_bitmap); + free (reserved_hard_regs); + return res; +} + +/* Add pseudo REGNO to slot SLOT_NUM. */ +static void +add_pseudo_to_slot (int regno, int slot_num) +{ + struct pseudo_slot *first; + + if (slots[slot_num].regno < 0) + { + /* It is the first pseudo in the slot. */ + slots[slot_num].regno = regno; + pseudo_slots[regno].first = &pseudo_slots[regno]; + pseudo_slots[regno].next = NULL; + } + else + { + first = pseudo_slots[regno].first = &pseudo_slots[slots[slot_num].regno]; + pseudo_slots[regno].next = first->next; + first->next = &pseudo_slots[regno]; + } + pseudo_slots[regno].mem = NULL_RTX; + pseudo_slots[regno].slot_num = slot_num; + slots[slot_num].live_ranges + = lra_merge_live_ranges (slots[slot_num].live_ranges, + lra_copy_live_range_list + (lra_reg_info[regno].live_ranges)); +} + +/* Assign stack slot numbers to pseudos in array PSEUDO_REGNOS of + length N. Sort pseudos in PSEUDO_REGNOS for subsequent assigning + memory stack slots. */ +static void +assign_stack_slot_num_and_sort_pseudos (int *pseudo_regnos, int n) +{ + int i, j, regno; + + slots_num = 0; + /* Assign stack slot numbers to spilled pseudos, use smaller numbers + for most frequently used pseudos. */ + for (i = 0; i < n; i++) + { + regno = pseudo_regnos[i]; + if (! flag_ira_share_spill_slots) + j = slots_num; + else + { + for (j = 0; j < slots_num; j++) + if (slots[j].hard_regno < 0 + && ! (lra_intersected_live_ranges_p + (slots[j].live_ranges, + lra_reg_info[regno].live_ranges))) + break; + } + if (j >= slots_num) + { + /* New slot. */ + slots[j].live_ranges = NULL; + slots[j].regno = slots[j].hard_regno = -1; + slots[j].mem = NULL_RTX; + slots_num++; + } + add_pseudo_to_slot (regno, j); + } + /* Sort regnos according to their slot numbers. */ + qsort (pseudo_regnos, n, sizeof (int), pseudo_reg_slot_compare); +} + +/* Recursively process LOC in INSN and change spilled pseudos to the + corresponding memory or spilled hard reg. Ignore spilled pseudos + created from the scratches. */ +static void +remove_pseudos (rtx *loc, rtx insn) +{ + int i; + rtx hard_reg; + const char *fmt; + enum rtx_code code; + + if (*loc == NULL_RTX) + return; + code = GET_CODE (*loc); + if (code == REG && (i = REGNO (*loc)) >= FIRST_PSEUDO_REGISTER + && lra_get_regno_hard_regno (i) < 0 + /* We do not want to assign memory for former scratches because + it might result in an address reload for some targets. In + any case we transform such pseudos not getting hard registers + into scratches back. */ + && ! lra_former_scratch_p (i)) + { + hard_reg = spill_hard_reg[i]; + *loc = copy_rtx (hard_reg != NULL_RTX ? hard_reg : pseudo_slots[i].mem); + return; + } + + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + remove_pseudos (&XEXP (*loc, i), insn); + else if (fmt[i] == 'E') + { + int j; + + for (j = XVECLEN (*loc, i) - 1; j >= 0; j--) + remove_pseudos (&XVECEXP (*loc, i, j), insn); + } + } +} + +/* Convert spilled pseudos into their stack slots or spill hard regs, + put insns to process on the constraint stack (that is all insns in + which pseudos were changed to memory or spill hard regs). */ +static void +spill_pseudos (void) +{ + basic_block bb; + rtx insn; + int i; + bitmap_head spilled_pseudos, changed_insns; + + bitmap_initialize (&spilled_pseudos, ®_obstack); + bitmap_initialize (&changed_insns, ®_obstack); + for (i = FIRST_PSEUDO_REGISTER; i < regs_num; i++) + { + if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0 + && ! lra_former_scratch_p (i)) + { + bitmap_set_bit (&spilled_pseudos, i); + bitmap_ior_into (&changed_insns, &lra_reg_info[i].insn_bitmap); + } + } + FOR_EACH_BB (bb) + { + FOR_BB_INSNS (bb, insn) + if (bitmap_bit_p (&changed_insns, INSN_UID (insn))) + { + remove_pseudos (&PATTERN (insn), insn); + if (CALL_P (insn)) + remove_pseudos (&CALL_INSN_FUNCTION_USAGE (insn), insn); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + "Changing spilled pseudos to memory in insn #%u\n", + INSN_UID (insn)); + lra_push_insn (insn); + if (lra_reg_spill_p || targetm.different_addr_displacement_p ()) + lra_set_used_insn_alternative (insn, -1); + } + else if (CALL_P (insn)) + /* Presence of any pseudo in CALL_INSN_FUNCTION_USAGE does + not affect value of insn_bitmap of the corresponding + lra_reg_info. That is because we don't need to reload + pseudos in CALL_INSN_FUNCTION_USAGEs. So if we process + only insns in the insn_bitmap of given pseudo here, we + can miss the pseudo in some + CALL_INSN_FUNCTION_USAGEs. */ + remove_pseudos (&CALL_INSN_FUNCTION_USAGE (insn), insn); + bitmap_and_compl_into (df_get_live_in (bb), &spilled_pseudos); + bitmap_and_compl_into (df_get_live_out (bb), &spilled_pseudos); + } + bitmap_clear (&spilled_pseudos); + bitmap_clear (&changed_insns); +} + +/* Return true if we need to change some pseudos into memory. */ +bool +lra_need_for_spills_p (void) +{ + int i; max_regno = max_reg_num (); + + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0 + && ! lra_former_scratch_p (i)) + return true; + return false; +} + +/* Change spilled pseudos into memory or spill hard regs. Put changed + insns on the constraint stack (these insns will be considered on + the next constraint pass). The changed insns are all insns in + which pseudos were changed. */ +void +lra_spill (void) +{ + int i, n, curr_regno; + int *pseudo_regnos; + + regs_num = max_reg_num (); + spill_hard_reg = XNEWVEC (rtx, regs_num); + pseudo_regnos = XNEWVEC (int, regs_num); + for (n = 0, i = FIRST_PSEUDO_REGISTER; i < regs_num; i++) + if (lra_reg_info[i].nrefs != 0 && lra_get_regno_hard_regno (i) < 0 + /* We do not want to assign memory for former scratches. */ + && ! lra_former_scratch_p (i)) + { + spill_hard_reg[i] = NULL_RTX; + pseudo_regnos[n++] = i; + } + lra_assert (n > 0); + pseudo_slots = XNEWVEC (struct pseudo_slot, regs_num); + slots = XNEWVEC (struct slot, regs_num); + /* Sort regnos according their usage frequencies. */ + qsort (pseudo_regnos, n, sizeof (int), regno_freq_compare); + n = assign_spill_hard_regs (pseudo_regnos, n); + assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n); + for (i = 0; i < n; i++) + if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX) + assign_mem_slot (pseudo_regnos[i]); + if (lra_dump_file != NULL) + { + for (i = 0; i < slots_num; i++) + { + fprintf (lra_dump_file, " Slot %d regnos (width = %d):", i, + GET_MODE_SIZE (GET_MODE (slots[i].mem))); + for (curr_regno = slots[i].regno;; + curr_regno = pseudo_slots[curr_regno].next - pseudo_slots) + { + fprintf (lra_dump_file, " %d", curr_regno); + if (pseudo_slots[curr_regno].next == NULL) + break; + } + fprintf (lra_dump_file, "\n"); + } + } + spill_pseudos (); + free (slots); + free (pseudo_slots); + free (pseudo_regnos); +} + +/* Final change of pseudos got hard registers into the corresponding + hard registers. */ +void +lra_hard_reg_substitution (void) +{ + int i, hard_regno; + basic_block bb; + rtx insn; + int max_regno = max_reg_num (); + + for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) + if (lra_reg_info[i].nrefs != 0 + && (hard_regno = lra_get_regno_hard_regno (i)) >= 0) + SET_REGNO (regno_reg_rtx[i], hard_regno); + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + if (INSN_P (insn)) + { + lra_insn_recog_data_t id; + bool insn_change_p = false; + + id = lra_get_insn_recog_data (insn); + for (i = id->insn_static_data->n_operands - 1; i >= 0; i--) + { + rtx op = *id->operand_loc[i]; + + if (GET_CODE (op) == SUBREG && REG_P (SUBREG_REG (op))) + { + lra_assert (REGNO (SUBREG_REG (op)) < FIRST_PSEUDO_REGISTER); + alter_subreg (id->operand_loc[i], ! DEBUG_INSN_P (insn)); + lra_update_dup (id, i); + insn_change_p = true; + } + } + if (insn_change_p) + lra_update_operator_dups (id); + } +} diff --git a/gcc/lra.c b/gcc/lra.c new file mode 100644 index 00000000000..1897e8593cf --- /dev/null +++ b/gcc/lra.c @@ -0,0 +1,2398 @@ +/* LRA (local register allocator) driver and LRA utilities. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + + +/* The Local Register Allocator (LRA) is a replacement of former + reload pass. It is focused to simplify code solving the reload + pass tasks, to make the code maintenance easier, and to implement new + perspective optimizations. + + The major LRA design solutions are: + o division small manageable, separated sub-tasks + o reflection of all transformations and decisions in RTL as more + as possible + o insn constraints as a primary source of the info (minimizing + number of target-depended macros/hooks) + + In brief LRA works by iterative insn process with the final goal is + to satisfy all insn and address constraints: + o New reload insns (in brief reloads) and reload pseudos might be + generated; + o Some pseudos might be spilled to assign hard registers to + new reload pseudos; + o Changing spilled pseudos to stack memory or their equivalences; + o Allocation stack memory changes the address displacement and + new iteration is needed. + + Here is block diagram of LRA passes: + + --------------------- + | Undo inheritance | --------------- --------------- + | for spilled pseudos)| | Memory-memory | | New (and old) | + | and splits (for |<----| move coalesce |<-----| pseudos | + | pseudos got the | --------------- | assignment | + Start | same hard regs) | --------------- + | --------------------- ^ + V | ---------------- | + ----------- V | Update virtual | | +| Remove |----> ------------>| register | | +| scratches | ^ | displacements | | + ----------- | ---------------- | + | | | + | V New | + ---------------- No ------------ pseudos ------------------- + | Spilled pseudo | change |Constraints:| or insns | Inheritance/split | + | to memory |<-------| RTL |--------->| transformations | + | substitution | | transfor- | | in EBB scope | + ---------------- | mations | ------------------- + | ------------ + V + ------------------------- + | Hard regs substitution, | + | devirtalization, and |------> Finish + | restoring scratches got | + | memory | + ------------------------- + + To speed up the process: + o We process only insns affected by changes on previous + iterations; + o We don't use DFA-infrastructure because it results in much slower + compiler speed than a special IR described below does; + o We use a special insn representation for quick access to insn + info which is always *synchronized* with the current RTL; + o Insn IR is minimized by memory. It is divided on three parts: + o one specific for each insn in RTL (only operand locations); + o one common for all insns in RTL with the same insn code + (different operand attributes from machine descriptions); + o one oriented for maintenance of live info (list of pseudos). + o Pseudo data: + o all insns where the pseudo is referenced; + o live info (conflicting hard regs, live ranges, # of + references etc); + o data used for assigning (preferred hard regs, costs etc). + + This file contains LRA driver, LRA utility functions and data, and + code for dealing with scratches. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "rtl.h" +#include "tm_p.h" +#include "regs.h" +#include "insn-config.h" +#include "insn-codes.h" +#include "recog.h" +#include "output.h" +#include "addresses.h" +#include "hard-reg-set.h" +#include "flags.h" +#include "function.h" +#include "expr.h" +#include "basic-block.h" +#include "except.h" +#include "tree-pass.h" +#include "timevar.h" +#include "target.h" +#include "vec.h" +#include "ira.h" +#include "lra-int.h" +#include "df.h" + +/* Hard registers currently not available for allocation. It can + changed after some hard registers become not eliminable. */ +HARD_REG_SET lra_no_alloc_regs; + +static int get_new_reg_value (void); +static void expand_reg_info (void); +static void invalidate_insn_recog_data (int); +static int get_insn_freq (rtx); +static void invalidate_insn_data_regno_info (lra_insn_recog_data_t, rtx, int); + +/* Expand all regno related info needed for LRA. */ +static void +expand_reg_data (void) +{ + resize_reg_info (); + expand_reg_info (); + ira_expand_reg_equiv (); +} + +/* Create and return a new reg of ORIGINAL mode. If ORIGINAL is NULL + or of VOIDmode, use MD_MODE for the new reg. Initialize its + register class to RCLASS. Print message about assigning class + RCLASS containing new register name TITLE unless it is NULL. Use + attributes of ORIGINAL if it is a register. The created register + will have unique held value. */ +rtx +lra_create_new_reg_with_unique_value (enum machine_mode md_mode, rtx original, + enum reg_class rclass, const char *title) +{ + enum machine_mode mode; + rtx new_reg; + + if (original == NULL_RTX || (mode = GET_MODE (original)) == VOIDmode) + mode = md_mode; + lra_assert (mode != VOIDmode); + new_reg = gen_reg_rtx (mode); + if (original == NULL_RTX || ! REG_P (original)) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Creating newreg=%i", REGNO (new_reg)); + } + else + { + if (ORIGINAL_REGNO (original) >= FIRST_PSEUDO_REGISTER) + ORIGINAL_REGNO (new_reg) = ORIGINAL_REGNO (original); + REG_USERVAR_P (new_reg) = REG_USERVAR_P (original); + REG_POINTER (new_reg) = REG_POINTER (original); + REG_ATTRS (new_reg) = REG_ATTRS (original); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Creating newreg=%i from oldreg=%i", + REGNO (new_reg), REGNO (original)); + } + if (lra_dump_file != NULL) + { + if (title != NULL) + fprintf (lra_dump_file, ", assigning class %s to%s%s r%d", + reg_class_names[rclass], *title == '\0' ? "" : " ", + title, REGNO (new_reg)); + fprintf (lra_dump_file, "\n"); + } + expand_reg_data (); + setup_reg_classes (REGNO (new_reg), rclass, NO_REGS, rclass); + return new_reg; +} + +/* Analogous to the previous function but also inherits value of + ORIGINAL. */ +rtx +lra_create_new_reg (enum machine_mode md_mode, rtx original, + enum reg_class rclass, const char *title) +{ + rtx new_reg; + + new_reg + = lra_create_new_reg_with_unique_value (md_mode, original, rclass, title); + if (original != NULL_RTX && REG_P (original)) + lra_reg_info[REGNO (new_reg)].val = lra_reg_info[REGNO (original)].val; + return new_reg; +} + +/* Set up for REGNO unique hold value. */ +void +lra_set_regno_unique_value (int regno) +{ + lra_reg_info[regno].val = get_new_reg_value (); +} + +/* Invalidate INSN related info used by LRA. */ +void +lra_invalidate_insn_data (rtx insn) +{ + lra_invalidate_insn_regno_info (insn); + invalidate_insn_recog_data (INSN_UID (insn)); +} + +/* Mark INSN deleted and invalidate the insn related info used by + LRA. */ +void +lra_set_insn_deleted (rtx insn) +{ + lra_invalidate_insn_data (insn); + SET_INSN_DELETED (insn); +} + +/* Delete an unneeded INSN and any previous insns who sole purpose is + loading data that is dead in INSN. */ +void +lra_delete_dead_insn (rtx insn) +{ + rtx prev = prev_real_insn (insn); + rtx prev_dest; + + /* If the previous insn sets a register that dies in our insn, + delete it too. */ + if (prev && GET_CODE (PATTERN (prev)) == SET + && (prev_dest = SET_DEST (PATTERN (prev)), REG_P (prev_dest)) + && reg_mentioned_p (prev_dest, PATTERN (insn)) + && find_regno_note (insn, REG_DEAD, REGNO (prev_dest)) + && ! side_effects_p (SET_SRC (PATTERN (prev)))) + lra_delete_dead_insn (prev); + + lra_set_insn_deleted (insn); +} + +/* Target checks operands through operand predicates to recognize an + insn. We should have a special precaution to generate add insns + which are frequent results of elimination. + + Emit insns for x = y + z. X can be used to store intermediate + values and should be not in Y and Z when we use X to store an + intermediate value. Y + Z should form [base] [+ index[ * scale]] [ + + disp] where base and index are registers, disp and scale are + constants. Y should contain base if it is present, Z should + contain disp if any. index[*scale] can be part of Y or Z. */ +void +lra_emit_add (rtx x, rtx y, rtx z) +{ + int old; + rtx insn, last; + rtx a1, a2, base, index, disp, scale, index_scale; + bool ok_p; + + insn = gen_add3_insn (x, y, z); + old = max_reg_num (); + if (insn != NULL_RTX) + emit_insn (insn); + else + { + disp = a2 = NULL_RTX; + if (GET_CODE (y) == PLUS) + { + a1 = XEXP (y, 0); + a2 = XEXP (y, 1); + disp = z; + } + else + { + a1 = y; + if (CONSTANT_P (z)) + disp = z; + else + a2 = z; + } + index_scale = scale = NULL_RTX; + if (GET_CODE (a1) == MULT) + { + index_scale = a1; + index = XEXP (a1, 0); + scale = XEXP (a1, 1); + base = a2; + } + else if (a2 != NULL_RTX && GET_CODE (a2) == MULT) + { + index_scale = a2; + index = XEXP (a2, 0); + scale = XEXP (a2, 1); + base = a1; + } + else + { + base = a1; + index = a2; + } + if (! REG_P (base) + || (index != NULL_RTX && ! REG_P (index)) + || (disp != NULL_RTX && ! CONSTANT_P (disp)) + || (scale != NULL_RTX && ! CONSTANT_P (scale))) + { + /* Its is not an address generation. Probably we have no 3 op + add. Last chance is to use 2-op add insn. */ + lra_assert (x != y && x != z); + emit_move_insn (x, z); + insn = gen_add2_insn (x, y); + emit_insn (insn); + } + else + { + if (index_scale == NULL_RTX) + index_scale = index; + if (disp == NULL_RTX) + { + /* Generate x = index_scale; x = x + base. */ + lra_assert (index_scale != NULL_RTX && base != NULL_RTX); + emit_move_insn (x, index_scale); + insn = gen_add2_insn (x, base); + emit_insn (insn); + } + else if (scale == NULL_RTX) + { + /* Try x = base + disp. */ + lra_assert (base != NULL_RTX); + last = get_last_insn (); + insn = emit_move_insn (x, gen_rtx_PLUS (GET_MODE (base), + base, disp)); + if (recog_memoized (insn) < 0) + { + delete_insns_since (last); + /* Generate x = disp; x = x + base. */ + emit_move_insn (x, disp); + insn = gen_add2_insn (x, base); + emit_insn (insn); + } + /* Generate x = x + index. */ + if (index != NULL_RTX) + { + insn = gen_add2_insn (x, index); + emit_insn (insn); + } + } + else + { + /* Try x = index_scale; x = x + disp; x = x + base. */ + last = get_last_insn (); + insn = emit_move_insn (x, index_scale); + ok_p = false; + if (recog_memoized (insn) >= 0) + { + insn = gen_add2_insn (x, disp); + if (insn != NULL_RTX) + { + emit_insn (insn); + insn = gen_add2_insn (x, disp); + if (insn != NULL_RTX) + { + emit_insn (insn); + ok_p = true; + } + } + } + if (! ok_p) + { + delete_insns_since (last); + /* Generate x = disp; x = x + base; x = x + index_scale. */ + emit_move_insn (x, disp); + insn = gen_add2_insn (x, base); + emit_insn (insn); + insn = gen_add2_insn (x, index_scale); + emit_insn (insn); + } + } + } + } + /* Functions emit_... can create pseudos -- so expand the pseudo + data. */ + if (old != max_reg_num ()) + expand_reg_data (); +} + +/* The number of emitted reload insns so far. */ +int lra_curr_reload_num; + +/* Emit x := y, processing special case when y = u + v or y = u + v * + scale + w through emit_add (Y can be an address which is base + + index reg * scale + displacement in general case). X may be used + as intermediate result therefore it should be not in Y. */ +void +lra_emit_move (rtx x, rtx y) +{ + int old; + + if (GET_CODE (y) != PLUS) + { + if (rtx_equal_p (x, y)) + return; + old = max_reg_num (); + emit_move_insn (x, y); + if (REG_P (x)) + lra_reg_info[ORIGINAL_REGNO (x)].last_reload = ++lra_curr_reload_num; + /* Function emit_move can create pseudos -- so expand the pseudo + data. */ + if (old != max_reg_num ()) + expand_reg_data (); + return; + } + lra_emit_add (x, XEXP (y, 0), XEXP (y, 1)); +} + +/* Update insn operands which are duplication of operands whose + numbers are in array of NOPS (with end marker -1). The insn is + represented by its LRA internal representation ID. */ +void +lra_update_dups (lra_insn_recog_data_t id, signed char *nops) +{ + int i, j, nop; + struct lra_static_insn_data *static_id = id->insn_static_data; + + for (i = 0; i < static_id->n_dups; i++) + for (j = 0; (nop = nops[j]) >= 0; j++) + if (static_id->dup_num[i] == nop) + *id->dup_loc[i] = *id->operand_loc[nop]; +} + + + +/* This page contains code dealing with info about registers in the + insns. */ + +/* Pools for insn reg info. */ +static alloc_pool insn_reg_pool; + +/* Initiate pool for insn reg info. */ +static void +init_insn_regs (void) +{ + insn_reg_pool + = create_alloc_pool ("insn regs", sizeof (struct lra_insn_reg), 100); +} + +/* Create LRA insn related info about referenced REGNO with TYPE + (in/out/inout), biggest reference mode MODE, flag that it is + reference through subreg (SUBREG_P), flag that is early clobbered + in the insn (EARLY_CLOBBER), and reference to the next insn reg + info (NEXT). */ +static struct lra_insn_reg * +new_insn_reg (int regno, enum op_type type, enum machine_mode mode, + bool subreg_p, bool early_clobber, struct lra_insn_reg *next) +{ + struct lra_insn_reg *ir; + + ir = (struct lra_insn_reg *) pool_alloc (insn_reg_pool); + ir->type = type; + ir->biggest_mode = mode; + ir->subreg_p = subreg_p; + ir->early_clobber = early_clobber; + ir->regno = regno; + ir->next = next; + return ir; +} + +/* Free insn reg info IR. */ +static void +free_insn_reg (struct lra_insn_reg *ir) +{ + pool_free (insn_reg_pool, ir); +} + +/* Free insn reg info list IR. */ +static void +free_insn_regs (struct lra_insn_reg *ir) +{ + struct lra_insn_reg *next_ir; + + for (; ir != NULL; ir = next_ir) + { + next_ir = ir->next; + free_insn_reg (ir); + } +} + +/* Finish pool for insn reg info. */ +static void +finish_insn_regs (void) +{ + free_alloc_pool (insn_reg_pool); +} + + + +/* This page contains code dealing LRA insn info (or in other words + LRA internal insn representation). */ + +struct target_lra_int default_target_lra_int; +#if SWITCHABLE_TARGET +struct target_lra_int *this_target_lra_int = &default_target_lra_int; +#endif + +/* Map INSN_CODE -> the static insn data. This info is valid during + all translation unit. */ +struct lra_static_insn_data *insn_code_data[LAST_INSN_CODE]; + +/* Debug insns are represented as a special insn with one input + operand which is RTL expression in var_location. */ + +/* The following data are used as static insn operand data for all + debug insns. If structure lra_operand_data is changed, the + initializer should be changed too. */ +static struct lra_operand_data debug_operand_data = + { + NULL, /* alternative */ + VOIDmode, /* We are not interesting in the operand mode. */ + OP_IN, + 0, 0, 0, 0 + }; + +/* The following data are used as static insn data for all debug + insns. If structure lra_static_insn_data is changed, the + initializer should be changed too. */ +static struct lra_static_insn_data debug_insn_static_data = + { + &debug_operand_data, + 0, /* Duplication operands #. */ + -1, /* Commutative operand #. */ + 1, /* Operands #. There is only one operand which is debug RTL + expression. */ + 0, /* Duplications #. */ + 0, /* Alternatives #. We are not interesting in alternatives + because we does not proceed debug_insns for reloads. */ + NULL, /* Hard registers referenced in machine description. */ + NULL /* Descriptions of operands in alternatives. */ + }; + +/* Called once per compiler work to initialize some LRA data related + to insns. */ +static void +init_insn_code_data_once (void) +{ + memset (insn_code_data, 0, sizeof (insn_code_data)); + memset (op_alt_data, 0, sizeof (op_alt_data)); +} + +/* Called once per compiler work to finalize some LRA data related to + insns. */ +static void +finish_insn_code_data_once (void) +{ + int i; + + for (i = 0; i < LAST_INSN_CODE; i++) + { + if (insn_code_data[i] != NULL) + free (insn_code_data[i]); + if (op_alt_data[i] != NULL) + free (op_alt_data[i]); + } +} + +/* Initialize LRA info about operands in insn alternatives. */ +static void +init_op_alt_data (void) +{ + int i; + + for (i = 0; i < LAST_INSN_CODE; i++) + if (op_alt_data[i] != NULL) + { + free (op_alt_data[i]); + op_alt_data[i] = NULL; + } +} + +/* Return static insn data, allocate and setup if necessary. Although + dup_num is static data (it depends only on icode), to set it up we + need to extract insn first. So recog_data should be valid for + normal insn (ICODE >= 0) before the call. */ +static struct lra_static_insn_data * +get_static_insn_data (int icode, int nop, int ndup, int nalt) +{ + struct lra_static_insn_data *data; + size_t n_bytes; + + lra_assert (icode < LAST_INSN_CODE); + if (icode >= 0 && (data = insn_code_data[icode]) != NULL) + return data; + lra_assert (nop >= 0 && ndup >= 0 && nalt >= 0); + n_bytes = sizeof (struct lra_static_insn_data) + + sizeof (struct lra_operand_data) * nop + + sizeof (int) * ndup; + data = XNEWVAR (struct lra_static_insn_data, n_bytes); + data->n_operands = nop; + data->n_dups = ndup; + data->n_alternatives = nalt; + data->operand = ((struct lra_operand_data *) + ((char *) data + sizeof (struct lra_static_insn_data))); + data->dup_num = ((int *) ((char *) data->operand + + sizeof (struct lra_operand_data) * nop)); + if (icode >= 0) + { + int i; + + insn_code_data[icode] = data; + for (i = 0; i < nop; i++) + { + data->operand[i].constraint + = insn_data[icode].operand[i].constraint; + data->operand[i].mode = insn_data[icode].operand[i].mode; + data->operand[i].strict_low = insn_data[icode].operand[i].strict_low; + data->operand[i].is_operator + = insn_data[icode].operand[i].is_operator; + data->operand[i].type + = (data->operand[i].constraint[0] == '=' ? OP_OUT + : data->operand[i].constraint[0] == '+' ? OP_INOUT + : OP_IN); + data->operand[i].is_address = false; + } + for (i = 0; i < ndup; i++) + data->dup_num[i] = recog_data.dup_num[i]; + } + return data; +} + +/* The current length of the following array. */ +int lra_insn_recog_data_len; + +/* Map INSN_UID -> the insn recog data (NULL if unknown). */ +lra_insn_recog_data_t *lra_insn_recog_data; + +/* Initialize LRA data about insns. */ +static void +init_insn_recog_data (void) +{ + lra_insn_recog_data_len = 0; + lra_insn_recog_data = NULL; + init_insn_regs (); +} + +/* Expand, if necessary, LRA data about insns. */ +static void +check_and_expand_insn_recog_data (int index) +{ + int i, old; + + if (lra_insn_recog_data_len > index) + return; + old = lra_insn_recog_data_len; + lra_insn_recog_data_len = index * 3 / 2 + 1; + lra_insn_recog_data = XRESIZEVEC (lra_insn_recog_data_t, + lra_insn_recog_data, + lra_insn_recog_data_len); + for (i = old; i < lra_insn_recog_data_len; i++) + lra_insn_recog_data[i] = NULL; +} + +/* Finish LRA DATA about insn. */ +static void +free_insn_recog_data (lra_insn_recog_data_t data) +{ + if (data->operand_loc != NULL) + free (data->operand_loc); + if (data->dup_loc != NULL) + free (data->dup_loc); + if (data->arg_hard_regs != NULL) + free (data->arg_hard_regs); +#ifdef HAVE_ATTR_enabled + if (data->alternative_enabled_p != NULL) + free (data->alternative_enabled_p); +#endif + if (data->icode < 0 && NONDEBUG_INSN_P (data->insn)) + { + if (data->insn_static_data->operand_alternative != NULL) + free (data->insn_static_data->operand_alternative); + free_insn_regs (data->insn_static_data->hard_regs); + free (data->insn_static_data); + } + free_insn_regs (data->regs); + data->regs = NULL; + free (data); +} + +/* Finish LRA data about all insns. */ +static void +finish_insn_recog_data (void) +{ + int i; + lra_insn_recog_data_t data; + + for (i = 0; i < lra_insn_recog_data_len; i++) + if ((data = lra_insn_recog_data[i]) != NULL) + free_insn_recog_data (data); + finish_insn_regs (); + free (lra_insn_recog_data); +} + +/* Setup info about operands in alternatives of LRA DATA of insn. */ +static void +setup_operand_alternative (lra_insn_recog_data_t data) +{ + int i, nop, nalt; + int icode = data->icode; + struct lra_static_insn_data *static_data = data->insn_static_data; + + if (icode >= 0 + && (static_data->operand_alternative = op_alt_data[icode]) != NULL) + return; + static_data->commutative = -1; + nop = static_data->n_operands; + if (nop == 0) + { + static_data->operand_alternative = NULL; + return; + } + nalt = static_data->n_alternatives; + static_data->operand_alternative = XNEWVEC (struct operand_alternative, + nalt * nop); + memset (static_data->operand_alternative, 0, + nalt * nop * sizeof (struct operand_alternative)); + if (icode >= 0) + op_alt_data[icode] = static_data->operand_alternative; + for (i = 0; i < nop; i++) + { + int j; + struct operand_alternative *op_alt_start, *op_alt; + const char *p = static_data->operand[i].constraint; + + static_data->operand[i].early_clobber = 0; + op_alt_start = &static_data->operand_alternative[i]; + + for (j = 0; j < nalt; j++) + { + op_alt = op_alt_start + j * nop; + op_alt->cl = NO_REGS; + op_alt->constraint = p; + op_alt->matches = -1; + op_alt->matched = -1; + + if (*p == '\0' || *p == ',') + { + op_alt->anything_ok = 1; + continue; + } + + for (;;) + { + char c = *p; + if (c == '#') + do + c = *++p; + while (c != ',' && c != '\0'); + if (c == ',' || c == '\0') + { + p++; + break; + } + + switch (c) + { + case '=': case '+': case '*': + case 'E': case 'F': case 'G': case 'H': + case 's': case 'i': case 'n': + case 'I': case 'J': case 'K': case 'L': + case 'M': case 'N': case 'O': case 'P': + /* These don't say anything we care about. */ + break; + + case '%': + /* We currently only support one commutative pair of + operands. */ + if (static_data->commutative < 0) + static_data->commutative = i; + else + lra_assert (data->icode < 0); /* Asm */ + + /* The last operand should not be marked + commutative. */ + lra_assert (i != nop - 1); + break; + + case '?': + op_alt->reject += 6; + break; + case '!': + op_alt->reject += 600; + break; + case '&': + op_alt->earlyclobber = 1; + static_data->operand[i].early_clobber = 1; + break; + + case '0': case '1': case '2': case '3': case '4': + case '5': case '6': case '7': case '8': case '9': + { + char *end; + op_alt->matches = strtoul (p, &end, 10); + static_data->operand_alternative + [j * nop + op_alt->matches].matched = i; + p = end; + } + continue; + + case TARGET_MEM_CONSTRAINT: + op_alt->memory_ok = 1; + break; + case '<': + op_alt->decmem_ok = 1; + break; + case '>': + op_alt->incmem_ok = 1; + break; + case 'V': + op_alt->nonoffmem_ok = 1; + break; + case 'o': + op_alt->offmem_ok = 1; + break; + case 'X': + op_alt->anything_ok = 1; + break; + + case 'p': + static_data->operand[i].is_address = true; + op_alt->is_address = 1; + op_alt->cl = (reg_class_subunion[(int) op_alt->cl] + [(int) base_reg_class (VOIDmode, + ADDR_SPACE_GENERIC, + ADDRESS, SCRATCH)]); + break; + + case 'g': + case 'r': + op_alt->cl = + reg_class_subunion[(int) op_alt->cl][(int) GENERAL_REGS]; + break; + + default: + if (EXTRA_MEMORY_CONSTRAINT (c, p)) + { + op_alt->memory_ok = 1; + break; + } + if (EXTRA_ADDRESS_CONSTRAINT (c, p)) + { + static_data->operand[i].is_address = true; + op_alt->is_address = 1; + op_alt->cl + = (reg_class_subunion + [(int) op_alt->cl] + [(int) base_reg_class (VOIDmode, ADDR_SPACE_GENERIC, + ADDRESS, SCRATCH)]); + break; + } + + op_alt->cl + = (reg_class_subunion + [(int) op_alt->cl] + [(int) + REG_CLASS_FROM_CONSTRAINT ((unsigned char) c, p)]); + break; + } + p += CONSTRAINT_LEN (c, p); + } + } + } +} + +/* Recursively process X and collect info about registers, which are + not the insn operands, in X with TYPE (in/out/inout) and flag that + it is early clobbered in the insn (EARLY_CLOBBER) and add the info + to LIST. X is a part of insn given by DATA. Return the result + list. */ +static struct lra_insn_reg * +collect_non_operand_hard_regs (rtx *x, lra_insn_recog_data_t data, + struct lra_insn_reg *list, + enum op_type type, bool early_clobber) +{ + int i, j, regno, last; + bool subreg_p; + enum machine_mode mode; + struct lra_insn_reg *curr; + rtx op = *x; + enum rtx_code code = GET_CODE (op); + const char *fmt = GET_RTX_FORMAT (code); + + for (i = 0; i < data->insn_static_data->n_operands; i++) + if (x == data->operand_loc[i]) + /* It is an operand loc. Stop here. */ + return list; + for (i = 0; i < data->insn_static_data->n_dups; i++) + if (x == data->dup_loc[i]) + /* It is a dup loc. Stop here. */ + return list; + mode = GET_MODE (op); + subreg_p = false; + if (code == SUBREG) + { + op = SUBREG_REG (op); + code = GET_CODE (op); + if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (op))) + { + mode = GET_MODE (op); + if (GET_MODE_SIZE (mode) > REGMODE_NATURAL_SIZE (mode)) + subreg_p = true; + } + } + if (REG_P (op)) + { + if ((regno = REGNO (op)) >= FIRST_PSEUDO_REGISTER) + return list; + for (last = regno + hard_regno_nregs[regno][mode]; + regno < last; + regno++) + if (! TEST_HARD_REG_BIT (lra_no_alloc_regs, regno)) + { + for (curr = list; curr != NULL; curr = curr->next) + if (curr->regno == regno && curr->subreg_p == subreg_p + && curr->biggest_mode == mode) + { + if (curr->type != type) + curr->type = OP_INOUT; + if (curr->early_clobber != early_clobber) + curr->early_clobber = true; + break; + } + if (curr == NULL) + { + /* This is a new hard regno or the info can not be + integrated into the found structure. */ +#ifdef STACK_REGS + early_clobber + = (early_clobber + /* This clobber is to inform popping floating + point stack only. */ + && ! (FIRST_STACK_REG <= regno + && regno <= LAST_STACK_REG)); +#endif + list = new_insn_reg (regno, type, mode, subreg_p, + early_clobber, list); + } + } + return list; + } + switch (code) + { + case SET: + list = collect_non_operand_hard_regs (&SET_DEST (op), data, + list, OP_OUT, false); + list = collect_non_operand_hard_regs (&SET_SRC (op), data, + list, OP_IN, false); + break; + case CLOBBER: + /* We treat clobber of non-operand hard registers as early + clobber (the behavior is expected from asm). */ + list = collect_non_operand_hard_regs (&XEXP (op, 0), data, + list, OP_OUT, true); + break; + case PRE_INC: case PRE_DEC: case POST_INC: case POST_DEC: + list = collect_non_operand_hard_regs (&XEXP (op, 0), data, + list, OP_INOUT, false); + break; + case PRE_MODIFY: case POST_MODIFY: + list = collect_non_operand_hard_regs (&XEXP (op, 0), data, + list, OP_INOUT, false); + list = collect_non_operand_hard_regs (&XEXP (op, 1), data, + list, OP_IN, false); + break; + default: + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + list = collect_non_operand_hard_regs (&XEXP (op, i), data, + list, OP_IN, false); + else if (fmt[i] == 'E') + for (j = XVECLEN (op, i) - 1; j >= 0; j--) + list = collect_non_operand_hard_regs (&XVECEXP (op, i, j), data, + list, OP_IN, false); + } + } + return list; +} + +/* Set up and return info about INSN. Set up the info if it is not set up + yet. */ +lra_insn_recog_data_t +lra_set_insn_recog_data (rtx insn) +{ + lra_insn_recog_data_t data; + int i, n, icode; + rtx **locs; + unsigned int uid = INSN_UID (insn); + struct lra_static_insn_data *insn_static_data; + + check_and_expand_insn_recog_data (uid); + if (DEBUG_INSN_P (insn)) + icode = -1; + else + { + icode = INSN_CODE (insn); + if (icode < 0) + /* It might be a new simple insn which is not recognized yet. */ + INSN_CODE (insn) = icode = recog_memoized (insn); + } + data = XNEW (struct lra_insn_recog_data); + lra_insn_recog_data[uid] = data; + data->insn = insn; + data->used_insn_alternative = -1; + data->icode = icode; + data->regs = NULL; + if (DEBUG_INSN_P (insn)) + { + data->insn_static_data = &debug_insn_static_data; + data->dup_loc = NULL; + data->arg_hard_regs = NULL; +#ifdef HAVE_ATTR_enabled + data->alternative_enabled_p = NULL; +#endif + data->operand_loc = XNEWVEC (rtx *, 1); + data->operand_loc[0] = &INSN_VAR_LOCATION_LOC (insn); + return data; + } + if (icode < 0) + { + int nop; + enum machine_mode operand_mode[MAX_RECOG_OPERANDS]; + const char *constraints[MAX_RECOG_OPERANDS]; + + nop = asm_noperands (PATTERN (insn)); + data->operand_loc = data->dup_loc = NULL; + if (nop < 0) + /* Its is a special insn like USE or CLOBBER. */ + data->insn_static_data = insn_static_data + = get_static_insn_data (-1, 0, 0, 1); + else + { + /* expand_asm_operands makes sure there aren't too many + operands. */ + lra_assert (nop <= MAX_RECOG_OPERANDS); + if (nop != 0) + data->operand_loc = XNEWVEC (rtx *, nop); + /* Now get the operand values and constraints out of the + insn. */ + decode_asm_operands (PATTERN (insn), NULL, + data->operand_loc, + constraints, operand_mode, NULL); + n = 1; + if (nop > 0) + { + const char *p = recog_data.constraints[0]; + + for (p = constraints[0]; *p; p++) + n += *p == ','; + } + data->insn_static_data = insn_static_data + = get_static_insn_data (-1, nop, 0, n); + for (i = 0; i < nop; i++) + { + insn_static_data->operand[i].mode = operand_mode[i]; + insn_static_data->operand[i].constraint = constraints[i]; + insn_static_data->operand[i].strict_low = false; + insn_static_data->operand[i].is_operator = false; + insn_static_data->operand[i].is_address = false; + } + } + for (i = 0; i < insn_static_data->n_operands; i++) + insn_static_data->operand[i].type + = (insn_static_data->operand[i].constraint[0] == '=' ? OP_OUT + : insn_static_data->operand[i].constraint[0] == '+' ? OP_INOUT + : OP_IN); +#ifdef HAVE_ATTR_enabled + data->alternative_enabled_p = NULL; +#endif + } + else + { + insn_extract (insn); + data->insn_static_data = insn_static_data + = get_static_insn_data (icode, insn_data[icode].n_operands, + insn_data[icode].n_dups, + insn_data[icode].n_alternatives); + n = insn_static_data->n_operands; + if (n == 0) + locs = NULL; + else + { + locs = XNEWVEC (rtx *, n); + memcpy (locs, recog_data.operand_loc, n * sizeof (rtx *)); + } + data->operand_loc = locs; + n = insn_static_data->n_dups; + if (n == 0) + locs = NULL; + else + { + locs = XNEWVEC (rtx *, n); + memcpy (locs, recog_data.dup_loc, n * sizeof (rtx *)); + } + data->dup_loc = locs; +#ifdef HAVE_ATTR_enabled + { + bool *bp; + + n = insn_static_data->n_alternatives; + lra_assert (n >= 0); + data->alternative_enabled_p = bp = XNEWVEC (bool, n); + /* Cache the insn because we don't want to call extract_insn + from get_attr_enabled as extract_insn modifies + which_alternative. The attribute enabled should not depend + on insn operands, operand modes, operand types, and operand + constraints. It should depend on the architecture. If it + is not true, we should rewrite this file code to use + extract_insn instead of less expensive insn_extract. */ + recog_data.insn = insn; + for (i = 0; i < n; i++) + { + which_alternative = i; + bp[i] = get_attr_enabled (insn); + } + } +#endif + } + if (GET_CODE (PATTERN (insn)) == CLOBBER || GET_CODE (PATTERN (insn)) == USE) + insn_static_data->hard_regs = NULL; + else + insn_static_data->hard_regs + = collect_non_operand_hard_regs (&PATTERN (insn), data, + NULL, OP_IN, false); + setup_operand_alternative (data); + data->arg_hard_regs = NULL; + if (CALL_P (insn)) + { + rtx link; + int n_hard_regs, regno, arg_hard_regs[FIRST_PSEUDO_REGISTER]; + + n_hard_regs = 0; + /* Finding implicit hard register usage. We believe it will be + not changed whatever transformations are used. Call insns + are such example. */ + for (link = CALL_INSN_FUNCTION_USAGE (insn); + link != NULL_RTX; + link = XEXP (link, 1)) + if (GET_CODE (XEXP (link, 0)) == USE + && REG_P (XEXP (XEXP (link, 0), 0))) + { + regno = REGNO (XEXP (XEXP (link, 0), 0)); + lra_assert (regno < FIRST_PSEUDO_REGISTER); + /* It is an argument register. */ + for (i = (hard_regno_nregs + [regno][GET_MODE (XEXP (XEXP (link, 0), 0))]) - 1; + i >= 0; + i--) + arg_hard_regs[n_hard_regs++] = regno + i; + } + if (n_hard_regs != 0) + { + arg_hard_regs[n_hard_regs++] = -1; + data->arg_hard_regs = XNEWVEC (int, n_hard_regs); + memcpy (data->arg_hard_regs, arg_hard_regs, + sizeof (int) * n_hard_regs); + } + } + /* Some output operand can be recognized only from the context not + from the constraints which are empty in this case. Call insn may + contain a hard register in set destination with empty constraint + and extract_insn treats them as an input. */ + for (i = 0; i < insn_static_data->n_operands; i++) + { + int j; + rtx pat, set; + struct lra_operand_data *operand = &insn_static_data->operand[i]; + + /* ??? Should we treat 'X' the same way. It looks to me that + 'X' means anything and empty constraint means we do not + care. */ + if (operand->type != OP_IN || *operand->constraint != '\0' + || operand->is_operator) + continue; + pat = PATTERN (insn); + if (GET_CODE (pat) == SET) + { + if (data->operand_loc[i] != &SET_DEST (pat)) + continue; + } + else if (GET_CODE (pat) == PARALLEL) + { + for (j = XVECLEN (pat, 0) - 1; j >= 0; j--) + { + set = XVECEXP (PATTERN (insn), 0, j); + if (GET_CODE (set) == SET + && &SET_DEST (set) == data->operand_loc[i]) + break; + } + if (j < 0) + continue; + } + else + continue; + operand->type = OP_OUT; + } + return data; +} + +/* Return info about insn give by UID. The info should be already set + up. */ +static lra_insn_recog_data_t +get_insn_recog_data_by_uid (int uid) +{ + lra_insn_recog_data_t data; + + data = lra_insn_recog_data[uid]; + lra_assert (data != NULL); + return data; +} + +/* Invalidate all info about insn given by its UID. */ +static void +invalidate_insn_recog_data (int uid) +{ + lra_insn_recog_data_t data; + + data = lra_insn_recog_data[uid]; + lra_assert (data != NULL); + free_insn_recog_data (data); + lra_insn_recog_data[uid] = NULL; +} + +/* Update all the insn info about INSN. It is usually called when + something in the insn was changed. Return the updated info. */ +lra_insn_recog_data_t +lra_update_insn_recog_data (rtx insn) +{ + lra_insn_recog_data_t data; + int n; + unsigned int uid = INSN_UID (insn); + struct lra_static_insn_data *insn_static_data; + + check_and_expand_insn_recog_data (uid); + if ((data = lra_insn_recog_data[uid]) != NULL + && data->icode != INSN_CODE (insn)) + { + invalidate_insn_data_regno_info (data, insn, get_insn_freq (insn)); + invalidate_insn_recog_data (uid); + data = NULL; + } + if (data == NULL) + return lra_get_insn_recog_data (insn); + insn_static_data = data->insn_static_data; + data->used_insn_alternative = -1; + if (DEBUG_INSN_P (insn)) + return data; + if (data->icode < 0) + { + int nop; + enum machine_mode operand_mode[MAX_RECOG_OPERANDS]; + const char *constraints[MAX_RECOG_OPERANDS]; + + nop = asm_noperands (PATTERN (insn)); + if (nop >= 0) + { + lra_assert (nop == data->insn_static_data->n_operands); + /* Now get the operand values and constraints out of the + insn. */ + decode_asm_operands (PATTERN (insn), NULL, + data->operand_loc, + constraints, operand_mode, NULL); +#ifdef ENABLE_CHECKING + { + int i; + + for (i = 0; i < nop; i++) + lra_assert + (insn_static_data->operand[i].mode == operand_mode[i] + && insn_static_data->operand[i].constraint == constraints[i] + && ! insn_static_data->operand[i].is_operator); + } +#endif + } +#ifdef ENABLE_CHECKING + { + int i; + + for (i = 0; i < insn_static_data->n_operands; i++) + lra_assert + (insn_static_data->operand[i].type + == (insn_static_data->operand[i].constraint[0] == '=' ? OP_OUT + : insn_static_data->operand[i].constraint[0] == '+' ? OP_INOUT + : OP_IN)); + } +#endif + } + else + { + insn_extract (insn); + n = insn_static_data->n_operands; + if (n != 0) + memcpy (data->operand_loc, recog_data.operand_loc, n * sizeof (rtx *)); + n = insn_static_data->n_dups; + if (n != 0) + memcpy (data->dup_loc, recog_data.dup_loc, n * sizeof (rtx *)); +#ifdef HAVE_ATTR_enabled +#ifdef ENABLE_CHECKING + { + int i; + bool *bp; + + n = insn_static_data->n_alternatives; + bp = data->alternative_enabled_p; + lra_assert (n >= 0 && bp != NULL); + /* Cache the insn to prevent extract_insn call from + get_attr_enabled. */ + recog_data.insn = insn; + for (i = 0; i < n; i++) + { + which_alternative = i; + lra_assert (bp[i] == get_attr_enabled (insn)); + } + } +#endif +#endif + } + return data; +} + +/* Set up that INSN is using alternative ALT now. */ +void +lra_set_used_insn_alternative (rtx insn, int alt) +{ + lra_insn_recog_data_t data; + + data = lra_get_insn_recog_data (insn); + data->used_insn_alternative = alt; +} + +/* Set up that insn with UID is using alternative ALT now. The insn + info should be already set up. */ +void +lra_set_used_insn_alternative_by_uid (int uid, int alt) +{ + lra_insn_recog_data_t data; + + check_and_expand_insn_recog_data (uid); + data = lra_insn_recog_data[uid]; + lra_assert (data != NULL); + data->used_insn_alternative = alt; +} + + + +/* This page contains code dealing with common register info and + pseudo copies. */ + +/* The size of the following array. */ +static int reg_info_size; +/* Common info about each register. */ +struct lra_reg *lra_reg_info; + +/* Last register value. */ +static int last_reg_value; + +/* Return new register value. */ +static int +get_new_reg_value (void) +{ + return ++last_reg_value; +} + +/* Pools for copies. */ +static alloc_pool copy_pool; + +DEF_VEC_P(lra_copy_t); +DEF_VEC_ALLOC_P(lra_copy_t, heap); + +/* Vec referring to pseudo copies. */ +static VEC(lra_copy_t,heap) *copy_vec; + +/* Initialize I-th element of lra_reg_info. */ +static inline void +initialize_lra_reg_info_element (int i) +{ + bitmap_initialize (&lra_reg_info[i].insn_bitmap, ®_obstack); +#ifdef STACK_REGS + lra_reg_info[i].no_stack_p = false; +#endif + CLEAR_HARD_REG_SET (lra_reg_info[i].conflict_hard_regs); + lra_reg_info[i].preferred_hard_regno1 = -1; + lra_reg_info[i].preferred_hard_regno2 = -1; + lra_reg_info[i].preferred_hard_regno_profit1 = 0; + lra_reg_info[i].preferred_hard_regno_profit2 = 0; + lra_reg_info[i].live_ranges = NULL; + lra_reg_info[i].nrefs = lra_reg_info[i].freq = 0; + lra_reg_info[i].last_reload = 0; + lra_reg_info[i].restore_regno = -1; + lra_reg_info[i].val = get_new_reg_value (); + lra_reg_info[i].copies = NULL; +} + +/* Initialize common reg info and copies. */ +static void +init_reg_info (void) +{ + int i; + + last_reg_value = 0; + reg_info_size = max_reg_num () * 3 / 2 + 1; + lra_reg_info = XNEWVEC (struct lra_reg, reg_info_size); + for (i = 0; i < reg_info_size; i++) + initialize_lra_reg_info_element (i); + copy_pool + = create_alloc_pool ("lra copies", sizeof (struct lra_copy), 100); + copy_vec = VEC_alloc (lra_copy_t, heap, 100); +} + + +/* Finish common reg info and copies. */ +static void +finish_reg_info (void) +{ + int i; + + for (i = 0; i < reg_info_size; i++) + bitmap_clear (&lra_reg_info[i].insn_bitmap); + free (lra_reg_info); + reg_info_size = 0; + free_alloc_pool (copy_pool); + VEC_free (lra_copy_t, heap, copy_vec); +} + +/* Expand common reg info if it is necessary. */ +static void +expand_reg_info (void) +{ + int i, old = reg_info_size; + + if (reg_info_size > max_reg_num ()) + return; + reg_info_size = max_reg_num () * 3 / 2 + 1; + lra_reg_info = XRESIZEVEC (struct lra_reg, lra_reg_info, reg_info_size); + for (i = old; i < reg_info_size; i++) + initialize_lra_reg_info_element (i); +} + +/* Free all copies. */ +void +lra_free_copies (void) +{ + lra_copy_t cp; + + while (VEC_length (lra_copy_t, copy_vec) != 0) + { + cp = VEC_pop (lra_copy_t, copy_vec); + lra_reg_info[cp->regno1].copies = lra_reg_info[cp->regno2].copies = NULL; + pool_free (copy_pool, cp); + } +} + +/* Create copy of two pseudos REGNO1 and REGNO2. The copy execution + frequency is FREQ. */ +void +lra_create_copy (int regno1, int regno2, int freq) +{ + bool regno1_dest_p; + lra_copy_t cp; + + lra_assert (regno1 != regno2); + regno1_dest_p = true; + if (regno1 > regno2) + { + int temp = regno2; + + regno1_dest_p = false; + regno2 = regno1; + regno1 = temp; + } + cp = (lra_copy_t) pool_alloc (copy_pool); + VEC_safe_push (lra_copy_t, heap, copy_vec, cp); + cp->regno1_dest_p = regno1_dest_p; + cp->freq = freq; + cp->regno1 = regno1; + cp->regno2 = regno2; + cp->regno1_next = lra_reg_info[regno1].copies; + lra_reg_info[regno1].copies = cp; + cp->regno2_next = lra_reg_info[regno2].copies; + lra_reg_info[regno2].copies = cp; + if (lra_dump_file != NULL) + fprintf (lra_dump_file, " Creating copy r%d%sr%d@%d\n", + regno1, regno1_dest_p ? "<-" : "->", regno2, freq); +} + +/* Return N-th (0, 1, ...) copy. If there is no copy, return + NULL. */ +lra_copy_t +lra_get_copy (int n) +{ + if (n >= (int) VEC_length (lra_copy_t, copy_vec)) + return NULL; + return VEC_index (lra_copy_t, copy_vec, n); +} + + + +/* This page contains code dealing with info about registers in + insns. */ + +/* Process X of insn UID recursively and add info (operand type is + given by TYPE, flag of that it is early clobber is EARLY_CLOBBER) + about registers in X to the insn DATA. */ +static void +add_regs_to_insn_regno_info (lra_insn_recog_data_t data, rtx x, int uid, + enum op_type type, bool early_clobber) +{ + int i, j, regno; + bool subreg_p; + enum machine_mode mode; + const char *fmt; + enum rtx_code code; + struct lra_insn_reg *curr; + + code = GET_CODE (x); + mode = GET_MODE (x); + subreg_p = false; + if (GET_CODE (x) == SUBREG) + { + x = SUBREG_REG (x); + code = GET_CODE (x); + if (GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (x))) + { + mode = GET_MODE (x); + if (GET_MODE_SIZE (mode) > REGMODE_NATURAL_SIZE (mode)) + subreg_p = true; + } + } + if (REG_P (x)) + { + regno = REGNO (x); + expand_reg_info (); + if (bitmap_set_bit (&lra_reg_info[regno].insn_bitmap, uid)) + { + data->regs = new_insn_reg (regno, type, mode, subreg_p, + early_clobber, data->regs); + return; + } + else + { + for (curr = data->regs; curr != NULL; curr = curr->next) + if (curr->regno == regno) + { + if (curr->subreg_p != subreg_p || curr->biggest_mode != mode) + /* The info can not be integrated into the found + structure. */ + data->regs = new_insn_reg (regno, type, mode, subreg_p, + early_clobber, data->regs); + else + { + if (curr->type != type) + curr->type = OP_INOUT; + if (curr->early_clobber != early_clobber) + curr->early_clobber = true; + } + return; + } + gcc_unreachable (); + } + } + + switch (code) + { + case SET: + add_regs_to_insn_regno_info (data, SET_DEST (x), uid, OP_OUT, false); + add_regs_to_insn_regno_info (data, SET_SRC (x), uid, OP_IN, false); + break; + case CLOBBER: + /* We treat clobber of non-operand hard registers as early + clobber (the behavior is expected from asm). */ + add_regs_to_insn_regno_info (data, XEXP (x, 0), uid, OP_OUT, true); + break; + case PRE_INC: case PRE_DEC: case POST_INC: case POST_DEC: + add_regs_to_insn_regno_info (data, XEXP (x, 0), uid, OP_INOUT, false); + break; + case PRE_MODIFY: case POST_MODIFY: + add_regs_to_insn_regno_info (data, XEXP (x, 0), uid, OP_INOUT, false); + add_regs_to_insn_regno_info (data, XEXP (x, 1), uid, OP_IN, false); + break; + default: + if ((code != PARALLEL && code != EXPR_LIST) || type != OP_OUT) + /* Some targets place small structures in registers for return + values of functions, and those registers are wrapped in + PARALLEL that we may see as the destination of a SET. Here + is an example: + + (call_insn 13 12 14 2 (set (parallel:BLK [ + (expr_list:REG_DEP_TRUE (reg:DI 0 ax) + (const_int 0 [0])) + (expr_list:REG_DEP_TRUE (reg:DI 1 dx) + (const_int 8 [0x8])) + ]) + (call (mem:QI (symbol_ref:DI (... */ + type = OP_IN; + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + add_regs_to_insn_regno_info (data, XEXP (x, i), uid, type, false); + else if (fmt[i] == 'E') + { + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + add_regs_to_insn_regno_info (data, XVECEXP (x, i, j), uid, + type, false); + } + } + } +} + +/* Return execution frequency of INSN. */ +static int +get_insn_freq (rtx insn) +{ + basic_block bb; + + if ((bb = BLOCK_FOR_INSN (insn)) != NULL) + return REG_FREQ_FROM_BB (bb); + else + { + lra_assert (lra_insn_recog_data[INSN_UID (insn)] + ->insn_static_data->n_operands == 0); + /* We don't care about such insn, e.g. it might be jump with + addr_vec. */ + return 1; + } +} + +/* Invalidate all reg info of INSN with DATA and execution frequency + FREQ. Update common info about the invalidated registers. */ +static void +invalidate_insn_data_regno_info (lra_insn_recog_data_t data, rtx insn, + int freq) +{ + int uid; + bool debug_p; + unsigned int i; + struct lra_insn_reg *ir, *next_ir; + + uid = INSN_UID (insn); + debug_p = DEBUG_INSN_P (insn); + for (ir = data->regs; ir != NULL; ir = next_ir) + { + i = ir->regno; + next_ir = ir->next; + free_insn_reg (ir); + bitmap_clear_bit (&lra_reg_info[i].insn_bitmap, uid); + if (i >= FIRST_PSEUDO_REGISTER && ! debug_p) + { + lra_reg_info[i].nrefs--; + lra_reg_info[i].freq -= freq; + lra_assert (lra_reg_info[i].nrefs >= 0 && lra_reg_info[i].freq >= 0); + } + } + data->regs = NULL; +} + +/* Invalidate all reg info of INSN. Update common info about the + invalidated registers. */ +void +lra_invalidate_insn_regno_info (rtx insn) +{ + invalidate_insn_data_regno_info (lra_get_insn_recog_data (insn), insn, + get_insn_freq (insn)); +} + +/* Update common reg info from reg info of insn given by its DATA and + execution frequency FREQ. */ +static void +setup_insn_reg_info (lra_insn_recog_data_t data, int freq) +{ + unsigned int i; + struct lra_insn_reg *ir; + + for (ir = data->regs; ir != NULL; ir = ir->next) + if ((i = ir->regno) >= FIRST_PSEUDO_REGISTER) + { + lra_reg_info[i].nrefs++; + lra_reg_info[i].freq += freq; + } +} + +/* Set up insn reg info of INSN. Update common reg info from reg info + of INSN. */ +void +lra_update_insn_regno_info (rtx insn) +{ + int i, uid, freq; + lra_insn_recog_data_t data; + struct lra_static_insn_data *static_data; + enum rtx_code code; + + if (! INSN_P (insn)) + return; + data = lra_get_insn_recog_data (insn); + static_data = data->insn_static_data; + freq = get_insn_freq (insn); + invalidate_insn_data_regno_info (data, insn, freq); + uid = INSN_UID (insn); + for (i = static_data->n_operands - 1; i >= 0; i--) + add_regs_to_insn_regno_info (data, *data->operand_loc[i], uid, + static_data->operand[i].type, + static_data->operand[i].early_clobber); + if ((code = GET_CODE (PATTERN (insn))) == CLOBBER || code == USE) + add_regs_to_insn_regno_info (data, XEXP (PATTERN (insn), 0), uid, + code == USE ? OP_IN : OP_OUT, false); + if (NONDEBUG_INSN_P (insn)) + setup_insn_reg_info (data, freq); +} + +/* Return reg info of insn given by it UID. */ +struct lra_insn_reg * +lra_get_insn_regs (int uid) +{ + lra_insn_recog_data_t data; + + data = get_insn_recog_data_by_uid (uid); + return data->regs; +} + + + +/* This page contains code dealing with stack of the insns which + should be processed by the next constraint pass. */ + +/* Bitmap used to put an insn on the stack only in one exemplar. */ +static sbitmap lra_constraint_insn_stack_bitmap; + +/* The stack itself. */ +VEC (rtx, heap) *lra_constraint_insn_stack; + +/* Put INSN on the stack. If ALWAYS_UPDATE is true, always update the reg + info for INSN, otherwise only update it if INSN is not already on the + stack. */ +static inline void +lra_push_insn_1 (rtx insn, bool always_update) +{ + unsigned int uid = INSN_UID (insn); + if (always_update) + lra_update_insn_regno_info (insn); + if (uid >= SBITMAP_SIZE (lra_constraint_insn_stack_bitmap)) + lra_constraint_insn_stack_bitmap = + sbitmap_resize (lra_constraint_insn_stack_bitmap, 3 * uid / 2, 0); + if (TEST_BIT (lra_constraint_insn_stack_bitmap, uid)) + return; + SET_BIT (lra_constraint_insn_stack_bitmap, uid); + if (! always_update) + lra_update_insn_regno_info (insn); + VEC_safe_push (rtx, heap, lra_constraint_insn_stack, insn); +} + +/* Put INSN on the stack. */ +void +lra_push_insn (rtx insn) +{ + lra_push_insn_1 (insn, false); +} + +/* Put INSN on the stack and update its reg info. */ +void +lra_push_insn_and_update_insn_regno_info (rtx insn) +{ + lra_push_insn_1 (insn, true); +} + +/* Put insn with UID on the stack. */ +void +lra_push_insn_by_uid (unsigned int uid) +{ + lra_push_insn (lra_insn_recog_data[uid]->insn); +} + +/* Take the last-inserted insns off the stack and return it. */ +rtx +lra_pop_insn (void) +{ + rtx insn = VEC_pop (rtx, lra_constraint_insn_stack); + RESET_BIT (lra_constraint_insn_stack_bitmap, INSN_UID (insn)); + return insn; +} + +/* Return the current size of the insn stack. */ +unsigned int +lra_insn_stack_length (void) +{ + return VEC_length (rtx, lra_constraint_insn_stack); +} + +/* Push insns FROM to TO (excluding it) going in reverse order. */ +static void +push_insns (rtx from, rtx to) +{ + rtx insn; + + if (from == NULL_RTX) + return; + for (insn = from; insn != to; insn = PREV_INSN (insn)) + if (INSN_P (insn)) + lra_push_insn (insn); +} + +/* Emit insns BEFORE before INSN and insns AFTER after INSN. Put the + insns onto the stack. Print about emitting the insns with + TITLE. */ +void +lra_process_new_insns (rtx insn, rtx before, rtx after, const char *title) +{ + rtx last; + + if (lra_dump_file != NULL && (before != NULL_RTX || after != NULL_RTX)) + { + debug_rtl_slim (lra_dump_file, insn, insn, -1, 0); + if (before != NULL_RTX) + { + fprintf (lra_dump_file," %s before:\n", title); + debug_rtl_slim (lra_dump_file, before, NULL_RTX, -1, 0); + } + if (after != NULL_RTX) + { + fprintf (lra_dump_file, " %s after:\n", title); + debug_rtl_slim (lra_dump_file, after, NULL_RTX, -1, 0); + } + fprintf (lra_dump_file, "\n"); + } + if (before != NULL_RTX) + { + emit_insn_before (before, insn); + push_insns (PREV_INSN (insn), PREV_INSN (before)); + } + if (after != NULL_RTX) + { + for (last = after; NEXT_INSN (last) != NULL_RTX; last = NEXT_INSN (last)) + ; + emit_insn_after (after, insn); + push_insns (last, insn); + } +} + + + +/* This page contains code dealing with scratches (changing them onto + pseudos and restoring them from the pseudos). + + We change scratches into pseudos at the beginning of LRA to + simplify dealing with them (conflicts, hard register assignments). + + If the pseudo denoting scratch was spilled it means that we do need + a hard register for it. Such pseudos are transformed back to + scratches at the end of LRA. */ + +/* Description of location of a former scratch operand. */ +struct loc +{ + rtx insn; /* Insn where the scratch was. */ + int nop; /* Number of the operand which was a scratch. */ +}; + +typedef struct loc *loc_t; + +DEF_VEC_P(loc_t); +DEF_VEC_ALLOC_P(loc_t, heap); + +/* Locations of the former scratches. */ +static VEC (loc_t, heap) *scratches; + +/* Bitmap of scratch regnos. */ +static bitmap_head scratch_bitmap; + +/* Bitmap of scratch operands. */ +static bitmap_head scratch_operand_bitmap; + +/* Return true if pseudo REGNO is made of SCRATCH. */ +bool +lra_former_scratch_p (int regno) +{ + return bitmap_bit_p (&scratch_bitmap, regno); +} + +/* Return true if the operand NOP of INSN is a former scratch. */ +bool +lra_former_scratch_operand_p (rtx insn, int nop) +{ + return bitmap_bit_p (&scratch_operand_bitmap, + INSN_UID (insn) * MAX_RECOG_OPERANDS + nop) != 0; +} + +/* Change scratches onto pseudos and save their location. */ +static void +remove_scratches (void) +{ + int i; + bool insn_changed_p; + basic_block bb; + rtx insn, reg; + loc_t loc; + lra_insn_recog_data_t id; + struct lra_static_insn_data *static_id; + + scratches = VEC_alloc (loc_t, heap, get_max_uid ()); + bitmap_initialize (&scratch_bitmap, ®_obstack); + bitmap_initialize (&scratch_operand_bitmap, ®_obstack); + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + if (INSN_P (insn)) + { + id = lra_get_insn_recog_data (insn); + static_id = id->insn_static_data; + insn_changed_p = false; + for (i = 0; i < static_id->n_operands; i++) + if (GET_CODE (*id->operand_loc[i]) == SCRATCH + && GET_MODE (*id->operand_loc[i]) != VOIDmode) + { + insn_changed_p = true; + *id->operand_loc[i] = reg + = lra_create_new_reg (static_id->operand[i].mode, + *id->operand_loc[i], ALL_REGS, NULL); + add_reg_note (insn, REG_UNUSED, reg); + lra_update_dup (id, i); + loc = XNEW (struct loc); + loc->insn = insn; + loc->nop = i; + VEC_safe_push (loc_t, heap, scratches, loc); + bitmap_set_bit (&scratch_bitmap, REGNO (*id->operand_loc[i])); + bitmap_set_bit (&scratch_operand_bitmap, + INSN_UID (insn) * MAX_RECOG_OPERANDS + i); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + "Removing SCRATCH in insn #%u (nop %d)\n", + INSN_UID (insn), i); + } + if (insn_changed_p) + /* Because we might use DF right after caller-saves sub-pass + we need to keep DF info up to date. */ + df_insn_rescan (insn); + } +} + +/* Changes pseudos created by function remove_scratches onto scratches. */ +static void +restore_scratches (void) +{ + int i, regno; + loc_t loc; + rtx last = NULL_RTX; + lra_insn_recog_data_t id = NULL; + + for (i = 0; VEC_iterate (loc_t, scratches, i, loc); i++) + { + if (last != loc->insn) + { + last = loc->insn; + id = lra_get_insn_recog_data (last); + } + if (REG_P (*id->operand_loc[loc->nop]) + && ((regno = REGNO (*id->operand_loc[loc->nop])) + >= FIRST_PSEUDO_REGISTER) + && lra_get_regno_hard_regno (regno) < 0) + { + /* It should be only case when scratch register with chosen + constraint 'X' did not get memory or hard register. */ + lra_assert (lra_former_scratch_p (regno)); + *id->operand_loc[loc->nop] + = gen_rtx_SCRATCH (GET_MODE (*id->operand_loc[loc->nop])); + lra_update_dup (id, loc->nop); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, "Restoring SCRATCH in insn #%u(nop %d)\n", + INSN_UID (loc->insn), loc->nop); + } + } + for (i = 0; VEC_iterate (loc_t, scratches, i, loc); i++) + free (loc); + VEC_free (loc_t, heap, scratches); + bitmap_clear (&scratch_bitmap); + bitmap_clear (&scratch_operand_bitmap); +} + + + +#ifdef ENABLE_CHECKING + +/* Function checks RTL for correctness. If FINAL_P is true, it is + done at the end of LRA and the check is more rigorous. */ +static void +check_rtl (bool final_p) +{ + int i; + basic_block bb; + rtx insn; + lra_insn_recog_data_t id; + + lra_assert (! final_p || reload_completed); + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + if (NONDEBUG_INSN_P (insn) + && GET_CODE (PATTERN (insn)) != USE + && GET_CODE (PATTERN (insn)) != CLOBBER + && GET_CODE (PATTERN (insn)) != ADDR_VEC + && GET_CODE (PATTERN (insn)) != ADDR_DIFF_VEC + && GET_CODE (PATTERN (insn)) != ASM_INPUT) + { + if (final_p) + { + extract_insn (insn); + lra_assert (constrain_operands (1)); + continue; + } + if (insn_invalid_p (insn, false)) + fatal_insn_not_found (insn); + if (asm_noperands (PATTERN (insn)) >= 0) + continue; + id = lra_get_insn_recog_data (insn); + /* The code is based on assumption that all addresses in + regular instruction are legitimate before LRA. The code in + lra-constraints.c is based on assumption that there is no + subreg of memory as an insn operand. */ + for (i = 0; i < id->insn_static_data->n_operands; i++) + { + rtx op = *id->operand_loc[i]; + + if (MEM_P (op) + && (GET_MODE (op) != BLKmode + || GET_CODE (XEXP (op, 0)) != SCRATCH) + && ! memory_address_p (GET_MODE (op), XEXP (op, 0)) + /* Some ports don't recognize the following addresses + as legitimate. Although they are legitimate if + they satisfies the constraints and will be checked + by insn constraints which we ignore here. */ + && GET_CODE (XEXP (op, 0)) != UNSPEC + && GET_RTX_CLASS (GET_CODE (XEXP (op, 0))) != RTX_AUTOINC) + fatal_insn_not_found (insn); + } + } +} +#endif /* #ifdef ENABLE_CHECKING */ + +/* Determine if the current function has an exception receiver block + that reaches the exit block via non-exceptional edges */ +static bool +has_nonexceptional_receiver (void) +{ + edge e; + edge_iterator ei; + basic_block *tos, *worklist, bb; + + /* If we're not optimizing, then just err on the safe side. */ + if (!optimize) + return true; + + /* First determine which blocks can reach exit via normal paths. */ + tos = worklist = XNEWVEC (basic_block, n_basic_blocks + 1); + + FOR_EACH_BB (bb) + bb->flags &= ~BB_REACHABLE; + + /* Place the exit block on our worklist. */ + EXIT_BLOCK_PTR->flags |= BB_REACHABLE; + *tos++ = EXIT_BLOCK_PTR; + + /* Iterate: find everything reachable from what we've already seen. */ + while (tos != worklist) + { + bb = *--tos; + + FOR_EACH_EDGE (e, ei, bb->preds) + if (e->flags & EDGE_ABNORMAL) + { + free (worklist); + return true; + } + else + { + basic_block src = e->src; + + if (!(src->flags & BB_REACHABLE)) + { + src->flags |= BB_REACHABLE; + *tos++ = src; + } + } + } + free (worklist); + /* No exceptional block reached exit unexceptionally. */ + return false; +} + +#ifdef AUTO_INC_DEC + +/* Process recursively X of INSN and add REG_INC notes if necessary. */ +static void +add_auto_inc_notes (rtx insn, rtx x) +{ + enum rtx_code code = GET_CODE (x); + const char *fmt; + int i, j; + + if (code == MEM && auto_inc_p (XEXP (x, 0))) + { + add_reg_note (insn, REG_INC, XEXP (XEXP (x, 0), 0)); + return; + } + + /* Scan all X sub-expressions. */ + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + { + if (fmt[i] == 'e') + add_auto_inc_notes (insn, XEXP (x, i)); + else if (fmt[i] == 'E') + for (j = XVECLEN (x, i) - 1; j >= 0; j--) + add_auto_inc_notes (insn, XVECEXP (x, i, j)); + } +} + +#endif + +/* Remove all REG_DEAD and REG_UNUSED notes and regenerate REG_INC. + We change pseudos by hard registers without notification of DF and + that can make the notes obsolete. DF-infrastructure does not deal + with REG_INC notes -- so we should regenerate them here. */ +static void +update_inc_notes (void) +{ + rtx *pnote; + basic_block bb; + rtx insn; + + FOR_EACH_BB (bb) + FOR_BB_INSNS (bb, insn) + if (NONDEBUG_INSN_P (insn)) + { + pnote = ®_NOTES (insn); + while (*pnote != 0) + { + if (REG_NOTE_KIND (*pnote) == REG_INC) + *pnote = XEXP (*pnote, 1); + else + pnote = &XEXP (*pnote, 1); + } +#ifdef AUTO_INC_DEC + add_auto_inc_notes (insn, PATTERN (insn)); +#endif + } +} + +/* Set to 1 while in lra. */ +int lra_in_progress; + +/* Start of reload pseudo regnos before the new spill pass. */ +int lra_constraint_new_regno_start; + +/* Inheritance pseudo regnos before the new spill pass. */ +bitmap_head lra_inheritance_pseudos; + +/* Split regnos before the new spill pass. */ +bitmap_head lra_split_regs; + +/* Reload pseudo regnos before the new assign pass which still can be + spilled after the assinment pass. */ +bitmap_head lra_optional_reload_pseudos; + +/* First UID of insns generated before a new spill pass. */ +int lra_constraint_new_insn_uid_start; + +/* File used for output of LRA debug information. */ +FILE *lra_dump_file; + +/* True if we should try spill into registers of different classes + instead of memory. */ +bool lra_reg_spill_p; + +/* Set up value LRA_REG_SPILL_P. */ +static void +setup_reg_spill_flag (void) +{ + int cl, mode; + + if (targetm.spill_class != NULL) + for (cl = 0; cl < (int) LIM_REG_CLASSES; cl++) + for (mode = 0; mode < MAX_MACHINE_MODE; mode++) + if (targetm.spill_class ((enum reg_class) cl, + (enum machine_mode) mode) != NO_REGS) + { + lra_reg_spill_p = true; + return; + } + lra_reg_spill_p = false; +} + +/* True if the current function is too big to use regular algorithms + in LRA. In other words, we should use simpler and faster algorithms + in LRA. It also means we should not worry about generation code + for caller saves. The value is set up in IRA. */ +bool lra_simple_p; + +/* Major LRA entry function. F is a file should be used to dump LRA + debug info. */ +void +lra (FILE *f) +{ + int i; + bool live_p, scratch_p, inserted_p; + + lra_dump_file = f; + + timevar_push (TV_LRA); + + init_insn_recog_data (); + +#ifdef ENABLE_CHECKING + check_rtl (false); +#endif + + COPY_HARD_REG_SET (lra_no_alloc_regs, ira_no_alloc_regs); + + lra_live_range_iter = lra_coalesce_iter = 0; + lra_constraint_iter = lra_constraint_iter_after_spill = 0; + lra_inheritance_iter = lra_undo_inheritance_iter = 0; + + setup_reg_spill_flag (); + + /* We can not set up reload_in_progress because it prevents new + pseudo creation. */ + lra_in_progress = 1; + + init_reg_info (); + expand_reg_info (); + + /* Function remove_scratches can creates new pseudos for clobbers -- + so set up lra_constraint_new_regno_start before its call to + permit changing reg classes for pseudos created by this + simplification. */ + lra_constraint_new_regno_start = max_reg_num (); + remove_scratches (); + scratch_p = lra_constraint_new_regno_start != max_reg_num (); + + /* A function that has a non-local label that can reach the exit + block via non-exceptional paths must save all call-saved + registers. */ + if (cfun->has_nonlocal_label && has_nonexceptional_receiver ()) + crtl->saves_all_registers = 1; + + if (crtl->saves_all_registers) + for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) + if (! call_used_regs[i] && ! fixed_regs[i] && ! LOCAL_REGNO (i)) + df_set_regs_ever_live (i, true); + + /* We don't DF from now and avoid its using because it is to + expensive when a lot of RTL changes are made. */ + df_set_flags (DF_NO_INSN_RESCAN); + lra_constraint_insn_stack = VEC_alloc (rtx, heap, get_max_uid ()); + lra_constraint_insn_stack_bitmap = sbitmap_alloc (get_max_uid ()); + sbitmap_zero (lra_constraint_insn_stack_bitmap); + lra_live_ranges_init (); + lra_constraints_init (); + lra_curr_reload_num = 0; + push_insns (get_last_insn (), NULL_RTX); + /* It is needed for the 1st coalescing. */ + lra_constraint_new_insn_uid_start = get_max_uid (); + bitmap_initialize (&lra_inheritance_pseudos, ®_obstack); + bitmap_initialize (&lra_split_regs, ®_obstack); + bitmap_initialize (&lra_optional_reload_pseudos, ®_obstack); + live_p = false; + for (;;) + { + for (;;) + { + bitmap_clear (&lra_optional_reload_pseudos); + /* We should try to assign hard registers to scratches even + if there were no RTL transformations in + lra_constraints. */ + if (! lra_constraints (lra_constraint_iter == 0) + && (lra_constraint_iter > 1 + || (! scratch_p && ! caller_save_needed))) + break; + /* Constraint transformations may result in that eliminable + hard regs become uneliminable and pseudos which use them + should be spilled. It is better to do it before pseudo + assignments. + + For example, rs6000 can make + RS6000_PIC_OFFSET_TABLE_REGNUM uneliminable if we started + to use a constant pool. */ + lra_eliminate (false); + /* Do inheritance only for regular algorithms. */ + if (! lra_simple_p) + lra_inheritance (); + /* We need live ranges for lra_assign -- so build them. */ + lra_create_live_ranges (true); + live_p = true; + /* If we don't spill non-reload and non-inheritance pseudos, + there is no sense to run memory-memory move coalescing. + If inheritance pseudos were spilled, the memory-memory + moves involving them will be removed by pass undoing + inheritance. */ + if (lra_simple_p) + lra_assign (); + else + { + /* Do coalescing only for regular algorithms. */ + if (! lra_assign () && lra_coalesce ()) + live_p = false; + if (lra_undo_inheritance ()) + live_p = false; + } + } + bitmap_clear (&lra_inheritance_pseudos); + bitmap_clear (&lra_split_regs); + if (! lra_need_for_spills_p ()) + break; + if (! live_p) + { + /* We need full live info for spilling pseudos into + registers instead of memory. */ + lra_create_live_ranges (lra_reg_spill_p); + live_p = true; + } + lra_spill (); + /* Assignment of stack slots changes elimination offsets for + some eliminations. So update the offsets here. */ + lra_eliminate (false); + lra_constraint_new_regno_start = max_reg_num (); + lra_constraint_new_insn_uid_start = get_max_uid (); + lra_constraint_iter_after_spill = 0; + } + restore_scratches (); + lra_eliminate (true); + lra_hard_reg_substitution (); + lra_in_progress = 0; + lra_clear_live_ranges (); + lra_live_ranges_finish (); + lra_constraints_finish (); + finish_reg_info (); + sbitmap_free (lra_constraint_insn_stack_bitmap); + VEC_free (rtx, heap, lra_constraint_insn_stack); + finish_insn_recog_data (); + regstat_free_n_sets_and_refs (); + regstat_free_ri (); + reload_completed = 1; + update_inc_notes (); + + inserted_p = fixup_abnormal_edges (); + + /* We've possibly turned single trapping insn into multiple ones. */ + if (cfun->can_throw_non_call_exceptions) + { + sbitmap blocks; + blocks = sbitmap_alloc (last_basic_block); + sbitmap_ones (blocks); + find_many_sub_basic_blocks (blocks); + sbitmap_free (blocks); + } + + if (inserted_p) + commit_edge_insertions (); + + /* Replacing pseudos with their memory equivalents might have + created shared rtx. Subsequent passes would get confused + by this, so unshare everything here. */ + unshare_all_rtl_again (get_insns ()); + +#ifdef ENABLE_CHECKING + check_rtl (true); +#endif + + timevar_pop (TV_LRA); +} + +/* Called once per compiler to initialize LRA data once. */ +void +lra_init_once (void) +{ + init_insn_code_data_once (); +} + +/* Initialize LRA whenever register-related information is changed. */ +void +lra_init (void) +{ + init_op_alt_data (); +} + +/* Called once per compiler to finish LRA data which are initialize + once. */ +void +lra_finish_once (void) +{ + finish_insn_code_data_once (); +} diff --git a/gcc/lra.h b/gcc/lra.h new file mode 100644 index 00000000000..ea275f72a44 --- /dev/null +++ b/gcc/lra.h @@ -0,0 +1,42 @@ +/* Communication between the Local Register Allocator (LRA) and + the rest of the compiler. + Copyright (C) 2010, 2011, 2012 + Free Software Foundation, Inc. + Contributed by Vladimir Makarov <vmakarov@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +extern bool lra_simple_p; + +/* Return the allocno reg class of REGNO. If it is a reload pseudo, + the pseudo should finally get hard register of the allocno + class. */ +static inline enum reg_class +lra_get_allocno_class (int regno) +{ + resize_reg_info (); + return reg_allocno_class (regno); +} + +extern rtx lra_create_new_reg (enum machine_mode, rtx, enum reg_class, + const char *); +extern void lra_init_elimination (void); +extern rtx lra_eliminate_regs (rtx, enum machine_mode, rtx); +extern void lra (FILE *); +extern void lra_init_once (void); +extern void lra_init (void); +extern void lra_finish_once (void); diff --git a/gcc/output.h b/gcc/output.h index 705c02e3dc8..3fb743a17e9 100644 --- a/gcc/output.h +++ b/gcc/output.h @@ -76,7 +76,7 @@ extern rtx final_scan_insn (rtx, FILE *, int, int, int *); /* Replace a SUBREG with a REG or a MEM, based on the thing it is a subreg of. */ -extern rtx alter_subreg (rtx *); +extern rtx alter_subreg (rtx *, bool); /* Print an operand using machine-dependent assembler syntax. */ extern void output_operand (rtx, int); diff --git a/gcc/recog.c b/gcc/recog.c index f28b0219d95..466ac6c3e2a 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -993,6 +993,12 @@ general_operand (rtx op, enum machine_mode mode) /* FLOAT_MODE subregs can't be paradoxical. Combine will occasionally create such rtl, and we must reject it. */ if (SCALAR_FLOAT_MODE_P (GET_MODE (op)) + /* LRA can use subreg to store a floating point value in an + integer mode. Although the floating point and the + integer modes need the same number of hard registers, the + size of floating point mode can be less than the integer + mode. */ + && ! lra_in_progress && GET_MODE_SIZE (GET_MODE (op)) > GET_MODE_SIZE (GET_MODE (sub))) return 0; @@ -1068,6 +1074,12 @@ register_operand (rtx op, enum machine_mode mode) /* FLOAT_MODE subregs can't be paradoxical. Combine will occasionally create such rtl, and we must reject it. */ if (SCALAR_FLOAT_MODE_P (GET_MODE (op)) + /* LRA can use subreg to store a floating point value in an + integer mode. Although the floating point and the + integer modes need the same number of hard registers, the + size of floating point mode can be less than the integer + mode. */ + && ! lra_in_progress && GET_MODE_SIZE (GET_MODE (op)) > GET_MODE_SIZE (GET_MODE (sub))) return 0; @@ -1099,7 +1111,7 @@ scratch_operand (rtx op, enum machine_mode mode) return (GET_CODE (op) == SCRATCH || (REG_P (op) - && REGNO (op) < FIRST_PSEUDO_REGISTER)); + && (lra_in_progress || REGNO (op) < FIRST_PSEUDO_REGISTER))); } /* Return 1 if OP is a valid immediate operand for mode MODE. diff --git a/gcc/rtl.h b/gcc/rtl.h index 09f1e773899..361669ac191 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -2372,6 +2372,9 @@ extern int epilogue_completed; extern int reload_in_progress; +/* Set to 1 while in lra. */ +extern int lra_in_progress; + /* This macro indicates whether you may create a new pseudo-register. */ @@ -2490,7 +2493,12 @@ extern rtx make_compound_operation (rtx, enum rtx_code); extern void delete_dead_jumptables (void); /* In sched-vis.c. */ -extern void dump_insn_slim (FILE *, const_rtx x); +extern void debug_bb_n_slim (int); +extern void debug_bb_slim (struct basic_block_def *); +extern void print_value_slim (FILE *, const_rtx, int); +extern void debug_rtl_slim (FILE *, const_rtx, const_rtx, int, int); +extern void dump_insn_slim (FILE *f, const_rtx x); +extern void debug_insn_slim (const_rtx x); /* In sched-rgn.c. */ extern void schedule_insns (void); diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c index a19bdfdc0d1..a101a292039 100644 --- a/gcc/rtlanal.c +++ b/gcc/rtlanal.c @@ -3481,7 +3481,9 @@ simplify_subreg_regno (unsigned int xregno, enum machine_mode xmode, /* Give the backend a chance to disallow the mode change. */ if (GET_MODE_CLASS (xmode) != MODE_COMPLEX_INT && GET_MODE_CLASS (xmode) != MODE_COMPLEX_FLOAT - && REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode)) + && REG_CANNOT_CHANGE_MODE_P (xregno, xmode, ymode) + /* We can use mode change in LRA for some transformations. */ + && ! lra_in_progress) return -1; #endif @@ -3491,10 +3493,16 @@ simplify_subreg_regno (unsigned int xregno, enum machine_mode xmode, return -1; if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM - && xregno == ARG_POINTER_REGNUM) + /* We should convert arg register in LRA after the elimination + if it is possible. */ + && xregno == ARG_POINTER_REGNUM + && ! lra_in_progress) return -1; - if (xregno == STACK_POINTER_REGNUM) + if (xregno == STACK_POINTER_REGNUM + /* We should convert hard stack register in LRA if it is + possible. */ + && ! lra_in_progress) return -1; /* Try to get the register offset. */ diff --git a/gcc/sched-vis.c b/gcc/sched-vis.c index 24403a647ee..280b33ad2f5 100644 --- a/gcc/sched-vis.c +++ b/gcc/sched-vis.c @@ -546,6 +546,21 @@ print_value (char *buf, const_rtx x, int verbose) } } /* print_value */ +/* Print X, an RTL value node, to file F in slim format. Include + additional information if VERBOSE is nonzero. + + Value nodes are constants, registers, labels, symbols and + memory. */ + +void +print_value_slim (FILE *f, const_rtx x, int verbose) +{ + char buf[BUF_LEN]; + + print_value (buf, x, verbose); + fprintf (f, "%s", buf); +} + /* The next step in insn detalization, its pattern recognition. */ void diff --git a/gcc/sdbout.c b/gcc/sdbout.c index 59892ba4a20..5413c6cd153 100644 --- a/gcc/sdbout.c +++ b/gcc/sdbout.c @@ -767,7 +767,7 @@ sdbout_symbol (tree decl, int local) if (REGNO (value) >= FIRST_PSEUDO_REGISTER) return; } - regno = REGNO (alter_subreg (&value)); + regno = REGNO (alter_subreg (&value, true)); SET_DECL_RTL (decl, value); } /* Don't output anything if an auto variable diff --git a/gcc/target-globals.c b/gcc/target-globals.c index e679f21614e..b3d02a1bc73 100644 --- a/gcc/target-globals.c +++ b/gcc/target-globals.c @@ -37,6 +37,7 @@ along with GCC; see the file COPYING3. If not see #include "libfuncs.h" #include "cfgloop.h" #include "ira-int.h" +#include "lra-int.h" #include "builtins.h" #include "gcse.h" #include "bb-reorder.h" @@ -55,6 +56,7 @@ struct target_globals default_target_globals = { &default_target_cfgloop, &default_target_ira, &default_target_ira_int, + &default_target_lra_int, &default_target_builtins, &default_target_gcse, &default_target_bb_reorder, diff --git a/gcc/target-globals.h b/gcc/target-globals.h index fb0f260c0c9..42f19d4f64f 100644 --- a/gcc/target-globals.h +++ b/gcc/target-globals.h @@ -32,6 +32,7 @@ extern struct target_libfuncs *this_target_libfuncs; extern struct target_cfgloop *this_target_cfgloop; extern struct target_ira *this_target_ira; extern struct target_ira_int *this_target_ira_int; +extern struct target_lra_int *this_target_lra_int; extern struct target_builtins *this_target_builtins; extern struct target_gcse *this_target_gcse; extern struct target_bb_reorder *this_target_bb_reorder; @@ -49,6 +50,7 @@ struct GTY(()) target_globals { struct target_cfgloop *GTY((skip)) cfgloop; struct target_ira *GTY((skip)) ira; struct target_ira_int *GTY((skip)) ira_int; + struct target_lra_int *GTY((skip)) lra_int; struct target_builtins *GTY((skip)) builtins; struct target_gcse *GTY((skip)) gcse; struct target_bb_reorder *GTY((skip)) bb_reorder; @@ -73,6 +75,7 @@ restore_target_globals (struct target_globals *g) this_target_cfgloop = g->cfgloop; this_target_ira = g->ira; this_target_ira_int = g->ira_int; + this_target_lra_int = g->lra_int; this_target_builtins = g->builtins; this_target_gcse = g->gcse; this_target_bb_reorder = g->bb_reorder; diff --git a/gcc/target.def b/gcc/target.def index 2d79290b311..586522435a2 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -2352,6 +2352,55 @@ DEFHOOK tree, (tree type, tree expr), hook_tree_tree_tree_null) +/* Return true if we use LRA instead of reload. */ +DEFHOOK +(lra_p, + "A target hook which returns true if we use LRA instead of reload pass.\ + It means that LRA was ported to the target.\ + \ + The default version of this target hook returns always false.", + bool, (void), + default_lra_p) + +/* Return register priority of given hard regno for the current target. */ +DEFHOOK +(register_priority, + "A target hook which returns the register priority number to which the\ + register @var{hard_regno} belongs to. The bigger the number, the\ + more preferable the hard register usage (when all other conditions are\ + the same). This hook can be used to prefer some hard register over\ + others in LRA. For example, some x86-64 register usage needs\ + additional prefix which makes instructions longer. The hook can\ + return lower priority number for such registers make them less favorable\ + and as result making the generated code smaller.\ + \ + The default version of this target hook returns always zero.", + int, (int), + default_register_priority) + +/* Return true if maximal address displacement can be different. */ +DEFHOOK +(different_addr_displacement_p, + "A target hook which returns true if an address with the same structure\ + can have different maximal legitimate displacement. For example, the\ + displacement can depend on memory mode or on operand combinations in\ + the insn.\ + \ + The default version of this target hook returns always false.", + bool, (void), + default_different_addr_displacement_p) + +/* Determine class for spilling pseudos of given mode into registers + instead of memory. */ +DEFHOOK +(spill_class, + "This hook defines a class of registers which could be used for spilling\ + pseudos of the given mode and class, or @code{NO_REGS} if only memory\ + should be used. Not defining this hook is equivalent to returning\ + @code{NO_REGS} for all inputs.", + reg_class_t, (reg_class_t, enum machine_mode), + NULL) + /* True if a structure, union or array with MODE containing FIELD should be accessed using BLKmode. */ DEFHOOK diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 265fc9840d1..be008fdcd5d 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -840,6 +840,24 @@ default_branch_target_register_class (void) return NO_REGS; } +extern bool +default_lra_p (void) +{ + return false; +} + +int +default_register_priority (int hard_regno ATTRIBUTE_UNUSED) +{ + return 0; +} + +extern bool +default_different_addr_displacement_p (void) +{ + return false; +} + reg_class_t default_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x ATTRIBUTE_UNUSED, reg_class_t reload_class_i ATTRIBUTE_UNUSED, diff --git a/gcc/targhooks.h b/gcc/targhooks.h index e89f096bcfb..d4196024708 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -132,6 +132,9 @@ extern rtx default_static_chain (const_tree, bool); extern void default_trampoline_init (rtx, tree, rtx); extern int default_return_pops_args (tree, tree, int); extern reg_class_t default_branch_target_register_class (void); +extern bool default_lra_p (void); +extern int default_register_priority (int); +extern bool default_different_addr_displacement_p (void); extern reg_class_t default_secondary_reload (bool, rtx, reg_class_t, enum machine_mode, secondary_reload_info *); diff --git a/gcc/timevar.def b/gcc/timevar.def index 8f99b509558..3ad8ba2d7b5 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -223,10 +223,16 @@ DEFTIMEVAR (TV_REGMOVE , "regmove") DEFTIMEVAR (TV_MODE_SWITCH , "mode switching") DEFTIMEVAR (TV_SMS , "sms modulo scheduling") DEFTIMEVAR (TV_SCHED , "scheduling") -DEFTIMEVAR (TV_IRA , "integrated RA") -DEFTIMEVAR (TV_RELOAD , "reload") +DEFTIMEVAR (TV_IRA , "integrated RA") +DEFTIMEVAR (TV_LRA , "LRA non-specific") +DEFTIMEVAR (TV_LRA_ELIMINATE , "LRA virtuals elimination") +DEFTIMEVAR (TV_LRA_INHERITANCE , "LRA reload inheritance") +DEFTIMEVAR (TV_LRA_CREATE_LIVE_RANGES, "LRA create live ranges") +DEFTIMEVAR (TV_LRA_ASSIGN , "LRA hard reg assignment") +DEFTIMEVAR (TV_LRA_COALESCE , "LRA coalesce pseudo regs") +DEFTIMEVAR (TV_RELOAD , "reload") DEFTIMEVAR (TV_RELOAD_CSE_REGS , "reload CSE regs") -DEFTIMEVAR (TV_GCSE_AFTER_RELOAD , "load CSE after reload") +DEFTIMEVAR (TV_GCSE_AFTER_RELOAD , "load CSE after reload") DEFTIMEVAR (TV_REE , "ree") DEFTIMEVAR (TV_THREAD_PROLOGUE_AND_EPILOGUE, "thread pro- & epilogue") DEFTIMEVAR (TV_IFCVT2 , "if-conversion 2") |