diff options
author | amker <amker@138bc75d-0d04-0410-961f-82ee72b054a4> | 2014-11-14 02:32:38 +0000 |
---|---|---|
committer | amker <amker@138bc75d-0d04-0410-961f-82ee72b054a4> | 2014-11-14 02:32:38 +0000 |
commit | 012ad66c69df497ca32eda18561aa64c101c769a (patch) | |
tree | 07a10a70a37299c1e153e8716e560040483d0d31 /gcc/doc/tm.texi | |
parent | fb773ec95b6e64ab3ab05e0a286183ef4e48bee7 (diff) | |
download | gcc-012ad66c69df497ca32eda18561aa64c101c769a.tar.gz |
* timevar.def (TV_SCHED_FUSION): New time var.
* passes.def (pass_sched_fusion): New pass.
* config/arm/arm.c (TARGET_SCHED_FUSION_PRIORITY): New.
(extract_base_offset_in_addr, fusion_load_store): New.
(arm_sched_fusion_priority): New.
(arm_option_override): Disable scheduling fusion by default
on non-armv7 processors or ldrd/strd isn't preferred.
* sched-int.h (struct _haifa_insn_data): New field.
(INSN_FUSION_PRIORITY, FUSION_MAX_PRIORITY, sched_fusion): New.
* sched-rgn.c (rest_of_handle_sched_fusion): New.
(pass_data_sched_fusion, pass_sched_fusion): New.
(make_pass_sched_fusion): New.
* haifa-sched.c (sched_fusion): New.
(insn_cost): Handle sched_fusion.
(priority): Handle sched_fusion by calling target hook.
(enum rfs_decision): New enum value.
(rfs_str): New element for RFS_FUSION.
(rank_for_schedule): Support sched_fusion.
(schedule_insn, max_issue, prune_ready_list): Handle sched_fusion.
(schedule_block, fix_tick_ready): Handle sched_fusion.
* common.opt (flag_schedule_fusion): New.
* tree-pass.h (make_pass_sched_fusion): New.
* target.def (fusion_priority): New.
* doc/tm.texi.in (TARGET_SCHED_FUSION_PRIORITY): New.
* doc/tm.texi: Regenerated.
* doc/invoke.texi (-fschedule-fusion): New.
testsuite:
* gcc.target/arm/ldrd-strd-pair-1.c: New test.
* gcc.target/arm/vfp-1.c: Improve scanning string.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@217533 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/doc/tm.texi')
-rw-r--r-- | gcc/doc/tm.texi | 70 |
1 files changed, 70 insertions, 0 deletions
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 6e2825f2e01..8d137f5cf1e 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6771,6 +6771,76 @@ This hook is called by tree reassociator to determine a level of parallelism required in output calculations chain. @end deftypefn +@deftypefn {Target Hook} void TARGET_SCHED_FUSION_PRIORITY (rtx_insn *@var{insn}, int @var{max_pri}, int *@var{fusion_pri}, int *@var{pri}) +This hook is called by scheduling fusion pass. It calculates fusion +priorities for each instruction passed in by parameter. The priorities +are returned via pointer parameters. + +@var{insn} is the instruction whose priorities need to be calculated. +@var{max_pri} is the maximum priority can be returned in any cases. +@var{fusion_pri} is the pointer parameter through which @var{insn}'s +fusion priority should be calculated and returned. +@var{pri} is the pointer parameter through which @var{insn}'s priority +should be calculated and returned. + +Same @var{fusion_pri} should be returned for instructions which should +be scheduled together. Different @var{pri} should be returned for +instructions with same @var{fusion_pri}. @var{fusion_pri} is the major +sort key, @var{pri} is the minor sort key. All instructions will be +scheduled according to the two priorities. All priorities calculated +should be between 0 (exclusive) and @var{max_pri} (inclusive). To avoid +false dependencies, @var{fusion_pri} of instructions which need to be +scheduled together should be smaller than @var{fusion_pri} of irrelevant +instructions. + +Given below example: + + ldr r10, [r1, 4] + add r4, r4, r10 + ldr r15, [r2, 8] + sub r5, r5, r15 + ldr r11, [r1, 0] + add r4, r4, r11 + ldr r16, [r2, 12] + sub r5, r5, r16 + +On targets like ARM/AArch64, the two pairs of consecutive loads should be +merged. Since peephole2 pass can't help in this case unless consecutive +loads are actually next to each other in instruction flow. That's where +this scheduling fusion pass works. This hook calculates priority for each +instruction based on its fustion type, like: + + ldr r10, [r1, 4] ; fusion_pri=99, pri=96 + add r4, r4, r10 ; fusion_pri=100, pri=100 + ldr r15, [r2, 8] ; fusion_pri=98, pri=92 + sub r5, r5, r15 ; fusion_pri=100, pri=100 + ldr r11, [r1, 0] ; fusion_pri=99, pri=100 + add r4, r4, r11 ; fusion_pri=100, pri=100 + ldr r16, [r2, 12] ; fusion_pri=98, pri=88 + sub r5, r5, r16 ; fusion_pri=100, pri=100 + +Scheduling fusion pass then sorts all ready to issue instructions according +to the priorities. As a result, instructions of same fusion type will be +pushed together in instruction flow, like: + + ldr r11, [r1, 0] + ldr r10, [r1, 4] + ldr r15, [r2, 8] + ldr r16, [r2, 12] + add r4, r4, r10 + sub r5, r5, r15 + add r4, r4, r11 + sub r5, r5, r16 + +Now peephole2 pass can simply merge the two pairs of loads. + +Since scheduling fusion pass relies on peephole2 to do real fusion +work, it is only enabled by default when peephole2 is in effect. + +This is firstly introduced on ARM/AArch64 targets, please refer to +the hook implementation for how different fusion types are supported. +@end deftypefn + @node Sections @section Dividing the Output into Sections (Texts, Data, @dots{}) @c the above section title is WAY too long. maybe cut the part between |