diff options
author | danglin <danglin@138bc75d-0d04-0410-961f-82ee72b054a4> | 2002-10-31 03:13:44 +0000 |
---|---|---|
committer | danglin <danglin@138bc75d-0d04-0410-961f-82ee72b054a4> | 2002-10-31 03:13:44 +0000 |
commit | ece8882131d54884edc90a37d43a8588fab2c7a3 (patch) | |
tree | f47286e288d980615f8a9fbf886fb3a2f0dffa4e | |
parent | 981a251cec2130deed33ce9eac028501a47343c7 (diff) | |
download | gcc-ece8882131d54884edc90a37d43a8588fab2c7a3.tar.gz |
* pa-linux.h (ASM_OUTPUT_EXTERNAL_LIBCALL): Define.
* pa-protos.h (attr_length_millicode_call, attr_length_call,
pa_init_machine_status): Declare new global functions.
* pa.c (void copy_fp_args, length_fp_args, get_plabel): Declare and
implement new functions.
(attr_length_millicode_call, attr_length_call): Implement.
(total_code_bytes): Change type to long.
(pa_output_function_prologue): Compute total_code_bytes on TARGET_64BIT.
Reset counter if flag_function_sections.
(output_deferred_plabels): Set output alignment to 3 for TARGET_64BIT.
(output_cbranch): Move call to gen_label_rtx.
(output_millicode_call): Rewrite adding long TARGET_64BIT call, expose
delay slot in all variants, shorten pc-relative calls.
(output_call): Rewrite adding long TARGET_64BIT call, improved delay
slot usage and exposure, various new call variants, and shortened
sequences for some variants on TARGET_PA_20.
Miscellaneous format changes.
* pa.h (total_code_bytes): Change type to long.
(MASK_LONG_CALLS, TARGET_LONG_CALLS, TARGET_LONG_ABS_CALL,
TARGET_LONG_PIC_SDIFF_CALL, TARGET_LONG_PIC_PCREL_CALL): Define.
(TARGET_SWITCHES): Add "-mlong-calls" and "-mno-long-calls" options.
(EXTRA_CONSTRAINT, GO_IF_LEGITIMATE_ADDRESS,
LEGITIMIZE_RELOAD_ADDRESS): Don't use long floating point loads and
stores on TARGET_ELF32.
*pa.md (define_delay): Allow insns in delay on TARGET_PORTABLE_RUNTIME.
(unnamed patterns for mulsi3, divsi3, udivsi3, modsi3, umodsi3 and
canonicalize_funcptr_for_compare expanders): Calculate attribute length
attr_length_millicode_call().
(call_internal_symref, call_value_internal_symref): Clobber register 1.
Calculate attribute length using attr_length_call().
(call_internal_reg_64bit, call_value_internal_reg_64bit): Move gp load
to delay slot.
(sibcall, sibcall_value): Rewrite.
(sibcall_internal_symref, sibcall_value_internal_symref): Clobber
register 1. Use attr_length_call().
(sibcall_internal_symref_64bit, sibcall_value_internal_symref_64bit):
New patterns.
(unamed pattern for canonicalize_funcptr_for_compare): Rewrite.
* som.h (MEMBER_TYPE_FORCES_BLK): Define.
* t-pa64 (TARGET_LIBGCC2_CFLAGS): Add "-mlong-calls".
* doc/invoke.texi (mlong-calls): Document.
git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@58665 138bc75d-0d04-0410-961f-82ee72b054a4
-rw-r--r-- | gcc/ChangeLog | 44 | ||||
-rw-r--r-- | gcc/config/pa/pa-linux.h | 13 | ||||
-rw-r--r-- | gcc/config/pa/pa-protos.h | 2 | ||||
-rw-r--r-- | gcc/config/pa/pa.c | 920 | ||||
-rw-r--r-- | gcc/config/pa/pa.h | 49 | ||||
-rw-r--r-- | gcc/config/pa/pa.md | 372 | ||||
-rw-r--r-- | gcc/config/pa/som.h | 4 | ||||
-rw-r--r-- | gcc/config/pa/t-pa64 | 2 | ||||
-rw-r--r-- | gcc/doc/invoke.texi | 29 |
9 files changed, 816 insertions, 619 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index c0798dfbff2..4031d7bf863 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,47 @@ +2002-10-30 John David Anglin <dave@hiauly.hia.nrc.ca> + + * pa-linux.h (ASM_OUTPUT_EXTERNAL_LIBCALL): Define. + * pa-protos.h (attr_length_millicode_call, attr_length_call, + pa_init_machine_status): Declare new global functions. + * pa.c (void copy_fp_args, length_fp_args, get_plabel): Declare and + implement new functions. + (attr_length_millicode_call, attr_length_call): Implement. + (total_code_bytes): Change type to long. + (pa_output_function_prologue): Compute total_code_bytes on TARGET_64BIT. + Reset counter if flag_function_sections. + (output_deferred_plabels): Set output alignment to 3 for TARGET_64BIT. + (output_cbranch): Move call to gen_label_rtx. + (output_millicode_call): Rewrite adding long TARGET_64BIT call, expose + delay slot in all variants, shorten pc-relative calls. + (output_call): Rewrite adding long TARGET_64BIT call, improved delay + slot usage and exposure, various new call variants, and shortened + sequences for some variants on TARGET_PA_20. + Miscellaneous format changes. + * pa.h (total_code_bytes): Change type to long. + (MASK_LONG_CALLS, TARGET_LONG_CALLS, TARGET_LONG_ABS_CALL, + TARGET_LONG_PIC_SDIFF_CALL, TARGET_LONG_PIC_PCREL_CALL): Define. + (TARGET_SWITCHES): Add "-mlong-calls" and "-mno-long-calls" options. + (EXTRA_CONSTRAINT, GO_IF_LEGITIMATE_ADDRESS, + LEGITIMIZE_RELOAD_ADDRESS): Don't use long floating point loads and + stores on TARGET_ELF32. + *pa.md (define_delay): Allow insns in delay on TARGET_PORTABLE_RUNTIME. + (unnamed patterns for mulsi3, divsi3, udivsi3, modsi3, umodsi3 and + canonicalize_funcptr_for_compare expanders): Calculate attribute length + attr_length_millicode_call(). + (call_internal_symref, call_value_internal_symref): Clobber register 1. + Calculate attribute length using attr_length_call(). + (call_internal_reg_64bit, call_value_internal_reg_64bit): Move gp load + to delay slot. + (sibcall, sibcall_value): Rewrite. + (sibcall_internal_symref, sibcall_value_internal_symref): Clobber + register 1. Use attr_length_call(). + (sibcall_internal_symref_64bit, sibcall_value_internal_symref_64bit): + New patterns. + (unamed pattern for canonicalize_funcptr_for_compare): Rewrite. + * som.h (MEMBER_TYPE_FORCES_BLK): Define. + * t-pa64 (TARGET_LIBGCC2_CFLAGS): Add "-mlong-calls". + * doc/invoke.texi (mlong-calls): Document. + 2002-10-30 Roger Sayle <roger@eyesopen.com> * fold-const.c (fold_binary_op_with_conditional_arg): Improve diff --git a/gcc/config/pa/pa-linux.h b/gcc/config/pa/pa-linux.h index cdc201d5f59..7f372faa8d3 100644 --- a/gcc/config/pa/pa-linux.h +++ b/gcc/config/pa/pa-linux.h @@ -196,6 +196,19 @@ Boston, MA 02111-1307, USA. */ } \ while (0) +/* As well as globalizing the label, we need to encode the label + to ensure a plabel is generated in an indirect call. */ + +#undef ASM_OUTPUT_EXTERNAL_LIBCALL +#define ASM_OUTPUT_EXTERNAL_LIBCALL(FILE, FUN) \ + do \ + { \ + if (!FUNCTION_NAME_P (XSTR (FUN, 0))) \ + hppa_encode_label (FUN); \ + (*targetm.asm_out.globalize_label) (FILE, XSTR (FUN, 0)); \ + } \ + while (0) + /* Linux always uses gas. */ #undef TARGET_GAS #define TARGET_GAS 1 diff --git a/gcc/config/pa/pa-protos.h b/gcc/config/pa/pa-protos.h index ca115fb55fc..5d1ab111d22 100644 --- a/gcc/config/pa/pa-protos.h +++ b/gcc/config/pa/pa-protos.h @@ -105,6 +105,8 @@ extern int jump_in_call_delay PARAMS ((rtx)); extern enum reg_class secondary_reload_class PARAMS ((enum reg_class, enum machine_mode, rtx)); extern int hppa_fpstore_bypass_p PARAMS ((rtx, rtx)); +extern int attr_length_millicode_call PARAMS ((rtx, int)); +extern int attr_length_call PARAMS ((rtx, int)); /* Declare functions defined in pa.c and used in templates. */ diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c index 58ba157fc24..b51e946852c 100644 --- a/gcc/config/pa/pa.c +++ b/gcc/config/pa/pa.c @@ -121,11 +121,13 @@ static void pa_globalize_label PARAMS ((FILE *, const char *)) ATTRIBUTE_UNUSED; static void pa_asm_output_mi_thunk PARAMS ((FILE *, tree, HOST_WIDE_INT, HOST_WIDE_INT, tree)); - +static void copy_fp_args PARAMS ((rtx)) ATTRIBUTE_UNUSED; +static int length_fp_args PARAMS ((rtx)) ATTRIBUTE_UNUSED; +static struct deferred_plabel *get_plabel PARAMS ((const char *)) + ATTRIBUTE_UNUSED; /* Save the operands last given to a compare for use when we generate a scc or bcc insn. */ - rtx hppa_compare_op0, hppa_compare_op1; enum cmp_type hppa_branch_type; @@ -149,12 +151,10 @@ static rtx find_addr_reg PARAMS ((rtx)); /* Keep track of the number of bytes we have output in the CODE subspaces during this compilation so we'll know when to emit inline long-calls. */ - -unsigned int total_code_bytes; +unsigned long total_code_bytes; /* Variables to handle plabels that we discover are necessary at assembly output time. They are output after the current function. */ - struct deferred_plabel GTY(()) { rtx internal_label; @@ -3197,14 +3197,14 @@ pa_output_function_prologue (file, size) fputs ("\n\t.ENTRY\n", file); /* If we're using GAS and SOM, and not using the portable runtime model, - then we don't need to accumulate the total number of code bytes. */ + or function sections, then we don't need to accumulate the total number + of code bytes. */ if ((TARGET_GAS && TARGET_SOM && ! TARGET_PORTABLE_RUNTIME) - /* FIXME: we can't handle long calls for TARGET_64BIT. */ - || TARGET_64BIT) + || flag_function_sections) total_code_bytes = 0; else if (INSN_ADDRESSES_SET_P ()) { - unsigned int old_total = total_code_bytes; + unsigned long old_total = total_code_bytes; total_code_bytes += INSN_ADDRESSES (INSN_UID (get_last_nonnote_insn ())); total_code_bytes += FUNCTION_BOUNDARY / BITS_PER_UNIT; @@ -4726,6 +4726,47 @@ output_global_address (file, x, round_constant) output_addr_const (file, x); } +static struct deferred_plabel * +get_plabel (fname) + const char *fname; +{ + size_t i; + + /* See if we have already put this function on the list of deferred + plabels. This list is generally small, so a liner search is not + too ugly. If it proves too slow replace it with something faster. */ + for (i = 0; i < n_deferred_plabels; i++) + if (strcmp (fname, deferred_plabels[i].name) == 0) + break; + + /* If the deferred plabel list is empty, or this entry was not found + on the list, create a new entry on the list. */ + if (deferred_plabels == NULL || i == n_deferred_plabels) + { + const char *real_name; + + if (deferred_plabels == 0) + deferred_plabels = (struct deferred_plabel *) + ggc_alloc (sizeof (struct deferred_plabel)); + else + deferred_plabels = (struct deferred_plabel *) + ggc_realloc (deferred_plabels, + ((n_deferred_plabels + 1) + * sizeof (struct deferred_plabel))); + + i = n_deferred_plabels++; + deferred_plabels[i].internal_label = gen_label_rtx (); + deferred_plabels[i].name = ggc_strdup (fname); + + /* Gross. We have just implicitly taken the address of this function, + mark it as such. */ + real_name = (*targetm.strip_name_encoding) (fname); + TREE_SYMBOL_REFERENCED (get_identifier (real_name)) = 1; + } + + return &deferred_plabels[i]; +} + void output_deferred_plabels (file) FILE *file; @@ -4737,7 +4778,7 @@ output_deferred_plabels (file) if (n_deferred_plabels) { data_section (); - ASM_OUTPUT_ALIGN (file, 2); + ASM_OUTPUT_ALIGN (file, TARGET_64BIT ? 3 : 2); } /* Now output the deferred plabels. */ @@ -5323,9 +5364,9 @@ hppa_va_arg (valist, type) const char * output_cbranch (operands, nullify, length, negated, insn) - rtx *operands; - int nullify, length, negated; - rtx insn; + rtx *operands; + int nullify, length, negated; + rtx insn; { static char buf[100]; int useskip = 0; @@ -5499,12 +5540,11 @@ output_cbranch (operands, nullify, length, negated, insn) xoperands[1] = operands[1]; xoperands[2] = operands[2]; xoperands[3] = operands[3]; - if (TARGET_SOM || ! TARGET_GAS) - xoperands[4] = gen_label_rtx (); output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); - if (TARGET_SOM || ! TARGET_GAS) + if (TARGET_SOM || !TARGET_GAS) { + xoperands[4] = gen_label_rtx (); output_asm_insn ("addil L'%l0-%l4,%%r1", xoperands); ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", CODE_LABEL_NUMBER (xoperands[4])); @@ -5536,10 +5576,10 @@ output_cbranch (operands, nullify, length, negated, insn) const char * output_bb (operands, nullify, length, negated, insn, which) - rtx *operands ATTRIBUTE_UNUSED; - int nullify, length, negated; - rtx insn; - int which; + rtx *operands ATTRIBUTE_UNUSED; + int nullify, length, negated; + rtx insn; + int which; { static char buf[100]; int useskip = 0; @@ -5684,10 +5724,10 @@ output_bb (operands, nullify, length, negated, insn, which) const char * output_bvb (operands, nullify, length, negated, insn, which) - rtx *operands ATTRIBUTE_UNUSED; - int nullify, length, negated; - rtx insn; - int which; + rtx *operands ATTRIBUTE_UNUSED; + int nullify, length, negated; + rtx insn; + int which; { static char buf[100]; int useskip = 0; @@ -6043,442 +6083,594 @@ output_movb (operands, insn, which_alternative, reverse_comparison) } } +/* Copy any FP arguments in INSN into integer registers. */ +static void +copy_fp_args (insn) + rtx insn; +{ + rtx link; + rtx xoperands[2]; -/* INSN is a millicode call. It may have an unconditional jump in its delay - slot. - - CALL_DEST is the routine we are calling. */ + for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1)) + { + int arg_mode, regno; + rtx use = XEXP (link, 0); -const char * -output_millicode_call (insn, call_dest) - rtx insn; - rtx call_dest; -{ - int attr_length = get_attr_length (insn); - int seq_length = dbr_sequence_length (); - int distance; - rtx xoperands[4]; - rtx seq_insn; + if (! (GET_CODE (use) == USE + && GET_CODE (XEXP (use, 0)) == REG + && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0))))) + continue; - xoperands[3] = gen_rtx_REG (Pmode, TARGET_64BIT ? 2 : 31); + arg_mode = GET_MODE (XEXP (use, 0)); + regno = REGNO (XEXP (use, 0)); - /* Handle common case -- empty delay slot or no jump in the delay slot, - and we're sure that the branch will reach the beginning of the $CODE$ - subspace. The within reach form of the $$sh_func_adrs call has - a length of 28 and attribute type of multi. This length is the - same as the maximum length of an out of reach PIC call to $$div. */ - if ((seq_length == 0 - && (attr_length == 8 - || (attr_length == 28 && get_attr_type (insn) == TYPE_MULTI))) - || (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN - && attr_length == 4)) - { - xoperands[0] = call_dest; - output_asm_insn ("{bl|b,l} %0,%3%#", xoperands); - return ""; + /* Is it a floating point register? */ + if (regno >= 32 && regno <= 39) + { + /* Copy the FP register into an integer register via memory. */ + if (arg_mode == SFmode) + { + xoperands[0] = XEXP (use, 0); + xoperands[1] = gen_rtx_REG (SImode, 26 - (regno - 32) / 2); + output_asm_insn ("{fstws|fstw} %0,-16(%%sr0,%%r30)", xoperands); + output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); + } + else + { + xoperands[0] = XEXP (use, 0); + xoperands[1] = gen_rtx_REG (DImode, 25 - (regno - 34) / 2); + output_asm_insn ("{fstds|fstd} %0,-16(%%sr0,%%r30)", xoperands); + output_asm_insn ("ldw -12(%%sr0,%%r30),%R1", xoperands); + output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); + } + } } +} + +/* Compute length of the FP argument copy sequence for INSN. */ +static int +length_fp_args (insn) + rtx insn; +{ + int length = 0; + rtx link; - /* This call may not reach the beginning of the $CODE$ subspace. */ - if (attr_length > 8) + for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1)) { - int delay_insn_deleted = 0; + int arg_mode, regno; + rtx use = XEXP (link, 0); - /* We need to emit an inline long-call branch. */ - if (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) + if (! (GET_CODE (use) == USE + && GET_CODE (XEXP (use, 0)) == REG + && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0))))) + continue; + + arg_mode = GET_MODE (XEXP (use, 0)); + regno = REGNO (XEXP (use, 0)); + + /* Is it a floating point register? */ + if (regno >= 32 && regno <= 39) { - /* A non-jump insn in the delay slot. By definition we can - emit this insn before the call. */ - final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0); - - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; - delay_insn_deleted = 1; + if (arg_mode == SFmode) + length += 8; + else + length += 12; } + } - /* PIC long millicode call sequence. */ - if (flag_pic) - { - xoperands[0] = call_dest; - if (TARGET_SOM || ! TARGET_GAS) - xoperands[1] = gen_label_rtx (); + return length; +} - /* Get our address + 8 into %r1. */ - output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); +/* We include the delay slot in the returned length as it is better to + over estimate the length than to under estimate it. */ - if (TARGET_SOM || ! TARGET_GAS) - { - /* Add %r1 to the offset of our target from the next insn. */ - output_asm_insn ("addil L%%%0-%1,%%r1", xoperands); - ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", - CODE_LABEL_NUMBER (xoperands[1])); - output_asm_insn ("ldo R%%%0-%1(%%r1),%%r1", xoperands); - } - else - { - output_asm_insn ("addil L%%%0-$PIC_pcrel$0+4,%%r1", xoperands); - output_asm_insn ("ldo R%%%0-$PIC_pcrel$0+8(%%r1),%%r1", - xoperands); - } +int +attr_length_millicode_call (insn, length) + rtx insn; + int length; +{ + unsigned long distance = total_code_bytes + INSN_ADDRESSES (INSN_UID (insn)); + + if (distance < total_code_bytes) + distance = -1; + + if (TARGET_64BIT) + { + if (!TARGET_LONG_CALLS && distance < 7600000) + return length + 8; + + return length + 20; + } + else if (TARGET_PORTABLE_RUNTIME) + return length + 24; + else + { + if (!TARGET_LONG_CALLS && distance < 240000) + return length + 8; + + if (TARGET_LONG_ABS_CALL && !flag_pic) + return length + 12; + + return length + 24; + } +} + +/* INSN is a function call. It may have an unconditional jump + in its delay slot. - /* Get the return address into %r31. */ - output_asm_insn ("blr 0,%3", xoperands); + CALL_DEST is the routine we are calling. */ - /* Branch to our target which is in %r1. */ - output_asm_insn ("bv,n %%r0(%%r1)", xoperands); +const char * +output_millicode_call (insn, call_dest) + rtx insn; + rtx call_dest; +{ + int attr_length = get_attr_length (insn); + int seq_length = dbr_sequence_length (); + int distance; + rtx seq_insn; + rtx xoperands[3]; - /* Empty delay slot. Note this insn gets fetched twice and - executed once. To be safe we use a nop. */ - output_asm_insn ("nop", xoperands); + xoperands[0] = call_dest; + xoperands[2] = gen_rtx_REG (Pmode, TARGET_64BIT ? 2 : 31); + + /* Handle the common case where we are sure that the branch will + reach the beginning of the $CODE$ subspace. The within reach + form of the $$sh_func_adrs call has a length of 28. Because + it has an attribute type of multi, it never has a non-zero + sequence length. The length of the $$sh_func_adrs is the same + as certain out of reach PIC calls to other routines. */ + if (!TARGET_LONG_CALLS + && ((seq_length == 0 + && (attr_length == 12 + || (attr_length == 28 && get_attr_type (insn) == TYPE_MULTI))) + || (seq_length != 0 && attr_length == 8))) + { + output_asm_insn ("{bl|b,l} %0,%2", xoperands); + } + else + { + if (TARGET_64BIT) + { + /* It might seem that one insn could be saved by accessing + the millicode function using the linkage table. However, + this doesn't work in shared libraries and other dynamically + loaded objects. Using a pc-relative sequence also avoids + problems related to the implicit use of the gp register. */ + output_asm_insn ("b,l .+8,%%r1", xoperands); + output_asm_insn ("addil L'%0-$PIC_pcrel$0+4,%%r1", xoperands); + output_asm_insn ("ldo R'%0-$PIC_pcrel$0+8(%%r1),%%r1", xoperands); + output_asm_insn ("bve,l (%%r1),%%r2", xoperands); } - /* Pure portable runtime doesn't allow be/ble; we also don't have - PIC support in the assembler/linker, so this sequence is needed. */ else if (TARGET_PORTABLE_RUNTIME) { - xoperands[0] = call_dest; - /* Get the address of our target into %r29. */ - output_asm_insn ("ldil L%%%0,%%r29", xoperands); - output_asm_insn ("ldo R%%%0(%%r29),%%r29", xoperands); + /* Pure portable runtime doesn't allow be/ble; we also don't + have PIC support in the assembler/linker, so this sequence + is needed. */ - /* Get our return address into %r31. */ - output_asm_insn ("blr %%r0,%3", xoperands); + /* Get the address of our target into %r1. */ + output_asm_insn ("ldil L'%0,%%r1", xoperands); + output_asm_insn ("ldo R'%0(%%r1),%%r1", xoperands); - /* Jump to our target address in %r29. */ - output_asm_insn ("bv,n %%r0(%%r29)", xoperands); + /* Get our return address into %r31. */ + output_asm_insn ("{bl|b,l} .+8,%%r31", xoperands); + output_asm_insn ("addi 8,%%r31,%%r31", xoperands); - /* Empty delay slot. Note this insn gets fetched twice and - executed once. To be safe we use a nop. */ - output_asm_insn ("nop", xoperands); + /* Jump to our target address in %r1. */ + output_asm_insn ("bv %%r0(%%r1)", xoperands); } - /* If we're allowed to use be/ble instructions, then this is the - best sequence to use for a long millicode call. */ - else + else if (!flag_pic) { - xoperands[0] = call_dest; - output_asm_insn ("ldil L%%%0,%3", xoperands); + output_asm_insn ("ldil L'%0,%%r1", xoperands); if (TARGET_PA_20) - output_asm_insn ("be,l R%%%0(%%sr4,%3),%%sr0,%%r31", xoperands); + output_asm_insn ("be,l R'%0(%%sr4,%%r1),%%sr0,%%r31", xoperands); else - output_asm_insn ("ble R%%%0(%%sr4,%3)", xoperands); - output_asm_insn ("nop", xoperands); + output_asm_insn ("ble R'%0(%%sr4,%%r1)", xoperands); } - - /* If we had a jump in the call's delay slot, output it now. */ - if (seq_length != 0 && !delay_insn_deleted) + else { - xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - output_asm_insn ("b,n %0", xoperands); + if (TARGET_SOM || !TARGET_GAS) + { + /* The HP assembler can generate relocations for the + difference of two symbols. GAS can do this for a + millicode symbol but not an arbitrary external + symbol when generating SOM output. */ + xoperands[1] = gen_label_rtx (); + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addi 16,%%r1,%%r31", xoperands); + ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", + CODE_LABEL_NUMBER (xoperands[1])); + output_asm_insn ("addil L'%0-%l1,%%r1", xoperands); + output_asm_insn ("ldo R'%0-%l1(%%r1),%%r1", xoperands); + } + else + { + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addi 16,%%r1,%%r31", xoperands); + output_asm_insn ("addil L'%0-$PIC_pcrel$0+8,%%r1", xoperands); + output_asm_insn ("ldo R'%0-$PIC_pcrel$0+12(%%r1),%%r1", + xoperands); + } - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + /* Jump to our target address in %r1. */ + output_asm_insn ("bv %%r0(%%r1)", xoperands); } - return ""; } - /* This call has an unconditional jump in its delay slot and the - call is known to reach its target or the beginning of the current - subspace. */ + if (seq_length == 0) + output_asm_insn ("nop", xoperands); - /* Use the containing sequence insn's address. */ - seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); - - distance = INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) - - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8; + /* We are done if there isn't a jump in the delay slot. */ + if (seq_length == 0 || GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) + return ""; - /* If the branch was too far away, emit a normal call followed - by a nop, followed by the unconditional branch. + /* This call has an unconditional jump in its delay slot. */ + xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - If the branch is close, then adjust %r2 from within the - call's delay slot. */ + /* See if the return address can be adjusted. Use the containing + sequence insn's address. */ + seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); + distance = (INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) + - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8); - xoperands[0] = call_dest; - xoperands[1] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - if (! VAL_14_BITS_P (distance)) - output_asm_insn ("{bl|b,l} %0,%3\n\tnop\n\tb,n %1", xoperands); - else + if (VAL_14_BITS_P (distance)) { - xoperands[2] = gen_label_rtx (); - output_asm_insn ("\n\t{bl|b,l} %0,%3\n\tldo %1-%2(%3),%3", - xoperands); + xoperands[1] = gen_label_rtx (); + output_asm_insn ("ldo %0-%1(%2),%2", xoperands); ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", - CODE_LABEL_NUMBER (xoperands[2])); + CODE_LABEL_NUMBER (xoperands[3])); } + else + /* ??? This branch may not reach its target. */ + output_asm_insn ("nop\n\tb,n %0", xoperands); /* Delete the jump. */ PUT_CODE (NEXT_INSN (insn), NOTE); NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + return ""; } -/* INSN is either a function call. It may have an unconditional jump +/* We include the delay slot in the returned length as it is better to + over estimate the length than to under estimate it. */ + +int +attr_length_call (insn, sibcall) + rtx insn; + int sibcall; +{ + unsigned long distance = total_code_bytes + INSN_ADDRESSES (INSN_UID (insn)); + + if (distance < total_code_bytes) + distance = -1; + + if (TARGET_64BIT) + { + if (!TARGET_LONG_CALLS + && ((!sibcall && distance < 7600000) || distance < 240000)) + return 8; + + return (sibcall ? 28 : 24); + } + else + { + if (!TARGET_LONG_CALLS + && ((TARGET_PA_20 && !sibcall && distance < 7600000) + || distance < 240000)) + return 8; + + if (TARGET_LONG_ABS_CALL && !flag_pic) + return 12; + + if ((TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL) + || (TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL)) + { + if (TARGET_PA_20) + return 20; + + return 28; + } + else + { + int length = 0; + + if (TARGET_SOM) + length += length_fp_args (insn); + + if (flag_pic) + length += 4; + + if (TARGET_PA_20) + return (length + 32); + + if (!sibcall) + length += 8; + + return (length + 40); + } + } +} + +/* INSN is a function call. It may have an unconditional jump in its delay slot. CALL_DEST is the routine we are calling. */ const char * output_call (insn, call_dest, sibcall) - rtx insn; - rtx call_dest; - int sibcall; + rtx insn; + rtx call_dest; + int sibcall; { + int delay_insn_deleted = 0; + int delay_slot_filled = 0; int attr_length = get_attr_length (insn); int seq_length = dbr_sequence_length (); - int distance; - rtx xoperands[4]; - rtx seq_insn; + rtx xoperands[2]; + + xoperands[0] = call_dest; - /* Handle common case -- empty delay slot or no jump in the delay slot, - and we're sure that the branch will reach the beginning of the $CODE$ - subspace. */ - if ((seq_length == 0 && attr_length == 12) - || (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN - && attr_length == 8)) + /* Handle the common case where we're sure that the branch will reach + the beginning of the $CODE$ subspace. */ + if (!TARGET_LONG_CALLS + && ((seq_length == 0 && attr_length == 12) + || (seq_length != 0 && attr_length == 8))) { - xoperands[0] = call_dest; xoperands[1] = gen_rtx_REG (word_mode, sibcall ? 0 : 2); - output_asm_insn ("{bl|b,l} %0,%1%#", xoperands); - return ""; + output_asm_insn ("{bl|b,l} %0,%1", xoperands); } - - /* This call may not reach the beginning of the $CODE$ subspace. */ - if (attr_length > 12) + else { - int delay_insn_deleted = 0; - rtx xoperands[2]; - rtx link; - - /* We need to emit an inline long-call branch. Furthermore, - because we're changing a named function call into an indirect - function call well after the parameters have been set up, we - need to make sure any FP args appear in both the integer - and FP registers. Also, we need move any delay slot insn - out of the delay slot. And finally, we can't rely on the linker - being able to fix the call to $$dyncall! -- Yuk!. */ - if (seq_length != 0 - && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) + if (TARGET_64BIT) { - /* A non-jump insn in the delay slot. By definition we can - emit this insn before the call (and in fact before argument - relocating. */ - final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0); - - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; - delay_insn_deleted = 1; - } + /* ??? As far as I can tell, the HP linker doesn't support the + long pc-relative sequence described in the 64-bit runtime + architecture. So, we use a slightly longer indirect call. */ + struct deferred_plabel *p = get_plabel (XSTR (call_dest, 0)); + + xoperands[0] = p->internal_label; + xoperands[1] = gen_label_rtx (); + + /* If this isn't a sibcall, we put the load of %r27 into the + delay slot. We can't do this in a sibcall as we don't + have a second call-clobbered scratch register available. */ + if (seq_length != 0 + && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN + && !sibcall) + { + final_scan_insn (NEXT_INSN (insn), asm_out_file, + optimize, 0, 0); + + /* Now delete the delay insn. */ + PUT_CODE (NEXT_INSN (insn), NOTE); + NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; + NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + delay_insn_deleted = 1; + } - /* Now copy any FP arguments into integer registers. */ - for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1)) - { - int arg_mode, regno; - rtx use = XEXP (link, 0); - if (! (GET_CODE (use) == USE - && GET_CODE (XEXP (use, 0)) == REG - && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0))))) - continue; + output_asm_insn ("addil LT'%0,%%r27", xoperands); + output_asm_insn ("ldd RT'%0(%%r1),%%r1", xoperands); + output_asm_insn ("ldd 0(%%r1),%%r1", xoperands); - arg_mode = GET_MODE (XEXP (use, 0)); - regno = REGNO (XEXP (use, 0)); - /* Is it a floating point register? */ - if (regno >= 32 && regno <= 39) + if (sibcall) { - /* Copy from the FP register into an integer register - (via memory). */ - if (arg_mode == SFmode) - { - xoperands[0] = XEXP (use, 0); - xoperands[1] = gen_rtx_REG (SImode, 26 - (regno - 32) / 2); - output_asm_insn ("{fstws|fstw} %0,-16(%%sr0,%%r30)", - xoperands); - output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); - } - else - { - xoperands[0] = XEXP (use, 0); - xoperands[1] = gen_rtx_REG (DImode, 25 - (regno - 34) / 2); - output_asm_insn ("{fstds|fstd} %0,-16(%%sr0,%%r30)", - xoperands); - output_asm_insn ("ldw -12(%%sr0,%%r30),%R1", xoperands); - output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands); - } + output_asm_insn ("ldd 24(%%r1),%%r27", xoperands); + output_asm_insn ("ldd 16(%%r1),%%r1", xoperands); + output_asm_insn ("bve (%%r1)", xoperands); + } + else + { + output_asm_insn ("ldd 16(%%r1),%%r2", xoperands); + output_asm_insn ("bve,l (%%r2),%%r2", xoperands); + output_asm_insn ("ldd 24(%%r1),%%r27", xoperands); + delay_slot_filled = 1; } } - - /* Don't have to worry about TARGET_PORTABLE_RUNTIME here since - we don't have any direct calls in that case. */ + else { - size_t i; - const char *name = XSTR (call_dest, 0); - - /* See if we have already put this function on the list - of deferred plabels. This list is generally small, - so a liner search is not too ugly. If it proves too - slow replace it with something faster. */ - for (i = 0; i < n_deferred_plabels; i++) - if (strcmp (name, deferred_plabels[i].name) == 0) - break; - - /* If the deferred plabel list is empty, or this entry was - not found on the list, create a new entry on the list. */ - if (deferred_plabels == NULL || i == n_deferred_plabels) + int indirect_call = 0; + + /* Emit a long call. There are several different sequences + of increasing length and complexity. In most cases, + they don't allow an instruction in the delay slot. */ + if (!(TARGET_LONG_ABS_CALL && !flag_pic) + && !(TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL) + && !(TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL)) + indirect_call = 1; + + if (seq_length != 0 + && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN + && !sibcall + && (!TARGET_PA_20 || indirect_call)) { - const char *real_name; - - if (deferred_plabels == 0) - deferred_plabels = (struct deferred_plabel *) - ggc_alloc (sizeof (struct deferred_plabel)); - else - deferred_plabels = (struct deferred_plabel *) - ggc_realloc (deferred_plabels, - ((n_deferred_plabels + 1) - * sizeof (struct deferred_plabel))); - - i = n_deferred_plabels++; - deferred_plabels[i].internal_label = gen_label_rtx (); - deferred_plabels[i].name = ggc_strdup (name); - - /* Gross. We have just implicitly taken the address of this - function, mark it as such. */ - real_name = (*targetm.strip_name_encoding) (name); - TREE_SYMBOL_REFERENCED (get_identifier (real_name)) = 1; + /* A non-jump insn in the delay slot. By definition we can + emit this insn before the call (and in fact before argument + relocating. */ + final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0); + + /* Now delete the delay insn. */ + PUT_CODE (NEXT_INSN (insn), NOTE); + NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; + NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + delay_insn_deleted = 1; } - /* We have to load the address of the function using a procedure - label (plabel). Inline plabels can lose for PIC and other - cases, so avoid them by creating a 32bit plabel in the data - segment. */ - if (flag_pic) + if (TARGET_LONG_ABS_CALL && !flag_pic) { - xoperands[0] = deferred_plabels[i].internal_label; - if (TARGET_SOM || ! TARGET_GAS) - xoperands[1] = gen_label_rtx (); - - output_asm_insn ("addil LT%%%0,%%r19", xoperands); - output_asm_insn ("ldw RT%%%0(%%r1),%%r22", xoperands); - output_asm_insn ("ldw 0(%%r22),%%r22", xoperands); - - /* Get our address + 8 into %r1. */ - output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + /* This is the best sequence for making long calls in + non-pic code. Unfortunately, GNU ld doesn't provide + the stub needed for external calls, and GAS's support + for this with the SOM linker is buggy. */ + output_asm_insn ("ldil L'%0,%%r1", xoperands); + if (sibcall) + output_asm_insn ("be R'%0(%%sr4,%%r1)", xoperands); + else + { + if (TARGET_PA_20) + output_asm_insn ("be,l R'%0(%%sr4,%%r1),%%sr0,%%r31", + xoperands); + else + output_asm_insn ("ble R'%0(%%sr4,%%r1)", xoperands); - if (TARGET_SOM || ! TARGET_GAS) + output_asm_insn ("copy %%r31,%%r2", xoperands); + delay_slot_filled = 1; + } + } + else + { + if (TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL) { - /* Add %r1 to the offset of dyncall from the next insn. */ - output_asm_insn ("addil L%%$$dyncall-%1,%%r1", xoperands); + /* The HP assembler and linker can handle relocations + for the difference of two symbols. GAS and the HP + linker can't do this when one of the symbols is + external. */ + xoperands[1] = gen_label_rtx (); + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addil L'%0-%l1,%%r1", xoperands); ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", CODE_LABEL_NUMBER (xoperands[1])); - output_asm_insn ("ldo R%%$$dyncall-%1(%%r1),%%r1", xoperands); - } - else + output_asm_insn ("ldo R'%0-%l1(%%r1),%%r1", xoperands); + } + else if (TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL) { - output_asm_insn ("addil L%%$$dyncall-$PIC_pcrel$0+4,%%r1", + /* GAS currently can't generate the relocations that + are needed for the SOM linker under HP-UX using this + sequence. The GNU linker doesn't generate the stubs + that are needed for external calls on TARGET_ELF32 + with this sequence. For now, we have to use a + longer plabel sequence when using GAS. */ + output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands); + output_asm_insn ("addil L'%0-$PIC_pcrel$0+4,%%r1", xoperands); - output_asm_insn ("ldo R%%$$dyncall-$PIC_pcrel$0+8(%%r1),%%r1", + output_asm_insn ("ldo R'%0-$PIC_pcrel$0+8(%%r1),%%r1", xoperands); } - - /* Get the return address into %r31. */ - output_asm_insn ("blr %%r0,%%r31", xoperands); - - /* Branch to our target which is in %r1. */ - output_asm_insn ("bv %%r0(%%r1)", xoperands); - - if (sibcall) - { - /* This call never returns, so we do not need to fix the - return pointer. */ - output_asm_insn ("nop", xoperands); - } else { - /* Copy the return address into %r2 also. */ - output_asm_insn ("copy %%r31,%%r2", xoperands); - } - } - else - { - xoperands[0] = deferred_plabels[i].internal_label; + /* Emit a long plabel-based call sequence. This is + essentially an inline implementation of $$dyncall. + We don't actually try to call $$dyncall as this is + as difficult as calling the function itself. */ + struct deferred_plabel *p = get_plabel (XSTR (call_dest, 0)); + + xoperands[0] = p->internal_label; + xoperands[1] = gen_label_rtx (); + + /* Since the call is indirect, FP arguments in registers + need to be copied to the general registers. Then, the + argument relocation stub will copy them back. */ + if (TARGET_SOM) + copy_fp_args (insn); + + if (flag_pic) + { + output_asm_insn ("addil LT'%0,%%r19", xoperands); + output_asm_insn ("ldw RT'%0(%%r1),%%r1", xoperands); + output_asm_insn ("ldw 0(%%r1),%%r1", xoperands); + } + else + { + output_asm_insn ("addil LR'%0-$global$,%%r27", + xoperands); + output_asm_insn ("ldw RR'%0-$global$(%%r1),%%r1", + xoperands); + } - /* Get the address of our target into %r22. */ - output_asm_insn ("addil LR%%%0-$global$,%%r27", xoperands); - output_asm_insn ("ldw RR%%%0-$global$(%%r1),%%r22", xoperands); + output_asm_insn ("bb,>=,n %%r1,30,.+16", xoperands); + output_asm_insn ("depi 0,31,2,%%r1", xoperands); + output_asm_insn ("ldw 4(%%sr0,%%r1),%%r19", xoperands); + output_asm_insn ("ldw 0(%%sr0,%%r1),%%r1", xoperands); - /* Get the high part of the address of $dyncall into %r2, then - add in the low part in the branch instruction. */ - output_asm_insn ("ldil L%%$$dyncall,%%r2", xoperands); - if (TARGET_PA_20) - output_asm_insn ("be,l R%%$$dyncall(%%sr4,%%r2),%%sr0,%%r31", - xoperands); - else - output_asm_insn ("ble R%%$$dyncall(%%sr4,%%r2)", xoperands); + if (!sibcall && !TARGET_PA_20) + { + output_asm_insn ("{bl|b,l} .+8,%%r2", xoperands); + output_asm_insn ("addi 16,%%r2,%%r2", xoperands); + } + } - if (sibcall) + if (TARGET_PA_20) { - /* This call never returns, so we do not need to fix the - return pointer. */ - output_asm_insn ("nop", xoperands); + if (sibcall) + output_asm_insn ("bve (%%r1)", xoperands); + else + { + if (indirect_call) + { + output_asm_insn ("bve,l (%%r1),%%r2", xoperands); + output_asm_insn ("stw %%r2,-24(%%sp)", xoperands); + delay_slot_filled = 1; + } + else + output_asm_insn ("bve,l (%%r1),%%r2", xoperands); + } } else { - /* Copy the return address into %r2 also. */ - output_asm_insn ("copy %%r31,%%r2", xoperands); - } - } - } + output_asm_insn ("ldsid (%%r1),%%r31\n\tmtsp %%r31,%%sr0", + xoperands); - /* If we had a jump in the call's delay slot, output it now. */ - if (seq_length != 0 && !delay_insn_deleted) - { - xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - output_asm_insn ("b,n %0", xoperands); + if (sibcall) + output_asm_insn ("be 0(%%sr0,%%r1)", xoperands); + else + { + output_asm_insn ("ble 0(%%sr0,%%r1)", xoperands); - /* Now delete the delay insn. */ - PUT_CODE (NEXT_INSN (insn), NOTE); - NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; - NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + if (indirect_call) + output_asm_insn ("stw %%r31,-24(%%sp)", xoperands); + else + output_asm_insn ("copy %%r31,%%r2", xoperands); + delay_slot_filled = 1; + } + } + } } - return ""; } - /* This call has an unconditional jump in its delay slot and the - call is known to reach its target or the beginning of the current - subspace. */ + if (seq_length == 0 || (delay_insn_deleted && !delay_slot_filled)) + output_asm_insn ("nop", xoperands); - /* Use the containing sequence insn's address. */ - seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); + /* We are done if there isn't a jump in the delay slot. */ + if (seq_length == 0 + || delay_insn_deleted + || GET_CODE (NEXT_INSN (insn)) != JUMP_INSN) + return ""; - distance = INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) - - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8; + /* A sibcall should never have a branch in the delay slot. */ + if (sibcall) + abort (); - /* If the branch is too far away, emit a normal call followed - by a nop, followed by the unconditional branch. If the branch - is close, then adjust %r2 in the call's delay slot. */ + /* This call has an unconditional jump in its delay slot. */ + xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - xoperands[0] = call_dest; - xoperands[1] = XEXP (PATTERN (NEXT_INSN (insn)), 1); - if (! VAL_14_BITS_P (distance)) - output_asm_insn ("{bl|b,l} %0,%%r2\n\tnop\n\tb,n %1", xoperands); - else + if (!delay_slot_filled) { - xoperands[3] = gen_label_rtx (); - output_asm_insn ("\n\t{bl|b,l} %0,%%r2\n\tldo %1-%3(%%r2),%%r2", - xoperands); - ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", - CODE_LABEL_NUMBER (xoperands[3])); + /* See if the return address can be adjusted. Use the containing + sequence insn's address. */ + rtx seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0))); + int distance = (INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn)))) + - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8); + + if (VAL_14_BITS_P (distance)) + { + xoperands[1] = gen_label_rtx (); + output_asm_insn ("ldo %0-%1(%%r2),%%r2", xoperands); + ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L", + CODE_LABEL_NUMBER (xoperands[3])); + } + else + /* ??? This branch may not reach its target. */ + output_asm_insn ("nop\n\tb,n %0", xoperands); } + else + /* ??? This branch may not reach its target. */ + output_asm_insn ("b,n %0", xoperands); /* Delete the jump. */ PUT_CODE (NEXT_INSN (insn), NOTE); NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED; NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0; + return ""; } @@ -6580,8 +6772,8 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function) { if (! TARGET_64BIT && ! TARGET_PORTABLE_RUNTIME && flag_pic) { - fprintf (file, "\taddil LT%%%s,%%r19\n", lab); - fprintf (file, "\tldw RT%%%s(%%r1),%%r22\n", lab); + fprintf (file, "\taddil LT'%s,%%r19\n", lab); + fprintf (file, "\tldw RT'%s(%%r1),%%r22\n", lab); fprintf (file, "\tldw 0(%%sr0,%%r22),%%r22\n"); fprintf (file, "\tbb,>=,n %%r22,30,.+16\n"); fprintf (file, "\tdepi 0,31,2,%%r22\n"); @@ -6603,13 +6795,13 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function) { if (! TARGET_64BIT && ! TARGET_PORTABLE_RUNTIME && flag_pic) { - fprintf (file, "\taddil L%%"); + fprintf (file, "\taddil L'"); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); - fprintf (file, ",%%r26\n\tldo R%%"); + fprintf (file, ",%%r26\n\tldo R'"); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); fprintf (file, "(%%r1),%%r26\n"); - fprintf (file, "\taddil LT%%%s,%%r19\n", lab); - fprintf (file, "\tldw RT%%%s(%%r1),%%r22\n", lab); + fprintf (file, "\taddil LT'%s,%%r19\n", lab); + fprintf (file, "\tldw RT'%s(%%r1),%%r22\n", lab); fprintf (file, "\tldw 0(%%sr0,%%r22),%%r22\n"); fprintf (file, "\tbb,>=,n %%r22,30,.+16\n"); fprintf (file, "\tdepi 0,31,2,%%r22\n"); @@ -6620,9 +6812,9 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function) } else { - fprintf (file, "\taddil L%%"); + fprintf (file, "\taddil L'"); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); - fprintf (file, ",%%r26\n\tb %s\n\tldo R%%", target_name); + fprintf (file, ",%%r26\n\tb %s\n\tldo R'", target_name); fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta); fprintf (file, "(%%r1),%%r26\n"); } @@ -6634,7 +6826,7 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function) data_section (); fprintf (file, "\t.align 4\n"); ASM_OUTPUT_INTERNAL_LABEL (file, "LTHN", current_thunk_number); - fprintf (file, "\t.word P%%%s\n", target_name); + fprintf (file, "\t.word P'%s\n", target_name); function_section (thunk_fndecl); } current_thunk_number++; diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h index 924571a10f7..a3f24b6d6e3 100644 --- a/gcc/config/pa/pa.h +++ b/gcc/config/pa/pa.h @@ -31,7 +31,7 @@ enum cmp_type /* comparison type */ }; /* For long call handling. */ -extern unsigned int total_code_bytes; +extern unsigned long total_code_bytes; /* Which processor to schedule for. */ @@ -152,6 +152,12 @@ extern int target_flags; #define TARGET_GNU_LD (target_flags & MASK_GNU_LD) #endif +/* Force generation of long calls. */ +#define MASK_LONG_CALLS 32768 +#ifndef TARGET_LONG_CALLS +#define TARGET_LONG_CALLS (target_flags & MASK_LONG_CALLS) +#endif + #ifndef TARGET_PA_10 #define TARGET_PA_10 (target_flags & (MASK_PA_11 | MASK_PA_20) == 0) #endif @@ -179,6 +185,27 @@ extern int target_flags; #define TARGET_SOM 0 #endif +/* The following three defines are potential target switches. The current + defines are optimal given the current capabilities of GAS and GNU ld. */ + +/* Define to a C expression evaluating to true to use long absolute calls. + Currently, only the HP assembler and SOM linker support long absolute + calls. They are used only in non-pic code. */ +#define TARGET_LONG_ABS_CALL (TARGET_SOM && !TARGET_GAS) + +/* Define to a C expression evaluating to true to use long pic symbol + difference calls. This is a call variant similar to the long pic + pc-relative call. Long pic symbol difference calls are only used with + the HP SOM linker. Currently, only the HP assembler supports these + calls. GAS doesn't allow an arbritrary difference of two symbols. */ +#define TARGET_LONG_PIC_SDIFF_CALL (!TARGET_GAS) + +/* Define to a C expression evaluating to true to use long pic + pc-relative calls. Long pic pc-relative calls are only used with + GAS. Currently, they are usable for calls within a module but + not for external calls. */ +#define TARGET_LONG_PIC_PCREL_CALL 0 + /* Macro to define tables used to set the flags. This is a list in braces of target switches with each switch being { "NAME", VALUE, "HELP_STRING" }. VALUE is the bits to set, @@ -237,6 +264,10 @@ extern int target_flags; N_("Generate code for huge switch statements") }, \ { "no-big-switch", -MASK_BIG_SWITCH, \ N_("Do not generate code for huge switch statements") }, \ + { "long-calls", MASK_LONG_CALLS, \ + N_("Always generate long calls") }, \ + { "no-long-calls", -MASK_LONG_CALLS, \ + N_("Generate long calls only when needed") }, \ { "linker-opt", 0, \ N_("Enable linker optimizations") }, \ SUBTARGET_SWITCHES \ @@ -1193,8 +1224,14 @@ extern int may_call_alloca; /* Using DFmode forces only short displacements \ to be recognized as valid in reg+d addresses. \ However, this is not necessary for PA2.0 since\ - it has long FP loads/stores. */ \ + it has long FP loads/stores. \ + \ + FIXME: the ELF32 linker clobbers the LSB of \ + the FP register number in {fldw,fstw} insns. \ + Thus, we only allow long FP loads/stores on \ + TARGET_64BIT. */ \ && memory_address_p ((TARGET_PA_20 \ + && !TARGET_ELF32 \ ? GET_MODE (OP) \ : DFmode), \ XEXP (OP, 0)) \ @@ -1300,7 +1337,7 @@ extern int may_call_alloca; if (GET_CODE (index) == CONST_INT \ && ((INT_14_BITS (index) \ && (TARGET_SOFT_FLOAT \ - || (TARGET_PA_20 \ + || (TARGET_PA_20 \ && ((MODE == SFmode \ && (INTVAL (index) % 4) == 0)\ || (MODE == DFmode \ @@ -1327,6 +1364,7 @@ extern int may_call_alloca; /* We can allow symbolic LO_SUM addresses\ for PA2.0. */ \ || (TARGET_PA_20 \ + && !TARGET_ELF32 \ && GET_CODE (XEXP (X, 1)) != CONST_INT)\ || ((MODE) != SFmode \ && (MODE) != DFmode))) \ @@ -1340,6 +1378,7 @@ extern int may_call_alloca; /* We can allow symbolic LO_SUM addresses\ for PA2.0. */ \ || (TARGET_PA_20 \ + && !TARGET_ELF32 \ && GET_CODE (XEXP (X, 1)) != CONST_INT)\ || ((MODE) != SFmode \ && (MODE) != DFmode))) \ @@ -1354,7 +1393,7 @@ extern int may_call_alloca; && REG_OK_FOR_BASE_P (XEXP (X, 0)) \ && GET_CODE (XEXP (X, 1)) == UNSPEC \ && (TARGET_SOFT_FLOAT \ - || TARGET_PA_20 \ + || (TARGET_PA_20 && !TARGET_ELF32) \ || ((MODE) != SFmode \ && (MODE) != DFmode))) \ goto ADDR; \ @@ -1386,7 +1425,7 @@ do { \ rtx new, temp = NULL_RTX; \ \ mask = (GET_MODE_CLASS (MODE) == MODE_FLOAT \ - ? (TARGET_PA_20 ? 0x3fff : 0x1f) : 0x3fff); \ + ? (TARGET_PA_20 && !TARGET_ELF32 ? 0x3fff : 0x1f) : 0x3fff); \ \ if (optimize \ && GET_CODE (AD) == PLUS) \ diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md index bbd44fa3d04..ac6bdc9aff4 100644 --- a/gcc/config/pa/pa.md +++ b/gcc/config/pa/pa.md @@ -105,12 +105,9 @@ (define_delay (eq_attr "type" "call") [(eq_attr "in_call_delay" "true") (nil) (nil)]) -;; millicode call delay slot description. Note it disallows delay slot -;; when TARGET_PORTABLE_RUNTIME is true. +;; Millicode call delay slot description. (define_delay (eq_attr "type" "milli") - [(and (eq_attr "in_call_delay" "true") - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") (const_int 0))) - (nil) (nil)]) + [(eq_attr "in_call_delay" "true") (nil) (nil)]) ;; Return and other similar instructions. (define_delay (eq_attr "type" "branch,parallel_branch") @@ -4089,27 +4086,7 @@ "!TARGET_64BIT" "* return output_mul_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) (mult:SI (reg:SI 26) (reg:SI 25))) @@ -4120,7 +4097,7 @@ "TARGET_64BIT" "* return output_mul_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "muldi3" [(set (match_operand:DI 0 "register_operand" "") @@ -4211,27 +4188,7 @@ "* return output_div_insn (operands, 0, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) @@ -4245,7 +4202,7 @@ "* return output_div_insn (operands, 0, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "udivsi3" [(set (reg:SI 26) (match_operand:SI 1 "move_operand" "")) @@ -4261,6 +4218,7 @@ " { operands[3] = gen_reg_rtx (SImode); + if (TARGET_64BIT) { operands[5] = gen_rtx_REG (SImode, 2); @@ -4287,27 +4245,7 @@ "* return output_div_insn (operands, 1, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) @@ -4321,7 +4259,7 @@ "* return output_div_insn (operands, 1, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "modsi3" [(set (reg:SI 26) (match_operand:SI 1 "move_operand" "")) @@ -4360,27 +4298,7 @@ "* return output_mod_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) (mod:SI (reg:SI 26) (reg:SI 25))) @@ -4393,7 +4311,7 @@ "* return output_mod_insn (0, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_expand "umodsi3" [(set (reg:SI 26) (match_operand:SI 1 "move_operand" "")) @@ -4432,27 +4350,7 @@ "* return output_mod_insn (1, insn);" [(set_attr "type" "milli") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 4) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 24) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 20)] - -;; Out of reach, can use ble - (const_int 12)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) (define_insn "" [(set (reg:SI 29) (umod:SI (reg:SI 26) (reg:SI 25))) @@ -4465,7 +4363,7 @@ "* return output_mod_insn (1, insn);" [(set_attr "type" "milli") - (set (attr "length") (const_int 4))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))]) ;;- and instructions ;; We define DImode `and` so with DImode `not` we can get @@ -6036,11 +5934,12 @@ call_insn = emit_call_insn (gen_call_internal_reg (operands[1])); } + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); + if (flag_pic) { use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx); - if (TARGET_64BIT) - use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); /* After each call we must restore the PIC register, even if it doesn't appear to be used. */ @@ -6052,6 +5951,7 @@ (define_insn "call_internal_symref" [(call (mem:SI (match_operand 0 "call_operand_address" "")) (match_operand 1 "" "i")) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 0))] "! TARGET_PORTABLE_RUNTIME" @@ -6061,21 +5961,7 @@ return output_call (insn, operands[0], 0); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 0)"))]) (define_insn "call_internal_reg_64bit" [(call (mem:SI (match_operand:DI 0 "register_operand" "r")) @@ -6086,15 +5972,16 @@ "* { /* ??? Needs more work. Length computation, split into multiple insns, - do not use %r22 directly, expose delay slot. */ - return \"ldd 16(%0),%%r2\;ldd 24(%0),%%r27\;bve,l (%%r2),%%r2\;nop\"; + expose delay slot. */ + return \"ldd 16(%0),%%r2\;bve,l (%%r2),%%r2\;ldd 24(%0),%%r27\"; }" [(set_attr "type" "dyncall") - (set (attr "length") (const_int 16))]) + (set (attr "length") (const_int 12))]) (define_insn "call_internal_reg" [(call (mem:SI (reg:SI 22)) (match_operand 0 "" "i")) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 1))] "" @@ -6218,11 +6105,13 @@ call_insn = emit_call_insn (gen_call_value_internal_reg (operands[0], operands[2])); } + + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); + if (flag_pic) { use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx); - if (TARGET_64BIT) - use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); /* After each call we must restore the PIC register, even if it doesn't appear to be used. */ @@ -6235,6 +6124,7 @@ [(set (match_operand 0 "" "=rf") (call (mem:SI (match_operand 1 "call_operand_address" "")) (match_operand 2 "" "i"))) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 0))] ;;- Don't use operand 1 for most machines. @@ -6245,21 +6135,7 @@ return output_call (insn, operands[1], 0); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 0)"))]) (define_insn "call_value_internal_reg_64bit" [(set (match_operand 0 "" "=rf") @@ -6271,16 +6147,17 @@ "* { /* ??? Needs more work. Length computation, split into multiple insns, - do not use %r22 directly, expose delay slot. */ - return \"ldd 16(%1),%%r2\;ldd 24(%1),%%r27\;bve,l (%%r2),%%r2\;nop\"; + expose delay slot. */ + return \"ldd 16(%1),%%r2\;bve,l (%%r2),%%r2\;ldd 24(%1),%%r27\"; }" [(set_attr "type" "dyncall") - (set (attr "length") (const_int 16))]) + (set (attr "length") (const_int 12))]) (define_insn "call_value_internal_reg" [(set (match_operand 0 "" "=rf") (call (mem:SI (reg:SI 22)) (match_operand 1 "" "i"))) + (clobber (reg:SI 1)) (clobber (reg:SI 2)) (use (const_int 1))] "" @@ -6389,10 +6266,9 @@ }") (define_expand "sibcall" - [(parallel [(call (match_operand:SI 0 "" "") - (match_operand 1 "" "")) - (clobber (reg:SI 0))])] - "! TARGET_PORTABLE_RUNTIME" + [(call (match_operand:SI 0 "" "") + (match_operand 1 "" ""))] + "!TARGET_PORTABLE_RUNTIME" " { rtx op; @@ -6400,8 +6276,21 @@ op = XEXP (operands[0], 0); - /* We do not allow indirect sibling calls. */ - call_insn = emit_call_insn (gen_sibcall_internal_symref (op, operands[1])); + if (TARGET_64BIT) + emit_move_insn (arg_pointer_rtx, + gen_rtx_PLUS (word_mode, virtual_outgoing_args_rtx, + GEN_INT (64))); + + /* Indirect sibling calls are not allowed. */ + if (TARGET_64BIT) + call_insn = gen_sibcall_internal_symref_64bit (op, operands[1]); + else + call_insn = gen_sibcall_internal_symref (op, operands[1]); + + call_insn = emit_call_insn (call_insn); + + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); if (flag_pic) { @@ -6417,38 +6306,39 @@ (define_insn "sibcall_internal_symref" [(call (mem:SI (match_operand 0 "call_operand_address" "")) (match_operand 1 "" "i")) - (clobber (reg:SI 0)) + (clobber (reg:SI 1)) (use (reg:SI 2)) (use (const_int 0))] - "! TARGET_PORTABLE_RUNTIME" + "!TARGET_PORTABLE_RUNTIME && !TARGET_64BIT" "* { output_arg_descriptor (insn); return output_call (insn, operands[0], 1); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) + +(define_insn "sibcall_internal_symref_64bit" + [(call (mem:SI (match_operand 0 "call_operand_address" "")) + (match_operand 1 "" "i")) + (clobber (reg:SI 1)) + (clobber (reg:SI 27)) + (use (reg:SI 2)) + (use (const_int 0))] + "TARGET_64BIT" + "* +{ + output_arg_descriptor (insn); + return output_call (insn, operands[0], 1); +}" + [(set_attr "type" "call") + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) (define_expand "sibcall_value" - [(parallel [(set (match_operand 0 "" "") + [(set (match_operand 0 "" "") (call (match_operand:SI 1 "" "") - (match_operand 2 "" ""))) - (clobber (reg:SI 0))])] - "! TARGET_PORTABLE_RUNTIME" + (match_operand 2 "" "")))] + "!TARGET_PORTABLE_RUNTIME" " { rtx op; @@ -6456,10 +6346,24 @@ op = XEXP (operands[1], 0); - /* We do not allow indirect sibling calls. */ - call_insn = emit_call_insn (gen_sibcall_value_internal_symref (operands[0], - op, - operands[2])); + if (TARGET_64BIT) + emit_move_insn (arg_pointer_rtx, + gen_rtx_PLUS (word_mode, virtual_outgoing_args_rtx, + GEN_INT (64))); + + /* Indirect sibling calls are not allowed. */ + if (TARGET_64BIT) + call_insn + = gen_sibcall_value_internal_symref_64bit (operands[0], op, operands[2]); + else + call_insn + = gen_sibcall_value_internal_symref (operands[0], op, operands[2]); + + call_insn = emit_call_insn (call_insn); + + if (TARGET_64BIT) + use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx); + if (flag_pic) { use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx); @@ -6475,32 +6379,34 @@ [(set (match_operand 0 "" "=rf") (call (mem:SI (match_operand 1 "call_operand_address" "")) (match_operand 2 "" "i"))) - (clobber (reg:SI 0)) + (clobber (reg:SI 1)) (use (reg:SI 2)) (use (const_int 0))] - ;;- Don't use operand 1 for most machines. - "! TARGET_PORTABLE_RUNTIME" + "!TARGET_PORTABLE_RUNTIME && !TARGET_64BIT" "* { output_arg_descriptor (insn); return output_call (insn, operands[1], 1); }" [(set_attr "type" "call") - (set (attr "length") -;; If we're sure that we can either reach the target or that the -;; linker can use a long-branch stub, then the length is at most -;; 8 bytes. -;; -;; For long-calls the length will be at most 68 bytes (non-pic) -;; or 84 bytes (pic). */ -;; Else we have to use a long-call; - (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (const_int 8) - (if_then_else (eq (symbol_ref "flag_pic") - (const_int 0)) - (const_int 68) - (const_int 84))))]) + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) + +(define_insn "sibcall_value_internal_symref_64bit" + [(set (match_operand 0 "" "=rf") + (call (mem:SI (match_operand 1 "call_operand_address" "")) + (match_operand 2 "" "i"))) + (clobber (reg:SI 1)) + (clobber (reg:SI 27)) + (use (reg:SI 2)) + (use (const_int 0))] + "TARGET_64BIT" + "* +{ + output_arg_descriptor (insn); + return output_call (insn, operands[1], 1); +}" + [(set_attr "type" "call") + (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))]) (define_insn "nop" [(const_int 0)] @@ -7392,6 +7298,12 @@ "!TARGET_64BIT" "* { + int length = get_attr_length (insn); + rtx xoperands[2]; + + xoperands[0] = GEN_INT (length - 8); + xoperands[1] = GEN_INT (length - 16); + /* Must import the magic millicode routine. */ output_asm_insn (\".IMPORT $$sh_func_adrs,MILLICODE\", NULL); @@ -7400,60 +7312,24 @@ First, copy our input parameter into %r29 just in case we don't need to call $$sh_func_adrs. */ output_asm_insn (\"copy %%r26,%%r29\", NULL); + output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\", NULL); /* Next, examine the low two bits in %r26, if they aren't 0x2, then we use %r26 unchanged. */ - if (get_attr_length (insn) == 32) - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+24\", NULL); - else if (get_attr_length (insn) == 40) - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+32\", NULL); - else if (get_attr_length (insn) == 44) - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+36\", NULL); - else - output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+20\", NULL); + output_asm_insn (\"{comib|cmpib},<>,n 2,%%r31,.+%0\", xoperands); + output_asm_insn (\"ldi 4096,%%r31\", NULL); /* Next, compare %r26 with 4096, if %r26 is less than or equal to - 4096, then we use %r26 unchanged. */ - if (get_attr_length (insn) == 32) - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+16\", - NULL); - else if (get_attr_length (insn) == 40) - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+24\", - NULL); - else if (get_attr_length (insn) == 44) - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+28\", - NULL); - else - output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+12\", - NULL); + 4096, then again we use %r26 unchanged. */ + output_asm_insn (\"{comb|cmpb},<<,n %%r26,%%r31,.+%1\", xoperands); - /* Else call $$sh_func_adrs to extract the function's real add24. */ + /* Finally, call $$sh_func_adrs to extract the function's real add24. */ return output_millicode_call (insn, gen_rtx_SYMBOL_REF (SImode, - \"$$sh_func_adrs\")); + \"$$sh_func_adrs\")); }" [(set_attr "type" "multi") - (set (attr "length") - (cond [ -;; Target (or stub) within reach - (and (lt (plus (symbol_ref "total_code_bytes") (pc)) - (const_int 240000)) - (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0))) - (const_int 28) - -;; Out of reach PIC - (ne (symbol_ref "flag_pic") - (const_int 0)) - (const_int 44) - -;; Out of reach PORTABLE_RUNTIME - (ne (symbol_ref "TARGET_PORTABLE_RUNTIME") - (const_int 0)) - (const_int 40)] - -;; Out of reach, can use ble - (const_int 32)))]) + (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 20)"))]) ;; On the PA, the PIC register is call clobbered, so it must ;; be saved & restored around calls by the caller. If the call diff --git a/gcc/config/pa/som.h b/gcc/config/pa/som.h index e72b7fed64a..98c66fbfe37 100644 --- a/gcc/config/pa/som.h +++ b/gcc/config/pa/som.h @@ -371,3 +371,7 @@ do { \ on the location of the GCC tool directory. The downside is GCC cannot be moved after installation using a symlink. */ #define ALWAYS_STRIP_DOTDOT 1 + +/* Aggregates with a single float or double field should be passed and + returned in the general registers. */ +#define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) (MODE==SFmode || MODE==DFmode) diff --git a/gcc/config/pa/t-pa64 b/gcc/config/pa/t-pa64 index 9323a250ed2..d1b2b264931 100644 --- a/gcc/config/pa/t-pa64 +++ b/gcc/config/pa/t-pa64 @@ -1,4 +1,4 @@ -TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1 +TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1 -mlong-calls LIB2FUNCS_EXTRA=quadlib.c diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 43acaace119..7d994b19c5b 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -508,7 +508,7 @@ in the following sections. -march=@var{architecture-type} @gol -mbig-switch -mdisable-fpregs -mdisable-indexing @gol -mfast-indirect-calls -mgas -mgnu-ld -mhp-ld @gol --mjump-in-delay -mlinker-opt @gol +-mjump-in-delay -mlinker-opt -mlong-calls @gol -mlong-load-store -mno-big-switch -mno-disable-fpregs @gol -mno-disable-indexing -mno-fast-indirect-calls -mno-gas @gol -mno-jump-in-delay -mno-long-load-store @gol @@ -8094,6 +8094,33 @@ configure option, gcc's program search path, and finally by the user's @env{PATH}. The linker used by GCC can be printed using @samp{which `gcc -print-prog-name=ld`}. +@item -mlong-calls +@opindex mno-long-calls +Generate code that uses long call sequences. This ensures that a call +is always able to reach linker generated stubs. The default is to generate +long calls only when the distance from the call site to the beginning +of the function or translation unit, as the case may be, exceeds a +predefined limit set by the branch type being used. The limits for +normal calls are 7,600,000 and 240,000 bytes, respectively for the +PA 2.0 and PA 1.X architectures. Sibcalls are always limited at +240,000 bytes. + +Distances are measured from the beginning of functions when using the +@option{-ffunction-sections} option, or when using the @option{-mgas} +and @option{-mno-portable-runtime} options together under HP-UX with +the SOM linker. + +It is normally not desirable to use this option as it will degrade +performance. However, it may be useful in large applications, +particularly when partial linking is used to build the application. + +The types of long calls used depends on the capabilities of the +assembler and linker, and the type of code being generated. The +impact on systems that support long absolute calls, and long pic +symbol-difference or pc-relative calls should be relatively small. +However, an indirect call is used on 32-bit ELF systems in pic code +and it is quite long. + @end table @node Intel 960 Options |