summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authordanglin <danglin@138bc75d-0d04-0410-961f-82ee72b054a4>2002-10-31 03:13:44 +0000
committerdanglin <danglin@138bc75d-0d04-0410-961f-82ee72b054a4>2002-10-31 03:13:44 +0000
commitece8882131d54884edc90a37d43a8588fab2c7a3 (patch)
treef47286e288d980615f8a9fbf886fb3a2f0dffa4e
parent981a251cec2130deed33ce9eac028501a47343c7 (diff)
downloadgcc-ece8882131d54884edc90a37d43a8588fab2c7a3.tar.gz
* pa-linux.h (ASM_OUTPUT_EXTERNAL_LIBCALL): Define.
* pa-protos.h (attr_length_millicode_call, attr_length_call, pa_init_machine_status): Declare new global functions. * pa.c (void copy_fp_args, length_fp_args, get_plabel): Declare and implement new functions. (attr_length_millicode_call, attr_length_call): Implement. (total_code_bytes): Change type to long. (pa_output_function_prologue): Compute total_code_bytes on TARGET_64BIT. Reset counter if flag_function_sections. (output_deferred_plabels): Set output alignment to 3 for TARGET_64BIT. (output_cbranch): Move call to gen_label_rtx. (output_millicode_call): Rewrite adding long TARGET_64BIT call, expose delay slot in all variants, shorten pc-relative calls. (output_call): Rewrite adding long TARGET_64BIT call, improved delay slot usage and exposure, various new call variants, and shortened sequences for some variants on TARGET_PA_20. Miscellaneous format changes. * pa.h (total_code_bytes): Change type to long. (MASK_LONG_CALLS, TARGET_LONG_CALLS, TARGET_LONG_ABS_CALL, TARGET_LONG_PIC_SDIFF_CALL, TARGET_LONG_PIC_PCREL_CALL): Define. (TARGET_SWITCHES): Add "-mlong-calls" and "-mno-long-calls" options. (EXTRA_CONSTRAINT, GO_IF_LEGITIMATE_ADDRESS, LEGITIMIZE_RELOAD_ADDRESS): Don't use long floating point loads and stores on TARGET_ELF32. *pa.md (define_delay): Allow insns in delay on TARGET_PORTABLE_RUNTIME. (unnamed patterns for mulsi3, divsi3, udivsi3, modsi3, umodsi3 and canonicalize_funcptr_for_compare expanders): Calculate attribute length attr_length_millicode_call(). (call_internal_symref, call_value_internal_symref): Clobber register 1. Calculate attribute length using attr_length_call(). (call_internal_reg_64bit, call_value_internal_reg_64bit): Move gp load to delay slot. (sibcall, sibcall_value): Rewrite. (sibcall_internal_symref, sibcall_value_internal_symref): Clobber register 1. Use attr_length_call(). (sibcall_internal_symref_64bit, sibcall_value_internal_symref_64bit): New patterns. (unamed pattern for canonicalize_funcptr_for_compare): Rewrite. * som.h (MEMBER_TYPE_FORCES_BLK): Define. * t-pa64 (TARGET_LIBGCC2_CFLAGS): Add "-mlong-calls". * doc/invoke.texi (mlong-calls): Document. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@58665 138bc75d-0d04-0410-961f-82ee72b054a4
-rw-r--r--gcc/ChangeLog44
-rw-r--r--gcc/config/pa/pa-linux.h13
-rw-r--r--gcc/config/pa/pa-protos.h2
-rw-r--r--gcc/config/pa/pa.c920
-rw-r--r--gcc/config/pa/pa.h49
-rw-r--r--gcc/config/pa/pa.md372
-rw-r--r--gcc/config/pa/som.h4
-rw-r--r--gcc/config/pa/t-pa642
-rw-r--r--gcc/doc/invoke.texi29
9 files changed, 816 insertions, 619 deletions
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c0798dfbff2..4031d7bf863 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,47 @@
+2002-10-30 John David Anglin <dave@hiauly.hia.nrc.ca>
+
+ * pa-linux.h (ASM_OUTPUT_EXTERNAL_LIBCALL): Define.
+ * pa-protos.h (attr_length_millicode_call, attr_length_call,
+ pa_init_machine_status): Declare new global functions.
+ * pa.c (void copy_fp_args, length_fp_args, get_plabel): Declare and
+ implement new functions.
+ (attr_length_millicode_call, attr_length_call): Implement.
+ (total_code_bytes): Change type to long.
+ (pa_output_function_prologue): Compute total_code_bytes on TARGET_64BIT.
+ Reset counter if flag_function_sections.
+ (output_deferred_plabels): Set output alignment to 3 for TARGET_64BIT.
+ (output_cbranch): Move call to gen_label_rtx.
+ (output_millicode_call): Rewrite adding long TARGET_64BIT call, expose
+ delay slot in all variants, shorten pc-relative calls.
+ (output_call): Rewrite adding long TARGET_64BIT call, improved delay
+ slot usage and exposure, various new call variants, and shortened
+ sequences for some variants on TARGET_PA_20.
+ Miscellaneous format changes.
+ * pa.h (total_code_bytes): Change type to long.
+ (MASK_LONG_CALLS, TARGET_LONG_CALLS, TARGET_LONG_ABS_CALL,
+ TARGET_LONG_PIC_SDIFF_CALL, TARGET_LONG_PIC_PCREL_CALL): Define.
+ (TARGET_SWITCHES): Add "-mlong-calls" and "-mno-long-calls" options.
+ (EXTRA_CONSTRAINT, GO_IF_LEGITIMATE_ADDRESS,
+ LEGITIMIZE_RELOAD_ADDRESS): Don't use long floating point loads and
+ stores on TARGET_ELF32.
+ *pa.md (define_delay): Allow insns in delay on TARGET_PORTABLE_RUNTIME.
+ (unnamed patterns for mulsi3, divsi3, udivsi3, modsi3, umodsi3 and
+ canonicalize_funcptr_for_compare expanders): Calculate attribute length
+ attr_length_millicode_call().
+ (call_internal_symref, call_value_internal_symref): Clobber register 1.
+ Calculate attribute length using attr_length_call().
+ (call_internal_reg_64bit, call_value_internal_reg_64bit): Move gp load
+ to delay slot.
+ (sibcall, sibcall_value): Rewrite.
+ (sibcall_internal_symref, sibcall_value_internal_symref): Clobber
+ register 1. Use attr_length_call().
+ (sibcall_internal_symref_64bit, sibcall_value_internal_symref_64bit):
+ New patterns.
+ (unamed pattern for canonicalize_funcptr_for_compare): Rewrite.
+ * som.h (MEMBER_TYPE_FORCES_BLK): Define.
+ * t-pa64 (TARGET_LIBGCC2_CFLAGS): Add "-mlong-calls".
+ * doc/invoke.texi (mlong-calls): Document.
+
2002-10-30 Roger Sayle <roger@eyesopen.com>
* fold-const.c (fold_binary_op_with_conditional_arg): Improve
diff --git a/gcc/config/pa/pa-linux.h b/gcc/config/pa/pa-linux.h
index cdc201d5f59..7f372faa8d3 100644
--- a/gcc/config/pa/pa-linux.h
+++ b/gcc/config/pa/pa-linux.h
@@ -196,6 +196,19 @@ Boston, MA 02111-1307, USA. */
} \
while (0)
+/* As well as globalizing the label, we need to encode the label
+ to ensure a plabel is generated in an indirect call. */
+
+#undef ASM_OUTPUT_EXTERNAL_LIBCALL
+#define ASM_OUTPUT_EXTERNAL_LIBCALL(FILE, FUN) \
+ do \
+ { \
+ if (!FUNCTION_NAME_P (XSTR (FUN, 0))) \
+ hppa_encode_label (FUN); \
+ (*targetm.asm_out.globalize_label) (FILE, XSTR (FUN, 0)); \
+ } \
+ while (0)
+
/* Linux always uses gas. */
#undef TARGET_GAS
#define TARGET_GAS 1
diff --git a/gcc/config/pa/pa-protos.h b/gcc/config/pa/pa-protos.h
index ca115fb55fc..5d1ab111d22 100644
--- a/gcc/config/pa/pa-protos.h
+++ b/gcc/config/pa/pa-protos.h
@@ -105,6 +105,8 @@ extern int jump_in_call_delay PARAMS ((rtx));
extern enum reg_class secondary_reload_class PARAMS ((enum reg_class,
enum machine_mode, rtx));
extern int hppa_fpstore_bypass_p PARAMS ((rtx, rtx));
+extern int attr_length_millicode_call PARAMS ((rtx, int));
+extern int attr_length_call PARAMS ((rtx, int));
/* Declare functions defined in pa.c and used in templates. */
diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c
index 58ba157fc24..b51e946852c 100644
--- a/gcc/config/pa/pa.c
+++ b/gcc/config/pa/pa.c
@@ -121,11 +121,13 @@ static void pa_globalize_label PARAMS ((FILE *, const char *))
ATTRIBUTE_UNUSED;
static void pa_asm_output_mi_thunk PARAMS ((FILE *, tree, HOST_WIDE_INT,
HOST_WIDE_INT, tree));
-
+static void copy_fp_args PARAMS ((rtx)) ATTRIBUTE_UNUSED;
+static int length_fp_args PARAMS ((rtx)) ATTRIBUTE_UNUSED;
+static struct deferred_plabel *get_plabel PARAMS ((const char *))
+ ATTRIBUTE_UNUSED;
/* Save the operands last given to a compare for use when we
generate a scc or bcc insn. */
-
rtx hppa_compare_op0, hppa_compare_op1;
enum cmp_type hppa_branch_type;
@@ -149,12 +151,10 @@ static rtx find_addr_reg PARAMS ((rtx));
/* Keep track of the number of bytes we have output in the CODE subspaces
during this compilation so we'll know when to emit inline long-calls. */
-
-unsigned int total_code_bytes;
+unsigned long total_code_bytes;
/* Variables to handle plabels that we discover are necessary at assembly
output time. They are output after the current function. */
-
struct deferred_plabel GTY(())
{
rtx internal_label;
@@ -3197,14 +3197,14 @@ pa_output_function_prologue (file, size)
fputs ("\n\t.ENTRY\n", file);
/* If we're using GAS and SOM, and not using the portable runtime model,
- then we don't need to accumulate the total number of code bytes. */
+ or function sections, then we don't need to accumulate the total number
+ of code bytes. */
if ((TARGET_GAS && TARGET_SOM && ! TARGET_PORTABLE_RUNTIME)
- /* FIXME: we can't handle long calls for TARGET_64BIT. */
- || TARGET_64BIT)
+ || flag_function_sections)
total_code_bytes = 0;
else if (INSN_ADDRESSES_SET_P ())
{
- unsigned int old_total = total_code_bytes;
+ unsigned long old_total = total_code_bytes;
total_code_bytes += INSN_ADDRESSES (INSN_UID (get_last_nonnote_insn ()));
total_code_bytes += FUNCTION_BOUNDARY / BITS_PER_UNIT;
@@ -4726,6 +4726,47 @@ output_global_address (file, x, round_constant)
output_addr_const (file, x);
}
+static struct deferred_plabel *
+get_plabel (fname)
+ const char *fname;
+{
+ size_t i;
+
+ /* See if we have already put this function on the list of deferred
+ plabels. This list is generally small, so a liner search is not
+ too ugly. If it proves too slow replace it with something faster. */
+ for (i = 0; i < n_deferred_plabels; i++)
+ if (strcmp (fname, deferred_plabels[i].name) == 0)
+ break;
+
+ /* If the deferred plabel list is empty, or this entry was not found
+ on the list, create a new entry on the list. */
+ if (deferred_plabels == NULL || i == n_deferred_plabels)
+ {
+ const char *real_name;
+
+ if (deferred_plabels == 0)
+ deferred_plabels = (struct deferred_plabel *)
+ ggc_alloc (sizeof (struct deferred_plabel));
+ else
+ deferred_plabels = (struct deferred_plabel *)
+ ggc_realloc (deferred_plabels,
+ ((n_deferred_plabels + 1)
+ * sizeof (struct deferred_plabel)));
+
+ i = n_deferred_plabels++;
+ deferred_plabels[i].internal_label = gen_label_rtx ();
+ deferred_plabels[i].name = ggc_strdup (fname);
+
+ /* Gross. We have just implicitly taken the address of this function,
+ mark it as such. */
+ real_name = (*targetm.strip_name_encoding) (fname);
+ TREE_SYMBOL_REFERENCED (get_identifier (real_name)) = 1;
+ }
+
+ return &deferred_plabels[i];
+}
+
void
output_deferred_plabels (file)
FILE *file;
@@ -4737,7 +4778,7 @@ output_deferred_plabels (file)
if (n_deferred_plabels)
{
data_section ();
- ASM_OUTPUT_ALIGN (file, 2);
+ ASM_OUTPUT_ALIGN (file, TARGET_64BIT ? 3 : 2);
}
/* Now output the deferred plabels. */
@@ -5323,9 +5364,9 @@ hppa_va_arg (valist, type)
const char *
output_cbranch (operands, nullify, length, negated, insn)
- rtx *operands;
- int nullify, length, negated;
- rtx insn;
+ rtx *operands;
+ int nullify, length, negated;
+ rtx insn;
{
static char buf[100];
int useskip = 0;
@@ -5499,12 +5540,11 @@ output_cbranch (operands, nullify, length, negated, insn)
xoperands[1] = operands[1];
xoperands[2] = operands[2];
xoperands[3] = operands[3];
- if (TARGET_SOM || ! TARGET_GAS)
- xoperands[4] = gen_label_rtx ();
output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands);
- if (TARGET_SOM || ! TARGET_GAS)
+ if (TARGET_SOM || !TARGET_GAS)
{
+ xoperands[4] = gen_label_rtx ();
output_asm_insn ("addil L'%l0-%l4,%%r1", xoperands);
ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L",
CODE_LABEL_NUMBER (xoperands[4]));
@@ -5536,10 +5576,10 @@ output_cbranch (operands, nullify, length, negated, insn)
const char *
output_bb (operands, nullify, length, negated, insn, which)
- rtx *operands ATTRIBUTE_UNUSED;
- int nullify, length, negated;
- rtx insn;
- int which;
+ rtx *operands ATTRIBUTE_UNUSED;
+ int nullify, length, negated;
+ rtx insn;
+ int which;
{
static char buf[100];
int useskip = 0;
@@ -5684,10 +5724,10 @@ output_bb (operands, nullify, length, negated, insn, which)
const char *
output_bvb (operands, nullify, length, negated, insn, which)
- rtx *operands ATTRIBUTE_UNUSED;
- int nullify, length, negated;
- rtx insn;
- int which;
+ rtx *operands ATTRIBUTE_UNUSED;
+ int nullify, length, negated;
+ rtx insn;
+ int which;
{
static char buf[100];
int useskip = 0;
@@ -6043,442 +6083,594 @@ output_movb (operands, insn, which_alternative, reverse_comparison)
}
}
+/* Copy any FP arguments in INSN into integer registers. */
+static void
+copy_fp_args (insn)
+ rtx insn;
+{
+ rtx link;
+ rtx xoperands[2];
-/* INSN is a millicode call. It may have an unconditional jump in its delay
- slot.
-
- CALL_DEST is the routine we are calling. */
+ for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1))
+ {
+ int arg_mode, regno;
+ rtx use = XEXP (link, 0);
-const char *
-output_millicode_call (insn, call_dest)
- rtx insn;
- rtx call_dest;
-{
- int attr_length = get_attr_length (insn);
- int seq_length = dbr_sequence_length ();
- int distance;
- rtx xoperands[4];
- rtx seq_insn;
+ if (! (GET_CODE (use) == USE
+ && GET_CODE (XEXP (use, 0)) == REG
+ && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0)))))
+ continue;
- xoperands[3] = gen_rtx_REG (Pmode, TARGET_64BIT ? 2 : 31);
+ arg_mode = GET_MODE (XEXP (use, 0));
+ regno = REGNO (XEXP (use, 0));
- /* Handle common case -- empty delay slot or no jump in the delay slot,
- and we're sure that the branch will reach the beginning of the $CODE$
- subspace. The within reach form of the $$sh_func_adrs call has
- a length of 28 and attribute type of multi. This length is the
- same as the maximum length of an out of reach PIC call to $$div. */
- if ((seq_length == 0
- && (attr_length == 8
- || (attr_length == 28 && get_attr_type (insn) == TYPE_MULTI)))
- || (seq_length != 0
- && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN
- && attr_length == 4))
- {
- xoperands[0] = call_dest;
- output_asm_insn ("{bl|b,l} %0,%3%#", xoperands);
- return "";
+ /* Is it a floating point register? */
+ if (regno >= 32 && regno <= 39)
+ {
+ /* Copy the FP register into an integer register via memory. */
+ if (arg_mode == SFmode)
+ {
+ xoperands[0] = XEXP (use, 0);
+ xoperands[1] = gen_rtx_REG (SImode, 26 - (regno - 32) / 2);
+ output_asm_insn ("{fstws|fstw} %0,-16(%%sr0,%%r30)", xoperands);
+ output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands);
+ }
+ else
+ {
+ xoperands[0] = XEXP (use, 0);
+ xoperands[1] = gen_rtx_REG (DImode, 25 - (regno - 34) / 2);
+ output_asm_insn ("{fstds|fstd} %0,-16(%%sr0,%%r30)", xoperands);
+ output_asm_insn ("ldw -12(%%sr0,%%r30),%R1", xoperands);
+ output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands);
+ }
+ }
}
+}
+
+/* Compute length of the FP argument copy sequence for INSN. */
+static int
+length_fp_args (insn)
+ rtx insn;
+{
+ int length = 0;
+ rtx link;
- /* This call may not reach the beginning of the $CODE$ subspace. */
- if (attr_length > 8)
+ for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1))
{
- int delay_insn_deleted = 0;
+ int arg_mode, regno;
+ rtx use = XEXP (link, 0);
- /* We need to emit an inline long-call branch. */
- if (seq_length != 0
- && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN)
+ if (! (GET_CODE (use) == USE
+ && GET_CODE (XEXP (use, 0)) == REG
+ && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0)))))
+ continue;
+
+ arg_mode = GET_MODE (XEXP (use, 0));
+ regno = REGNO (XEXP (use, 0));
+
+ /* Is it a floating point register? */
+ if (regno >= 32 && regno <= 39)
{
- /* A non-jump insn in the delay slot. By definition we can
- emit this insn before the call. */
- final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0);
-
- /* Now delete the delay insn. */
- PUT_CODE (NEXT_INSN (insn), NOTE);
- NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
- NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
- delay_insn_deleted = 1;
+ if (arg_mode == SFmode)
+ length += 8;
+ else
+ length += 12;
}
+ }
- /* PIC long millicode call sequence. */
- if (flag_pic)
- {
- xoperands[0] = call_dest;
- if (TARGET_SOM || ! TARGET_GAS)
- xoperands[1] = gen_label_rtx ();
+ return length;
+}
- /* Get our address + 8 into %r1. */
- output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands);
+/* We include the delay slot in the returned length as it is better to
+ over estimate the length than to under estimate it. */
- if (TARGET_SOM || ! TARGET_GAS)
- {
- /* Add %r1 to the offset of our target from the next insn. */
- output_asm_insn ("addil L%%%0-%1,%%r1", xoperands);
- ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L",
- CODE_LABEL_NUMBER (xoperands[1]));
- output_asm_insn ("ldo R%%%0-%1(%%r1),%%r1", xoperands);
- }
- else
- {
- output_asm_insn ("addil L%%%0-$PIC_pcrel$0+4,%%r1", xoperands);
- output_asm_insn ("ldo R%%%0-$PIC_pcrel$0+8(%%r1),%%r1",
- xoperands);
- }
+int
+attr_length_millicode_call (insn, length)
+ rtx insn;
+ int length;
+{
+ unsigned long distance = total_code_bytes + INSN_ADDRESSES (INSN_UID (insn));
+
+ if (distance < total_code_bytes)
+ distance = -1;
+
+ if (TARGET_64BIT)
+ {
+ if (!TARGET_LONG_CALLS && distance < 7600000)
+ return length + 8;
+
+ return length + 20;
+ }
+ else if (TARGET_PORTABLE_RUNTIME)
+ return length + 24;
+ else
+ {
+ if (!TARGET_LONG_CALLS && distance < 240000)
+ return length + 8;
+
+ if (TARGET_LONG_ABS_CALL && !flag_pic)
+ return length + 12;
+
+ return length + 24;
+ }
+}
+
+/* INSN is a function call. It may have an unconditional jump
+ in its delay slot.
- /* Get the return address into %r31. */
- output_asm_insn ("blr 0,%3", xoperands);
+ CALL_DEST is the routine we are calling. */
- /* Branch to our target which is in %r1. */
- output_asm_insn ("bv,n %%r0(%%r1)", xoperands);
+const char *
+output_millicode_call (insn, call_dest)
+ rtx insn;
+ rtx call_dest;
+{
+ int attr_length = get_attr_length (insn);
+ int seq_length = dbr_sequence_length ();
+ int distance;
+ rtx seq_insn;
+ rtx xoperands[3];
- /* Empty delay slot. Note this insn gets fetched twice and
- executed once. To be safe we use a nop. */
- output_asm_insn ("nop", xoperands);
+ xoperands[0] = call_dest;
+ xoperands[2] = gen_rtx_REG (Pmode, TARGET_64BIT ? 2 : 31);
+
+ /* Handle the common case where we are sure that the branch will
+ reach the beginning of the $CODE$ subspace. The within reach
+ form of the $$sh_func_adrs call has a length of 28. Because
+ it has an attribute type of multi, it never has a non-zero
+ sequence length. The length of the $$sh_func_adrs is the same
+ as certain out of reach PIC calls to other routines. */
+ if (!TARGET_LONG_CALLS
+ && ((seq_length == 0
+ && (attr_length == 12
+ || (attr_length == 28 && get_attr_type (insn) == TYPE_MULTI)))
+ || (seq_length != 0 && attr_length == 8)))
+ {
+ output_asm_insn ("{bl|b,l} %0,%2", xoperands);
+ }
+ else
+ {
+ if (TARGET_64BIT)
+ {
+ /* It might seem that one insn could be saved by accessing
+ the millicode function using the linkage table. However,
+ this doesn't work in shared libraries and other dynamically
+ loaded objects. Using a pc-relative sequence also avoids
+ problems related to the implicit use of the gp register. */
+ output_asm_insn ("b,l .+8,%%r1", xoperands);
+ output_asm_insn ("addil L'%0-$PIC_pcrel$0+4,%%r1", xoperands);
+ output_asm_insn ("ldo R'%0-$PIC_pcrel$0+8(%%r1),%%r1", xoperands);
+ output_asm_insn ("bve,l (%%r1),%%r2", xoperands);
}
- /* Pure portable runtime doesn't allow be/ble; we also don't have
- PIC support in the assembler/linker, so this sequence is needed. */
else if (TARGET_PORTABLE_RUNTIME)
{
- xoperands[0] = call_dest;
- /* Get the address of our target into %r29. */
- output_asm_insn ("ldil L%%%0,%%r29", xoperands);
- output_asm_insn ("ldo R%%%0(%%r29),%%r29", xoperands);
+ /* Pure portable runtime doesn't allow be/ble; we also don't
+ have PIC support in the assembler/linker, so this sequence
+ is needed. */
- /* Get our return address into %r31. */
- output_asm_insn ("blr %%r0,%3", xoperands);
+ /* Get the address of our target into %r1. */
+ output_asm_insn ("ldil L'%0,%%r1", xoperands);
+ output_asm_insn ("ldo R'%0(%%r1),%%r1", xoperands);
- /* Jump to our target address in %r29. */
- output_asm_insn ("bv,n %%r0(%%r29)", xoperands);
+ /* Get our return address into %r31. */
+ output_asm_insn ("{bl|b,l} .+8,%%r31", xoperands);
+ output_asm_insn ("addi 8,%%r31,%%r31", xoperands);
- /* Empty delay slot. Note this insn gets fetched twice and
- executed once. To be safe we use a nop. */
- output_asm_insn ("nop", xoperands);
+ /* Jump to our target address in %r1. */
+ output_asm_insn ("bv %%r0(%%r1)", xoperands);
}
- /* If we're allowed to use be/ble instructions, then this is the
- best sequence to use for a long millicode call. */
- else
+ else if (!flag_pic)
{
- xoperands[0] = call_dest;
- output_asm_insn ("ldil L%%%0,%3", xoperands);
+ output_asm_insn ("ldil L'%0,%%r1", xoperands);
if (TARGET_PA_20)
- output_asm_insn ("be,l R%%%0(%%sr4,%3),%%sr0,%%r31", xoperands);
+ output_asm_insn ("be,l R'%0(%%sr4,%%r1),%%sr0,%%r31", xoperands);
else
- output_asm_insn ("ble R%%%0(%%sr4,%3)", xoperands);
- output_asm_insn ("nop", xoperands);
+ output_asm_insn ("ble R'%0(%%sr4,%%r1)", xoperands);
}
-
- /* If we had a jump in the call's delay slot, output it now. */
- if (seq_length != 0 && !delay_insn_deleted)
+ else
{
- xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1);
- output_asm_insn ("b,n %0", xoperands);
+ if (TARGET_SOM || !TARGET_GAS)
+ {
+ /* The HP assembler can generate relocations for the
+ difference of two symbols. GAS can do this for a
+ millicode symbol but not an arbitrary external
+ symbol when generating SOM output. */
+ xoperands[1] = gen_label_rtx ();
+ output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands);
+ output_asm_insn ("addi 16,%%r1,%%r31", xoperands);
+ ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L",
+ CODE_LABEL_NUMBER (xoperands[1]));
+ output_asm_insn ("addil L'%0-%l1,%%r1", xoperands);
+ output_asm_insn ("ldo R'%0-%l1(%%r1),%%r1", xoperands);
+ }
+ else
+ {
+ output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands);
+ output_asm_insn ("addi 16,%%r1,%%r31", xoperands);
+ output_asm_insn ("addil L'%0-$PIC_pcrel$0+8,%%r1", xoperands);
+ output_asm_insn ("ldo R'%0-$PIC_pcrel$0+12(%%r1),%%r1",
+ xoperands);
+ }
- /* Now delete the delay insn. */
- PUT_CODE (NEXT_INSN (insn), NOTE);
- NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
- NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
+ /* Jump to our target address in %r1. */
+ output_asm_insn ("bv %%r0(%%r1)", xoperands);
}
- return "";
}
- /* This call has an unconditional jump in its delay slot and the
- call is known to reach its target or the beginning of the current
- subspace. */
+ if (seq_length == 0)
+ output_asm_insn ("nop", xoperands);
- /* Use the containing sequence insn's address. */
- seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0)));
-
- distance = INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn))))
- - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8;
+ /* We are done if there isn't a jump in the delay slot. */
+ if (seq_length == 0 || GET_CODE (NEXT_INSN (insn)) != JUMP_INSN)
+ return "";
- /* If the branch was too far away, emit a normal call followed
- by a nop, followed by the unconditional branch.
+ /* This call has an unconditional jump in its delay slot. */
+ xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1);
- If the branch is close, then adjust %r2 from within the
- call's delay slot. */
+ /* See if the return address can be adjusted. Use the containing
+ sequence insn's address. */
+ seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0)));
+ distance = (INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn))))
+ - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8);
- xoperands[0] = call_dest;
- xoperands[1] = XEXP (PATTERN (NEXT_INSN (insn)), 1);
- if (! VAL_14_BITS_P (distance))
- output_asm_insn ("{bl|b,l} %0,%3\n\tnop\n\tb,n %1", xoperands);
- else
+ if (VAL_14_BITS_P (distance))
{
- xoperands[2] = gen_label_rtx ();
- output_asm_insn ("\n\t{bl|b,l} %0,%3\n\tldo %1-%2(%3),%3",
- xoperands);
+ xoperands[1] = gen_label_rtx ();
+ output_asm_insn ("ldo %0-%1(%2),%2", xoperands);
ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L",
- CODE_LABEL_NUMBER (xoperands[2]));
+ CODE_LABEL_NUMBER (xoperands[3]));
}
+ else
+ /* ??? This branch may not reach its target. */
+ output_asm_insn ("nop\n\tb,n %0", xoperands);
/* Delete the jump. */
PUT_CODE (NEXT_INSN (insn), NOTE);
NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
+
return "";
}
-/* INSN is either a function call. It may have an unconditional jump
+/* We include the delay slot in the returned length as it is better to
+ over estimate the length than to under estimate it. */
+
+int
+attr_length_call (insn, sibcall)
+ rtx insn;
+ int sibcall;
+{
+ unsigned long distance = total_code_bytes + INSN_ADDRESSES (INSN_UID (insn));
+
+ if (distance < total_code_bytes)
+ distance = -1;
+
+ if (TARGET_64BIT)
+ {
+ if (!TARGET_LONG_CALLS
+ && ((!sibcall && distance < 7600000) || distance < 240000))
+ return 8;
+
+ return (sibcall ? 28 : 24);
+ }
+ else
+ {
+ if (!TARGET_LONG_CALLS
+ && ((TARGET_PA_20 && !sibcall && distance < 7600000)
+ || distance < 240000))
+ return 8;
+
+ if (TARGET_LONG_ABS_CALL && !flag_pic)
+ return 12;
+
+ if ((TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL)
+ || (TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL))
+ {
+ if (TARGET_PA_20)
+ return 20;
+
+ return 28;
+ }
+ else
+ {
+ int length = 0;
+
+ if (TARGET_SOM)
+ length += length_fp_args (insn);
+
+ if (flag_pic)
+ length += 4;
+
+ if (TARGET_PA_20)
+ return (length + 32);
+
+ if (!sibcall)
+ length += 8;
+
+ return (length + 40);
+ }
+ }
+}
+
+/* INSN is a function call. It may have an unconditional jump
in its delay slot.
CALL_DEST is the routine we are calling. */
const char *
output_call (insn, call_dest, sibcall)
- rtx insn;
- rtx call_dest;
- int sibcall;
+ rtx insn;
+ rtx call_dest;
+ int sibcall;
{
+ int delay_insn_deleted = 0;
+ int delay_slot_filled = 0;
int attr_length = get_attr_length (insn);
int seq_length = dbr_sequence_length ();
- int distance;
- rtx xoperands[4];
- rtx seq_insn;
+ rtx xoperands[2];
+
+ xoperands[0] = call_dest;
- /* Handle common case -- empty delay slot or no jump in the delay slot,
- and we're sure that the branch will reach the beginning of the $CODE$
- subspace. */
- if ((seq_length == 0 && attr_length == 12)
- || (seq_length != 0
- && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN
- && attr_length == 8))
+ /* Handle the common case where we're sure that the branch will reach
+ the beginning of the $CODE$ subspace. */
+ if (!TARGET_LONG_CALLS
+ && ((seq_length == 0 && attr_length == 12)
+ || (seq_length != 0 && attr_length == 8)))
{
- xoperands[0] = call_dest;
xoperands[1] = gen_rtx_REG (word_mode, sibcall ? 0 : 2);
- output_asm_insn ("{bl|b,l} %0,%1%#", xoperands);
- return "";
+ output_asm_insn ("{bl|b,l} %0,%1", xoperands);
}
-
- /* This call may not reach the beginning of the $CODE$ subspace. */
- if (attr_length > 12)
+ else
{
- int delay_insn_deleted = 0;
- rtx xoperands[2];
- rtx link;
-
- /* We need to emit an inline long-call branch. Furthermore,
- because we're changing a named function call into an indirect
- function call well after the parameters have been set up, we
- need to make sure any FP args appear in both the integer
- and FP registers. Also, we need move any delay slot insn
- out of the delay slot. And finally, we can't rely on the linker
- being able to fix the call to $$dyncall! -- Yuk!. */
- if (seq_length != 0
- && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN)
+ if (TARGET_64BIT)
{
- /* A non-jump insn in the delay slot. By definition we can
- emit this insn before the call (and in fact before argument
- relocating. */
- final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0);
-
- /* Now delete the delay insn. */
- PUT_CODE (NEXT_INSN (insn), NOTE);
- NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
- NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
- delay_insn_deleted = 1;
- }
+ /* ??? As far as I can tell, the HP linker doesn't support the
+ long pc-relative sequence described in the 64-bit runtime
+ architecture. So, we use a slightly longer indirect call. */
+ struct deferred_plabel *p = get_plabel (XSTR (call_dest, 0));
+
+ xoperands[0] = p->internal_label;
+ xoperands[1] = gen_label_rtx ();
+
+ /* If this isn't a sibcall, we put the load of %r27 into the
+ delay slot. We can't do this in a sibcall as we don't
+ have a second call-clobbered scratch register available. */
+ if (seq_length != 0
+ && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN
+ && !sibcall)
+ {
+ final_scan_insn (NEXT_INSN (insn), asm_out_file,
+ optimize, 0, 0);
+
+ /* Now delete the delay insn. */
+ PUT_CODE (NEXT_INSN (insn), NOTE);
+ NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
+ NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
+ delay_insn_deleted = 1;
+ }
- /* Now copy any FP arguments into integer registers. */
- for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1))
- {
- int arg_mode, regno;
- rtx use = XEXP (link, 0);
- if (! (GET_CODE (use) == USE
- && GET_CODE (XEXP (use, 0)) == REG
- && FUNCTION_ARG_REGNO_P (REGNO (XEXP (use, 0)))))
- continue;
+ output_asm_insn ("addil LT'%0,%%r27", xoperands);
+ output_asm_insn ("ldd RT'%0(%%r1),%%r1", xoperands);
+ output_asm_insn ("ldd 0(%%r1),%%r1", xoperands);
- arg_mode = GET_MODE (XEXP (use, 0));
- regno = REGNO (XEXP (use, 0));
- /* Is it a floating point register? */
- if (regno >= 32 && regno <= 39)
+ if (sibcall)
{
- /* Copy from the FP register into an integer register
- (via memory). */
- if (arg_mode == SFmode)
- {
- xoperands[0] = XEXP (use, 0);
- xoperands[1] = gen_rtx_REG (SImode, 26 - (regno - 32) / 2);
- output_asm_insn ("{fstws|fstw} %0,-16(%%sr0,%%r30)",
- xoperands);
- output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands);
- }
- else
- {
- xoperands[0] = XEXP (use, 0);
- xoperands[1] = gen_rtx_REG (DImode, 25 - (regno - 34) / 2);
- output_asm_insn ("{fstds|fstd} %0,-16(%%sr0,%%r30)",
- xoperands);
- output_asm_insn ("ldw -12(%%sr0,%%r30),%R1", xoperands);
- output_asm_insn ("ldw -16(%%sr0,%%r30),%1", xoperands);
- }
+ output_asm_insn ("ldd 24(%%r1),%%r27", xoperands);
+ output_asm_insn ("ldd 16(%%r1),%%r1", xoperands);
+ output_asm_insn ("bve (%%r1)", xoperands);
+ }
+ else
+ {
+ output_asm_insn ("ldd 16(%%r1),%%r2", xoperands);
+ output_asm_insn ("bve,l (%%r2),%%r2", xoperands);
+ output_asm_insn ("ldd 24(%%r1),%%r27", xoperands);
+ delay_slot_filled = 1;
}
}
-
- /* Don't have to worry about TARGET_PORTABLE_RUNTIME here since
- we don't have any direct calls in that case. */
+ else
{
- size_t i;
- const char *name = XSTR (call_dest, 0);
-
- /* See if we have already put this function on the list
- of deferred plabels. This list is generally small,
- so a liner search is not too ugly. If it proves too
- slow replace it with something faster. */
- for (i = 0; i < n_deferred_plabels; i++)
- if (strcmp (name, deferred_plabels[i].name) == 0)
- break;
-
- /* If the deferred plabel list is empty, or this entry was
- not found on the list, create a new entry on the list. */
- if (deferred_plabels == NULL || i == n_deferred_plabels)
+ int indirect_call = 0;
+
+ /* Emit a long call. There are several different sequences
+ of increasing length and complexity. In most cases,
+ they don't allow an instruction in the delay slot. */
+ if (!(TARGET_LONG_ABS_CALL && !flag_pic)
+ && !(TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL)
+ && !(TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL))
+ indirect_call = 1;
+
+ if (seq_length != 0
+ && GET_CODE (NEXT_INSN (insn)) != JUMP_INSN
+ && !sibcall
+ && (!TARGET_PA_20 || indirect_call))
{
- const char *real_name;
-
- if (deferred_plabels == 0)
- deferred_plabels = (struct deferred_plabel *)
- ggc_alloc (sizeof (struct deferred_plabel));
- else
- deferred_plabels = (struct deferred_plabel *)
- ggc_realloc (deferred_plabels,
- ((n_deferred_plabels + 1)
- * sizeof (struct deferred_plabel)));
-
- i = n_deferred_plabels++;
- deferred_plabels[i].internal_label = gen_label_rtx ();
- deferred_plabels[i].name = ggc_strdup (name);
-
- /* Gross. We have just implicitly taken the address of this
- function, mark it as such. */
- real_name = (*targetm.strip_name_encoding) (name);
- TREE_SYMBOL_REFERENCED (get_identifier (real_name)) = 1;
+ /* A non-jump insn in the delay slot. By definition we can
+ emit this insn before the call (and in fact before argument
+ relocating. */
+ final_scan_insn (NEXT_INSN (insn), asm_out_file, optimize, 0, 0);
+
+ /* Now delete the delay insn. */
+ PUT_CODE (NEXT_INSN (insn), NOTE);
+ NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
+ NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
+ delay_insn_deleted = 1;
}
- /* We have to load the address of the function using a procedure
- label (plabel). Inline plabels can lose for PIC and other
- cases, so avoid them by creating a 32bit plabel in the data
- segment. */
- if (flag_pic)
+ if (TARGET_LONG_ABS_CALL && !flag_pic)
{
- xoperands[0] = deferred_plabels[i].internal_label;
- if (TARGET_SOM || ! TARGET_GAS)
- xoperands[1] = gen_label_rtx ();
-
- output_asm_insn ("addil LT%%%0,%%r19", xoperands);
- output_asm_insn ("ldw RT%%%0(%%r1),%%r22", xoperands);
- output_asm_insn ("ldw 0(%%r22),%%r22", xoperands);
-
- /* Get our address + 8 into %r1. */
- output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands);
+ /* This is the best sequence for making long calls in
+ non-pic code. Unfortunately, GNU ld doesn't provide
+ the stub needed for external calls, and GAS's support
+ for this with the SOM linker is buggy. */
+ output_asm_insn ("ldil L'%0,%%r1", xoperands);
+ if (sibcall)
+ output_asm_insn ("be R'%0(%%sr4,%%r1)", xoperands);
+ else
+ {
+ if (TARGET_PA_20)
+ output_asm_insn ("be,l R'%0(%%sr4,%%r1),%%sr0,%%r31",
+ xoperands);
+ else
+ output_asm_insn ("ble R'%0(%%sr4,%%r1)", xoperands);
- if (TARGET_SOM || ! TARGET_GAS)
+ output_asm_insn ("copy %%r31,%%r2", xoperands);
+ delay_slot_filled = 1;
+ }
+ }
+ else
+ {
+ if (TARGET_SOM && TARGET_LONG_PIC_SDIFF_CALL)
{
- /* Add %r1 to the offset of dyncall from the next insn. */
- output_asm_insn ("addil L%%$$dyncall-%1,%%r1", xoperands);
+ /* The HP assembler and linker can handle relocations
+ for the difference of two symbols. GAS and the HP
+ linker can't do this when one of the symbols is
+ external. */
+ xoperands[1] = gen_label_rtx ();
+ output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands);
+ output_asm_insn ("addil L'%0-%l1,%%r1", xoperands);
ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L",
CODE_LABEL_NUMBER (xoperands[1]));
- output_asm_insn ("ldo R%%$$dyncall-%1(%%r1),%%r1", xoperands);
- }
- else
+ output_asm_insn ("ldo R'%0-%l1(%%r1),%%r1", xoperands);
+ }
+ else if (TARGET_GAS && TARGET_LONG_PIC_PCREL_CALL)
{
- output_asm_insn ("addil L%%$$dyncall-$PIC_pcrel$0+4,%%r1",
+ /* GAS currently can't generate the relocations that
+ are needed for the SOM linker under HP-UX using this
+ sequence. The GNU linker doesn't generate the stubs
+ that are needed for external calls on TARGET_ELF32
+ with this sequence. For now, we have to use a
+ longer plabel sequence when using GAS. */
+ output_asm_insn ("{bl|b,l} .+8,%%r1", xoperands);
+ output_asm_insn ("addil L'%0-$PIC_pcrel$0+4,%%r1",
xoperands);
- output_asm_insn ("ldo R%%$$dyncall-$PIC_pcrel$0+8(%%r1),%%r1",
+ output_asm_insn ("ldo R'%0-$PIC_pcrel$0+8(%%r1),%%r1",
xoperands);
}
-
- /* Get the return address into %r31. */
- output_asm_insn ("blr %%r0,%%r31", xoperands);
-
- /* Branch to our target which is in %r1. */
- output_asm_insn ("bv %%r0(%%r1)", xoperands);
-
- if (sibcall)
- {
- /* This call never returns, so we do not need to fix the
- return pointer. */
- output_asm_insn ("nop", xoperands);
- }
else
{
- /* Copy the return address into %r2 also. */
- output_asm_insn ("copy %%r31,%%r2", xoperands);
- }
- }
- else
- {
- xoperands[0] = deferred_plabels[i].internal_label;
+ /* Emit a long plabel-based call sequence. This is
+ essentially an inline implementation of $$dyncall.
+ We don't actually try to call $$dyncall as this is
+ as difficult as calling the function itself. */
+ struct deferred_plabel *p = get_plabel (XSTR (call_dest, 0));
+
+ xoperands[0] = p->internal_label;
+ xoperands[1] = gen_label_rtx ();
+
+ /* Since the call is indirect, FP arguments in registers
+ need to be copied to the general registers. Then, the
+ argument relocation stub will copy them back. */
+ if (TARGET_SOM)
+ copy_fp_args (insn);
+
+ if (flag_pic)
+ {
+ output_asm_insn ("addil LT'%0,%%r19", xoperands);
+ output_asm_insn ("ldw RT'%0(%%r1),%%r1", xoperands);
+ output_asm_insn ("ldw 0(%%r1),%%r1", xoperands);
+ }
+ else
+ {
+ output_asm_insn ("addil LR'%0-$global$,%%r27",
+ xoperands);
+ output_asm_insn ("ldw RR'%0-$global$(%%r1),%%r1",
+ xoperands);
+ }
- /* Get the address of our target into %r22. */
- output_asm_insn ("addil LR%%%0-$global$,%%r27", xoperands);
- output_asm_insn ("ldw RR%%%0-$global$(%%r1),%%r22", xoperands);
+ output_asm_insn ("bb,>=,n %%r1,30,.+16", xoperands);
+ output_asm_insn ("depi 0,31,2,%%r1", xoperands);
+ output_asm_insn ("ldw 4(%%sr0,%%r1),%%r19", xoperands);
+ output_asm_insn ("ldw 0(%%sr0,%%r1),%%r1", xoperands);
- /* Get the high part of the address of $dyncall into %r2, then
- add in the low part in the branch instruction. */
- output_asm_insn ("ldil L%%$$dyncall,%%r2", xoperands);
- if (TARGET_PA_20)
- output_asm_insn ("be,l R%%$$dyncall(%%sr4,%%r2),%%sr0,%%r31",
- xoperands);
- else
- output_asm_insn ("ble R%%$$dyncall(%%sr4,%%r2)", xoperands);
+ if (!sibcall && !TARGET_PA_20)
+ {
+ output_asm_insn ("{bl|b,l} .+8,%%r2", xoperands);
+ output_asm_insn ("addi 16,%%r2,%%r2", xoperands);
+ }
+ }
- if (sibcall)
+ if (TARGET_PA_20)
{
- /* This call never returns, so we do not need to fix the
- return pointer. */
- output_asm_insn ("nop", xoperands);
+ if (sibcall)
+ output_asm_insn ("bve (%%r1)", xoperands);
+ else
+ {
+ if (indirect_call)
+ {
+ output_asm_insn ("bve,l (%%r1),%%r2", xoperands);
+ output_asm_insn ("stw %%r2,-24(%%sp)", xoperands);
+ delay_slot_filled = 1;
+ }
+ else
+ output_asm_insn ("bve,l (%%r1),%%r2", xoperands);
+ }
}
else
{
- /* Copy the return address into %r2 also. */
- output_asm_insn ("copy %%r31,%%r2", xoperands);
- }
- }
- }
+ output_asm_insn ("ldsid (%%r1),%%r31\n\tmtsp %%r31,%%sr0",
+ xoperands);
- /* If we had a jump in the call's delay slot, output it now. */
- if (seq_length != 0 && !delay_insn_deleted)
- {
- xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1);
- output_asm_insn ("b,n %0", xoperands);
+ if (sibcall)
+ output_asm_insn ("be 0(%%sr0,%%r1)", xoperands);
+ else
+ {
+ output_asm_insn ("ble 0(%%sr0,%%r1)", xoperands);
- /* Now delete the delay insn. */
- PUT_CODE (NEXT_INSN (insn), NOTE);
- NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
- NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
+ if (indirect_call)
+ output_asm_insn ("stw %%r31,-24(%%sp)", xoperands);
+ else
+ output_asm_insn ("copy %%r31,%%r2", xoperands);
+ delay_slot_filled = 1;
+ }
+ }
+ }
}
- return "";
}
- /* This call has an unconditional jump in its delay slot and the
- call is known to reach its target or the beginning of the current
- subspace. */
+ if (seq_length == 0 || (delay_insn_deleted && !delay_slot_filled))
+ output_asm_insn ("nop", xoperands);
- /* Use the containing sequence insn's address. */
- seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0)));
+ /* We are done if there isn't a jump in the delay slot. */
+ if (seq_length == 0
+ || delay_insn_deleted
+ || GET_CODE (NEXT_INSN (insn)) != JUMP_INSN)
+ return "";
- distance = INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn))))
- - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8;
+ /* A sibcall should never have a branch in the delay slot. */
+ if (sibcall)
+ abort ();
- /* If the branch is too far away, emit a normal call followed
- by a nop, followed by the unconditional branch. If the branch
- is close, then adjust %r2 in the call's delay slot. */
+ /* This call has an unconditional jump in its delay slot. */
+ xoperands[0] = XEXP (PATTERN (NEXT_INSN (insn)), 1);
- xoperands[0] = call_dest;
- xoperands[1] = XEXP (PATTERN (NEXT_INSN (insn)), 1);
- if (! VAL_14_BITS_P (distance))
- output_asm_insn ("{bl|b,l} %0,%%r2\n\tnop\n\tb,n %1", xoperands);
- else
+ if (!delay_slot_filled)
{
- xoperands[3] = gen_label_rtx ();
- output_asm_insn ("\n\t{bl|b,l} %0,%%r2\n\tldo %1-%3(%%r2),%%r2",
- xoperands);
- ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L",
- CODE_LABEL_NUMBER (xoperands[3]));
+ /* See if the return address can be adjusted. Use the containing
+ sequence insn's address. */
+ rtx seq_insn = NEXT_INSN (PREV_INSN (XVECEXP (final_sequence, 0, 0)));
+ int distance = (INSN_ADDRESSES (INSN_UID (JUMP_LABEL (NEXT_INSN (insn))))
+ - INSN_ADDRESSES (INSN_UID (seq_insn)) - 8);
+
+ if (VAL_14_BITS_P (distance))
+ {
+ xoperands[1] = gen_label_rtx ();
+ output_asm_insn ("ldo %0-%1(%%r2),%%r2", xoperands);
+ ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, "L",
+ CODE_LABEL_NUMBER (xoperands[3]));
+ }
+ else
+ /* ??? This branch may not reach its target. */
+ output_asm_insn ("nop\n\tb,n %0", xoperands);
}
+ else
+ /* ??? This branch may not reach its target. */
+ output_asm_insn ("b,n %0", xoperands);
/* Delete the jump. */
PUT_CODE (NEXT_INSN (insn), NOTE);
NOTE_LINE_NUMBER (NEXT_INSN (insn)) = NOTE_INSN_DELETED;
NOTE_SOURCE_FILE (NEXT_INSN (insn)) = 0;
+
return "";
}
@@ -6580,8 +6772,8 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function)
{
if (! TARGET_64BIT && ! TARGET_PORTABLE_RUNTIME && flag_pic)
{
- fprintf (file, "\taddil LT%%%s,%%r19\n", lab);
- fprintf (file, "\tldw RT%%%s(%%r1),%%r22\n", lab);
+ fprintf (file, "\taddil LT'%s,%%r19\n", lab);
+ fprintf (file, "\tldw RT'%s(%%r1),%%r22\n", lab);
fprintf (file, "\tldw 0(%%sr0,%%r22),%%r22\n");
fprintf (file, "\tbb,>=,n %%r22,30,.+16\n");
fprintf (file, "\tdepi 0,31,2,%%r22\n");
@@ -6603,13 +6795,13 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function)
{
if (! TARGET_64BIT && ! TARGET_PORTABLE_RUNTIME && flag_pic)
{
- fprintf (file, "\taddil L%%");
+ fprintf (file, "\taddil L'");
fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta);
- fprintf (file, ",%%r26\n\tldo R%%");
+ fprintf (file, ",%%r26\n\tldo R'");
fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta);
fprintf (file, "(%%r1),%%r26\n");
- fprintf (file, "\taddil LT%%%s,%%r19\n", lab);
- fprintf (file, "\tldw RT%%%s(%%r1),%%r22\n", lab);
+ fprintf (file, "\taddil LT'%s,%%r19\n", lab);
+ fprintf (file, "\tldw RT'%s(%%r1),%%r22\n", lab);
fprintf (file, "\tldw 0(%%sr0,%%r22),%%r22\n");
fprintf (file, "\tbb,>=,n %%r22,30,.+16\n");
fprintf (file, "\tdepi 0,31,2,%%r22\n");
@@ -6620,9 +6812,9 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function)
}
else
{
- fprintf (file, "\taddil L%%");
+ fprintf (file, "\taddil L'");
fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta);
- fprintf (file, ",%%r26\n\tb %s\n\tldo R%%", target_name);
+ fprintf (file, ",%%r26\n\tb %s\n\tldo R'", target_name);
fprintf (file, HOST_WIDE_INT_PRINT_DEC, delta);
fprintf (file, "(%%r1),%%r26\n");
}
@@ -6634,7 +6826,7 @@ pa_asm_output_mi_thunk (file, thunk_fndecl, delta, vcall_offset, function)
data_section ();
fprintf (file, "\t.align 4\n");
ASM_OUTPUT_INTERNAL_LABEL (file, "LTHN", current_thunk_number);
- fprintf (file, "\t.word P%%%s\n", target_name);
+ fprintf (file, "\t.word P'%s\n", target_name);
function_section (thunk_fndecl);
}
current_thunk_number++;
diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h
index 924571a10f7..a3f24b6d6e3 100644
--- a/gcc/config/pa/pa.h
+++ b/gcc/config/pa/pa.h
@@ -31,7 +31,7 @@ enum cmp_type /* comparison type */
};
/* For long call handling. */
-extern unsigned int total_code_bytes;
+extern unsigned long total_code_bytes;
/* Which processor to schedule for. */
@@ -152,6 +152,12 @@ extern int target_flags;
#define TARGET_GNU_LD (target_flags & MASK_GNU_LD)
#endif
+/* Force generation of long calls. */
+#define MASK_LONG_CALLS 32768
+#ifndef TARGET_LONG_CALLS
+#define TARGET_LONG_CALLS (target_flags & MASK_LONG_CALLS)
+#endif
+
#ifndef TARGET_PA_10
#define TARGET_PA_10 (target_flags & (MASK_PA_11 | MASK_PA_20) == 0)
#endif
@@ -179,6 +185,27 @@ extern int target_flags;
#define TARGET_SOM 0
#endif
+/* The following three defines are potential target switches. The current
+ defines are optimal given the current capabilities of GAS and GNU ld. */
+
+/* Define to a C expression evaluating to true to use long absolute calls.
+ Currently, only the HP assembler and SOM linker support long absolute
+ calls. They are used only in non-pic code. */
+#define TARGET_LONG_ABS_CALL (TARGET_SOM && !TARGET_GAS)
+
+/* Define to a C expression evaluating to true to use long pic symbol
+ difference calls. This is a call variant similar to the long pic
+ pc-relative call. Long pic symbol difference calls are only used with
+ the HP SOM linker. Currently, only the HP assembler supports these
+ calls. GAS doesn't allow an arbritrary difference of two symbols. */
+#define TARGET_LONG_PIC_SDIFF_CALL (!TARGET_GAS)
+
+/* Define to a C expression evaluating to true to use long pic
+ pc-relative calls. Long pic pc-relative calls are only used with
+ GAS. Currently, they are usable for calls within a module but
+ not for external calls. */
+#define TARGET_LONG_PIC_PCREL_CALL 0
+
/* Macro to define tables used to set the flags. This is a
list in braces of target switches with each switch being
{ "NAME", VALUE, "HELP_STRING" }. VALUE is the bits to set,
@@ -237,6 +264,10 @@ extern int target_flags;
N_("Generate code for huge switch statements") }, \
{ "no-big-switch", -MASK_BIG_SWITCH, \
N_("Do not generate code for huge switch statements") }, \
+ { "long-calls", MASK_LONG_CALLS, \
+ N_("Always generate long calls") }, \
+ { "no-long-calls", -MASK_LONG_CALLS, \
+ N_("Generate long calls only when needed") }, \
{ "linker-opt", 0, \
N_("Enable linker optimizations") }, \
SUBTARGET_SWITCHES \
@@ -1193,8 +1224,14 @@ extern int may_call_alloca;
/* Using DFmode forces only short displacements \
to be recognized as valid in reg+d addresses. \
However, this is not necessary for PA2.0 since\
- it has long FP loads/stores. */ \
+ it has long FP loads/stores. \
+ \
+ FIXME: the ELF32 linker clobbers the LSB of \
+ the FP register number in {fldw,fstw} insns. \
+ Thus, we only allow long FP loads/stores on \
+ TARGET_64BIT. */ \
&& memory_address_p ((TARGET_PA_20 \
+ && !TARGET_ELF32 \
? GET_MODE (OP) \
: DFmode), \
XEXP (OP, 0)) \
@@ -1300,7 +1337,7 @@ extern int may_call_alloca;
if (GET_CODE (index) == CONST_INT \
&& ((INT_14_BITS (index) \
&& (TARGET_SOFT_FLOAT \
- || (TARGET_PA_20 \
+ || (TARGET_PA_20 \
&& ((MODE == SFmode \
&& (INTVAL (index) % 4) == 0)\
|| (MODE == DFmode \
@@ -1327,6 +1364,7 @@ extern int may_call_alloca;
/* We can allow symbolic LO_SUM addresses\
for PA2.0. */ \
|| (TARGET_PA_20 \
+ && !TARGET_ELF32 \
&& GET_CODE (XEXP (X, 1)) != CONST_INT)\
|| ((MODE) != SFmode \
&& (MODE) != DFmode))) \
@@ -1340,6 +1378,7 @@ extern int may_call_alloca;
/* We can allow symbolic LO_SUM addresses\
for PA2.0. */ \
|| (TARGET_PA_20 \
+ && !TARGET_ELF32 \
&& GET_CODE (XEXP (X, 1)) != CONST_INT)\
|| ((MODE) != SFmode \
&& (MODE) != DFmode))) \
@@ -1354,7 +1393,7 @@ extern int may_call_alloca;
&& REG_OK_FOR_BASE_P (XEXP (X, 0)) \
&& GET_CODE (XEXP (X, 1)) == UNSPEC \
&& (TARGET_SOFT_FLOAT \
- || TARGET_PA_20 \
+ || (TARGET_PA_20 && !TARGET_ELF32) \
|| ((MODE) != SFmode \
&& (MODE) != DFmode))) \
goto ADDR; \
@@ -1386,7 +1425,7 @@ do { \
rtx new, temp = NULL_RTX; \
\
mask = (GET_MODE_CLASS (MODE) == MODE_FLOAT \
- ? (TARGET_PA_20 ? 0x3fff : 0x1f) : 0x3fff); \
+ ? (TARGET_PA_20 && !TARGET_ELF32 ? 0x3fff : 0x1f) : 0x3fff); \
\
if (optimize \
&& GET_CODE (AD) == PLUS) \
diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index bbd44fa3d04..ac6bdc9aff4 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -105,12 +105,9 @@
(define_delay (eq_attr "type" "call")
[(eq_attr "in_call_delay" "true") (nil) (nil)])
-;; millicode call delay slot description. Note it disallows delay slot
-;; when TARGET_PORTABLE_RUNTIME is true.
+;; Millicode call delay slot description.
(define_delay (eq_attr "type" "milli")
- [(and (eq_attr "in_call_delay" "true")
- (eq (symbol_ref "TARGET_PORTABLE_RUNTIME") (const_int 0)))
- (nil) (nil)])
+ [(eq_attr "in_call_delay" "true") (nil) (nil)])
;; Return and other similar instructions.
(define_delay (eq_attr "type" "branch,parallel_branch")
@@ -4089,27 +4086,7 @@
"!TARGET_64BIT"
"* return output_mul_insn (0, insn);"
[(set_attr "type" "milli")
- (set (attr "length")
- (cond [
-;; Target (or stub) within reach
- (and (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (eq (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0)))
- (const_int 4)
-
-;; Out of reach PIC
- (ne (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 24)
-
-;; Out of reach PORTABLE_RUNTIME
- (ne (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0))
- (const_int 20)]
-
-;; Out of reach, can use ble
- (const_int 12)))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_insn ""
[(set (reg:SI 29) (mult:SI (reg:SI 26) (reg:SI 25)))
@@ -4120,7 +4097,7 @@
"TARGET_64BIT"
"* return output_mul_insn (0, insn);"
[(set_attr "type" "milli")
- (set (attr "length") (const_int 4))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_expand "muldi3"
[(set (match_operand:DI 0 "register_operand" "")
@@ -4211,27 +4188,7 @@
"*
return output_div_insn (operands, 0, insn);"
[(set_attr "type" "milli")
- (set (attr "length")
- (cond [
-;; Target (or stub) within reach
- (and (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (eq (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0)))
- (const_int 4)
-
-;; Out of reach PIC
- (ne (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 24)
-
-;; Out of reach PORTABLE_RUNTIME
- (ne (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0))
- (const_int 20)]
-
-;; Out of reach, can use ble
- (const_int 12)))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_insn ""
[(set (reg:SI 29)
@@ -4245,7 +4202,7 @@
"*
return output_div_insn (operands, 0, insn);"
[(set_attr "type" "milli")
- (set (attr "length") (const_int 4))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_expand "udivsi3"
[(set (reg:SI 26) (match_operand:SI 1 "move_operand" ""))
@@ -4261,6 +4218,7 @@
"
{
operands[3] = gen_reg_rtx (SImode);
+
if (TARGET_64BIT)
{
operands[5] = gen_rtx_REG (SImode, 2);
@@ -4287,27 +4245,7 @@
"*
return output_div_insn (operands, 1, insn);"
[(set_attr "type" "milli")
- (set (attr "length")
- (cond [
-;; Target (or stub) within reach
- (and (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (eq (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0)))
- (const_int 4)
-
-;; Out of reach PIC
- (ne (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 24)
-
-;; Out of reach PORTABLE_RUNTIME
- (ne (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0))
- (const_int 20)]
-
-;; Out of reach, can use ble
- (const_int 12)))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_insn ""
[(set (reg:SI 29)
@@ -4321,7 +4259,7 @@
"*
return output_div_insn (operands, 1, insn);"
[(set_attr "type" "milli")
- (set (attr "length") (const_int 4))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_expand "modsi3"
[(set (reg:SI 26) (match_operand:SI 1 "move_operand" ""))
@@ -4360,27 +4298,7 @@
"*
return output_mod_insn (0, insn);"
[(set_attr "type" "milli")
- (set (attr "length")
- (cond [
-;; Target (or stub) within reach
- (and (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (eq (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0)))
- (const_int 4)
-
-;; Out of reach PIC
- (ne (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 24)
-
-;; Out of reach PORTABLE_RUNTIME
- (ne (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0))
- (const_int 20)]
-
-;; Out of reach, can use ble
- (const_int 12)))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_insn ""
[(set (reg:SI 29) (mod:SI (reg:SI 26) (reg:SI 25)))
@@ -4393,7 +4311,7 @@
"*
return output_mod_insn (0, insn);"
[(set_attr "type" "milli")
- (set (attr "length") (const_int 4))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_expand "umodsi3"
[(set (reg:SI 26) (match_operand:SI 1 "move_operand" ""))
@@ -4432,27 +4350,7 @@
"*
return output_mod_insn (1, insn);"
[(set_attr "type" "milli")
- (set (attr "length")
- (cond [
-;; Target (or stub) within reach
- (and (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (eq (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0)))
- (const_int 4)
-
-;; Out of reach PIC
- (ne (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 24)
-
-;; Out of reach PORTABLE_RUNTIME
- (ne (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0))
- (const_int 20)]
-
-;; Out of reach, can use ble
- (const_int 12)))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
(define_insn ""
[(set (reg:SI 29) (umod:SI (reg:SI 26) (reg:SI 25)))
@@ -4465,7 +4363,7 @@
"*
return output_mod_insn (1, insn);"
[(set_attr "type" "milli")
- (set (attr "length") (const_int 4))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 0)"))])
;;- and instructions
;; We define DImode `and` so with DImode `not` we can get
@@ -6036,11 +5934,12 @@
call_insn = emit_call_insn (gen_call_internal_reg (operands[1]));
}
+ if (TARGET_64BIT)
+ use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx);
+
if (flag_pic)
{
use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx);
- if (TARGET_64BIT)
- use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx);
/* After each call we must restore the PIC register, even if it
doesn't appear to be used. */
@@ -6052,6 +5951,7 @@
(define_insn "call_internal_symref"
[(call (mem:SI (match_operand 0 "call_operand_address" ""))
(match_operand 1 "" "i"))
+ (clobber (reg:SI 1))
(clobber (reg:SI 2))
(use (const_int 0))]
"! TARGET_PORTABLE_RUNTIME"
@@ -6061,21 +5961,7 @@
return output_call (insn, operands[0], 0);
}"
[(set_attr "type" "call")
- (set (attr "length")
-;; If we're sure that we can either reach the target or that the
-;; linker can use a long-branch stub, then the length is at most
-;; 8 bytes.
-;;
-;; For long-calls the length will be at most 68 bytes (non-pic)
-;; or 84 bytes (pic). */
-;; Else we have to use a long-call;
- (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (const_int 8)
- (if_then_else (eq (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 68)
- (const_int 84))))])
+ (set (attr "length") (symbol_ref "attr_length_call (insn, 0)"))])
(define_insn "call_internal_reg_64bit"
[(call (mem:SI (match_operand:DI 0 "register_operand" "r"))
@@ -6086,15 +5972,16 @@
"*
{
/* ??? Needs more work. Length computation, split into multiple insns,
- do not use %r22 directly, expose delay slot. */
- return \"ldd 16(%0),%%r2\;ldd 24(%0),%%r27\;bve,l (%%r2),%%r2\;nop\";
+ expose delay slot. */
+ return \"ldd 16(%0),%%r2\;bve,l (%%r2),%%r2\;ldd 24(%0),%%r27\";
}"
[(set_attr "type" "dyncall")
- (set (attr "length") (const_int 16))])
+ (set (attr "length") (const_int 12))])
(define_insn "call_internal_reg"
[(call (mem:SI (reg:SI 22))
(match_operand 0 "" "i"))
+ (clobber (reg:SI 1))
(clobber (reg:SI 2))
(use (const_int 1))]
""
@@ -6218,11 +6105,13 @@
call_insn = emit_call_insn (gen_call_value_internal_reg (operands[0],
operands[2]));
}
+
+ if (TARGET_64BIT)
+ use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx);
+
if (flag_pic)
{
use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx);
- if (TARGET_64BIT)
- use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx);
/* After each call we must restore the PIC register, even if it
doesn't appear to be used. */
@@ -6235,6 +6124,7 @@
[(set (match_operand 0 "" "=rf")
(call (mem:SI (match_operand 1 "call_operand_address" ""))
(match_operand 2 "" "i")))
+ (clobber (reg:SI 1))
(clobber (reg:SI 2))
(use (const_int 0))]
;;- Don't use operand 1 for most machines.
@@ -6245,21 +6135,7 @@
return output_call (insn, operands[1], 0);
}"
[(set_attr "type" "call")
- (set (attr "length")
-;; If we're sure that we can either reach the target or that the
-;; linker can use a long-branch stub, then the length is at most
-;; 8 bytes.
-;;
-;; For long-calls the length will be at most 68 bytes (non-pic)
-;; or 84 bytes (pic). */
-;; Else we have to use a long-call;
- (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (const_int 8)
- (if_then_else (eq (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 68)
- (const_int 84))))])
+ (set (attr "length") (symbol_ref "attr_length_call (insn, 0)"))])
(define_insn "call_value_internal_reg_64bit"
[(set (match_operand 0 "" "=rf")
@@ -6271,16 +6147,17 @@
"*
{
/* ??? Needs more work. Length computation, split into multiple insns,
- do not use %r22 directly, expose delay slot. */
- return \"ldd 16(%1),%%r2\;ldd 24(%1),%%r27\;bve,l (%%r2),%%r2\;nop\";
+ expose delay slot. */
+ return \"ldd 16(%1),%%r2\;bve,l (%%r2),%%r2\;ldd 24(%1),%%r27\";
}"
[(set_attr "type" "dyncall")
- (set (attr "length") (const_int 16))])
+ (set (attr "length") (const_int 12))])
(define_insn "call_value_internal_reg"
[(set (match_operand 0 "" "=rf")
(call (mem:SI (reg:SI 22))
(match_operand 1 "" "i")))
+ (clobber (reg:SI 1))
(clobber (reg:SI 2))
(use (const_int 1))]
""
@@ -6389,10 +6266,9 @@
}")
(define_expand "sibcall"
- [(parallel [(call (match_operand:SI 0 "" "")
- (match_operand 1 "" ""))
- (clobber (reg:SI 0))])]
- "! TARGET_PORTABLE_RUNTIME"
+ [(call (match_operand:SI 0 "" "")
+ (match_operand 1 "" ""))]
+ "!TARGET_PORTABLE_RUNTIME"
"
{
rtx op;
@@ -6400,8 +6276,21 @@
op = XEXP (operands[0], 0);
- /* We do not allow indirect sibling calls. */
- call_insn = emit_call_insn (gen_sibcall_internal_symref (op, operands[1]));
+ if (TARGET_64BIT)
+ emit_move_insn (arg_pointer_rtx,
+ gen_rtx_PLUS (word_mode, virtual_outgoing_args_rtx,
+ GEN_INT (64)));
+
+ /* Indirect sibling calls are not allowed. */
+ if (TARGET_64BIT)
+ call_insn = gen_sibcall_internal_symref_64bit (op, operands[1]);
+ else
+ call_insn = gen_sibcall_internal_symref (op, operands[1]);
+
+ call_insn = emit_call_insn (call_insn);
+
+ if (TARGET_64BIT)
+ use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx);
if (flag_pic)
{
@@ -6417,38 +6306,39 @@
(define_insn "sibcall_internal_symref"
[(call (mem:SI (match_operand 0 "call_operand_address" ""))
(match_operand 1 "" "i"))
- (clobber (reg:SI 0))
+ (clobber (reg:SI 1))
(use (reg:SI 2))
(use (const_int 0))]
- "! TARGET_PORTABLE_RUNTIME"
+ "!TARGET_PORTABLE_RUNTIME && !TARGET_64BIT"
"*
{
output_arg_descriptor (insn);
return output_call (insn, operands[0], 1);
}"
[(set_attr "type" "call")
- (set (attr "length")
-;; If we're sure that we can either reach the target or that the
-;; linker can use a long-branch stub, then the length is at most
-;; 8 bytes.
-;;
-;; For long-calls the length will be at most 68 bytes (non-pic)
-;; or 84 bytes (pic). */
-;; Else we have to use a long-call;
- (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (const_int 8)
- (if_then_else (eq (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 68)
- (const_int 84))))])
+ (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))])
+
+(define_insn "sibcall_internal_symref_64bit"
+ [(call (mem:SI (match_operand 0 "call_operand_address" ""))
+ (match_operand 1 "" "i"))
+ (clobber (reg:SI 1))
+ (clobber (reg:SI 27))
+ (use (reg:SI 2))
+ (use (const_int 0))]
+ "TARGET_64BIT"
+ "*
+{
+ output_arg_descriptor (insn);
+ return output_call (insn, operands[0], 1);
+}"
+ [(set_attr "type" "call")
+ (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))])
(define_expand "sibcall_value"
- [(parallel [(set (match_operand 0 "" "")
+ [(set (match_operand 0 "" "")
(call (match_operand:SI 1 "" "")
- (match_operand 2 "" "")))
- (clobber (reg:SI 0))])]
- "! TARGET_PORTABLE_RUNTIME"
+ (match_operand 2 "" "")))]
+ "!TARGET_PORTABLE_RUNTIME"
"
{
rtx op;
@@ -6456,10 +6346,24 @@
op = XEXP (operands[1], 0);
- /* We do not allow indirect sibling calls. */
- call_insn = emit_call_insn (gen_sibcall_value_internal_symref (operands[0],
- op,
- operands[2]));
+ if (TARGET_64BIT)
+ emit_move_insn (arg_pointer_rtx,
+ gen_rtx_PLUS (word_mode, virtual_outgoing_args_rtx,
+ GEN_INT (64)));
+
+ /* Indirect sibling calls are not allowed. */
+ if (TARGET_64BIT)
+ call_insn
+ = gen_sibcall_value_internal_symref_64bit (operands[0], op, operands[2]);
+ else
+ call_insn
+ = gen_sibcall_value_internal_symref (operands[0], op, operands[2]);
+
+ call_insn = emit_call_insn (call_insn);
+
+ if (TARGET_64BIT)
+ use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), arg_pointer_rtx);
+
if (flag_pic)
{
use_reg (&CALL_INSN_FUNCTION_USAGE (call_insn), pic_offset_table_rtx);
@@ -6475,32 +6379,34 @@
[(set (match_operand 0 "" "=rf")
(call (mem:SI (match_operand 1 "call_operand_address" ""))
(match_operand 2 "" "i")))
- (clobber (reg:SI 0))
+ (clobber (reg:SI 1))
(use (reg:SI 2))
(use (const_int 0))]
- ;;- Don't use operand 1 for most machines.
- "! TARGET_PORTABLE_RUNTIME"
+ "!TARGET_PORTABLE_RUNTIME && !TARGET_64BIT"
"*
{
output_arg_descriptor (insn);
return output_call (insn, operands[1], 1);
}"
[(set_attr "type" "call")
- (set (attr "length")
-;; If we're sure that we can either reach the target or that the
-;; linker can use a long-branch stub, then the length is at most
-;; 8 bytes.
-;;
-;; For long-calls the length will be at most 68 bytes (non-pic)
-;; or 84 bytes (pic). */
-;; Else we have to use a long-call;
- (if_then_else (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (const_int 8)
- (if_then_else (eq (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 68)
- (const_int 84))))])
+ (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))])
+
+(define_insn "sibcall_value_internal_symref_64bit"
+ [(set (match_operand 0 "" "=rf")
+ (call (mem:SI (match_operand 1 "call_operand_address" ""))
+ (match_operand 2 "" "i")))
+ (clobber (reg:SI 1))
+ (clobber (reg:SI 27))
+ (use (reg:SI 2))
+ (use (const_int 0))]
+ "TARGET_64BIT"
+ "*
+{
+ output_arg_descriptor (insn);
+ return output_call (insn, operands[1], 1);
+}"
+ [(set_attr "type" "call")
+ (set (attr "length") (symbol_ref "attr_length_call (insn, 1)"))])
(define_insn "nop"
[(const_int 0)]
@@ -7392,6 +7298,12 @@
"!TARGET_64BIT"
"*
{
+ int length = get_attr_length (insn);
+ rtx xoperands[2];
+
+ xoperands[0] = GEN_INT (length - 8);
+ xoperands[1] = GEN_INT (length - 16);
+
/* Must import the magic millicode routine. */
output_asm_insn (\".IMPORT $$sh_func_adrs,MILLICODE\", NULL);
@@ -7400,60 +7312,24 @@
First, copy our input parameter into %r29 just in case we don't
need to call $$sh_func_adrs. */
output_asm_insn (\"copy %%r26,%%r29\", NULL);
+ output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\", NULL);
/* Next, examine the low two bits in %r26, if they aren't 0x2, then
we use %r26 unchanged. */
- if (get_attr_length (insn) == 32)
- output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+24\", NULL);
- else if (get_attr_length (insn) == 40)
- output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+32\", NULL);
- else if (get_attr_length (insn) == 44)
- output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+36\", NULL);
- else
- output_asm_insn (\"{extru|extrw,u} %%r26,31,2,%%r31\;{comib|cmpib},<>,n 2,%%r31,.+20\", NULL);
+ output_asm_insn (\"{comib|cmpib},<>,n 2,%%r31,.+%0\", xoperands);
+ output_asm_insn (\"ldi 4096,%%r31\", NULL);
/* Next, compare %r26 with 4096, if %r26 is less than or equal to
- 4096, then we use %r26 unchanged. */
- if (get_attr_length (insn) == 32)
- output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+16\",
- NULL);
- else if (get_attr_length (insn) == 40)
- output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+24\",
- NULL);
- else if (get_attr_length (insn) == 44)
- output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+28\",
- NULL);
- else
- output_asm_insn (\"ldi 4096,%%r31\;{comb|cmpb},<<,n %%r26,%%r31,.+12\",
- NULL);
+ 4096, then again we use %r26 unchanged. */
+ output_asm_insn (\"{comb|cmpb},<<,n %%r26,%%r31,.+%1\", xoperands);
- /* Else call $$sh_func_adrs to extract the function's real add24. */
+ /* Finally, call $$sh_func_adrs to extract the function's real add24. */
return output_millicode_call (insn,
gen_rtx_SYMBOL_REF (SImode,
- \"$$sh_func_adrs\"));
+ \"$$sh_func_adrs\"));
}"
[(set_attr "type" "multi")
- (set (attr "length")
- (cond [
-;; Target (or stub) within reach
- (and (lt (plus (symbol_ref "total_code_bytes") (pc))
- (const_int 240000))
- (eq (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0)))
- (const_int 28)
-
-;; Out of reach PIC
- (ne (symbol_ref "flag_pic")
- (const_int 0))
- (const_int 44)
-
-;; Out of reach PORTABLE_RUNTIME
- (ne (symbol_ref "TARGET_PORTABLE_RUNTIME")
- (const_int 0))
- (const_int 40)]
-
-;; Out of reach, can use ble
- (const_int 32)))])
+ (set (attr "length") (symbol_ref "attr_length_millicode_call (insn, 20)"))])
;; On the PA, the PIC register is call clobbered, so it must
;; be saved & restored around calls by the caller. If the call
diff --git a/gcc/config/pa/som.h b/gcc/config/pa/som.h
index e72b7fed64a..98c66fbfe37 100644
--- a/gcc/config/pa/som.h
+++ b/gcc/config/pa/som.h
@@ -371,3 +371,7 @@ do { \
on the location of the GCC tool directory. The downside is GCC
cannot be moved after installation using a symlink. */
#define ALWAYS_STRIP_DOTDOT 1
+
+/* Aggregates with a single float or double field should be passed and
+ returned in the general registers. */
+#define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) (MODE==SFmode || MODE==DFmode)
diff --git a/gcc/config/pa/t-pa64 b/gcc/config/pa/t-pa64
index 9323a250ed2..d1b2b264931 100644
--- a/gcc/config/pa/t-pa64
+++ b/gcc/config/pa/t-pa64
@@ -1,4 +1,4 @@
-TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1
+TARGET_LIBGCC2_CFLAGS = -fPIC -Dpa64=1 -DELF=1 -mlong-calls
LIB2FUNCS_EXTRA=quadlib.c
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 43acaace119..7d994b19c5b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -508,7 +508,7 @@ in the following sections.
-march=@var{architecture-type} @gol
-mbig-switch -mdisable-fpregs -mdisable-indexing @gol
-mfast-indirect-calls -mgas -mgnu-ld -mhp-ld @gol
--mjump-in-delay -mlinker-opt @gol
+-mjump-in-delay -mlinker-opt -mlong-calls @gol
-mlong-load-store -mno-big-switch -mno-disable-fpregs @gol
-mno-disable-indexing -mno-fast-indirect-calls -mno-gas @gol
-mno-jump-in-delay -mno-long-load-store @gol
@@ -8094,6 +8094,33 @@ configure option, gcc's program search path, and finally by the user's
@env{PATH}. The linker used by GCC can be printed using @samp{which
`gcc -print-prog-name=ld`}.
+@item -mlong-calls
+@opindex mno-long-calls
+Generate code that uses long call sequences. This ensures that a call
+is always able to reach linker generated stubs. The default is to generate
+long calls only when the distance from the call site to the beginning
+of the function or translation unit, as the case may be, exceeds a
+predefined limit set by the branch type being used. The limits for
+normal calls are 7,600,000 and 240,000 bytes, respectively for the
+PA 2.0 and PA 1.X architectures. Sibcalls are always limited at
+240,000 bytes.
+
+Distances are measured from the beginning of functions when using the
+@option{-ffunction-sections} option, or when using the @option{-mgas}
+and @option{-mno-portable-runtime} options together under HP-UX with
+the SOM linker.
+
+It is normally not desirable to use this option as it will degrade
+performance. However, it may be useful in large applications,
+particularly when partial linking is used to build the application.
+
+The types of long calls used depends on the capabilities of the
+assembler and linker, and the type of code being generated. The
+impact on systems that support long absolute calls, and long pic
+symbol-difference or pc-relative calls should be relatively small.
+However, an indirect call is used on 32-bit ELF systems in pic code
+and it is quite long.
+
@end table
@node Intel 960 Options