This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: RFC: SPARC tail calls using the new infrastructure


On Tue, Mar 21, 2000 at 08:11:08AM -0700, Jeffrey A Law wrote:
> 
>   In message <20000321103500.L525@mff.cuni.cz>you write:
>   > Hi!
>   > 
>   > On SPARC (and I assume on all machines with register windows) when doing a
>   > tail call we have to put arguments into the INCOMING registers as opposed t
>   > o
>   > normal calls where they are in OUTGOING registers, because the tail call
>   > must unroll the register window first.
>   > I wonder what is the best place to do this.
> optimize_sibling_and_tail_recursive_calls is certainly the wrong place to
> do this.  In fact, part of the point of moving all the argument handling for
> tail call optimization to the tree level was to avoid this kind of problem.

Ok, understand, that's why it was a RFC.

> 
> The existing code should already be putting arguments into the incoming
> registers on the sparc.  If it isn't, you should find out why.

Where that code is? I cannot find any traces of it in any of Richard's tail
call commits.
emit_call_1 unconditionally uses FUNCTION_ARG (and not
FUNCTION_INCOMING_ARG) or hard_function_value with 0 as the last argument,
nothing related to function argument/retval register
checks ECF_SIBCALL or pass == 0 (depending on the function),
initialize_argument_information does not get that information at all.
Is the solution you want to see something like:
say in initialize_argument_information, provided it gets ecf_flags argument:

#ifdef FUNCTION_INCOMING_ARG
      if (ecf_flags & ECF_SIBCALL)
	args[i].reg = FUNCTION_INCOMING_ARG (*args_so_far, mode, type,
					     argpos < n_named_args);
      else
#endif
	args[i].reg = FUNCTION_ARG (*args_so_far, mode, type,
				    argpos < n_named_args);

or something completely different?

Just to make sure there is no confusion:
A normal call on SPARC looks like
	mov 0, %o0
	mov 1, %o1
	call foo
	 mov 2, %o2

while tail call has to look (unless it is a leaf function) like
	mov 0, %i0
	mov 1, %i1
	call foo
	 restore %g0, 2, %o2

because the register window is switched before the call.

Attached is a new version of the patch (still using the remap in sibcall.c
because I'd like to know first how do you want to do the incoming arguments)
which did not trigger any testsuite regressions and supports already leaf
functions with tail calls.

2000-03-21  Jakub Jelinek  <jakub@redhat.com>

	* sibcall.c (skip_copy_to_return_value): Use OUTGOING_REGNO for
	comparison if regno's are equal.
	(sibcall_remap_arguments): New function.
	(optimize_sibling_and_tail_recursive_calls): Call it.
	* jump.c (jump_optimize_1): Avoid calling delete_unreferenced_labels
	for minimal jump, because those labels might be referenced from
	within CALL_PLACEHOLDERs.

	* final.c (permitted_reg_in_leaf_functions, only_leaf_regs_used):
	Change LEAF_REGISTERS from an array initializer to actual array
	identifier. Move static global variable into the function.
	(leaf_function_p): Allow SIBLING_CALL_P calls even outside of
	sequences for leaf functions.
	* global.c (global_alloc): Likewise.
	* tm.texi (LEAF_REGISTERS): Update documentation.

	* config/sparc/sparc.h (CONDITIONAL_REGISTER_USAGE): Remove the ugly
	TARGET_FLAT leaf disabling hack.
	(LEAF_REGISTERS): Changed from an array initializer to actual array
	identifier to avoid duplication and remove the above hack.
	* config/sparc/sparc.md (sibcall): New attr type. Use it almost
	always like call attribute.
	(eligible_for_sibcall_delay): New attribute.
	(sibcall): New delay type.
	(sibcall, sibcall_value, sibcall_epilogue): New expands.
	(sibcall_address_sp32, sibcall_symbolic_sp32, sibcall_address_sp64,
	sibcall_symbolic_sp64, sibcall_value_address_sp32,
	sibcall_value_symbolic_sp32, sibcall_value_address_sp64,
	sibcall_value_symbolic_sp64): New insns.
	* config/sparc/sparc.c (sparc_leaf_regs): New array.
	(eligible_for_sibcall_delay, output_restore_regs, output_sibcall):
	New functions.
	(output_function_epilogue): Move part of the code into
	output_restore_regs.
	(ultra_code_from_mask, ultrasparc_sched_reorder): Handle
	TYPE_SIBCALL.
	* sparc-protos.h (output_sibcall, eligible_for_sibcall_delay): New
	prototypes.

--- gcc/config/sparc/sparc.h.jj	Mon Mar 13 18:08:11 2000
+++ gcc/config/sparc/sparc.h	Tue Mar 21 12:27:53 2000
@@ -1066,8 +1066,8 @@ do								\
 	   %fp, but output it as %i7.  */			\
 	fixed_regs[31] = 1;					\
 	reg_names[FRAME_POINTER_REGNUM] = "%i7";		\
-	/* ??? This is a hack to disable leaf functions.  */	\
-	global_regs[7] = 1;					\
+	/* Disable leaf functions */				\
+	bzero (sparc_leaf_regs, FIRST_PSEUDO_REGISTER);		\
       }								\
     if (profile_block_flag)					\
       {								\
@@ -1373,26 +1373,8 @@ extern enum reg_class sparc_regno_reg_cl
   
 #define ORDER_REGS_FOR_LOCAL_ALLOC order_regs_for_local_alloc ()
 
-/* ??? %g7 is not a leaf register to effectively #undef LEAF_REGISTERS when
-   -mflat is used.  Function only_leaf_regs_used will return 0 if a global
-   register is used and is not permitted in a leaf function.  We make %g7
-   a global reg if -mflat and voila.  Since %g7 is a system register and is
-   fixed it won't be used by gcc anyway.  */
-
-#define LEAF_REGISTERS \
-{ 1, 1, 1, 1, 1, 1, 1, 0,	\
-  0, 0, 0, 0, 0, 0, 1, 0,	\
-  0, 0, 0, 0, 0, 0, 0, 0,	\
-  1, 1, 1, 1, 1, 1, 0, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1, 1, 1, 1,	\
-  1, 1, 1, 1, 1}
+extern char sparc_leaf_regs[];
+#define LEAF_REGISTERS sparc_leaf_regs
 
 extern char leaf_reg_remap[];
 #define LEAF_REG_REMAP(REGNO) (leaf_reg_remap[REGNO])
--- gcc/config/sparc/sparc.md.jj	Mon Mar 13 18:05:46 2000
+++ gcc/config/sparc/sparc.md	Tue Mar 21 13:31:02 2000
@@ -88,7 +88,7 @@
 ;; type "call_no_delay_slot" is a call followed by an unimp instruction.
 
 (define_attr "type"
-  "move,unary,binary,compare,load,sload,store,ialu,shift,uncond_branch,branch,call,call_no_delay_slot,return,address,imul,fpload,fpstore,fp,fpmove,fpcmove,fpcmp,fpmul,fpdivs,fpdivd,fpsqrts,fpsqrtd,cmove,multi,misc"
+  "move,unary,binary,compare,load,sload,store,ialu,shift,uncond_branch,branch,call,sibcall,call_no_delay_slot,return,address,imul,fpload,fpstore,fp,fpmove,fpcmove,fpcmp,fpmul,fpdivs,fpdivd,fpsqrts,fpsqrtd,cmove,multi,misc"
   (const_string "binary"))
 
 ;; Set true if insn uses call-clobbered intermediate register.
@@ -131,7 +131,7 @@
 ;; Attributes for instruction and branch scheduling
 
 (define_attr "in_call_delay" "false,true"
-  (cond [(eq_attr "type" "uncond_branch,branch,call,call_no_delay_slot,return,multi")
+  (cond [(eq_attr "type" "uncond_branch,branch,call,sibcall,call_no_delay_slot,return,multi")
 	 	(const_string "false")
 	 (eq_attr "type" "load,fpload,store,fpstore")
 	 	(if_then_else (eq_attr "length" "1")
@@ -148,6 +148,12 @@
 (define_delay (eq_attr "type" "call")
   [(eq_attr "in_call_delay" "true") (nil) (nil)])
 
+(define_attr "eligible_for_sibcall_delay" "false,true"
+  (symbol_ref "eligible_for_sibcall_delay(insn)"))
+
+(define_delay (eq_attr "type" "sibcall")
+  [(eq_attr "eligible_for_sibcall_delay" "true") (nil) (nil)])
+
 (define_attr "leaf_function" "false,true"
   (const (symbol_ref "current_function_uses_only_leaf_regs")))
 
@@ -179,19 +185,19 @@
 ;; because it prevents us from moving back the final store of inner loops.
 
 (define_attr "in_branch_delay" "false,true"
-  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,call_no_delay_slot,multi")
+  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi")
 		     (eq_attr "length" "1"))
 		(const_string "true")
 		(const_string "false")))
 
 (define_attr "in_uncond_branch_delay" "false,true"
-  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,call_no_delay_slot,multi")
+  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi")
 		     (eq_attr "length" "1"))
 		(const_string "true")
 		(const_string "false")))
 
 (define_attr "in_annul_branch_delay" "false,true"
-  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,call_no_delay_slot,multi")
+  (if_then_else (and (eq_attr "type" "!uncond_branch,branch,call,sibcall,call_no_delay_slot,multi")
 		     (eq_attr "length" "1"))
 		(const_string "true")
 		(const_string "false")))
@@ -453,7 +459,7 @@
 
 (define_function_unit "ieuN" 2 0
   (and (eq_attr "cpu" "ultrasparc")
-    (eq_attr "type" "ialu,binary,move,unary,shift,compare,call,call_no_delay_slot,uncond_branch"))
+    (eq_attr "type" "ialu,binary,move,unary,shift,compare,call,sibcall,call_no_delay_slot,uncond_branch"))
   1 1)
 
 (define_function_unit "ieu0" 1 0
@@ -468,7 +474,7 @@
 
 (define_function_unit "ieu1" 1 0
   (and (eq_attr "cpu" "ultrasparc")
-    (eq_attr "type" "compare,call,call_no_delay_slot,uncond_branch"))
+    (eq_attr "type" "compare,call,sibcall,call_no_delay_slot,uncond_branch"))
   1 1)
 
 (define_function_unit "cti" 1 0
@@ -8569,6 +8575,93 @@
 
   DONE;
 }")
+
+;;- tail calls
+(define_expand "sibcall"
+  [(parallel [(call (match_operand 0 "call_operand" "") (const_int 0))
+	      (return)])]
+  ""
+  "")
+
+(define_insn "*sibcall_address_sp32"
+  [(call (mem:SI (match_operand:SI 0 "address_operand" "p"))
+	 (match_operand 1 "" ""))
+   (return)]
+  "! TARGET_PTR64"
+  "* return output_sibcall(insn, operands[0]);"
+  [(set_attr "type" "sibcall")])
+
+(define_insn "*sibcall_symbolic_sp32"
+  [(call (mem:SI (match_operand:SI 0 "symbolic_operand" "s"))
+	 (match_operand 1 "" ""))
+   (return)]
+  "! TARGET_PTR64"
+  "* return output_sibcall(insn, operands[0]);"
+  [(set_attr "type" "sibcall")])
+
+(define_insn "*sibcall_address_sp64"
+  [(call (mem:SI (match_operand:DI 0 "address_operand" "p"))
+	 (match_operand 1 "" ""))
+   (return)]
+  "TARGET_PTR64"
+  "* return output_sibcall(insn, operands[0]);"
+  [(set_attr "type" "sibcall")])
+
+(define_insn "*sibcall_symbolic_sp64"
+  [(call (mem:SI (match_operand:DI 0 "symbolic_operand" "s"))
+	 (match_operand 1 "" ""))
+   (return)]
+  "TARGET_PTR64"
+  "* return output_sibcall(insn, operands[0]);"
+  [(set_attr "type" "sibcall")])
+
+(define_expand "sibcall_value"
+  [(parallel [(set (match_operand 0 "register_operand" "=rf")
+		(call (match_operand:SI 1 "" "") (const_int 0)))
+	      (return)])]
+  ""
+  "")
+
+(define_insn "*sibcall_value_address_sp32"
+  [(set (match_operand 0 "" "=rf")
+	(call (mem:SI (match_operand:SI 1 "address_operand" "p"))
+	      (match_operand 2 "" "")))
+   (return)]
+  "! TARGET_PTR64"
+  "* return output_sibcall(insn, operands[1]);"
+  [(set_attr "type" "sibcall")])
+
+(define_insn "*sibcall_value_symbolic_sp32"
+  [(set (match_operand 0 "" "=rf")
+	(call (mem:SI (match_operand:SI 1 "symbolic_operand" "s"))
+	      (match_operand 2 "" "")))
+   (return)]
+  "! TARGET_PTR64"
+  "* return output_sibcall(insn, operands[1]);"
+  [(set_attr "type" "sibcall")])
+
+(define_insn "*sibcall_value_address_sp64"
+  [(set (match_operand 0 "" "")
+	(call (mem:SI (match_operand:DI 1 "address_operand" "p"))
+	      (match_operand 2 "" "")))
+   (return)]
+  "TARGET_PTR64"
+  "* return output_sibcall(insn, operands[1]);"
+  [(set_attr "type" "sibcall")])
+
+(define_insn "*sibcall_value_symbolic_sp64"
+  [(set (match_operand 0 "" "")
+	(call (mem:SI (match_operand:DI 1 "symbolic_operand" "s"))
+	      (match_operand 2 "" "")))
+   (return)]
+  "TARGET_PTR64"
+  "* return output_sibcall(insn, operands[1]);"
+  [(set_attr "type" "sibcall")])
+
+(define_expand "sibcall_epilogue"
+  [(const_int 0)]
+  "! TARGET_FLAT"
+  "DONE;")
 
 ;; UNSPEC_VOLATILE is considered to use and clobber all hard registers and
 ;; all of memory.  This blocks insns from being moved across this point.
--- gcc/config/sparc/sparc.c.jj	Mon Mar 13 18:05:46 2000
+++ gcc/config/sparc/sparc.c	Tue Mar 21 15:20:23 2000
@@ -99,6 +99,24 @@ char leaf_reg_remap[] =
   88, 89, 90, 91, 92, 93, 94, 95,
   96, 97, 98, 99, 100};
 
+/* Vector, indexed by hard register number, which contains 1
+   for a register that is allowable in a candidate for leaf
+   function treatment.  */
+char sparc_leaf_regs[] =
+{ 1, 1, 1, 1, 1, 1, 1, 1,
+  0, 0, 0, 0, 0, 0, 1, 0,
+  0, 0, 0, 0, 0, 0, 0, 0,
+  1, 1, 1, 1, 1, 1, 0, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1, 1, 1, 1,
+  1, 1, 1, 1, 1};
+
 #endif
 
 /* Name of where we pretend to think the frame pointer points.
@@ -2458,6 +2476,98 @@ eligible_for_epilogue_delay (trial, slot
   return 0;
 }
 
+/* Return nonzero if TRIAL can go into the sibling call
+   delay slot.  */
+
+int
+eligible_for_sibcall_delay (trial)
+     rtx trial;
+{
+  rtx pat, src;
+
+  if (GET_CODE (trial) != INSN || GET_CODE (PATTERN (trial)) != SET)
+    return 0;
+
+  if (get_attr_length (trial) != 1 || profile_block_flag == 2)
+    return 0;
+
+  pat = PATTERN (trial);
+
+  if (current_function_uses_only_leaf_regs)
+    {
+      /* If the tail call is done using the call instruction,
+	 we have to restore %o7 in the delay slot.  */
+      if (TARGET_ARCH64 && ! TARGET_CM_MEDLOW)
+	return 0;
+
+      /* %g1 is used to build the function address */
+      if (reg_mentioned_p (gen_rtx_REG (Pmode, 1), pat))
+	return 0;
+
+      return 1;
+    }
+
+  /* Otherwise, only operations which can be done in tandem with
+     a `restore' insn can go into the delay slot.  */
+  if (GET_CODE (SET_DEST (pat)) != REG
+      || REGNO (SET_DEST (pat)) < 24
+      || REGNO (SET_DEST (pat)) >= 32)
+    return 0;
+
+  /* If it mentions %o7, it can't go in, because sibcall will clobber it
+     in most cases.  */
+  if (reg_mentioned_p (gen_rtx_REG (Pmode, 15), pat))
+    return 0;
+
+  src = SET_SRC (pat);
+
+  if (arith_operand (src, GET_MODE (src)))
+    {
+      if (TARGET_ARCH64)
+        return GET_MODE_SIZE (GET_MODE (src)) <= GET_MODE_SIZE (DImode);
+      else
+        return GET_MODE_SIZE (GET_MODE (src)) <= GET_MODE_SIZE (SImode);
+    }
+
+  else if (arith_double_operand (src, GET_MODE (src)))
+    return GET_MODE_SIZE (GET_MODE (src)) <= GET_MODE_SIZE (DImode);
+
+  else if (! TARGET_FPU && restore_operand (SET_DEST (pat), SFmode)
+	   && register_operand (src, SFmode))
+    return 1;
+
+  else if (GET_CODE (src) == PLUS
+	   && arith_operand (XEXP (src, 0), SImode)
+	   && arith_operand (XEXP (src, 1), SImode)
+	   && (register_operand (XEXP (src, 0), SImode)
+	       || register_operand (XEXP (src, 1), SImode)))
+    return 1;
+
+  else if (GET_CODE (src) == PLUS
+	   && arith_double_operand (XEXP (src, 0), DImode)
+	   && arith_double_operand (XEXP (src, 1), DImode)
+	   && (register_operand (XEXP (src, 0), DImode)
+	       || register_operand (XEXP (src, 1), DImode)))
+    return 1;
+
+  else if (GET_CODE (src) == LO_SUM
+	   && ! TARGET_CM_MEDMID
+	   && ((register_operand (XEXP (src, 0), SImode)
+	        && immediate_operand (XEXP (src, 1), SImode))
+	       || (TARGET_ARCH64
+		   && register_operand (XEXP (src, 0), DImode)
+		   && immediate_operand (XEXP (src, 1), DImode))))
+    return 1;
+
+  else if (GET_CODE (src) == ASHIFT
+	   && (register_operand (XEXP (src, 0), SImode)
+	       || register_operand (XEXP (src, 0), DImode))
+	   && XEXP (src, 1) == const1_rtx)
+    return 1;
+
+  return 0;
+}
+
 static int
 check_return_regs (x)
      rtx x;
@@ -3421,6 +3531,40 @@ output_function_prologue (file, size, le
     }
 }
 
+/* Output code to restore any call saved registers.  */
+
+static void
+output_restore_regs (file, leaf_function)
+     FILE *file;
+     int leaf_function;
+{
+  int offset, n_regs;
+  const char *base;
+
+  offset = -apparent_fsize + frame_base_offset;
+  if (offset < -4096 || offset + num_gfregs * 4 > 4096 - 8 /*double*/)
+    {
+      build_big_number (file, offset, "%g1");
+      fprintf (file, "\tadd\t%s, %%g1, %%g1\n", frame_base_name);
+      base = "%g1";
+      offset = 0;
+    }
+  else
+    {
+      base = frame_base_name;
+    }
+
+  n_regs = 0;
+  if (TARGET_EPILOGUE && ! leaf_function)
+    /* ??? Originally saved regs 0-15 here.  */
+    n_regs = restore_regs (file, 0, 8, base, offset, 0);
+  else if (leaf_function)
+    /* ??? Originally saved regs 0-31 here.  */
+    n_regs = restore_regs (file, 0, 8, base, offset, 0);
+  if (TARGET_EPILOGUE)
+    restore_regs (file, 32, TARGET_V9 ? 96 : 64, base, offset, n_regs);
+}
+
 /* Output code for the function epilogue.  */
 
 void
@@ -3455,35 +3599,8 @@ output_function_epilogue (file, size, le
       goto output_vectors;                                                    
     }
 
-  /* Restore any call saved registers.  */
   if (num_gfregs)
-    {
-      int offset, n_regs;
-      const char *base;
-
-      offset = -apparent_fsize + frame_base_offset;
-      if (offset < -4096 || offset + num_gfregs * 4 > 4096 - 8 /*double*/)
-	{
-	  build_big_number (file, offset, "%g1");
-	  fprintf (file, "\tadd\t%s, %%g1, %%g1\n", frame_base_name);
-	  base = "%g1";
-	  offset = 0;
-	}
-      else
-	{
-	  base = frame_base_name;
-	}
-
-      n_regs = 0;
-      if (TARGET_EPILOGUE && ! leaf_function)
-	/* ??? Originally saved regs 0-15 here.  */
-	n_regs = restore_regs (file, 0, 8, base, offset, 0);
-      else if (leaf_function)
-	/* ??? Originally saved regs 0-31 here.  */
-	n_regs = restore_regs (file, 0, 8, base, offset, 0);
-      if (TARGET_EPILOGUE)
-	restore_regs (file, 32, TARGET_V9 ? 96 : 64, base, offset, n_regs);
-    }
+    output_restore_regs (file, leaf_function);
 
   /* Work out how to skip the caller's unimp instruction if required.  */
   if (leaf_function)
@@ -3573,6 +3690,105 @@ output_function_epilogue (file, size, le
  output_vectors:
   sparc_output_deferred_case_vectors ();
 }
+
+/* Output a sibling call.  */
+
+const char *
+output_sibcall (insn, call_operand)
+     rtx insn, call_operand;
+{
+  int leaf_regs = current_function_uses_only_leaf_regs;
+  rtx operands[3];
+
+  if (num_gfregs)
+    {
+      /* Call to restore global regs might clobber
+	 the delay slot. Instead of checking for this
+	 output the delay slot now.  */
+      if (dbr_sequence_length () > 0)
+	{
+	  rtx delay = NEXT_INSN (insn);
+
+	  if (! delay)
+	    abort ();
+
+	  final_scan_insn (delay, asm_out_file, 1, 0, 1);
+	  PATTERN (delay) = gen_blockage ();
+	  INSN_CODE (delay) = -1;
+	}
+      output_restore_regs (asm_out_file, leaf_regs);
+    }
+
+  operands[0] = call_operand;
+
+  if (leaf_regs)
+    {
+      if (symbolic_operand (operands[0], Pmode))
+	{
+	  if (TARGET_ARCH32 || TARGET_CM_MEDLOW)
+	    {
+	      output_asm_insn ("sethi\t%%hi(%a0), %%g1", operands);
+	      output_asm_insn ("jmpl\t%%g1 + %%lo(%a0), %%g0", operands);
+	    }
+	  else
+	    {
+	      output_asm_insn ("mov\t%%o7, %%g1", operands);
+	      output_asm_insn ("call\t%a0, 0", operands);
+	      output_asm_insn (" mov\t%%g1, %%o7", operands);
+	      if (dbr_sequence_length () > 0)
+		abort ();
+	      return "";
+	    }
+	}
+      else
+	output_asm_insn ("jmpl\t%a0, %%g0", operands);
+      if (num_gfregs || dbr_sequence_length () == 0)
+	output_asm_insn (" nop", operands);
+      return "";
+    }
+
+  output_asm_insn ("call\t%a0, 0", operands);
+  if (!num_gfregs && dbr_sequence_length () > 0)
+    {
+      rtx delay = NEXT_INSN (insn), pat;
+
+      if (! delay)
+	abort ();
+
+      pat = PATTERN (delay);
+      if (GET_CODE (pat) != SET)
+	abort ();
+
+      operands[0] = SET_DEST (pat);
+      pat = SET_SRC (pat);
+      switch (GET_CODE (pat))
+	{
+	case PLUS:
+	  operands[1] = XEXP (pat, 0);
+	  operands[2] = XEXP (pat, 1);
+	  output_asm_insn (" restore %r1, %2, %Y0", operands);
+	  break;
+	case LO_SUM:
+	  operands[1] = XEXP (pat, 0);
+	  operands[2] = XEXP (pat, 1);
+	  output_asm_insn (" restore %r1, %%lo(%a2), %Y0", operands);
+	  break;
+	case ASHIFT:
+	  operands[1] = XEXP (pat, 0);
+	  output_asm_insn (" restore %r1, %r1, %Y0", operands);
+	  break;
+	default:
+	  operands[1] = pat;
+	  output_asm_insn (" restore %%g0, %1, %Y0", operands);
+	  break;
+	}
+      PATTERN (delay) = gen_blockage ();
+      INSN_CODE (delay) = -1;
+    }
+  else
+    output_asm_insn (" restore", operands);
+  return "";
+}
 
 /* Functions for handling argument passing.
 
@@ -7012,6 +7228,7 @@ ultra_code_from_mask (type_mask)
     return IEU0;
   else if (type_mask & (TMASK (TYPE_COMPARE) |
 			TMASK (TYPE_CALL) |
+			TMASK (TYPE_SIBCALL) |
 			TMASK (TYPE_UNCOND_BRANCH)))
     return IEU1;
   else if (type_mask & (TMASK (TYPE_IALU) | TMASK (TYPE_BINARY) |
@@ -7484,6 +7701,7 @@ ultrasparc_sched_reorder (dump, sched_ve
 	/* If we are not in the process of emptying out the pipe, try to
 	   obtain an instruction which must be the first in it's group.  */
 	ip = ultra_find_type ((TMASK (TYPE_CALL) |
+			       TMASK (TYPE_SIBCALL) |
 			       TMASK (TYPE_CALL_NO_DELAY_SLOT) |
 			       TMASK (TYPE_UNCOND_BRANCH)),
 			      ready, this_insn);
--- gcc/config/sparc/sparc-protos.h.jj	Thu Feb 17 16:31:05 2000
+++ gcc/config/sparc/sparc-protos.h	Tue Mar 21 12:27:54 2000
@@ -96,6 +96,7 @@ extern int sparc_splitdi_legitimate PARA
 extern int sparc_absnegfloat_split_legitimate PARAMS ((rtx, rtx));
 extern char *output_cbranch PARAMS ((rtx, int, int, int, int, rtx));
 extern const char *output_return PARAMS ((rtx *));
+extern const char *output_sibcall PARAMS ((rtx, rtx));
 extern char *output_v9branch PARAMS ((rtx, int, int, int, int, int, rtx));
 extern void emit_v9_brxx_insn PARAMS ((enum rtx_code, rtx, rtx));
 extern void output_double_int PARAMS ((FILE *, rtx));
@@ -121,6 +122,7 @@ extern int cc_arithopn PARAMS ((rtx, enu
 extern int data_segment_operand PARAMS ((rtx, enum machine_mode));
 extern int eligible_for_epilogue_delay PARAMS ((rtx, int));
 extern int eligible_for_return_delay PARAMS ((rtx));
+extern int eligible_for_sibcall_delay PARAMS ((rtx));
 extern int emit_move_sequence PARAMS ((rtx, enum machine_mode));
 extern int extend_op PARAMS ((rtx, enum machine_mode));
 extern int fcc_reg_operand PARAMS ((rtx, enum machine_mode));
--- gcc/tm.texi.jj	Sun Mar 19 20:31:07 2000
+++ gcc/tm.texi	Tue Mar 21 10:25:41 2000
@@ -1652,7 +1652,7 @@ accomplish this.
 @table @code
 @findex LEAF_REGISTERS
 @item LEAF_REGISTERS
-A C initializer for a vector, indexed by hard register number, which
+Name of a char vector, indexed by hard register number, which
 contains 1 for a register that is allowable in a candidate for leaf
 function treatment.
 
--- gcc/sibcall.c.jj	Sun Mar 19 06:26:47 2000
+++ gcc/sibcall.c	Tue Mar 21 13:17:41 2000
@@ -140,9 +140,13 @@ skip_copy_to_return_value (orig_insn, ha
      called function's return value was copied.  Otherwise we're returning
      some other value.  */
 
+#ifndef OUTGOING_REGNO
+#define OUTGOING_REGNO(N) (N)
+#endif
+
   if (SET_DEST (set) == current_function_return_rtx
       && REG_P (SET_DEST (set))
-      && REGNO (SET_DEST (set)) == REGNO (hardret)
+      && OUTGOING_REGNO (REGNO (SET_DEST (set))) == REGNO (hardret)
       && SET_SRC (set) == softret)
     return insn;
 
@@ -353,6 +357,117 @@ replace_call_placeholder (insn, use)
   NOTE_LINE_NUMBER (insn) = NOTE_INSN_DELETED;
 }
 
+#ifdef INCOMING_REGNO
+
+/* If the machine has register windows, remap sibling call arguments from
+   normally used "output" registers to "input" registers where they wind
+   up when register window is changed before jumping into the sibling
+   function.  */
+void
+sibcall_remap_arguments (insn)
+     rtx insn;
+{
+  rtx link, x, * hardretp = NULL, hardret;
+  int regno, remap_regno, remap_nregs = 0;
+  HARD_REG_SET remap_reg_set;
+  basic_block call_block;
+  
+  call_block = BLOCK_FOR_INSN (insn);
+  CLEAR_HARD_REG_SET (remap_reg_set);
+
+  /* First find the CALL_INSN */
+  for (;
+       insn && (GET_CODE (insn) != CALL_INSN || ! SIBLING_CALL_P (insn));
+       insn = PREV_INSN (insn))
+    if (insn == call_block->head)
+      break;
+
+  if (! insn || GET_CODE (insn) != CALL_INSN)
+    abort ();
+
+  /* Walk function usage list and note which arguments need to be remapped.
+     Remap them in the usage list.  */
+  for (link = CALL_INSN_FUNCTION_USAGE (insn); link; link = XEXP (link, 1))
+    if (GET_CODE (XEXP (link, 0)) == USE)
+      {
+	x = XEXP (XEXP (link, 0), 0);
+	if (GET_CODE (x) != REG)
+	  continue;
+	if (REGNO (x) >= FIRST_PSEUDO_REGISTER)
+	  continue;
+	regno = REGNO (x);
+	remap_regno = INCOMING_REGNO (regno);
+	if (remap_regno != regno)
+	  {
+	    ++remap_nregs;
+	    SET_HARD_REG_BIT (remap_reg_set, regno);
+	    XEXP (XEXP (link, 0), 0) = gen_rtx_REG (GET_MODE (x), remap_regno);
+	  }
+      }
+
+  /* Find the setters of the hard register arguments and remap them.  */
+  for (link = PREV_INSN (insn); link && remap_nregs; link = PREV_INSN (link))
+    {
+      x = single_set (link);
+      if (x && GET_CODE (SET_DEST (x)) == REG
+	  && TEST_HARD_REG_BIT (remap_reg_set, REGNO (SET_DEST (x))))
+	{
+	  regno = REGNO (SET_DEST (x));
+	  remap_regno = INCOMING_REGNO (regno);
+	  if (remap_regno == regno)
+	    abort ();
+
+	  SET_DEST (x) =
+	    gen_rtx_REG (GET_MODE (SET_DEST (x)), remap_regno);
+	  CLEAR_HARD_REG_BIT (remap_reg_set, regno);
+	  --remap_nregs;
+	}
+      if (link == call_block->head)
+	break;
+    }
+
+  /* We did not find all of them, something's wrong.  */
+  if (remap_nregs)
+    abort ();
+
+  /* Now look if we have to remap the return value register as well.  */
+  if (GET_CODE (PATTERN (insn)) == SET
+      && GET_CODE (SET_SRC (PATTERN (insn))) == CALL)
+    hardretp = &SET_DEST (PATTERN (insn));
+  else if (GET_CODE (PATTERN (insn)) == PARALLEL
+	   && GET_CODE (XVECEXP (PATTERN (insn), 0, 0)) == SET
+	   && GET_CODE (SET_SRC (XVECEXP (PATTERN (insn), 0, 0))) == CALL)
+    hardretp = &SET_DEST (XVECEXP (PATTERN (insn), 0, 0));
+
+  if (! hardretp)
+    return;
+
+  hardret = *hardretp;
+
+  if (GET_CODE (hardret) != REG || REGNO (hardret) >= FIRST_PSEUDO_REGISTER)
+    abort ();
+
+  regno = REGNO (hardret);
+  remap_regno = INCOMING_REGNO (regno);
+  if (remap_regno == regno)
+    /* We don't have to remap anything.  */
+    return;
+
+  *hardretp = gen_rtx_REG (GET_MODE (hardret), remap_regno);
+
+  link = NEXT_INSN (insn);
+  if (! link || GET_CODE (link) == BARRIER)
+    return;
+
+  x = single_set (link);
+  if (! x || SET_SRC (x) != hardret)
+    return;
+
+  SET_SRC (x) = *hardretp;
+}
+
+#endif
+
 
 /* Given a (possibly empty) set of potential sibling or tail recursion call
    sites, determine if optimization is possible.
@@ -551,6 +666,14 @@ success:
 				      : sibcall != 0
 					 ? sibcall_use_sibcall
 					 : sibcall_use_normal);
+#ifdef INCOMING_REGNO
+	  /* If the machine has register windows, the sibling call
+	     takes the input registers (as opposed to output registers
+	     for normal calls) as arguments.  */
+	  if (sibcall && ! tailrecursion)
+	    sibcall_remap_arguments (insn);
+#endif
+
 	}
     }
 
--- gcc/final.c.jj	Sun Mar 19 20:31:03 2000
+++ gcc/final.c	Tue Mar 21 10:50:59 2000
@@ -4015,7 +4015,8 @@ leaf_function_p ()
 
   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
     {
-      if (GET_CODE (insn) == CALL_INSN)
+      if (GET_CODE (insn) == CALL_INSN
+	  && ! SIBLING_CALL_P (insn))
 	return 0;
       if (GET_CODE (insn) == INSN
 	  && GET_CODE (PATTERN (insn)) == SEQUENCE
@@ -4025,7 +4026,8 @@ leaf_function_p ()
     }
   for (insn = current_function_epilogue_delay_list; insn; insn = XEXP (insn, 1))
     {
-      if (GET_CODE (XEXP (insn, 0)) == CALL_INSN)
+      if (GET_CODE (XEXP (insn, 0)) == CALL_INSN
+	  && ! SIBLING_CALL_P (insn))
 	return 0;
       if (GET_CODE (XEXP (insn, 0)) == INSN
 	  && GET_CODE (PATTERN (XEXP (insn, 0))) == SEQUENCE
@@ -4048,8 +4050,6 @@ leaf_function_p ()
 
 #ifdef LEAF_REGISTERS
 
-static char permitted_reg_in_leaf_functions[] = LEAF_REGISTERS;
-
 /* Return 1 if this function uses only the registers that can be
    safely renumbered.  */
 
@@ -4057,6 +4057,7 @@ int
 only_leaf_regs_used ()
 {
   int i;
+  char *permitted_reg_in_leaf_functions = LEAF_REGISTERS;
 
   for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
     if ((regs_ever_live[i] || global_regs[i])
--- gcc/global.c.jj	Mon Mar  6 18:37:42 2000
+++ gcc/global.c	Tue Mar 21 10:25:42 2000
@@ -374,7 +374,7 @@ global_alloc (file)
      a leaf function.  */
   {
     char *cheap_regs;
-    static char leaf_regs[] = LEAF_REGISTERS;
+    char *leaf_regs = LEAF_REGISTERS;
 
     if (only_leaf_regs_used () && leaf_function_p ())
       cheap_regs = leaf_regs;
--- gcc/jump.c.jj	Sun Mar 19 20:31:04 2000
+++ gcc/jump.c	Tue Mar 21 10:25:42 2000
@@ -211,7 +211,7 @@ jump_optimize_1 (f, cross_jump, noop_mov
   int old_max_reg;
   int first = 1;
   int max_uid = 0;
-  rtx last_insn;
+  rtx last_insn = NULL_RTX;
 
   cross_jump_death_matters = (cross_jump == 2);
   max_uid = init_label_info (f) + 1;
@@ -252,9 +252,14 @@ jump_optimize_1 (f, cross_jump, noop_mov
     goto end;
 
   if (! minimal)
-    exception_optimize ();
+    {
+      exception_optimize ();
 
-  last_insn = delete_unreferenced_labels (f);
+      /* We cannot delete unreferenced labels for minimal, because
+	 they might be referenced within CALL_PLACEHOLDER tail call
+	 sites.  */
+      last_insn = delete_unreferenced_labels (f);
+    }
 
   if (noop_moves)
     delete_noop_moves (f);


Cheers,
    Jakub
___________________________________________________________________
Jakub Jelinek | jakub@redhat.com | http://sunsite.mff.cuni.cz/~jj
Linux version 2.3.99-pre2 on a sparc64 machine (1343.49 BogoMips)
___________________________________________________________________

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]