diff options
author | David M Peixotto <dmp@rice.edu> | 2011-10-19 15:49:06 -0500 |
---|---|---|
committer | David Terei <davidterei@gmail.com> | 2011-11-01 03:18:40 -0700 |
commit | a9ce36118f0de3aeb427792f8f2c5ae097c94d3f (patch) | |
tree | d03c1697a04df842b21bafa214f22140473a2e0d | |
parent | f0ae3f31277ebfe2384fca3f89867f340ae9b492 (diff) | |
download | haskell-a9ce36118f0de3aeb427792f8f2c5ae097c94d3f.tar.gz |
Change stack alignment to 16+8 bytes in STG code
This patch changes the STG code so that %rsp to be aligned
to a 16-byte boundary + 8. This is the alignment required by
the x86_64 ABI on entry to a function. Previously we kept
%rsp aligned to a 16-byte boundary, but this was causing
problems for the LLVM backend (see #4211).
We now don't need to invoke llvm stack mangler on
x86_64 targets. Since the stack is now 16+8 byte algined in
STG land on x86_64, we don't need to mangle the stack
manipulations with the llvm mangler.
This patch only modifies the alignement for x86_64 backends.
Signed-off-by: David Terei <davidterei@gmail.com>
-rw-r--r-- | compiler/llvmGen/LlvmMangler.hs | 6 | ||||
-rw-r--r-- | compiler/nativeGen/X86/CodeGen.hs | 16 | ||||
-rw-r--r-- | rts/StgCRun.c | 46 |
3 files changed, 39 insertions, 29 deletions
diff --git a/compiler/llvmGen/LlvmMangler.hs b/compiler/llvmGen/LlvmMangler.hs index 68e92cf651..981bbf2858 100644 --- a/compiler/llvmGen/LlvmMangler.hs +++ b/compiler/llvmGen/LlvmMangler.hs @@ -143,11 +143,13 @@ fixTables ss = fixed have been pushed, so sub 4). GHC though since it always uses jumps keeps the stack 16 byte aligned on both function calls and function entry. - We correct the alignment here. + We correct the alignment here for Mac OS X i386. The x86_64 target already + has the correct alignment since we keep the stack 16+8 aligned throughout + STG land for 64-bit targets. -} fixupStack :: B.ByteString -> B.ByteString -> B.ByteString -#if !darwin_TARGET_OS +#if !darwin_TARGET_OS || x86_64_TARGET_ARCH fixupStack = const #else diff --git a/compiler/nativeGen/X86/CodeGen.hs b/compiler/nativeGen/X86/CodeGen.hs index 1efa327002..458f379380 100644 --- a/compiler/nativeGen/X86/CodeGen.hs +++ b/compiler/nativeGen/X86/CodeGen.hs @@ -1842,15 +1842,17 @@ genCCall64 target dest_regs args = tot_arg_size = arg_size * length stack_args -- On entry to the called function, %rsp should be aligned - -- on a 16-byte boundary +8 (i.e. the first stack arg after - -- the return address is 16-byte aligned). In STG land - -- %rsp is kept 16-byte aligned (see StgCRun.c), so we just - -- need to make sure we push a multiple of 16-bytes of args, - -- plus the return address, to get the correct alignment. + -- on a 16-byte boundary +8 (i.e. the first stack arg + -- above the return address is 16-byte aligned). In STG + -- land %rsp is kept 8-byte aligned (see StgCRun.c), so we + -- just need to make sure we pad by eight bytes after + -- pushing a multiple of 16-bytes of args to get the + -- correct alignment. If we push an odd number of eight byte + -- arguments then no padding is needed. -- Urg, this is hard. We need to feed the delta back into -- the arg pushing code. (real_size, adjust_rsp) <- - if tot_arg_size `rem` 16 == 0 + if (tot_arg_size + 8) `rem` 16 == 0 then return (tot_arg_size, nilOL) else do -- we need to adjust... delta <- getDeltaNat @@ -1865,7 +1867,7 @@ genCCall64 target dest_regs args = delta <- getDeltaNat -- deal with static vs dynamic call targets - (callinsns,cconv) <- + (callinsns,_cconv) <- case target of CmmCallee (CmmLit (CmmLabel lbl)) conv -> -- ToDo: stdcall arg sizes diff --git a/rts/StgCRun.c b/rts/StgCRun.c index 7251e64253..11e0543475 100644 --- a/rts/StgCRun.c +++ b/rts/StgCRun.c @@ -267,29 +267,36 @@ StgRunIsImplementedInAssembler(void) "addq %0, %%rsp\n\t" "retq" - : : "i"(RESERVED_C_STACK_BYTES+48+8 /*stack frame size*/)); + : : "i"(RESERVED_C_STACK_BYTES+48 /*stack frame size*/)); /* - HACK alert! - - The x86_64 ABI specifies that on a procedure call, %rsp is + The x86_64 ABI specifies that on entry to a procedure, %rsp is aligned on a 16-byte boundary + 8. That is, the first argument on the stack after the return address will be - 16-byte aligned. - - Which should be fine: RESERVED_C_STACK_BYTES+48 is a multiple - of 16 bytes. + 16-byte aligned. + + We maintain the 16+8 stack alignment throughout the STG code. + + When we call STG_RUN the stack will be aligned to 16+8. We used + to subtract an extra 8 bytes so that %rsp would be 16 byte + aligned at all times in STG land. This worked fine for the + native code generator which knew that the stack was already + aligned on 16 bytes when it generated calls to C functions. + + This arrangemnt caused problems for the LLVM backend. The LLVM + code generator would assume that on entry to each function the + stack is aligned to 16+8 as required by the ABI. However, since + we only enter STG functions by jumping to them with tail calls, + the stack was actually aligned to a 16-byte boundary. The LLVM + backend had its own mangler that would post-process the + assembly code to fixup the stack manipulation code to mainain + the correct alignment (see #4211). + + Therefore, we now now keep the stack aligned to 16+8 while in + STG land so that LLVM generates correct code without any + mangling. The native code generator can handle this alignment + just fine by making sure the stack is aligned to a 16-byte + boundary before it makes a C-call. - BUT... when we do a C-call from STG land, gcc likes to put the - stack alignment adjustment in the prolog. eg. if we're calling - a function with arguments in regs, gcc will insert 'subq $8,%rsp' - in the prolog, to keep %rsp aligned (the return address is 8 - bytes, remember). The mangler throws away the prolog, so we - lose the stack alignment. - - The hack is to add this extra 8 bytes to our %rsp adjustment - here, so that throughout STG code, %rsp is 16-byte aligned, - ready for a C-call. - A quick way to see if this is wrong is to compile this code: main = System.Exit.exitWith ExitSuccess @@ -300,7 +307,6 @@ StgRunIsImplementedInAssembler(void) stack isn't aligned, and calling exitWith from Haskell invokes shutdownHaskellAndExit using a C call. - Future gcc releases will almost certainly break this hack... */ } |