| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
spillReg, loadReg (x86): always spill fp registers using double-sized
loads/stores, even if they nominally contain only a float value --
otherwise the spill loses the extra guard bits.
|
|
|
|
|
|
| |
spillReg, loadReg (x86): spill above %esp, not below it. Duh. If you
spill below %esp, ccalls, which use stack below %esp, can trash the
spill area.
|
|
|
|
| |
Teach magicIdRegMaybe about R9 and R10.
|
|
|
|
| |
Print a useful panic msg if getRegister(x86) can't reduce a tree.
|
|
|
|
| |
pprInstr: implement GABS, GNEG, GSQRT.
|
|
|
|
| |
Handle float args correctly for x86 ccalls.
|
|
|
|
|
| |
Disable a dubious looking clause for trivialCode (x86), which was
generating bad code for some subtracts.
|
|
|
|
|
| |
Implement the HP_CHK_GEN macro. As a result, teach mkNativeHdr et al
about R9 and R10.
|
|
|
|
| |
wibble
|
|
|
|
|
|
| |
amodeToStix, GET_TAG: implement correctly for little-endian-32 and
supply implementation for big-endian-32. Definitely won't work on
64-bit platforms.
|
|
|
|
| |
genCodeInfoTable: put tag value into srt_len field for constr info tables.
|
|
|
|
|
| |
x86: free up all FP regs before doing a ccall. This appears to be a
part of the x86 calling convention(s).
|
|
|
|
| |
Add missing final paragraph of explaination about x86 FP trickery.
|
|
|
|
| |
Minor improvements to x86 FP fake-to-real insn translation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
genCCall for x86, as supplied, used PUSH et al to move args onto the C
stack ready for the call. Reasonable as this seems, it causes a
problem with spill code, since the spiller spills relative to %esp and
assumes that %esp doesn't move. If the args of a ccall involved any
spilled values, the resulting code would be wrong.
The One True Way is to do it like a RISC: move args to the stack
without adjusting %esp for each argument, then adjust it all at once
immediately prior to the call insn and un-adjust it immediately
afterwards. genCCall now does this. In general, push/pop and other
C-stack effecting operations should not be generated for the same
reason.
|
|
|
|
| |
Add -optCrts-M80m for older compilers. Sigh.
|
|
|
|
| |
Start a NOTES file, recording known but un-fixed nativeGen bugs.
|
|
|
|
| |
Fix syntax errors in #ifdef'd Alpha/Sparc bits.
|
|
|
|
| |
Insert large commit message re x86 FP rehash as a comment.
|
|
|
|
|
|
|
| |
ARR_HDR_SIZE --> ARR_WORDS_HDR_SIZE, and derived quantities in
Constants.h, Constants.lhs et al are similarly renamed.
new constant ARR_PTRS_HDR_SIZE, with corresponding derivatives.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Major reworking of the x86 floating point code generation.
Intel, in their infinite wisdom, selected a stack model for floating
point registers on x86. That might have made sense back in 1979 --
nowadays we can see it for the nonsense it really is. A stack model
fits poorly with the existing nativeGen infrastructure, which assumes
flat integer and FP register sets. Prior to this commit, nativeGen
could not generate correct x86 FP code -- to do so would have meant
somehow working the register-stack paradigm into the register
allocator and spiller, which sounds very difficult.
We have decided to cheat, and go for a simple fix which requires no
infrastructure modifications, at the expense of generating ropey but
correct FP code. All notions of the x86 FP stack and its insns have
been removed. Instead, we pretend (to the instruction selector and
register allocator) that x86 has six floating point registers, %fake0
.. %fake5, which can be used in the usual flat manner. We further
claim that x86 has floating point instructions very similar to SPARC
and Alpha, that is, a simple 3-operand register-register arrangement.
Code generation and register allocation proceed on this basis.
When we come to print out the final assembly, our convenient fiction
is converted to dismal reality. Each fake instruction is
independently converted to a series of real x86 instructions.
%fake0 .. %fake5 are mapped to %st(0) .. %st(5). To do reg-reg
arithmetic operations, the two operands are pushed onto the top of the
FP stack, the operation done, and the result copied back into the
relevant register. There are only six %fake registers because 2 are
needed for the translation, and x86 has 8 in total.
The translation is inefficient but is simple and it works. A cleverer
translation would handle a sequence of insns, simulating the FP stack
contents, would not impose a fixed mapping from %fake to %st regs, and
hopefully could avoid most of the redundant reg-reg moves of the
current translation.
|
|
|
|
| |
Increase the heap size for Parser.hs to 80M (for 4.04).
|
|
|
|
|
| |
trivialCode (x86), when fst arg is immediate, assumed you could reverse
the order of operands, but not true for eg subtract. Fixed.
|
|
|
|
| |
amodeToStix: correctly compute offset for CHARLIKE_closure-s.
|
|
|
|
| |
Add comment about code generation for debug tracing.
|
|
|
|
|
| |
MachCode.stmt2Instrs, StFunBegin, x86 case only: for debugging,
generate trace code to print the name of each labelled code block.
|
|
|
|
|
| |
genCCall for x86 assumed that all args were 4 bytes long :-(.
Now works with doubles too.
|
|
|
|
|
|
| |
Don't spew floating/double literals into assembly output, since this
causes difficulties with FP numbers near the edges of the allowed
ranges. Instead, convert them to a sequence of bytes and emit those.
|
|
|
|
|
| |
Remove StLitLit, and clean up somewhat the handling of
stdout/stderr/stdin in CLitLits (in StixPrim.amodeToStix).
|
|
|
|
| |
Zap a couple of trace-s.
|
|
|
|
|
| |
Change alignment directives for x86 segment changes to something
more plausible-looking for Linux.
|
|
|
|
|
| |
genCodeInfoTable, genBitmapInfoTable: construct type_info to reflect
non-presence of flags in type info field.
|
|
|
|
|
| |
StixPrim.amodeToStix case CMacroExpr: handle UPD_FRAME_UPDATEE
StixMacro.macroCode: handle UPDATE_SU_FROM_UPD_FRAME
|
|
|
|
|
| |
Print a couple of blank lines in final assembly output in between basic
blocks, to make it easier to match up with the output of -ddump-stix.
|
|
|
|
| |
macroCode: implement PUSH_SEQ_FRAME
|
|
|
|
| |
checkCode: handle HP_CHK_UT_ALT.
|
|
|
|
| |
primCode: implement DataToTagOp.
|
|
|
|
| |
Bugfix (raiseError in non-enterable closures); added GranSim code to Schedule.c
|
|
|
|
| |
stmt2Instrs: correctly handle StData with zero data words
|
|
|
|
|
| |
genCodeInfoTable: don't do getSRTInfo on the closure info if we already
know via infoTblNeedsSRT that an SRT isn't needed.
|
|
|
|
| |
gentopcode: handle CClosureTbl.
|
|
|
|
|
| |
Merged GUM-4-04 branch into the main trunk. In particular merged GUM and
SMP code. Most of the GranSim code in GUM-4-04 still has to be carried over.
|
|
|
|
|
| |
Rearrange top-level nativeGen plumbing so that -ddump-stix is visible
even if subsequent nativeGen passes crash.
|
|
|
|
| |
Added a rudimentary implementation of -ddump-stix.
|
|
|
|
|
| |
Fix a bug in inlining that gave unresolved references
whenever you compile without -O. Silly me.
|
|
|
|
|
| |
Emit a reasonable error message instead of crashing when there's an
unterminated literal-liberal in the source file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit arranges that literal strings will fuse
nicely, by expressing them as an application of build.
* NoRepStr is now completely redundant, though I havn't removed it yet.
* The unpackStr stuff moves from PrelPack to PrelBase.
* There's a new form of Rule, a BuiltinRule, for rules that
can't be expressed in Haskell. The string-fusion rule is one
such. It's defined in prelude/PrelRules.lhs.
* PrelRules.lhs also contains a great deal of code that
implements constant folding. In due course this will replace
ConFold.lhs, but for the moment it simply duplicates it.
|
|
|
|
|
|
|
|
| |
Fix a renamer bug that rejected
import M hiding( C )
where C is a constructor.
|
|
|
|
| |
Remove more vestiges of IntAbsOp, and now unused absIntCode.
|
|
|
|
| |
Remove *uses* of unused IntAbsOp (see recent log message in prelude/PrimOp).
|