summaryrefslogtreecommitdiff
path: root/compiler/prelude
diff options
context:
space:
mode:
authorJohan Tibell <johan.tibell@gmail.com>2014-03-13 09:35:21 +0100
committerJohan Tibell <johan.tibell@gmail.com>2014-03-22 10:32:02 +0100
commit1eece45692fb5d1a5f4ec60c1537f8068237e9c1 (patch)
treeb5d99d52c5a6ab762f9b92dfd0504105122ed62b /compiler/prelude
parent99ef27913dbe55fa57891bbf97d131e0933733e3 (diff)
downloadhaskell-1eece45692fb5d1a5f4ec60c1537f8068237e9c1.tar.gz
codeGen: inline allocation optimization for clone array primops
The inline allocation version is 69% faster than the out-of-line version, when cloning an array of 16 unit elements on a 64-bit machine. Comparing the new and the old primop implementations isn't straightforward. The old version had a missing heap check that I discovered during the development of the new version. Comparing the old and the new version would requiring fixing the old version, which in turn means reimplementing the equivalent of MAYBE_CG in StgCmmPrim. The inline allocation threshold is configurable via -fmax-inline-alloc-size which gives the maximum array size, in bytes, to allocate inline. The size does not include the closure header size. Allowing the same primop to be either inline or out-of-line has some implication for how we lay out heap checks. We always place a heap check around out-of-line primops, as they may allocate outside of our knowledge. However, for the inline primops we only allow allocation via the standard means (i.e. virtHp). Since the clone primops might be either inline or out-of-line the heap check layout code now consults shouldInlinePrimOp to know whether a primop will be inlined.
Diffstat (limited to 'compiler/prelude')
-rw-r--r--compiler/prelude/primops.txt.pp4
1 files changed, 4 insertions, 0 deletions
diff --git a/compiler/prelude/primops.txt.pp b/compiler/prelude/primops.txt.pp
index 49fef3523a..e1a9824a16 100644
--- a/compiler/prelude/primops.txt.pp
+++ b/compiler/prelude/primops.txt.pp
@@ -794,6 +794,7 @@ primop CloneArrayOp "cloneArray#" GenPrimOp
source array. The provided array must fully contain the specified
range, but this is not checked.}
with
+ out_of_line = True
has_side_effects = True
code_size = { primOpCodeSizeForeignCall + 4 }
@@ -804,6 +805,7 @@ primop CloneMutableArrayOp "cloneMutableArray#" GenPrimOp
source array. The provided array must fully contain the specified
range, but this is not checked.}
with
+ out_of_line = True
has_side_effects = True
code_size = { primOpCodeSizeForeignCall + 4 }
@@ -814,6 +816,7 @@ primop FreezeArrayOp "freezeArray#" GenPrimOp
source array. The provided array must fully contain the specified
range, but this is not checked.}
with
+ out_of_line = True
has_side_effects = True
code_size = { primOpCodeSizeForeignCall + 4 }
@@ -824,6 +827,7 @@ primop ThawArrayOp "thawArray#" GenPrimOp
source array. The provided array must fully contain the specified
range, but this is not checked.}
with
+ out_of_line = True
has_side_effects = True
code_size = { primOpCodeSizeForeignCall + 4 }