| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: In initLinker, this patch adds the handle of the module corresponding to the program binary to the list of DLL handles that lookupSymbol uses to search for symbols.
Test Plan: validate
Reviewers: simonmar, austin
Reviewed By: simonmar, austin
Subscribers: phaskell, simonmar, relrod, ezyang, carter
Differential Revision: https://phabricator.haskell.org/D103
GHC Trac Issues: #9382
|
|
|
|
|
|
| |
this is z-encoding (as hvr tells me)
This reverts commit 425d5178af55620efa00e6e16426f491c63ad533.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The two new primops with the type-signatures
resizeMutableByteArray# :: MutableByteArray# s -> Int#
-> State# s -> (# State# s, MutableByteArray# s #)
shrinkMutableByteArray# :: MutableByteArray# s -> Int#
-> State# s -> State# s
allow to resize MutableByteArray#s in-place (when possible), and are useful
for algorithms where memory is temporarily over-allocated. The motivating
use-case is for implementing integer backends, where the final target size of
the result is either N or N+1, and only known after the operation has been
performed.
A future commit will implement a stateful variant of the
`sizeofMutableByteArray#` operation (see #9447 for details), since now the
size of a `MutableByteArray#` may change over its lifetime (i.e before
it gets frozen or GCed).
Test Plan: ./validate --slow
Reviewers: ezyang, austin, simonmar
Reviewed By: austin, simonmar
Differential Revision: https://phabricator.haskell.org/D133
|
|
|
|
|
|
|
|
| |
This will hopefully help ensure some basic consistency in the forward by
overriding buffer variables. In particular, it sets the wrap length, the
offset to 4, and turns off tabs.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second attempt to add this functionality. The first
attempt was reverted in 950fcae46a82569e7cd1fba1637a23b419e00ecd, due
to register allocator failure on x86. Given how the register
allocator currently works, we don't have enough registers on x86 to
support cmpxchg using complicated addressing modes. Instead we fall
back to a simpler addressing mode on x86.
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
|
|
|
|
|
| |
This reverts commit 2f8b4c9330b455d4cb31c186c747a7db12a69251.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
| |
This commit caused the register allocator to fail on i386.
This reverts commit d8abf85f8ca176854e9d5d0b12371c4bc402aac3 and
04dd7cb3423f1940242fdfe2ea2e3b8abd68a177 (the second being a fix to
the first).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add more primops for atomic ops on byte arrays
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
|
|
|
| |
See Note [RTLD_LOCAL] for a summary of the problem and solution, and
|
|
|
|
|
|
| |
Fixes #9080
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
| |
Problems were found on 32-bit platforms, I'll commit again when I have a fix.
This reverts the following commits:
54b31f744848da872c7c6366dea840748e01b5cf
b0534f78a73f972e279eed4447a5687bd6a8308e
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This tracks the amount of memory allocation by each thread in a
counter stored in the TSO. Optionally, when the counter drops below
zero (it counts down), the thread can be sent an asynchronous
exception: AllocationLimitExceeded. When this happens, given a small
additional limit so that it can handle the exception. See
documentation in GHC.Conc for more details.
Allocation limits are similar to timeouts, but
- timeouts use real time, not CPU time. Allocation limits do not
count anything while the thread is blocked or in foreign code.
- timeouts don't re-trigger if the thread catches the exception,
allocation limits do.
- timeouts can catch non-allocating loops, if you use
-fno-omit-yields. This doesn't work for allocation limits.
I couldn't measure any impact on benchmarks with these changes, even
for nofib/smp.
|
|
|
|
|
|
| |
Issue discovered by Coverity Scan, CID 43168.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
| |
Issue discovered by Coverity Scan, CID 43171.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
| |
CID 43178
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a patch from FB's internal build of GHC that I'm pushing
upstream.
Author: Andrew Gallagher <agallagher@fb.com>
This diff adds simple thin archive support to ghc's linker code, which
basically just entails finding the member data from disk rather than
from inside the archive (except for the case of the symbol index and
gnu filename index, where the member data is still inline).
|
| |
|
|
|
|
| |
The copy array family of primops were moved out-of-line.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These array types are smaller than Array# and MutableArray# and are
faster when the array size is small, as they don't have the overhead
of a card table. Having no card table reduces the closure size with 2
words in the typical small array case and leads to less work when
updating or GC:ing the array.
Reduces both the runtime and memory allocation by 8.8% on my insert
benchmark for the HashMap type in the unordered-containers package,
which makes use of lots of small arrays. With tuned GC settings
(i.e. `+RTS -A6M`) the runtime reduction is 15%.
Fixes #8923.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inline allocation version is 69% faster than the out-of-line
version, when cloning an array of 16 unit elements on a 64-bit
machine.
Comparing the new and the old primop implementations isn't
straightforward. The old version had a missing heap check that I
discovered during the development of the new version. Comparing the
old and the new version would requiring fixing the old version, which
in turn means reimplementing the equivalent of MAYBE_CG in StgCmmPrim.
The inline allocation threshold is configurable via
-fmax-inline-alloc-size which gives the maximum array size, in bytes,
to allocate inline. The size does not include the closure header size.
Allowing the same primop to be either inline or out-of-line has some
implication for how we lay out heap checks. We always place a heap
check around out-of-line primops, as they may allocate outside of our
knowledge. However, for the inline primops we only allow allocation
via the standard means (i.e. virtHp). Since the clone primops might be
either inline or out-of-line the heap check layout code now consults
shouldInlinePrimOp to know whether a primop will be inlined.
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
| |
Our old function for searching for sections could only deal
with section names that were eight bytes or shorter; this
patch adds support for long section names.
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
| |
This fixes #7134
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds code for jumping to given addresses for ARM, written by Ben
Gamari.
However, when allocating new infotables for bytecode (which is where
this jump code occurs), we need to be sure to flush the cache on the
execute pointer returned from allocateExec() - on systems like ARM, the
processor won't reliably read back code or automatically cache flush,
where x86 will.
So we add a new flushExec primitive to call out to GCC's
__builtin___clear_cache primitive, which will properly generate the
correct code (nothing on x86, and a call to libgcc's __clear_cache on
ARM) and make sure we use it after writing the code out.
Authored-by: Ben Gamari <bgamari.foss@gmail.com>
Authored-by: Austin Seipp <austin@well-typed.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
|
|
|
|
|
|
|
|
| |
This creates a new C API:
initLinker_ (int retain_cafs)
The old initLinker() was left as-is for backwards compatibility. See
documentation in Linker.h.
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
| |
It needs freeStablePtr, which tripped my validate build, due to an
implicit declaration warning. I'm quite surprised this somehow did not
trip the build before.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
| |
Spotted by Bertram Felgenhauer.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
|
|
|
|
|
|
| |
See also #5435.
Now we have to remember the the StablePtrs that get created by the
module initializer so that we can free them again in unloadObj().
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
| |
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|