| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
fixes #19756, updates haddock submodule
|
|
|
|
| |
asynchronous.
|
|
|
|
|
|
|
|
| |
In (<|>) for ZipList, avoid processing the first argument twice (both as first
argument of (++) and for its length in drop count of the second argument).
Previously, the entire first argument was forced into memory, now (<|>) can run
in constant space even with long inputs.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* ensure that division wrappers are INLINE
* make div/mod/divMod call quot/rem/quotRem (same code)
* this ensures that the quotRemWordN# primitive is used to implement
divMod (it wasn't the case for sized Words)
* make first argument strict for Natural and Integer (similarly to other
numeric types)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The sequencing of monadic effects in foldlM and foldrM was described as
respectively right-associative and left-associative, but this could be
confusing, as in essence we're just composing Kleisli arrows, whose
composition is simply associative.
What matters therefore is the order of sequencing of effects, which
can be described more clearly without dragging in associativity as
such.
This avoids describing these folds as being both left-to-right and
right-to-left depending on whether we're tracking effects or operator
application. The new text should be easier to understand.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using the following high-quality benchmark (with -O2):
main :: IO ()
main = do
let
go 0 = ""
go n@(W# n#) = showWord n# (go (n -1))
print $ length (go 10000000)
I get the following performance results:
- remWord+quotRem: 0,76s user 0,00s system 99% cpu 0,762 total
- quotRemWord: 0,45s user 0,01s system 99% cpu 0,456 total
Note that showSignedInt already uses quotRemInt.
|
|
|
|
| |
Also make sure to be able to build with non-apple-clang, while using apple's SDK on macOS
|
|
|
|
|
|
| |
Thanks to Mathnerd3141 for the fixed example.
Fixes #19880
|
| |
|
| |
|
|
|
|
| |
Progress towards #19026
|
|
|
|
| |
(cherry picked from commit d22e087f7bf74341c4468f11b4eb0273033ca931)
|
|
|
|
|
|
|
|
|
|
|
| |
We think the compiler is ready, so we can do this for all over the 8-,
16-, and 32-bit boxed types.
We are holding off on doing all the primops at once so things are easier
to investigate.
Metric Decrease:
T12545
|
| |
|
|
|
|
|
|
|
| |
With a quick flavour I get:
before T12545(normal) ghc/alloc 8628109152
after T12545(normal) ghc/alloc 8559741088
|
|
|
|
|
| |
Like !5572, this is switching over a portion of the primops which seems
safe to use.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This gives a more precise type signature to `magicDict` as proposed in #16646.
In addition, this replaces the constant-folding rule for `magicDict` in
`GHC.Core.Opt.ConstantFold` with a special case in the desugarer in
`GHC.HsToCore.Expr.dsHsWrapped`. I have also renamed `magicDict` to `withDict`
in light of the discussion in
https://mail.haskell.org/pipermail/ghc-devs/2021-April/019833.html.
All of this has the following benefits:
* `withDict` is now more type safe than before. Moreover, if a user applies
`withDict` at an incorrect type, the special-casing in `dsHsWrapped` will
now throw an error message indicating what the user did incorrectly.
* `withDict` can now work with classes that have multiple type arguments, such
as `Typeable @k a`. This means that `Data.Typeable.Internal.withTypeable` can
now be implemented in terms of `withDict`.
* Since the special-casing for `withDict` no longer needs to match on the
structure of the expression passed as an argument to `withDict`, it no
longer cares about the presence or absence of `Tick`s. In effect, this
obsoletes the fix for #19667.
The new `T16646` test case demonstrates the new version of `withDict` in
action, both in terms of `base` functions defined in terms of `withDict`
as well as in terms of functions from the `reflection` and `singletons`
libraries. The `T16646Fail` test case demonstrates the error message that GHC
throws when `withDict` is applied incorrectly.
This fixes #16646. By adding more tests for `withDict`, this also
fixes #19673 as a side effect.
|
| |
|
| |
|
|
|
|
|
|
| |
sortWith has the same type definition as `Data.List.sortOn` (eg: `Ord b => (a -> b) -> [a] -> [a]`).
Nonetheless, they behave differently, sortOn being more efficient.
This merge request add documentation to reflect on this differences
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main idea here is to avoid treating
* case e of {}
* case unsafeEqualityProof of UnsafeRefl co -> blah
specially in CoreToStg. Instead, nail them in CorePrep,
by converting
case e of {}
==> e |> unsafe-co
case unsafeEqualityProof of UnsafeRefl cv -> blah
==> blah[unsafe-co/cv]
in GHC.Core.Prep. Now expressions that we want to treat as trivial
really are trivial. We can get rid of cpExprIsTrivial.
And we fix #19700.
A downside is that, at least under unsafeEqualityProof, we substitute
in types and coercions, which is more work. But a big advantage is
that it's all very simple and principled: CorePrep really gets rid of
the unsafeCoerce stuff, as it does empty case, runRW#, lazyId etc.
I've updated the overview in GHC.Core.Prep, and added
Note [Unsafe coercions] in GHC.Core.Prep
Note [Implementing unsafeCoerce] in base:Unsafe.Coerce
We get 3% fewer bytes allocated when compiling perf/compiler/T5631,
which uses a lot of unsafeCoerces. (It's a happy-generated parser.)
Metric Decrease:
T5631
|
| |
|
|
|
|
| |
Fixes #19719.
|
|
|
|
| |
follow-up from !4675
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- This allows specialized mconcat implementations an opportunity to combine
elements efficiently in a single pass.
- Inline the default implementation of `mconcat`, this
may result in list fusion.
- In Monoids with strict `mappend`, implement `mconcat` as a strict left fold:
* And (FiniteBits)
* Ior (FiniteBits)
* Xor (FiniteBits)
* Iff (FiniteBits)
* Max (Ord)
* Min (Ord)
* Sum (Num)
* Product (Num)
* (a -> m) (Monoid m)
- Delegate mconcat for WrappedMonoid to the underlying monoid.
Resolves: #17123
Per the discussion in !4890, we expect some stat changes:
* T17123(normal) run/alloc 403143160.0 4954736.0 -98.8% GOOD
This is the expected improvement in `fold` for a long list of
`Text` elements.
* T13056(optasm) ghc/alloc 381013328.0 447700520.0 +17.5% BAD
Here there's an extra simplifier run as a result of the new methods
of the Foldable instance for List. It looks benign. The test is
a micro benchmark that compiles just the derived foldable instances
for a pair of structures, a cost of this magnitude is not expected
to extend to more realistic programs.
* T9198(normal) ghc/alloc 504661992.0 541334168.0 +7.3% BAD
This test regressed from 8.10 and 9.0 back to exponential blowup.
This metric also fluctuates, for reasons not yet clear. The issue
here is the exponetial blowup, not this MR.
Metric Decrease:
T17123
Metric Increase:
T9198
T13056
|
|
|
|
|
| |
And though partially applied foldl' is now again inlined, #4301 has not
resurfaced, and appears to be resolved.
|
|
|
|
|
|
| |
It is incorrectly displayed in hackage as:
`m1 <*> m2 = m1 >>= (x1 -> m2 >>= (x2 -> return (x1 x2)))`
which isn't correct Haskell
|
|
|
|
| |
Also added nested foldr example for `concat`.
|
| |
|
|
|
|
|
|
|
| |
- Remove GHC.OldList
- Remove Data.OldList
- compat-unqualified-imports is no-op
- update haddock submodule
|
|
|
|
| |
Fixes #19575
|
|
|
|
|
|
|
|
|
| |
This drops allocateExec for darwin, and replaces it with
a alloc, write, mark executable strategy instead. This prevents
us from trying to allocate an executable range and then write to
it, which X^W will prohibit on darwin.
This will *only* work if we can use mmap.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The RULES that use hand-written specialised code for overloaded class
methods like floor, ceiling, truncate etc were fragile to certain
transformations. This patch makes them robust. See #19582.
It's all described in Note [Rules for overloaded class methods].
No test case because currently we don't do the transformation
(floating out over-saturated applications) that makes this patch
have an effect. But we may so so in future, and this patch makes
the RULES much more robust.
|
|
|
|
|
| |
This commit adds the `lint:compiler` Hadrian target to the CI runner.
It does also fixes hints in the compiler/ and libraries/base/ codebases.
|
| |
|
|
|
|
|
|
|
|
|
| |
It seems like I imported "GHC.Types ()" thinking that it would
transitively import GHC.Num.Integer when I wrote that module; but it
doesn't.
This led to build failures.
See https://mail.haskell.org/pipermail/ghc-devs/2021-March/019641.html
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's quite backend-dependent whether we will actually handle that case
right, so let's just always do this as a precaution.
In particular, once we replace the native primops used here with the new
sized primops, the 16-bit ones on x86 will begin to use 16-bit sized
instructions where they didn't before.
Though I'm not sure of any arch which has 8-bit scalar instructions, I
also did those for consistency. Plus, there are *vector* 8-bit ops in
the wild, so if we ever got into autovectorization or something maybe
it's prudent to put this here as a reminder not to forget about
catching overflows.
Progress towards #19026
|
|
|
|
|
| |
Co-authored-by: Daniel Rogozin <daniel.rogozin@serokell.io>
Co-authored-by: Rinat Stryungis <rinat.stryungis@serokell.io>
|
|
|
|
|
|
|
|
| |
integerToFloat# and integerToDouble# were moved from ghc-bignum to base.
GHC.Integer.floatFromInteger and doubleFromInteger were removed.
Fixes #15926, #17231, #17782
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Related to #19381 #19359 #14702
After a spike in memory usage we have been conservative about returning
allocated blocks to the OS in case we are still allocating a lot and would
end up just reallocating them. The result of this was that up to 4 * live_bytes
of blocks would be retained once they were allocated even if memory usage ended up
a lot lower.
For a heap of size ~1.5G, this would result in OS memory reporting 6G which is
both misleading and worrying for users.
In long-lived server applications this results in consistent high memory
usage when the live data size is much more reasonable (for example ghcide)
Therefore we have a new (2021) strategy which starts by retaining up to 4 * live_bytes
of blocks before gradually returning uneeded memory back to the OS on subsequent
major GCs which are NOT caused by a heap overflow.
Each major GC which is NOT caused by heap overflow increases the consec_idle_gcs
counter and the amount of memory which is retained is inversely proportional to this number.
By default the excess memory retained is
oldGenFactor (controlled by -F) / 2 ^ (consec_idle_gcs * returnDecayFactor)
On a major GC caused by a heap overflow, the `consec_idle_gcs` variable is reset to 0
(as we could continue to allocate more, so retaining all the memory might make sense).
Therefore setting bigger values for `-Fd` makes the rate at which memory is returned slower.
Smaller values make it get returned faster. Setting `-Fd0` disables the
memory return completely, which is the behaviour of older GHC versions.
The default is `-Fd4` which results in the following scaling:
> mapM print [(x, 1/ (2**(x / 4))) | x <- [1 :: Double ..20]]
(1.0,0.8408964152537146)
(2.0,0.7071067811865475)
(3.0,0.5946035575013605)
(4.0,0.5)
(5.0,0.4204482076268573)
(6.0,0.35355339059327373)
(7.0,0.29730177875068026)
(8.0,0.25)
(9.0,0.21022410381342865)
(10.0,0.17677669529663687)
(11.0,0.14865088937534013)
(12.0,0.125)
(13.0,0.10511205190671433)
(14.0,8.838834764831843e-2)
(15.0,7.432544468767006e-2)
(16.0,6.25e-2)
(17.0,5.255602595335716e-2)
(18.0,4.4194173824159216e-2)
(19.0,3.716272234383503e-2)
(20.0,3.125e-2)
So after 13 consecutive GCs only 0.1 of the maximum memory used will be retained.
Further to this decay factor, the amount of memory we attempt to retain is
also influenced by the GC strategy for the oldest generation. If we are using
a copying strategy then we will need at least 2 * live_bytes for copying to take
place, so we always keep that much. If using compacting or nonmoving then we need a lower number,
so we just retain at least `1.2 * live_bytes` for some protection.
In future we might want to make this behaviour more aggressive, some
relevant literature is
> Ulan Degenbaev, Jochen Eisinger, Manfred Ernst, Ross McIlroy, and Hannes Payer. 2016. Idle time garbage collection scheduling. SIGPLAN Not. 51, 6 (June 2016), 570–583. DOI:https://doi.org/10.1145/2980983.2908106
which describes the "memory reducer" in the V8 javascript engine which
on an idle collection immediately returns as much memory as possible.
|
|
|
|
|
|
|
| |
Now that GHC 9.0.1 is released, it is time to drop support for bootstrapping
with GHC 8.8, as we only support building with the previous two major GHC
releases. As an added bonus, this allows us to remove several bits of CPP that
are either always true or no longer reachable.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements the BoxedRep proposal, refactoring the `RuntimeRep`
hierarchy from:
```haskell
data RuntimeRep = LiftedPtrRep | UnliftedPtrRep | ...
```
to
```haskell
data RuntimeRep = BoxedRep Levity | ...
data Levity = Lifted | Unlifted
```
Updates binary, haddock submodules.
Closes #17526.
Metric Increase:
T12545
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The `whereFrom` function provides a Haskell interface for using the
information created by `-finfo-table-map`. Given a Haskell value, the
info table address will be passed to the `lookupIPE` function in order
to attempt to find the source location information for that particular closure.
At the moment it's not possible to distinguish the absense of the map
and a failed lookup.
|
|
|
|
|
|
|
|
| |
This profiling mode creates bands by the address of the info table for
each closure. This provides a much more fine-grained profiling output
than any of the other profiling modes.
The `-hi` profiling mode does not require a profiling build.
|
|
|
|
| |
($) is INLINE so there is no reason ($!) shouldn't.
|
|
|
|
|
|
|
|
|
|
| |
This patch exposes three new functions in `GHC.Profiling` which allow
heap profiling to be enabled and disabled dynamically.
1. startHeapProfTimer - Starts heap profiling with the given RTS options
2. stopHeapProfTimer - Stops heap profiling
3. requestHeapCensus - Perform a heap census on the next context
switch, regardless of whether the timer is enabled or not.
|