summaryrefslogtreecommitdiff
path: root/includes
Commit message (Collapse)AuthorAgeFilesLines
* Move `/includes` to `/rts/include`, sort per package betterJohn Ericson2021-08-0973-13416/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | In order to make the packages in this repo "reinstallable", we need to associate source code with a specific packages. Having a top level `/includes` dir that mixes concerns (which packages' includes?) gets in the way of this. To start, I have moved everything to `rts/`, which is mostly correct. There are a few things however that really don't belong in the rts (like the generated constants haskell type, `CodeGen.Platform.h`). Those needed to be manually adjusted. Things of note: - No symlinking for sake of windows, so we hard-link at configure time. - `CodeGen.Platform.h` no longer as `.hs` extension (in addition to being moved to `compiler/`) so as not to confuse anyone, since it is next to Haskell files. - Blanket `-Iincludes` is gone in both build systems, include paths now more strictly respect per-package dependencies. - `deriveConstants` has been taught to not require a `--target-os` flag when generating the platform-agnostic Haskell type. Make takes advantage of this, but Hadrian has yet to.
* Make `PosixSource.h` installed and under `rts/`John Ericson2021-08-091-0/+38
| | | | | | is used outside of the rts so we do this rather than just fish it out of the repo in ad-hoc way, in order to make packages in this repo more self-contained.
* Clean up whitespace in /includesJohn Ericson2021-08-099-104/+104
| | | | I need to do this now or when I move these files the linter will be mad.
* rts: Fix use of sized array in Heap.hBen Gamari2021-08-091-2/+6
| | | | | | Sized arrays cannot be used in headers that might be imported from C++. Fixes #20199.
* Fix ASSERTS_ENABLED CPPSylvain Henry2021-08-032-2/+2
|
* rts/OSThreads: Fix reference clock of timedWaitConditionBen Gamari2021-08-021-1/+11
| | | | | | | | | | | | | | | Previously `timedWaitCondition` assumed that timeouts were referenced against `CLOCK_MONOTONIC`. This is wrong; by default `pthread_cond_timedwait` references against `CLOCK_REALTIME`, although this can be overridden using `pthread_condattr_setclock`. Fix this and add support for using `CLOCK_MONOTONIC` whenever possible as it is more robust against system time changes and is likely cheaper to query. Unfortunately, this is complicated by the fact that older versions of Darwin did not provide `clock_gettime`, which means we also need to introduce a fallback path using `gettimeofday`. Fixes #20144.
* PrimOps: Add CAS op for all int sizesPeter Trommler2021-08-021-0/+4
| | | | | | | | | | | PPC NCG: Implement CAS inline for 32 and 64 bit testsuite: Add tests for smaller atomic CAS X86 NCG: Catch calls to CAS C fallback Primops: Add atomicCasWord[8|16|32|64]Addr# Add tests for atomicCasWord[8|16|32|64]Addr# Add changelog entry for new primops X86 NCG: Fix MO-Cmpxchg W64 on 32-bit arch ghc-prim: 64-bit CAS C fallback on all archs
* UNREG: implement 64-bit mach ops for 32-bit targetsSergei Trofimovich2021-07-291-1/+34
| | | | | | | | | | | | | | | Noticed build failures like ``` ghc-stage1: panic! (the 'impossible' happened) GHC version 9.3.20210721: pprCallishMachOp_for_C: MO_x64_Ne not supported! ``` on `--tagget=hppa2.0-unknown-linux-gnu`. The change does not fix all 32-bit unreg target problems, but at least allows linking final ghc binaries. Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
* Add configure flag to enable ASSERTs in all waysDaniel Gröber2021-07-292-7/+25
| | | | | | | | Running the test suite with asserts enabled is somewhat tricky at the moment as running it with a GHC compiled the DEBUG way has some hundred failures from the start. These seem to be unrelated to assertions though. So this provides a toggle to make it easier to debug failing assertions using the test suite.
* rts: Fix inconsistent signatures for collect_pointersBen Gamari2021-07-271-1/+1
| | | | Fixes #20160.
* rts/OSThreads: Improve error handling consistencyBen Gamari2021-07-271-3/+4
| | | | | | | | | | | | Previously we relied on the caller to check the return value from broadcastCondition and friends, most of whom neglected to do so. Given that these functions should not fail anyways, I've opted to drop the return value entirely and rather move the result check into the OSThreads functions. This slightly changes the semantics of timedWaitCondition which now returns false only in the case of timeout, rather than any error as previously done.
* rts/OSThreads: Ensure that we catch failures from pthread_mutex_lockBen Gamari2021-07-271-4/+5
| | | | Previously we would only catch EDEADLK errors.
* rts: Drop allocateExec and friendsBen Gamari2021-07-271-7/+0
| | | | All uses of these now use ExecPage.
* rts: Introduce and use ExecPage abstractionBen Gamari2021-07-272-0/+19
| | | | | Here we introduce a very thin abstraction for allocating, filling, and freezing executable pages to replace allocateExec.
* rts: Break up adjustor logicBen Gamari2021-07-272-4/+1
|
* Adds AArch64 Native Code GeneratorMoritz Angermann2021-06-053-3/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In which we add a new code generator to the Glasgow Haskell Compiler. This codegen supports ELF and Mach-O targets, thus covering Linux, macOS, and BSDs in principle. It was tested only on macOS and Linux. The NCG follows a similar structure as the other native code generators we already have, and should therfore be realtively easy to follow. It supports most of the features required for a proper native code generator, but does not claim to be perfect or fully optimised. There are still opportunities for optimisations. Metric Decrease: ManyAlternatives ManyConstructors MultiLayerModules PmSeriesG PmSeriesS PmSeriesT PmSeriesV T10421 T10421a T10858 T11195 T11276 T11303b T11374 T11822 T12227 T12545 T12707 T13035 T13253 T13253-spj T13379 T13701 T13719 T14683 T14697 T15164 T15630 T16577 T17096 T17516 T17836 T17836b T17977 T17977b T18140 T18282 T18304 T18478 T18698a T18698b T18923 T1969 T3064 T5030 T5321FD T5321Fun T5631 T5642 T5837 T783 T9198 T9233 T9630 T9872d T9961 WWRec Metric Increase: T4801
* Put Unique related global variables in the RTS (#19940)Sylvain Henry2021-06-051-0/+2
|
* Tighten scope of non-POSIX visibility macrosViktor Dukhovni2021-04-301-6/+0
| | | | | | | The __BSD_VISIBLE and _DARWIN_C_SOURCE macros expose non-POSIX prototypes in system header files. We should scope these to just the ".c" modules that actually need them, and avoid defining them in header files used in other C modules.
* Constants: add a note and fix minor doc glitchesSylvain Henry2021-04-102-3/+3
|
* Don't produce platformConstants fileSylvain Henry2021-04-101-8/+0
| | | | It isn't used for anything anymore
* Remove dynamic-by-default (#16782)Sylvain Henry2021-04-071-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dynamic-by-default was a mechanism to automatically select the -dynamic way for some targets. It was implemented in a convoluted way: it was defined as a flavour option, hence it couldn't be passed as a global settings (which are produced by `configure` before considering flavours), so a build system rule was used to pass -DDYNAMIC_BY_DEFAULT to the C compiler so that deriveConstants could infer it. * Make build system has it disabled for 8 years (951e28c0625ece7e0db6ac9d4a1e61e2737b10de) * It has never been implemented in Hadrian * Last time someone tried to enable it 1 year ago it didn't work (!2436) * Having this as a global constant impedes making GHC multi-target (see !5427) This commit fully removes support for dynamic-by-default. If someone wants to reimplement something like this, it would probably need to move the logic in the compiler. (Doing this would probably need some refactoring of the way the compiler handles DynFlags: DynFlags are used to store and to pass enabled ways to many parts of the compiler. It can be set by command-line flags, GHC API, global settings. In multi-target GHC, we will use DynFlags to load the target platform and its constants: but at this point with the current DynFlags implementation we can't easily update the existing DynFlags with target-specific options such as dynamic-by-default without overriding ways previously set by the user.)
* [armv7] PIC by default + [aarch64-linux] T11276 metric increaseMoritz Angermann2021-03-291-1/+1
| | | | | Metric Increase: T11276
* Allocate Adjustors and mark them readable in two stepsMoritz Angermann2021-03-292-1/+13
| | | | | | | | | This drops allocateExec for darwin, and replaces it with a alloc, write, mark executable strategy instead. This prevents us from trying to allocate an executable range and then write to it, which X^W will prohibit on darwin. This will *only* work if we can use mmap.
* Generate GHCi bytecode from STG instead of Core and support unboxedLuite Stegeman2021-03-202-0/+75
| | | | | | tuples and sums. fixes #1257
* rts: Gradually return retained memory to the OSMatthew Pickering2021-03-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Related to #19381 #19359 #14702 After a spike in memory usage we have been conservative about returning allocated blocks to the OS in case we are still allocating a lot and would end up just reallocating them. The result of this was that up to 4 * live_bytes of blocks would be retained once they were allocated even if memory usage ended up a lot lower. For a heap of size ~1.5G, this would result in OS memory reporting 6G which is both misleading and worrying for users. In long-lived server applications this results in consistent high memory usage when the live data size is much more reasonable (for example ghcide) Therefore we have a new (2021) strategy which starts by retaining up to 4 * live_bytes of blocks before gradually returning uneeded memory back to the OS on subsequent major GCs which are NOT caused by a heap overflow. Each major GC which is NOT caused by heap overflow increases the consec_idle_gcs counter and the amount of memory which is retained is inversely proportional to this number. By default the excess memory retained is oldGenFactor (controlled by -F) / 2 ^ (consec_idle_gcs * returnDecayFactor) On a major GC caused by a heap overflow, the `consec_idle_gcs` variable is reset to 0 (as we could continue to allocate more, so retaining all the memory might make sense). Therefore setting bigger values for `-Fd` makes the rate at which memory is returned slower. Smaller values make it get returned faster. Setting `-Fd0` disables the memory return completely, which is the behaviour of older GHC versions. The default is `-Fd4` which results in the following scaling: > mapM print [(x, 1/ (2**(x / 4))) | x <- [1 :: Double ..20]] (1.0,0.8408964152537146) (2.0,0.7071067811865475) (3.0,0.5946035575013605) (4.0,0.5) (5.0,0.4204482076268573) (6.0,0.35355339059327373) (7.0,0.29730177875068026) (8.0,0.25) (9.0,0.21022410381342865) (10.0,0.17677669529663687) (11.0,0.14865088937534013) (12.0,0.125) (13.0,0.10511205190671433) (14.0,8.838834764831843e-2) (15.0,7.432544468767006e-2) (16.0,6.25e-2) (17.0,5.255602595335716e-2) (18.0,4.4194173824159216e-2) (19.0,3.716272234383503e-2) (20.0,3.125e-2) So after 13 consecutive GCs only 0.1 of the maximum memory used will be retained. Further to this decay factor, the amount of memory we attempt to retain is also influenced by the GC strategy for the oldest generation. If we are using a copying strategy then we will need at least 2 * live_bytes for copying to take place, so we always keep that much. If using compacting or nonmoving then we need a lower number, so we just retain at least `1.2 * live_bytes` for some protection. In future we might want to make this behaviour more aggressive, some relevant literature is > Ulan Degenbaev, Jochen Eisinger, Manfred Ernst, Ross McIlroy, and Hannes Payer. 2016. Idle time garbage collection scheduling. SIGPLAN Not. 51, 6 (June 2016), 570–583. DOI:https://doi.org/10.1145/2980983.2908106 which describes the "memory reducer" in the V8 javascript engine which on an idle collection immediately returns as much memory as possible.
* eventlog: Add BLOCKS_SIZE eventMatthew Pickering2021-03-081-0/+1
| | | | | | | | | | | | | The BLOCKS_SIZE event reports the size of the currently allocated blocks in bytes. It is like the HEAP_SIZE event, but reports about the blocks rather than megablocks. You can work out the current heap fragmentation by looking at the difference between HEAP_SIZE and BLOCKS_SIZE. Fixes #19357
* eventlog: Add MEM_RETURN event to give information about fragmentationMatthew Pickering2021-03-081-0/+2
| | | | | | | | | | | | | | See #19357 The event reports the * Current number of megablocks allocated * The number that the RTS thinks it needs * The number is managed to return to the OS When current > need then the difference is returned to the OS, the successful number of returned mblocks is reported by 'returned'. In a fragmented heap current > need but returned < current - need.
* Implement riscv64 LLVM backendAndreas Schwab2021-03-054-1/+141
| | | | This enables a registerised build for the riscv64 architecture.
* Add whereFrom and whereFrom# primopMatthew Pickering2021-03-031-0/+1
| | | | | | | | | | The `whereFrom` function provides a Haskell interface for using the information created by `-finfo-table-map`. Given a Haskell value, the info table address will be passed to the `lookupIPE` function in order to attempt to find the source location information for that particular closure. At the moment it's not possible to distinguish the absense of the map and a failed lookup.
* Add -finfo-table-map which maps info tables to source positionsMatthew Pickering2021-03-033-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new flag embeds a lookup table from the address of an info table to information about that info table. The main interface for consulting the map is the `lookupIPE` C function > InfoProvEnt * lookupIPE(StgInfoTable *info) The `InfoProvEnt` has the following structure: > typedef struct InfoProv_{ > char * table_name; > char * closure_desc; > char * ty_desc; > char * label; > char * module; > char * srcloc; > } InfoProv; > > typedef struct InfoProvEnt_ { > StgInfoTable * info; > InfoProv prov; > struct InfoProvEnt_ *link; > } InfoProvEnt; The source positions are approximated in a similar way to the source positions for DWARF debugging information. They are only approximate but in our experience provide a good enough hint about where the problem might be. It is therefore recommended to use this flag in conjunction with `-g<n>` for more accurate locations. The lookup table is also emitted into the eventlog when it is available as it is intended to be used with the `-hi` profiling mode. Using this flag will significantly increase the size of the resulting object file but only by a factor of 2-3x in our experience.
* Profiling by info table mode (-hi)Matthew Pickering2021-03-032-6/+8
| | | | | | | | This profiling mode creates bands by the address of the info table for each closure. This provides a much more fine-grained profiling output than any of the other profiling modes. The `-hi` profiling mode does not require a profiling build.
* Profiling: Allow heap profiling to be controlled dynamically.Matthew Pickering2021-03-033-0/+26
| | | | | | | | | | This patch exposes three new functions in `GHC.Profiling` which allow heap profiling to be enabled and disabled dynamically. 1. startHeapProfTimer - Starts heap profiling with the given RTS options 2. stopHeapProfTimer - Stops heap profiling 3. requestHeapCensus - Perform a heap census on the next context switch, regardless of whether the timer is enabled or not.
* Support auto-detection of MAX_REAL_FLOAT_REG and MAX_REAL_DOUBLE_REG up to 6ARATA Mizuki2021-03-021-2/+18
| | | | Fixes #17953
* Define TRY_ACQUIRE_LOCK correctly when non-threadedMatthew Pickering2021-03-021-3/+5
|
* Remove the -xt heap profiling optionMatthew Pickering2021-02-271-1/+0
| | | | | | | It should be left to tooling to perform the filtering to remove these specific closure types from the profile if desired. Fixes #16795
* rts: Introduce --eventlog-flush-interval flagBen Gamari2021-02-271-0/+2
| | | | | | | This introduces a flag, --eventlog-flush-interval, which can be used to set an upper bound on the amount of time for which an eventlog event will remain enqueued. This can be useful in real-time monitoring settings.
* Move absentError into ghc-prim.Andreas Klebinger2021-02-261-0/+1
| | | | | | | | | | | | When using -fdicts-strict we generate references to absentError while compiling ghc-prim. However we always load ghc-prim before base so this caused linker errors. We simply solve this by moving absentError into ghc-prim. This does mean it's now a panic instead of an exception which can no longer be caught. But given that it should only be thrown if there is a compiler error that seems acceptable, and in fact we already do this for absentSumFieldError which has similar constraints.
* rts: Add generic block traversal function, listAllBlocksMatthew Pickering2021-02-181-0/+3
| | | | | | | | | This function is exposed in the RtsAPI.h so that external users have a blessed way to traverse all the different `bdescr`s which are known by the RTS. The main motivation is to use this function in ghc-debug but avoid having to expose the internal structure of a Capability in the API.
* rts: TraverseHeap: Simplify profiling headerDaniel Gröber2021-02-171-14/+5
| | | | | Having a union in the closure profiling header really just complicates things so get back to basics, we just have a single StgWord there for now.
* Fix typosBrian Wignall2021-02-061-1/+1
|
* Move ioManager{Start,Wakeup,Die} to internal IOManager.hDuncan Coutts2021-01-251-12/+0
| | | | | | | | Move them from the external IOInterface.h to the internal IOManager.h. The functions are all in fact internal. They are not used from the base library at all. Remove ioManagerWakeup as an exported symbol. It is not used elsewhere.
* Rename includes/rts/IOManager.h to IOInterface.hDuncan Coutts2021-01-252-1/+1
| | | | | | | | | | | | | | | | | | | | | Naming is hard. Where we want to get to is to have a clear internal and external API for the IO manager within the RTS. What we have right now is just the external API (used in base for the Haskell side of the threaded IO manager impls) living in includes/rts/IOManager.h. We want to add a clear RTS internal API, which really ought to live in rts/IOManager.h. Several people think it's too confusing to have both: * includes/rts/IOManager.h for the external API * rts/IOManager.h for the internal API So the plan is to add rts/IOManager.{h,c} as the internal parts, and rename the external part to be includes/rts/IOInterface.h. It is admittidly not great to have .h files in includes/rts/ called "interface" since by definition, every .h fle under includes/ is an interface! Alternative naming scheme suggestions welcome!
* Force inlining of deRefStablePtr to silence warningsAndreas Klebinger2021-01-221-2/+2
|
* rts: Initialize card table in newArray#Ben Gamari2021-01-171-13/+20
| | | | | | | | | | Previously we would leave the card table of new arrays uninitialized. This wasn't a soundness issue: at worst we would end up doing unnecessary scavenging during GC, after which the card table would be reset. That being said, it seems worth initializing this properly to avoid both unnecessary work and non-determinism. Fixes #19143.
* rts: gc: use mutex+condvar instead of spinlooks in gc entry/exitDouglas Wilson2021-01-171-14/+0
| | | | | | used timed wait on condition variable in waitForGcThreads fix dodgy timespec calculation
* rts: add timedWaitConditionDouglas Wilson2021-01-171-0/+1
|
* rts: add max_n_todo_overflow internal counterDouglas Wilson2021-01-171-0/+2
| | | | | | | | I've never observed this counter taking a non-zero value, however I do think it's existence is justified by the comment in grab_local_todo_block. I've not added it to RTSStats in GHC.Stats, as it doesn't seem worth the api churn.
* rts: remove no_work counterDouglas Wilson2021-01-171-3/+0
| | | | We are no longer busyish waiting, so this is no longer meaningful
* rts/eventlog: Introduce event to demarcate new ticky sampleBen Gamari2021-01-171-1/+2
|
* rts: Zero shrunk array slop in vanilla RTSBen Gamari2021-01-071-1/+5
| | | | | | But only when profiling or DEBUG are enabled. Fixes #17572.