diff options
author | Ivan Maidanski <ivmai@mail.ru> | 2023-01-26 07:32:47 +0300 |
---|---|---|
committer | Ivan Maidanski <ivmai@mail.ru> | 2023-01-26 07:53:41 +0300 |
commit | d389c2d7f08ab48e7b6ec1a22c2afe9d201f5b12 (patch) | |
tree | de1e09811741f697ac3bd557af06c402390be641 /docs/porting.md | |
parent | a8d383792100cb9958b430498e5a49ea4c4c2b9f (diff) | |
download | bdwgc-d389c2d7f08ab48e7b6ec1a22c2afe9d201f5b12.tar.gz |
Rename doc folder to docs
* CMakeLists.txt [enable_docs] (CMAKE_INSTALL_DOCDIR): Rename doc
folder to docs.
* Makefile.am [ENABLE_DOCS] (docdocdir, dist_docdocs_DATA,
dist_docdocsplatforms_DATA): Likewise.
* Makefile.direct (CXXFLAGS): Likewise.
* README.md: Likewise.
* docs/simple_example.md (Other platforms): Likewise.
* Makefile.am [ENABLE_DOCS] (docdocdir): Rename to docdocsdir.
* Makefile.am [ENABLE_DOCS] (dist_docdoc_DATA): Rename to
dist_docdocs_DATA.
* Makefile.am [ENABLE_DOCS] (docdocplatformsdir): Rename to
docdocsplatformsdir.
* Makefile.am [ENABLE_DOCS] (dist_docdocplatforms_DATA): Rename to
dist_docdocsplatforms_DATA.
* doc/README.autoconf: Move to docs folder.
* doc/README.cmake: Likewise.
* doc/README.cords: Likewise.
* doc/README.environment: Likewise.
* doc/README.macros: Likewise.
* doc/debugging.md: Likewise.
* doc/faq.md: Likewise.
* doc/finalization.md: Likewise.
* doc/gcdescr.md: Likewise.
* doc/gcinterface.md: Likewise.
* doc/leak.md: Likewise.
* doc/overview.md: Likewise.
* doc/porting.md: Likewise.
* doc/scale.md: Likewise.
* doc/simple_example.md: Likewise.
* doc/tree.md: Likewise.
* doc/platforms/README.aix: Move to docs/platforms folder.
* doc/platforms/README.amiga: Likewise.
* doc/platforms/README.arm_cross: Likewise.
* doc/platforms/README.darwin: Likewise.
* doc/platforms/README.dgux386: Likewise.
* doc/platforms/README.emscripten: Likewise.
* doc/platforms/README.ews4800: Likewise.
* doc/platforms/README.hp: Likewise.
* doc/platforms/README.linux: Likewise.
* doc/platforms/README.mac: Likewise.
* doc/platforms/README.os2: Likewise.
* doc/platforms/README.sgi: Likewise.
* doc/platforms/README.solaris2: Likewise.
* doc/platforms/README.symbian: Likewise.
* doc/platforms/README.uts: Likewise.
* doc/platforms/README.win32: Likewise.
* doc/platforms/README.win64: Likewise.
Diffstat (limited to 'docs/porting.md')
-rw-r--r-- | docs/porting.md | 263 |
1 files changed, 263 insertions, 0 deletions
diff --git a/docs/porting.md b/docs/porting.md new file mode 100644 index 00000000..6025e53f --- /dev/null +++ b/docs/porting.md @@ -0,0 +1,263 @@ +# Conservative Garbage Collector Porting Directions + +The collector is designed to be relatively easy to port, but is not portable +code per se. The collector inherently has to perform operations, such +as scanning the stack(s), that are not possible in portable C code. + +All of the following assumes that the collector is being ported to +a byte-addressable 32- or 64-bit machine. Currently all successful ports +to 64-bit machines involve LP64 and LLP64 targets (notably Win64). You +are hereby discouraged from attempting a port to non-byte-addressable, +or 8-bit, or 16-bit machines. + +The difficulty of porting the collector varies greatly depending on the needed +functionality. In the simplest case, only some small additions are needed for +the `include/private/gcconfig.h` file. This is described in the following +section. Later sections discuss some of the optional features, which typically +involve more porting effort. + +Note that the collector makes heavy use of `ifdef`s. Unlike some other +software projects, we have concluded repeatedly that this is preferable +to system dependent files, with code duplicated between the files. However, +to keep this manageable, we do strongly believe in indenting `ifdef`s +correctly (for historical reasons usually without the leading sharp sign). +(Separate source files are of course fine if they do not result in code +duplication.) + +## Adding Platforms to gcconfig.h + +If neither thread support, nor tracing of dynamic library data is required, +these are often the only changes you will need to make. + +The `gcconfig.h` file consists of three sections: + + 1. A section that defines GC-internal macros that identify the architecture + (e.g. `IA64` or `I386`) and operating system (e.g. `LINUX` or `MSWIN32`). + This is usually done by testing predefined macros. By defining our own + macros instead of using the predefined ones directly, we can impose a bit + more consistency, and somewhat isolate ourselves from compiler differences. + It is relatively straightforward to add a new entry here. But please try + to be consistent with the existing code. In particular, 64-bit variants + of 32-bit architectures general are _not_ treated as a new architecture. + Instead we explicitly test for 64-bit-ness in the few places in which + it matters. (The notable exception here is `I386` and `X86_64`. This + is partially historical, and partially justified by the fact that there are + arguably more substantial architecture and ABI differences here than for + RISC variants.) On GNU-based systems, `cpp -dM empty_source_file.c` seems + to generate a set of predefined macros. On some other systems, the "verbose" + compiler option may do so, or the manual page may list them. + + 2. A section that defines a small number of platform-specific macros, which + are then used directly by the collector. For simple ports, this is where + most of the effort is required. We describe the macros below. This section + contains a subsection for each architecture (enclosed in a suitable `ifdef`. + Each subsection usually contains some architecture-dependent defines, + followed by several sets of OS-dependent defines, again enclosed in + `ifdef`s. + + 3. A section that fills in defaults for some macros left undefined in the + preceding section, and defines some other macros that rarely need adjustment + for new platforms. You will typically not have to touch these. If you are + porting to an OS that was previously completely unsupported, it is likely + that you will need to add another clause to the definition of `GET_MEM`. + +The following macros must be defined correctly for each architecture and +operating system: + + * `MACH_TYPE` - Defined to a string that represents the machine + architecture. Usually just the macro name used to identify the architecture, + but enclosed in quotes. + * `OS_TYPE` - Defined to a string that represents the operating system name. + Usually just the macro name used to identify the operating system, but + enclosed in quotes. + * `CPP_WORDSZ` - The word size in bits as a constant suitable for + preprocessor tests, i.e. without casts or `sizeof` expressions. Currently + always defined as either 64 or 32. For platforms supporting both 32- and + 64-bit ABIs, this should be conditionally defined depending on the current + ABI. There is a default of 32. + * `ALIGNMENT` - Defined to be the largest _N_ such that all pointer + are guaranteed to be aligned on _N_-byte boundaries. Defining it to be _1_ + will always work, but perform poorly. For all modern 32-bit platforms, this + is 4. For all modern 64-bit platforms, this is 8. Whether or not x86 + qualifies as a modern architecture here is compiler- and OS-dependent. + * `DATASTART` - The beginning of the main data segment. The collector will + trace all memory between `DATASTART` and `DATAEND` for root pointers. + On some platforms, this can be defined to a constant address, though + experience has shown that to be risky. Ideally the linker will define + a symbol (e.g. `_data`) whose address is the beginning of the data segment. + Sometimes the value can be computed using the `GC_SysVGetDataStart` + function. Not used if either the next macro is defined, or if dynamic + loading is supported, and the dynamic loading support defines a function + `GC_register_main_static_data` which returns false. + * `SEARCH_FOR_DATA_START` - If this is defined `DATASTART` will be defined + to a dynamically computed value which is obtained by starting with the + address of `_end` and walking backwards until non-addressable memory + is found. This often works on Posix-like platforms. It makes it harder + to debug client programs, since startup involves generating and catching + a segmentation fault, which tends to confuse users. + * `DATAEND` - Set to the end of the main data segment. Defaults to `_end`, + where that is declared as an array. This works in some cases, since the + linker introduces a suitable symbol. + * `DATASTART2`, `DATAEND2` - Some platforms have two discontiguous main data + segments, e.g. for initialized and uninitialized data. If so, these two + macros should be defined to the limits of the second main data segment. + * `STACK_GROWS_UP` - Should be defined if the stack (or thread stacks) grow + towards higher addresses. (This appears to be true only on PA-RISC. If your + architecture has more than one stack per thread, and is not supported yet, + you will need to do more work. Grep for "IA64" in the source for an + example.) + * `STACKBOTTOM` - Defined to be the cold end of the stack, which is usually + (i.e. when the stacks grow down) the highest address in the stack. It must + bound the region of the stack that contains pointers into the GC heap. With + thread support, this must be the cold end of the main stack, which typically + cannot be found in the same way as the other thread stacks. If this is not + defined and none of the following three macros is defined, client code must + explicitly set `GC_stackbottom` to an appropriate value before calling + `GC_INIT` or any other `GC_` routine. + * `LINUX_STACKBOTTOM` - May be defined instead of `STACKBOTTOM`. If defined, + then the cold end of the stack will be determined, we usually read it from + `/proc`. + * `HEURISTIC1` - May be defined instead of `STACKBOTTOM`. `STACK_GRAN` + should generally also be redefined. The cold end of the stack is determined + by taking an address inside `GC_init`s frame, and rounding it up to the next + multiple of `STACK_GRAN`. This works well if the stack bottom is always + aligned to a large power of two. (`STACK_GRAN` is predefined to 0x1000000, + which is rarely optimal.) + * `HEURISTIC2` - May be defined instead of `STACKBOTTOM`. The cold end + of the stack is determined by taking an address inside `GC_init`s frame, + incrementing it repeatedly in small steps (decrement if `STACK_GROWS_UP`), + and reading the value at each location. We remember the value when the first + Segmentation violation or Bus error is signaled, round that to the nearest + plausible page boundary, and use that as the stack bottom. + * `DYNAMIC_LOADING` - Should be defined if `dyn_load.c` has been updated for + this platform and tracing of dynamic library roots is supported. + * `GWW_VDB`, `MPROTECT_VDB`, `PROC_VDB`, `SOFT_VDB` - May be defined if the + corresponding _virtual dirty bit_ implementation in `os_dep.c` is usable on + this platform. This allows incremental/generational garbage collection. + (`GWW_VDB` uses the Win32 `GetWriteWatch` function to read dirty bits, + `MPROTECT_VDB` identifies modified pages by write protecting the heap and + catching faults. `PROC_VDB` and `SOFT_VDB` use the /proc pseudo-files to + read dirty bits.) + * `PREFETCH`, `GC_PREFETCH_FOR_WRITE` - The collector uses `PREFETCH(x)` + to preload the cache with the data at _x_ address. This defaults to a no-op. + * `CLEAR_DOUBLE` - If `CLEAR_DOUBLE` is defined, then `CLEAR_DOUBLE(x)` + is used as a fast way to clear the two words at `GC_malloc`-aligned address + _x_. By default, word stores of 0 are used instead. + * `HEAP_START` - May be defined as the initial address hint for mmap-based + allocation. + +## Additional requirements for a basic port + +In some cases, you may have to add additional platform-specific code to other +files. A likely candidate is the implementation +of `GC_with_callee_saves_pushed` in `mach_dep.c`. This ensure that register +contents that the collector must trace from are copied to the stack. Typically +this can be done portably, but on some platforms it may require assembly code, +or just tweaking of conditional compilation tests. + +If your platform supports `getcontext` then defining the macro +`UNIX_LIKE` for your OS in `gcconfig.h` (if it is not defined there yet) +is likely to solve the problem. Otherwise, if you are using gcc, +`_builtin_unwind_init` will be used, and should work fine. If that is not +applicable either, the implementation will try to use `setjmp`. This will work +if your `setjmp` implementation saves all possibly pointer-valued registers +into the buffer, as opposed to trying to unwind the stack at `longjmp` time. +The `setjmp_test` test tries to determine this, but often does not get it +right. Registers tracing handled with an assembly code is generally to be +avoided. + +Most commonly `os_dep.c` will not require attention, but see below. + +## Thread support + +Supporting threads requires that the collector be able to find and suspend all +threads potentially accessing the garbage-collected heap, and locate any state +associated with each thread that must be traced. + +The functionality needed for thread support is generally implemented in one or +more files specific to the particular thread interface. For example, somewhat +portable pthread support is implemented in `pthread_support.c` and +`pthread_stop_world.c`. The essential functionality consists of: + + * `GC_stop_world` - Stops all threads which may access the garbage collected + heap, other than the caller; + * `GC_start_world` - Restart other threads; + * `GC_push_all_stacks` - Push the contents of all thread stacks (or, + at least, of pointer-containing regions in the thread stacks) onto the mark + stack. + +These very often require that the garbage collector maintain its own data +structures to track active threads. + +In addition, `LOCK` and `UNLOCK` must be implemented in `gc_locks.h`. + +The easiest case is probably a new pthreads platform on which threads can be +stopped with signals. In this case, the changes involve: + + 1. Introducing a suitable `GC_xxx_THREADS` macro, which should + be automatically defined by `gc_config_macros.h` in the right cases. + It should also result in a definition of `GC_PTHREADS`, as for the existing + cases. + 2. Ensuring that the `atomic_ops` package at least minimally + supports the platform. If incremental GC is needed, or if pthread locks + do not perform adequately as the allocation lock, you will probably need + to ensure that a sufficient `atomic_ops` port exists for the platform + to provided an atomic test and set operation. The latest GC code can use + GCC atomic intrinsics instead of `atomic_ops` package (see + `include/private/gc_atomic_ops.h`). + 3. Making any needed adjustments to `pthread_stop_world.c` and + `pthread_support.c`. Ideally none should be needed. In fact, not all of this + is as well standardized as one would like, and outright bugs requiring + workarounds are common. Non-preemptive threads packages will probably + require further work. Similarly thread-local allocation and parallel marking + requires further work in `pthread_support.c`, and may require better + `atomic_ops` support for the designed platform. + +## Dynamic library support + +So long as `DATASTART` and `DATAEND` are defined correctly, the collector will +trace memory reachable from file scope or `static` variables defined as part +of the main executable. This is sufficient if either the program is statically +linked, or if pointers to the garbage-collected heap are never stored +in non-stack variables defined in dynamic libraries. + +If dynamic library data sections must also be traced, then: + + * `DYNAMIC_LOADING` must be defined in the appropriate section of + `gcconfig.h`. + * An appropriate versions of the functions `GC_register_dynamic_libraries` + should be defined in `dyn_load.c`. This function should invoke + `GC_cond_add_roots(region_start, region_end, TRUE)` on each dynamic + library data section. + +Implementations that scan for writable data segments are error prone, +particularly in the presence of threads. They frequently result in race +conditions when threads exit and stacks disappear. They may also accidentally +trace large regions of graphics memory, or mapped files. On at least one +occasion they have been known to try to trace device memory that could not +safely be read in the manner the GC wanted to read it. + +It is usually safer to walk the dynamic linker data structure, especially +if the linker exports an interface to do so. But beware of poorly documented +locking behavior in this case. + +## Incremental GC support + +For incremental and generational collection to work, `os_dep.c` must contain +a suitable _virtual dirty bit_ implementation, which allows the collector +to track which heap pages (assumed to be a multiple of the collector's block +size) have been written during a certain time interval. The collector provides +several implementations, which might be adapted. The default (`DEFAULT_VDB`) +is a placeholder which treats all pages as having been written. This ensures +correctness, but renders incremental and generational collection essentially +useless. + +## Stack traces for debug support + +If stack traces in objects are needed for debug support, `GC_save_callers` and +`GC_print_callers` must be implemented. + +## Disclaimer + +This is an initial pass at porting guidelines. Some things have no doubt been +overlooked. |