diff options
author | DJ Delorie <dj@delorie.com> | 2017-07-06 13:37:30 -0400 |
---|---|---|
committer | DJ Delorie <dj@delorie.com> | 2017-07-06 13:37:30 -0400 |
commit | d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc (patch) | |
tree | 380cfbc329860434d6b29825bd02ba5f0c7d4b30 /manual | |
parent | 3cefdd7310a5d1fad45648d9346e47df9c185fdc (diff) | |
download | glibc-d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc.tar.gz |
Add per-thread cache to malloc
* config.make.in: Enable experimental malloc option.
* configure.ac: Likewise.
* configure: Regenerate.
* manual/install.texi: Document it.
* INSTALL: Regenerate.
* malloc/Makefile: Likewise.
* malloc/malloc.c: Add per-thread cache (tcache).
(tcache_put): New.
(tcache_get): New.
(tcache_thread_freeres): New.
(tcache_init): New.
(__libc_malloc): Use cached chunks if available.
(__libc_free): Initialize tcache if needed.
(__libc_realloc): Likewise.
(__libc_calloc): Likewise.
(_int_malloc): Prefill tcache when appropriate.
(_int_free): Likewise.
(do_set_tcache_max): New.
(do_set_tcache_count): New.
(do_set_tcache_unsorted_limit): New.
* manual/probes.texi: Document new probes.
* malloc/arena.c: Add new tcache tunables.
* elf/dl-tunables.list: Likewise.
* manual/tunables.texi: Document them.
* NEWS: Mention the per-thread cache.
Diffstat (limited to 'manual')
-rw-r--r-- | manual/install.texi | 6 | ||||
-rw-r--r-- | manual/probes.texi | 19 | ||||
-rw-r--r-- | manual/tunables.texi | 32 |
3 files changed, 57 insertions, 0 deletions
diff --git a/manual/install.texi b/manual/install.texi index 03eb2dd93b..b8deb9ceba 100644 --- a/manual/install.texi +++ b/manual/install.texi @@ -232,6 +232,12 @@ libnss_nisplus are not built at all. Use this option to enable libnsl with all depending NSS modules and header files. +@item --disable-experimental-malloc +By default, a per-thread cache is enabled in @code{malloc}. While +this cache can be disabled on a per-application basis using tunables +(set glibc.malloc.tcache_count to zero), this option can be used to +remove it from the build completely. + @item --build=@var{build-system} @itemx --host=@var{host-system} These options are for cross-compiling. If you specify both options and diff --git a/manual/probes.texi b/manual/probes.texi index eb91c62703..96acaed206 100644 --- a/manual/probes.texi +++ b/manual/probes.texi @@ -231,6 +231,25 @@ dynamic brk/mmap thresholds. Argument @var{$arg1} and @var{$arg2} are the adjusted mmap and trim thresholds, respectively. @end deftp +@deftp Probe memory_tunable_tcache_max_bytes (int @var{$arg1}, int @var{$arg2}) +This probe is triggered when the @code{glibc.malloc.tcache_max} +tunable is set. Argument @var{$arg1} is the requested value, and +@var{$arg2} is the previous value of this tunable. +@end deftp + +@deftp Probe memory_tunable_tcache_count (int @var{$arg1}, int @var{$arg2}) +This probe is triggered when the @code{glibc.malloc.tcache_count} +tunable is set. Argument @var{$arg1} is the requested value, and +@var{$arg2} is the previous value of this tunable. +@end deftp + +@deftp Probe memory_tunable_tcache_unsorted_limit (int @var{$arg1}, int @var{$arg2}) +This probe is triggered when the +@code{glibc.malloc.tcache_unsorted_limit} tunable is set. Argument +@var{$arg1} is the requested value, and @var{$arg2} is the previous +value of this tunable. +@end deftp + @node Mathematical Function Probes @section Mathematical Function Probes diff --git a/manual/tunables.texi b/manual/tunables.texi index 9331b03702..b16d591b90 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -193,6 +193,38 @@ systems the limit is twice the number of cores online and on 64-bit systems, it is 8 times the number of cores online. @end deftp +@deftp Tunable glibc.malloc.tcache_max +The maximum size of a request (in bytes) which may be met via the +per-thread cache. The default (and maximum) value is 1032 bytes on +64-bit systems and 516 bytes on 32-bit systems. +@end deftp + +@deftp Tunable glibc.malloc.tcache_count +The maximum number of chunks of each size to cache. The default is 7. +There is no upper limit, other than available system memory. If set +to zero, the per-thread cache is effectively disabled. + +The approximate maximum overhead of the per-thread cache is thus equal +to the number of bins times the chunk count in each bin times the size +of each chunk. With defaults, the approximate maximum overhead of the +per-thread cache is approximately 236 KB on 64-bit systems and 118 KB +on 32-bit systems. +@end deftp + +@deftp Tunable glibc.malloc.tcache_unsorted_limit +When the user requests memory and the request cannot be met via the +per-thread cache, the arenas are used to meet the request. At this +time, additional chunks will be moved from existing arena lists to +pre-fill the corresponding cache. While copies from the fastbins, +smallbins, and regular bins are bounded and predictable due to the bin +sizes, copies from the unsorted bin are not bounded, and incur +additional time penalties as they need to be sorted as they're +scanned. To make scanning the unsorted list more predictable and +bounded, the user may set this tunable to limit the number of chunks +that are scanned from the unsorted list while searching for chunks to +pre-fill the per-thread cache with. The default, or when set to zero, +is no limit. + @node Hardware Capability Tunables @section Hardware Capability Tunables @cindex hardware capability tunables |