diff options
Diffstat (limited to 'deps/jemalloc/doc')
-rw-r--r-- | deps/jemalloc/doc/jemalloc.3 | 881 | ||||
-rw-r--r-- | deps/jemalloc/doc/jemalloc.html | 897 | ||||
-rw-r--r-- | deps/jemalloc/doc/jemalloc.xml.in | 1207 |
3 files changed, 2025 insertions, 960 deletions
diff --git a/deps/jemalloc/doc/jemalloc.3 b/deps/jemalloc/doc/jemalloc.3 index d04fbb498..2e6b2c0e8 100644 --- a/deps/jemalloc/doc/jemalloc.3 +++ b/deps/jemalloc/doc/jemalloc.3 @@ -2,12 +2,12 @@ .\" Title: JEMALLOC .\" Author: Jason Evans .\" Generator: DocBook XSL Stylesheets v1.78.1 <http://docbook.sf.net/> -.\" Date: 03/31/2014 +.\" Date: 09/24/2015 .\" Manual: User Manual -.\" Source: jemalloc 3.6.0-0-g46c0af68bd248b04df75e4f92d5fb804c3d75340 +.\" Source: jemalloc 4.0.3-0-ge9192eacf8935e29fc62fddc2701f7942b1cc02c .\" Language: English .\" -.TH "JEMALLOC" "3" "03/31/2014" "jemalloc 3.6.0-0-g46c0af68bd24" "User Manual" +.TH "JEMALLOC" "3" "09/24/2015" "jemalloc 4.0.3-0-ge9192eacf893" "User Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- @@ -31,13 +31,12 @@ jemalloc \- general purpose memory allocation functions .SH "LIBRARY" .PP -This manual describes jemalloc 3\&.6\&.0\-0\-g46c0af68bd248b04df75e4f92d5fb804c3d75340\&. More information can be found at the +This manual describes jemalloc 4\&.0\&.3\-0\-ge9192eacf8935e29fc62fddc2701f7942b1cc02c\&. More information can be found at the \m[blue]\fBjemalloc website\fR\m[]\&\s-2\u[1]\d\s+2\&. .SH "SYNOPSIS" .sp .ft B .nf -#include <stdlib\&.h> #include <jemalloc/jemalloc\&.h> .fi .ft @@ -65,6 +64,8 @@ This manual describes jemalloc 3\&.6\&.0\-0\-g46c0af68bd248b04df75e4f92d5fb804c3 .BI "size_t sallocx(void\ *" "ptr" ", int\ " "flags" ");" .HP \w'void\ dallocx('u .BI "void dallocx(void\ *" "ptr" ", int\ " "flags" ");" +.HP \w'void\ sdallocx('u +.BI "void sdallocx(void\ *" "ptr" ", size_t\ " "size" ", int\ " "flags" ");" .HP \w'size_t\ nallocx('u .BI "size_t nallocx(size_t\ " "size" ", int\ " "flags" ");" .HP \w'int\ mallctl('u @@ -81,17 +82,6 @@ This manual describes jemalloc 3\&.6\&.0\-0\-g46c0af68bd248b04df75e4f92d5fb804c3 .BI "void (*malloc_message)(void\ *" "cbopaque" ", const\ char\ *" "s" ");" .PP const char *\fImalloc_conf\fR; -.SS "Experimental API" -.HP \w'int\ allocm('u -.BI "int allocm(void\ **" "ptr" ", size_t\ *" "rsize" ", size_t\ " "size" ", int\ " "flags" ");" -.HP \w'int\ rallocm('u -.BI "int rallocm(void\ **" "ptr" ", size_t\ *" "rsize" ", size_t\ " "size" ", size_t\ " "extra" ", int\ " "flags" ");" -.HP \w'int\ sallocm('u -.BI "int sallocm(const\ void\ *" "ptr" ", size_t\ *" "rsize" ", int\ " "flags" ");" -.HP \w'int\ dallocm('u -.BI "int dallocm(void\ *" "ptr" ", int\ " "flags" ");" -.HP \w'int\ nallocm('u -.BI "int nallocm(size_t\ *" "rsize" ", size_t\ " "size" ", int\ " "flags" ");" .SH "DESCRIPTION" .SS "Standard API" .PP @@ -118,7 +108,7 @@ The \fBposix_memalign\fR\fB\fR function allocates \fIsize\fR -bytes of memory such that the allocation\*(Aqs base address is an even multiple of +bytes of memory such that the allocation\*(Aqs base address is a multiple of \fIalignment\fR, and returns the allocation in the value pointed to by \fIptr\fR\&. The requested \fIalignment\fR @@ -129,7 +119,7 @@ The \fBaligned_alloc\fR\fB\fR function allocates \fIsize\fR -bytes of memory such that the allocation\*(Aqs base address is an even multiple of +bytes of memory such that the allocation\*(Aqs base address is a multiple of \fIalignment\fR\&. The requested \fIalignment\fR must be a power of 2\&. Behavior is undefined if @@ -172,7 +162,8 @@ The \fBrallocx\fR\fB\fR, \fBxallocx\fR\fB\fR, \fBsallocx\fR\fB\fR, -\fBdallocx\fR\fB\fR, and +\fBdallocx\fR\fB\fR, +\fBsdallocx\fR\fB\fR, and \fBnallocx\fR\fB\fR functions all have a \fIflags\fR @@ -201,11 +192,32 @@ is a power of 2\&. Initialize newly allocated memory to contain zero bytes\&. In the growing reallocation case, the real size prior to reallocation defines the boundary between untouched bytes and those that are initialized to contain zero bytes\&. If this macro is absent, newly allocated memory is uninitialized\&. .RE .PP +\fBMALLOCX_TCACHE(\fR\fB\fItc\fR\fR\fB) \fR +.RS 4 +Use the thread\-specific cache (tcache) specified by the identifier +\fItc\fR, which must have been acquired via the +"tcache\&.create" +mallctl\&. This macro does not validate that +\fItc\fR +specifies a valid identifier\&. +.RE +.PP +\fBMALLOCX_TCACHE_NONE\fR +.RS 4 +Do not use a thread\-specific cache (tcache)\&. Unless +\fBMALLOCX_TCACHE(\fR\fB\fItc\fR\fR\fB)\fR +or +\fBMALLOCX_TCACHE_NONE\fR +is specified, an automatically managed tcache will be used under many circumstances\&. This macro cannot be used in the same +\fIflags\fR +argument as +\fBMALLOCX_TCACHE(\fR\fB\fItc\fR\fR\fB)\fR\&. +.RE +.PP \fBMALLOCX_ARENA(\fR\fB\fIa\fR\fR\fB) \fR .RS 4 Use the arena specified by the index -\fIa\fR -(and by necessity bypass the thread cache)\&. This macro has no effect for huge regions, nor for regions that were allocated via an arena other than the one specified\&. This macro does not validate that +\fIa\fR\&. This macro has no effect for regions that were allocated via an arena other than the one specified\&. This macro does not validate that \fIa\fR specifies an arena index in the valid range\&. .RE @@ -258,6 +270,17 @@ function causes the memory referenced by to be made available for future allocations\&. .PP The +\fBsdallocx\fR\fB\fR +function is an extension of +\fBdallocx\fR\fB\fR +with a +\fIsize\fR +parameter to allow the caller to pass in the allocation size as an optimization\&. The minimum valid input size is the original requested size of the allocation, and the maximum valid input size is the corresponding value returned by +\fBnallocx\fR\fB\fR +or +\fBsallocx\fR\fB\fR\&. +.PP +The \fBnallocx\fR\fB\fR function allocates no memory, but it performs the same size computation as the \fBmallocx\fR\fB\fR @@ -351,7 +374,7 @@ uses the \fBmallctl*\fR\fB\fR functions internally, so inconsistent statistics can be reported if multiple threads use these functions simultaneously\&. If \fB\-\-enable\-stats\fR -is specified during configuration, \(lqm\(rq and \(lqa\(rq can be specified to omit merged arena and per arena statistics, respectively; \(lqb\(rq and \(lql\(rq can be specified to omit per size class statistics for bins and large objects, respectively\&. Unrecognized characters are silently ignored\&. Note that thread caching may prevent some statistics from being completely up to date, since extra locking would be required to merge counters that track thread cache operations\&. +is specified during configuration, \(lqm\(rq and \(lqa\(rq can be specified to omit merged arena and per arena statistics, respectively; \(lqb\(rq, \(lql\(rq, and \(lqh\(rq can be specified to omit per size class statistics for bins, large objects, and huge objects, respectively\&. Unrecognized characters are silently ignored\&. Note that thread caching may prevent some statistics from being completely up to date, since extra locking would be required to merge counters that track thread cache operations\&. .PP The \fBmalloc_usable_size\fR\fB\fR @@ -362,126 +385,6 @@ function is not a mechanism for in\-place \fBrealloc\fR\fB\fR; rather it is provided solely as a tool for introspection purposes\&. Any discrepancy between the requested allocation size and the size reported by \fBmalloc_usable_size\fR\fB\fR should not be depended on, since such behavior is entirely implementation\-dependent\&. -.SS "Experimental API" -.PP -The experimental API is subject to change or removal without regard for backward compatibility\&. If -\fB\-\-disable\-experimental\fR -is specified during configuration, the experimental API is omitted\&. -.PP -The -\fBallocm\fR\fB\fR, -\fBrallocm\fR\fB\fR, -\fBsallocm\fR\fB\fR, -\fBdallocm\fR\fB\fR, and -\fBnallocm\fR\fB\fR -functions all have a -\fIflags\fR -argument that can be used to specify options\&. The functions only check the options that are contextually relevant\&. Use bitwise or (|) operations to specify one or more of the following: -.PP -\fBALLOCM_LG_ALIGN(\fR\fB\fIla\fR\fR\fB) \fR -.RS 4 -Align the memory allocation to start at an address that is a multiple of -(1 << \fIla\fR)\&. This macro does not validate that -\fIla\fR -is within the valid range\&. -.RE -.PP -\fBALLOCM_ALIGN(\fR\fB\fIa\fR\fR\fB) \fR -.RS 4 -Align the memory allocation to start at an address that is a multiple of -\fIa\fR, where -\fIa\fR -is a power of two\&. This macro does not validate that -\fIa\fR -is a power of 2\&. -.RE -.PP -\fBALLOCM_ZERO\fR -.RS 4 -Initialize newly allocated memory to contain zero bytes\&. In the growing reallocation case, the real size prior to reallocation defines the boundary between untouched bytes and those that are initialized to contain zero bytes\&. If this macro is absent, newly allocated memory is uninitialized\&. -.RE -.PP -\fBALLOCM_NO_MOVE\fR -.RS 4 -For reallocation, fail rather than moving the object\&. This constraint can apply to both growth and shrinkage\&. -.RE -.PP -\fBALLOCM_ARENA(\fR\fB\fIa\fR\fR\fB) \fR -.RS 4 -Use the arena specified by the index -\fIa\fR -(and by necessity bypass the thread cache)\&. This macro has no effect for huge regions, nor for regions that were allocated via an arena other than the one specified\&. This macro does not validate that -\fIa\fR -specifies an arena index in the valid range\&. -.RE -.PP -The -\fBallocm\fR\fB\fR -function allocates at least -\fIsize\fR -bytes of memory, sets -\fI*ptr\fR -to the base address of the allocation, and sets -\fI*rsize\fR -to the real size of the allocation if -\fIrsize\fR -is not -\fBNULL\fR\&. Behavior is undefined if -\fIsize\fR -is -\fB0\fR, or if request size overflows due to size class and/or alignment constraints\&. -.PP -The -\fBrallocm\fR\fB\fR -function resizes the allocation at -\fI*ptr\fR -to be at least -\fIsize\fR -bytes, sets -\fI*ptr\fR -to the base address of the allocation if it moved, and sets -\fI*rsize\fR -to the real size of the allocation if -\fIrsize\fR -is not -\fBNULL\fR\&. If -\fIextra\fR -is non\-zero, an attempt is made to resize the allocation to be at least -(\fIsize\fR + \fIextra\fR) -bytes, though inability to allocate the extra byte(s) will not by itself result in failure\&. Behavior is undefined if -\fIsize\fR -is -\fB0\fR, if request size overflows due to size class and/or alignment constraints, or if -(\fIsize\fR + \fIextra\fR > \fBSIZE_T_MAX\fR)\&. -.PP -The -\fBsallocm\fR\fB\fR -function sets -\fI*rsize\fR -to the real size of the allocation\&. -.PP -The -\fBdallocm\fR\fB\fR -function causes the memory referenced by -\fIptr\fR -to be made available for future allocations\&. -.PP -The -\fBnallocm\fR\fB\fR -function allocates no memory, but it performs the same size computation as the -\fBallocm\fR\fB\fR -function, and if -\fIrsize\fR -is not -\fBNULL\fR -it sets -\fI*rsize\fR -to the real size of the allocation that would result from the equivalent -\fBallocm\fR\fB\fR -function call\&. Behavior is undefined if -\fIsize\fR -is -\fB0\fR, or if request size overflows due to size class and/or alignment constraints\&. .SH "TUNING" .PP Once, when the first call is made to one of the memory allocation routines, the allocator initializes its internals based in part on various options that can be specified at compile\- or run\-time\&. @@ -519,8 +422,8 @@ options\&. Some options have boolean values (true/false), others have integer va Traditionally, allocators have used \fBsbrk\fR(2) to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory\&. If -\fB\-\-enable\-dss\fR -is specified during configuration, this allocator uses both +\fBsbrk\fR(2) +is supported by the operating system, this allocator uses both \fBmmap\fR(2) and \fBsbrk\fR(2), in that order of preference; otherwise only @@ -535,18 +438,29 @@ is specified during configuration, this allocator supports thread\-specific cach .PP Memory is conceptually broken into equal\-sized chunks, where the chunk size is a power of two that is greater than the page size\&. Chunks are always aligned to multiples of the chunk size\&. This alignment makes it possible to find metadata for user objects very quickly\&. .PP -User objects are broken into three categories according to size: small, large, and huge\&. Small objects are smaller than one page\&. Large objects are smaller than the chunk size\&. Huge objects are a multiple of the chunk size\&. Small and large objects are managed by arenas; huge objects are managed separately in a single data structure that is shared by all threads\&. Huge objects are used by applications infrequently enough that this single data structure is not a scalability issue\&. +User objects are broken into three categories according to size: small, large, and huge\&. Small and large objects are managed entirely by arenas; huge objects are additionally aggregated in a single data structure that is shared by all threads\&. Huge objects are typically used by applications infrequently enough that this single data structure is not a scalability issue\&. .PP Each chunk that is managed by an arena tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one large object)\&. The combination of chunk alignment and chunk page maps makes it possible to determine all metadata regarding small and large allocations in constant time\&. .PP -Small objects are managed in groups by page runs\&. Each run maintains a frontier and free list to track which regions are in use\&. Allocation requests that are no more than half the quantum (8 or 16, depending on architecture) are rounded up to the nearest power of two that is at least -sizeof(\fBdouble\fR)\&. All other small object size classes are multiples of the quantum, spaced such that internal fragmentation is limited to approximately 25% for all but the smallest size classes\&. Allocation requests that are larger than the maximum small size class, but small enough to fit in an arena\-managed chunk (see the +Small objects are managed in groups by page runs\&. Each run maintains a bitmap to track which regions are in use\&. Allocation requests that are no more than half the quantum (8 or 16, depending on architecture) are rounded up to the nearest power of two that is at least +sizeof(\fBdouble\fR)\&. All other object size classes are multiples of the quantum, spaced such that there are four size classes for each doubling in size, which limits internal fragmentation to approximately 20% for all but the smallest size classes\&. Small size classes are smaller than four times the page size, large size classes are smaller than the chunk size (see the "opt\&.lg_chunk" -option), are rounded up to the nearest run size\&. Allocation requests that are too large to fit in an arena\-managed chunk are rounded up to the nearest multiple of the chunk size\&. +option), and huge size classes extend from the chunk size up to one size class less than the full address space size\&. .PP Allocations are packed tightly together, which can be an issue for multi\-threaded applications\&. If you need to assure that allocations do not suffer from cacheline sharing, round your allocation requests up to the nearest multiple of the cacheline size, or specify cacheline alignment when allocating\&. .PP -Assuming 4 MiB chunks, 4 KiB pages, and a 16\-byte quantum on a 64\-bit system, the size classes in each category are as shown in +The +\fBrealloc\fR\fB\fR, +\fBrallocx\fR\fB\fR, and +\fBxallocx\fR\fB\fR +functions may resize allocations without moving them under limited circumstances\&. Unlike the +\fB*allocx\fR\fB\fR +API, the standard API does not officially round up the usable size of an allocation to the nearest size class, so technically it is necessary to call +\fBrealloc\fR\fB\fR +to grow e\&.g\&. a 9\-byte allocation to 16 bytes, or shrink a 16\-byte allocation to 9 bytes\&. Growth and shrinkage trivially succeeds in place as long as the pre\-size and post\-size both round up to the same size class\&. No other API guarantees are made regarding in\-place resizing, but the current implementation also tries to resize large and huge allocations in place, as long as the pre\-size and post\-size are both large or both huge\&. In such cases shrinkage always succeeds for large size classes, but for huge size classes the chunk allocator must support splitting (see +"arena\&.<i>\&.chunk_hooks")\&. Growth only succeeds if the trailing memory is currently available, and additionally for huge size classes the chunk allocator must support merging\&. +.PP +Assuming 2 MiB chunks, 4 KiB pages, and a 16\-byte quantum on a 64\-bit system, the size classes in each category are as shown in Table 1\&. .sp .it 1 an-trap @@ -572,8 +486,23 @@ l r l ^ r l ^ r l ^ r l +^ r l +^ r l l r l -l r l. +^ r l +^ r l +^ r l +^ r l +^ r l +^ r l +^ r l +l r l +^ r l +^ r l +^ r l +^ r l +^ r l +^ r l. T{ Small T}:T{ @@ -584,7 +513,7 @@ T} :T{ 16 T}:T{ -[16, 32, 48, \&.\&.\&., 128] +[16, 32, 48, 64, 80, 96, 112, 128] T} :T{ 32 @@ -609,21 +538,96 @@ T} :T{ 512 T}:T{ -[2560, 3072, 3584] +[2560, 3072, 3584, 4096] +T} +:T{ +1 KiB +T}:T{ +[5 KiB, 6 KiB, 7 KiB, 8 KiB] +T} +:T{ +2 KiB +T}:T{ +[10 KiB, 12 KiB, 14 KiB] T} T{ Large T}:T{ +2 KiB +T}:T{ +[16 KiB] +T} +:T{ 4 KiB T}:T{ -[4 KiB, 8 KiB, 12 KiB, \&.\&.\&., 4072 KiB] +[20 KiB, 24 KiB, 28 KiB, 32 KiB] +T} +:T{ +8 KiB +T}:T{ +[40 KiB, 48 KiB, 54 KiB, 64 KiB] +T} +:T{ +16 KiB +T}:T{ +[80 KiB, 96 KiB, 112 KiB, 128 KiB] +T} +:T{ +32 KiB +T}:T{ +[160 KiB, 192 KiB, 224 KiB, 256 KiB] +T} +:T{ +64 KiB +T}:T{ +[320 KiB, 384 KiB, 448 KiB, 512 KiB] +T} +:T{ +128 KiB +T}:T{ +[640 KiB, 768 KiB, 896 KiB, 1 MiB] +T} +:T{ +256 KiB +T}:T{ +[1280 KiB, 1536 KiB, 1792 KiB] T} T{ Huge T}:T{ +256 KiB +T}:T{ +[2 MiB] +T} +:T{ +512 KiB +T}:T{ +[2560 KiB, 3 MiB, 3584 KiB, 4 MiB] +T} +:T{ +1 MiB +T}:T{ +[5 MiB, 6 MiB, 7 MiB, 8 MiB] +T} +:T{ +2 MiB +T}:T{ +[10 MiB, 12 MiB, 14 MiB, 16 MiB] +T} +:T{ 4 MiB T}:T{ -[4 MiB, 8 MiB, 12 MiB, \&.\&.\&.] +[20 MiB, 24 MiB, 28 MiB, 32 MiB] +T} +:T{ +8 MiB +T}:T{ +[40 MiB, 48 MiB, 56 MiB, 64 MiB] +T} +:T{ +\&.\&.\&. +T}:T{ +\&.\&.\&. T} .TE .sp 1 @@ -660,15 +664,15 @@ If a value is passed in, refresh the data from which the functions report values, and increment the epoch\&. Return the current epoch\&. This is useful for detecting whether another thread caused a refresh\&. .RE .PP -"config\&.debug" (\fBbool\fR) r\- +"config\&.cache_oblivious" (\fBbool\fR) r\- .RS 4 -\fB\-\-enable\-debug\fR +\fB\-\-enable\-cache\-oblivious\fR was specified during build configuration\&. .RE .PP -"config\&.dss" (\fBbool\fR) r\- +"config\&.debug" (\fBbool\fR) r\- .RS 4 -\fB\-\-enable\-dss\fR +\fB\-\-enable\-debug\fR was specified during build configuration\&. .RE .PP @@ -684,12 +688,6 @@ was specified during build configuration\&. was specified during build configuration\&. .RE .PP -"config\&.mremap" (\fBbool\fR) r\- -.RS 4 -\fB\-\-enable\-mremap\fR -was specified during build configuration\&. -.RE -.PP "config\&.munmap" (\fBbool\fR) r\- .RS 4 \fB\-\-enable\-munmap\fR @@ -763,14 +761,16 @@ is specified during configuration, in which case it is enabled by default\&. .RS 4 dss (\fBsbrk\fR(2)) allocation precedence as related to \fBmmap\fR(2) -allocation\&. The following settings are supported: \(lqdisabled\(rq, \(lqprimary\(rq, and \(lqsecondary\(rq\&. The default is \(lqsecondary\(rq if -"config\&.dss" -is true, \(lqdisabled\(rq otherwise\&. +allocation\&. The following settings are supported if +\fBsbrk\fR(2) +is supported by the operating system: \(lqdisabled\(rq, \(lqprimary\(rq, and \(lqsecondary\(rq; otherwise only \(lqdisabled\(rq is supported\&. The default is \(lqsecondary\(rq if +\fBsbrk\fR(2) +is supported by the operating system; \(lqdisabled\(rq otherwise\&. .RE .PP "opt\&.lg_chunk" (\fBsize_t\fR) r\- .RS 4 -Virtual memory chunk size (log base 2)\&. If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size\&. The default chunk size is 4 MiB (2^22)\&. +Virtual memory chunk size (log base 2)\&. If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size\&. The default chunk size is 2 MiB (2^21)\&. .RE .PP "opt\&.narenas" (\fBsize_t\fR) r\- @@ -782,7 +782,11 @@ Maximum number of arenas to use for automatic multiplexing of threads and arenas .RS 4 Per\-arena minimum ratio (log base 2) of active to dirty pages\&. Some dirty unused pages may be allowed to accumulate, within the limit set by the ratio (or one chunk worth of dirty pages, whichever is greater), before informing the kernel about some of those pages via \fBmadvise\fR(2) -or a similar system call\&. This provides the kernel with sufficient information to recycle dirty pages if physical memory becomes scarce and the pages remain unused\&. The default minimum ratio is 8:1 (2^3:1); an option value of \-1 will disable dirty page purging\&. +or a similar system call\&. This provides the kernel with sufficient information to recycle dirty pages if physical memory becomes scarce and the pages remain unused\&. The default minimum ratio is 8:1 (2^3:1); an option value of \-1 will disable dirty page purging\&. See +"arenas\&.lg_dirty_mult" +and +"arena\&.<i>\&.lg_dirty_mult" +for related dynamic control options\&. .RE .PP "opt\&.stats_print" (\fBbool\fR) r\- @@ -793,16 +797,21 @@ function is called at program exit via an \fBatexit\fR(3) function\&. If \fB\-\-enable\-stats\fR -is specified during configuration, this has the potential to cause deadlock for a multi\-threaded process that exits while one or more threads are executing in the memory allocation functions\&. Therefore, this option should only be used with care; it is primarily intended as a performance tuning aid during application development\&. This option is disabled by default\&. +is specified during configuration, this has the potential to cause deadlock for a multi\-threaded process that exits while one or more threads are executing in the memory allocation functions\&. Furthermore, +\fBatexit\fR\fB\fR +may allocate memory during application initialization and then deadlock internally when jemalloc in turn calls +\fBatexit\fR\fB\fR, so this option is not univerally usable (though the application can register its own +\fBatexit\fR\fB\fR +function with equivalent functionality)\&. Therefore, this option should only be used with care; it is primarily intended as a performance tuning aid during application development\&. This option is disabled by default\&. .RE .PP -"opt\&.junk" (\fBbool\fR) r\- [\fB\-\-enable\-fill\fR] +"opt\&.junk" (\fBconst char *\fR) r\- [\fB\-\-enable\-fill\fR] .RS 4 -Junk filling enabled/disabled\&. If enabled, each byte of uninitialized allocated memory will be initialized to -0xa5\&. All deallocated memory will be initialized to -0x5a\&. This is intended for debugging and will impact performance negatively\&. This option is disabled by default unless +Junk filling\&. If set to "alloc", each byte of uninitialized allocated memory will be initialized to +0xa5\&. If set to "free", all deallocated memory will be initialized to +0x5a\&. If set to "true", both allocated and deallocated memory will be initialized, and if set to "false", junk filling be disabled entirely\&. This is intended for debugging and will impact performance negatively\&. This option is "false" by default unless \fB\-\-enable\-debug\fR -is specified during configuration, in which case it is enabled by default unless running inside +is specified during configuration, in which case it is "true" by default unless running inside \m[blue]\fBValgrind\fR\m[]\&\s-2\u[2]\d\s+2\&. .RE .PP @@ -825,10 +834,9 @@ option is enabled, the redzones are checked for corruption during deallocation\& "opt\&.zero" (\fBbool\fR) r\- [\fB\-\-enable\-fill\fR] .RS 4 Zero filling enabled/disabled\&. If enabled, each byte of uninitialized allocated memory will be initialized to 0\&. Note that this initialization only happens once for each byte, so -\fBrealloc\fR\fB\fR, -\fBrallocx\fR\fB\fR +\fBrealloc\fR\fB\fR and -\fBrallocm\fR\fB\fR +\fBrallocx\fR\fB\fR calls do not zero memory that was previously allocated\&. This is intended for debugging and will impact performance negatively\&. This option is disabled by default\&. .RE .PP @@ -839,12 +847,6 @@ Allocation tracing based on enabled/disabled\&. This option is disabled by default\&. .RE .PP -"opt\&.valgrind" (\fBbool\fR) r\- [\fB\-\-enable\-valgrind\fR] -.RS 4 -\m[blue]\fBValgrind\fR\m[]\&\s-2\u[2]\d\s+2 -support enabled/disabled\&. This option is vestigal because jemalloc auto\-detects whether it is running inside Valgrind\&. This option is disabled by default, unless running inside Valgrind\&. -.RE -.PP "opt\&.xmalloc" (\fBbool\fR) r\- [\fB\-\-enable\-xmalloc\fR] .RS 4 Abort\-on\-out\-of\-memory enabled/disabled\&. If enabled, rather than returning failure for any allocation function, display a diagnostic message on @@ -867,15 +869,15 @@ This option is disabled by default\&. .PP "opt\&.tcache" (\fBbool\fR) r\- [\fB\-\-enable\-tcache\fR] .RS 4 -Thread\-specific caching enabled/disabled\&. When there are multiple threads, each thread uses a thread\-specific cache for objects up to a certain size\&. Thread\-specific caching allows many allocations to be satisfied without performing any thread synchronization, at the cost of increased memory use\&. See the +Thread\-specific caching (tcache) enabled/disabled\&. When there are multiple threads, each thread uses a tcache for objects up to a certain size\&. Thread\-specific caching allows many allocations to be satisfied without performing any thread synchronization, at the cost of increased memory use\&. See the "opt\&.lg_tcache_max" option for related tuning information\&. This option is enabled by default unless running inside -\m[blue]\fBValgrind\fR\m[]\&\s-2\u[2]\d\s+2\&. +\m[blue]\fBValgrind\fR\m[]\&\s-2\u[2]\d\s+2, in which case it is forcefully disabled\&. .RE .PP "opt\&.lg_tcache_max" (\fBsize_t\fR) r\- [\fB\-\-enable\-tcache\fR] .RS 4 -Maximum size class (log base 2) to cache in the thread\-specific cache\&. At a minimum, all small size classes are cached, and at a maximum all large size classes are cached\&. The default maximum is 32 KiB (2^15)\&. +Maximum size class (log base 2) to cache in the thread\-specific cache (tcache)\&. At a minimum, all small size classes are cached, and at a maximum all large size classes are cached\&. The default maximum is 32 KiB (2^15)\&. .RE .PP "opt\&.prof" (\fBbool\fR) r\- [\fB\-\-enable\-prof\fR] @@ -892,9 +894,11 @@ option for information on interval\-triggered profile dumping, the "opt\&.prof_gdump" option for information on high\-water\-triggered profile dumping, and the "opt\&.prof_final" -option for final profile dumping\&. Profile output is compatible with the included +option for final profile dumping\&. Profile output is compatible with the +\fBjeprof\fR +command, which is based on the \fBpprof\fR -Perl script, which originates from the +that is developed as part of the \m[blue]\fBgperftools package\fR\m[]\&\s-2\u[3]\d\s+2\&. .RE .PP @@ -904,7 +908,7 @@ Filename prefix for profile dumps\&. If the prefix is set to the empty string, n jeprof\&. .RE .PP -"opt\&.prof_active" (\fBbool\fR) rw [\fB\-\-enable\-prof\fR] +"opt\&.prof_active" (\fBbool\fR) r\- [\fB\-\-enable\-prof\fR] .RS 4 Profiling activated/deactivated\&. This is a secondary control mechanism that makes it possible to start the application with profiling enabled (see the "opt\&.prof" @@ -913,7 +917,16 @@ option) but inactive, then toggle profiling at any time during program execution mallctl\&. This option is enabled by default\&. .RE .PP -"opt\&.lg_prof_sample" (\fBssize_t\fR) r\- [\fB\-\-enable\-prof\fR] +"opt\&.prof_thread_active_init" (\fBbool\fR) r\- [\fB\-\-enable\-prof\fR] +.RS 4 +Initial setting for +"thread\&.prof\&.active" +in newly created threads\&. The initial setting for newly created threads can also be changed during execution via the +"prof\&.thread_active_init" +mallctl\&. This option is enabled by default\&. +.RE +.PP +"opt\&.lg_prof_sample" (\fBsize_t\fR) r\- [\fB\-\-enable\-prof\fR] .RS 4 Average interval (log base 2) between allocation samples, as measured in bytes of allocation activity\&. Increasing the sampling interval decreases profile fidelity, but also decreases the computational overhead\&. The default sample interval is 512 KiB (2^19 B)\&. .RE @@ -935,12 +948,8 @@ option\&. By default, interval\-triggered profile dumping is disabled (encoded a .PP "opt\&.prof_gdump" (\fBbool\fR) r\- [\fB\-\-enable\-prof\fR] .RS 4 -Trigger a memory profile dump every time the total virtual memory exceeds the previous maximum\&. Profiles are dumped to files named according to the pattern -<prefix>\&.<pid>\&.<seq>\&.u<useq>\&.heap, where -<prefix> -is controlled by the -"opt\&.prof_prefix" -option\&. This option is disabled by default\&. +Set the initial state of +"prof\&.gdump", which when enabled triggers a memory profile dump every time the total virtual memory exceeds the previous maximum\&. This option is disabled by default\&. .RE .PP "opt\&.prof_final" (\fBbool\fR) r\- [\fB\-\-enable\-prof\fR] @@ -952,7 +961,12 @@ function to dump final memory usage to a file named according to the pattern <prefix> is controlled by the "opt\&.prof_prefix" -option\&. This option is enabled by default\&. +option\&. Note that +\fBatexit\fR\fB\fR +may allocate memory during application initialization and then deadlock internally when jemalloc in turn calls +\fBatexit\fR\fB\fR, so this option is not univerally usable (though the application can register its own +\fBatexit\fR\fB\fR +function with equivalent functionality)\&. This option is disabled by default\&. .RE .PP "opt\&.prof_leak" (\fBbool\fR) r\- [\fB\-\-enable\-prof\fR] @@ -1007,10 +1021,42 @@ Enable/disable calling thread\*(Aqs tcache\&. The tcache is implicitly flushed a .PP "thread\&.tcache\&.flush" (\fBvoid\fR) \-\- [\fB\-\-enable\-tcache\fR] .RS 4 -Flush calling thread\*(Aqs tcache\&. This interface releases all cached objects and internal data structures associated with the calling thread\*(Aqs thread\-specific cache\&. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits\&. However, garbage collection is triggered by allocation activity, so it is possible for a thread that stops allocating/deallocating to retain its cache indefinitely, in which case the developer may find manual flushing useful\&. +Flush calling thread\*(Aqs thread\-specific cache (tcache)\&. This interface releases all cached objects and internal data structures associated with the calling thread\*(Aqs tcache\&. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits\&. However, garbage collection is triggered by allocation activity, so it is possible for a thread that stops allocating/deallocating to retain its cache indefinitely, in which case the developer may find manual flushing useful\&. +.RE +.PP +"thread\&.prof\&.name" (\fBconst char *\fR) r\- or \-w [\fB\-\-enable\-prof\fR] +.RS 4 +Get/set the descriptive name associated with the calling thread in memory profile dumps\&. An internal copy of the name string is created, so the input string need not be maintained after this interface completes execution\&. The output string of this interface should be copied for non\-ephemeral uses, because multiple implementation details can cause asynchronous string deallocation\&. Furthermore, each invocation of this interface can only read or write; simultaneous read/write is not supported due to string lifetime limitations\&. The name string must nil\-terminated and comprised only of characters in the sets recognized by +\fBisgraph\fR(3) +and +\fBisblank\fR(3)\&. +.RE +.PP +"thread\&.prof\&.active" (\fBbool\fR) rw [\fB\-\-enable\-prof\fR] +.RS 4 +Control whether sampling is currently active for the calling thread\&. This is an activation mechanism in addition to +"prof\&.active"; both must be active for the calling thread to sample\&. This flag is enabled by default\&. +.RE +.PP +"tcache\&.create" (\fBunsigned\fR) r\- [\fB\-\-enable\-tcache\fR] +.RS 4 +Create an explicit thread\-specific cache (tcache) and return an identifier that can be passed to the +\fBMALLOCX_TCACHE(\fR\fB\fItc\fR\fR\fB)\fR +macro to explicitly use the specified cache rather than the automatically managed one that is used by default\&. Each explicit cache can be used by only one thread at a time; the application must assure that this constraint holds\&. +.RE +.PP +"tcache\&.flush" (\fBunsigned\fR) \-w [\fB\-\-enable\-tcache\fR] +.RS 4 +Flush the specified thread\-specific cache (tcache)\&. The same considerations apply to this interface as to +"thread\&.tcache\&.flush", except that the tcache will never be automatically be discarded\&. .RE .PP -"arena\&.<i>\&.purge" (\fBunsigned\fR) \-\- +"tcache\&.destroy" (\fBunsigned\fR) \-w [\fB\-\-enable\-tcache\fR] +.RS 4 +Flush the specified thread\-specific cache (tcache) and make the identifier available for use during a future tcache creation\&. +.RE +.PP +"arena\&.<i>\&.purge" (\fBvoid\fR) \-\- .RS 4 Purge unused dirty pages for arena <i>, or for all arenas if <i> equals "arenas\&.narenas"\&. @@ -1019,11 +1065,237 @@ Purge unused dirty pages for arena <i>, or for all arenas if <i> equals "arena\&.<i>\&.dss" (\fBconst char *\fR) rw .RS 4 Set the precedence of dss allocation as related to mmap allocation for arena <i>, or for all arenas if <i> equals -"arenas\&.narenas"\&. Note that even during huge allocation this setting is read from the arena that would be chosen for small or large allocation so that applications can depend on consistent dss versus mmap allocation regardless of allocation size\&. See +"arenas\&.narenas"\&. See "opt\&.dss" for supported settings\&. .RE .PP +"arena\&.<i>\&.lg_dirty_mult" (\fBssize_t\fR) rw +.RS 4 +Current per\-arena minimum ratio (log base 2) of active to dirty pages for arena <i>\&. Each time this interface is set and the ratio is increased, pages are synchronously purged as necessary to impose the new ratio\&. See +"opt\&.lg_dirty_mult" +for additional information\&. +.RE +.PP +"arena\&.<i>\&.chunk_hooks" (\fBchunk_hooks_t\fR) rw +.RS 4 +Get or set the chunk management hook functions for arena <i>\&. The functions must be capable of operating on all extant chunks associated with arena <i>, usually by passing unknown chunks to the replaced functions\&. In practice, it is feasible to control allocation for arenas created via +"arenas\&.extend" +such that all chunks originate from an application\-supplied chunk allocator (by setting custom chunk hook functions just after arena creation), but the automatically created arenas may have already created chunks prior to the application having an opportunity to take over chunk allocation\&. +.sp +.if n \{\ +.RS 4 +.\} +.nf +typedef struct { + chunk_alloc_t *alloc; + chunk_dalloc_t *dalloc; + chunk_commit_t *commit; + chunk_decommit_t *decommit; + chunk_purge_t *purge; + chunk_split_t *split; + chunk_merge_t *merge; +} chunk_hooks_t; +.fi +.if n \{\ +.RE +.\} +.sp +The +\fBchunk_hooks_t\fR +structure comprises function pointers which are described individually below\&. jemalloc uses these functions to manage chunk lifetime, which starts off with allocation of mapped committed memory, in the simplest case followed by deallocation\&. However, there are performance and platform reasons to retain chunks for later reuse\&. Cleanup attempts cascade from deallocation to decommit to purging, which gives the chunk management functions opportunities to reject the most permanent cleanup operations in favor of less permanent (and often less costly) operations\&. The chunk splitting and merging operations can also be opted out of, but this is mainly intended to support platforms on which virtual memory mappings provided by the operating system kernel do not automatically coalesce and split, e\&.g\&. Windows\&. +.HP \w'typedef\ void\ *(chunk_alloc_t)('u +.BI "typedef void *(chunk_alloc_t)(void\ *" "chunk" ", size_t\ " "size" ", size_t\ " "alignment" ", bool\ *" "zero" ", bool\ *" "commit" ", unsigned\ " "arena_ind" ");" +.sp +.if n \{\ +.RS 4 +.\} +.nf +.fi +.if n \{\ +.RE +.\} +.sp +A chunk allocation function conforms to the +\fBchunk_alloc_t\fR +type and upon success returns a pointer to +\fIsize\fR +bytes of mapped memory on behalf of arena +\fIarena_ind\fR +such that the chunk\*(Aqs base address is a multiple of +\fIalignment\fR, as well as setting +\fI*zero\fR +to indicate whether the chunk is zeroed and +\fI*commit\fR +to indicate whether the chunk is committed\&. Upon error the function returns +\fBNULL\fR +and leaves +\fI*zero\fR +and +\fI*commit\fR +unmodified\&. The +\fIsize\fR +parameter is always a multiple of the chunk size\&. The +\fIalignment\fR +parameter is always a power of two at least as large as the chunk size\&. Zeroing is mandatory if +\fI*zero\fR +is true upon function entry\&. Committing is mandatory if +\fI*commit\fR +is true upon function entry\&. If +\fIchunk\fR +is not +\fBNULL\fR, the returned pointer must be +\fIchunk\fR +on success or +\fBNULL\fR +on error\&. Committed memory may be committed in absolute terms as on a system that does not overcommit, or in implicit terms as on a system that overcommits and satisfies physical memory needs on demand via soft page faults\&. Note that replacing the default chunk allocation function makes the arena\*(Aqs +"arena\&.<i>\&.dss" +setting irrelevant\&. +.HP \w'typedef\ bool\ (chunk_dalloc_t)('u +.BI "typedef bool (chunk_dalloc_t)(void\ *" "chunk" ", size_t\ " "size" ", bool\ " "committed" ", unsigned\ " "arena_ind" ");" +.sp +.if n \{\ +.RS 4 +.\} +.nf +.fi +.if n \{\ +.RE +.\} +.sp +A chunk deallocation function conforms to the +\fBchunk_dalloc_t\fR +type and deallocates a +\fIchunk\fR +of given +\fIsize\fR +with +\fIcommitted\fR/decommited memory as indicated, on behalf of arena +\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates opt\-out from deallocation; the virtual memory mapping associated with the chunk remains mapped, in the same commit state, and available for future use, in which case it will be automatically retained for later reuse\&. +.HP \w'typedef\ bool\ (chunk_commit_t)('u +.BI "typedef bool (chunk_commit_t)(void\ *" "chunk" ", size_t\ " "size" ", size_t\ " "offset" ", size_t\ " "length" ", unsigned\ " "arena_ind" ");" +.sp +.if n \{\ +.RS 4 +.\} +.nf +.fi +.if n \{\ +.RE +.\} +.sp +A chunk commit function conforms to the +\fBchunk_commit_t\fR +type and commits zeroed physical memory to back pages within a +\fIchunk\fR +of given +\fIsize\fR +at +\fIoffset\fR +bytes, extending for +\fIlength\fR +on behalf of arena +\fIarena_ind\fR, returning false upon success\&. Committed memory may be committed in absolute terms as on a system that does not overcommit, or in implicit terms as on a system that overcommits and satisfies physical memory needs on demand via soft page faults\&. If the function returns true, this indicates insufficient physical memory to satisfy the request\&. +.HP \w'typedef\ bool\ (chunk_decommit_t)('u +.BI "typedef bool (chunk_decommit_t)(void\ *" "chunk" ", size_t\ " "size" ", size_t\ " "offset" ", size_t\ " "length" ", unsigned\ " "arena_ind" ");" +.sp +.if n \{\ +.RS 4 +.\} +.nf +.fi +.if n \{\ +.RE +.\} +.sp +A chunk decommit function conforms to the +\fBchunk_decommit_t\fR +type and decommits any physical memory that is backing pages within a +\fIchunk\fR +of given +\fIsize\fR +at +\fIoffset\fR +bytes, extending for +\fIlength\fR +on behalf of arena +\fIarena_ind\fR, returning false upon success, in which case the pages will be committed via the chunk commit function before being reused\&. If the function returns true, this indicates opt\-out from decommit; the memory remains committed and available for future use, in which case it will be automatically retained for later reuse\&. +.HP \w'typedef\ bool\ (chunk_purge_t)('u +.BI "typedef bool (chunk_purge_t)(void\ *" "chunk" ", size_t" "size" ", size_t\ " "offset" ", size_t\ " "length" ", unsigned\ " "arena_ind" ");" +.sp +.if n \{\ +.RS 4 +.\} +.nf +.fi +.if n \{\ +.RE +.\} +.sp +A chunk purge function conforms to the +\fBchunk_purge_t\fR +type and optionally discards physical pages within the virtual memory mapping associated with +\fIchunk\fR +of given +\fIsize\fR +at +\fIoffset\fR +bytes, extending for +\fIlength\fR +on behalf of arena +\fIarena_ind\fR, returning false if pages within the purged virtual memory range will be zero\-filled the next time they are accessed\&. +.HP \w'typedef\ bool\ (chunk_split_t)('u +.BI "typedef bool (chunk_split_t)(void\ *" "chunk" ", size_t\ " "size" ", size_t\ " "size_a" ", size_t\ " "size_b" ", bool\ " "committed" ", unsigned\ " "arena_ind" ");" +.sp +.if n \{\ +.RS 4 +.\} +.nf +.fi +.if n \{\ +.RE +.\} +.sp +A chunk split function conforms to the +\fBchunk_split_t\fR +type and optionally splits +\fIchunk\fR +of given +\fIsize\fR +into two adjacent chunks, the first of +\fIsize_a\fR +bytes, and the second of +\fIsize_b\fR +bytes, operating on +\fIcommitted\fR/decommitted memory as indicated, on behalf of arena +\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates that the chunk remains unsplit and therefore should continue to be operated on as a whole\&. +.HP \w'typedef\ bool\ (chunk_merge_t)('u +.BI "typedef bool (chunk_merge_t)(void\ *" "chunk_a" ", size_t\ " "size_a" ", void\ *" "chunk_b" ", size_t\ " "size_b" ", bool\ " "committed" ", unsigned\ " "arena_ind" ");" +.sp +.if n \{\ +.RS 4 +.\} +.nf +.fi +.if n \{\ +.RE +.\} +.sp +A chunk merge function conforms to the +\fBchunk_merge_t\fR +type and optionally merges adjacent chunks, +\fIchunk_a\fR +of given +\fIsize_a\fR +and +\fIchunk_b\fR +of given +\fIsize_b\fR +into one contiguous chunk, operating on +\fIcommitted\fR/decommitted memory as indicated, on behalf of arena +\fIarena_ind\fR, returning false upon success\&. If the function returns true, this indicates that the chunks remain distinct mappings and therefore should continue to be operated on independently\&. +.RE +.PP "arenas\&.narenas" (\fBunsigned\fR) r\- .RS 4 Current limit on number of arenas\&. @@ -1036,6 +1308,15 @@ An array of booleans\&. Each boolean indicates whether the corresponding arena is initialized\&. .RE .PP +"arenas\&.lg_dirty_mult" (\fBssize_t\fR) rw +.RS 4 +Current default per\-arena minimum ratio (log base 2) of active to dirty pages, used to initialize +"arena\&.<i>\&.lg_dirty_mult" +during arena creation\&. See +"opt\&.lg_dirty_mult" +for additional information\&. +.RE +.PP "arenas\&.quantum" (\fBsize_t\fR) r\- .RS 4 Quantum size\&. @@ -1076,7 +1357,7 @@ Number of regions per page run\&. Number of bytes per page run\&. .RE .PP -"arenas\&.nlruns" (\fBsize_t\fR) r\- +"arenas\&.nlruns" (\fBunsigned\fR) r\- .RS 4 Total number of large size classes\&. .RE @@ -1086,9 +1367,14 @@ Total number of large size classes\&. Maximum size supported by this large size class\&. .RE .PP -"arenas\&.purge" (\fBunsigned\fR) \-w +"arenas\&.nhchunks" (\fBunsigned\fR) r\- +.RS 4 +Total number of huge size classes\&. +.RE +.PP +"arenas\&.hchunk\&.<i>\&.size" (\fBsize_t\fR) r\- .RS 4 -Purge unused dirty pages for the specified arena, or for all arenas if none is specified\&. +Maximum size supported by this huge size class\&. .RE .PP "arenas\&.extend" (\fBunsigned\fR) r\- @@ -1096,11 +1382,22 @@ Purge unused dirty pages for the specified arena, or for all arenas if none is s Extend the array of arenas by appending a new arena, and returning the new arena index\&. .RE .PP +"prof\&.thread_active_init" (\fBbool\fR) rw [\fB\-\-enable\-prof\fR] +.RS 4 +Control the initial setting for +"thread\&.prof\&.active" +in newly created threads\&. See the +"opt\&.prof_thread_active_init" +option for additional information\&. +.RE +.PP "prof\&.active" (\fBbool\fR) rw [\fB\-\-enable\-prof\fR] .RS 4 Control whether sampling is currently active\&. See the "opt\&.prof_active" -option for additional information\&. +option for additional information, as well as the interrelated +"thread\&.prof\&.active" +mallctl\&. .RE .PP "prof\&.dump" (\fBconst char *\fR) \-w [\fB\-\-enable\-prof\fR] @@ -1113,6 +1410,30 @@ is controlled by the option\&. .RE .PP +"prof\&.gdump" (\fBbool\fR) rw [\fB\-\-enable\-prof\fR] +.RS 4 +When enabled, trigger a memory profile dump every time the total virtual memory exceeds the previous maximum\&. Profiles are dumped to files named according to the pattern +<prefix>\&.<pid>\&.<seq>\&.u<useq>\&.heap, where +<prefix> +is controlled by the +"opt\&.prof_prefix" +option\&. +.RE +.PP +"prof\&.reset" (\fBsize_t\fR) \-w [\fB\-\-enable\-prof\fR] +.RS 4 +Reset all memory profile statistics, and optionally update the sample rate (see +"opt\&.lg_prof_sample" +and +"prof\&.lg_sample")\&. +.RE +.PP +"prof\&.lg_sample" (\fBsize_t\fR) r\- [\fB\-\-enable\-prof\fR] +.RS 4 +Get the current sample rate (see +"opt\&.lg_prof_sample")\&. +.RE +.PP "prof\&.interval" (\fBuint64_t\fR) r\- [\fB\-\-enable\-prof\fR] .RS 4 Average number of bytes allocated between inverval\-based profile dumps\&. See the @@ -1122,7 +1443,7 @@ option for additional information\&. .PP "stats\&.cactive" (\fBsize_t *\fR) r\- [\fB\-\-enable\-stats\fR] .RS 4 -Pointer to a counter that contains an approximate count of the current number of bytes in active pages\&. The estimate may be high, but never low, because each arena rounds up to the nearest multiple of the chunk size when computing its contribution to the counter\&. Note that the +Pointer to a counter that contains an approximate count of the current number of bytes in active pages\&. The estimate may be high, but never low, because each arena rounds up when computing its contribution to the counter\&. Note that the "epoch" mallctl has no bearing on this counter\&. Furthermore, counter consistency is maintained via atomic operations, so it is necessary to use an atomic operation in order to guarantee a consistent read when dereferencing the pointer\&. .RE @@ -1136,44 +1457,27 @@ Total number of bytes allocated by the application\&. .RS 4 Total number of bytes in active pages allocated by the application\&. This is a multiple of the page size, and greater than or equal to "stats\&.allocated"\&. This does not include -"stats\&.arenas\&.<i>\&.pdirty" -and pages entirely devoted to allocator metadata\&. -.RE -.PP -"stats\&.mapped" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] -.RS 4 -Total number of bytes in chunks mapped on behalf of the application\&. This is a multiple of the chunk size, and is at least as large as -"stats\&.active"\&. This does not include inactive chunks\&. -.RE -.PP -"stats\&.chunks\&.current" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] -.RS 4 -Total number of chunks actively mapped on behalf of the application\&. This does not include inactive chunks\&. -.RE -.PP -"stats\&.chunks\&.total" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] -.RS 4 -Cumulative number of chunks allocated\&. -.RE -.PP -"stats\&.chunks\&.high" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] -.RS 4 -Maximum number of active chunks at any time thus far\&. +"stats\&.arenas\&.<i>\&.pdirty", nor pages entirely devoted to allocator metadata\&. .RE .PP -"stats\&.huge\&.allocated" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] +"stats\&.metadata" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] .RS 4 -Number of bytes currently allocated by huge objects\&. +Total number of bytes dedicated to metadata, which comprise base allocations used for bootstrap\-sensitive internal allocator data structures, arena chunk headers (see +"stats\&.arenas\&.<i>\&.metadata\&.mapped"), and internal allocations (see +"stats\&.arenas\&.<i>\&.metadata\&.allocated")\&. .RE .PP -"stats\&.huge\&.nmalloc" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] +"stats\&.resident" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] .RS 4 -Cumulative number of huge allocation requests\&. +Maximum number of bytes in physically resident data pages mapped by the allocator, comprising all pages dedicated to allocator metadata, pages backing active allocations, and unused dirty pages\&. This is a maximum rather than precise because pages may not actually be physically resident if they correspond to demand\-zeroed virtual memory that has not yet been touched\&. This is a multiple of the page size, and is larger than +"stats\&.active"\&. .RE .PP -"stats\&.huge\&.ndalloc" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] +"stats\&.mapped" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] .RS 4 -Cumulative number of huge deallocation requests\&. +Total number of bytes in active chunks mapped by the allocator\&. This is a multiple of the chunk size, and is larger than +"stats\&.active"\&. This does not include inactive chunks, even those that contain unused dirty pages, which means that there is no strict ordering between this and +"stats\&.resident"\&. .RE .PP "stats\&.arenas\&.<i>\&.dss" (\fBconst char *\fR) r\- @@ -1185,6 +1489,13 @@ allocation\&. See for details\&. .RE .PP +"stats\&.arenas\&.<i>\&.lg_dirty_mult" (\fBssize_t\fR) r\- +.RS 4 +Minimum ratio (log base 2) of active to dirty pages\&. See +"opt\&.lg_dirty_mult" +for details\&. +.RE +.PP "stats\&.arenas\&.<i>\&.nthreads" (\fBunsigned\fR) r\- .RS 4 Number of threads currently assigned to arena\&. @@ -1207,6 +1518,24 @@ or similar has not been called\&. Number of mapped bytes\&. .RE .PP +"stats\&.arenas\&.<i>\&.metadata\&.mapped" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Number of mapped bytes in arena chunk headers, which track the states of the non\-metadata pages\&. +.RE +.PP +"stats\&.arenas\&.<i>\&.metadata\&.allocated" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Number of bytes dedicated to internal allocations\&. Internal allocations differ from application\-originated allocations in that they are for internal use, and that they are omitted from heap profiles\&. This statistic is reported separately from +"stats\&.metadata" +and +"stats\&.arenas\&.<i>\&.metadata\&.mapped" +because it overlaps with e\&.g\&. the +"stats\&.allocated" +and +"stats\&.active" +statistics, whereas the other metadata statistics do not\&. +.RE +.PP "stats\&.arenas\&.<i>\&.npurge" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] .RS 4 Number of dirty page purge sweeps performed\&. @@ -1264,9 +1593,24 @@ Cumulative number of large deallocation requests served directly by the arena\&. Cumulative number of large allocation requests\&. .RE .PP -"stats\&.arenas\&.<i>\&.bins\&.<j>\&.allocated" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] +"stats\&.arenas\&.<i>\&.huge\&.allocated" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Number of bytes currently allocated by huge objects\&. +.RE +.PP +"stats\&.arenas\&.<i>\&.huge\&.nmalloc" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Cumulative number of huge allocation requests served directly by the arena\&. +.RE +.PP +"stats\&.arenas\&.<i>\&.huge\&.ndalloc" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] .RS 4 -Current number of bytes allocated by bin\&. +Cumulative number of huge deallocation requests served directly by the arena\&. +.RE +.PP +"stats\&.arenas\&.<i>\&.huge\&.nrequests" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Cumulative number of huge allocation requests\&. .RE .PP "stats\&.arenas\&.<i>\&.bins\&.<j>\&.nmalloc" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] @@ -1284,6 +1628,11 @@ Cumulative number of allocations returned to bin\&. Cumulative number of allocation requests\&. .RE .PP +"stats\&.arenas\&.<i>\&.bins\&.<j>\&.curregs" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Current number of regions for this size class\&. +.RE +.PP "stats\&.arenas\&.<i>\&.bins\&.<j>\&.nfills" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR \fB\-\-enable\-tcache\fR] .RS 4 Cumulative number of tcache fills\&. @@ -1328,6 +1677,26 @@ Cumulative number of allocation requests for this size class\&. .RS 4 Current number of runs for this size class\&. .RE +.PP +"stats\&.arenas\&.<i>\&.hchunks\&.<j>\&.nmalloc" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Cumulative number of allocation requests for this size class served directly by the arena\&. +.RE +.PP +"stats\&.arenas\&.<i>\&.hchunks\&.<j>\&.ndalloc" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Cumulative number of deallocation requests for this size class served directly by the arena\&. +.RE +.PP +"stats\&.arenas\&.<i>\&.hchunks\&.<j>\&.nrequests" (\fBuint64_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Cumulative number of allocation requests for this size class\&. +.RE +.PP +"stats\&.arenas\&.<i>\&.hchunks\&.<j>\&.curhchunks" (\fBsize_t\fR) r\- [\fB\-\-enable\-stats\fR] +.RS 4 +Current number of huge allocations for this size class\&. +.RE .SH "DEBUGGING MALLOC PROBLEMS" .PP When debugging, it is a good idea to configure/build jemalloc with the @@ -1513,44 +1882,6 @@ The \fBmalloc_usable_size\fR\fB\fR function returns the usable size of the allocation pointed to by \fIptr\fR\&. -.SS "Experimental API" -.PP -The -\fBallocm\fR\fB\fR, -\fBrallocm\fR\fB\fR, -\fBsallocm\fR\fB\fR, -\fBdallocm\fR\fB\fR, and -\fBnallocm\fR\fB\fR -functions return -\fBALLOCM_SUCCESS\fR -on success; otherwise they return an error value\&. The -\fBallocm\fR\fB\fR, -\fBrallocm\fR\fB\fR, and -\fBnallocm\fR\fB\fR -functions will fail if: -.PP -ALLOCM_ERR_OOM -.RS 4 -Out of memory\&. Insufficient contiguous memory was available to service the allocation request\&. The -\fBallocm\fR\fB\fR -function additionally sets -\fI*ptr\fR -to -\fBNULL\fR, whereas the -\fBrallocm\fR\fB\fR -function leaves -\fB*ptr\fR -unmodified\&. -.RE -The -\fBrallocm\fR\fB\fR -function will also fail if: -.PP -ALLOCM_ERR_NOT_MOVED -.RS 4 -\fBALLOCM_NO_MOVE\fR -was specified, but the reallocation request could not be serviced without moving the object\&. -.RE .SH "ENVIRONMENT" .PP The following environment variable affects the execution of the allocation functions: diff --git a/deps/jemalloc/doc/jemalloc.html b/deps/jemalloc/doc/jemalloc.html index 5a9fc7789..7b8e2be8c 100644 --- a/deps/jemalloc/doc/jemalloc.html +++ b/deps/jemalloc/doc/jemalloc.html @@ -1,8 +1,7 @@ -<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>JEMALLOC</title><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="refentry"><a name="idm316394519664"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>jemalloc — general purpose memory allocation functions</p></div><div class="refsect1"><a name="library"></a><h2>LIBRARY</h2><p>This manual describes jemalloc 3.6.0-0-g46c0af68bd248b04df75e4f92d5fb804c3d75340. More information - can be found at the <a class="ulink" href="http://www.canonware.com/jemalloc/" target="_top">jemalloc website</a>.</p></div><div class="refsynopsisdiv"><h2>SYNOPSIS</h2><div class="funcsynopsis"><pre class="funcsynopsisinfo">#include <<code class="filename">stdlib.h</code>> -#include <<code class="filename">jemalloc/jemalloc.h</code>></pre><div class="refsect2"><a name="idm316394002288"></a><h3>Standard API</h3><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">malloc</b>(</code></td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">calloc</b>(</code></td><td>size_t <var class="pdparam">number</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">posix_memalign</b>(</code></td><td>void **<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">alignment</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">aligned_alloc</b>(</code></td><td>size_t <var class="pdparam">alignment</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">realloc</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">free</b>(</code></td><td>void *<var class="pdparam">ptr</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="refsect2"><a name="idm316393986160"></a><h3>Non-standard API</h3><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">mallocx</b>(</code></td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">rallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">xallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">extra</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">sallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">dallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">nallocx</b>(</code></td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">mallctl</b>(</code></td><td>const char *<var class="pdparam">name</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">oldp</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">oldlenp</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">newp</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">newlen</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">mallctlnametomib</b>(</code></td><td>const char *<var class="pdparam">name</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">mibp</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">miblenp</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">mallctlbymib</b>(</code></td><td>const size_t *<var class="pdparam">mib</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">miblen</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">oldp</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">oldlenp</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">newp</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">newlen</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">malloc_stats_print</b>(</code></td><td>void <var class="pdparam">(*write_cb)</var> +<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>JEMALLOC</title><meta name="generator" content="DocBook XSL Stylesheets V1.78.1"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="refentry"><a name="idp45223136"></a><div class="titlepage"></div><div class="refnamediv"><h2>Name</h2><p>jemalloc — general purpose memory allocation functions</p></div><div class="refsect1"><a name="library"></a><h2>LIBRARY</h2><p>This manual describes jemalloc 4.0.3-0-ge9192eacf8935e29fc62fddc2701f7942b1cc02c. More information + can be found at the <a class="ulink" href="http://www.canonware.com/jemalloc/" target="_top">jemalloc website</a>.</p></div><div class="refsynopsisdiv"><h2>SYNOPSIS</h2><div class="funcsynopsis"><pre class="funcsynopsisinfo">#include <<code class="filename">jemalloc/jemalloc.h</code>></pre><div class="refsect2"><a name="idp44244480"></a><h3>Standard API</h3><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">malloc</b>(</code></td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">calloc</b>(</code></td><td>size_t <var class="pdparam">number</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">posix_memalign</b>(</code></td><td>void **<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">alignment</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">aligned_alloc</b>(</code></td><td>size_t <var class="pdparam">alignment</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">realloc</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">free</b>(</code></td><td>void *<var class="pdparam">ptr</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="refsect2"><a name="idp46062768"></a><h3>Non-standard API</h3><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">mallocx</b>(</code></td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void *<b class="fsfunc">rallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">xallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">extra</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">sallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">dallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">sdallocx</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">nallocx</b>(</code></td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">mallctl</b>(</code></td><td>const char *<var class="pdparam">name</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">oldp</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">oldlenp</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">newp</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">newlen</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">mallctlnametomib</b>(</code></td><td>const char *<var class="pdparam">name</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">mibp</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">miblenp</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">mallctlbymib</b>(</code></td><td>const size_t *<var class="pdparam">mib</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">miblen</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">oldp</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">oldlenp</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">newp</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">newlen</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">malloc_stats_print</b>(</code></td><td>void <var class="pdparam">(*write_cb)</var> <code>(</code>void *, const char *<code>)</code> - , </td></tr><tr><td> </td><td>void *<var class="pdparam">cbopaque</var>, </td></tr><tr><td> </td><td>const char *<var class="pdparam">opts</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">malloc_usable_size</b>(</code></td><td>const void *<var class="pdparam">ptr</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">(*malloc_message)</b>(</code></td><td>void *<var class="pdparam">cbopaque</var>, </td></tr><tr><td> </td><td>const char *<var class="pdparam">s</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><p><span class="type">const char *</span><code class="varname">malloc_conf</code>;</p></div><div class="refsect2"><a name="idm316388684112"></a><h3>Experimental API</h3><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">allocm</b>(</code></td><td>void **<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">rsize</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">rallocm</b>(</code></td><td>void **<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">rsize</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">extra</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">sallocm</b>(</code></td><td>const void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>size_t *<var class="pdparam">rsize</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">dallocm</b>(</code></td><td>void *<var class="pdparam">ptr</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">int <b class="fsfunc">nallocm</b>(</code></td><td>size_t *<var class="pdparam">rsize</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>int <var class="pdparam">flags</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div></div></div><div class="refsect1"><a name="description"></a><h2>DESCRIPTION</h2><div class="refsect2"><a name="idm316388663504"></a><h3>Standard API</h3><p>The <code class="function">malloc</code>(<em class="parameter"><code></code></em>) function allocates + , </td></tr><tr><td> </td><td>void *<var class="pdparam">cbopaque</var>, </td></tr><tr><td> </td><td>const char *<var class="pdparam">opts</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">size_t <b class="fsfunc">malloc_usable_size</b>(</code></td><td>const void *<var class="pdparam">ptr</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">void <b class="fsfunc">(*malloc_message)</b>(</code></td><td>void *<var class="pdparam">cbopaque</var>, </td></tr><tr><td> </td><td>const char *<var class="pdparam">s</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div><p><span class="type">const char *</span><code class="varname">malloc_conf</code>;</p></div></div></div><div class="refsect1"><a name="description"></a><h2>DESCRIPTION</h2><div class="refsect2"><a name="idp46115952"></a><h3>Standard API</h3><p>The <code class="function">malloc</code>(<em class="parameter"><code></code></em>) function allocates <em class="parameter"><code>size</code></em> bytes of uninitialized memory. The allocated space is suitably aligned (after possible pointer coercion) for storage of any type of object.</p><p>The <code class="function">calloc</code>(<em class="parameter"><code></code></em>) function allocates @@ -13,13 +12,13 @@ exception that the allocated memory is explicitly initialized to zero bytes.</p><p>The <code class="function">posix_memalign</code>(<em class="parameter"><code></code></em>) function allocates <em class="parameter"><code>size</code></em> bytes of memory such that the - allocation's base address is an even multiple of + allocation's base address is a multiple of <em class="parameter"><code>alignment</code></em>, and returns the allocation in the value pointed to by <em class="parameter"><code>ptr</code></em>. The requested - <em class="parameter"><code>alignment</code></em> must be a power of 2 at least as large - as <code class="code">sizeof(<span class="type">void *</span>)</code>.</p><p>The <code class="function">aligned_alloc</code>(<em class="parameter"><code></code></em>) function + <em class="parameter"><code>alignment</code></em> must be a power of 2 at least as large as + <code class="code">sizeof(<span class="type">void *</span>)</code>.</p><p>The <code class="function">aligned_alloc</code>(<em class="parameter"><code></code></em>) function allocates <em class="parameter"><code>size</code></em> bytes of memory such that the - allocation's base address is an even multiple of + allocation's base address is a multiple of <em class="parameter"><code>alignment</code></em>. The requested <em class="parameter"><code>alignment</code></em> must be a power of 2. Behavior is undefined if <em class="parameter"><code>size</code></em> is not an integral multiple of @@ -38,37 +37,51 @@ <code class="function">malloc</code>(<em class="parameter"><code></code></em>) for the specified size.</p><p>The <code class="function">free</code>(<em class="parameter"><code></code></em>) function causes the allocated memory referenced by <em class="parameter"><code>ptr</code></em> to be made available for future allocations. If <em class="parameter"><code>ptr</code></em> is - <code class="constant">NULL</code>, no action occurs.</p></div><div class="refsect2"><a name="idm316388639904"></a><h3>Non-standard API</h3><p>The <code class="function">mallocx</code>(<em class="parameter"><code></code></em>), + <code class="constant">NULL</code>, no action occurs.</p></div><div class="refsect2"><a name="idp46144704"></a><h3>Non-standard API</h3><p>The <code class="function">mallocx</code>(<em class="parameter"><code></code></em>), <code class="function">rallocx</code>(<em class="parameter"><code></code></em>), <code class="function">xallocx</code>(<em class="parameter"><code></code></em>), <code class="function">sallocx</code>(<em class="parameter"><code></code></em>), - <code class="function">dallocx</code>(<em class="parameter"><code></code></em>), and + <code class="function">dallocx</code>(<em class="parameter"><code></code></em>), + <code class="function">sdallocx</code>(<em class="parameter"><code></code></em>), and <code class="function">nallocx</code>(<em class="parameter"><code></code></em>) functions all have a <em class="parameter"><code>flags</code></em> argument that can be used to specify options. The functions only check the options that are contextually relevant. Use bitwise or (<code class="code">|</code>) operations to specify one or more of the following: - </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="constant">MALLOCX_LG_ALIGN(<em class="parameter"><code>la</code></em>) + </p><div class="variablelist"><dl class="variablelist"><dt><a name="MALLOCX_LG_ALIGN"></a><span class="term"><code class="constant">MALLOCX_LG_ALIGN(<em class="parameter"><code>la</code></em>) </code></span></dt><dd><p>Align the memory allocation to start at an address that is a multiple of <code class="code">(1 << <em class="parameter"><code>la</code></em>)</code>. This macro does not validate that <em class="parameter"><code>la</code></em> is within the valid - range.</p></dd><dt><span class="term"><code class="constant">MALLOCX_ALIGN(<em class="parameter"><code>a</code></em>) + range.</p></dd><dt><a name="MALLOCX_ALIGN"></a><span class="term"><code class="constant">MALLOCX_ALIGN(<em class="parameter"><code>a</code></em>) </code></span></dt><dd><p>Align the memory allocation to start at an address that is a multiple of <em class="parameter"><code>a</code></em>, where <em class="parameter"><code>a</code></em> is a power of two. This macro does not validate that <em class="parameter"><code>a</code></em> is a power of 2. - </p></dd><dt><span class="term"><code class="constant">MALLOCX_ZERO</code></span></dt><dd><p>Initialize newly allocated memory to contain zero + </p></dd><dt><a name="MALLOCX_ZERO"></a><span class="term"><code class="constant">MALLOCX_ZERO</code></span></dt><dd><p>Initialize newly allocated memory to contain zero bytes. In the growing reallocation case, the real size prior to reallocation defines the boundary between untouched bytes and those that are initialized to contain zero bytes. If this macro is - absent, newly allocated memory is uninitialized.</p></dd><dt><span class="term"><code class="constant">MALLOCX_ARENA(<em class="parameter"><code>a</code></em>) + absent, newly allocated memory is uninitialized.</p></dd><dt><a name="MALLOCX_TCACHE"></a><span class="term"><code class="constant">MALLOCX_TCACHE(<em class="parameter"><code>tc</code></em>) + </code></span></dt><dd><p>Use the thread-specific cache (tcache) specified by + the identifier <em class="parameter"><code>tc</code></em>, which must have been + acquired via the <a class="link" href="#tcache.create"> + "<code class="mallctl">tcache.create</code>" + </a> + mallctl. This macro does not validate that + <em class="parameter"><code>tc</code></em> specifies a valid + identifier.</p></dd><dt><a name="MALLOC_TCACHE_NONE"></a><span class="term"><code class="constant">MALLOCX_TCACHE_NONE</code></span></dt><dd><p>Do not use a thread-specific cache (tcache). Unless + <code class="constant">MALLOCX_TCACHE(<em class="parameter"><code>tc</code></em>)</code> or + <code class="constant">MALLOCX_TCACHE_NONE</code> is specified, an + automatically managed tcache will be used under many circumstances. + This macro cannot be used in the same <em class="parameter"><code>flags</code></em> + argument as + <code class="constant">MALLOCX_TCACHE(<em class="parameter"><code>tc</code></em>)</code>.</p></dd><dt><a name="MALLOCX_ARENA"></a><span class="term"><code class="constant">MALLOCX_ARENA(<em class="parameter"><code>a</code></em>) </code></span></dt><dd><p>Use the arena specified by the index - <em class="parameter"><code>a</code></em> (and by necessity bypass the thread - cache). This macro has no effect for huge regions, nor for regions - that were allocated via an arena other than the one specified. - This macro does not validate that <em class="parameter"><code>a</code></em> - specifies an arena index in the valid range.</p></dd></dl></div><p> + <em class="parameter"><code>a</code></em>. This macro has no effect for regions that + were allocated via an arena other than the one specified. This + macro does not validate that <em class="parameter"><code>a</code></em> specifies an + arena index in the valid range.</p></dd></dl></div><p> </p><p>The <code class="function">mallocx</code>(<em class="parameter"><code></code></em>) function allocates at least <em class="parameter"><code>size</code></em> bytes of memory, and returns a pointer to the base address of the allocation. Behavior is undefined if @@ -91,7 +104,14 @@ > <code class="constant">SIZE_T_MAX</code>)</code>.</p><p>The <code class="function">sallocx</code>(<em class="parameter"><code></code></em>) function returns the real size of the allocation at <em class="parameter"><code>ptr</code></em>.</p><p>The <code class="function">dallocx</code>(<em class="parameter"><code></code></em>) function causes the memory referenced by <em class="parameter"><code>ptr</code></em> to be made available for - future allocations.</p><p>The <code class="function">nallocx</code>(<em class="parameter"><code></code></em>) function allocates no + future allocations.</p><p>The <code class="function">sdallocx</code>(<em class="parameter"><code></code></em>) function is an + extension of <code class="function">dallocx</code>(<em class="parameter"><code></code></em>) with a + <em class="parameter"><code>size</code></em> parameter to allow the caller to pass in the + allocation size as an optimization. The minimum valid input size is the + original requested size of the allocation, and the maximum valid input + size is the corresponding value returned by + <code class="function">nallocx</code>(<em class="parameter"><code></code></em>) or + <code class="function">sallocx</code>(<em class="parameter"><code></code></em>).</p><p>The <code class="function">nallocx</code>(<em class="parameter"><code></code></em>) function allocates no memory, but it performs the same size computation as the <code class="function">mallocx</code>(<em class="parameter"><code></code></em>) function, and returns the real size of the allocation that would result from the equivalent @@ -162,11 +182,12 @@ for (i = 0; i < nbins; i++) { functions simultaneously. If <code class="option">--enable-stats</code> is specified during configuration, “m” and “a” can be specified to omit merged arena and per arena statistics, respectively; - “b” and “l” can be specified to omit per size - class statistics for bins and large objects, respectively. Unrecognized - characters are silently ignored. Note that thread caching may prevent - some statistics from being completely up to date, since extra locking - would be required to merge counters that track thread cache operations. + “b”, “l”, and “h” can be specified to + omit per size class statistics for bins, large objects, and huge objects, + respectively. Unrecognized characters are silently ignored. Note that + thread caching may prevent some statistics from being completely up to + date, since extra locking would be required to merge counters that track + thread cache operations. </p><p>The <code class="function">malloc_usable_size</code>(<em class="parameter"><code></code></em>) function returns the usable size of the allocation pointed to by <em class="parameter"><code>ptr</code></em>. The return value may be larger than the size @@ -177,74 +198,7 @@ for (i = 0; i < nbins; i++) { discrepancy between the requested allocation size and the size reported by <code class="function">malloc_usable_size</code>(<em class="parameter"><code></code></em>) should not be depended on, since such behavior is entirely implementation-dependent. - </p></div><div class="refsect2"><a name="idm316388574208"></a><h3>Experimental API</h3><p>The experimental API is subject to change or removal without regard - for backward compatibility. If <code class="option">--disable-experimental</code> - is specified during configuration, the experimental API is - omitted.</p><p>The <code class="function">allocm</code>(<em class="parameter"><code></code></em>), - <code class="function">rallocm</code>(<em class="parameter"><code></code></em>), - <code class="function">sallocm</code>(<em class="parameter"><code></code></em>), - <code class="function">dallocm</code>(<em class="parameter"><code></code></em>), and - <code class="function">nallocm</code>(<em class="parameter"><code></code></em>) functions all have a - <em class="parameter"><code>flags</code></em> argument that can be used to specify - options. The functions only check the options that are contextually - relevant. Use bitwise or (<code class="code">|</code>) operations to - specify one or more of the following: - </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="constant">ALLOCM_LG_ALIGN(<em class="parameter"><code>la</code></em>) - </code></span></dt><dd><p>Align the memory allocation to start at an address - that is a multiple of <code class="code">(1 << - <em class="parameter"><code>la</code></em>)</code>. This macro does not validate - that <em class="parameter"><code>la</code></em> is within the valid - range.</p></dd><dt><span class="term"><code class="constant">ALLOCM_ALIGN(<em class="parameter"><code>a</code></em>) - </code></span></dt><dd><p>Align the memory allocation to start at an address - that is a multiple of <em class="parameter"><code>a</code></em>, where - <em class="parameter"><code>a</code></em> is a power of two. This macro does not - validate that <em class="parameter"><code>a</code></em> is a power of 2. - </p></dd><dt><span class="term"><code class="constant">ALLOCM_ZERO</code></span></dt><dd><p>Initialize newly allocated memory to contain zero - bytes. In the growing reallocation case, the real size prior to - reallocation defines the boundary between untouched bytes and those - that are initialized to contain zero bytes. If this macro is - absent, newly allocated memory is uninitialized.</p></dd><dt><span class="term"><code class="constant">ALLOCM_NO_MOVE</code></span></dt><dd><p>For reallocation, fail rather than moving the - object. This constraint can apply to both growth and - shrinkage.</p></dd><dt><span class="term"><code class="constant">ALLOCM_ARENA(<em class="parameter"><code>a</code></em>) - </code></span></dt><dd><p>Use the arena specified by the index - <em class="parameter"><code>a</code></em> (and by necessity bypass the thread - cache). This macro has no effect for huge regions, nor for regions - that were allocated via an arena other than the one specified. - This macro does not validate that <em class="parameter"><code>a</code></em> - specifies an arena index in the valid range.</p></dd></dl></div><p> - </p><p>The <code class="function">allocm</code>(<em class="parameter"><code></code></em>) function allocates at - least <em class="parameter"><code>size</code></em> bytes of memory, sets - <em class="parameter"><code>*ptr</code></em> to the base address of the allocation, and - sets <em class="parameter"><code>*rsize</code></em> to the real size of the allocation if - <em class="parameter"><code>rsize</code></em> is not <code class="constant">NULL</code>. Behavior - is undefined if <em class="parameter"><code>size</code></em> is <code class="constant">0</code>, or - if request size overflows due to size class and/or alignment - constraints.</p><p>The <code class="function">rallocm</code>(<em class="parameter"><code></code></em>) function resizes the - allocation at <em class="parameter"><code>*ptr</code></em> to be at least - <em class="parameter"><code>size</code></em> bytes, sets <em class="parameter"><code>*ptr</code></em> to - the base address of the allocation if it moved, and sets - <em class="parameter"><code>*rsize</code></em> to the real size of the allocation if - <em class="parameter"><code>rsize</code></em> is not <code class="constant">NULL</code>. If - <em class="parameter"><code>extra</code></em> is non-zero, an attempt is made to resize - the allocation to be at least <code class="code">(<em class="parameter"><code>size</code></em> + - <em class="parameter"><code>extra</code></em>)</code> bytes, though inability to allocate - the extra byte(s) will not by itself result in failure. Behavior is - undefined if <em class="parameter"><code>size</code></em> is <code class="constant">0</code>, if - request size overflows due to size class and/or alignment constraints, or - if <code class="code">(<em class="parameter"><code>size</code></em> + - <em class="parameter"><code>extra</code></em> > - <code class="constant">SIZE_T_MAX</code>)</code>.</p><p>The <code class="function">sallocm</code>(<em class="parameter"><code></code></em>) function sets - <em class="parameter"><code>*rsize</code></em> to the real size of the allocation.</p><p>The <code class="function">dallocm</code>(<em class="parameter"><code></code></em>) function causes the - memory referenced by <em class="parameter"><code>ptr</code></em> to be made available for - future allocations.</p><p>The <code class="function">nallocm</code>(<em class="parameter"><code></code></em>) function allocates no - memory, but it performs the same size computation as the - <code class="function">allocm</code>(<em class="parameter"><code></code></em>) function, and if - <em class="parameter"><code>rsize</code></em> is not <code class="constant">NULL</code> it sets - <em class="parameter"><code>*rsize</code></em> to the real size of the allocation that - would result from the equivalent <code class="function">allocm</code>(<em class="parameter"><code></code></em>) - function call. Behavior is undefined if <em class="parameter"><code>size</code></em> is - <code class="constant">0</code>, or if request size overflows due to size class - and/or alignment constraints.</p></div></div><div class="refsect1"><a name="tuning"></a><h2>TUNING</h2><p>Once, when the first call is made to one of the memory allocation + </p></div></div><div class="refsect1"><a name="tuning"></a><h2>TUNING</h2><p>Once, when the first call is made to one of the memory allocation routines, the allocator initializes its internals based in part on various options that can be specified at compile- or run-time.</p><p>The string pointed to by the global variable <code class="varname">malloc_conf</code>, the “name” of the file @@ -272,8 +226,9 @@ for (i = 0; i < nbins; i++) { <span class="citerefentry"><span class="refentrytitle">sbrk</span>(2)</span> to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory. If - <code class="option">--enable-dss</code> is specified during configuration, this - allocator uses both <span class="citerefentry"><span class="refentrytitle">mmap</span>(2)</span> and + <span class="citerefentry"><span class="refentrytitle">sbrk</span>(2)</span> is supported by the operating + system, this allocator uses both + <span class="citerefentry"><span class="refentrytitle">mmap</span>(2)</span> and <span class="citerefentry"><span class="refentrytitle">sbrk</span>(2)</span>, in that order of preference; otherwise only <span class="citerefentry"><span class="refentrytitle">mmap</span>(2)</span> is used.</p><p>This allocator uses multiple arenas in order to reduce lock contention for threaded programs on multi-processor systems. This works @@ -295,34 +250,52 @@ for (i = 0; i < nbins; i++) { chunk size is a power of two that is greater than the page size. Chunks are always aligned to multiples of the chunk size. This alignment makes it possible to find metadata for user objects very quickly.</p><p>User objects are broken into three categories according to size: - small, large, and huge. Small objects are smaller than one page. Large - objects are smaller than the chunk size. Huge objects are a multiple of - the chunk size. Small and large objects are managed by arenas; huge - objects are managed separately in a single data structure that is shared by - all threads. Huge objects are used by applications infrequently enough - that this single data structure is not a scalability issue.</p><p>Each chunk that is managed by an arena tracks its contents as runs of + small, large, and huge. Small and large objects are managed entirely by + arenas; huge objects are additionally aggregated in a single data structure + that is shared by all threads. Huge objects are typically used by + applications infrequently enough that this single data structure is not a + scalability issue.</p><p>Each chunk that is managed by an arena tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one large object). The combination of chunk alignment and chunk page maps makes it possible to determine all metadata regarding small and large allocations in constant time.</p><p>Small objects are managed in groups by page runs. Each run maintains - a frontier and free list to track which regions are in use. Allocation - requests that are no more than half the quantum (8 or 16, depending on - architecture) are rounded up to the nearest power of two that is at least - <code class="code">sizeof(<span class="type">double</span>)</code>. All other small - object size classes are multiples of the quantum, spaced such that internal - fragmentation is limited to approximately 25% for all but the smallest size - classes. Allocation requests that are larger than the maximum small size - class, but small enough to fit in an arena-managed chunk (see the <a class="link" href="#opt.lg_chunk"> + a bitmap to track which regions are in use. Allocation requests that are no + more than half the quantum (8 or 16, depending on architecture) are rounded + up to the nearest power of two that is at least <code class="code">sizeof(<span class="type">double</span>)</code>. All other object size + classes are multiples of the quantum, spaced such that there are four size + classes for each doubling in size, which limits internal fragmentation to + approximately 20% for all but the smallest size classes. Small size classes + are smaller than four times the page size, large size classes are smaller + than the chunk size (see the <a class="link" href="#opt.lg_chunk"> "<code class="mallctl">opt.lg_chunk</code>" - </a> option), are - rounded up to the nearest run size. Allocation requests that are too large - to fit in an arena-managed chunk are rounded up to the nearest multiple of - the chunk size.</p><p>Allocations are packed tightly together, which can be an issue for + </a> option), and + huge size classes extend from the chunk size up to one size class less than + the full address space size.</p><p>Allocations are packed tightly together, which can be an issue for multi-threaded applications. If you need to assure that allocations do not suffer from cacheline sharing, round your allocation requests up to the nearest multiple of the cacheline size, or specify cacheline alignment when - allocating.</p><p>Assuming 4 MiB chunks, 4 KiB pages, and a 16-byte quantum on a 64-bit - system, the size classes in each category are as shown in <a class="xref" href="#size_classes" title="Table 1. Size classes">Table 1</a>.</p><div class="table"><a name="size_classes"></a><p class="title"><b>Table 1. Size classes</b></p><div class="table-contents"><table summary="Size classes" border="1"><colgroup><col align="left" class="c1"><col align="right" class="c2"><col align="left" class="c3"></colgroup><thead><tr><th align="left">Category</th><th align="right">Spacing</th><th align="left">Size</th></tr></thead><tbody><tr><td rowspan="7" align="left">Small</td><td align="right">lg</td><td align="left">[8]</td></tr><tr><td align="right">16</td><td align="left">[16, 32, 48, ..., 128]</td></tr><tr><td align="right">32</td><td align="left">[160, 192, 224, 256]</td></tr><tr><td align="right">64</td><td align="left">[320, 384, 448, 512]</td></tr><tr><td align="right">128</td><td align="left">[640, 768, 896, 1024]</td></tr><tr><td align="right">256</td><td align="left">[1280, 1536, 1792, 2048]</td></tr><tr><td align="right">512</td><td align="left">[2560, 3072, 3584]</td></tr><tr><td align="left">Large</td><td align="right">4 KiB</td><td align="left">[4 KiB, 8 KiB, 12 KiB, ..., 4072 KiB]</td></tr><tr><td align="left">Huge</td><td align="right">4 MiB</td><td align="left">[4 MiB, 8 MiB, 12 MiB, ...]</td></tr></tbody></table></div></div><br class="table-break"></div><div class="refsect1"><a name="mallctl_namespace"></a><h2>MALLCTL NAMESPACE</h2><p>The following names are defined in the namespace accessible via the + allocating.</p><p>The <code class="function">realloc</code>(<em class="parameter"><code></code></em>), + <code class="function">rallocx</code>(<em class="parameter"><code></code></em>), and + <code class="function">xallocx</code>(<em class="parameter"><code></code></em>) functions may resize allocations + without moving them under limited circumstances. Unlike the + <code class="function">*allocx</code>(<em class="parameter"><code></code></em>) API, the standard API does not + officially round up the usable size of an allocation to the nearest size + class, so technically it is necessary to call + <code class="function">realloc</code>(<em class="parameter"><code></code></em>) to grow e.g. a 9-byte allocation to + 16 bytes, or shrink a 16-byte allocation to 9 bytes. Growth and shrinkage + trivially succeeds in place as long as the pre-size and post-size both round + up to the same size class. No other API guarantees are made regarding + in-place resizing, but the current implementation also tries to resize large + and huge allocations in place, as long as the pre-size and post-size are + both large or both huge. In such cases shrinkage always succeeds for large + size classes, but for huge size classes the chunk allocator must support + splitting (see <a class="link" href="#arena.i.chunk_hooks"> + "<code class="mallctl">arena.<i>.chunk_hooks</code>" + </a>). + Growth only succeeds if the trailing memory is currently available, and + additionally for huge size classes the chunk allocator must support + merging.</p><p>Assuming 2 MiB chunks, 4 KiB pages, and a 16-byte quantum on a + 64-bit system, the size classes in each category are as shown in <a class="xref" href="#size_classes" title="Table 1. Size classes">Table 1</a>.</p><div class="table"><a name="size_classes"></a><p class="title"><b>Table 1. Size classes</b></p><div class="table-contents"><table summary="Size classes" border="1"><colgroup><col align="left" class="c1"><col align="right" class="c2"><col align="left" class="c3"></colgroup><thead><tr><th align="left">Category</th><th align="right">Spacing</th><th align="left">Size</th></tr></thead><tbody><tr><td rowspan="9" align="left">Small</td><td align="right">lg</td><td align="left">[8]</td></tr><tr><td align="right">16</td><td align="left">[16, 32, 48, 64, 80, 96, 112, 128]</td></tr><tr><td align="right">32</td><td align="left">[160, 192, 224, 256]</td></tr><tr><td align="right">64</td><td align="left">[320, 384, 448, 512]</td></tr><tr><td align="right">128</td><td align="left">[640, 768, 896, 1024]</td></tr><tr><td align="right">256</td><td align="left">[1280, 1536, 1792, 2048]</td></tr><tr><td align="right">512</td><td align="left">[2560, 3072, 3584, 4096]</td></tr><tr><td align="right">1 KiB</td><td align="left">[5 KiB, 6 KiB, 7 KiB, 8 KiB]</td></tr><tr><td align="right">2 KiB</td><td align="left">[10 KiB, 12 KiB, 14 KiB]</td></tr><tr><td rowspan="8" align="left">Large</td><td align="right">2 KiB</td><td align="left">[16 KiB]</td></tr><tr><td align="right">4 KiB</td><td align="left">[20 KiB, 24 KiB, 28 KiB, 32 KiB]</td></tr><tr><td align="right">8 KiB</td><td align="left">[40 KiB, 48 KiB, 54 KiB, 64 KiB]</td></tr><tr><td align="right">16 KiB</td><td align="left">[80 KiB, 96 KiB, 112 KiB, 128 KiB]</td></tr><tr><td align="right">32 KiB</td><td align="left">[160 KiB, 192 KiB, 224 KiB, 256 KiB]</td></tr><tr><td align="right">64 KiB</td><td align="left">[320 KiB, 384 KiB, 448 KiB, 512 KiB]</td></tr><tr><td align="right">128 KiB</td><td align="left">[640 KiB, 768 KiB, 896 KiB, 1 MiB]</td></tr><tr><td align="right">256 KiB</td><td align="left">[1280 KiB, 1536 KiB, 1792 KiB]</td></tr><tr><td rowspan="7" align="left">Huge</td><td align="right">256 KiB</td><td align="left">[2 MiB]</td></tr><tr><td align="right">512 KiB</td><td align="left">[2560 KiB, 3 MiB, 3584 KiB, 4 MiB]</td></tr><tr><td align="right">1 MiB</td><td align="left">[5 MiB, 6 MiB, 7 MiB, 8 MiB]</td></tr><tr><td align="right">2 MiB</td><td align="left">[10 MiB, 12 MiB, 14 MiB, 16 MiB]</td></tr><tr><td align="right">4 MiB</td><td align="left">[20 MiB, 24 MiB, 28 MiB, 32 MiB]</td></tr><tr><td align="right">8 MiB</td><td align="left">[40 MiB, 48 MiB, 56 MiB, 64 MiB]</td></tr><tr><td align="right">...</td><td align="left">...</td></tr></tbody></table></div></div><br class="table-break"></div><div class="refsect1"><a name="mallctl_namespace"></a><h2>MALLCTL NAMESPACE</h2><p>The following names are defined in the namespace accessible via the <code class="function">mallctl*</code>(<em class="parameter"><code></code></em>) functions. Value types are specified in parentheses, their readable/writable statuses are encoded as <code class="literal">rw</code>, <code class="literal">r-</code>, <code class="literal">-w</code>, or @@ -355,20 +328,20 @@ for (i = 0; i < nbins; i++) { </span></dt><dd><p>If a value is passed in, refresh the data from which the <code class="function">mallctl*</code>(<em class="parameter"><code></code></em>) functions report values, and increment the epoch. Return the current epoch. This is useful for - detecting whether another thread caused a refresh.</p></dd><dt><a name="config.debug"></a><span class="term"> + detecting whether another thread caused a refresh.</p></dd><dt><a name="config.cache_oblivious"></a><span class="term"> - "<code class="mallctl">config.debug</code>" + "<code class="mallctl">config.cache_oblivious</code>" (<span class="type">bool</span>) <code class="literal">r-</code> - </span></dt><dd><p><code class="option">--enable-debug</code> was specified during - build configuration.</p></dd><dt><a name="config.dss"></a><span class="term"> + </span></dt><dd><p><code class="option">--enable-cache-oblivious</code> was specified + during build configuration.</p></dd><dt><a name="config.debug"></a><span class="term"> - "<code class="mallctl">config.dss</code>" + "<code class="mallctl">config.debug</code>" (<span class="type">bool</span>) <code class="literal">r-</code> - </span></dt><dd><p><code class="option">--enable-dss</code> was specified during + </span></dt><dd><p><code class="option">--enable-debug</code> was specified during build configuration.</p></dd><dt><a name="config.fill"></a><span class="term"> "<code class="mallctl">config.fill</code>" @@ -383,14 +356,7 @@ for (i = 0; i < nbins; i++) { (<span class="type">bool</span>) <code class="literal">r-</code> </span></dt><dd><p><code class="option">--enable-lazy-lock</code> was specified - during build configuration.</p></dd><dt><a name="config.mremap"></a><span class="term"> - - "<code class="mallctl">config.mremap</code>" - - (<span class="type">bool</span>) - <code class="literal">r-</code> - </span></dt><dd><p><code class="option">--enable-mremap</code> was specified during - build configuration.</p></dd><dt><a name="config.munmap"></a><span class="term"> + during build configuration.</p></dd><dt><a name="config.munmap"></a><span class="term"> "<code class="mallctl">config.munmap</code>" @@ -479,12 +445,13 @@ for (i = 0; i < nbins; i++) { <code class="literal">r-</code> </span></dt><dd><p>dss (<span class="citerefentry"><span class="refentrytitle">sbrk</span>(2)</span>) allocation precedence as related to <span class="citerefentry"><span class="refentrytitle">mmap</span>(2)</span> allocation. The following - settings are supported: “disabled”, “primary”, - and “secondary”. The default is “secondary” if - <a class="link" href="#config.dss"> - "<code class="mallctl">config.dss</code>" - </a> is - true, “disabled” otherwise. + settings are supported if + <span class="citerefentry"><span class="refentrytitle">sbrk</span>(2)</span> is supported by the operating + system: “disabled”, “primary”, and + “secondary”; otherwise only “disabled” is + supported. The default is “secondary” if + <span class="citerefentry"><span class="refentrytitle">sbrk</span>(2)</span> is supported by the operating + system; “disabled” otherwise. </p></dd><dt><a name="opt.lg_chunk"></a><span class="term"> "<code class="mallctl">opt.lg_chunk</code>" @@ -494,7 +461,7 @@ for (i = 0; i < nbins; i++) { </span></dt><dd><p>Virtual memory chunk size (log base 2). If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size. The default - chunk size is 4 MiB (2^22). + chunk size is 2 MiB (2^21). </p></dd><dt><a name="opt.narenas"></a><span class="term"> "<code class="mallctl">opt.narenas</code>" @@ -517,7 +484,13 @@ for (i = 0; i < nbins; i++) { provides the kernel with sufficient information to recycle dirty pages if physical memory becomes scarce and the pages remain unused. The default minimum ratio is 8:1 (2^3:1); an option value of -1 will - disable dirty page purging.</p></dd><dt><a name="opt.stats_print"></a><span class="term"> + disable dirty page purging. See <a class="link" href="#arenas.lg_dirty_mult"> + "<code class="mallctl">arenas.lg_dirty_mult</code>" + </a> + and <a class="link" href="#arena.i.lg_dirty_mult"> + "<code class="mallctl">arena.<i>.lg_dirty_mult</code>" + </a> + for related dynamic control options.</p></dd><dt><a name="opt.stats_print"></a><span class="term"> "<code class="mallctl">opt.stats_print</code>" @@ -530,23 +503,31 @@ for (i = 0; i < nbins; i++) { <code class="option">--enable-stats</code> is specified during configuration, this has the potential to cause deadlock for a multi-threaded process that exits while one or more threads are executing in the memory allocation - functions. Therefore, this option should only be used with care; it is - primarily intended as a performance tuning aid during application + functions. Furthermore, <code class="function">atexit</code>(<em class="parameter"><code></code></em>) may + allocate memory during application initialization and then deadlock + internally when jemalloc in turn calls + <code class="function">atexit</code>(<em class="parameter"><code></code></em>), so this option is not + univerally usable (though the application can register its own + <code class="function">atexit</code>(<em class="parameter"><code></code></em>) function with equivalent + functionality). Therefore, this option should only be used with care; + it is primarily intended as a performance tuning aid during application development. This option is disabled by default.</p></dd><dt><a name="opt.junk"></a><span class="term"> "<code class="mallctl">opt.junk</code>" - (<span class="type">bool</span>) + (<span class="type">const char *</span>) <code class="literal">r-</code> [<code class="option">--enable-fill</code>] - </span></dt><dd><p>Junk filling enabled/disabled. If enabled, each byte - of uninitialized allocated memory will be initialized to - <code class="literal">0xa5</code>. All deallocated memory will be initialized to - <code class="literal">0x5a</code>. This is intended for debugging and will - impact performance negatively. This option is disabled by default - unless <code class="option">--enable-debug</code> is specified during - configuration, in which case it is enabled by default unless running - inside <a class="ulink" href="http://valgrind.org/" target="_top">Valgrind</a>.</p></dd><dt><a name="opt.quarantine"></a><span class="term"> + </span></dt><dd><p>Junk filling. If set to "alloc", each byte of + uninitialized allocated memory will be initialized to + <code class="literal">0xa5</code>. If set to "free", all deallocated memory will + be initialized to <code class="literal">0x5a</code>. If set to "true", both + allocated and deallocated memory will be initialized, and if set to + "false", junk filling be disabled entirely. This is intended for + debugging and will impact performance negatively. This option is + "false" by default unless <code class="option">--enable-debug</code> is specified + during configuration, in which case it is "true" by default unless + running inside <a class="ulink" href="http://valgrind.org/" target="_top">Valgrind</a>.</p></dd><dt><a name="opt.quarantine"></a><span class="term"> "<code class="mallctl">opt.quarantine</code>" @@ -592,9 +573,8 @@ for (i = 0; i < nbins; i++) { </span></dt><dd><p>Zero filling enabled/disabled. If enabled, each byte of uninitialized allocated memory will be initialized to 0. Note that this initialization only happens once for each byte, so - <code class="function">realloc</code>(<em class="parameter"><code></code></em>), - <code class="function">rallocx</code>(<em class="parameter"><code></code></em>) and - <code class="function">rallocm</code>(<em class="parameter"><code></code></em>) calls do not zero memory that + <code class="function">realloc</code>(<em class="parameter"><code></code></em>) and + <code class="function">rallocx</code>(<em class="parameter"><code></code></em>) calls do not zero memory that was previously allocated. This is intended for debugging and will impact performance negatively. This option is disabled by default. </p></dd><dt><a name="opt.utrace"></a><span class="term"> @@ -606,17 +586,7 @@ for (i = 0; i < nbins; i++) { [<code class="option">--enable-utrace</code>] </span></dt><dd><p>Allocation tracing based on <span class="citerefentry"><span class="refentrytitle">utrace</span>(2)</span> enabled/disabled. This option - is disabled by default.</p></dd><dt><a name="opt.valgrind"></a><span class="term"> - - "<code class="mallctl">opt.valgrind</code>" - - (<span class="type">bool</span>) - <code class="literal">r-</code> - [<code class="option">--enable-valgrind</code>] - </span></dt><dd><p><a class="ulink" href="http://valgrind.org/" target="_top">Valgrind</a> - support enabled/disabled. This option is vestigal because jemalloc - auto-detects whether it is running inside Valgrind. This option is - disabled by default, unless running inside Valgrind.</p></dd><dt><a name="opt.xmalloc"></a><span class="term"> + is disabled by default.</p></dd><dt><a name="opt.xmalloc"></a><span class="term"> "<code class="mallctl">opt.xmalloc</code>" @@ -639,16 +609,16 @@ malloc_conf = "xmalloc:true";</pre><p> (<span class="type">bool</span>) <code class="literal">r-</code> [<code class="option">--enable-tcache</code>] - </span></dt><dd><p>Thread-specific caching enabled/disabled. When there - are multiple threads, each thread uses a thread-specific cache for - objects up to a certain size. Thread-specific caching allows many - allocations to be satisfied without performing any thread - synchronization, at the cost of increased memory use. See the - <a class="link" href="#opt.lg_tcache_max"> + </span></dt><dd><p>Thread-specific caching (tcache) enabled/disabled. When + there are multiple threads, each thread uses a tcache for objects up to + a certain size. Thread-specific caching allows many allocations to be + satisfied without performing any thread synchronization, at the cost of + increased memory use. See the <a class="link" href="#opt.lg_tcache_max"> "<code class="mallctl">opt.lg_tcache_max</code>" </a> option for related tuning information. This option is enabled by - default unless running inside <a class="ulink" href="http://valgrind.org/" target="_top">Valgrind</a>.</p></dd><dt><a name="opt.lg_tcache_max"></a><span class="term"> + default unless running inside <a class="ulink" href="http://valgrind.org/" target="_top">Valgrind</a>, in which case it is + forcefully disabled.</p></dd><dt><a name="opt.lg_tcache_max"></a><span class="term"> "<code class="mallctl">opt.lg_tcache_max</code>" @@ -656,8 +626,8 @@ malloc_conf = "xmalloc:true";</pre><p> <code class="literal">r-</code> [<code class="option">--enable-tcache</code>] </span></dt><dd><p>Maximum size class (log base 2) to cache in the - thread-specific cache. At a minimum, all small size classes are - cached, and at a maximum all large size classes are cached. The + thread-specific cache (tcache). At a minimum, all small size classes + are cached, and at a maximum all large size classes are cached. The default maximum is 32 KiB (2^15).</p></dd><dt><a name="opt.prof"></a><span class="term"> "<code class="mallctl">opt.prof</code>" @@ -686,8 +656,8 @@ malloc_conf = "xmalloc:true";</pre><p> "<code class="mallctl">opt.prof_final</code>" </a> option for final profile dumping. Profile output is compatible with - the included <span class="command"><strong>pprof</strong></span> Perl script, which originates - from the <a class="ulink" href="http://code.google.com/p/gperftools/" target="_top">gperftools + the <span class="command"><strong>jeprof</strong></span> command, which is based on the + <span class="command"><strong>pprof</strong></span> that is developed as part of the <a class="ulink" href="http://code.google.com/p/gperftools/" target="_top">gperftools package</a>.</p></dd><dt><a name="opt.prof_prefix"></a><span class="term"> "<code class="mallctl">opt.prof_prefix</code>" @@ -704,7 +674,7 @@ malloc_conf = "xmalloc:true";</pre><p> "<code class="mallctl">opt.prof_active</code>" (<span class="type">bool</span>) - <code class="literal">rw</code> + <code class="literal">r-</code> [<code class="option">--enable-prof</code>] </span></dt><dd><p>Profiling activated/deactivated. This is a secondary control mechanism that makes it possible to start the application with @@ -715,11 +685,25 @@ malloc_conf = "xmalloc:true";</pre><p> with the <a class="link" href="#prof.active"> "<code class="mallctl">prof.active</code>" </a> mallctl. - This option is enabled by default.</p></dd><dt><a name="opt.lg_prof_sample"></a><span class="term"> + This option is enabled by default.</p></dd><dt><a name="opt.prof_thread_active_init"></a><span class="term"> + + "<code class="mallctl">opt.prof_thread_active_init</code>" + + (<span class="type">bool</span>) + <code class="literal">r-</code> + [<code class="option">--enable-prof</code>] + </span></dt><dd><p>Initial setting for <a class="link" href="#thread.prof.active"> + "<code class="mallctl">thread.prof.active</code>" + </a> + in newly created threads. The initial setting for newly created threads + can also be changed during execution via the <a class="link" href="#prof.thread_active_init"> + "<code class="mallctl">prof.thread_active_init</code>" + </a> + mallctl. This option is enabled by default.</p></dd><dt><a name="opt.lg_prof_sample"></a><span class="term"> "<code class="mallctl">opt.lg_prof_sample</code>" - (<span class="type">ssize_t</span>) + (<span class="type">size_t</span>) <code class="literal">r-</code> [<code class="option">--enable-prof</code>] </span></dt><dd><p>Average interval (log base 2) between allocation @@ -764,14 +748,12 @@ malloc_conf = "xmalloc:true";</pre><p> (<span class="type">bool</span>) <code class="literal">r-</code> [<code class="option">--enable-prof</code>] - </span></dt><dd><p>Trigger a memory profile dump every time the total - virtual memory exceeds the previous maximum. Profiles are dumped to - files named according to the pattern - <code class="filename"><prefix>.<pid>.<seq>.u<useq>.heap</code>, - where <code class="literal"><prefix></code> is controlled by the <a class="link" href="#opt.prof_prefix"> - "<code class="mallctl">opt.prof_prefix</code>" - </a> - option. This option is disabled by default.</p></dd><dt><a name="opt.prof_final"></a><span class="term"> + </span></dt><dd><p>Set the initial state of <a class="link" href="#prof.gdump"> + "<code class="mallctl">prof.gdump</code>" + </a>, which when + enabled triggers a memory profile dump every time the total virtual + memory exceeds the previous maximum. This option is disabled by + default.</p></dd><dt><a name="opt.prof_final"></a><span class="term"> "<code class="mallctl">opt.prof_final</code>" @@ -785,7 +767,13 @@ malloc_conf = "xmalloc:true";</pre><p> where <code class="literal"><prefix></code> is controlled by the <a class="link" href="#opt.prof_prefix"> "<code class="mallctl">opt.prof_prefix</code>" </a> - option. This option is enabled by default.</p></dd><dt><a name="opt.prof_leak"></a><span class="term"> + option. Note that <code class="function">atexit</code>(<em class="parameter"><code></code></em>) may allocate + memory during application initialization and then deadlock internally + when jemalloc in turn calls <code class="function">atexit</code>(<em class="parameter"><code></code></em>), so + this option is not univerally usable (though the application can + register its own <code class="function">atexit</code>(<em class="parameter"><code></code></em>) function with + equivalent functionality). This option is disabled by + default.</p></dd><dt><a name="opt.prof_leak"></a><span class="term"> "<code class="mallctl">opt.prof_leak</code>" @@ -864,9 +852,9 @@ malloc_conf = "xmalloc:true";</pre><p> [<code class="option">--enable-tcache</code>] </span></dt><dd><p>Enable/disable calling thread's tcache. The tcache is implicitly flushed as a side effect of becoming - disabled (see + disabled (see <a class="link" href="#thread.tcache.flush"> "<code class="mallctl">thread.tcache.flush</code>" - ). + </a>). </p></dd><dt><a name="thread.tcache.flush"></a><span class="term"> "<code class="mallctl">thread.tcache.flush</code>" @@ -874,19 +862,84 @@ malloc_conf = "xmalloc:true";</pre><p> (<span class="type">void</span>) <code class="literal">--</code> [<code class="option">--enable-tcache</code>] - </span></dt><dd><p>Flush calling thread's tcache. This interface releases - all cached objects and internal data structures associated with the - calling thread's thread-specific cache. Ordinarily, this interface + </span></dt><dd><p>Flush calling thread's thread-specific cache (tcache). + This interface releases all cached objects and internal data structures + associated with the calling thread's tcache. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits. However, garbage collection is triggered by allocation activity, so it is possible for a thread that stops allocating/deallocating to retain its cache indefinitely, in which case - the developer may find manual flushing useful.</p></dd><dt><a name="arena.i.purge"></a><span class="term"> + the developer may find manual flushing useful.</p></dd><dt><a name="thread.prof.name"></a><span class="term"> - "<code class="mallctl">arena.<i>.purge</code>" + "<code class="mallctl">thread.prof.name</code>" + + (<span class="type">const char *</span>) + <code class="literal">r-</code> or + <code class="literal">-w</code> + [<code class="option">--enable-prof</code>] + </span></dt><dd><p>Get/set the descriptive name associated with the calling + thread in memory profile dumps. An internal copy of the name string is + created, so the input string need not be maintained after this interface + completes execution. The output string of this interface should be + copied for non-ephemeral uses, because multiple implementation details + can cause asynchronous string deallocation. Furthermore, each + invocation of this interface can only read or write; simultaneous + read/write is not supported due to string lifetime limitations. The + name string must nil-terminated and comprised only of characters in the + sets recognized + by <span class="citerefentry"><span class="refentrytitle">isgraph</span>(3)</span> and + <span class="citerefentry"><span class="refentrytitle">isblank</span>(3)</span>.</p></dd><dt><a name="thread.prof.active"></a><span class="term"> + + "<code class="mallctl">thread.prof.active</code>" + + (<span class="type">bool</span>) + <code class="literal">rw</code> + [<code class="option">--enable-prof</code>] + </span></dt><dd><p>Control whether sampling is currently active for the + calling thread. This is an activation mechanism in addition to <a class="link" href="#prof.active"> + "<code class="mallctl">prof.active</code>" + </a>; both must + be active for the calling thread to sample. This flag is enabled by + default.</p></dd><dt><a name="tcache.create"></a><span class="term"> + + "<code class="mallctl">tcache.create</code>" + + (<span class="type">unsigned</span>) + <code class="literal">r-</code> + [<code class="option">--enable-tcache</code>] + </span></dt><dd><p>Create an explicit thread-specific cache (tcache) and + return an identifier that can be passed to the <a class="link" href="#MALLOCX_TCACHE"><code class="constant">MALLOCX_TCACHE(<em class="parameter"><code>tc</code></em>)</code></a> + macro to explicitly use the specified cache rather than the + automatically managed one that is used by default. Each explicit cache + can be used by only one thread at a time; the application must assure + that this constraint holds. + </p></dd><dt><a name="tcache.flush"></a><span class="term"> + + "<code class="mallctl">tcache.flush</code>" + + (<span class="type">unsigned</span>) + <code class="literal">-w</code> + [<code class="option">--enable-tcache</code>] + </span></dt><dd><p>Flush the specified thread-specific cache (tcache). The + same considerations apply to this interface as to <a class="link" href="#thread.tcache.flush"> + "<code class="mallctl">thread.tcache.flush</code>" + </a>, + except that the tcache will never be automatically be discarded. + </p></dd><dt><a name="tcache.destroy"></a><span class="term"> + + "<code class="mallctl">tcache.destroy</code>" (<span class="type">unsigned</span>) + <code class="literal">-w</code> + [<code class="option">--enable-tcache</code>] + </span></dt><dd><p>Flush the specified thread-specific cache (tcache) and + make the identifier available for use during a future tcache creation. + </p></dd><dt><a name="arena.i.purge"></a><span class="term"> + + "<code class="mallctl">arena.<i>.purge</code>" + + (<span class="type">void</span>) <code class="literal">--</code> </span></dt><dd><p>Purge unused dirty pages for arena <i>, or for all arenas if <i> equals <a class="link" href="#arenas.narenas"> @@ -902,15 +955,138 @@ malloc_conf = "xmalloc:true";</pre><p> allocation for arena <i>, or for all arenas if <i> equals <a class="link" href="#arenas.narenas"> "<code class="mallctl">arenas.narenas</code>" - </a>. Note - that even during huge allocation this setting is read from the arena - that would be chosen for small or large allocation so that applications - can depend on consistent dss versus mmap allocation regardless of - allocation size. See <a class="link" href="#opt.dss"> + </a>. See + <a class="link" href="#opt.dss"> "<code class="mallctl">opt.dss</code>" </a> for supported - settings. - </p></dd><dt><a name="arenas.narenas"></a><span class="term"> + settings.</p></dd><dt><a name="arena.i.lg_dirty_mult"></a><span class="term"> + + "<code class="mallctl">arena.<i>.lg_dirty_mult</code>" + + (<span class="type">ssize_t</span>) + <code class="literal">rw</code> + </span></dt><dd><p>Current per-arena minimum ratio (log base 2) of active + to dirty pages for arena <i>. Each time this interface is set and + the ratio is increased, pages are synchronously purged as necessary to + impose the new ratio. See <a class="link" href="#opt.lg_dirty_mult"> + "<code class="mallctl">opt.lg_dirty_mult</code>" + </a> + for additional information.</p></dd><dt><a name="arena.i.chunk_hooks"></a><span class="term"> + + "<code class="mallctl">arena.<i>.chunk_hooks</code>" + + (<span class="type">chunk_hooks_t</span>) + <code class="literal">rw</code> + </span></dt><dd><p>Get or set the chunk management hook functions for arena + <i>. The functions must be capable of operating on all extant + chunks associated with arena <i>, usually by passing unknown + chunks to the replaced functions. In practice, it is feasible to + control allocation for arenas created via <a class="link" href="#arenas.extend"> + "<code class="mallctl">arenas.extend</code>" + </a> such + that all chunks originate from an application-supplied chunk allocator + (by setting custom chunk hook functions just after arena creation), but + the automatically created arenas may have already created chunks prior + to the application having an opportunity to take over chunk + allocation.</p><pre class="programlisting"> +typedef struct { + chunk_alloc_t *alloc; + chunk_dalloc_t *dalloc; + chunk_commit_t *commit; + chunk_decommit_t *decommit; + chunk_purge_t *purge; + chunk_split_t *split; + chunk_merge_t *merge; +} chunk_hooks_t;</pre><p>The <span class="type">chunk_hooks_t</span> structure comprises function + pointers which are described individually below. jemalloc uses these + functions to manage chunk lifetime, which starts off with allocation of + mapped committed memory, in the simplest case followed by deallocation. + However, there are performance and platform reasons to retain chunks for + later reuse. Cleanup attempts cascade from deallocation to decommit to + purging, which gives the chunk management functions opportunities to + reject the most permanent cleanup operations in favor of less permanent + (and often less costly) operations. The chunk splitting and merging + operations can also be opted out of, but this is mainly intended to + support platforms on which virtual memory mappings provided by the + operating system kernel do not automatically coalesce and split, e.g. + Windows.</p><div class="funcsynopsis"><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">typedef void *<b class="fsfunc">(chunk_alloc_t)</b>(</code></td><td>void *<var class="pdparam">chunk</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">alignment</var>, </td></tr><tr><td> </td><td>bool *<var class="pdparam">zero</var>, </td></tr><tr><td> </td><td>bool *<var class="pdparam">commit</var>, </td></tr><tr><td> </td><td>unsigned <var class="pdparam">arena_ind</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="literallayout"><p></p></div><p>A chunk allocation function conforms to the + <span class="type">chunk_alloc_t</span> type and upon success returns a pointer to + <em class="parameter"><code>size</code></em> bytes of mapped memory on behalf of arena + <em class="parameter"><code>arena_ind</code></em> such that the chunk's base address is a + multiple of <em class="parameter"><code>alignment</code></em>, as well as setting + <em class="parameter"><code>*zero</code></em> to indicate whether the chunk is zeroed and + <em class="parameter"><code>*commit</code></em> to indicate whether the chunk is + committed. Upon error the function returns <code class="constant">NULL</code> + and leaves <em class="parameter"><code>*zero</code></em> and + <em class="parameter"><code>*commit</code></em> unmodified. The + <em class="parameter"><code>size</code></em> parameter is always a multiple of the chunk + size. The <em class="parameter"><code>alignment</code></em> parameter is always a power + of two at least as large as the chunk size. Zeroing is mandatory if + <em class="parameter"><code>*zero</code></em> is true upon function entry. Committing is + mandatory if <em class="parameter"><code>*commit</code></em> is true upon function entry. + If <em class="parameter"><code>chunk</code></em> is not <code class="constant">NULL</code>, the + returned pointer must be <em class="parameter"><code>chunk</code></em> on success or + <code class="constant">NULL</code> on error. Committed memory may be committed + in absolute terms as on a system that does not overcommit, or in + implicit terms as on a system that overcommits and satisfies physical + memory needs on demand via soft page faults. Note that replacing the + default chunk allocation function makes the arena's <a class="link" href="#arena.i.dss"> + "<code class="mallctl">arena.<i>.dss</code>" + </a> + setting irrelevant.</p><div class="funcsynopsis"><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">typedef bool <b class="fsfunc">(chunk_dalloc_t)</b>(</code></td><td>void *<var class="pdparam">chunk</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>bool <var class="pdparam">committed</var>, </td></tr><tr><td> </td><td>unsigned <var class="pdparam">arena_ind</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="literallayout"><p></p></div><p> + A chunk deallocation function conforms to the + <span class="type">chunk_dalloc_t</span> type and deallocates a + <em class="parameter"><code>chunk</code></em> of given <em class="parameter"><code>size</code></em> with + <em class="parameter"><code>committed</code></em>/decommited memory as indicated, on + behalf of arena <em class="parameter"><code>arena_ind</code></em>, returning false upon + success. If the function returns true, this indicates opt-out from + deallocation; the virtual memory mapping associated with the chunk + remains mapped, in the same commit state, and available for future use, + in which case it will be automatically retained for later reuse.</p><div class="funcsynopsis"><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">typedef bool <b class="fsfunc">(chunk_commit_t)</b>(</code></td><td>void *<var class="pdparam">chunk</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">offset</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">length</var>, </td></tr><tr><td> </td><td>unsigned <var class="pdparam">arena_ind</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="literallayout"><p></p></div><p>A chunk commit function conforms to the + <span class="type">chunk_commit_t</span> type and commits zeroed physical memory to + back pages within a <em class="parameter"><code>chunk</code></em> of given + <em class="parameter"><code>size</code></em> at <em class="parameter"><code>offset</code></em> bytes, + extending for <em class="parameter"><code>length</code></em> on behalf of arena + <em class="parameter"><code>arena_ind</code></em>, returning false upon success. + Committed memory may be committed in absolute terms as on a system that + does not overcommit, or in implicit terms as on a system that + overcommits and satisfies physical memory needs on demand via soft page + faults. If the function returns true, this indicates insufficient + physical memory to satisfy the request.</p><div class="funcsynopsis"><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">typedef bool <b class="fsfunc">(chunk_decommit_t)</b>(</code></td><td>void *<var class="pdparam">chunk</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">offset</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">length</var>, </td></tr><tr><td> </td><td>unsigned <var class="pdparam">arena_ind</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="literallayout"><p></p></div><p>A chunk decommit function conforms to the + <span class="type">chunk_decommit_t</span> type and decommits any physical memory + that is backing pages within a <em class="parameter"><code>chunk</code></em> of given + <em class="parameter"><code>size</code></em> at <em class="parameter"><code>offset</code></em> bytes, + extending for <em class="parameter"><code>length</code></em> on behalf of arena + <em class="parameter"><code>arena_ind</code></em>, returning false upon success, in which + case the pages will be committed via the chunk commit function before + being reused. If the function returns true, this indicates opt-out from + decommit; the memory remains committed and available for future use, in + which case it will be automatically retained for later reuse.</p><div class="funcsynopsis"><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">typedef bool <b class="fsfunc">(chunk_purge_t)</b>(</code></td><td>void *<var class="pdparam">chunk</var>, </td></tr><tr><td> </td><td>size_t<var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">offset</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">length</var>, </td></tr><tr><td> </td><td>unsigned <var class="pdparam">arena_ind</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="literallayout"><p></p></div><p>A chunk purge function conforms to the <span class="type">chunk_purge_t</span> + type and optionally discards physical pages within the virtual memory + mapping associated with <em class="parameter"><code>chunk</code></em> of given + <em class="parameter"><code>size</code></em> at <em class="parameter"><code>offset</code></em> bytes, + extending for <em class="parameter"><code>length</code></em> on behalf of arena + <em class="parameter"><code>arena_ind</code></em>, returning false if pages within the + purged virtual memory range will be zero-filled the next time they are + accessed.</p><div class="funcsynopsis"><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">typedef bool <b class="fsfunc">(chunk_split_t)</b>(</code></td><td>void *<var class="pdparam">chunk</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size_a</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size_b</var>, </td></tr><tr><td> </td><td>bool <var class="pdparam">committed</var>, </td></tr><tr><td> </td><td>unsigned <var class="pdparam">arena_ind</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="literallayout"><p></p></div><p>A chunk split function conforms to the <span class="type">chunk_split_t</span> + type and optionally splits <em class="parameter"><code>chunk</code></em> of given + <em class="parameter"><code>size</code></em> into two adjacent chunks, the first of + <em class="parameter"><code>size_a</code></em> bytes, and the second of + <em class="parameter"><code>size_b</code></em> bytes, operating on + <em class="parameter"><code>committed</code></em>/decommitted memory as indicated, on + behalf of arena <em class="parameter"><code>arena_ind</code></em>, returning false upon + success. If the function returns true, this indicates that the chunk + remains unsplit and therefore should continue to be operated on as a + whole.</p><div class="funcsynopsis"><table border="0" class="funcprototype-table" summary="Function synopsis" style="cellspacing: 0; cellpadding: 0;"><tr><td><code class="funcdef">typedef bool <b class="fsfunc">(chunk_merge_t)</b>(</code></td><td>void *<var class="pdparam">chunk_a</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size_a</var>, </td></tr><tr><td> </td><td>void *<var class="pdparam">chunk_b</var>, </td></tr><tr><td> </td><td>size_t <var class="pdparam">size_b</var>, </td></tr><tr><td> </td><td>bool <var class="pdparam">committed</var>, </td></tr><tr><td> </td><td>unsigned <var class="pdparam">arena_ind</var><code>)</code>;</td></tr></table><div class="funcprototype-spacer"> </div></div><div class="literallayout"><p></p></div><p>A chunk merge function conforms to the <span class="type">chunk_merge_t</span> + type and optionally merges adjacent chunks, + <em class="parameter"><code>chunk_a</code></em> of given <em class="parameter"><code>size_a</code></em> + and <em class="parameter"><code>chunk_b</code></em> of given + <em class="parameter"><code>size_b</code></em> into one contiguous chunk, operating on + <em class="parameter"><code>committed</code></em>/decommitted memory as indicated, on + behalf of arena <em class="parameter"><code>arena_ind</code></em>, returning false upon + success. If the function returns true, this indicates that the chunks + remain distinct mappings and therefore should continue to be operated on + independently.</p></dd><dt><a name="arenas.narenas"></a><span class="term"> "<code class="mallctl">arenas.narenas</code>" @@ -926,7 +1102,20 @@ malloc_conf = "xmalloc:true";</pre><p> "<code class="mallctl">arenas.narenas</code>" </a> booleans. Each boolean indicates whether the corresponding arena is - initialized.</p></dd><dt><a name="arenas.quantum"></a><span class="term"> + initialized.</p></dd><dt><a name="arenas.lg_dirty_mult"></a><span class="term"> + + "<code class="mallctl">arenas.lg_dirty_mult</code>" + + (<span class="type">ssize_t</span>) + <code class="literal">rw</code> + </span></dt><dd><p>Current default per-arena minimum ratio (log base 2) of + active to dirty pages, used to initialize <a class="link" href="#arena.i.lg_dirty_mult"> + "<code class="mallctl">arena.<i>.lg_dirty_mult</code>" + </a> + during arena creation. See <a class="link" href="#opt.lg_dirty_mult"> + "<code class="mallctl">opt.lg_dirty_mult</code>" + </a> + for additional information.</p></dd><dt><a name="arenas.quantum"></a><span class="term"> "<code class="mallctl">arenas.quantum</code>" @@ -981,7 +1170,7 @@ malloc_conf = "xmalloc:true";</pre><p> "<code class="mallctl">arenas.nlruns</code>" - (<span class="type">size_t</span>) + (<span class="type">unsigned</span>) <code class="literal">r-</code> </span></dt><dd><p>Total number of large size classes.</p></dd><dt><a name="arenas.lrun.i.size"></a><span class="term"> @@ -990,21 +1179,40 @@ malloc_conf = "xmalloc:true";</pre><p> (<span class="type">size_t</span>) <code class="literal">r-</code> </span></dt><dd><p>Maximum size supported by this large size - class.</p></dd><dt><a name="arenas.purge"></a><span class="term"> + class.</p></dd><dt><a name="arenas.nhchunks"></a><span class="term"> - "<code class="mallctl">arenas.purge</code>" + "<code class="mallctl">arenas.nhchunks</code>" (<span class="type">unsigned</span>) - <code class="literal">-w</code> - </span></dt><dd><p>Purge unused dirty pages for the specified arena, or - for all arenas if none is specified.</p></dd><dt><a name="arenas.extend"></a><span class="term"> + <code class="literal">r-</code> + </span></dt><dd><p>Total number of huge size classes.</p></dd><dt><a name="arenas.hchunk.i.size"></a><span class="term"> + + "<code class="mallctl">arenas.hchunk.<i>.size</code>" + + (<span class="type">size_t</span>) + <code class="literal">r-</code> + </span></dt><dd><p>Maximum size supported by this huge size + class.</p></dd><dt><a name="arenas.extend"></a><span class="term"> "<code class="mallctl">arenas.extend</code>" (<span class="type">unsigned</span>) <code class="literal">r-</code> </span></dt><dd><p>Extend the array of arenas by appending a new arena, - and returning the new arena index.</p></dd><dt><a name="prof.active"></a><span class="term"> + and returning the new arena index.</p></dd><dt><a name="prof.thread_active_init"></a><span class="term"> + + "<code class="mallctl">prof.thread_active_init</code>" + + (<span class="type">bool</span>) + <code class="literal">rw</code> + [<code class="option">--enable-prof</code>] + </span></dt><dd><p>Control the initial setting for <a class="link" href="#thread.prof.active"> + "<code class="mallctl">thread.prof.active</code>" + </a> + in newly created threads. See the <a class="link" href="#opt.prof_thread_active_init"> + "<code class="mallctl">opt.prof_thread_active_init</code>" + </a> + option for additional information.</p></dd><dt><a name="prof.active"></a><span class="term"> "<code class="mallctl">prof.active</code>" @@ -1015,8 +1223,10 @@ malloc_conf = "xmalloc:true";</pre><p> <a class="link" href="#opt.prof_active"> "<code class="mallctl">opt.prof_active</code>" </a> - option for additional information. - </p></dd><dt><a name="prof.dump"></a><span class="term"> + option for additional information, as well as the interrelated <a class="link" href="#thread.prof.active"> + "<code class="mallctl">thread.prof.active</code>" + </a> + mallctl.</p></dd><dt><a name="prof.dump"></a><span class="term"> "<code class="mallctl">prof.dump</code>" @@ -1030,7 +1240,45 @@ malloc_conf = "xmalloc:true";</pre><p> <a class="link" href="#opt.prof_prefix"> "<code class="mallctl">opt.prof_prefix</code>" </a> - option.</p></dd><dt><a name="prof.interval"></a><span class="term"> + option.</p></dd><dt><a name="prof.gdump"></a><span class="term"> + + "<code class="mallctl">prof.gdump</code>" + + (<span class="type">bool</span>) + <code class="literal">rw</code> + [<code class="option">--enable-prof</code>] + </span></dt><dd><p>When enabled, trigger a memory profile dump every time + the total virtual memory exceeds the previous maximum. Profiles are + dumped to files named according to the pattern + <code class="filename"><prefix>.<pid>.<seq>.u<useq>.heap</code>, + where <code class="literal"><prefix></code> is controlled by the <a class="link" href="#opt.prof_prefix"> + "<code class="mallctl">opt.prof_prefix</code>" + </a> + option.</p></dd><dt><a name="prof.reset"></a><span class="term"> + + "<code class="mallctl">prof.reset</code>" + + (<span class="type">size_t</span>) + <code class="literal">-w</code> + [<code class="option">--enable-prof</code>] + </span></dt><dd><p>Reset all memory profile statistics, and optionally + update the sample rate (see <a class="link" href="#opt.lg_prof_sample"> + "<code class="mallctl">opt.lg_prof_sample</code>" + </a> + and <a class="link" href="#prof.lg_sample"> + "<code class="mallctl">prof.lg_sample</code>" + </a>). + </p></dd><dt><a name="prof.lg_sample"></a><span class="term"> + + "<code class="mallctl">prof.lg_sample</code>" + + (<span class="type">size_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-prof</code>] + </span></dt><dd><p>Get the current sample rate (see <a class="link" href="#opt.lg_prof_sample"> + "<code class="mallctl">opt.lg_prof_sample</code>" + </a>). + </p></dd><dt><a name="prof.interval"></a><span class="term"> "<code class="mallctl">prof.interval</code>" @@ -1051,9 +1299,8 @@ malloc_conf = "xmalloc:true";</pre><p> [<code class="option">--enable-stats</code>] </span></dt><dd><p>Pointer to a counter that contains an approximate count of the current number of bytes in active pages. The estimate may be - high, but never low, because each arena rounds up to the nearest - multiple of the chunk size when computing its contribution to the - counter. Note that the <a class="link" href="#epoch"> + high, but never low, because each arena rounds up when computing its + contribution to the counter. Note that the <a class="link" href="#epoch"> "<code class="mallctl">epoch</code>" </a> mallctl has no bearing on this counter. Furthermore, counter consistency is maintained via @@ -1082,68 +1329,53 @@ malloc_conf = "xmalloc:true";</pre><p> This does not include <a class="link" href="#stats.arenas.i.pdirty"> "<code class="mallctl">stats.arenas.<i>.pdirty</code>" - </a> and pages - entirely devoted to allocator metadata.</p></dd><dt><a name="stats.mapped"></a><span class="term"> - - "<code class="mallctl">stats.mapped</code>" - - (<span class="type">size_t</span>) - <code class="literal">r-</code> - [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Total number of bytes in chunks mapped on behalf of the - application. This is a multiple of the chunk size, and is at least as - large as <a class="link" href="#stats.active"> - "<code class="mallctl">stats.active</code>" - </a>. This - does not include inactive chunks.</p></dd><dt><a name="stats.chunks.current"></a><span class="term"> + </a>, nor pages + entirely devoted to allocator metadata.</p></dd><dt><a name="stats.metadata"></a><span class="term"> - "<code class="mallctl">stats.chunks.current</code>" + "<code class="mallctl">stats.metadata</code>" (<span class="type">size_t</span>) <code class="literal">r-</code> [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Total number of chunks actively mapped on behalf of the - application. This does not include inactive chunks. - </p></dd><dt><a name="stats.chunks.total"></a><span class="term"> - - "<code class="mallctl">stats.chunks.total</code>" - - (<span class="type">uint64_t</span>) - <code class="literal">r-</code> - [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Cumulative number of chunks allocated.</p></dd><dt><a name="stats.chunks.high"></a><span class="term"> + </span></dt><dd><p>Total number of bytes dedicated to metadata, which + comprise base allocations used for bootstrap-sensitive internal + allocator data structures, arena chunk headers (see <a class="link" href="#stats.arenas.i.metadata.mapped"> + "<code class="mallctl">stats.arenas.<i>.metadata.mapped</code>" + </a>), + and internal allocations (see <a class="link" href="#stats.arenas.i.metadata.allocated"> + "<code class="mallctl">stats.arenas.<i>.metadata.allocated</code>" + </a>).</p></dd><dt><a name="stats.resident"></a><span class="term"> - "<code class="mallctl">stats.chunks.high</code>" + "<code class="mallctl">stats.resident</code>" (<span class="type">size_t</span>) <code class="literal">r-</code> [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Maximum number of active chunks at any time thus far. - </p></dd><dt><a name="stats.huge.allocated"></a><span class="term"> + </span></dt><dd><p>Maximum number of bytes in physically resident data + pages mapped by the allocator, comprising all pages dedicated to + allocator metadata, pages backing active allocations, and unused dirty + pages. This is a maximum rather than precise because pages may not + actually be physically resident if they correspond to demand-zeroed + virtual memory that has not yet been touched. This is a multiple of the + page size, and is larger than <a class="link" href="#stats.active"> + "<code class="mallctl">stats.active</code>" + </a>.</p></dd><dt><a name="stats.mapped"></a><span class="term"> - "<code class="mallctl">stats.huge.allocated</code>" + "<code class="mallctl">stats.mapped</code>" (<span class="type">size_t</span>) <code class="literal">r-</code> [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Number of bytes currently allocated by huge objects. - </p></dd><dt><a name="stats.huge.nmalloc"></a><span class="term"> - - "<code class="mallctl">stats.huge.nmalloc</code>" - - (<span class="type">uint64_t</span>) - <code class="literal">r-</code> - [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Cumulative number of huge allocation requests. - </p></dd><dt><a name="stats.huge.ndalloc"></a><span class="term"> - - "<code class="mallctl">stats.huge.ndalloc</code>" - - (<span class="type">uint64_t</span>) - <code class="literal">r-</code> - [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Cumulative number of huge deallocation requests. - </p></dd><dt><a name="stats.arenas.i.dss"></a><span class="term"> + </span></dt><dd><p>Total number of bytes in active chunks mapped by the + allocator. This is a multiple of the chunk size, and is larger than + <a class="link" href="#stats.active"> + "<code class="mallctl">stats.active</code>" + </a>. + This does not include inactive chunks, even those that contain unused + dirty pages, which means that there is no strict ordering between this + and <a class="link" href="#stats.resident"> + "<code class="mallctl">stats.resident</code>" + </a>.</p></dd><dt><a name="stats.arenas.i.dss"></a><span class="term"> "<code class="mallctl">stats.arenas.<i>.dss</code>" @@ -1153,7 +1385,17 @@ malloc_conf = "xmalloc:true";</pre><p> related to <span class="citerefentry"><span class="refentrytitle">mmap</span>(2)</span> allocation. See <a class="link" href="#opt.dss"> "<code class="mallctl">opt.dss</code>" </a> for details. - </p></dd><dt><a name="stats.arenas.i.nthreads"></a><span class="term"> + </p></dd><dt><a name="stats.arenas.i.lg_dirty_mult"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.lg_dirty_mult</code>" + + (<span class="type">ssize_t</span>) + <code class="literal">r-</code> + </span></dt><dd><p>Minimum ratio (log base 2) of active to dirty pages. + See <a class="link" href="#opt.lg_dirty_mult"> + "<code class="mallctl">opt.lg_dirty_mult</code>" + </a> + for details.</p></dd><dt><a name="stats.arenas.i.nthreads"></a><span class="term"> "<code class="mallctl">stats.arenas.<i>.nthreads</code>" @@ -1182,7 +1424,38 @@ malloc_conf = "xmalloc:true";</pre><p> (<span class="type">size_t</span>) <code class="literal">r-</code> [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Number of mapped bytes.</p></dd><dt><a name="stats.arenas.i.npurge"></a><span class="term"> + </span></dt><dd><p>Number of mapped bytes.</p></dd><dt><a name="stats.arenas.i.metadata.mapped"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.metadata.mapped</code>" + + (<span class="type">size_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Number of mapped bytes in arena chunk headers, which + track the states of the non-metadata pages.</p></dd><dt><a name="stats.arenas.i.metadata.allocated"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.metadata.allocated</code>" + + (<span class="type">size_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Number of bytes dedicated to internal allocations. + Internal allocations differ from application-originated allocations in + that they are for internal use, and that they are omitted from heap + profiles. This statistic is reported separately from <a class="link" href="#stats.metadata"> + "<code class="mallctl">stats.metadata</code>" + </a> and + <a class="link" href="#stats.arenas.i.metadata.mapped"> + "<code class="mallctl">stats.arenas.<i>.metadata.mapped</code>" + </a> + because it overlaps with e.g. the <a class="link" href="#stats.allocated"> + "<code class="mallctl">stats.allocated</code>" + </a> and + <a class="link" href="#stats.active"> + "<code class="mallctl">stats.active</code>" + </a> + statistics, whereas the other metadata statistics do + not.</p></dd><dt><a name="stats.arenas.i.npurge"></a><span class="term"> "<code class="mallctl">stats.arenas.<i>.npurge</code>" @@ -1270,15 +1543,39 @@ malloc_conf = "xmalloc:true";</pre><p> <code class="literal">r-</code> [<code class="option">--enable-stats</code>] </span></dt><dd><p>Cumulative number of large allocation requests. - </p></dd><dt><a name="stats.arenas.i.bins.j.allocated"></a><span class="term"> + </p></dd><dt><a name="stats.arenas.i.huge.allocated"></a><span class="term"> - "<code class="mallctl">stats.arenas.<i>.bins.<j>.allocated</code>" + "<code class="mallctl">stats.arenas.<i>.huge.allocated</code>" (<span class="type">size_t</span>) <code class="literal">r-</code> [<code class="option">--enable-stats</code>] - </span></dt><dd><p>Current number of bytes allocated by - bin.</p></dd><dt><a name="stats.arenas.i.bins.j.nmalloc"></a><span class="term"> + </span></dt><dd><p>Number of bytes currently allocated by huge objects. + </p></dd><dt><a name="stats.arenas.i.huge.nmalloc"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.huge.nmalloc</code>" + + (<span class="type">uint64_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Cumulative number of huge allocation requests served + directly by the arena.</p></dd><dt><a name="stats.arenas.i.huge.ndalloc"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.huge.ndalloc</code>" + + (<span class="type">uint64_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Cumulative number of huge deallocation requests served + directly by the arena.</p></dd><dt><a name="stats.arenas.i.huge.nrequests"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.huge.nrequests</code>" + + (<span class="type">uint64_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Cumulative number of huge allocation requests. + </p></dd><dt><a name="stats.arenas.i.bins.j.nmalloc"></a><span class="term"> "<code class="mallctl">stats.arenas.<i>.bins.<j>.nmalloc</code>" @@ -1302,7 +1599,15 @@ malloc_conf = "xmalloc:true";</pre><p> <code class="literal">r-</code> [<code class="option">--enable-stats</code>] </span></dt><dd><p>Cumulative number of allocation - requests.</p></dd><dt><a name="stats.arenas.i.bins.j.nfills"></a><span class="term"> + requests.</p></dd><dt><a name="stats.arenas.i.bins.j.curregs"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.bins.<j>.curregs</code>" + + (<span class="type">size_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Current number of regions for this size + class.</p></dd><dt><a name="stats.arenas.i.bins.j.nfills"></a><span class="term"> "<code class="mallctl">stats.arenas.<i>.bins.<j>.nfills</code>" @@ -1370,6 +1675,38 @@ malloc_conf = "xmalloc:true";</pre><p> <code class="literal">r-</code> [<code class="option">--enable-stats</code>] </span></dt><dd><p>Current number of runs for this size class. + </p></dd><dt><a name="stats.arenas.i.hchunks.j.nmalloc"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.hchunks.<j>.nmalloc</code>" + + (<span class="type">uint64_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Cumulative number of allocation requests for this size + class served directly by the arena.</p></dd><dt><a name="stats.arenas.i.hchunks.j.ndalloc"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.hchunks.<j>.ndalloc</code>" + + (<span class="type">uint64_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Cumulative number of deallocation requests for this + size class served directly by the arena.</p></dd><dt><a name="stats.arenas.i.hchunks.j.nrequests"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.hchunks.<j>.nrequests</code>" + + (<span class="type">uint64_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Cumulative number of allocation requests for this size + class.</p></dd><dt><a name="stats.arenas.i.hchunks.j.curhchunks"></a><span class="term"> + + "<code class="mallctl">stats.arenas.<i>.hchunks.<j>.curhchunks</code>" + + (<span class="type">size_t</span>) + <code class="literal">r-</code> + [<code class="option">--enable-stats</code>] + </span></dt><dd><p>Current number of huge allocations for this size class. </p></dd></dl></div></div><div class="refsect1"><a name="debugging_malloc_problems"></a><h2>DEBUGGING MALLOC PROBLEMS</h2><p>When debugging, it is a good idea to configure/build jemalloc with the <code class="option">--enable-debug</code> and <code class="option">--enable-fill</code> options, and recompile the program with suitable options and symbols for @@ -1406,7 +1743,7 @@ malloc_conf = "xmalloc:true";</pre><p> <code class="function">malloc_stats_print</code>(<em class="parameter"><code></code></em>), followed by a string pointer. Please note that doing anything which tries to allocate memory in this function is likely to result in a crash or deadlock.</p><p>All messages are prefixed by - “<code class="computeroutput"><jemalloc>: </code>”.</p></div><div class="refsect1"><a name="return_values"></a><h2>RETURN VALUES</h2><div class="refsect2"><a name="idm316388028784"></a><h3>Standard API</h3><p>The <code class="function">malloc</code>(<em class="parameter"><code></code></em>) and + “<code class="computeroutput"><jemalloc>: </code>”.</p></div><div class="refsect1"><a name="return_values"></a><h2>RETURN VALUES</h2><div class="refsect2"><a name="idp46949776"></a><h3>Standard API</h3><p>The <code class="function">malloc</code>(<em class="parameter"><code></code></em>) and <code class="function">calloc</code>(<em class="parameter"><code></code></em>) functions return a pointer to the allocated memory if successful; otherwise a <code class="constant">NULL</code> pointer is returned and <code class="varname">errno</code> is set to @@ -1434,7 +1771,7 @@ malloc_conf = "xmalloc:true";</pre><p> allocation failure. The <code class="function">realloc</code>(<em class="parameter"><code></code></em>) function always leaves the original buffer intact when an error occurs. </p><p>The <code class="function">free</code>(<em class="parameter"><code></code></em>) function returns no - value.</p></div><div class="refsect2"><a name="idm316388003104"></a><h3>Non-standard API</h3><p>The <code class="function">mallocx</code>(<em class="parameter"><code></code></em>) and + value.</p></div><div class="refsect2"><a name="idp46974576"></a><h3>Non-standard API</h3><p>The <code class="function">mallocx</code>(<em class="parameter"><code></code></em>) and <code class="function">rallocx</code>(<em class="parameter"><code></code></em>) functions return a pointer to the allocated memory if successful; otherwise a <code class="constant">NULL</code> pointer is returned to indicate insufficient contiguous memory was @@ -1465,27 +1802,7 @@ malloc_conf = "xmalloc:true";</pre><p> read/write processing.</p></dd></dl></div><p> </p><p>The <code class="function">malloc_usable_size</code>(<em class="parameter"><code></code></em>) function returns the usable size of the allocation pointed to by - <em class="parameter"><code>ptr</code></em>. </p></div><div class="refsect2"><a name="idm316387973360"></a><h3>Experimental API</h3><p>The <code class="function">allocm</code>(<em class="parameter"><code></code></em>), - <code class="function">rallocm</code>(<em class="parameter"><code></code></em>), - <code class="function">sallocm</code>(<em class="parameter"><code></code></em>), - <code class="function">dallocm</code>(<em class="parameter"><code></code></em>), and - <code class="function">nallocm</code>(<em class="parameter"><code></code></em>) functions return - <code class="constant">ALLOCM_SUCCESS</code> on success; otherwise they return an - error value. The <code class="function">allocm</code>(<em class="parameter"><code></code></em>), - <code class="function">rallocm</code>(<em class="parameter"><code></code></em>), and - <code class="function">nallocm</code>(<em class="parameter"><code></code></em>) functions will fail if: - </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><span class="errorname">ALLOCM_ERR_OOM</span></span></dt><dd><p>Out of memory. Insufficient contiguous memory was - available to service the allocation request. The - <code class="function">allocm</code>(<em class="parameter"><code></code></em>) function additionally sets - <em class="parameter"><code>*ptr</code></em> to <code class="constant">NULL</code>, whereas - the <code class="function">rallocm</code>(<em class="parameter"><code></code></em>) function leaves - <code class="constant">*ptr</code> unmodified.</p></dd></dl></div><p> - The <code class="function">rallocm</code>(<em class="parameter"><code></code></em>) function will also - fail if: - </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><span class="errorname">ALLOCM_ERR_NOT_MOVED</span></span></dt><dd><p><code class="constant">ALLOCM_NO_MOVE</code> was specified, - but the reallocation request could not be serviced without moving - the object.</p></dd></dl></div><p> - </p></div></div><div class="refsect1"><a name="environment"></a><h2>ENVIRONMENT</h2><p>The following environment variable affects the execution of the + <em class="parameter"><code>ptr</code></em>. </p></div></div><div class="refsect1"><a name="environment"></a><h2>ENVIRONMENT</h2><p>The following environment variable affects the execution of the allocation functions: </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><code class="envar">MALLOC_CONF</code></span></dt><dd><p>If the environment variable <code class="envar">MALLOC_CONF</code> is set, the characters it contains diff --git a/deps/jemalloc/doc/jemalloc.xml.in b/deps/jemalloc/doc/jemalloc.xml.in index d8e2e711f..8fc774b18 100644 --- a/deps/jemalloc/doc/jemalloc.xml.in +++ b/deps/jemalloc/doc/jemalloc.xml.in @@ -38,17 +38,13 @@ <refname>xallocx</refname> <refname>sallocx</refname> <refname>dallocx</refname> + <refname>sdallocx</refname> <refname>nallocx</refname> <refname>mallctl</refname> <refname>mallctlnametomib</refname> <refname>mallctlbymib</refname> <refname>malloc_stats_print</refname> <refname>malloc_usable_size</refname> - <refname>allocm</refname> - <refname>rallocm</refname> - <refname>sallocm</refname> - <refname>dallocm</refname> - <refname>nallocm</refname> --> <refpurpose>general purpose memory allocation functions</refpurpose> </refnamediv> @@ -61,8 +57,7 @@ <refsynopsisdiv> <title>SYNOPSIS</title> <funcsynopsis> - <funcsynopsisinfo>#include <<filename class="headerfile">stdlib.h</filename>> -#include <<filename class="headerfile">jemalloc/jemalloc.h</filename>></funcsynopsisinfo> + <funcsynopsisinfo>#include <<filename class="headerfile">jemalloc/jemalloc.h</filename>></funcsynopsisinfo> <refsect2> <title>Standard API</title> <funcprototype> @@ -126,6 +121,12 @@ <paramdef>int <parameter>flags</parameter></paramdef> </funcprototype> <funcprototype> + <funcdef>void <function>sdallocx</function></funcdef> + <paramdef>void *<parameter>ptr</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>int <parameter>flags</parameter></paramdef> + </funcprototype> + <funcprototype> <funcdef>size_t <function>nallocx</function></funcdef> <paramdef>size_t <parameter>size</parameter></paramdef> <paramdef>int <parameter>flags</parameter></paramdef> @@ -172,41 +173,6 @@ </funcprototype> <para><type>const char *</type><varname>malloc_conf</varname>;</para> </refsect2> - <refsect2> - <title>Experimental API</title> - <funcprototype> - <funcdef>int <function>allocm</function></funcdef> - <paramdef>void **<parameter>ptr</parameter></paramdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>size_t <parameter>size</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>rallocm</function></funcdef> - <paramdef>void **<parameter>ptr</parameter></paramdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>size_t <parameter>size</parameter></paramdef> - <paramdef>size_t <parameter>extra</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>sallocm</function></funcdef> - <paramdef>const void *<parameter>ptr</parameter></paramdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>dallocm</function></funcdef> - <paramdef>void *<parameter>ptr</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>nallocm</function></funcdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>size_t <parameter>size</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - </refsect2> </funcsynopsis> </refsynopsisdiv> <refsect1 id="description"> @@ -229,15 +195,15 @@ <para>The <function>posix_memalign<parameter/></function> function allocates <parameter>size</parameter> bytes of memory such that the - allocation's base address is an even multiple of + allocation's base address is a multiple of <parameter>alignment</parameter>, and returns the allocation in the value pointed to by <parameter>ptr</parameter>. The requested - <parameter>alignment</parameter> must be a power of 2 at least as large - as <code language="C">sizeof(<type>void *</type>)</code>.</para> + <parameter>alignment</parameter> must be a power of 2 at least as large as + <code language="C">sizeof(<type>void *</type>)</code>.</para> <para>The <function>aligned_alloc<parameter/></function> function allocates <parameter>size</parameter> bytes of memory such that the - allocation's base address is an even multiple of + allocation's base address is a multiple of <parameter>alignment</parameter>. The requested <parameter>alignment</parameter> must be a power of 2. Behavior is undefined if <parameter>size</parameter> is not an integral multiple of @@ -268,14 +234,15 @@ <function>rallocx<parameter/></function>, <function>xallocx<parameter/></function>, <function>sallocx<parameter/></function>, - <function>dallocx<parameter/></function>, and + <function>dallocx<parameter/></function>, + <function>sdallocx<parameter/></function>, and <function>nallocx<parameter/></function> functions all have a <parameter>flags</parameter> argument that can be used to specify options. The functions only check the options that are contextually relevant. Use bitwise or (<code language="C">|</code>) operations to specify one or more of the following: <variablelist> - <varlistentry> + <varlistentry id="MALLOCX_LG_ALIGN"> <term><constant>MALLOCX_LG_ALIGN(<parameter>la</parameter>) </constant></term> @@ -285,7 +252,7 @@ that <parameter>la</parameter> is within the valid range.</para></listitem> </varlistentry> - <varlistentry> + <varlistentry id="MALLOCX_ALIGN"> <term><constant>MALLOCX_ALIGN(<parameter>a</parameter>) </constant></term> @@ -295,7 +262,7 @@ validate that <parameter>a</parameter> is a power of 2. </para></listitem> </varlistentry> - <varlistentry> + <varlistentry id="MALLOCX_ZERO"> <term><constant>MALLOCX_ZERO</constant></term> <listitem><para>Initialize newly allocated memory to contain zero @@ -304,16 +271,38 @@ that are initialized to contain zero bytes. If this macro is absent, newly allocated memory is uninitialized.</para></listitem> </varlistentry> - <varlistentry> + <varlistentry id="MALLOCX_TCACHE"> + <term><constant>MALLOCX_TCACHE(<parameter>tc</parameter>) + </constant></term> + + <listitem><para>Use the thread-specific cache (tcache) specified by + the identifier <parameter>tc</parameter>, which must have been + acquired via the <link + linkend="tcache.create"><mallctl>tcache.create</mallctl></link> + mallctl. This macro does not validate that + <parameter>tc</parameter> specifies a valid + identifier.</para></listitem> + </varlistentry> + <varlistentry id="MALLOC_TCACHE_NONE"> + <term><constant>MALLOCX_TCACHE_NONE</constant></term> + + <listitem><para>Do not use a thread-specific cache (tcache). Unless + <constant>MALLOCX_TCACHE(<parameter>tc</parameter>)</constant> or + <constant>MALLOCX_TCACHE_NONE</constant> is specified, an + automatically managed tcache will be used under many circumstances. + This macro cannot be used in the same <parameter>flags</parameter> + argument as + <constant>MALLOCX_TCACHE(<parameter>tc</parameter>)</constant>.</para></listitem> + </varlistentry> + <varlistentry id="MALLOCX_ARENA"> <term><constant>MALLOCX_ARENA(<parameter>a</parameter>) </constant></term> <listitem><para>Use the arena specified by the index - <parameter>a</parameter> (and by necessity bypass the thread - cache). This macro has no effect for huge regions, nor for regions - that were allocated via an arena other than the one specified. - This macro does not validate that <parameter>a</parameter> - specifies an arena index in the valid range.</para></listitem> + <parameter>a</parameter>. This macro has no effect for regions that + were allocated via an arena other than the one specified. This + macro does not validate that <parameter>a</parameter> specifies an + arena index in the valid range.</para></listitem> </varlistentry> </variablelist> </para> @@ -352,6 +341,15 @@ memory referenced by <parameter>ptr</parameter> to be made available for future allocations.</para> + <para>The <function>sdallocx<parameter/></function> function is an + extension of <function>dallocx<parameter/></function> with a + <parameter>size</parameter> parameter to allow the caller to pass in the + allocation size as an optimization. The minimum valid input size is the + original requested size of the allocation, and the maximum valid input + size is the corresponding value returned by + <function>nallocx<parameter/></function> or + <function>sallocx<parameter/></function>.</para> + <para>The <function>nallocx<parameter/></function> function allocates no memory, but it performs the same size computation as the <function>mallocx<parameter/></function> function, and returns the real @@ -430,11 +428,12 @@ for (i = 0; i < nbins; i++) { functions simultaneously. If <option>--enable-stats</option> is specified during configuration, “m” and “a” can be specified to omit merged arena and per arena statistics, respectively; - “b” and “l” can be specified to omit per size - class statistics for bins and large objects, respectively. Unrecognized - characters are silently ignored. Note that thread caching may prevent - some statistics from being completely up to date, since extra locking - would be required to merge counters that track thread cache operations. + “b”, “l”, and “h” can be specified to + omit per size class statistics for bins, large objects, and huge objects, + respectively. Unrecognized characters are silently ignored. Note that + thread caching may prevent some statistics from being completely up to + date, since extra locking would be required to merge counters that track + thread cache operations. </para> <para>The <function>malloc_usable_size<parameter/></function> function @@ -449,116 +448,6 @@ for (i = 0; i < nbins; i++) { depended on, since such behavior is entirely implementation-dependent. </para> </refsect2> - <refsect2> - <title>Experimental API</title> - <para>The experimental API is subject to change or removal without regard - for backward compatibility. If <option>--disable-experimental</option> - is specified during configuration, the experimental API is - omitted.</para> - - <para>The <function>allocm<parameter/></function>, - <function>rallocm<parameter/></function>, - <function>sallocm<parameter/></function>, - <function>dallocm<parameter/></function>, and - <function>nallocm<parameter/></function> functions all have a - <parameter>flags</parameter> argument that can be used to specify - options. The functions only check the options that are contextually - relevant. Use bitwise or (<code language="C">|</code>) operations to - specify one or more of the following: - <variablelist> - <varlistentry> - <term><constant>ALLOCM_LG_ALIGN(<parameter>la</parameter>) - </constant></term> - - <listitem><para>Align the memory allocation to start at an address - that is a multiple of <code language="C">(1 << - <parameter>la</parameter>)</code>. This macro does not validate - that <parameter>la</parameter> is within the valid - range.</para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_ALIGN(<parameter>a</parameter>) - </constant></term> - - <listitem><para>Align the memory allocation to start at an address - that is a multiple of <parameter>a</parameter>, where - <parameter>a</parameter> is a power of two. This macro does not - validate that <parameter>a</parameter> is a power of 2. - </para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_ZERO</constant></term> - - <listitem><para>Initialize newly allocated memory to contain zero - bytes. In the growing reallocation case, the real size prior to - reallocation defines the boundary between untouched bytes and those - that are initialized to contain zero bytes. If this macro is - absent, newly allocated memory is uninitialized.</para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_NO_MOVE</constant></term> - - <listitem><para>For reallocation, fail rather than moving the - object. This constraint can apply to both growth and - shrinkage.</para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_ARENA(<parameter>a</parameter>) - </constant></term> - - <listitem><para>Use the arena specified by the index - <parameter>a</parameter> (and by necessity bypass the thread - cache). This macro has no effect for huge regions, nor for regions - that were allocated via an arena other than the one specified. - This macro does not validate that <parameter>a</parameter> - specifies an arena index in the valid range.</para></listitem> - </varlistentry> - </variablelist> - </para> - - <para>The <function>allocm<parameter/></function> function allocates at - least <parameter>size</parameter> bytes of memory, sets - <parameter>*ptr</parameter> to the base address of the allocation, and - sets <parameter>*rsize</parameter> to the real size of the allocation if - <parameter>rsize</parameter> is not <constant>NULL</constant>. Behavior - is undefined if <parameter>size</parameter> is <constant>0</constant>, or - if request size overflows due to size class and/or alignment - constraints.</para> - - <para>The <function>rallocm<parameter/></function> function resizes the - allocation at <parameter>*ptr</parameter> to be at least - <parameter>size</parameter> bytes, sets <parameter>*ptr</parameter> to - the base address of the allocation if it moved, and sets - <parameter>*rsize</parameter> to the real size of the allocation if - <parameter>rsize</parameter> is not <constant>NULL</constant>. If - <parameter>extra</parameter> is non-zero, an attempt is made to resize - the allocation to be at least <code - language="C">(<parameter>size</parameter> + - <parameter>extra</parameter>)</code> bytes, though inability to allocate - the extra byte(s) will not by itself result in failure. Behavior is - undefined if <parameter>size</parameter> is <constant>0</constant>, if - request size overflows due to size class and/or alignment constraints, or - if <code language="C">(<parameter>size</parameter> + - <parameter>extra</parameter> > - <constant>SIZE_T_MAX</constant>)</code>.</para> - - <para>The <function>sallocm<parameter/></function> function sets - <parameter>*rsize</parameter> to the real size of the allocation.</para> - - <para>The <function>dallocm<parameter/></function> function causes the - memory referenced by <parameter>ptr</parameter> to be made available for - future allocations.</para> - - <para>The <function>nallocm<parameter/></function> function allocates no - memory, but it performs the same size computation as the - <function>allocm<parameter/></function> function, and if - <parameter>rsize</parameter> is not <constant>NULL</constant> it sets - <parameter>*rsize</parameter> to the real size of the allocation that - would result from the equivalent <function>allocm<parameter/></function> - function call. Behavior is undefined if <parameter>size</parameter> is - <constant>0</constant>, or if request size overflows due to size class - and/or alignment constraints.</para> - </refsect2> </refsect1> <refsect1 id="tuning"> <title>TUNING</title> @@ -598,8 +487,10 @@ for (i = 0; i < nbins; i++) { <manvolnum>2</manvolnum></citerefentry> to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory. If - <option>--enable-dss</option> is specified during configuration, this - allocator uses both <citerefentry><refentrytitle>mmap</refentrytitle> + <citerefentry><refentrytitle>sbrk</refentrytitle> + <manvolnum>2</manvolnum></citerefentry> is supported by the operating + system, this allocator uses both + <citerefentry><refentrytitle>mmap</refentrytitle> <manvolnum>2</manvolnum></citerefentry> and <citerefentry><refentrytitle>sbrk</refentrytitle> <manvolnum>2</manvolnum></citerefentry>, in that order of preference; @@ -632,12 +523,11 @@ for (i = 0; i < nbins; i++) { possible to find metadata for user objects very quickly.</para> <para>User objects are broken into three categories according to size: - small, large, and huge. Small objects are smaller than one page. Large - objects are smaller than the chunk size. Huge objects are a multiple of - the chunk size. Small and large objects are managed by arenas; huge - objects are managed separately in a single data structure that is shared by - all threads. Huge objects are used by applications infrequently enough - that this single data structure is not a scalability issue.</para> + small, large, and huge. Small and large objects are managed entirely by + arenas; huge objects are additionally aggregated in a single data structure + that is shared by all threads. Huge objects are typically used by + applications infrequently enough that this single data structure is not a + scalability issue.</para> <para>Each chunk that is managed by an arena tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one @@ -646,18 +536,18 @@ for (i = 0; i < nbins; i++) { allocations in constant time.</para> <para>Small objects are managed in groups by page runs. Each run maintains - a frontier and free list to track which regions are in use. Allocation - requests that are no more than half the quantum (8 or 16, depending on - architecture) are rounded up to the nearest power of two that is at least - <code language="C">sizeof(<type>double</type>)</code>. All other small - object size classes are multiples of the quantum, spaced such that internal - fragmentation is limited to approximately 25% for all but the smallest size - classes. Allocation requests that are larger than the maximum small size - class, but small enough to fit in an arena-managed chunk (see the <link - linkend="opt.lg_chunk"><mallctl>opt.lg_chunk</mallctl></link> option), are - rounded up to the nearest run size. Allocation requests that are too large - to fit in an arena-managed chunk are rounded up to the nearest multiple of - the chunk size.</para> + a bitmap to track which regions are in use. Allocation requests that are no + more than half the quantum (8 or 16, depending on architecture) are rounded + up to the nearest power of two that is at least <code + language="C">sizeof(<type>double</type>)</code>. All other object size + classes are multiples of the quantum, spaced such that there are four size + classes for each doubling in size, which limits internal fragmentation to + approximately 20% for all but the smallest size classes. Small size classes + are smaller than four times the page size, large size classes are smaller + than the chunk size (see the <link + linkend="opt.lg_chunk"><mallctl>opt.lg_chunk</mallctl></link> option), and + huge size classes extend from the chunk size up to one size class less than + the full address space size.</para> <para>Allocations are packed tightly together, which can be an issue for multi-threaded applications. If you need to assure that allocations do not @@ -665,8 +555,29 @@ for (i = 0; i < nbins; i++) { nearest multiple of the cacheline size, or specify cacheline alignment when allocating.</para> - <para>Assuming 4 MiB chunks, 4 KiB pages, and a 16-byte quantum on a 64-bit - system, the size classes in each category are as shown in <xref + <para>The <function>realloc<parameter/></function>, + <function>rallocx<parameter/></function>, and + <function>xallocx<parameter/></function> functions may resize allocations + without moving them under limited circumstances. Unlike the + <function>*allocx<parameter/></function> API, the standard API does not + officially round up the usable size of an allocation to the nearest size + class, so technically it is necessary to call + <function>realloc<parameter/></function> to grow e.g. a 9-byte allocation to + 16 bytes, or shrink a 16-byte allocation to 9 bytes. Growth and shrinkage + trivially succeeds in place as long as the pre-size and post-size both round + up to the same size class. No other API guarantees are made regarding + in-place resizing, but the current implementation also tries to resize large + and huge allocations in place, as long as the pre-size and post-size are + both large or both huge. In such cases shrinkage always succeeds for large + size classes, but for huge size classes the chunk allocator must support + splitting (see <link + linkend="arena.i.chunk_hooks"><mallctl>arena.<i>.chunk_hooks</mallctl></link>). + Growth only succeeds if the trailing memory is currently available, and + additionally for huge size classes the chunk allocator must support + merging.</para> + + <para>Assuming 2 MiB chunks, 4 KiB pages, and a 16-byte quantum on a + 64-bit system, the size classes in each category are as shown in <xref linkend="size_classes" xrefstyle="template:Table %n"/>.</para> <table xml:id="size_classes" frame="all"> @@ -684,13 +595,13 @@ for (i = 0; i < nbins; i++) { </thead> <tbody> <row> - <entry morerows="6">Small</entry> + <entry morerows="8">Small</entry> <entry>lg</entry> <entry>[8]</entry> </row> <row> <entry>16</entry> - <entry>[16, 32, 48, ..., 128]</entry> + <entry>[16, 32, 48, 64, 80, 96, 112, 128]</entry> </row> <row> <entry>32</entry> @@ -710,17 +621,77 @@ for (i = 0; i < nbins; i++) { </row> <row> <entry>512</entry> - <entry>[2560, 3072, 3584]</entry> + <entry>[2560, 3072, 3584, 4096]</entry> + </row> + <row> + <entry>1 KiB</entry> + <entry>[5 KiB, 6 KiB, 7 KiB, 8 KiB]</entry> + </row> + <row> + <entry>2 KiB</entry> + <entry>[10 KiB, 12 KiB, 14 KiB]</entry> + </row> + <row> + <entry morerows="7">Large</entry> + <entry>2 KiB</entry> + <entry>[16 KiB]</entry> </row> <row> - <entry>Large</entry> <entry>4 KiB</entry> - <entry>[4 KiB, 8 KiB, 12 KiB, ..., 4072 KiB]</entry> + <entry>[20 KiB, 24 KiB, 28 KiB, 32 KiB]</entry> + </row> + <row> + <entry>8 KiB</entry> + <entry>[40 KiB, 48 KiB, 54 KiB, 64 KiB]</entry> + </row> + <row> + <entry>16 KiB</entry> + <entry>[80 KiB, 96 KiB, 112 KiB, 128 KiB]</entry> + </row> + <row> + <entry>32 KiB</entry> + <entry>[160 KiB, 192 KiB, 224 KiB, 256 KiB]</entry> + </row> + <row> + <entry>64 KiB</entry> + <entry>[320 KiB, 384 KiB, 448 KiB, 512 KiB]</entry> + </row> + <row> + <entry>128 KiB</entry> + <entry>[640 KiB, 768 KiB, 896 KiB, 1 MiB]</entry> + </row> + <row> + <entry>256 KiB</entry> + <entry>[1280 KiB, 1536 KiB, 1792 KiB]</entry> + </row> + <row> + <entry morerows="6">Huge</entry> + <entry>256 KiB</entry> + <entry>[2 MiB]</entry> + </row> + <row> + <entry>512 KiB</entry> + <entry>[2560 KiB, 3 MiB, 3584 KiB, 4 MiB]</entry> + </row> + <row> + <entry>1 MiB</entry> + <entry>[5 MiB, 6 MiB, 7 MiB, 8 MiB]</entry> + </row> + <row> + <entry>2 MiB</entry> + <entry>[10 MiB, 12 MiB, 14 MiB, 16 MiB]</entry> </row> <row> - <entry>Huge</entry> <entry>4 MiB</entry> - <entry>[4 MiB, 8 MiB, 12 MiB, ...]</entry> + <entry>[20 MiB, 24 MiB, 28 MiB, 32 MiB]</entry> + </row> + <row> + <entry>8 MiB</entry> + <entry>[40 MiB, 48 MiB, 56 MiB, 64 MiB]</entry> + </row> + <row> + <entry>...</entry> + <entry>...</entry> </row> </tbody> </tgroup> @@ -765,23 +736,23 @@ for (i = 0; i < nbins; i++) { detecting whether another thread caused a refresh.</para></listitem> </varlistentry> - <varlistentry id="config.debug"> + <varlistentry id="config.cache_oblivious"> <term> - <mallctl>config.debug</mallctl> + <mallctl>config.cache_oblivious</mallctl> (<type>bool</type>) <literal>r-</literal> </term> - <listitem><para><option>--enable-debug</option> was specified during - build configuration.</para></listitem> + <listitem><para><option>--enable-cache-oblivious</option> was specified + during build configuration.</para></listitem> </varlistentry> - <varlistentry id="config.dss"> + <varlistentry id="config.debug"> <term> - <mallctl>config.dss</mallctl> + <mallctl>config.debug</mallctl> (<type>bool</type>) <literal>r-</literal> </term> - <listitem><para><option>--enable-dss</option> was specified during + <listitem><para><option>--enable-debug</option> was specified during build configuration.</para></listitem> </varlistentry> @@ -805,16 +776,6 @@ for (i = 0; i < nbins; i++) { during build configuration.</para></listitem> </varlistentry> - <varlistentry id="config.mremap"> - <term> - <mallctl>config.mremap</mallctl> - (<type>bool</type>) - <literal>r-</literal> - </term> - <listitem><para><option>--enable-mremap</option> was specified during - build configuration.</para></listitem> - </varlistentry> - <varlistentry id="config.munmap"> <term> <mallctl>config.munmap</mallctl> @@ -940,10 +901,15 @@ for (i = 0; i < nbins; i++) { <manvolnum>2</manvolnum></citerefentry>) allocation precedence as related to <citerefentry><refentrytitle>mmap</refentrytitle> <manvolnum>2</manvolnum></citerefentry> allocation. The following - settings are supported: “disabled”, “primary”, - and “secondary”. The default is “secondary” if - <link linkend="config.dss"><mallctl>config.dss</mallctl></link> is - true, “disabled” otherwise. + settings are supported if + <citerefentry><refentrytitle>sbrk</refentrytitle> + <manvolnum>2</manvolnum></citerefentry> is supported by the operating + system: “disabled”, “primary”, and + “secondary”; otherwise only “disabled” is + supported. The default is “secondary” if + <citerefentry><refentrytitle>sbrk</refentrytitle> + <manvolnum>2</manvolnum></citerefentry> is supported by the operating + system; “disabled” otherwise. </para></listitem> </varlistentry> @@ -956,7 +922,7 @@ for (i = 0; i < nbins; i++) { <listitem><para>Virtual memory chunk size (log base 2). If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size. The default - chunk size is 4 MiB (2^22). + chunk size is 2 MiB (2^21). </para></listitem> </varlistentry> @@ -986,7 +952,11 @@ for (i = 0; i < nbins; i++) { provides the kernel with sufficient information to recycle dirty pages if physical memory becomes scarce and the pages remain unused. The default minimum ratio is 8:1 (2^3:1); an option value of -1 will - disable dirty page purging.</para></listitem> + disable dirty page purging. See <link + linkend="arenas.lg_dirty_mult"><mallctl>arenas.lg_dirty_mult</mallctl></link> + and <link + linkend="arena.i.lg_dirty_mult"><mallctl>arena.<i>.lg_dirty_mult</mallctl></link> + for related dynamic control options.</para></listitem> </varlistentry> <varlistentry id="opt.stats_print"> @@ -1003,26 +973,34 @@ for (i = 0; i < nbins; i++) { <option>--enable-stats</option> is specified during configuration, this has the potential to cause deadlock for a multi-threaded process that exits while one or more threads are executing in the memory allocation - functions. Therefore, this option should only be used with care; it is - primarily intended as a performance tuning aid during application + functions. Furthermore, <function>atexit<parameter/></function> may + allocate memory during application initialization and then deadlock + internally when jemalloc in turn calls + <function>atexit<parameter/></function>, so this option is not + univerally usable (though the application can register its own + <function>atexit<parameter/></function> function with equivalent + functionality). Therefore, this option should only be used with care; + it is primarily intended as a performance tuning aid during application development. This option is disabled by default.</para></listitem> </varlistentry> <varlistentry id="opt.junk"> <term> <mallctl>opt.junk</mallctl> - (<type>bool</type>) + (<type>const char *</type>) <literal>r-</literal> [<option>--enable-fill</option>] </term> - <listitem><para>Junk filling enabled/disabled. If enabled, each byte - of uninitialized allocated memory will be initialized to - <literal>0xa5</literal>. All deallocated memory will be initialized to - <literal>0x5a</literal>. This is intended for debugging and will - impact performance negatively. This option is disabled by default - unless <option>--enable-debug</option> is specified during - configuration, in which case it is enabled by default unless running - inside <ulink + <listitem><para>Junk filling. If set to "alloc", each byte of + uninitialized allocated memory will be initialized to + <literal>0xa5</literal>. If set to "free", all deallocated memory will + be initialized to <literal>0x5a</literal>. If set to "true", both + allocated and deallocated memory will be initialized, and if set to + "false", junk filling be disabled entirely. This is intended for + debugging and will impact performance negatively. This option is + "false" by default unless <option>--enable-debug</option> is specified + during configuration, in which case it is "true" by default unless + running inside <ulink url="http://valgrind.org/">Valgrind</ulink>.</para></listitem> </varlistentry> @@ -1076,9 +1054,8 @@ for (i = 0; i < nbins; i++) { <listitem><para>Zero filling enabled/disabled. If enabled, each byte of uninitialized allocated memory will be initialized to 0. Note that this initialization only happens once for each byte, so - <function>realloc<parameter/></function>, - <function>rallocx<parameter/></function> and - <function>rallocm<parameter/></function> calls do not zero memory that + <function>realloc<parameter/></function> and + <function>rallocx<parameter/></function> calls do not zero memory that was previously allocated. This is intended for debugging and will impact performance negatively. This option is disabled by default. </para></listitem> @@ -1097,19 +1074,6 @@ for (i = 0; i < nbins; i++) { is disabled by default.</para></listitem> </varlistentry> - <varlistentry id="opt.valgrind"> - <term> - <mallctl>opt.valgrind</mallctl> - (<type>bool</type>) - <literal>r-</literal> - [<option>--enable-valgrind</option>] - </term> - <listitem><para><ulink url="http://valgrind.org/">Valgrind</ulink> - support enabled/disabled. This option is vestigal because jemalloc - auto-detects whether it is running inside Valgrind. This option is - disabled by default, unless running inside Valgrind.</para></listitem> - </varlistentry> - <varlistentry id="opt.xmalloc"> <term> <mallctl>opt.xmalloc</mallctl> @@ -1137,16 +1101,16 @@ malloc_conf = "xmalloc:true";]]></programlisting> <literal>r-</literal> [<option>--enable-tcache</option>] </term> - <listitem><para>Thread-specific caching enabled/disabled. When there - are multiple threads, each thread uses a thread-specific cache for - objects up to a certain size. Thread-specific caching allows many - allocations to be satisfied without performing any thread - synchronization, at the cost of increased memory use. See the - <link + <listitem><para>Thread-specific caching (tcache) enabled/disabled. When + there are multiple threads, each thread uses a tcache for objects up to + a certain size. Thread-specific caching allows many allocations to be + satisfied without performing any thread synchronization, at the cost of + increased memory use. See the <link linkend="opt.lg_tcache_max"><mallctl>opt.lg_tcache_max</mallctl></link> option for related tuning information. This option is enabled by default unless running inside <ulink - url="http://valgrind.org/">Valgrind</ulink>.</para></listitem> + url="http://valgrind.org/">Valgrind</ulink>, in which case it is + forcefully disabled.</para></listitem> </varlistentry> <varlistentry id="opt.lg_tcache_max"> @@ -1157,8 +1121,8 @@ malloc_conf = "xmalloc:true";]]></programlisting> [<option>--enable-tcache</option>] </term> <listitem><para>Maximum size class (log base 2) to cache in the - thread-specific cache. At a minimum, all small size classes are - cached, and at a maximum all large size classes are cached. The + thread-specific cache (tcache). At a minimum, all small size classes + are cached, and at a maximum all large size classes are cached. The default maximum is 32 KiB (2^15).</para></listitem> </varlistentry> @@ -1183,8 +1147,9 @@ malloc_conf = "xmalloc:true";]]></programlisting> option for information on high-water-triggered profile dumping, and the <link linkend="opt.prof_final"><mallctl>opt.prof_final</mallctl></link> option for final profile dumping. Profile output is compatible with - the included <command>pprof</command> Perl script, which originates - from the <ulink url="http://code.google.com/p/gperftools/">gperftools + the <command>jeprof</command> command, which is based on the + <command>pprof</command> that is developed as part of the <ulink + url="http://code.google.com/p/gperftools/">gperftools package</ulink>.</para></listitem> </varlistentry> @@ -1206,7 +1171,7 @@ malloc_conf = "xmalloc:true";]]></programlisting> <term> <mallctl>opt.prof_active</mallctl> (<type>bool</type>) - <literal>rw</literal> + <literal>r-</literal> [<option>--enable-prof</option>] </term> <listitem><para>Profiling activated/deactivated. This is a secondary @@ -1219,10 +1184,25 @@ malloc_conf = "xmalloc:true";]]></programlisting> This option is enabled by default.</para></listitem> </varlistentry> + <varlistentry id="opt.prof_thread_active_init"> + <term> + <mallctl>opt.prof_thread_active_init</mallctl> + (<type>bool</type>) + <literal>r-</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Initial setting for <link + linkend="thread.prof.active"><mallctl>thread.prof.active</mallctl></link> + in newly created threads. The initial setting for newly created threads + can also be changed during execution via the <link + linkend="prof.thread_active_init"><mallctl>prof.thread_active_init</mallctl></link> + mallctl. This option is enabled by default.</para></listitem> + </varlistentry> + <varlistentry id="opt.lg_prof_sample"> <term> <mallctl>opt.lg_prof_sample</mallctl> - (<type>ssize_t</type>) + (<type>size_t</type>) <literal>r-</literal> [<option>--enable-prof</option>] </term> @@ -1276,13 +1256,11 @@ malloc_conf = "xmalloc:true";]]></programlisting> <literal>r-</literal> [<option>--enable-prof</option>] </term> - <listitem><para>Trigger a memory profile dump every time the total - virtual memory exceeds the previous maximum. Profiles are dumped to - files named according to the pattern - <filename><prefix>.<pid>.<seq>.u<useq>.heap</filename>, - where <literal><prefix></literal> is controlled by the <link - linkend="opt.prof_prefix"><mallctl>opt.prof_prefix</mallctl></link> - option. This option is disabled by default.</para></listitem> + <listitem><para>Set the initial state of <link + linkend="prof.gdump"><mallctl>prof.gdump</mallctl></link>, which when + enabled triggers a memory profile dump every time the total virtual + memory exceeds the previous maximum. This option is disabled by + default.</para></listitem> </varlistentry> <varlistentry id="opt.prof_final"> @@ -1299,7 +1277,13 @@ malloc_conf = "xmalloc:true";]]></programlisting> <filename><prefix>.<pid>.<seq>.f.heap</filename>, where <literal><prefix></literal> is controlled by the <link linkend="opt.prof_prefix"><mallctl>opt.prof_prefix</mallctl></link> - option. This option is enabled by default.</para></listitem> + option. Note that <function>atexit<parameter/></function> may allocate + memory during application initialization and then deadlock internally + when jemalloc in turn calls <function>atexit<parameter/></function>, so + this option is not univerally usable (though the application can + register its own <function>atexit<parameter/></function> function with + equivalent functionality). This option is disabled by + default.</para></listitem> </varlistentry> <varlistentry id="opt.prof_leak"> @@ -1396,7 +1380,7 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Enable/disable calling thread's tcache. The tcache is implicitly flushed as a side effect of becoming disabled (see <link - lenkend="thread.tcache.flush"><mallctl>thread.tcache.flush</mallctl></link>). + linkend="thread.tcache.flush"><mallctl>thread.tcache.flush</mallctl></link>). </para></listitem> </varlistentry> @@ -1407,9 +1391,9 @@ malloc_conf = "xmalloc:true";]]></programlisting> <literal>--</literal> [<option>--enable-tcache</option>] </term> - <listitem><para>Flush calling thread's tcache. This interface releases - all cached objects and internal data structures associated with the - calling thread's thread-specific cache. Ordinarily, this interface + <listitem><para>Flush calling thread's thread-specific cache (tcache). + This interface releases all cached objects and internal data structures + associated with the calling thread's tcache. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits. However, garbage collection is triggered by allocation @@ -1418,10 +1402,91 @@ malloc_conf = "xmalloc:true";]]></programlisting> the developer may find manual flushing useful.</para></listitem> </varlistentry> + <varlistentry id="thread.prof.name"> + <term> + <mallctl>thread.prof.name</mallctl> + (<type>const char *</type>) + <literal>r-</literal> or + <literal>-w</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Get/set the descriptive name associated with the calling + thread in memory profile dumps. An internal copy of the name string is + created, so the input string need not be maintained after this interface + completes execution. The output string of this interface should be + copied for non-ephemeral uses, because multiple implementation details + can cause asynchronous string deallocation. Furthermore, each + invocation of this interface can only read or write; simultaneous + read/write is not supported due to string lifetime limitations. The + name string must nil-terminated and comprised only of characters in the + sets recognized + by <citerefentry><refentrytitle>isgraph</refentrytitle> + <manvolnum>3</manvolnum></citerefentry> and + <citerefentry><refentrytitle>isblank</refentrytitle> + <manvolnum>3</manvolnum></citerefentry>.</para></listitem> + </varlistentry> + + <varlistentry id="thread.prof.active"> + <term> + <mallctl>thread.prof.active</mallctl> + (<type>bool</type>) + <literal>rw</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Control whether sampling is currently active for the + calling thread. This is an activation mechanism in addition to <link + linkend="prof.active"><mallctl>prof.active</mallctl></link>; both must + be active for the calling thread to sample. This flag is enabled by + default.</para></listitem> + </varlistentry> + + <varlistentry id="tcache.create"> + <term> + <mallctl>tcache.create</mallctl> + (<type>unsigned</type>) + <literal>r-</literal> + [<option>--enable-tcache</option>] + </term> + <listitem><para>Create an explicit thread-specific cache (tcache) and + return an identifier that can be passed to the <link + linkend="MALLOCX_TCACHE"><constant>MALLOCX_TCACHE(<parameter>tc</parameter>)</constant></link> + macro to explicitly use the specified cache rather than the + automatically managed one that is used by default. Each explicit cache + can be used by only one thread at a time; the application must assure + that this constraint holds. + </para></listitem> + </varlistentry> + + <varlistentry id="tcache.flush"> + <term> + <mallctl>tcache.flush</mallctl> + (<type>unsigned</type>) + <literal>-w</literal> + [<option>--enable-tcache</option>] + </term> + <listitem><para>Flush the specified thread-specific cache (tcache). The + same considerations apply to this interface as to <link + linkend="thread.tcache.flush"><mallctl>thread.tcache.flush</mallctl></link>, + except that the tcache will never be automatically be discarded. + </para></listitem> + </varlistentry> + + <varlistentry id="tcache.destroy"> + <term> + <mallctl>tcache.destroy</mallctl> + (<type>unsigned</type>) + <literal>-w</literal> + [<option>--enable-tcache</option>] + </term> + <listitem><para>Flush the specified thread-specific cache (tcache) and + make the identifier available for use during a future tcache creation. + </para></listitem> + </varlistentry> + <varlistentry id="arena.i.purge"> <term> <mallctl>arena.<i>.purge</mallctl> - (<type>unsigned</type>) + (<type>void</type>) <literal>--</literal> </term> <listitem><para>Purge unused dirty pages for arena <i>, or for @@ -1439,14 +1504,222 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Set the precedence of dss allocation as related to mmap allocation for arena <i>, or for all arenas if <i> equals <link - linkend="arenas.narenas"><mallctl>arenas.narenas</mallctl></link>. Note - that even during huge allocation this setting is read from the arena - that would be chosen for small or large allocation so that applications - can depend on consistent dss versus mmap allocation regardless of - allocation size. See <link - linkend="opt.dss"><mallctl>opt.dss</mallctl></link> for supported - settings. - </para></listitem> + linkend="arenas.narenas"><mallctl>arenas.narenas</mallctl></link>. See + <link linkend="opt.dss"><mallctl>opt.dss</mallctl></link> for supported + settings.</para></listitem> + </varlistentry> + + <varlistentry id="arena.i.lg_dirty_mult"> + <term> + <mallctl>arena.<i>.lg_dirty_mult</mallctl> + (<type>ssize_t</type>) + <literal>rw</literal> + </term> + <listitem><para>Current per-arena minimum ratio (log base 2) of active + to dirty pages for arena <i>. Each time this interface is set and + the ratio is increased, pages are synchronously purged as necessary to + impose the new ratio. See <link + linkend="opt.lg_dirty_mult"><mallctl>opt.lg_dirty_mult</mallctl></link> + for additional information.</para></listitem> + </varlistentry> + + <varlistentry id="arena.i.chunk_hooks"> + <term> + <mallctl>arena.<i>.chunk_hooks</mallctl> + (<type>chunk_hooks_t</type>) + <literal>rw</literal> + </term> + <listitem><para>Get or set the chunk management hook functions for arena + <i>. The functions must be capable of operating on all extant + chunks associated with arena <i>, usually by passing unknown + chunks to the replaced functions. In practice, it is feasible to + control allocation for arenas created via <link + linkend="arenas.extend"><mallctl>arenas.extend</mallctl></link> such + that all chunks originate from an application-supplied chunk allocator + (by setting custom chunk hook functions just after arena creation), but + the automatically created arenas may have already created chunks prior + to the application having an opportunity to take over chunk + allocation.</para> + + <programlisting language="C"><![CDATA[ +typedef struct { + chunk_alloc_t *alloc; + chunk_dalloc_t *dalloc; + chunk_commit_t *commit; + chunk_decommit_t *decommit; + chunk_purge_t *purge; + chunk_split_t *split; + chunk_merge_t *merge; +} chunk_hooks_t;]]></programlisting> + <para>The <type>chunk_hooks_t</type> structure comprises function + pointers which are described individually below. jemalloc uses these + functions to manage chunk lifetime, which starts off with allocation of + mapped committed memory, in the simplest case followed by deallocation. + However, there are performance and platform reasons to retain chunks for + later reuse. Cleanup attempts cascade from deallocation to decommit to + purging, which gives the chunk management functions opportunities to + reject the most permanent cleanup operations in favor of less permanent + (and often less costly) operations. The chunk splitting and merging + operations can also be opted out of, but this is mainly intended to + support platforms on which virtual memory mappings provided by the + operating system kernel do not automatically coalesce and split, e.g. + Windows.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef void *<function>(chunk_alloc_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>alignment</parameter></paramdef> + <paramdef>bool *<parameter>zero</parameter></paramdef> + <paramdef>bool *<parameter>commit</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk allocation function conforms to the + <type>chunk_alloc_t</type> type and upon success returns a pointer to + <parameter>size</parameter> bytes of mapped memory on behalf of arena + <parameter>arena_ind</parameter> such that the chunk's base address is a + multiple of <parameter>alignment</parameter>, as well as setting + <parameter>*zero</parameter> to indicate whether the chunk is zeroed and + <parameter>*commit</parameter> to indicate whether the chunk is + committed. Upon error the function returns <constant>NULL</constant> + and leaves <parameter>*zero</parameter> and + <parameter>*commit</parameter> unmodified. The + <parameter>size</parameter> parameter is always a multiple of the chunk + size. The <parameter>alignment</parameter> parameter is always a power + of two at least as large as the chunk size. Zeroing is mandatory if + <parameter>*zero</parameter> is true upon function entry. Committing is + mandatory if <parameter>*commit</parameter> is true upon function entry. + If <parameter>chunk</parameter> is not <constant>NULL</constant>, the + returned pointer must be <parameter>chunk</parameter> on success or + <constant>NULL</constant> on error. Committed memory may be committed + in absolute terms as on a system that does not overcommit, or in + implicit terms as on a system that overcommits and satisfies physical + memory needs on demand via soft page faults. Note that replacing the + default chunk allocation function makes the arena's <link + linkend="arena.i.dss"><mallctl>arena.<i>.dss</mallctl></link> + setting irrelevant.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_dalloc_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>bool <parameter>committed</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para> + A chunk deallocation function conforms to the + <type>chunk_dalloc_t</type> type and deallocates a + <parameter>chunk</parameter> of given <parameter>size</parameter> with + <parameter>committed</parameter>/decommited memory as indicated, on + behalf of arena <parameter>arena_ind</parameter>, returning false upon + success. If the function returns true, this indicates opt-out from + deallocation; the virtual memory mapping associated with the chunk + remains mapped, in the same commit state, and available for future use, + in which case it will be automatically retained for later reuse.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_commit_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>offset</parameter></paramdef> + <paramdef>size_t <parameter>length</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk commit function conforms to the + <type>chunk_commit_t</type> type and commits zeroed physical memory to + back pages within a <parameter>chunk</parameter> of given + <parameter>size</parameter> at <parameter>offset</parameter> bytes, + extending for <parameter>length</parameter> on behalf of arena + <parameter>arena_ind</parameter>, returning false upon success. + Committed memory may be committed in absolute terms as on a system that + does not overcommit, or in implicit terms as on a system that + overcommits and satisfies physical memory needs on demand via soft page + faults. If the function returns true, this indicates insufficient + physical memory to satisfy the request.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_decommit_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>offset</parameter></paramdef> + <paramdef>size_t <parameter>length</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk decommit function conforms to the + <type>chunk_decommit_t</type> type and decommits any physical memory + that is backing pages within a <parameter>chunk</parameter> of given + <parameter>size</parameter> at <parameter>offset</parameter> bytes, + extending for <parameter>length</parameter> on behalf of arena + <parameter>arena_ind</parameter>, returning false upon success, in which + case the pages will be committed via the chunk commit function before + being reused. If the function returns true, this indicates opt-out from + decommit; the memory remains committed and available for future use, in + which case it will be automatically retained for later reuse.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_purge_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t<parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>offset</parameter></paramdef> + <paramdef>size_t <parameter>length</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk purge function conforms to the <type>chunk_purge_t</type> + type and optionally discards physical pages within the virtual memory + mapping associated with <parameter>chunk</parameter> of given + <parameter>size</parameter> at <parameter>offset</parameter> bytes, + extending for <parameter>length</parameter> on behalf of arena + <parameter>arena_ind</parameter>, returning false if pages within the + purged virtual memory range will be zero-filled the next time they are + accessed.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_split_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>size_a</parameter></paramdef> + <paramdef>size_t <parameter>size_b</parameter></paramdef> + <paramdef>bool <parameter>committed</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk split function conforms to the <type>chunk_split_t</type> + type and optionally splits <parameter>chunk</parameter> of given + <parameter>size</parameter> into two adjacent chunks, the first of + <parameter>size_a</parameter> bytes, and the second of + <parameter>size_b</parameter> bytes, operating on + <parameter>committed</parameter>/decommitted memory as indicated, on + behalf of arena <parameter>arena_ind</parameter>, returning false upon + success. If the function returns true, this indicates that the chunk + remains unsplit and therefore should continue to be operated on as a + whole.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_merge_t)</function></funcdef> + <paramdef>void *<parameter>chunk_a</parameter></paramdef> + <paramdef>size_t <parameter>size_a</parameter></paramdef> + <paramdef>void *<parameter>chunk_b</parameter></paramdef> + <paramdef>size_t <parameter>size_b</parameter></paramdef> + <paramdef>bool <parameter>committed</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk merge function conforms to the <type>chunk_merge_t</type> + type and optionally merges adjacent chunks, + <parameter>chunk_a</parameter> of given <parameter>size_a</parameter> + and <parameter>chunk_b</parameter> of given + <parameter>size_b</parameter> into one contiguous chunk, operating on + <parameter>committed</parameter>/decommitted memory as indicated, on + behalf of arena <parameter>arena_ind</parameter>, returning false upon + success. If the function returns true, this indicates that the chunks + remain distinct mappings and therefore should continue to be operated on + independently.</para> + </listitem> </varlistentry> <varlistentry id="arenas.narenas"> @@ -1470,6 +1743,20 @@ malloc_conf = "xmalloc:true";]]></programlisting> initialized.</para></listitem> </varlistentry> + <varlistentry id="arenas.lg_dirty_mult"> + <term> + <mallctl>arenas.lg_dirty_mult</mallctl> + (<type>ssize_t</type>) + <literal>rw</literal> + </term> + <listitem><para>Current default per-arena minimum ratio (log base 2) of + active to dirty pages, used to initialize <link + linkend="arena.i.lg_dirty_mult"><mallctl>arena.<i>.lg_dirty_mult</mallctl></link> + during arena creation. See <link + linkend="opt.lg_dirty_mult"><mallctl>opt.lg_dirty_mult</mallctl></link> + for additional information.</para></listitem> + </varlistentry> + <varlistentry id="arenas.quantum"> <term> <mallctl>arenas.quantum</mallctl> @@ -1548,7 +1835,7 @@ malloc_conf = "xmalloc:true";]]></programlisting> <varlistentry id="arenas.nlruns"> <term> <mallctl>arenas.nlruns</mallctl> - (<type>size_t</type>) + (<type>unsigned</type>) <literal>r-</literal> </term> <listitem><para>Total number of large size classes.</para></listitem> @@ -1564,14 +1851,23 @@ malloc_conf = "xmalloc:true";]]></programlisting> class.</para></listitem> </varlistentry> - <varlistentry id="arenas.purge"> + <varlistentry id="arenas.nhchunks"> <term> - <mallctl>arenas.purge</mallctl> + <mallctl>arenas.nhchunks</mallctl> (<type>unsigned</type>) - <literal>-w</literal> + <literal>r-</literal> </term> - <listitem><para>Purge unused dirty pages for the specified arena, or - for all arenas if none is specified.</para></listitem> + <listitem><para>Total number of huge size classes.</para></listitem> + </varlistentry> + + <varlistentry id="arenas.hchunk.i.size"> + <term> + <mallctl>arenas.hchunk.<i>.size</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + </term> + <listitem><para>Maximum size supported by this huge size + class.</para></listitem> </varlistentry> <varlistentry id="arenas.extend"> @@ -1584,6 +1880,20 @@ malloc_conf = "xmalloc:true";]]></programlisting> and returning the new arena index.</para></listitem> </varlistentry> + <varlistentry id="prof.thread_active_init"> + <term> + <mallctl>prof.thread_active_init</mallctl> + (<type>bool</type>) + <literal>rw</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Control the initial setting for <link + linkend="thread.prof.active"><mallctl>thread.prof.active</mallctl></link> + in newly created threads. See the <link + linkend="opt.prof_thread_active_init"><mallctl>opt.prof_thread_active_init</mallctl></link> + option for additional information.</para></listitem> + </varlistentry> + <varlistentry id="prof.active"> <term> <mallctl>prof.active</mallctl> @@ -1594,8 +1904,9 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Control whether sampling is currently active. See the <link linkend="opt.prof_active"><mallctl>opt.prof_active</mallctl></link> - option for additional information. - </para></listitem> + option for additional information, as well as the interrelated <link + linkend="thread.prof.active"><mallctl>thread.prof.active</mallctl></link> + mallctl.</para></listitem> </varlistentry> <varlistentry id="prof.dump"> @@ -1614,6 +1925,49 @@ malloc_conf = "xmalloc:true";]]></programlisting> option.</para></listitem> </varlistentry> + <varlistentry id="prof.gdump"> + <term> + <mallctl>prof.gdump</mallctl> + (<type>bool</type>) + <literal>rw</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>When enabled, trigger a memory profile dump every time + the total virtual memory exceeds the previous maximum. Profiles are + dumped to files named according to the pattern + <filename><prefix>.<pid>.<seq>.u<useq>.heap</filename>, + where <literal><prefix></literal> is controlled by the <link + linkend="opt.prof_prefix"><mallctl>opt.prof_prefix</mallctl></link> + option.</para></listitem> + </varlistentry> + + <varlistentry id="prof.reset"> + <term> + <mallctl>prof.reset</mallctl> + (<type>size_t</type>) + <literal>-w</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Reset all memory profile statistics, and optionally + update the sample rate (see <link + linkend="opt.lg_prof_sample"><mallctl>opt.lg_prof_sample</mallctl></link> + and <link + linkend="prof.lg_sample"><mallctl>prof.lg_sample</mallctl></link>). + </para></listitem> + </varlistentry> + + <varlistentry id="prof.lg_sample"> + <term> + <mallctl>prof.lg_sample</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Get the current sample rate (see <link + linkend="opt.lg_prof_sample"><mallctl>opt.lg_prof_sample</mallctl></link>). + </para></listitem> + </varlistentry> + <varlistentry id="prof.interval"> <term> <mallctl>prof.interval</mallctl> @@ -1637,9 +1991,8 @@ malloc_conf = "xmalloc:true";]]></programlisting> </term> <listitem><para>Pointer to a counter that contains an approximate count of the current number of bytes in active pages. The estimate may be - high, but never low, because each arena rounds up to the nearest - multiple of the chunk size when computing its contribution to the - counter. Note that the <link + high, but never low, because each arena rounds up when computing its + contribution to the counter. Note that the <link linkend="epoch"><mallctl>epoch</mallctl></link> mallctl has no bearing on this counter. Furthermore, counter consistency is maintained via atomic operations, so it is necessary to use an atomic operation in @@ -1670,88 +2023,56 @@ malloc_conf = "xmalloc:true";]]></programlisting> equal to <link linkend="stats.allocated"><mallctl>stats.allocated</mallctl></link>. This does not include <link linkend="stats.arenas.i.pdirty"> - <mallctl>stats.arenas.<i>.pdirty</mallctl></link> and pages + <mallctl>stats.arenas.<i>.pdirty</mallctl></link>, nor pages entirely devoted to allocator metadata.</para></listitem> </varlistentry> - <varlistentry id="stats.mapped"> + <varlistentry id="stats.metadata"> <term> - <mallctl>stats.mapped</mallctl> + <mallctl>stats.metadata</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Total number of bytes in chunks mapped on behalf of the - application. This is a multiple of the chunk size, and is at least as - large as <link - linkend="stats.active"><mallctl>stats.active</mallctl></link>. This - does not include inactive chunks.</para></listitem> - </varlistentry> - - <varlistentry id="stats.chunks.current"> - <term> - <mallctl>stats.chunks.current</mallctl> - (<type>size_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Total number of chunks actively mapped on behalf of the - application. This does not include inactive chunks. - </para></listitem> - </varlistentry> - - <varlistentry id="stats.chunks.total"> - <term> - <mallctl>stats.chunks.total</mallctl> - (<type>uint64_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Cumulative number of chunks allocated.</para></listitem> + <listitem><para>Total number of bytes dedicated to metadata, which + comprise base allocations used for bootstrap-sensitive internal + allocator data structures, arena chunk headers (see <link + linkend="stats.arenas.i.metadata.mapped"><mallctl>stats.arenas.<i>.metadata.mapped</mallctl></link>), + and internal allocations (see <link + linkend="stats.arenas.i.metadata.allocated"><mallctl>stats.arenas.<i>.metadata.allocated</mallctl></link>).</para></listitem> </varlistentry> - <varlistentry id="stats.chunks.high"> + <varlistentry id="stats.resident"> <term> - <mallctl>stats.chunks.high</mallctl> + <mallctl>stats.resident</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Maximum number of active chunks at any time thus far. - </para></listitem> + <listitem><para>Maximum number of bytes in physically resident data + pages mapped by the allocator, comprising all pages dedicated to + allocator metadata, pages backing active allocations, and unused dirty + pages. This is a maximum rather than precise because pages may not + actually be physically resident if they correspond to demand-zeroed + virtual memory that has not yet been touched. This is a multiple of the + page size, and is larger than <link + linkend="stats.active"><mallctl>stats.active</mallctl></link>.</para></listitem> </varlistentry> - <varlistentry id="stats.huge.allocated"> + <varlistentry id="stats.mapped"> <term> - <mallctl>stats.huge.allocated</mallctl> + <mallctl>stats.mapped</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Number of bytes currently allocated by huge objects. - </para></listitem> - </varlistentry> - - <varlistentry id="stats.huge.nmalloc"> - <term> - <mallctl>stats.huge.nmalloc</mallctl> - (<type>uint64_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Cumulative number of huge allocation requests. - </para></listitem> - </varlistentry> - - <varlistentry id="stats.huge.ndalloc"> - <term> - <mallctl>stats.huge.ndalloc</mallctl> - (<type>uint64_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Cumulative number of huge deallocation requests. - </para></listitem> + <listitem><para>Total number of bytes in active chunks mapped by the + allocator. This is a multiple of the chunk size, and is larger than + <link linkend="stats.active"><mallctl>stats.active</mallctl></link>. + This does not include inactive chunks, even those that contain unused + dirty pages, which means that there is no strict ordering between this + and <link + linkend="stats.resident"><mallctl>stats.resident</mallctl></link>.</para></listitem> </varlistentry> <varlistentry id="stats.arenas.i.dss"> @@ -1768,6 +2089,18 @@ malloc_conf = "xmalloc:true";]]></programlisting> </para></listitem> </varlistentry> + <varlistentry id="stats.arenas.i.lg_dirty_mult"> + <term> + <mallctl>stats.arenas.<i>.lg_dirty_mult</mallctl> + (<type>ssize_t</type>) + <literal>r-</literal> + </term> + <listitem><para>Minimum ratio (log base 2) of active to dirty pages. + See <link + linkend="opt.lg_dirty_mult"><mallctl>opt.lg_dirty_mult</mallctl></link> + for details.</para></listitem> + </varlistentry> + <varlistentry id="stats.arenas.i.nthreads"> <term> <mallctl>stats.arenas.<i>.nthreads</mallctl> @@ -1809,6 +2142,38 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Number of mapped bytes.</para></listitem> </varlistentry> + <varlistentry id="stats.arenas.i.metadata.mapped"> + <term> + <mallctl>stats.arenas.<i>.metadata.mapped</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Number of mapped bytes in arena chunk headers, which + track the states of the non-metadata pages.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.metadata.allocated"> + <term> + <mallctl>stats.arenas.<i>.metadata.allocated</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Number of bytes dedicated to internal allocations. + Internal allocations differ from application-originated allocations in + that they are for internal use, and that they are omitted from heap + profiles. This statistic is reported separately from <link + linkend="stats.metadata"><mallctl>stats.metadata</mallctl></link> and + <link + linkend="stats.arenas.i.metadata.mapped"><mallctl>stats.arenas.<i>.metadata.mapped</mallctl></link> + because it overlaps with e.g. the <link + linkend="stats.allocated"><mallctl>stats.allocated</mallctl></link> and + <link linkend="stats.active"><mallctl>stats.active</mallctl></link> + statistics, whereas the other metadata statistics do + not.</para></listitem> + </varlistentry> + <varlistentry id="stats.arenas.i.npurge"> <term> <mallctl>stats.arenas.<i>.npurge</mallctl> @@ -1930,15 +2295,48 @@ malloc_conf = "xmalloc:true";]]></programlisting> </para></listitem> </varlistentry> - <varlistentry id="stats.arenas.i.bins.j.allocated"> + <varlistentry id="stats.arenas.i.huge.allocated"> <term> - <mallctl>stats.arenas.<i>.bins.<j>.allocated</mallctl> + <mallctl>stats.arenas.<i>.huge.allocated</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Current number of bytes allocated by - bin.</para></listitem> + <listitem><para>Number of bytes currently allocated by huge objects. + </para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.huge.nmalloc"> + <term> + <mallctl>stats.arenas.<i>.huge.nmalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of huge allocation requests served + directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.huge.ndalloc"> + <term> + <mallctl>stats.arenas.<i>.huge.ndalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of huge deallocation requests served + directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.huge.nrequests"> + <term> + <mallctl>stats.arenas.<i>.huge.nrequests</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of huge allocation requests. + </para></listitem> </varlistentry> <varlistentry id="stats.arenas.i.bins.j.nmalloc"> @@ -1974,6 +2372,17 @@ malloc_conf = "xmalloc:true";]]></programlisting> requests.</para></listitem> </varlistentry> + <varlistentry id="stats.arenas.i.bins.j.curregs"> + <term> + <mallctl>stats.arenas.<i>.bins.<j>.curregs</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Current number of regions for this size + class.</para></listitem> + </varlistentry> + <varlistentry id="stats.arenas.i.bins.j.nfills"> <term> <mallctl>stats.arenas.<i>.bins.<j>.nfills</mallctl> @@ -2068,6 +2477,50 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Current number of runs for this size class. </para></listitem> </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.nmalloc"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.nmalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of allocation requests for this size + class served directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.ndalloc"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.ndalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of deallocation requests for this + size class served directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.nrequests"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.nrequests</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of allocation requests for this size + class.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.curhchunks"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.curhchunks</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Current number of huge allocations for this size class. + </para></listitem> + </varlistentry> </variablelist> </refsect1> <refsect1 id="debugging_malloc_problems"> @@ -2253,42 +2706,6 @@ malloc_conf = "xmalloc:true";]]></programlisting> returns the usable size of the allocation pointed to by <parameter>ptr</parameter>. </para> </refsect2> - <refsect2> - <title>Experimental API</title> - <para>The <function>allocm<parameter/></function>, - <function>rallocm<parameter/></function>, - <function>sallocm<parameter/></function>, - <function>dallocm<parameter/></function>, and - <function>nallocm<parameter/></function> functions return - <constant>ALLOCM_SUCCESS</constant> on success; otherwise they return an - error value. The <function>allocm<parameter/></function>, - <function>rallocm<parameter/></function>, and - <function>nallocm<parameter/></function> functions will fail if: - <variablelist> - <varlistentry> - <term><errorname>ALLOCM_ERR_OOM</errorname></term> - - <listitem><para>Out of memory. Insufficient contiguous memory was - available to service the allocation request. The - <function>allocm<parameter/></function> function additionally sets - <parameter>*ptr</parameter> to <constant>NULL</constant>, whereas - the <function>rallocm<parameter/></function> function leaves - <constant>*ptr</constant> unmodified.</para></listitem> - </varlistentry> - </variablelist> - The <function>rallocm<parameter/></function> function will also - fail if: - <variablelist> - <varlistentry> - <term><errorname>ALLOCM_ERR_NOT_MOVED</errorname></term> - - <listitem><para><constant>ALLOCM_NO_MOVE</constant> was specified, - but the reallocation request could not be serviced without moving - the object.</para></listitem> - </varlistentry> - </variablelist> - </para> - </refsect2> </refsect1> <refsect1 id="environment"> <title>ENVIRONMENT</title> |