diff options
Diffstat (limited to 'deps/jemalloc/doc/jemalloc.xml.in')
-rw-r--r-- | deps/jemalloc/doc/jemalloc.xml.in | 1207 |
1 files changed, 812 insertions, 395 deletions
diff --git a/deps/jemalloc/doc/jemalloc.xml.in b/deps/jemalloc/doc/jemalloc.xml.in index d8e2e711f..8fc774b18 100644 --- a/deps/jemalloc/doc/jemalloc.xml.in +++ b/deps/jemalloc/doc/jemalloc.xml.in @@ -38,17 +38,13 @@ <refname>xallocx</refname> <refname>sallocx</refname> <refname>dallocx</refname> + <refname>sdallocx</refname> <refname>nallocx</refname> <refname>mallctl</refname> <refname>mallctlnametomib</refname> <refname>mallctlbymib</refname> <refname>malloc_stats_print</refname> <refname>malloc_usable_size</refname> - <refname>allocm</refname> - <refname>rallocm</refname> - <refname>sallocm</refname> - <refname>dallocm</refname> - <refname>nallocm</refname> --> <refpurpose>general purpose memory allocation functions</refpurpose> </refnamediv> @@ -61,8 +57,7 @@ <refsynopsisdiv> <title>SYNOPSIS</title> <funcsynopsis> - <funcsynopsisinfo>#include <<filename class="headerfile">stdlib.h</filename>> -#include <<filename class="headerfile">jemalloc/jemalloc.h</filename>></funcsynopsisinfo> + <funcsynopsisinfo>#include <<filename class="headerfile">jemalloc/jemalloc.h</filename>></funcsynopsisinfo> <refsect2> <title>Standard API</title> <funcprototype> @@ -126,6 +121,12 @@ <paramdef>int <parameter>flags</parameter></paramdef> </funcprototype> <funcprototype> + <funcdef>void <function>sdallocx</function></funcdef> + <paramdef>void *<parameter>ptr</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>int <parameter>flags</parameter></paramdef> + </funcprototype> + <funcprototype> <funcdef>size_t <function>nallocx</function></funcdef> <paramdef>size_t <parameter>size</parameter></paramdef> <paramdef>int <parameter>flags</parameter></paramdef> @@ -172,41 +173,6 @@ </funcprototype> <para><type>const char *</type><varname>malloc_conf</varname>;</para> </refsect2> - <refsect2> - <title>Experimental API</title> - <funcprototype> - <funcdef>int <function>allocm</function></funcdef> - <paramdef>void **<parameter>ptr</parameter></paramdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>size_t <parameter>size</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>rallocm</function></funcdef> - <paramdef>void **<parameter>ptr</parameter></paramdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>size_t <parameter>size</parameter></paramdef> - <paramdef>size_t <parameter>extra</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>sallocm</function></funcdef> - <paramdef>const void *<parameter>ptr</parameter></paramdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>dallocm</function></funcdef> - <paramdef>void *<parameter>ptr</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - <funcprototype> - <funcdef>int <function>nallocm</function></funcdef> - <paramdef>size_t *<parameter>rsize</parameter></paramdef> - <paramdef>size_t <parameter>size</parameter></paramdef> - <paramdef>int <parameter>flags</parameter></paramdef> - </funcprototype> - </refsect2> </funcsynopsis> </refsynopsisdiv> <refsect1 id="description"> @@ -229,15 +195,15 @@ <para>The <function>posix_memalign<parameter/></function> function allocates <parameter>size</parameter> bytes of memory such that the - allocation's base address is an even multiple of + allocation's base address is a multiple of <parameter>alignment</parameter>, and returns the allocation in the value pointed to by <parameter>ptr</parameter>. The requested - <parameter>alignment</parameter> must be a power of 2 at least as large - as <code language="C">sizeof(<type>void *</type>)</code>.</para> + <parameter>alignment</parameter> must be a power of 2 at least as large as + <code language="C">sizeof(<type>void *</type>)</code>.</para> <para>The <function>aligned_alloc<parameter/></function> function allocates <parameter>size</parameter> bytes of memory such that the - allocation's base address is an even multiple of + allocation's base address is a multiple of <parameter>alignment</parameter>. The requested <parameter>alignment</parameter> must be a power of 2. Behavior is undefined if <parameter>size</parameter> is not an integral multiple of @@ -268,14 +234,15 @@ <function>rallocx<parameter/></function>, <function>xallocx<parameter/></function>, <function>sallocx<parameter/></function>, - <function>dallocx<parameter/></function>, and + <function>dallocx<parameter/></function>, + <function>sdallocx<parameter/></function>, and <function>nallocx<parameter/></function> functions all have a <parameter>flags</parameter> argument that can be used to specify options. The functions only check the options that are contextually relevant. Use bitwise or (<code language="C">|</code>) operations to specify one or more of the following: <variablelist> - <varlistentry> + <varlistentry id="MALLOCX_LG_ALIGN"> <term><constant>MALLOCX_LG_ALIGN(<parameter>la</parameter>) </constant></term> @@ -285,7 +252,7 @@ that <parameter>la</parameter> is within the valid range.</para></listitem> </varlistentry> - <varlistentry> + <varlistentry id="MALLOCX_ALIGN"> <term><constant>MALLOCX_ALIGN(<parameter>a</parameter>) </constant></term> @@ -295,7 +262,7 @@ validate that <parameter>a</parameter> is a power of 2. </para></listitem> </varlistentry> - <varlistentry> + <varlistentry id="MALLOCX_ZERO"> <term><constant>MALLOCX_ZERO</constant></term> <listitem><para>Initialize newly allocated memory to contain zero @@ -304,16 +271,38 @@ that are initialized to contain zero bytes. If this macro is absent, newly allocated memory is uninitialized.</para></listitem> </varlistentry> - <varlistentry> + <varlistentry id="MALLOCX_TCACHE"> + <term><constant>MALLOCX_TCACHE(<parameter>tc</parameter>) + </constant></term> + + <listitem><para>Use the thread-specific cache (tcache) specified by + the identifier <parameter>tc</parameter>, which must have been + acquired via the <link + linkend="tcache.create"><mallctl>tcache.create</mallctl></link> + mallctl. This macro does not validate that + <parameter>tc</parameter> specifies a valid + identifier.</para></listitem> + </varlistentry> + <varlistentry id="MALLOC_TCACHE_NONE"> + <term><constant>MALLOCX_TCACHE_NONE</constant></term> + + <listitem><para>Do not use a thread-specific cache (tcache). Unless + <constant>MALLOCX_TCACHE(<parameter>tc</parameter>)</constant> or + <constant>MALLOCX_TCACHE_NONE</constant> is specified, an + automatically managed tcache will be used under many circumstances. + This macro cannot be used in the same <parameter>flags</parameter> + argument as + <constant>MALLOCX_TCACHE(<parameter>tc</parameter>)</constant>.</para></listitem> + </varlistentry> + <varlistentry id="MALLOCX_ARENA"> <term><constant>MALLOCX_ARENA(<parameter>a</parameter>) </constant></term> <listitem><para>Use the arena specified by the index - <parameter>a</parameter> (and by necessity bypass the thread - cache). This macro has no effect for huge regions, nor for regions - that were allocated via an arena other than the one specified. - This macro does not validate that <parameter>a</parameter> - specifies an arena index in the valid range.</para></listitem> + <parameter>a</parameter>. This macro has no effect for regions that + were allocated via an arena other than the one specified. This + macro does not validate that <parameter>a</parameter> specifies an + arena index in the valid range.</para></listitem> </varlistentry> </variablelist> </para> @@ -352,6 +341,15 @@ memory referenced by <parameter>ptr</parameter> to be made available for future allocations.</para> + <para>The <function>sdallocx<parameter/></function> function is an + extension of <function>dallocx<parameter/></function> with a + <parameter>size</parameter> parameter to allow the caller to pass in the + allocation size as an optimization. The minimum valid input size is the + original requested size of the allocation, and the maximum valid input + size is the corresponding value returned by + <function>nallocx<parameter/></function> or + <function>sallocx<parameter/></function>.</para> + <para>The <function>nallocx<parameter/></function> function allocates no memory, but it performs the same size computation as the <function>mallocx<parameter/></function> function, and returns the real @@ -430,11 +428,12 @@ for (i = 0; i < nbins; i++) { functions simultaneously. If <option>--enable-stats</option> is specified during configuration, “m” and “a” can be specified to omit merged arena and per arena statistics, respectively; - “b” and “l” can be specified to omit per size - class statistics for bins and large objects, respectively. Unrecognized - characters are silently ignored. Note that thread caching may prevent - some statistics from being completely up to date, since extra locking - would be required to merge counters that track thread cache operations. + “b”, “l”, and “h” can be specified to + omit per size class statistics for bins, large objects, and huge objects, + respectively. Unrecognized characters are silently ignored. Note that + thread caching may prevent some statistics from being completely up to + date, since extra locking would be required to merge counters that track + thread cache operations. </para> <para>The <function>malloc_usable_size<parameter/></function> function @@ -449,116 +448,6 @@ for (i = 0; i < nbins; i++) { depended on, since such behavior is entirely implementation-dependent. </para> </refsect2> - <refsect2> - <title>Experimental API</title> - <para>The experimental API is subject to change or removal without regard - for backward compatibility. If <option>--disable-experimental</option> - is specified during configuration, the experimental API is - omitted.</para> - - <para>The <function>allocm<parameter/></function>, - <function>rallocm<parameter/></function>, - <function>sallocm<parameter/></function>, - <function>dallocm<parameter/></function>, and - <function>nallocm<parameter/></function> functions all have a - <parameter>flags</parameter> argument that can be used to specify - options. The functions only check the options that are contextually - relevant. Use bitwise or (<code language="C">|</code>) operations to - specify one or more of the following: - <variablelist> - <varlistentry> - <term><constant>ALLOCM_LG_ALIGN(<parameter>la</parameter>) - </constant></term> - - <listitem><para>Align the memory allocation to start at an address - that is a multiple of <code language="C">(1 << - <parameter>la</parameter>)</code>. This macro does not validate - that <parameter>la</parameter> is within the valid - range.</para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_ALIGN(<parameter>a</parameter>) - </constant></term> - - <listitem><para>Align the memory allocation to start at an address - that is a multiple of <parameter>a</parameter>, where - <parameter>a</parameter> is a power of two. This macro does not - validate that <parameter>a</parameter> is a power of 2. - </para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_ZERO</constant></term> - - <listitem><para>Initialize newly allocated memory to contain zero - bytes. In the growing reallocation case, the real size prior to - reallocation defines the boundary between untouched bytes and those - that are initialized to contain zero bytes. If this macro is - absent, newly allocated memory is uninitialized.</para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_NO_MOVE</constant></term> - - <listitem><para>For reallocation, fail rather than moving the - object. This constraint can apply to both growth and - shrinkage.</para></listitem> - </varlistentry> - <varlistentry> - <term><constant>ALLOCM_ARENA(<parameter>a</parameter>) - </constant></term> - - <listitem><para>Use the arena specified by the index - <parameter>a</parameter> (and by necessity bypass the thread - cache). This macro has no effect for huge regions, nor for regions - that were allocated via an arena other than the one specified. - This macro does not validate that <parameter>a</parameter> - specifies an arena index in the valid range.</para></listitem> - </varlistentry> - </variablelist> - </para> - - <para>The <function>allocm<parameter/></function> function allocates at - least <parameter>size</parameter> bytes of memory, sets - <parameter>*ptr</parameter> to the base address of the allocation, and - sets <parameter>*rsize</parameter> to the real size of the allocation if - <parameter>rsize</parameter> is not <constant>NULL</constant>. Behavior - is undefined if <parameter>size</parameter> is <constant>0</constant>, or - if request size overflows due to size class and/or alignment - constraints.</para> - - <para>The <function>rallocm<parameter/></function> function resizes the - allocation at <parameter>*ptr</parameter> to be at least - <parameter>size</parameter> bytes, sets <parameter>*ptr</parameter> to - the base address of the allocation if it moved, and sets - <parameter>*rsize</parameter> to the real size of the allocation if - <parameter>rsize</parameter> is not <constant>NULL</constant>. If - <parameter>extra</parameter> is non-zero, an attempt is made to resize - the allocation to be at least <code - language="C">(<parameter>size</parameter> + - <parameter>extra</parameter>)</code> bytes, though inability to allocate - the extra byte(s) will not by itself result in failure. Behavior is - undefined if <parameter>size</parameter> is <constant>0</constant>, if - request size overflows due to size class and/or alignment constraints, or - if <code language="C">(<parameter>size</parameter> + - <parameter>extra</parameter> > - <constant>SIZE_T_MAX</constant>)</code>.</para> - - <para>The <function>sallocm<parameter/></function> function sets - <parameter>*rsize</parameter> to the real size of the allocation.</para> - - <para>The <function>dallocm<parameter/></function> function causes the - memory referenced by <parameter>ptr</parameter> to be made available for - future allocations.</para> - - <para>The <function>nallocm<parameter/></function> function allocates no - memory, but it performs the same size computation as the - <function>allocm<parameter/></function> function, and if - <parameter>rsize</parameter> is not <constant>NULL</constant> it sets - <parameter>*rsize</parameter> to the real size of the allocation that - would result from the equivalent <function>allocm<parameter/></function> - function call. Behavior is undefined if <parameter>size</parameter> is - <constant>0</constant>, or if request size overflows due to size class - and/or alignment constraints.</para> - </refsect2> </refsect1> <refsect1 id="tuning"> <title>TUNING</title> @@ -598,8 +487,10 @@ for (i = 0; i < nbins; i++) { <manvolnum>2</manvolnum></citerefentry> to obtain memory, which is suboptimal for several reasons, including race conditions, increased fragmentation, and artificial limitations on maximum usable memory. If - <option>--enable-dss</option> is specified during configuration, this - allocator uses both <citerefentry><refentrytitle>mmap</refentrytitle> + <citerefentry><refentrytitle>sbrk</refentrytitle> + <manvolnum>2</manvolnum></citerefentry> is supported by the operating + system, this allocator uses both + <citerefentry><refentrytitle>mmap</refentrytitle> <manvolnum>2</manvolnum></citerefentry> and <citerefentry><refentrytitle>sbrk</refentrytitle> <manvolnum>2</manvolnum></citerefentry>, in that order of preference; @@ -632,12 +523,11 @@ for (i = 0; i < nbins; i++) { possible to find metadata for user objects very quickly.</para> <para>User objects are broken into three categories according to size: - small, large, and huge. Small objects are smaller than one page. Large - objects are smaller than the chunk size. Huge objects are a multiple of - the chunk size. Small and large objects are managed by arenas; huge - objects are managed separately in a single data structure that is shared by - all threads. Huge objects are used by applications infrequently enough - that this single data structure is not a scalability issue.</para> + small, large, and huge. Small and large objects are managed entirely by + arenas; huge objects are additionally aggregated in a single data structure + that is shared by all threads. Huge objects are typically used by + applications infrequently enough that this single data structure is not a + scalability issue.</para> <para>Each chunk that is managed by an arena tracks its contents as runs of contiguous pages (unused, backing a set of small objects, or backing one @@ -646,18 +536,18 @@ for (i = 0; i < nbins; i++) { allocations in constant time.</para> <para>Small objects are managed in groups by page runs. Each run maintains - a frontier and free list to track which regions are in use. Allocation - requests that are no more than half the quantum (8 or 16, depending on - architecture) are rounded up to the nearest power of two that is at least - <code language="C">sizeof(<type>double</type>)</code>. All other small - object size classes are multiples of the quantum, spaced such that internal - fragmentation is limited to approximately 25% for all but the smallest size - classes. Allocation requests that are larger than the maximum small size - class, but small enough to fit in an arena-managed chunk (see the <link - linkend="opt.lg_chunk"><mallctl>opt.lg_chunk</mallctl></link> option), are - rounded up to the nearest run size. Allocation requests that are too large - to fit in an arena-managed chunk are rounded up to the nearest multiple of - the chunk size.</para> + a bitmap to track which regions are in use. Allocation requests that are no + more than half the quantum (8 or 16, depending on architecture) are rounded + up to the nearest power of two that is at least <code + language="C">sizeof(<type>double</type>)</code>. All other object size + classes are multiples of the quantum, spaced such that there are four size + classes for each doubling in size, which limits internal fragmentation to + approximately 20% for all but the smallest size classes. Small size classes + are smaller than four times the page size, large size classes are smaller + than the chunk size (see the <link + linkend="opt.lg_chunk"><mallctl>opt.lg_chunk</mallctl></link> option), and + huge size classes extend from the chunk size up to one size class less than + the full address space size.</para> <para>Allocations are packed tightly together, which can be an issue for multi-threaded applications. If you need to assure that allocations do not @@ -665,8 +555,29 @@ for (i = 0; i < nbins; i++) { nearest multiple of the cacheline size, or specify cacheline alignment when allocating.</para> - <para>Assuming 4 MiB chunks, 4 KiB pages, and a 16-byte quantum on a 64-bit - system, the size classes in each category are as shown in <xref + <para>The <function>realloc<parameter/></function>, + <function>rallocx<parameter/></function>, and + <function>xallocx<parameter/></function> functions may resize allocations + without moving them under limited circumstances. Unlike the + <function>*allocx<parameter/></function> API, the standard API does not + officially round up the usable size of an allocation to the nearest size + class, so technically it is necessary to call + <function>realloc<parameter/></function> to grow e.g. a 9-byte allocation to + 16 bytes, or shrink a 16-byte allocation to 9 bytes. Growth and shrinkage + trivially succeeds in place as long as the pre-size and post-size both round + up to the same size class. No other API guarantees are made regarding + in-place resizing, but the current implementation also tries to resize large + and huge allocations in place, as long as the pre-size and post-size are + both large or both huge. In such cases shrinkage always succeeds for large + size classes, but for huge size classes the chunk allocator must support + splitting (see <link + linkend="arena.i.chunk_hooks"><mallctl>arena.<i>.chunk_hooks</mallctl></link>). + Growth only succeeds if the trailing memory is currently available, and + additionally for huge size classes the chunk allocator must support + merging.</para> + + <para>Assuming 2 MiB chunks, 4 KiB pages, and a 16-byte quantum on a + 64-bit system, the size classes in each category are as shown in <xref linkend="size_classes" xrefstyle="template:Table %n"/>.</para> <table xml:id="size_classes" frame="all"> @@ -684,13 +595,13 @@ for (i = 0; i < nbins; i++) { </thead> <tbody> <row> - <entry morerows="6">Small</entry> + <entry morerows="8">Small</entry> <entry>lg</entry> <entry>[8]</entry> </row> <row> <entry>16</entry> - <entry>[16, 32, 48, ..., 128]</entry> + <entry>[16, 32, 48, 64, 80, 96, 112, 128]</entry> </row> <row> <entry>32</entry> @@ -710,17 +621,77 @@ for (i = 0; i < nbins; i++) { </row> <row> <entry>512</entry> - <entry>[2560, 3072, 3584]</entry> + <entry>[2560, 3072, 3584, 4096]</entry> + </row> + <row> + <entry>1 KiB</entry> + <entry>[5 KiB, 6 KiB, 7 KiB, 8 KiB]</entry> + </row> + <row> + <entry>2 KiB</entry> + <entry>[10 KiB, 12 KiB, 14 KiB]</entry> + </row> + <row> + <entry morerows="7">Large</entry> + <entry>2 KiB</entry> + <entry>[16 KiB]</entry> </row> <row> - <entry>Large</entry> <entry>4 KiB</entry> - <entry>[4 KiB, 8 KiB, 12 KiB, ..., 4072 KiB]</entry> + <entry>[20 KiB, 24 KiB, 28 KiB, 32 KiB]</entry> + </row> + <row> + <entry>8 KiB</entry> + <entry>[40 KiB, 48 KiB, 54 KiB, 64 KiB]</entry> + </row> + <row> + <entry>16 KiB</entry> + <entry>[80 KiB, 96 KiB, 112 KiB, 128 KiB]</entry> + </row> + <row> + <entry>32 KiB</entry> + <entry>[160 KiB, 192 KiB, 224 KiB, 256 KiB]</entry> + </row> + <row> + <entry>64 KiB</entry> + <entry>[320 KiB, 384 KiB, 448 KiB, 512 KiB]</entry> + </row> + <row> + <entry>128 KiB</entry> + <entry>[640 KiB, 768 KiB, 896 KiB, 1 MiB]</entry> + </row> + <row> + <entry>256 KiB</entry> + <entry>[1280 KiB, 1536 KiB, 1792 KiB]</entry> + </row> + <row> + <entry morerows="6">Huge</entry> + <entry>256 KiB</entry> + <entry>[2 MiB]</entry> + </row> + <row> + <entry>512 KiB</entry> + <entry>[2560 KiB, 3 MiB, 3584 KiB, 4 MiB]</entry> + </row> + <row> + <entry>1 MiB</entry> + <entry>[5 MiB, 6 MiB, 7 MiB, 8 MiB]</entry> + </row> + <row> + <entry>2 MiB</entry> + <entry>[10 MiB, 12 MiB, 14 MiB, 16 MiB]</entry> </row> <row> - <entry>Huge</entry> <entry>4 MiB</entry> - <entry>[4 MiB, 8 MiB, 12 MiB, ...]</entry> + <entry>[20 MiB, 24 MiB, 28 MiB, 32 MiB]</entry> + </row> + <row> + <entry>8 MiB</entry> + <entry>[40 MiB, 48 MiB, 56 MiB, 64 MiB]</entry> + </row> + <row> + <entry>...</entry> + <entry>...</entry> </row> </tbody> </tgroup> @@ -765,23 +736,23 @@ for (i = 0; i < nbins; i++) { detecting whether another thread caused a refresh.</para></listitem> </varlistentry> - <varlistentry id="config.debug"> + <varlistentry id="config.cache_oblivious"> <term> - <mallctl>config.debug</mallctl> + <mallctl>config.cache_oblivious</mallctl> (<type>bool</type>) <literal>r-</literal> </term> - <listitem><para><option>--enable-debug</option> was specified during - build configuration.</para></listitem> + <listitem><para><option>--enable-cache-oblivious</option> was specified + during build configuration.</para></listitem> </varlistentry> - <varlistentry id="config.dss"> + <varlistentry id="config.debug"> <term> - <mallctl>config.dss</mallctl> + <mallctl>config.debug</mallctl> (<type>bool</type>) <literal>r-</literal> </term> - <listitem><para><option>--enable-dss</option> was specified during + <listitem><para><option>--enable-debug</option> was specified during build configuration.</para></listitem> </varlistentry> @@ -805,16 +776,6 @@ for (i = 0; i < nbins; i++) { during build configuration.</para></listitem> </varlistentry> - <varlistentry id="config.mremap"> - <term> - <mallctl>config.mremap</mallctl> - (<type>bool</type>) - <literal>r-</literal> - </term> - <listitem><para><option>--enable-mremap</option> was specified during - build configuration.</para></listitem> - </varlistentry> - <varlistentry id="config.munmap"> <term> <mallctl>config.munmap</mallctl> @@ -940,10 +901,15 @@ for (i = 0; i < nbins; i++) { <manvolnum>2</manvolnum></citerefentry>) allocation precedence as related to <citerefentry><refentrytitle>mmap</refentrytitle> <manvolnum>2</manvolnum></citerefentry> allocation. The following - settings are supported: “disabled”, “primary”, - and “secondary”. The default is “secondary” if - <link linkend="config.dss"><mallctl>config.dss</mallctl></link> is - true, “disabled” otherwise. + settings are supported if + <citerefentry><refentrytitle>sbrk</refentrytitle> + <manvolnum>2</manvolnum></citerefentry> is supported by the operating + system: “disabled”, “primary”, and + “secondary”; otherwise only “disabled” is + supported. The default is “secondary” if + <citerefentry><refentrytitle>sbrk</refentrytitle> + <manvolnum>2</manvolnum></citerefentry> is supported by the operating + system; “disabled” otherwise. </para></listitem> </varlistentry> @@ -956,7 +922,7 @@ for (i = 0; i < nbins; i++) { <listitem><para>Virtual memory chunk size (log base 2). If a chunk size outside the supported size range is specified, the size is silently clipped to the minimum/maximum supported size. The default - chunk size is 4 MiB (2^22). + chunk size is 2 MiB (2^21). </para></listitem> </varlistentry> @@ -986,7 +952,11 @@ for (i = 0; i < nbins; i++) { provides the kernel with sufficient information to recycle dirty pages if physical memory becomes scarce and the pages remain unused. The default minimum ratio is 8:1 (2^3:1); an option value of -1 will - disable dirty page purging.</para></listitem> + disable dirty page purging. See <link + linkend="arenas.lg_dirty_mult"><mallctl>arenas.lg_dirty_mult</mallctl></link> + and <link + linkend="arena.i.lg_dirty_mult"><mallctl>arena.<i>.lg_dirty_mult</mallctl></link> + for related dynamic control options.</para></listitem> </varlistentry> <varlistentry id="opt.stats_print"> @@ -1003,26 +973,34 @@ for (i = 0; i < nbins; i++) { <option>--enable-stats</option> is specified during configuration, this has the potential to cause deadlock for a multi-threaded process that exits while one or more threads are executing in the memory allocation - functions. Therefore, this option should only be used with care; it is - primarily intended as a performance tuning aid during application + functions. Furthermore, <function>atexit<parameter/></function> may + allocate memory during application initialization and then deadlock + internally when jemalloc in turn calls + <function>atexit<parameter/></function>, so this option is not + univerally usable (though the application can register its own + <function>atexit<parameter/></function> function with equivalent + functionality). Therefore, this option should only be used with care; + it is primarily intended as a performance tuning aid during application development. This option is disabled by default.</para></listitem> </varlistentry> <varlistentry id="opt.junk"> <term> <mallctl>opt.junk</mallctl> - (<type>bool</type>) + (<type>const char *</type>) <literal>r-</literal> [<option>--enable-fill</option>] </term> - <listitem><para>Junk filling enabled/disabled. If enabled, each byte - of uninitialized allocated memory will be initialized to - <literal>0xa5</literal>. All deallocated memory will be initialized to - <literal>0x5a</literal>. This is intended for debugging and will - impact performance negatively. This option is disabled by default - unless <option>--enable-debug</option> is specified during - configuration, in which case it is enabled by default unless running - inside <ulink + <listitem><para>Junk filling. If set to "alloc", each byte of + uninitialized allocated memory will be initialized to + <literal>0xa5</literal>. If set to "free", all deallocated memory will + be initialized to <literal>0x5a</literal>. If set to "true", both + allocated and deallocated memory will be initialized, and if set to + "false", junk filling be disabled entirely. This is intended for + debugging and will impact performance negatively. This option is + "false" by default unless <option>--enable-debug</option> is specified + during configuration, in which case it is "true" by default unless + running inside <ulink url="http://valgrind.org/">Valgrind</ulink>.</para></listitem> </varlistentry> @@ -1076,9 +1054,8 @@ for (i = 0; i < nbins; i++) { <listitem><para>Zero filling enabled/disabled. If enabled, each byte of uninitialized allocated memory will be initialized to 0. Note that this initialization only happens once for each byte, so - <function>realloc<parameter/></function>, - <function>rallocx<parameter/></function> and - <function>rallocm<parameter/></function> calls do not zero memory that + <function>realloc<parameter/></function> and + <function>rallocx<parameter/></function> calls do not zero memory that was previously allocated. This is intended for debugging and will impact performance negatively. This option is disabled by default. </para></listitem> @@ -1097,19 +1074,6 @@ for (i = 0; i < nbins; i++) { is disabled by default.</para></listitem> </varlistentry> - <varlistentry id="opt.valgrind"> - <term> - <mallctl>opt.valgrind</mallctl> - (<type>bool</type>) - <literal>r-</literal> - [<option>--enable-valgrind</option>] - </term> - <listitem><para><ulink url="http://valgrind.org/">Valgrind</ulink> - support enabled/disabled. This option is vestigal because jemalloc - auto-detects whether it is running inside Valgrind. This option is - disabled by default, unless running inside Valgrind.</para></listitem> - </varlistentry> - <varlistentry id="opt.xmalloc"> <term> <mallctl>opt.xmalloc</mallctl> @@ -1137,16 +1101,16 @@ malloc_conf = "xmalloc:true";]]></programlisting> <literal>r-</literal> [<option>--enable-tcache</option>] </term> - <listitem><para>Thread-specific caching enabled/disabled. When there - are multiple threads, each thread uses a thread-specific cache for - objects up to a certain size. Thread-specific caching allows many - allocations to be satisfied without performing any thread - synchronization, at the cost of increased memory use. See the - <link + <listitem><para>Thread-specific caching (tcache) enabled/disabled. When + there are multiple threads, each thread uses a tcache for objects up to + a certain size. Thread-specific caching allows many allocations to be + satisfied without performing any thread synchronization, at the cost of + increased memory use. See the <link linkend="opt.lg_tcache_max"><mallctl>opt.lg_tcache_max</mallctl></link> option for related tuning information. This option is enabled by default unless running inside <ulink - url="http://valgrind.org/">Valgrind</ulink>.</para></listitem> + url="http://valgrind.org/">Valgrind</ulink>, in which case it is + forcefully disabled.</para></listitem> </varlistentry> <varlistentry id="opt.lg_tcache_max"> @@ -1157,8 +1121,8 @@ malloc_conf = "xmalloc:true";]]></programlisting> [<option>--enable-tcache</option>] </term> <listitem><para>Maximum size class (log base 2) to cache in the - thread-specific cache. At a minimum, all small size classes are - cached, and at a maximum all large size classes are cached. The + thread-specific cache (tcache). At a minimum, all small size classes + are cached, and at a maximum all large size classes are cached. The default maximum is 32 KiB (2^15).</para></listitem> </varlistentry> @@ -1183,8 +1147,9 @@ malloc_conf = "xmalloc:true";]]></programlisting> option for information on high-water-triggered profile dumping, and the <link linkend="opt.prof_final"><mallctl>opt.prof_final</mallctl></link> option for final profile dumping. Profile output is compatible with - the included <command>pprof</command> Perl script, which originates - from the <ulink url="http://code.google.com/p/gperftools/">gperftools + the <command>jeprof</command> command, which is based on the + <command>pprof</command> that is developed as part of the <ulink + url="http://code.google.com/p/gperftools/">gperftools package</ulink>.</para></listitem> </varlistentry> @@ -1206,7 +1171,7 @@ malloc_conf = "xmalloc:true";]]></programlisting> <term> <mallctl>opt.prof_active</mallctl> (<type>bool</type>) - <literal>rw</literal> + <literal>r-</literal> [<option>--enable-prof</option>] </term> <listitem><para>Profiling activated/deactivated. This is a secondary @@ -1219,10 +1184,25 @@ malloc_conf = "xmalloc:true";]]></programlisting> This option is enabled by default.</para></listitem> </varlistentry> + <varlistentry id="opt.prof_thread_active_init"> + <term> + <mallctl>opt.prof_thread_active_init</mallctl> + (<type>bool</type>) + <literal>r-</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Initial setting for <link + linkend="thread.prof.active"><mallctl>thread.prof.active</mallctl></link> + in newly created threads. The initial setting for newly created threads + can also be changed during execution via the <link + linkend="prof.thread_active_init"><mallctl>prof.thread_active_init</mallctl></link> + mallctl. This option is enabled by default.</para></listitem> + </varlistentry> + <varlistentry id="opt.lg_prof_sample"> <term> <mallctl>opt.lg_prof_sample</mallctl> - (<type>ssize_t</type>) + (<type>size_t</type>) <literal>r-</literal> [<option>--enable-prof</option>] </term> @@ -1276,13 +1256,11 @@ malloc_conf = "xmalloc:true";]]></programlisting> <literal>r-</literal> [<option>--enable-prof</option>] </term> - <listitem><para>Trigger a memory profile dump every time the total - virtual memory exceeds the previous maximum. Profiles are dumped to - files named according to the pattern - <filename><prefix>.<pid>.<seq>.u<useq>.heap</filename>, - where <literal><prefix></literal> is controlled by the <link - linkend="opt.prof_prefix"><mallctl>opt.prof_prefix</mallctl></link> - option. This option is disabled by default.</para></listitem> + <listitem><para>Set the initial state of <link + linkend="prof.gdump"><mallctl>prof.gdump</mallctl></link>, which when + enabled triggers a memory profile dump every time the total virtual + memory exceeds the previous maximum. This option is disabled by + default.</para></listitem> </varlistentry> <varlistentry id="opt.prof_final"> @@ -1299,7 +1277,13 @@ malloc_conf = "xmalloc:true";]]></programlisting> <filename><prefix>.<pid>.<seq>.f.heap</filename>, where <literal><prefix></literal> is controlled by the <link linkend="opt.prof_prefix"><mallctl>opt.prof_prefix</mallctl></link> - option. This option is enabled by default.</para></listitem> + option. Note that <function>atexit<parameter/></function> may allocate + memory during application initialization and then deadlock internally + when jemalloc in turn calls <function>atexit<parameter/></function>, so + this option is not univerally usable (though the application can + register its own <function>atexit<parameter/></function> function with + equivalent functionality). This option is disabled by + default.</para></listitem> </varlistentry> <varlistentry id="opt.prof_leak"> @@ -1396,7 +1380,7 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Enable/disable calling thread's tcache. The tcache is implicitly flushed as a side effect of becoming disabled (see <link - lenkend="thread.tcache.flush"><mallctl>thread.tcache.flush</mallctl></link>). + linkend="thread.tcache.flush"><mallctl>thread.tcache.flush</mallctl></link>). </para></listitem> </varlistentry> @@ -1407,9 +1391,9 @@ malloc_conf = "xmalloc:true";]]></programlisting> <literal>--</literal> [<option>--enable-tcache</option>] </term> - <listitem><para>Flush calling thread's tcache. This interface releases - all cached objects and internal data structures associated with the - calling thread's thread-specific cache. Ordinarily, this interface + <listitem><para>Flush calling thread's thread-specific cache (tcache). + This interface releases all cached objects and internal data structures + associated with the calling thread's tcache. Ordinarily, this interface need not be called, since automatic periodic incremental garbage collection occurs, and the thread cache is automatically discarded when a thread exits. However, garbage collection is triggered by allocation @@ -1418,10 +1402,91 @@ malloc_conf = "xmalloc:true";]]></programlisting> the developer may find manual flushing useful.</para></listitem> </varlistentry> + <varlistentry id="thread.prof.name"> + <term> + <mallctl>thread.prof.name</mallctl> + (<type>const char *</type>) + <literal>r-</literal> or + <literal>-w</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Get/set the descriptive name associated with the calling + thread in memory profile dumps. An internal copy of the name string is + created, so the input string need not be maintained after this interface + completes execution. The output string of this interface should be + copied for non-ephemeral uses, because multiple implementation details + can cause asynchronous string deallocation. Furthermore, each + invocation of this interface can only read or write; simultaneous + read/write is not supported due to string lifetime limitations. The + name string must nil-terminated and comprised only of characters in the + sets recognized + by <citerefentry><refentrytitle>isgraph</refentrytitle> + <manvolnum>3</manvolnum></citerefentry> and + <citerefentry><refentrytitle>isblank</refentrytitle> + <manvolnum>3</manvolnum></citerefentry>.</para></listitem> + </varlistentry> + + <varlistentry id="thread.prof.active"> + <term> + <mallctl>thread.prof.active</mallctl> + (<type>bool</type>) + <literal>rw</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Control whether sampling is currently active for the + calling thread. This is an activation mechanism in addition to <link + linkend="prof.active"><mallctl>prof.active</mallctl></link>; both must + be active for the calling thread to sample. This flag is enabled by + default.</para></listitem> + </varlistentry> + + <varlistentry id="tcache.create"> + <term> + <mallctl>tcache.create</mallctl> + (<type>unsigned</type>) + <literal>r-</literal> + [<option>--enable-tcache</option>] + </term> + <listitem><para>Create an explicit thread-specific cache (tcache) and + return an identifier that can be passed to the <link + linkend="MALLOCX_TCACHE"><constant>MALLOCX_TCACHE(<parameter>tc</parameter>)</constant></link> + macro to explicitly use the specified cache rather than the + automatically managed one that is used by default. Each explicit cache + can be used by only one thread at a time; the application must assure + that this constraint holds. + </para></listitem> + </varlistentry> + + <varlistentry id="tcache.flush"> + <term> + <mallctl>tcache.flush</mallctl> + (<type>unsigned</type>) + <literal>-w</literal> + [<option>--enable-tcache</option>] + </term> + <listitem><para>Flush the specified thread-specific cache (tcache). The + same considerations apply to this interface as to <link + linkend="thread.tcache.flush"><mallctl>thread.tcache.flush</mallctl></link>, + except that the tcache will never be automatically be discarded. + </para></listitem> + </varlistentry> + + <varlistentry id="tcache.destroy"> + <term> + <mallctl>tcache.destroy</mallctl> + (<type>unsigned</type>) + <literal>-w</literal> + [<option>--enable-tcache</option>] + </term> + <listitem><para>Flush the specified thread-specific cache (tcache) and + make the identifier available for use during a future tcache creation. + </para></listitem> + </varlistentry> + <varlistentry id="arena.i.purge"> <term> <mallctl>arena.<i>.purge</mallctl> - (<type>unsigned</type>) + (<type>void</type>) <literal>--</literal> </term> <listitem><para>Purge unused dirty pages for arena <i>, or for @@ -1439,14 +1504,222 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Set the precedence of dss allocation as related to mmap allocation for arena <i>, or for all arenas if <i> equals <link - linkend="arenas.narenas"><mallctl>arenas.narenas</mallctl></link>. Note - that even during huge allocation this setting is read from the arena - that would be chosen for small or large allocation so that applications - can depend on consistent dss versus mmap allocation regardless of - allocation size. See <link - linkend="opt.dss"><mallctl>opt.dss</mallctl></link> for supported - settings. - </para></listitem> + linkend="arenas.narenas"><mallctl>arenas.narenas</mallctl></link>. See + <link linkend="opt.dss"><mallctl>opt.dss</mallctl></link> for supported + settings.</para></listitem> + </varlistentry> + + <varlistentry id="arena.i.lg_dirty_mult"> + <term> + <mallctl>arena.<i>.lg_dirty_mult</mallctl> + (<type>ssize_t</type>) + <literal>rw</literal> + </term> + <listitem><para>Current per-arena minimum ratio (log base 2) of active + to dirty pages for arena <i>. Each time this interface is set and + the ratio is increased, pages are synchronously purged as necessary to + impose the new ratio. See <link + linkend="opt.lg_dirty_mult"><mallctl>opt.lg_dirty_mult</mallctl></link> + for additional information.</para></listitem> + </varlistentry> + + <varlistentry id="arena.i.chunk_hooks"> + <term> + <mallctl>arena.<i>.chunk_hooks</mallctl> + (<type>chunk_hooks_t</type>) + <literal>rw</literal> + </term> + <listitem><para>Get or set the chunk management hook functions for arena + <i>. The functions must be capable of operating on all extant + chunks associated with arena <i>, usually by passing unknown + chunks to the replaced functions. In practice, it is feasible to + control allocation for arenas created via <link + linkend="arenas.extend"><mallctl>arenas.extend</mallctl></link> such + that all chunks originate from an application-supplied chunk allocator + (by setting custom chunk hook functions just after arena creation), but + the automatically created arenas may have already created chunks prior + to the application having an opportunity to take over chunk + allocation.</para> + + <programlisting language="C"><![CDATA[ +typedef struct { + chunk_alloc_t *alloc; + chunk_dalloc_t *dalloc; + chunk_commit_t *commit; + chunk_decommit_t *decommit; + chunk_purge_t *purge; + chunk_split_t *split; + chunk_merge_t *merge; +} chunk_hooks_t;]]></programlisting> + <para>The <type>chunk_hooks_t</type> structure comprises function + pointers which are described individually below. jemalloc uses these + functions to manage chunk lifetime, which starts off with allocation of + mapped committed memory, in the simplest case followed by deallocation. + However, there are performance and platform reasons to retain chunks for + later reuse. Cleanup attempts cascade from deallocation to decommit to + purging, which gives the chunk management functions opportunities to + reject the most permanent cleanup operations in favor of less permanent + (and often less costly) operations. The chunk splitting and merging + operations can also be opted out of, but this is mainly intended to + support platforms on which virtual memory mappings provided by the + operating system kernel do not automatically coalesce and split, e.g. + Windows.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef void *<function>(chunk_alloc_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>alignment</parameter></paramdef> + <paramdef>bool *<parameter>zero</parameter></paramdef> + <paramdef>bool *<parameter>commit</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk allocation function conforms to the + <type>chunk_alloc_t</type> type and upon success returns a pointer to + <parameter>size</parameter> bytes of mapped memory on behalf of arena + <parameter>arena_ind</parameter> such that the chunk's base address is a + multiple of <parameter>alignment</parameter>, as well as setting + <parameter>*zero</parameter> to indicate whether the chunk is zeroed and + <parameter>*commit</parameter> to indicate whether the chunk is + committed. Upon error the function returns <constant>NULL</constant> + and leaves <parameter>*zero</parameter> and + <parameter>*commit</parameter> unmodified. The + <parameter>size</parameter> parameter is always a multiple of the chunk + size. The <parameter>alignment</parameter> parameter is always a power + of two at least as large as the chunk size. Zeroing is mandatory if + <parameter>*zero</parameter> is true upon function entry. Committing is + mandatory if <parameter>*commit</parameter> is true upon function entry. + If <parameter>chunk</parameter> is not <constant>NULL</constant>, the + returned pointer must be <parameter>chunk</parameter> on success or + <constant>NULL</constant> on error. Committed memory may be committed + in absolute terms as on a system that does not overcommit, or in + implicit terms as on a system that overcommits and satisfies physical + memory needs on demand via soft page faults. Note that replacing the + default chunk allocation function makes the arena's <link + linkend="arena.i.dss"><mallctl>arena.<i>.dss</mallctl></link> + setting irrelevant.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_dalloc_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>bool <parameter>committed</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para> + A chunk deallocation function conforms to the + <type>chunk_dalloc_t</type> type and deallocates a + <parameter>chunk</parameter> of given <parameter>size</parameter> with + <parameter>committed</parameter>/decommited memory as indicated, on + behalf of arena <parameter>arena_ind</parameter>, returning false upon + success. If the function returns true, this indicates opt-out from + deallocation; the virtual memory mapping associated with the chunk + remains mapped, in the same commit state, and available for future use, + in which case it will be automatically retained for later reuse.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_commit_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>offset</parameter></paramdef> + <paramdef>size_t <parameter>length</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk commit function conforms to the + <type>chunk_commit_t</type> type and commits zeroed physical memory to + back pages within a <parameter>chunk</parameter> of given + <parameter>size</parameter> at <parameter>offset</parameter> bytes, + extending for <parameter>length</parameter> on behalf of arena + <parameter>arena_ind</parameter>, returning false upon success. + Committed memory may be committed in absolute terms as on a system that + does not overcommit, or in implicit terms as on a system that + overcommits and satisfies physical memory needs on demand via soft page + faults. If the function returns true, this indicates insufficient + physical memory to satisfy the request.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_decommit_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>offset</parameter></paramdef> + <paramdef>size_t <parameter>length</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk decommit function conforms to the + <type>chunk_decommit_t</type> type and decommits any physical memory + that is backing pages within a <parameter>chunk</parameter> of given + <parameter>size</parameter> at <parameter>offset</parameter> bytes, + extending for <parameter>length</parameter> on behalf of arena + <parameter>arena_ind</parameter>, returning false upon success, in which + case the pages will be committed via the chunk commit function before + being reused. If the function returns true, this indicates opt-out from + decommit; the memory remains committed and available for future use, in + which case it will be automatically retained for later reuse.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_purge_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t<parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>offset</parameter></paramdef> + <paramdef>size_t <parameter>length</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk purge function conforms to the <type>chunk_purge_t</type> + type and optionally discards physical pages within the virtual memory + mapping associated with <parameter>chunk</parameter> of given + <parameter>size</parameter> at <parameter>offset</parameter> bytes, + extending for <parameter>length</parameter> on behalf of arena + <parameter>arena_ind</parameter>, returning false if pages within the + purged virtual memory range will be zero-filled the next time they are + accessed.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_split_t)</function></funcdef> + <paramdef>void *<parameter>chunk</parameter></paramdef> + <paramdef>size_t <parameter>size</parameter></paramdef> + <paramdef>size_t <parameter>size_a</parameter></paramdef> + <paramdef>size_t <parameter>size_b</parameter></paramdef> + <paramdef>bool <parameter>committed</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk split function conforms to the <type>chunk_split_t</type> + type and optionally splits <parameter>chunk</parameter> of given + <parameter>size</parameter> into two adjacent chunks, the first of + <parameter>size_a</parameter> bytes, and the second of + <parameter>size_b</parameter> bytes, operating on + <parameter>committed</parameter>/decommitted memory as indicated, on + behalf of arena <parameter>arena_ind</parameter>, returning false upon + success. If the function returns true, this indicates that the chunk + remains unsplit and therefore should continue to be operated on as a + whole.</para> + + <funcsynopsis><funcprototype> + <funcdef>typedef bool <function>(chunk_merge_t)</function></funcdef> + <paramdef>void *<parameter>chunk_a</parameter></paramdef> + <paramdef>size_t <parameter>size_a</parameter></paramdef> + <paramdef>void *<parameter>chunk_b</parameter></paramdef> + <paramdef>size_t <parameter>size_b</parameter></paramdef> + <paramdef>bool <parameter>committed</parameter></paramdef> + <paramdef>unsigned <parameter>arena_ind</parameter></paramdef> + </funcprototype></funcsynopsis> + <literallayout></literallayout> + <para>A chunk merge function conforms to the <type>chunk_merge_t</type> + type and optionally merges adjacent chunks, + <parameter>chunk_a</parameter> of given <parameter>size_a</parameter> + and <parameter>chunk_b</parameter> of given + <parameter>size_b</parameter> into one contiguous chunk, operating on + <parameter>committed</parameter>/decommitted memory as indicated, on + behalf of arena <parameter>arena_ind</parameter>, returning false upon + success. If the function returns true, this indicates that the chunks + remain distinct mappings and therefore should continue to be operated on + independently.</para> + </listitem> </varlistentry> <varlistentry id="arenas.narenas"> @@ -1470,6 +1743,20 @@ malloc_conf = "xmalloc:true";]]></programlisting> initialized.</para></listitem> </varlistentry> + <varlistentry id="arenas.lg_dirty_mult"> + <term> + <mallctl>arenas.lg_dirty_mult</mallctl> + (<type>ssize_t</type>) + <literal>rw</literal> + </term> + <listitem><para>Current default per-arena minimum ratio (log base 2) of + active to dirty pages, used to initialize <link + linkend="arena.i.lg_dirty_mult"><mallctl>arena.<i>.lg_dirty_mult</mallctl></link> + during arena creation. See <link + linkend="opt.lg_dirty_mult"><mallctl>opt.lg_dirty_mult</mallctl></link> + for additional information.</para></listitem> + </varlistentry> + <varlistentry id="arenas.quantum"> <term> <mallctl>arenas.quantum</mallctl> @@ -1548,7 +1835,7 @@ malloc_conf = "xmalloc:true";]]></programlisting> <varlistentry id="arenas.nlruns"> <term> <mallctl>arenas.nlruns</mallctl> - (<type>size_t</type>) + (<type>unsigned</type>) <literal>r-</literal> </term> <listitem><para>Total number of large size classes.</para></listitem> @@ -1564,14 +1851,23 @@ malloc_conf = "xmalloc:true";]]></programlisting> class.</para></listitem> </varlistentry> - <varlistentry id="arenas.purge"> + <varlistentry id="arenas.nhchunks"> <term> - <mallctl>arenas.purge</mallctl> + <mallctl>arenas.nhchunks</mallctl> (<type>unsigned</type>) - <literal>-w</literal> + <literal>r-</literal> </term> - <listitem><para>Purge unused dirty pages for the specified arena, or - for all arenas if none is specified.</para></listitem> + <listitem><para>Total number of huge size classes.</para></listitem> + </varlistentry> + + <varlistentry id="arenas.hchunk.i.size"> + <term> + <mallctl>arenas.hchunk.<i>.size</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + </term> + <listitem><para>Maximum size supported by this huge size + class.</para></listitem> </varlistentry> <varlistentry id="arenas.extend"> @@ -1584,6 +1880,20 @@ malloc_conf = "xmalloc:true";]]></programlisting> and returning the new arena index.</para></listitem> </varlistentry> + <varlistentry id="prof.thread_active_init"> + <term> + <mallctl>prof.thread_active_init</mallctl> + (<type>bool</type>) + <literal>rw</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Control the initial setting for <link + linkend="thread.prof.active"><mallctl>thread.prof.active</mallctl></link> + in newly created threads. See the <link + linkend="opt.prof_thread_active_init"><mallctl>opt.prof_thread_active_init</mallctl></link> + option for additional information.</para></listitem> + </varlistentry> + <varlistentry id="prof.active"> <term> <mallctl>prof.active</mallctl> @@ -1594,8 +1904,9 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Control whether sampling is currently active. See the <link linkend="opt.prof_active"><mallctl>opt.prof_active</mallctl></link> - option for additional information. - </para></listitem> + option for additional information, as well as the interrelated <link + linkend="thread.prof.active"><mallctl>thread.prof.active</mallctl></link> + mallctl.</para></listitem> </varlistentry> <varlistentry id="prof.dump"> @@ -1614,6 +1925,49 @@ malloc_conf = "xmalloc:true";]]></programlisting> option.</para></listitem> </varlistentry> + <varlistentry id="prof.gdump"> + <term> + <mallctl>prof.gdump</mallctl> + (<type>bool</type>) + <literal>rw</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>When enabled, trigger a memory profile dump every time + the total virtual memory exceeds the previous maximum. Profiles are + dumped to files named according to the pattern + <filename><prefix>.<pid>.<seq>.u<useq>.heap</filename>, + where <literal><prefix></literal> is controlled by the <link + linkend="opt.prof_prefix"><mallctl>opt.prof_prefix</mallctl></link> + option.</para></listitem> + </varlistentry> + + <varlistentry id="prof.reset"> + <term> + <mallctl>prof.reset</mallctl> + (<type>size_t</type>) + <literal>-w</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Reset all memory profile statistics, and optionally + update the sample rate (see <link + linkend="opt.lg_prof_sample"><mallctl>opt.lg_prof_sample</mallctl></link> + and <link + linkend="prof.lg_sample"><mallctl>prof.lg_sample</mallctl></link>). + </para></listitem> + </varlistentry> + + <varlistentry id="prof.lg_sample"> + <term> + <mallctl>prof.lg_sample</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-prof</option>] + </term> + <listitem><para>Get the current sample rate (see <link + linkend="opt.lg_prof_sample"><mallctl>opt.lg_prof_sample</mallctl></link>). + </para></listitem> + </varlistentry> + <varlistentry id="prof.interval"> <term> <mallctl>prof.interval</mallctl> @@ -1637,9 +1991,8 @@ malloc_conf = "xmalloc:true";]]></programlisting> </term> <listitem><para>Pointer to a counter that contains an approximate count of the current number of bytes in active pages. The estimate may be - high, but never low, because each arena rounds up to the nearest - multiple of the chunk size when computing its contribution to the - counter. Note that the <link + high, but never low, because each arena rounds up when computing its + contribution to the counter. Note that the <link linkend="epoch"><mallctl>epoch</mallctl></link> mallctl has no bearing on this counter. Furthermore, counter consistency is maintained via atomic operations, so it is necessary to use an atomic operation in @@ -1670,88 +2023,56 @@ malloc_conf = "xmalloc:true";]]></programlisting> equal to <link linkend="stats.allocated"><mallctl>stats.allocated</mallctl></link>. This does not include <link linkend="stats.arenas.i.pdirty"> - <mallctl>stats.arenas.<i>.pdirty</mallctl></link> and pages + <mallctl>stats.arenas.<i>.pdirty</mallctl></link>, nor pages entirely devoted to allocator metadata.</para></listitem> </varlistentry> - <varlistentry id="stats.mapped"> + <varlistentry id="stats.metadata"> <term> - <mallctl>stats.mapped</mallctl> + <mallctl>stats.metadata</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Total number of bytes in chunks mapped on behalf of the - application. This is a multiple of the chunk size, and is at least as - large as <link - linkend="stats.active"><mallctl>stats.active</mallctl></link>. This - does not include inactive chunks.</para></listitem> - </varlistentry> - - <varlistentry id="stats.chunks.current"> - <term> - <mallctl>stats.chunks.current</mallctl> - (<type>size_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Total number of chunks actively mapped on behalf of the - application. This does not include inactive chunks. - </para></listitem> - </varlistentry> - - <varlistentry id="stats.chunks.total"> - <term> - <mallctl>stats.chunks.total</mallctl> - (<type>uint64_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Cumulative number of chunks allocated.</para></listitem> + <listitem><para>Total number of bytes dedicated to metadata, which + comprise base allocations used for bootstrap-sensitive internal + allocator data structures, arena chunk headers (see <link + linkend="stats.arenas.i.metadata.mapped"><mallctl>stats.arenas.<i>.metadata.mapped</mallctl></link>), + and internal allocations (see <link + linkend="stats.arenas.i.metadata.allocated"><mallctl>stats.arenas.<i>.metadata.allocated</mallctl></link>).</para></listitem> </varlistentry> - <varlistentry id="stats.chunks.high"> + <varlistentry id="stats.resident"> <term> - <mallctl>stats.chunks.high</mallctl> + <mallctl>stats.resident</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Maximum number of active chunks at any time thus far. - </para></listitem> + <listitem><para>Maximum number of bytes in physically resident data + pages mapped by the allocator, comprising all pages dedicated to + allocator metadata, pages backing active allocations, and unused dirty + pages. This is a maximum rather than precise because pages may not + actually be physically resident if they correspond to demand-zeroed + virtual memory that has not yet been touched. This is a multiple of the + page size, and is larger than <link + linkend="stats.active"><mallctl>stats.active</mallctl></link>.</para></listitem> </varlistentry> - <varlistentry id="stats.huge.allocated"> + <varlistentry id="stats.mapped"> <term> - <mallctl>stats.huge.allocated</mallctl> + <mallctl>stats.mapped</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Number of bytes currently allocated by huge objects. - </para></listitem> - </varlistentry> - - <varlistentry id="stats.huge.nmalloc"> - <term> - <mallctl>stats.huge.nmalloc</mallctl> - (<type>uint64_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Cumulative number of huge allocation requests. - </para></listitem> - </varlistentry> - - <varlistentry id="stats.huge.ndalloc"> - <term> - <mallctl>stats.huge.ndalloc</mallctl> - (<type>uint64_t</type>) - <literal>r-</literal> - [<option>--enable-stats</option>] - </term> - <listitem><para>Cumulative number of huge deallocation requests. - </para></listitem> + <listitem><para>Total number of bytes in active chunks mapped by the + allocator. This is a multiple of the chunk size, and is larger than + <link linkend="stats.active"><mallctl>stats.active</mallctl></link>. + This does not include inactive chunks, even those that contain unused + dirty pages, which means that there is no strict ordering between this + and <link + linkend="stats.resident"><mallctl>stats.resident</mallctl></link>.</para></listitem> </varlistentry> <varlistentry id="stats.arenas.i.dss"> @@ -1768,6 +2089,18 @@ malloc_conf = "xmalloc:true";]]></programlisting> </para></listitem> </varlistentry> + <varlistentry id="stats.arenas.i.lg_dirty_mult"> + <term> + <mallctl>stats.arenas.<i>.lg_dirty_mult</mallctl> + (<type>ssize_t</type>) + <literal>r-</literal> + </term> + <listitem><para>Minimum ratio (log base 2) of active to dirty pages. + See <link + linkend="opt.lg_dirty_mult"><mallctl>opt.lg_dirty_mult</mallctl></link> + for details.</para></listitem> + </varlistentry> + <varlistentry id="stats.arenas.i.nthreads"> <term> <mallctl>stats.arenas.<i>.nthreads</mallctl> @@ -1809,6 +2142,38 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Number of mapped bytes.</para></listitem> </varlistentry> + <varlistentry id="stats.arenas.i.metadata.mapped"> + <term> + <mallctl>stats.arenas.<i>.metadata.mapped</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Number of mapped bytes in arena chunk headers, which + track the states of the non-metadata pages.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.metadata.allocated"> + <term> + <mallctl>stats.arenas.<i>.metadata.allocated</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Number of bytes dedicated to internal allocations. + Internal allocations differ from application-originated allocations in + that they are for internal use, and that they are omitted from heap + profiles. This statistic is reported separately from <link + linkend="stats.metadata"><mallctl>stats.metadata</mallctl></link> and + <link + linkend="stats.arenas.i.metadata.mapped"><mallctl>stats.arenas.<i>.metadata.mapped</mallctl></link> + because it overlaps with e.g. the <link + linkend="stats.allocated"><mallctl>stats.allocated</mallctl></link> and + <link linkend="stats.active"><mallctl>stats.active</mallctl></link> + statistics, whereas the other metadata statistics do + not.</para></listitem> + </varlistentry> + <varlistentry id="stats.arenas.i.npurge"> <term> <mallctl>stats.arenas.<i>.npurge</mallctl> @@ -1930,15 +2295,48 @@ malloc_conf = "xmalloc:true";]]></programlisting> </para></listitem> </varlistentry> - <varlistentry id="stats.arenas.i.bins.j.allocated"> + <varlistentry id="stats.arenas.i.huge.allocated"> <term> - <mallctl>stats.arenas.<i>.bins.<j>.allocated</mallctl> + <mallctl>stats.arenas.<i>.huge.allocated</mallctl> (<type>size_t</type>) <literal>r-</literal> [<option>--enable-stats</option>] </term> - <listitem><para>Current number of bytes allocated by - bin.</para></listitem> + <listitem><para>Number of bytes currently allocated by huge objects. + </para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.huge.nmalloc"> + <term> + <mallctl>stats.arenas.<i>.huge.nmalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of huge allocation requests served + directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.huge.ndalloc"> + <term> + <mallctl>stats.arenas.<i>.huge.ndalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of huge deallocation requests served + directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.huge.nrequests"> + <term> + <mallctl>stats.arenas.<i>.huge.nrequests</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of huge allocation requests. + </para></listitem> </varlistentry> <varlistentry id="stats.arenas.i.bins.j.nmalloc"> @@ -1974,6 +2372,17 @@ malloc_conf = "xmalloc:true";]]></programlisting> requests.</para></listitem> </varlistentry> + <varlistentry id="stats.arenas.i.bins.j.curregs"> + <term> + <mallctl>stats.arenas.<i>.bins.<j>.curregs</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Current number of regions for this size + class.</para></listitem> + </varlistentry> + <varlistentry id="stats.arenas.i.bins.j.nfills"> <term> <mallctl>stats.arenas.<i>.bins.<j>.nfills</mallctl> @@ -2068,6 +2477,50 @@ malloc_conf = "xmalloc:true";]]></programlisting> <listitem><para>Current number of runs for this size class. </para></listitem> </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.nmalloc"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.nmalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of allocation requests for this size + class served directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.ndalloc"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.ndalloc</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of deallocation requests for this + size class served directly by the arena.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.nrequests"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.nrequests</mallctl> + (<type>uint64_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Cumulative number of allocation requests for this size + class.</para></listitem> + </varlistentry> + + <varlistentry id="stats.arenas.i.hchunks.j.curhchunks"> + <term> + <mallctl>stats.arenas.<i>.hchunks.<j>.curhchunks</mallctl> + (<type>size_t</type>) + <literal>r-</literal> + [<option>--enable-stats</option>] + </term> + <listitem><para>Current number of huge allocations for this size class. + </para></listitem> + </varlistentry> </variablelist> </refsect1> <refsect1 id="debugging_malloc_problems"> @@ -2253,42 +2706,6 @@ malloc_conf = "xmalloc:true";]]></programlisting> returns the usable size of the allocation pointed to by <parameter>ptr</parameter>. </para> </refsect2> - <refsect2> - <title>Experimental API</title> - <para>The <function>allocm<parameter/></function>, - <function>rallocm<parameter/></function>, - <function>sallocm<parameter/></function>, - <function>dallocm<parameter/></function>, and - <function>nallocm<parameter/></function> functions return - <constant>ALLOCM_SUCCESS</constant> on success; otherwise they return an - error value. The <function>allocm<parameter/></function>, - <function>rallocm<parameter/></function>, and - <function>nallocm<parameter/></function> functions will fail if: - <variablelist> - <varlistentry> - <term><errorname>ALLOCM_ERR_OOM</errorname></term> - - <listitem><para>Out of memory. Insufficient contiguous memory was - available to service the allocation request. The - <function>allocm<parameter/></function> function additionally sets - <parameter>*ptr</parameter> to <constant>NULL</constant>, whereas - the <function>rallocm<parameter/></function> function leaves - <constant>*ptr</constant> unmodified.</para></listitem> - </varlistentry> - </variablelist> - The <function>rallocm<parameter/></function> function will also - fail if: - <variablelist> - <varlistentry> - <term><errorname>ALLOCM_ERR_NOT_MOVED</errorname></term> - - <listitem><para><constant>ALLOCM_NO_MOVE</constant> was specified, - but the reallocation request could not be serviced without moving - the object.</para></listitem> - </varlistentry> - </variablelist> - </para> - </refsect2> </refsect1> <refsect1 id="environment"> <title>ENVIRONMENT</title> |