summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Update ChangeLog for 4.3.0.rel-4.3.0Jason Evans2016-11-041-0/+2
|
* Fix arena data structure size calculation.Jason Evans2016-11-041-2/+2
| | | | | | | | | | Fix paren placement so that QUANTUM_CEILING() applies to the correct portion of the expression that computes how much memory to base_alloc(). In practice this bug had no impact. This was caused by 5d8db15db91c85d47b343cfc07fc6ea736f0de48 (Simplify run quantization.), which in turn fixed an over-allocation regression caused by 3c4d92e82a31f652a7c77ca937a02d0185085b06 (Add per size class huge allocation statistics.).
* Fixes to Visual Studio Project filesMatthew Parkinson2016-11-042-3/+19
|
* Use -std=gnu11 if available.Jason Evans2016-11-041-2/+8
| | | | This supersedes -std=gnu99, and enables C11 atomics.
* Update ChangeLog for 4.3.0.Jason Evans2016-11-041-3/+9
|
* Fix large allocation to search optimal size class heap.Jason Evans2016-11-031-1/+1
| | | | | | | | | | | | | | | | | Fix arena_run_alloc_large_helper() to not convert size to usize when searching for the first best fit via arena_run_first_best_fit(). This allows the search to consider the optimal quantized size class, so that e.g. allocating and deallocating 40 KiB in a tight loop can reuse the same memory. This regression was nominally caused by 5707d6f952c71baa2f19102479859012982ac821 (Quantize szad trees by size class.), but it did not commonly cause problems until 8a03cf039cd06f9fa6972711195055d865673966 (Implement cache index randomization for large allocations.). These regressions were first released in 4.0.0. This resolves #487.
* Fix chunk_alloc_cache() to support decommitted allocation.Jason Evans2016-11-033-12/+14
| | | | | | | | Fix chunk_alloc_cache() to support decommitted allocation, and use this ability in arena_chunk_alloc_internal() and arena_stash_dirty(), so that chunks don't get permanently stuck in a hybrid state. This resolves #487.
* Update symbol mangling.Jason Evans2016-11-031-0/+3
|
* Update ChangeLog for 4.3.0.Jason Evans2016-11-021-0/+37
|
* Update project URL.Jason Evans2016-11-023-3/+3
|
* Check for existance of CPU_COUNT macro before using it.Dave Watson2016-11-021-1/+7
| | | | This resolves #485.
* Fix sycall(2) configure test for Linux.Jason Evans2016-11-021-2/+1
|
* Do not use syscall(2) on OS X 10.12 (deprecated).Jason Evans2016-11-024-4/+24
|
* Add os_unfair_lock support.Jason Evans2016-11-027-0/+42
| | | | | OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended replacement.
* Fix/refactor zone allocator integration code.Jason Evans2016-11-022-86/+108
| | | | | | | | | Fix zone_force_unlock() to reinitialize, rather than unlocking mutexes, since OS X 10.12 cannot tolerate a child unlocking mutexes that were locked by its parent. Refactor; this was a side effect of experimenting with zone {de,re}registration during fork(2).
* Call _exit(2) rather than exit(3) in forked child.Jason Evans2016-11-021-1/+1
| | | | _exit(2) is async-signal-safe, whereas exit(3) is not.
* Force no lazy-lock on Windows.Jason Evans2016-11-021-5/+11
| | | | | | | Monitoring thread creation is unimplemented for Windows, which means lazy-lock cannot function correctly. This resolves #310.
* Use <quote>...</quote> rather than &ldquo;...&rdquo; or "..." in XML.Jason Evans2016-11-012-20/+21
|
* Add "J" (JSON) support to malloc_stats_print().Jason Evans2016-11-012-398/+876
| | | | This resolves #474.
* Refactor witness_unlock() to fix undefined test behavior.Jason Evans2016-10-312-11/+29
| | | | This resolves #396.
* Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW.Jason Evans2016-10-293-10/+10
| | | | | | | | The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC), whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but still has resolution (~1ms) that is adequate for our purposes. This resolves #479.
* Use syscall(2) rather than {open,read,close}(2) during boot.Jason Evans2016-10-291-0/+19
| | | | | | | | | Some applications wrap various system calls, and if they call the allocator in their wrappers, unexpected reentry can result. This is not a general solution (many other syscalls are spread throughout the code), but this resolves a bootstrapping issue that is apparently common. This resolves #443.
* Fix EXTRA_CFLAGS to not affect configuration.Jason Evans2016-10-292-5/+4
|
* Do not mark malloc_conf as weak on Windows.Jason Evans2016-10-291-1/+1
| | | | | | | This works around malloc_conf not being properly initialized by at least the cygwin toolchain. Prior build system changes to use -Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to work properly as a non-weak symbol (not tested).
* Do not mark malloc_conf as weak for unit tests.Jason Evans2016-10-281-1/+5
| | | | | | | This is generally correct (no need for weak symbols since no jemalloc library is involved in the link phase), and avoids linking problems (apparently unininitialized non-NULL malloc_conf) when using cygwin with gcc.
* Support static linking of jemalloc with glibcDave Watson2016-10-282-0/+34
| | | | | | | | | | | | | | | | | | | | | | | glibc defines its malloc implementation with several weak and strong symbols: strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc) strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree) strong_alias (__libc_free, __free) strong_alias (__libc_free, free) strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc) The issue is not with the weak symbols, but that other parts of glibc depend on __libc_malloc explicitly. Defining them in terms of jemalloc API's allows the linker to drop glibc's malloc.o completely from the link, and static linking no longer results in symbol collisions. Another wrinkle: jemalloc during initialization calls sysconf to get the number of CPU's. GLIBC allocates for the first time before setting up isspace (and other related) tables, which are used by sysconf. Instead, use the pthread API to get the number of CPUs with GLIBC, which seems to work. This resolves #442.
* Reduce memory requirements for regression tests.Jason Evans2016-10-283-35/+55
| | | | | | | This is intended to drop memory usage to a level that AppVeyor test instances can handle. This resolves #393.
* Periodically purge in memory-intensive integration tests.Jason Evans2016-10-281-0/+7
| | | | This resolves #393.
* Periodically purge in memory-intensive integration tests.Jason Evans2016-10-283-6/+27
| | | | This resolves #393.
* Only link with libm (-lm) if necessary.Jason Evans2016-10-282-6/+16
| | | | This fixes warnings when building with MSVC.
* Only use --whole-archive with gcc.Jason Evans2016-10-283-3/+7
| | | | | | | Conditionalize use of --whole-archive on the platform plus compiler, rather than on the ABI. This fixes a regression caused by 7b24c6e5570062495243f1e55131b395adb31e33 (Use --whole-archive when linking integration tests on MinGW.).
* Do not force lazy lock on Windows.Jason Evans2016-10-281-1/+0
| | | | | | | | This reverts 13473c7c66a81a4dc1cf11a97e9c8b1dbb785b64, which was intended to work around bootstrapping issues when linking statically. However, this actually causes problems in various other configurations, so this reversion may force a future fix for the underlying problem, if it still exists.
* Fix over-sized allocation of rtree leaf nodes.Jason Evans2016-10-281-1/+1
| | | | | Use the correct level metadata when allocating child nodes so that leaf nodes don't end up over-sized (2^16 elements vs 2^4 elements).
* Use --whole-archive when linking integration tests on MinGW.Jason Evans2016-10-251-1/+10
| | | | | | | Prior to this change, the malloc_conf weak symbol provided by the jemalloc dynamic library is always used, even if the application provides a malloc_conf symbol. Use the --whole-archive linker option to allow the weak symbol to be overridden.
* Do not (recursively) allocate within tsd_fetch().Jason Evans2016-10-2114-130/+176
| | | | | | | Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.
* Make dss operations lockless.Jason Evans2016-10-1311-146/+127
| | | | | | | | | | | | | | Rather than protecting dss operations with a mutex, use atomic operations. This has negligible impact on synchronization overhead during typical dss allocation, but is a substantial improvement for chunk_in_dss() and the newly added chunk_dss_mergeable(), which can be called multiple times during chunk deallocations. This change also has the advantage of avoiding tsd in deallocation paths associated with purging, which resolves potential deadlocks during thread exit due to attempted tsd resurrection. This resolves #425.
* Add/use adaptive spinning.Jason Evans2016-10-136-2/+66
| | | | | | | | Add spin_t and spin_{init,adaptive}(), which provide a simple abstraction for adaptive spinning. Adaptively spin during busy waits in bootstrapping and rtree node initialization.
* Disallow 0x5a junk filling when running in Valgrind.Jason Evans2016-10-121-6/+28
| | | | | | | | Explicitly disallow junk:true and junk:free runtime settings when running in Valgrind, since deallocation-time junk filling and redzone validation cause false positive Valgrind reports. This resolves #470.
* Fix and simplify decay-based purging.Jason Evans2016-10-112-69/+69
| | | | | | | | | | | | | | | | | | | | | Simplify decay-based purging attempts to only be triggered when the epoch is advanced, rather than every time purgeable memory increases. In a correctly functioning system (not previously the case; see below), this only causes a behavior difference if during subsequent purge attempts the least recently used (LRU) purgeable memory extent is initially too large to be purged, but that memory is reused between attempts and one or more of the next LRU purgeable memory extents are small enough to be purged. In practice this is an arbitrary behavior change that is within the set of acceptable behaviors. As for the purging fix, assure that arena->decay.ndirty is recorded *after* the epoch advance and associated purging occurs. Prior to this fix, it was possible for purging during epoch advance to cause a substantially underrepresentative (arena->ndirty - arena->decay.ndirty), i.e. the number of dirty pages attributed to the current epoch was too low, and a series of unintended purges could result. This fix is also relevant in the context of the simplification described above, but the bug's impact would be limited to over-purging at epoch advances.
* Fix decay tests to all adapt to nstime_monotonic().Jason Evans2016-10-111-6/+9
|
* Do not advance decay epoch when time goes backwards.Jason Evans2016-10-106-6/+63
| | | | | | Instead, move the epoch backward in time. Additionally, add nstime_monotonic() and use it in debug builds to assert that time only goes backward if nstime_update() is using a non-monotonic time source.
* Refactor arena->decay_* into arena->decay.* (arena_decay_t).Jason Evans2016-10-102-84/+91
|
* Refine nstime_update().Jason Evans2016-10-105-38/+109
| | | | | | | | | | | | | | | | | | | | | Add missing #include <time.h>. The critical time facilities appear to have been transitively included via unistd.h and sys/time.h, but in principle this omission was capable of having caused clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of gettimeofday(), which in turn could cause spurious non-monotonic time updates. Refactor nstime_get() out of nstime_update() and add configure tests for all variants. Add CLOCK_MONOTONIC_RAW support (Linux-specific) and mach_absolute_time() support (OS X-specific). Do not fall back to clock_gettime(CLOCK_REALTIME, ...). This was a fragile Linux-specific workaround, which we're unlikely to use at all now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we have no choice besides non-monotonic clocks, gettimeofday() is only incrementally worse.
* Simplify run quantization.Jason Evans2016-10-063-154/+31
|
* Refactor runs_avail.Jason Evans2016-10-046-49/+70
| | | | | | | | Use pszind_t size classes rather than szind_t size classes, and always reserve space for NPSIZES elements. This removes unused heaps that are not multiples of the page size, and adds (currently) unused heaps for all huge size classes, with the immediate benefit that the size of arena_t allocations is constant (no longer dependent on chunk size).
* Implement pz2ind(), pind2sz(), and psz2u().Jason Evans2016-10-046-28/+203
| | | | | | | These compute size classes and indices similarly to size2index(), index2size() and s2u(), respectively, but using the subset of size classes that are multiples of the page size. Note that pszind_t and szind_t are not interchangeable.
* Use TSDN_NULL rather than NULL as appropriate.Jason Evans2016-10-043-9/+9
|
* Fix a typo.Jason Evans2016-10-041-1/+1
|
* Define 64-bits atomics unconditionallyMike Hommey2016-10-041-10/+8
| | | | They are used on all platforms in prng.h.
* Close file descriptor after reading "/proc/sys/vm/overcommit_memory".Jason Evans2016-09-261-0/+1
| | | | | | | This bug was introduced by c2f970c32b527660a33fa513a76d913c812dcf7c (Modify pages_map() to support mapping uncommitted virtual memory.). This resolves #399.