summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Refactor rabbit_queue_index:queue_index_walkerdurable_startup_speedPhilip Kuryloski2020-03-272-38/+32
| | | | Attempt to improve readability by eliminating nested folding functions.
* Create SECURITY.mdMichael Klishin2020-03-251-0/+24
|
* Merge pull request #2279 from rabbitmq/startup_memory_fixMichael Klishin2020-03-252-232/+70
|\ | | | | Reduce memory usage during startup
| * Compile from scratchstartup_memory_fixMichael Klishin2020-03-231-2/+2
| |
| * Move worker_pool_SUITE to rabbitmq-commonPhilip Kuryloski2020-03-231-188/+0
| | | | | | | | https://github.com/rabbitmq/rabbitmq-common/pull/368/commits/36c9fbe59af6d6cce67fc430b333c44f30cc4c40
| * Fail vhost startup if index workers are queued unsuccessfullyPhilip Kuryloski2020-03-231-5/+6
| | | | | | | | | | | | | | | | using spawn_link in rabbit_msg_store:build_index alters the supervision tree such that there are unwanted side effects in rabbit_vhost_msg_store. We monitor the spawned process so that if there is a failure to enqueue the scan for each file, the vhost fails to start and reports an error.
| * Improve worker_pool worker utilizationPhilip Kuryloski2020-03-201-1/+1
| | | | | | | | | | | | Make use of the new dispatch_sync function in https://github.com/rabbitmq/rabbitmq-common/pull/368 to block only when all workers are busy
| * Reduce memory usage during startupPhilip Kuryloski2020-03-181-44/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of large backlogs of persistent messages (10s of millions of messages) Previously we queued a job for every file with worker_pool:submit_async, however if there are 50 million messages, this corresponds to ~79,000 files and the same number of pending tasks in the worker pool. The mailbox for worker_pool explodes under these circumstances, using massive amounts of memory. The following was helpful in zeroing in on the problem: https://elixirforum.com/t/extremely-high-memory-usage-in-genservers/4035
* | More logging around peer discovery backend initialisationMichael Klishin2020-03-241-3/+5
| |
* | product_info_SUITE: Try to grep several timesJean-Sébastien Pédron2020-03-241-3/+14
| | | | | | | | | | | | | | ... in case the log file was not fsync'd yet (and thus we don't see the content yet). This happens sometimes in Travis CI for instance.
* | Merge pull request #2285 from rabbitmq/test-wait-for-confirmsMichael Klishin2020-03-231-46/+25
|\ \ | | | | | | Wait for commits on test suite
| * | Wait for commits on test suitetest-wait-for-confirmsdcorbacho2020-03-231-46/+25
|/ / | | | | | | Don't wait for consensus as the publish could be delayed
* | feature_flags_SUITE: Make sure `rabbit` is not recompiledJean-Sébastien Pédron2020-03-231-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | ... while building `my_plugin`. We clear ALL_DEPS_DIRS to make sure they are not recompiled when the plugin is built. `rabbit` was previously compiled with -DTEST and if it is recompiled because of this plugin, it will be recompiled without -DTEST: the testsuite depends on test code so we can't allow that. Note that we do not clear the DEPS variable: we need it to be correct because it is used to generate `my_plugin.app` (and a RabbitMQ plugin must depend on `rabbit`).
* | feature_flags_SUITE: Adapt `registry` testJean-Sébastien Pédron2020-03-201-18/+16
| | | | | | | | | | ... to explicitely inject its own feature flags, instead of relying on actual module attributes.
* | Make the list of discovered nodes unique before using itMichael Klishin2020-03-201-1/+2
| | | | | | | | | | | | Backends can return duplicates, sometimes for reasons outside of their control, e.g. implicit or explicit versioning of values by the data store they are backed by.
* | Make sure peer discovery module is loaded before initialisationMichael Klishin2020-03-201-0/+1
| |
* | Merge pull request #2281 from ↵Jean-Sébastien Pédron2020-03-202-36/+36
|\ \ | |/ |/| | | | | rabbitmq/fix-ff-registry-loading+improve-ff-testing Fix feature flags registry loading + improve feature flags testing
| * rabbit_feature_flags: Wait for old registry to be purgedJean-Sébastien Pédron2020-03-191-2/+14
| | | | | | | | | | | | | | | | | | ... before deleting it and load the new code. In some rare cases, the soft purge didn't work because another process was running the old code. Thus the delete would fail. Now, we wait for the soft purge to succeed before proceeding.
| * rabbit_feature_flags: Add an API for testsuites to add their own feature flagsJean-Sébastien Pédron2020-03-192-34/+22
|/ | | | | | This should be more robust than relying the caller (through a forced exception). Way more robust considering that the latter seems to not work at all :)
* rabbit_fifo_SUITE mock log betterkjnilsson2020-03-171-3/+6
|
* Merge pull request #2272 from rabbitmq/rabbit-fifo-force-gcGerhard Lazu2020-03-162-7/+37
|\ | | | | rabbit_fifo: force gc when queue is empty
| * Log full GC sweep as debug, not infoGerhard Lazu2020-03-132-3/+5
| | | | | | | | | | | | | | | | Use a friendlier log message that converts bytes to megabytes. cc @kjnilsson Signed-off-by: Gerhard Lazu <gerhard@lazu.co.uk>
| * rabbit_fifo: force gc when queue is emptykjnilsson2020-03-122-7/+35
| | | | | | | | | | | | | | | | And it uses more than a fixed limit of ~2Mb of total memory. An empty queue should only need around ~100-200Kb of memory so this should avoid any unnecessary full sweeps. [#171644231]
* | Merge pull request #2276 from rabbitmq/mk-peer-discovery-retriesJean-Sébastien Pédron2020-03-167-25/+268
|\ \ | | | | | | Introduce peer discovery retries
| * | Make default peer discovery retry settings consistentMichael Klishin2020-03-161-1/+1
| | |
| * | Retry on [some] peer discovery failuresMichael Klishin2020-03-165-25/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the backend returns an error, we retry. If we fail to join discovered peers, we also retry. Schema table sync retries are already in place so nothing to change there. Closes #1627. Pair: @dumbbell.
| * | Classic config peer discovery: conditionally support registrationMichael Klishin2020-03-161-1/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If registration is supported, a randomized delay will be injected. It makes sense to support it in this config, in fact, our own test suites are evidence of that: they start all nodes in parallel. However, this backend is used by default so even a blank single node would then wait on a delay that serves no purpose. So we make this conditional: if no peer nodes are configured we don't want to induce any delays. Pair: @dumbbell.
| * | rabbit_nodes: introduce make/2 for convenienceMichael Klishin2020-03-161-3/+6
| | | | | | | | | | | | Pair: @dumbbell.
| * | Integration tess for classic config peer discovery backendMichael Klishin2020-03-161-0/+128
| | | | | | | | | | | | Pair: @dumbbell.
| * | Log it when schema tables were successfully syncedMichael Klishin2020-03-161-0/+1
|/ / | | | | | | Pair: @dumbbell.
* | Add a test case that imports definitions on bootMichael Klishin2020-03-121-3/+34
|/ | | | | This is a follow-up to c503d57e884dd440aa6b830bb9835be4dd2d8f69 which was a follow-up to 12571627a6089aa7d603e92b3e35745c8d398b3e.
* Make sure definition import work pool is started before actual import boot stepMichael Klishin2020-03-121-14/+15
|
* Revert "Use default work pool for definition import"Michael Klishin2020-03-121-0/+1
| | | | This reverts commit 1a60b43aa99bb8d97fe29a3c5266ede7678fe26e.
* Add a comment to remind us to discuss IMPORT_WORK_POOLGerhard Lazu2020-03-121-0/+2
| | | | | | | cc @michaelklishin Signed-off-by: Philip Kuryloski <pkuryloski@pivotal.io> (cherry picked from commit abc7b0aaad43a52d4f0473d374386b9b5b6a3936)
* Log if definitions are imported sequentially vs concurrentlyGerhard Lazu2020-03-121-2/+2
| | | | | Signed-off-by: Philip Kuryloski <pkuryloski@pivotal.io> (cherry picked from commit 955ad8d90744f352a2608ab3e305f3419df0cf0f)
* Use default work pool for definition importGerhard Lazu2020-03-121-1/+0
| | | | | | | | | | | | | | | | | | | | | This unblocks node boot, otherwise the node gets stuck on waiting for gatherer:out/1 to return. I am not sure why we need another work pool for boot steps and don't just use the default work pool. After all, there is nothing else running during node boot except the boot steps, so using the default work pool should be sufficient. If we do decide to create a new work pool, we will need to stop it after boot steps complete. Obviously, we will need to first fix this work pool which currently doesn't seem to be created. We can see all processes for the default work pool, but we cannot find any process for this new IMPORT_WORK_POOL. cc @michaelklishin Signed-off-by: Philip Kuryloski <pkuryloski@pivotal.io> (cherry picked from commit 0f86277b06b1a27ef468dcd02f1700cc99f04bf0)
* Merge pull request #2273 from luos/fix_crashed_queues_on_startupMichael Klishin2020-03-121-3/+13
| | | | | | After network partition some classic queues are crashed, do not have mirrors (cherry picked from commit 3c1b317b08870d3ca4c04c4a59b4067bf768707e)
* Update copyright (year 2020)Jean-Sébastien Pédron2020-03-10252-255/+255
|
* Remove unused old `check_xref` and `quickcheck` scriptsJean-Sébastien Pédron2020-03-102-332/+0
|
* dynamic_ha_SUITE: Increase the chance of concurrent rebalancesJean-Sébastien Pédron2020-03-091-3/+14
| | | | | | | | | | | | | | | | | | | | | | ... in `rebalance_multiple_blocked`. The testcase wants two rebalances to happen at the same time: the first one may succeed (but this is out of scope for this testcase) and the second one must be aborted with `{error, rebalance_in_progress}`. It looks like things are faster with Erlang 23 because the first rebalance is already finished when the second starts, probably due to internode communication being slower than rebalance (the two rebalances are triggered from the common_test node). This time, the testcase starts a function on the remote node which spawns two functions in parallel to request a rebalance. Everything being local, we increase the chance to have concurrent rebalances. We don't know the order of execution of the two functions, so we simply verify that one of them fails. This is still not 100% bullet proof, but should be ok.
* Merge pull request #2268 from rabbitmq/rabbitmq-management-782Michael Klishin2020-03-072-6/+55
|\ | | | | Allow only one rebalance operation to happen at a time
| * More loggingMichael Klishin2020-03-071-0/+1
| |
| * Add test that should failLuke Bakken2020-03-062-6/+54
|/ | | | | | | | Add code to block multiple queue rebalance operations, fix test Allow acquiring the rebalance lock prior to calling rabbit_amqqueue:rebalance Simplify queue rebalance code to always acquire the lock using the current process
* Update rabbitmq-components.mkGerhard Lazu2020-03-061-1/+1
|
* Update erlang.mkGerhard Lazu2020-03-061-29/+142
|
* Merge pull request #2267 from rabbitmq/rabbit-fifo-release-cursor-fixD Corbacho2020-03-053-8/+48
|\ | | | | rabbit_fifo: change release cursor calculation
| * rabbit_fifo: change release cursor calculationrabbit-fifo-release-cursor-fixkjnilsson2020-03-053-8/+48
| | | | | | | | | | | | | | | | Release cursors are taken less frequently the more messages there are on queue. This changes how this is calculated to simply use the message count rather than some multiple of the currently captured release cursors. This is more consistent and doesn't depend on non snapshottable state.
* | Travis CI: Update config from rabbitmq-commonJean-Sébastien Pédron2020-03-041-1/+10
| |
* | Travis CI: Refresh config patchJean-Sébastien Pédron2020-03-041-6/+6
| |
* | Travis CI: Update config from rabbitmq-commonJean-Sébastien Pédron2020-03-041-17/+13
|/