summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Disable greenthreads for RabbitDriver "listen" connectionsHEAD14.3.0masterArnaud Morin2023-03-032-0/+5
| | | | | | | | | | | | | | | | | When enabling heartbeat_in_pthread, we were restoring the "threading" python library from eventlet to original one in RabbitDriver but we forgot to do the same in AMQPDriverBase (RabbitDriver is subclass of AMQPDriverBase). We also need to use the original "queue" so that queues are not going to use greenthreads as well. Related-bug: #1961402 Related-bug: #1934937 Closes-bug: #2009138 Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com> Change-Id: I34ea0d1381e934297df2f793e0d2594ef8254f00
* Update master for stable/2023.1OpenStack Release Bot2023-02-242-0/+7
| | | | | | | | | | | | Add file to the reno documentation build to show release notes for stable/2023.1. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/2023.1. Sem-Ver: feature Change-Id: I80f227a59c36693c83bb94890536745610ba2393
* Fix typo in quorum-related variables for RabbitMQDmitriy Rabotyagov2023-02-143-6/+14
| | | | | | | | | | In [1] there was a typo made in variable names. To prevent even futher awkwardness regarding variable naming, we fix typo and publish a release note for ones that already using variables in their deployments. [1] https://review.opendev.org/c/openstack/oslo.messaging/+/831058 Change-Id: Icc438397c11521f3e5e9721f85aba9095e0831c2
* Support overriding class for get_rpc_* helper functions14.2.0Tobias Urdin2023-01-234-7/+20
| | | | | | | | | | | | | | | | | | We currently do not support overriding the class being instantiated in the RPC helper functions, this adds that support so that projects that define their own classes that inherit from oslo.messaging can use the helpers. For example neutron utilizes code from neutron-lib that has it's own RPCClient implementation that inherits from oslo.messaging, in order for them to use for example the get_rpc_client helper they need support to override the class being returned. The alternative would be to modify the internal _manual_load variable which seems counter-productive to extending the API provided to consumers. Change-Id: Ie22f2ee47a4ca3f28a71272ee1ffdb88aaeb7758
* tox cleanupsStephen Finucane2023-01-182-43/+53
| | | | | | | | | 'skip_basepython_conflicts' has been the cause of a couple of bugs in tox 4 and there is talk of it going away. Remove it and fix up a few other issues in the tox.ini file. Change-Id: Ic19c896af2ab0cf3570c43e8ceb8cba64fb45cdd Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
* Merge "Remove logging from ProducerConnection._produce_message"14.1.0Zuul2022-12-213-15/+67
|\
| * Remove logging from ProducerConnection._produce_messageGuillaume Espanel2022-08-033-15/+67
| | | | | | | | | | | | | | | | | | | | | | | | In impl_kafka, _produce_message is run in a tpool.execute context but it was also calling logging functions. This could cause subsequent calls to logging functions to deadlock. This patch moves the logging calls out of the tpool.execute scope. Change-Id: I81167eea0a6b1a43a88baa3bc383af684f4b1345 Closes-bug: #1981093
* | Merge "Warn when we force creating a non durable exchange"Zuul2022-12-201-0/+1
|\ \
| * | Warn when we force creating a non durable exchangeHervé Beraud2022-10-181-0/+1
| | | | | | | | | | | | | | | | | | | | | Adding warning logs so that users can detect the fallback with durable exchanges. Change-Id: Iabce0986fae6ed8838f1f94496b5994fc19cc5ef
* | | Merge "Implement get_rpc_client function"Zuul2022-12-017-29/+74
|\ \ \
| * | | Implement get_rpc_client functionTobias Urdin2022-10-257-29/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already expose functions to handle the instantiation of classes such as RPCServer and RPCTransport but the same was never done for RPCClient so the API is inconsistent in its enforcement. This adds a get_rpc_client function that should be used instead of instatiating the RPCClient class directly to be more consistent. This also allows to handle more logic inside the function in the future such as if implementations for an async client is implemented, as investigation in [1] has shown. [1] https://review.opendev.org/c/openstack/oslo.messaging/+/858936 Change-Id: Ia4d1f0497b9e2728bde02f4ff05fdc175ddffe66
* | | | Merge "Force creating non durable control exchange when a precondition failed"Zuul2022-11-162-3/+64
|\ \ \ \ | |/ / / |/| / / | |/ /
| * | Force creating non durable control exchange when a precondition failedHervé Beraud2021-12-152-3/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Precondition failed exception related to durable exchange config may be triggered when a control exchange is shared between services and when services try to create it with configs that differ from each others. RabbitMQ will reject the services that try to create it with a configuration that differ from the one used first. This kind of exception is not managed for now and services can fails without handling this kind of issue. These changes catch this kind exception to analyze if they related to durable config. In this case we try to re-declare the failing exchange/queue as non durable. This problem can be easily reproduced by running a local RabbitMQ server. By setting the config below (sample.conf): ``` [DEFAULT] transport_url = rabbit://localhost/ [OSLO_MESSAGING_RABBIT] amqp_durable_queues = true ``` And by running our simulator twice: ``` $ tox -e venv -- python tools/simulator.py -d rpc-server -w 40 $ tox -e venv -- python tools/simulator.py --config-file ./sample.conf -d rpc-server -w 40 ``` The first one will create a default non durable control exchange. The second one will create the same default control exchange but as durable. Closes-Bug: #1953351 Change-Id: I27625b468c428cde6609730c8ab429c2c112d010
* | | Update master for stable/zedOpenStack Release Bot2022-09-092-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add file to the reno documentation build to show release notes for stable/zed. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/zed. Sem-Ver: feature Change-Id: Ic1020b39172981abcc9fc3d66fc6ec58f440a456
* | | Merge "update hacking pin to support flake8 3.8.3"Zuul2022-08-303-3/+3
|\ \ \
| * | | update hacking pin to support flake8 3.8.3Sean Mooney2022-05-233-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this change updates the max version of hacking to 4.1.0 to allow pre-commit to work with the flake 3.8.3 release and correct one new error that was raised as a result. Change-Id: I3a0242208f411b430db0e7429e2c773f45b3d301
* | | | Change default value of "heartbeat_in_pthread" to False14.0.0Slawek Kaplonski2022-08-162-2/+13
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As was reported in the related bug some time ago, setting that option to True for nova-compute can break it as it's non-wsgi service. We also noticed same problems with randomly stucked non-wsgi services like e.g. neutron agents and probably the same issue can happen with any other non-wsgi service. To avoid that this patch changes default value of that config option to be False. Together with [1] it effectively reverts change done in [2] some time ago. [1] https://review.opendev.org/c/openstack/oslo.messaging/+/800621 [2] https://review.opendev.org/c/openstack/oslo.messaging/+/747395 Related-Bug: #1934937 Closes-Bug: #1961402 Change-Id: I85f5b9d1b5d15ad61a9fcd6e25925b7eeb8bf6e7
* | | Merge "Add quorum queue control configurations"13.0.0Zuul2022-06-133-7/+80
|\ \ \ | |/ / |/| |
| * | Add quorum queue control configurationshamza alqtaishat2022-04-063-7/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the quorum queue type add features that did not exist before or not handled in rabbitmq the following link shows some of them https://blog.rabbitmq.com/posts/2020/04/rabbitmq-gets-an-ha-upgrade/ the options below control the quorum queue and ensure the stability of the quorum system x-max-in-memory-length x-max-in-memory-bytes x-delivery-limit which control the memory usage and handle message poisoning Closes-Bug: #1962348 Change-Id: I570227d6102681f4f9d8813ed0d7693a1160c21d
* | | Drop python3.6/3.7 support in testing runtimeHervé Beraud2022-05-051-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Zed cycle testing runtime, we are targetting to drop the python 3.6/3.7 support, project started adding python 3.8 as minimum, example nova: - https://github.com/openstack/nova/blob/56b5aed08c6a3ed81b78dc216f0165ebfe3c3350/setup.cfg#L13 Change-Id: Id23d3845db716d26175d71280dbedf93736d19de
* | | Merge "Add EXTERNAL as rabbit login method"12.14.0Zuul2022-04-271-1/+1
|\ \ \
| * | | Add EXTERNAL as rabbit login methodhamza alqtaishat2022-04-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As explained in the link below kombu has login method called external https://docs.celeryq.dev/projects/kombu/en/latest/_modules/kombu/connection.html The login method external is not listed as a choice in the Rabbit driver As explained in RabbitMQ documention https://www.rabbitmq.com/access-control.html for Authentication using Client TLS (x.509) Certificate Data clients must be configured to use the EXTERNAL mechanism. Closes-Bug: #1970276 Change-Id: I5c38d3a3cafd49f8abc031e36bc595f32a8631d2
* | | | Merge "Add a new option to enforce the OpenSSL FIPS mode"Zuul2022-04-265-0/+98
|\ \ \ \
| * | | | Add a new option to enforce the OpenSSL FIPS modeHervé Beraud2021-11-085-0/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This option ``ssl_enforce_fips_mode`` allow us to enforce the FIPS mode if supported by the version of python in use. https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards Change-Id: I50c7de71bfd38137eb83d23e910298946507ce9f
* | | | | Merge "Add Python3 zed unit tests"Zuul2022-04-261-1/+1
|\ \ \ \ \ | |_|/ / / |/| | | |
| * | | | Add Python3 zed unit testsOpenStack Release Bot2022-03-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an automatically generated patch to ensure unit testing is in place for all the of the tested runtimes for zed. See also the PTI in governance [1]. [1]: https://governance.openstack.org/tc/reference/project-testing-interface.html Change-Id: I73a0700baa1c9edfb7a4b82be94df8bacff3c226
* | | | | tests: Fix test failures with kombu >= 5.2.4Stephen Finucane2022-04-051-6/+21
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kombu 5.2.4 fixed an off-by-one issue that meant we were attempting retries more than once [1]. We need to handle this to unblock the gate. This was discovered by examining the call stack and comparing this with recent changes in openstack/requirements. [1] https://github.com/celery/kombu/commit/5bed2a8f983a3bf61c12443e7704ffd89991ef9a Change-Id: I476e3c573523d5991c56b31ad4df1172196aa7f1 Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
* | | | Update master for stable/yogaOpenStack Release Bot2022-03-042-0/+7
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add file to the reno documentation build to show release notes for stable/yoga. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/yoga. Sem-Ver: feature Change-Id: I3d2b041769c5c14a7d391c223dc499218a937e76
* | | Merge "Adding support for rabbitmq quorum queues"12.13.0Zuul2022-02-084-9/+66
|\ \ \
| * | | Adding support for rabbitmq quorum queuesHervé Beraud2022-02-054-9/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://www.rabbitmq.com/quorum-queues.html The quorum queue is a modern queue type for RabbitMQ implementing a durable, replicated FIFO queue based on the Raft consensus algorithm. It is available as of RabbitMQ 3.8.0. the quorum queues can not be set by policy so this should be done when declaring the queue. To declare a quorum queue set the x-queue-type queue argument to quorum (the default is classic). This argument must be provided by a client at queue declaration time; it cannot be set or changed using a policy. This is because policy definition or applicable policy can be changed dynamically but queue type cannot. It must be specified at the time of declaration. its good for the oslo messaging to add support for that type of queue that have multiple advantaged over mirroring. If quorum queues are sets mirrored queues will be ignored. Closes-Bug: #1942933 Change-Id: Id573e04c287e034e50626daf6e18a34735d45251
* | | | Merge "[rabbit] use retry parameters during notification sending"12.12.0Zuul2022-01-128-26/+60
|\ \ \ \
| * | | | [rabbit] use retry parameters during notification sendingBalazs Gibizer2022-01-128-26/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rabbit backend now applies the [oslo_messaging_notifications]retry, [oslo_messaging_rabbit]rabbit_retry_interval, rabbit_retry_backoff and rabbit_interval_max configuration parameters when tries to establish the connection to the message bus during notification sending. This patch also clarifies the differences between the behavior of the kafka and the rabbit drivers in this regard. Closes-Bug: #1917645 Change-Id: Id4ccafc95314c86ae918336e42cca64a6acd4d94
* | | | | Merge "Reproduce bug 1917645"Zuul2021-12-211-0/+68
|\ \ \ \ \ | |/ / / /
| * | | | Reproduce bug 1917645Balazs Gibizer2021-11-241-0/+68
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The [oslo_messaging_notification]retry parameter is not applied during connecting to the message bus. But the documentation implies it should[1][2]. The two possible drivers, rabbit and kafka, behaves differently. 1) The rabbit driver will retry the connection forever, blocking the caller process. 2) The kafka driver also ignores the retry configuration but the notifier call returns immediately even if the notification is not (cannot) be delivered. This patch adds test cases to show the wrong behavior. [1] https://docs.openstack.org/oslo.messaging/latest/configuration/opts.html#oslo_messaging_notifications.retry [2] https://github.com/openstack/oslo.messaging/blob/feb72de7b81e3919dedc697f9fb5484a92f85ad8/oslo_messaging/notify/messaging.py#L31-L36 Related-Bug: #1917645 Change-Id: Id8557050157aecd3abd75c9114d3fcaecdfc5dc9
* | | | Update python testing classifierdengzhaosen2021-12-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yoga testing runtime[1] has been updated to add py39 testing as voting. Unit tests update are handled by the job template change in openstack-zuul-job - https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/820286 this commit updates the classifier in setup.cfg file. [1] https://governance.openstack.org/tc/reference/runtimes/yoga.html Change-Id: I26743858a0ca7a6e46bda821c8f29b6dff34ea15
* | | | amqp1: fix race when reconnecting12.11.1John Eckersberg2021-11-091-1/+2
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently this is how reconnect works: - pyngus detects failure and invokes callback Controller.connection_failed() which in turn calls Controller._handle_connection_loss() - The first thing that _handle_connection_loss does is to set self.addresser to None (important later) - Then it defers _do_reconnect after a delay (normally 1 second) - (1 second passes) - _do_reconnect calls _hard_reset which resets the controller state However, there is a race here. This can happen: - The above, up until it defers and waits for 1 second - Controller.send() is invoked on a task - A new Sender is created, and critically because self.reply_link still exists and is active, we call sender.attach and pass in self.addresser. Remember _handle_connection_loss sets self.addresser to None. - Eventually Sender.attach throws an AttributeError because it attempts to call addresser.resolve() but addresser is None The reason this happens is because although the connection is dead, the controller state is still half-alive because _hard_reset hasn't been called yet since it's deferred one second in _do_reconnect. The fix here is to move _hard_reset out of _do_reconnect and directly into _handle_connection_loss. The eventloop is woken up immediately to process _hard_reset but _do_reconnect is still deferred as before so as to retain the desired reconnect backoff behavior. Closes-Bug: #1941652 Change-Id: Ife62a7d76022908f0dc6a77f1ad607cb2fbd3e8f
* | | Merge "Remove deprecation of heartbeat_in_pthread"12.11.0Zuul2021-10-212-1/+7
|\ \ \
| * | | Remove deprecation of heartbeat_in_pthreadHervé Beraud2021-10-142-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some circumstances services can be executed outside of mod_wsgi and in a monkey patched environment. In this context we need to leave the possibility to users to execute the heartbeat in a green thread. The heartbeat_in_pthread was tagged as depreacted few months and planned for a future removal. These changes drop this deprecation to allow to enable green threads if needed. Closes-Bug: #1934937 Change-Id: Iee2e5a6f7d71acba70bbc857f0bd7d83e32a7b8c
* | | | rabbit: move stdlib_threading bits into _utils12.10.0John Eckersberg2021-09-222-14/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The amqp1 driver also needs this same logic, so split it out and share it. Change-Id: I2e9dbfa27887e26807f199c9d359bacd7c15c67a
* | | | Merge "use message id cache for RPC listener"Zuul2021-09-132-1/+61
|\ \ \ \
| * | | | use message id cache for RPC listenerNikita Kalyanov2021-09-102-1/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Return back the message id cache feature to RPC listener, it was removed while refactoring in I708c3d6676b974d8daac6817c15f596cdf35817b See attached bug for more info. We should not raise DuplicateMessageError to avoid rejecting the previously ACK'ed message. Closes-Bug: #1935883 Change-Id: Ie237e9e3fdc3fc27b3deb18b94751cdc3afd190e
* | | | | Merge "limit maximum timeout in the poll loop"Zuul2021-09-132-2/+29
|\ \ \ \ \ | |/ / / /
| * | | | limit maximum timeout in the poll loopNikita Kalyanov2021-07-132-2/+29
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should properly limit the maximum timeout with a 'min' to avoid long delays before message processing. Such delays may happen if the connection to a RabbitMQ server is re-established at the same time when the message arrives (see attached bug for more info). Moreover, this change is in line with the original intent to actually have an upper limit on maximum possible timeout (see comments in code and in the original review). Closes-Bug: #1935864 Change-Id: Iebc8a96e868d938a5d250bf9d66d20746c63d3d5
* | | | Add Python3 yoga unit testsOpenStack Release Bot2021-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an automatically generated patch to ensure unit testing is in place for all the of the tested runtimes for yoga. See also the PTI in governance [1]. [1]: https://governance.openstack.org/tc/reference/project-testing-interface.html Change-Id: I0088ff54c4807f240a1db2457aeefcdf8b91375d
* | | | Update master for stable/xenaOpenStack Release Bot2021-09-102-0/+7
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add file to the reno documentation build to show release notes for stable/xena. Use pbr instruction to increment the minor version number automatically so that master versions are higher than the versions on stable/xena. Sem-Ver: feature Change-Id: Ia40ac2ccee4fe230605f3183b0b432b0e31bff04
* | | amqp1: Do not reuse _socket_connection on reconnect12.9.1John Eckersberg2021-08-102-8/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each _SocketConnection object is unique per-peer. For example, the properties attribute may contain keys such as 'x-ssl-peer-name'. Reusing the existing _socket_connection during failover will cause the TLS handshake to fail since the peer name will not match. There is potential for other similar-yet-unexplored bad things to happen as well. Instead, reconnect by waking up the eventloop via the _do_reconnect method, which reconstructs the connection properties to reflect the new (failed-over-to) host and ultimately crates a new _SocketConnection (or re-uses a *valid* old one) in eventloop.Thread.connect(). Closes-Bug: #1938945 Change-Id: I0c8dc447f4dc8d0d08c312a1f3e6fa1745fb69fd
* | | amqp1: re-organize TestFailover to be reused by TestSSLJohn Eckersberg2021-08-101-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This breaks out the generation of brokers and transport_url into separate methods. These methods are used in the next patch in this series, where TestSSL is updated to inherit from TestFailover, and TestSSL overrides the _gen_brokers and _gen_transport_url methods to supply the necessary SSL-aware options. Change-Id: Ia2f977795abc2e81a996e299867e05d41057f33f
* | | Revert "Disable AMQP 1.0 SSL unit tests"John Eckersberg2021-08-101-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 8f5cfda6642ea7f75206d3183c2507e2e83c5693. Reason for revert: This was supposed to be temporary to unblock the gate. Whatever broke SSL cert generation in the first place appears to be fixed because I can run SSL tests now. Change-Id: I4f286cf3af0d578f472b84fe355c812910c7a121
* | | Merge "Changed minversion in tox to 3.18.0"12.9.0Zuul2021-08-101-3/+3
|\ \ \ | |/ / |/| |
| * | Changed minversion in tox to 3.18.0yangyawei2021-06-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | The patch bumps min version of tox to 3.18.0 in order to replace tox's whitelist_externals by allowlist_externals option: https://github.com/tox-dev/tox/blob/master/docs/changelog.rst#v3180-2020-07-23 Change-Id: If129b56be47952d018b9f6024d2a192950c1a974