diff options
author | Zuul <zuul@review.opendev.org> | 2022-09-21 16:38:35 +0000 |
---|---|---|
committer | Gerrit Code Review <review@openstack.org> | 2022-09-21 16:38:35 +0000 |
commit | eeeaa274cfc7ebee52beaed97571e2f87127f2dd (patch) | |
tree | 527206a62db1c20f93bde1d27ba634c101020452 /doc | |
parent | b767a92dd8212e254980d04b89b19ed01950f01d (diff) | |
parent | 9a8b1d149c738b7d06a93c39a29f43afeed7cc8a (diff) | |
download | ironic-eeeaa274cfc7ebee52beaed97571e2f87127f2dd.tar.gz |
Merge "Concurrent Distructive/Intensive ops limits"
Diffstat (limited to 'doc')
-rw-r--r-- | doc/source/admin/troubleshooting.rst | 37 |
1 files changed, 37 insertions, 0 deletions
diff --git a/doc/source/admin/troubleshooting.rst b/doc/source/admin/troubleshooting.rst index fa04d3006..2791430fd 100644 --- a/doc/source/admin/troubleshooting.rst +++ b/doc/source/admin/troubleshooting.rst @@ -973,3 +973,40 @@ Unfortunately, due to the way the conductor is designed, it is not possible to gracefully break a stuck lock held in ``*-ing`` states. As the last resort, you may need to restart the affected conductor. See `Why are my nodes stuck in a "-ing" state?`_. + +What is ConcurrentActionLimit? +============================== + +ConcurrentActionLimit is an exception which is raised to clients when an +operation is requested, but cannot be serviced at that moment because the +overall threshold of nodes in concurrent "Deployment" or "Cleaning" +operations has been reached. + +These limits exist for two distinct reasons. + +The first is they allow an operator to tune a deployment such that too many +concurrent deployments cannot be triggered at any given time, as a single +conductor has an internal limit to the number of overall concurrent tasks, +this restricts only the number of running concurrent actions. As such, this +accounts for the number of nodes in ``deploy`` and ``deploy wait`` states. +In the case of deployments, the default value is relatively high and should +be suitable for *most* larger operators. + +The second is to help slow down the ability in which an entire population of +baremetal nodes can be moved into and through cleaning, in order to help +guard against authenticated malicious users, or accidental script driven +operations. In this case, the total number of nodes in ``deleting``, +``cleaning``, and ``clean wait`` are evaluated. The default maximum limit +for cleaning operations is *50* and should be suitable for the majority of +baremetal operators. + +These settings can be modified by using the +``[conductor]max_concurrent_deploy`` and ``[conductor]max_concurrent_clean`` +settings from the ironic.conf file supporting the ``ironic-conductor`` +service. Neither setting can be explicity disabled, however there is also no +upper limit to the setting. + +.. note:: + This was an infrastructure operator requested feature from actual lessons + learned in the operation of Ironic in large scale production. The defaults + may not be suitable for the largest scale operators. |