summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorZuul <zuul@review.opendev.org>2022-09-21 16:38:35 +0000
committerGerrit Code Review <review@openstack.org>2022-09-21 16:38:35 +0000
commiteeeaa274cfc7ebee52beaed97571e2f87127f2dd (patch)
tree527206a62db1c20f93bde1d27ba634c101020452 /doc
parentb767a92dd8212e254980d04b89b19ed01950f01d (diff)
parent9a8b1d149c738b7d06a93c39a29f43afeed7cc8a (diff)
downloadironic-eeeaa274cfc7ebee52beaed97571e2f87127f2dd.tar.gz
Merge "Concurrent Distructive/Intensive ops limits"
Diffstat (limited to 'doc')
-rw-r--r--doc/source/admin/troubleshooting.rst37
1 files changed, 37 insertions, 0 deletions
diff --git a/doc/source/admin/troubleshooting.rst b/doc/source/admin/troubleshooting.rst
index fa04d3006..2791430fd 100644
--- a/doc/source/admin/troubleshooting.rst
+++ b/doc/source/admin/troubleshooting.rst
@@ -973,3 +973,40 @@ Unfortunately, due to the way the conductor is designed, it is not possible to
gracefully break a stuck lock held in ``*-ing`` states. As the last resort, you
may need to restart the affected conductor. See `Why are my nodes stuck in a
"-ing" state?`_.
+
+What is ConcurrentActionLimit?
+==============================
+
+ConcurrentActionLimit is an exception which is raised to clients when an
+operation is requested, but cannot be serviced at that moment because the
+overall threshold of nodes in concurrent "Deployment" or "Cleaning"
+operations has been reached.
+
+These limits exist for two distinct reasons.
+
+The first is they allow an operator to tune a deployment such that too many
+concurrent deployments cannot be triggered at any given time, as a single
+conductor has an internal limit to the number of overall concurrent tasks,
+this restricts only the number of running concurrent actions. As such, this
+accounts for the number of nodes in ``deploy`` and ``deploy wait`` states.
+In the case of deployments, the default value is relatively high and should
+be suitable for *most* larger operators.
+
+The second is to help slow down the ability in which an entire population of
+baremetal nodes can be moved into and through cleaning, in order to help
+guard against authenticated malicious users, or accidental script driven
+operations. In this case, the total number of nodes in ``deleting``,
+``cleaning``, and ``clean wait`` are evaluated. The default maximum limit
+for cleaning operations is *50* and should be suitable for the majority of
+baremetal operators.
+
+These settings can be modified by using the
+``[conductor]max_concurrent_deploy`` and ``[conductor]max_concurrent_clean``
+settings from the ironic.conf file supporting the ``ironic-conductor``
+service. Neither setting can be explicity disabled, however there is also no
+upper limit to the setting.
+
+.. note::
+ This was an infrastructure operator requested feature from actual lessons
+ learned in the operation of Ironic in large scale production. The defaults
+ may not be suitable for the largest scale operators.