13 files changed, 612 insertions, 47 deletions
diff --git a/doc/source/admin/anaconda-deploy-interface.rst b/doc/source/admin/anaconda-deploy-interface.rst
index 2c686506a..2b7195525 100644
--- a/doc/source/admin/anaconda-deploy-interface.rst
+++ b/doc/source/admin/anaconda-deploy-interface.rst
@@ -271,11 +271,44 @@ purposes.
    ``liveimg`` which is used as the base operating system image to
    start with.
 
+Configuration Considerations
+----------------------------
+
+When using the ``anaconda`` deployment interface, some configuration
+parameters may need to be adjusted in your environment. This is in large
+part due to the general defaults being set to much lower values for image
+based deployments, but the way the anaconda deployment interface works,
+you may need to make some adjustments.
+
+* ``[conductor]deploy_callback_timeout`` likely needs to be adjusted
+  for most ``anaconda`` deployment interface users. By default this
+  is a timer which looks for "agents" which have not checked in with
+  Ironic, or agents which may have crashed or failed after they
+  started. If the value is reached, then the current operation is failed.
+  This value should be set to a number of seconds which exceeds your
+  average anaconda deployment time.
+* ``[pxe]boot_retry_timeout`` can also be triggered and result in
+  an anaconda deployment in progress getting reset as it is intended
+  to reboot nodes which might have failed their initial PXE operation.
+  Depending on sizes of images, and the exact nature of what was deployed,
+  it may be necessary to ensure this is a much higher value.
+
 Limitations
 -----------
 
-This deploy interface has only been tested with Red Hat based operating systems
-that use anaconda. Other systems are not supported.
+* This deploy interface has only been tested with Red Hat based operating
+  systems that use anaconda. Other systems are not supported.
+
+* Runtime TLS certifiate injection into ramdisks is not supported. Assets
+  such as ``ramdisk`` or a ``stage2`` ramdisk image need to have trusted
+  Certificate Authority certificates present within the images *or* the
+  Ironic API endpoint utilized should utilize a known trusted Certificate
+  Authority.
+
+* The ``anaconda`` tooling deploying the instance/workload does not
+  heartbeat to Ironic like the ``ironic-python-agent`` driven ramdisks.
+  As such, you may need to adjust some timers. See
+  `Configuration Considerations`_ for some details on this.
 
 .. _`anaconda`: https://fedoraproject.org/wiki/Anaconda
 .. _`ks.cfg.template`: https://opendev.org/openstack/ironic/src/branch/master/ironic/drivers/modules/ks.cfg.template
diff --git a/doc/source/admin/drivers.rst b/doc/source/admin/drivers.rst
index c3d8eb377..f35cb2dfa 100644
--- a/doc/source/admin/drivers.rst
+++ b/doc/source/admin/drivers.rst
@@ -26,6 +26,7 @@ Hardware Types
   drivers/redfish
   drivers/snmp
   drivers/xclarity
+  drivers/fake
 
 Changing Hardware Types and Interfaces
 --------------------------------------
diff --git a/doc/source/admin/drivers/fake.rst b/doc/source/admin/drivers/fake.rst
new file mode 100644
index 000000000..ea7d7ef4c
--- /dev/null
+++ b/doc/source/admin/drivers/fake.rst
@@ -0,0 +1,36 @@
+===========
+Fake driver
+===========
+
+Overview
+========
+
+The ``fake-hardware`` hardware type is what it claims to be: fake. Use of this
+type or the ``fake`` interfaces should be temporary or limited to
+non-production environments, as the ``fake`` interfaces do not perform any of
+the actions typically expected.
+
+The ``fake`` interfaces can be configured to be combined with any of the
+"real" hardware interfaces, allowing you to effectively disable one or more
+hardware interfaces for testing by simply setting that interface to
+``fake``.
+
+Use cases
+=========
+
+Development
+-----------
+Developers can use ``fake-hardware`` hardware-type to mock out nodes for
+testing without those nodes needing to exist with physical or virtual hardware.
+
+Adoption
+--------
+Some OpenStack deployers have used ``fake`` interfaces in Ironic to allow an
+adoption-style workflow with Nova. By setting a node's hardware interfaces to
+``fake``, it's possible to deploy to that node with Nova without causing any
+actual changes to the hardware or an OS already deployed on it.
+
+This is generally an unsupported use case, but it is possible. For more
+information, see the relevant `post from CERN TechBlog`_.
+
+.. _`post from CERN TechBlog`: https://techblog.web.cern.ch/techblog/post/ironic-nova-adoption/
diff --git a/doc/source/admin/drivers/ibmc.rst b/doc/source/admin/drivers/ibmc.rst
index 1bf9a3ba2..0f7fe1d90 100644
--- a/doc/source/admin/drivers/ibmc.rst
+++ b/doc/source/admin/drivers/ibmc.rst
@@ -312,6 +312,6 @@ boot_up_seq               GET            Query boot up sequence
 get_raid_controller_list  GET            Query RAID controller summary info
 ========================  ============   ======================================
 
-.. _Huawei iBMC: https://e.huawei.com/en/products/cloud-computing-dc/servers/accessories/ibmc
+.. _Huawei iBMC: https://e.huawei.com/en/products/computing/kunpeng/accessories/ibmc
 .. _TLS: https://en.wikipedia.org/wiki/Transport_Layer_Security
 .. _HUAWEI iBMC Client library: https://pypi.org/project/python-ibmcclient/
diff --git a/doc/source/admin/drivers/ilo.rst b/doc/source/admin/drivers/ilo.rst
index f764a6d89..b6825fc40 100644
--- a/doc/source/admin/drivers/ilo.rst
+++ b/doc/source/admin/drivers/ilo.rst
@@ -55,6 +55,8 @@ The hardware type ``ilo`` supports following HPE server features:
 * `Updating security parameters as manual clean step`_
 * `Update Minimum Password Length security parameter as manual clean step`_
 * `Update Authentication Failure Logging security parameter as manual clean step`_
+* `Create Certificate Signing Request(CSR) as manual clean step`_
+* `Add HTTPS Certificate as manual clean step`_
 * `Activating iLO Advanced license as manual clean step`_
 * `Removing CA certificates from iLO as manual clean step`_
 * `Firmware based UEFI iSCSI boot from volume support`_
@@ -65,6 +67,7 @@ The hardware type ``ilo`` supports following HPE server features:
 * `BIOS configuration support`_
 * `IPv6 support`_
 * `Layer 3 or DHCP-less ramdisk booting`_
+* `Events subscription`_
 
 Apart from above features hardware type ``ilo5`` also supports following
 features:
@@ -200,6 +203,18 @@ The ``ilo`` hardware type supports following hardware interfaces:
         enabled_hardware_types = ilo
         enabled_rescue_interfaces = agent,no-rescue
 
+* vendor
+    Supports ``ilo``, ``ilo-redfish`` and ``no-vendor``. The default is
+    ``ilo``. They can be enabled by using the
+    ``[DEFAULT]enabled_vendor_interfaces`` option in ``ironic.conf`` as given
+    below:
+
+    .. code-block:: ini
+
+        [DEFAULT]
+        enabled_hardware_types = ilo
+        enabled_vendor_interfaces = ilo,ilo-redfish,no-vendor
+
 
 The ``ilo5`` hardware type supports all the ``ilo`` interfaces described above,
 except for ``boot`` and ``raid`` interfaces. The details of ``boot`` and
@@ -751,6 +766,12 @@ Supported **Manual** Cleaning Operations
   ``update_auth_failure_logging_threshold``:
     Updates the Authentication Failure Logging security parameter. See
     `Update Authentication Failure Logging security parameter as manual clean step`_ for user guidance on usage.
+  ``create_csr``:
+    Creates the certificate signing request. See `Create Certificate Signing Request(CSR) as manual clean step`_
+    for user guidance on usage.
+  ``add_https_certificate``:
+    Adds the signed HTTPS certificate to the iLO. See `Add HTTPS Certificate as manual clean step`_ for user
+    guidance on usage.
 
 * iLO with firmware version 1.5 is minimally required to support all the
   operations.
@@ -1648,6 +1669,54 @@ Both the arguments ``logging_threshold`` and ``ignore`` are optional. The accept
 value be False. If user passes the value of logging_threshold as 0, the Authentication Failure Logging security
 parameter will be disabled.
 
+Create Certificate Signing Request(CSR) as manual clean step
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+iLO driver can invoke ``create_csr`` request as a manual clean step. This step is only supported for iLO5 based hardware.
+
+An example of a manual clean step with ``create_csr`` as the only clean step could be::
+
+    "clean_steps": [{
+        "interface": "management",
+        "step": "create_csr",
+        "args": {
+            "csr_params": {
+                "City": "Bengaluru",
+                "CommonName": "1.1.1.1",
+                "Country": "India",
+                "OrgName": "HPE",
+                "State": "Karnataka"
+            }
+        }
+    }]
+
+The ``[ilo]cert_path`` option in ``ironic.conf`` is used as the directory path for
+creating the CSR, which defaults to ``/var/lib/ironic/ilo``. The CSR is created in the directory location
+given in ``[ilo]cert_path`` in ``node_uuid`` directory as <node_uuid>.csr.
+
+
+Add HTTPS Certificate as manual clean step
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+iLO driver can invoke ``add_https_certificate`` request as a manual clean step. This step is only supported for
+iLO5 based hardware.
+
+An example of a manual clean step with ``add_https_certificate`` as the only clean step could be::
+
+    "clean_steps": [{
+        "interface": "management",
+        "step": "add_https_certificate",
+        "args": {
+            "cert_file": "/test1/iLO.crt"
+        }
+    }]
+
+Argument ``cert_file`` is mandatory. The ``cert_file`` takes the path or url of the certificate file.
+The url schemes supported are: ``file``, ``http`` and ``https``.
+The CSR generated in step ``create_csr`` needs to be signed by a valid CA and the resultant HTTPS certificate should
+be provided in ``cert_file``. It copies the ``cert_file`` to ``[ilo]cert_path`` under ``node.uuid`` as <node_uuid>.crt
+before adding it to iLO.
+
 RAID Support
 ^^^^^^^^^^^^
 
@@ -2136,6 +2205,20 @@ DHCP-less deploy is supported by ``ilo`` and ``ilo5`` hardware types.
 However it would work only with ilo-virtual-media boot interface. See
 :doc:`/admin/dhcp-less` for more information.
 
+Events subscription
+^^^^^^^^^^^^^^^^^^^
+Events subscription is supported by ``ilo`` and ``ilo5`` hardware types with
+``ilo`` vendor interface for Gen10 and Gen10 Plus servers. See
+:ref:`node-vendor-passthru-methods` for more information.
+
+Anaconda based deployment
+^^^^^^^^^^^^^^^^^^^^^^^^^
+Deployment with ``anaconda`` deploy interface is supported by ``ilo`` and
+``ilo5`` hardware type and works with ``ilo-pxe`` and ``ilo-ipxe``
+boot interfaces.  See :doc:`/admin/anaconda-deploy-interface` for
+more information.
+
+
 .. _`ssacli documentation`: https://support.hpe.com/hpsc/doc/public/display?docId=c03909334
 .. _`proliant-tools`: https://docs.openstack.org/diskimage-builder/latest/elements/proliant-tools/README.html
 .. _`HPE iLO4 User Guide`: https://h20566.www2.hpe.com/hpsc/doc/public/display?docId=c03334051
diff --git a/doc/source/admin/drivers/irmc.rst b/doc/source/admin/drivers/irmc.rst
index 17b8d8644..9ddfa3b3d 100644
--- a/doc/source/admin/drivers/irmc.rst
+++ b/doc/source/admin/drivers/irmc.rst
@@ -123,11 +123,29 @@ Configuration via ``driver_info``
     the iRMC with administrator privileges.
   - ``driver_info/irmc_password`` property to be ``password`` for
     irmc_username.
-  - ``properties/capabilities`` property to be ``boot_mode:uefi`` if
-    UEFI boot is required.
-  - ``properties/capabilities`` property to be ``secure_boot:true`` if
-    UEFI Secure Boot is required. Please refer to `UEFI Secure Boot Support`_
-    for more information.
+
+  .. note::
+     Fujitsu server equipped with iRMC S6 2.00 or later version of firmware
+     disables IPMI over LAN by default. However user may be able to enable IPMI
+     via BMC settings.
+     To handle this change, ``irmc`` hardware type first tries IPMI and,
+     if IPMI operation fails, ``irmc`` hardware type uses Redfish API of Fujitsu
+     server to provide Ironic functionalities.
+     So if user deploys Fujitsu server with iRMC S6 2.00 or later, user needs
+     to set Redfish related parameters in ``driver_info``.
+
+  - ``driver_info/redifsh_address`` property to be ``IP address`` or
+    ``hostname`` of the iRMC. You can prefix it with protocol (e.g.
+    ``https://``). If you don't provide protocol, Ironic assumes HTTPS
+    (i.e. add ``https://`` prefix).
+    iRMC with S6 2.00 or later only support HTTPS connection to Redfish API.
+  - ``driver_info/redfish_username`` to be user name of iRMC with administrative
+    privileges
+  - ``driver_info/redfish_password`` to be password of ``redfish_username``
+  - ``driver_info/redfish_verify_ca`` accepts values those accepted in
+    ``driver_info/irmc_verify_ca``
+  - ``driver_info/redfish_auth_type`` to be one of ``basic``, ``session`` or
+    ``auto``
 
 * If ``port`` in ``[irmc]`` section of ``/etc/ironic/ironic.conf`` or
   ``driver_info/irmc_port`` is set to 443, ``driver_info/irmc_verify_ca``
@@ -191,6 +209,22 @@ Configuration via ``driver_info``
   - ``driver_info/irmc_snmp_priv_password`` property to be the privacy protocol
     pass phrase. The length of pass phrase should be at least 8 characters.
 
+
+Configuration via ``properties``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Each node is configured for ``irmc`` hardware type by setting the following
+  ironic node object's properties:
+
+  - ``properties/capabilities`` property to be ``boot_mode:uefi`` if
+    UEFI boot is required, or ``boot_mode:bios`` if Legacy BIOS is required.
+    If this is not set, ``default_boot_mode`` at ``[default]`` section in
+    ``ironic.conf`` will be used.
+  - ``properties/capabilities`` property to be ``secure_boot:true`` if
+    UEFI Secure Boot is required. Please refer to `UEFI Secure Boot Support`_
+    for more information.
+
+
 Configuration via ``ironic.conf``
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -199,6 +233,25 @@ Configuration via ``ironic.conf``
 
   - ``port``: Port to be used for iRMC operations; either 80
     or 443. The default value is 443. Optional.
+
+    .. note::
+       Since iRMC S6 2.00, iRMC firmware doesn't support HTTP connection to
+       REST API. If you deploy server with iRMS S6 2.00 and later, please
+       set ``port`` to 443.
+
+       ``irmc`` hardware type provides ``verify_step`` named
+       ``verify_http_https_connection_and_fw_version`` to check HTTP(S)
+       connection to iRMC REST API. If HTTP(S) connection is successfully
+       established, then it fetches and caches iRMC firmware version.
+       If HTTP(S) connection to iRMC REST API failed, Ironic node's state
+       moves to ``enroll`` with suggestion put in log message.
+       Default priority of this verify step is 10.
+
+       If operator updates iRMC firmware version of node, operator should
+       run ``cache_irmc_firmware_version`` node vendor passthru method
+       to update iRMC firmware version stored in
+       ``driver_internal_info/irmc_fw_version``.
+
   - ``auth_method``: Authentication method for iRMC operations;
     either ``basic`` or ``digest``. The default value is ``basic``. Optional.
   - ``client_timeout``: Timeout (in seconds) for iRMC
@@ -229,9 +282,10 @@ Configuration via ``ironic.conf``
     and ``v2c``. The default value is ``public``. Optional.
   - ``snmp_security``: SNMP security name required for version ``v3``.
     Optional.
-  - ``snmp_auth_proto``: The SNMPv3 auth protocol. The valid value and the
-    default value are both ``sha``. We will add more supported valid values
-    in the future. Optional.
+  - ``snmp_auth_proto``: The SNMPv3 auth protocol. If using iRMC S4 or S5, the
+    valid value of this option is only ``sha``. If using iRMC S6, the valid
+    values are ``sha256``, ``sha384`` and ``sha512``. The default value is
+    ``sha``. Optional.
   - ``snmp_priv_proto``: The SNMPv3 privacy protocol. The valid value and
     the default value are both ``aes``. We will add more supported valid values
     in the future. Optional.
diff --git a/doc/source/admin/drivers/redfish.rst b/doc/source/admin/drivers/redfish.rst
index dd19f8bde..063dd1fe5 100644
--- a/doc/source/admin/drivers/redfish.rst
+++ b/doc/source/admin/drivers/redfish.rst
@@ -87,8 +87,18 @@ field:
                          The "auto" mode first tries "session" and falls back
                          to "basic" if session authentication is not supported
                          by the Redfish BMC. Default is set in ironic config
-                         as ``[redfish]auth_type``.
+                         as ``[redfish]auth_type``. Most operators should not
+                         need to leverage this setting. Session based
+                         authentication should generally be used in most
+                         cases as it prevents re-authentication every time
+                         a background task checks in with the BMC.
 
+.. note::
+   The ``redfish_address``, ``redfish_username``, ``redfish_password``,
+   and ``redfish_verify_ca`` fields, if changed, will trigger a new session
+   to be establsihed and cached with the BMC. The ``redfish_auth_type`` field
+   will only be used for the creation of a new cached session, or should
+   one be rejected by the BMC.
 
 The ``baremetal node create`` command can be used to enroll
 a node with the ``redfish`` driver. For example:
@@ -533,6 +543,8 @@ settings. The following fields will be returned in the BIOS API
     "``unique``", "The setting is specific to this node"
     "``reset_required``", "After changing this setting a node reboot is required"
 
+.. _node-vendor-passthru-methods:
+
 Node Vendor Passthru Methods
 ============================
 
@@ -620,6 +632,75 @@ Eject Virtual Media
 
     "boot_device (optional)", "body", "string", "Type of the device to eject (all devices by default)"
 
+Internal Session Cache
+======================
+
+The ``redfish`` hardware type, and derived interfaces, utilizes a built-in
+session cache which prevents Ironic from re-authenticating every time
+Ironic attempts to connect to the BMC for any reason.
+
+This consists of cached connectors objects which are used and tracked by
+a unique consideration of ``redfish_username``, ``redfish_password``,
+``redfish_verify_ca``, and finally ``redfish_address``. Changing any one
+of those values will trigger a new session to be created.
+The ``redfish_system_id`` value is explicitly not considered as Redfish
+has a model of use of one BMC to many systems, which is also a model
+Ironic supports.
+
+The session cache default size is ``1000`` sessions per conductor.
+If you are operating a deployment with a larger number of Redfish
+BMCs, it is advised that you do appropriately tune that number.
+This can be tuned via the API service configuration file,
+``[redfish]connection_cache_size``.
+
+Session Cache Expiration
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+By default, sessions remain cached for as long as possible in
+memory, as long as they have not experienced an authentication,
+connection, or other unexplained error.
+
+Under normal circumstances, the sessions will only be rolled out
+of the cache in order of oldest first when the cache becomes full.
+There is no time based expiration to entries in the session cache.
+
+Of course, the cache is only in memory, and restarting the
+``ironic-conductor`` will also cause the cache to be rebuilt
+from scratch. If this is due to any persistent connectivity issue,
+this may be sign of an unexpected condition, and please consider
+contacting the Ironic developer community for assistance.
+
+Redfish Interoperability Profile
+================================
+
+Ironic projects provides Redfish Interoperability Profile located in
+``redfish-interop-profiles`` folder at source code root. The Redfish
+Interoperability Profile is a JSON document written in a particular format
+that serves two purposes.
+
+*  It enables the creation of a human-readable document that merges the
+   profile requirements with the Redfish schema into a single document
+   for developers or users.
+*  It allows a conformance test utility to test a Redfish Service
+   implementation for conformance with the profile.
+
+The JSON document structure is intended to align easily with JSON payloads
+retrieved from Redfish Service implementations, to allow for easy comparisons
+and conformance testing. Many of the properties defined within this structure
+have assumed default values that correspond with the most common use case, so
+that those properties can be omitted from the document for brevity.
+
+Validation of Profiles using DMTF tool
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+An open source utility has been created by the Redfish Forum to verify that
+a Redfish Service implementation conforms to the requirements included in a
+Redfish Interoperability Profile. The Redfish Interop Validator is available
+for download from the DMTF's organization on Github at
+https://github.com/DMTF/Redfish-Interop-Validator. Refer to instructions in
+README on how to configure and run validation.
+
+
 .. _Redfish: http://redfish.dmtf.org/
 .. _Sushy: https://opendev.org/openstack/sushy
 .. _TLS: https://en.wikipedia.org/wiki/Transport_Layer_Security
diff --git a/doc/source/admin/drivers/snmp.rst b/doc/source/admin/drivers/snmp.rst
index 1c402ab9b..eed4ed794 100644
--- a/doc/source/admin/drivers/snmp.rst
+++ b/doc/source/admin/drivers/snmp.rst
@@ -22,39 +22,47 @@ this table could possibly work using a similar driver.
 
 Please report any device status.
 
-==============   ==========   ==========    =====================
-Manufacturer     Model        Supported?    Driver name
-==============   ==========   ==========    =====================
-APC              AP7920       Yes           apc_masterswitch
-APC              AP9606       Yes           apc_masterswitch
-APC              AP9225       Yes           apc_masterswitchplus
-APC              AP7155       Yes           apc_rackpdu
-APC              AP7900       Yes           apc_rackpdu
-APC              AP7901       Yes           apc_rackpdu
-APC              AP7902       Yes           apc_rackpdu
-APC              AP7911a      Yes           apc_rackpdu
-APC              AP7921       Yes           apc_rackpdu
-APC              AP7922       Yes           apc_rackpdu
-APC              AP7930       Yes           apc_rackpdu
-APC              AP7931       Yes           apc_rackpdu
-APC              AP7932       Yes           apc_rackpdu
-APC              AP7940       Yes           apc_rackpdu
-APC              AP7941       Yes           apc_rackpdu
-APC              AP7951       Yes           apc_rackpdu
-APC              AP7960       Yes           apc_rackpdu
-APC              AP7990       Yes           apc_rackpdu
-APC              AP7998       Yes           apc_rackpdu
-APC              AP8941       Yes           apc_rackpdu
-APC              AP8953       Yes           apc_rackpdu
-APC              AP8959       Yes           apc_rackpdu
-APC              AP8961       Yes           apc_rackpdu
-APC              AP8965       Yes           apc_rackpdu
-Aten             all?         Yes           aten
-CyberPower       all?         Untested      cyberpower
-EatonPower       all?         Untested      eatonpower
-Teltronix        all?         Yes           teltronix
-BayTech          MRP27        Yes           baytech_mrp27
-==============   ==========   ==========    =====================
+==============   ==============   ==========   =====================
+Manufacturer     Model            Supported?   Driver name
+==============   ==============   ==========   =====================
+APC              AP7920           Yes          apc_masterswitch
+APC              AP9606           Yes          apc_masterswitch
+APC              AP9225           Yes          apc_masterswitchplus
+APC              AP7155           Yes          apc_rackpdu
+APC              AP7900           Yes          apc_rackpdu
+APC              AP7901           Yes          apc_rackpdu
+APC              AP7902           Yes          apc_rackpdu
+APC              AP7911a          Yes          apc_rackpdu
+APC              AP7921           Yes          apc_rackpdu
+APC              AP7922           Yes          apc_rackpdu
+APC              AP7930           Yes          apc_rackpdu
+APC              AP7931           Yes          apc_rackpdu
+APC              AP7932           Yes          apc_rackpdu
+APC              AP7940           Yes          apc_rackpdu
+APC              AP7941           Yes          apc_rackpdu
+APC              AP7951           Yes          apc_rackpdu
+APC              AP7960           Yes          apc_rackpdu
+APC              AP7990           Yes          apc_rackpdu
+APC              AP7998           Yes          apc_rackpdu
+APC              AP8941           Yes          apc_rackpdu
+APC              AP8953           Yes          apc_rackpdu
+APC              AP8959           Yes          apc_rackpdu
+APC              AP8961           Yes          apc_rackpdu
+APC              AP8965           Yes          apc_rackpdu
+Aten             all?             Yes          aten
+CyberPower       all?             Untested     cyberpower
+EatonPower       all?             Untested     eatonpower
+Teltronix        all?             Yes          teltronix
+BayTech          MRP27            Yes          baytech_mrp27
+Raritan          PX3-5547V-V2     Yes          raritan_pdu2
+Raritan          PX3-5726V        Yes          raritan_pdu2
+Raritan          PX3-5776U-N2     Yes          raritan_pdu2
+Raritan          PX3-5969U-V2     Yes          raritan_pdu2
+Raritan          PX3-5961I2U-V2   Yes          raritan_pdu2
+Vertiv           NU30212          Yes          vertivgeist_pdu
+ServerTech       CW-16VE-P32M     Yes          servertech_sentry3
+ServerTech       C2WG24SN         Yes          servertech_sentry4
+==============   ==============   ==========   =====================
 
 
 Software Requirements
diff --git a/doc/source/admin/hardware-burn-in.rst b/doc/source/admin/hardware-burn-in.rst
index 503664182..35f231d11 100644
--- a/doc/source/admin/hardware-burn-in.rst
+++ b/doc/source/admin/hardware-burn-in.rst
@@ -108,6 +108,13 @@ Then launch the test with:
     baremetal node clean --clean-steps '[{"step": "burnin_disk", \
         "interface": "deploy"}]' $NODE_NAME_OR_UUID
 
+In order to launch a parallel SMART self test on all devices after the
+disk burn-in (which will fail the step if any of the tests fail), set:
+
+.. code-block:: console
+
+    baremetal node set --driver-info agent_burnin_fio_disk_smart_test=True \
+        $NODE_NAME_OR_UUID
 
 Network burn-in
 ===============
diff --git a/doc/source/admin/metrics.rst b/doc/source/admin/metrics.rst
index f435a50c5..733c6569b 100644
--- a/doc/source/admin/metrics.rst
+++ b/doc/source/admin/metrics.rst
@@ -17,8 +17,11 @@ These performance measurements, herein referred to as "metrics", can be
 emitted from the Bare Metal service, including ironic-api, ironic-conductor,
 and ironic-python-agent. By default, none of the services will emit metrics.
 
-Configuring the Bare Metal Service to Enable Metrics
-====================================================
+It is important to stress that not only statsd is supported for metrics
+collection and transmission. This is covered later on in our documentation.
+
+Configuring the Bare Metal Service to Enable Metrics with Statsd
+================================================================
 
 Enabling metrics in ironic-api and ironic-conductor
 ---------------------------------------------------
@@ -62,6 +65,30 @@ in the ironic configuration file as well::
   agent_statsd_host = 198.51.100.2
   agent_statsd_port = 8125
 
+.. Note::
+   Use of a different metrics backend with the agent is not presently
+   supported.
+
+Transmission to the Message Bus Notifier
+========================================
+
+Regardless if you're using Ceilometer,
+`ironic-prometheus-exporter <https://docs.openstack.org/ironic-prometheus-exporter/latest/>`_,
+or some scripting you wrote to consume the message bus notifications,
+metrics data can be sent to the message bus notifier from the timer methods
+*and* additional gauge counters by utilizing the ``[metrics]backend``
+configuration option and setting it to ``collector``. When this is the case,
+Information is cached locally and periodically sent along with the general sensor
+data update to the messaging notifier, which can consumed off of the message bus,
+or via notifier plugin (such as is done with ironic-prometheus-exporter).
+
+.. NOTE::
+   Transmission of timer data only works for the Conductor or ``single-process``
+   Ironic service model. A separate webserver process presently does not have
+   the capability of triggering the call to retrieve and transmit the data.
+
+.. NOTE::
+   This functionality requires ironic-lib version 5.4.0 to be installed.
 
 Types of Metrics Emitted
 ========================
@@ -79,6 +106,9 @@ additional load before enabling metrics. To see which metrics have changed names
 or have been removed between releases, refer to the `ironic release notes
 <https://docs.openstack.org/releasenotes/ironic/>`_.
 
+Additional conductor metrics in the form of counts will also be generated in
+limited locations where petinant to the activity of the conductor.
+
 .. note::
   With the default statsd configuration, each timing metric may create
   additional metrics due to how statsd handles timing metrics. For more
diff --git a/doc/source/admin/retirement.rst b/doc/source/admin/retirement.rst
index e4884e0f4..aab307bac 100644
--- a/doc/source/admin/retirement.rst
+++ b/doc/source/admin/retirement.rst
@@ -23,6 +23,27 @@ scheduling of instances, but will still allow for other operations,
 such as cleaning, to happen (this marks an important difference to
 nodes which have the ``maintenance`` flag set).
 
+Requirements
+============
+
+The use of the retirement feature requires that automated cleaning
+be enabled. The default ``[conductor]automated_clean`` setting must
+not be disabled as the retirement feature is only engaged upon
+the completion of cleaning as it sets forth the expectation of removing
+sensitive data from a node.
+
+If you're uncomfortable with full cleaning, but want to make use of the
+the retirement feature, a compromise may be to explore use of metadata
+erasure, however this will leave additional data on disk which you may
+wish to erase completely. Please consult the configuration for the
+``[deploy]erase_devices_metadata_priority`` and
+``[deploy]erase_devices_priority`` settings, and do note that
+clean steps can be manually invoked through manual cleaning should you
+wish to trigger the ``erase_devices`` clean step to completely wipe
+all data from storage devices. Alternatively, automated cleaning can
+also be enabled on an individual node level using the
+``baremetal node set --automated-clean <node_id>`` command.
+
 How to use
 ==========
 
diff --git a/doc/source/admin/secure-rbac.rst b/doc/source/admin/secure-rbac.rst
index 639cfcb23..1f1bb66d1 100644
--- a/doc/source/admin/secure-rbac.rst
+++ b/doc/source/admin/secure-rbac.rst
@@ -267,3 +267,43 @@ restrictive and an ``owner`` may revoke access to ``lessee``.
 Access to the underlying baremetal node is not exclusive between the
 ``owner`` and ``lessee``, and this use model expects that some level of
 communication takes place between the appropriate parties.
+
+Can I, a project admin, create a node?
+--------------------------------------
+
+Starting in API version ``1.80``, the capability was added
+to allow users with an ``admin`` role to be able to create and
+delete their own nodes in Ironic.
+
+This functionality is enabled by default, and automatically
+imparts ``owner`` privileges to the created Bare Metal node.
+
+This functionality can be disabled by setting
+``[api]project_admin_can_manage_own_nodes`` to ``False``.
+
+Can I use a service role?
+-------------------------
+
+In later versions of Ironic, the ``service`` role has been added to enable
+delineation of accounts and access to Ironic's API. As Ironic's API was
+largely originally intended as an "admin" API service, the service role
+enables similar levels of access as a project-scoped user with the
+``admin`` or ``manager`` roles.
+
+In terms of access, this is likely best viewed as a user with the
+``manager`` role, but with slight elevation in privilege to enable
+usage of the service via a service account.
+
+A project scoped user with the ``service`` role is able to create
+baremetal nodes, but is not able to delete them. To disable the
+ability to create nodes, set the
+``[api]project_admin_can_manage_own_nodes`` setting to ``False``.
+The nodes which can be accessed/managed in the project scope also align
+with the ``owner`` and ``lessee`` access model, and thus if nodes are not
+matching the user's ``project_id``, then Ironic's API will appear not to
+have any enrolled baremetal nodes.
+
+With the system scope, a user with the ``service`` role is able to
+create baremetal nodes, but also, not delete them. The access rights
+are modeled such an ``admin`` scoped is needed to delete baremetal
+nodes from Ironic.
diff --git a/doc/source/admin/troubleshooting.rst b/doc/source/admin/troubleshooting.rst
index fa04d3006..72e969b6e 100644
--- a/doc/source/admin/troubleshooting.rst
+++ b/doc/source/admin/troubleshooting.rst
@@ -973,3 +973,174 @@ Unfortunately, due to the way the conductor is designed, it is not possible to
 gracefully break a stuck lock held in ``*-ing`` states. As the last resort, you
 may need to restart the affected conductor. See `Why are my nodes stuck in a
 "-ing" state?`_.
+
+What is ConcurrentActionLimit?
+==============================
+
+ConcurrentActionLimit is an exception which is raised to clients when an
+operation is requested, but cannot be serviced at that moment because the
+overall threshold of nodes in concurrent "Deployment" or "Cleaning"
+operations has been reached.
+
+These limits exist for two distinct reasons.
+
+The first is they allow an operator to tune a deployment such that too many
+concurrent deployments cannot be triggered at any given time, as a single
+conductor has an internal limit to the number of overall concurrent tasks,
+this restricts only the number of running concurrent actions. As such, this
+accounts for the number of nodes in ``deploy`` and ``deploy wait`` states.
+In the case of deployments, the default value is relatively high and should
+be suitable for *most* larger operators.
+
+The second is to help slow down the ability in which an entire population of
+baremetal nodes can be moved into and through cleaning, in order to help
+guard against authenticated malicious users, or accidental script driven
+operations. In this case, the total number of nodes in ``deleting``,
+``cleaning``, and ``clean wait`` are evaluated. The default maximum limit
+for cleaning operations is *50* and should be suitable for the majority of
+baremetal operators.
+
+These settings can be modified by using the
+``[conductor]max_concurrent_deploy`` and ``[conductor]max_concurrent_clean``
+settings from the ironic.conf file supporting the ``ironic-conductor``
+service. Neither setting can be explicity disabled, however there is also no
+upper limit to the setting.
+
+.. note::
+   This was an infrastructure operator requested feature from actual lessons
+   learned in the operation of Ironic in large scale production. The defaults
+   may not be suitable for the largest scale operators.
+
+Why do I have an error that an NVMe Partition is not a block device?
+====================================================================
+
+In some cases, you can encounter an error that suggests a partition that has
+been created on an NVMe block device, is not a block device.
+
+Example:
+
+  lsblk: /dev/nvme0n1p2: not a block device
+
+What has happened is the partition contains a partition table inside of it
+which is confusing the NVMe device interaction. While basically valid in
+some cases to have nested partition tables, for example, with software
+raid, in the NVMe case the driver and possibly the underlying device gets
+quite confused. This is in part because partitions in NVMe devices are higher
+level abstracts.
+
+The way this occurs is you likely had a ``whole-disk`` image, and it was
+configured as a partition image. If using glance, your image properties
+may have a ``img_type`` field, which should be ``whole-disk``, or you
+have a ``kernel_id`` and ``ramdisk_id`` value in the glance image
+``properties`` field. Definition of a kernel and ramdisk value also
+indicates that the image is of a ``partition`` image type. This is because
+a ``whole-disk`` image is bootable from the contents within the image,
+and partition images are unable to be booted without a kernel, and ramdisk.
+
+If you are using Ironic in standalone mode, the optional
+``instance_info/image_type`` setting may be advisable to be checked.
+Very similar to Glance usage above, if you have set Ironic's node level
+``instance_info/kernel`` and ``instance_info/ramdisk`` parameters, Ironic
+will proceed with deploying an image as if it is a partition image, and
+create a partition table on the new block device, and then write the
+contents of the image into the newly created partition.
+
+.. NOTE::
+   As a general reminder, the Ironic community recommends the use of
+   whole disk images over the use of partition images.
+
+Why can't I use Secure Erase/Wipe with RAID controllers?
+========================================================
+
+Situations have been reported where an infrastructure operator is expecting
+particular device types to be Secure Erased or Wiped when they are behind a
+RAID controller.
+
+For example, the server may have NVMe devices attached to a RAID controller
+which could be in pass-through or single disk volume mode. The same scenario
+exists basically regardless of the disk/storage medium/type.
+
+The basic reason why is that RAID controllers essentially act as command
+translators with a buffer cache. They tend to offer a simplified protocol
+to the Operating System, and interact with the storage device in whatever
+protocol is native to the device. This is the root of the underlying
+problem.
+
+Protocols such as SCSI are rooted in quite a bit of computing history,
+but never evolved to include primitives like Secure Erase which evolved in
+the `ATA protocol <https://en.wikipedia.org/wiki/Parallel_ATA#HDD_passwords_and_security>`_.
+
+The closest primitives in SCSI to ATA Secure Erase is the ``FORMAT UNIT``
+and ``UNMAP`` commands.
+
+``FORMAT UNIT`` might be a viable solution, and a tool named
+`sg_format <https://linux.die.net/man/8/sg_format>`_ exists,
+but there has not been a sufficient call upstream to implement this and
+test it sufficiently that the Ironic community would be comfortable
+shipping such a capability. The possibility also exists that a RAID
+controller might not translate this command through to an end device,
+just as some RAID controllers know how to handle and pass through
+ATA commands to disk devices which support them. It is entirely dependent
+upon the hardware configuration scenario.
+
+The ``UNMAP`` command is similar to the ATA ``TRIM`` command. Unfortunately
+the SCSI protocol requires this be performed at block level, and similar to
+``FORMAT UNIT``, it may not be supported or just passed through.
+
+If your interested in working on this area, or are willing to help test,
+please feel free to contact the
+:doc:`Ironic development community </contributor/community>`.
+An additional option is the creation of your own
+`custom Hardware Manager <https://opendev.org/openstack/ironic-python-agent/src/branch/master/examples/custom-disk-erase>`_
+which can contain your preferred logic, however this does require some Python
+development experience.
+
+One last item of note, depending on the RAID controller, the BMC, and a number
+of other variables, you may be able to leverage the `RAID <raid>`_
+configuration interface to delete volumes/disks, and recreate them. This may
+have the same effect as a clean disk, however that too is RAID controller
+dependent behavior.
+
+I'm in "clean failed" state, what do I do?
+==========================================
+
+There is only one way to exit the ``clean failed`` state. But before we visit
+the answer as to **how**, we need to stress the importance of attempting to
+understand **why** cleaning failed. On the simple side of things, this may be
+as simple as a DHCP failure, but on a complex side of things, it could be that
+a cleaning action failed against the underlying hardware, possibly due to
+a hardware failure.
+
+As such, we encourage everyone to attempt to understand **why** before exiting
+the ``clean failed`` state, because you could potentially make things worse
+for yourself. For example if firmware updates were being performed, you may
+need to perform a rollback operation against the physical server, depending on
+what, and how the firmware was being updated. Unfortunately this also borders
+the territory of "no simple answer".
+
+This can be counter balanced with sometimes there is a transient networking
+failure and a DHCP address was not obtained. An example of this would be
+suggested by the ``last_error`` field indicating something about "Timeout
+reached while cleaning the node", however we recommend following several
+basic troubleshooting steps:
+
+* Consult the ``last_error`` field on the node, utilizing the
+  ``baremetal node show <uuid>`` command.
+* If the version of ironic supports the feature, consult the node history
+  log, ``baremetal node history list`` and
+  ``baremetal node history get <uuid>``.
+* Consult the acutal console screen of the physical machine. *If* the ramdisk
+  booted, you will generally want to investigate the controller logs and see
+  if an uploaded agent log is being stored on the conductor responsible for
+  the baremetal node. Consult `Retrieving logs from the deploy ramdisk`_.
+  If the node did not boot for some reason, you can typically just retry
+  at this point and move on.
+
+How to get out of the state, once you've understood **why** you reached it
+in the first place, is to utilize the ``baremetal node manage <node_id>``
+command. This returns the node to ``manageable`` state, from where you can
+retry "cleaning" through automated cleaning with the ``provide`` command,
+or manual cleaning with ``clean`` command. or the next appropriate action
+in the workflow process you are attempting to follow, which may be
+ultimately be decommissioning the node because it could have failed and is
+being removed or replaced.