diff options
author | Dmitry Tantsur <dtantsur@protonmail.com> | 2020-12-10 17:51:05 +0100 |
---|---|---|
committer | Dmitry Tantsur <dtantsur@protonmail.com> | 2020-12-10 17:58:18 +0100 |
commit | 8a2c715a0a6677b2a32f9319c1227591b14bdfa5 (patch) | |
tree | 11345e03f6b8f6627cf8d0c88d55916aedca5880 /doc/source/admin/troubleshooting.rst | |
parent | 42bf964c8cf747a0b6b82243cbf15a3630aa3a6e (diff) | |
download | ironic-8a2c715a0a6677b2a32f9319c1227591b14bdfa5.tar.gz |
Add TLS troubleshooting guide entry
Change-Id: Ied66562bb2475513ddb8c712dedc5f50fc6cad4f
Diffstat (limited to 'doc/source/admin/troubleshooting.rst')
-rw-r--r-- | doc/source/admin/troubleshooting.rst | 51 |
1 files changed, 51 insertions, 0 deletions
diff --git a/doc/source/admin/troubleshooting.rst b/doc/source/admin/troubleshooting.rst index e774bbc3e..2ddd22cfc 100644 --- a/doc/source/admin/troubleshooting.rst +++ b/doc/source/admin/troubleshooting.rst @@ -718,3 +718,54 @@ or vendor supplied images. Centos, Ubuntu, Fedora, and Debian all publish operating system images which do generally include drivers and firmware for physical hardware. Many of these published "cloud" images, also support auto-configuration of networking AND population of user keys. + +Issues with autoconfigured TLS +============================== + +These issues will manifest as an error in ``ironic-conductor`` logs looking +similar to (lines are wrapped for readability):: + + ERROR ironic.drivers.modules.agent_client [-] + Failed to connect to the agent running on node d7c322f0-0354-4008-92b4-f49fb2201001 + for invoking command clean.get_clean_steps. Error: + HTTPSConnectionPool(host='192.168.123.126', port=9999): Max retries exceeded with url: + /v1/commands/?wait=true&agent_token=<token> (Caused by + SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),)): + requests.exceptions.SSLError: HTTPSConnectionPool(host='192.168.123.126', port=9999): + Max retries exceeded with url: /v1/commands/?wait=true&agent_token=<token> + (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:897)'),)) + +The cause of the issue is that the Bare Metal service cannot access the ramdisk +with the TLS certificate provided by the ramdisk on first heartbeat. You can +inspect the stored certificate in ``/var/lib/ironic/certificates/<node>.crt``. + +You can try connecting to the ramdisk using the IP address in the log message:: + + curl -vL https://<IP address>:9999/v1/commands \ + --cacert /var/lib/ironic/certificates/<node UUID>.crt + +You can get the detailed information about the certificate using openSSL:: + + openssl x509 -text -noout -in /var/lib/ironic/certificates/<node UUID>.crt + +Clock skew +---------- + +One possible source of the problem is a discrepancy between the hardware +clock on the node and the time on the machine with the Bare Metal service. +It can be detected by comparing the ``Not Before`` field in the ``openssl`` +output with the timestamp of a log message. + +The recommended solution is to enable the NTP support in ironic-python-agent by +passing the ``ipa-ntp-server`` argument with an address of an NTP server +reachable by the node. + +If it is not possible, you need to ensure the correct hardware time on the +machine. Keep in mind a potential issue with timezones: an ability to store +timezone in hardware is pretty recent and may not be available. Since +ironic-python-agent is likely operating in UTC, the hardware clock should also +be set in UTC. + +.. note:: + Microsoft Windows uses local time by default, so a machine that has + previously run Windows will likely have wrong time. |