summaryrefslogtreecommitdiff
path: root/doc/source/admin/boot-from-volume.rst
blob: 85496b1fb2ef2275f3d7cb5416cf7c122ce25578 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
.. _boot-from-volume:

================
Boot From Volume
================

Overview
========
The Bare Metal service supports booting from a Cinder iSCSI volume as of the
Pike release. This guide will primarily deal with this use case, but will be
updated as more paths for booting from a volume, such as FCoE, are introduced.

The boot from volume is supported on both legacy BIOS and
UEFI (iPXE binary for EFI booting) boot mode. We need to perform with
suitable images which will be created by diskimage-builder tool.

How this works - From Ironic's point of view
--------------------------------------------

In essence, ironic sets the stage for the process, by providing the required
information to the boot interface to facilitate the configuration of the
the node OR the iPXE boot templates such that the node CAN be booted.

.. seqdiag::
   :scale: 80

   diagram {
      User; API; Conductor; Storage; Boot; Network; Deploy;
      activation = none;
      span_height = 1;
      edge_length = 250;
      default_note_color = white;
      default_fontsize = 14;

      User -> API [label = "User or intermediate service such as nova supplies volume target configuration."];
      User -> API [label = "Sends deployment request."];
      API -> Conductor [label = "API transmits the action to the conductor service"];
      Conductor -> Storage [label = "Conductor calls the storage_interface to perform attachment of volume to node"];
      Conductor -> Boot [label = "Conductor calls the boot interface signaling preparation of an instance"];
      Conductor -> Network [label = "Conductor attaches the machine to network requested by the user VIF"];
      Conductor -> Deploy [label = "Conductor starts deployment steps which just turn the power on."];
   }

In this example, the boot interface does the heavy lifting. For drivers the
``irmc`` and ``ilo`` hardware types with hardware type specific boot
interfaces, they are able to signal via an out of band mechanism to the
baremetal node's BMC that the integrated iSCSI initiators are to connect
to the supplied volume target information.

In most hardware this would be the network cards of the machine.

In the case of the ``ipxe`` boot interface, templates are created on disk
which point to the iscsi target information that was either submitted
as part of the volume target, or when integrated with Nova, what was
requested as the baremetal's boot from volume disk upon requesting the
instance.

In terms of network access, both interface methods require connectivity
to the iscsi target. In the vendor driver specific path, additional network
configuration options may be available to allow separation of standard
network traffic and instance network traffic. In the iPXE case, this is
not possible as the OS userspace re-configures the iSCSI connection
after detection inside the OS ramdisk boot.

An iPXE user *may* be able to leverage multiple VIFs, one specifically
set to be set with ``pxe_enabled`` to handle the initial instance boot
and back-end storage traffic where as external facing network traffic
occurs on a different interface. This is a common pattern in iSCSI
based deployments in the physical realm.

Prerequisites
=============
Currently booting from a volume requires:

- Bare Metal service version 9.0.0
- Bare Metal API microversion 1.33 or later
- A driver that utilizes the :doc:`PXE boot mechanism </install/configure-pxe>`.
  Currently booting from a volume is supported by the reference drivers that
  utilize PXE boot mechanisms when iPXE is enabled.
- iPXE is an explicit requirement, as it provides the mechanism that attaches
  and initiates booting from an iSCSI volume.
- Metadata services need to be configured and available for the instance images
  to obtain configuration such as keys. Configuration drives are not supported
  due to minimum disk extension sizes.

Conductor Configuration
=======================
In ironic.conf, you can specify a list of enabled storage interfaces. Check
``[DEFAULT]enabled_storage_interfaces`` in your ironic.conf to ensure that
your desired interface is enabled. For example, to enable the ``cinder`` and
``noop`` storage interfaces::

  [DEFAULT]
  enabled_storage_interfaces = cinder,noop

If you want to specify a default storage interface rather than setting the
storage interface on a per node basis, set ``[DEFAULT]default_storage_interface``
in ironic.conf. The ``default_storage_interface`` will be used for any node that
doesn't have a storage interface defined.

Node Configuration
==================

Storage Interface
-----------------
You will need to specify what storage interface the node will use to handle
storage operations. For example, to set the storage interface to ``cinder``
on an existing node::

    baremetal node set --storage-interface cinder $NODE_UUID

A default storage interface can be specified in ironic.conf. See the
`Conductor Configuration`_ section for details.

iSCSI Configuration
-------------------
In order for a bare metal node to boot from an iSCSI volume, the ``iscsi_boot``
capability for the node must be set to ``True``. For example, if you want to
update an existing node to boot from volume::

    baremetal node set --property capabilities=iscsi_boot:True $NODE_UUID

You will also need to create a volume connector for the node, so the storage
interface will know how to communicate with the node for storage operation. In
the case of iSCSI, you will need to provide an iSCSI Qualifying Name (IQN)
that is unique to your SAN. For example, to create a volume connector for iSCSI::

    baremetal volume connector create \
             --node $NODE_UUID --type iqn --connector-id iqn.2017-08.org.openstack.$NODE_UUID

Image Creation
==============
We use ``disk-image-create`` in diskimage-builder tool to create images
for boot from volume feature. Some required elements for this mechanism for
corresponding boot modes are as following:

- Legacy BIOS boot mode: ``iscsi-boot`` element.
- UEFI boot mode: ``iscsi-boot`` and ``block-device-efi`` elements.

An example below::

    export IMAGE_NAME=<image_name>
    export DIB_CLOUD_INIT_DATASOURCES="ConfigDrive, OpenStack"
    disk-image-create centos7 vm cloud-init-datasources dhcp-all-interfaces iscsi-boot dracut-regenerate block-device-efi -o $IMAGE_NAME

.. note::
    * For CentOS images, we must add dependent element named
      ``dracut-regenerate`` during image creation. Otherwise,
      the image creation will fail with an error.
    * For Ubuntu images, we only support ``iscsi-boot`` element without
      ``dracut-regenerate`` element during image creation.

Advanced Topics
===============

Use without the Compute Service
-------------------------------

As discussed in other sections, the Bare Metal service has a concept of a
`connector` that is used to represent an interface that is intended to
be utilized to attach the remote volume.

In addition to the connectors, we have a concept of a `target` that can be
defined via the API. While a user of this feature through the Compute
service would automatically have a new target record created for them,
it is not explicitly required, and can be performed manually.

A target record can be created using a command similar to the example below::

    baremetal volume target create \
              --node $NODE_UUID --type iscsi --boot-index 0 --volume $VOLUME_UUID

.. Note:: A ``boot-index`` value of ``0`` represents the boot volume for a
          node. As the ``boot-index`` is per-node in sequential order,
          only one boot volume is permitted for each node.

Use Without Cinder
------------------

In the Rocky release, an ``external`` storage interface is available that
can be utilized without a Block Storage Service installation.

Under normal circumstances the ``cinder`` storage interface
interacts with the Block Storage Service to orchestrate and manage
attachment and detachment of volumes from the underlying block service
system.

The ``external`` storage interface contains the logic to allow the Bare
Metal service to determine if the Bare Metal node has been requested with
a remote storage volume for booting. This is in contrast to the default
``noop`` storage interface which does not contain logic to determine if
the node should or could boot from a remote volume.

It must be noted that minimal configuration or value validation occurs
with the ``external`` storage interface. The ``cinder`` storage interface
contains more extensive validation, that is likely un-necessary in a
``external`` scenario.

Setting the external storage interface::

    baremetal node set --storage-interface external $NODE_UUID

Setting a volume::

    baremetal volume target create --node $NODE_UUID \
        --type iscsi --boot-index 0 --volume-id $VOLUME_UUID \
        --property target_iqn="iqn.2010-10.com.example:vol-X" \
        --property target_lun="0" \
        --property target_portal="192.168.0.123:3260" \
        --property auth_method="CHAP" \
        --property auth_username="ABC" \
        --property auth_password="XYZ" \

Ensure that no image_source is defined::

    baremetal node unset \
        --instance-info image_source $NODE_UUID

Deploy the node::

    baremetal node deploy $NODE_UUID

Upon deploy, the boot interface for the baremetal node will attempt
to either create iPXE configuration OR set boot parameters out-of-band via
the management controller. Such action is boot interface specific and may not
support all forms of volume target configuration. As of the Rocky release,
the bare metal service does not support writing an Operating System image
to a remote boot from volume target, so that also must be ensured by
the user in advance.

Records of volume targets are removed upon the node being undeployed,
and as such are not persistent across deployments.

Cinder Multi-attach
-------------------

Volume multi-attach is a function that is commonly performed in computing
clusters where dedicated storage subsystems are utilized. For some time now,
the Block Storage service has supported the concept of multi-attach.
However, the Compute service, as of the Pike release, does not yet have
support to leverage multi-attach. Concurrently, multi-attach requires the
backend volume driver running as part of the Block Storage service to
contain support for multi-attach volumes.

When support for storage interfaces was added to the Bare Metal service,
specifically for the ``cinder`` storage interface, the concept of volume
multi-attach was accounted for, however has not been fully tested,
and is unlikely to be fully tested until there is Compute service integration
as well as volume driver support.

The data model for storage of volume targets in the Bare Metal service
has no constraints on the same target volume from being utilized.
When interacting with the Block Storage service, the Bare Metal service
will prevent the use of volumes that are being reported as ``in-use``
if they do not explicitly support multi-attach.