summaryrefslogtreecommitdiff
path: root/doc/source/admin/raid.rst
blob: 5930b457409481a6e1b75051de83444a8e468755 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
.. _raid:

==================
RAID Configuration
==================

Overview
========
Ironic supports RAID configuration for bare metal nodes.  It allows operators
to specify the desired RAID configuration via the OpenStackClient CLI or REST
API.  The desired RAID configuration is applied on the bare metal during manual
cleaning.

The examples described here use the OpenStackClient CLI; please see the
`REST API reference <https://docs.openstack.org/api-ref/baremetal/>`_
for their corresponding REST API requests.

Prerequisites
=============
The bare metal node needs to use a hardware type that supports RAID
configuration. RAID interfaces may implement RAID configuration either in-band
or out-of-band. Software RAID is supported on all hardware, although with some
caveats - see `Software RAID`_ for details.

In-band RAID configuration (including software RAID) is done using the
Ironic Python Agent ramdisk. For in-band hardware RAID configuration,
a hardware manager which supports RAID should be bundled with the ramdisk.

Whether a node supports RAID configuration could be found using the CLI
command ``baremetal node validate <node>``. In-band RAID is
usually implemented by the ``agent`` RAID interface.

Build agent ramdisk which supports RAID configuration
=====================================================

For doing in-band hardware RAID configuration, Ironic needs an agent ramdisk
bundled with a hardware manager which supports RAID configuration for your
hardware. For example, the :ref:`DIB_raid_support` should be used for HPE
Proliant Servers.

.. note::
    For in-band software RAID, the agent ramdisk does not need to be bundled
    with a hardware manager as the generic hardware manager in the Ironic
    Python Agent already provides (basic) support for software RAID.

RAID configuration JSON format
==============================
The desired RAID configuration and current RAID configuration are represented
in JSON format.

Target RAID configuration
-------------------------
This is the desired RAID configuration on the bare metal node.  Using the
OpenStackClient CLI (or REST API), the operator sets ``target_raid_config``
field of the node. The target RAID configuration will be applied during manual
cleaning.

Target RAID configuration is a dictionary having ``logical_disks``
as the key. The value for the ``logical_disks`` is a list of JSON
dictionaries. It looks like::

  {
    "logical_disks": [
      {<desired properties of logical disk 1>},
      {<desired properties of logical disk 2>},
      ...
    ]
  }

If the ``target_raid_config`` is an empty dictionary, it unsets the value of
``target_raid_config`` if the value was set with previous RAID configuration
done on the node.

Each dictionary of logical disk contains the desired properties of logical
disk supported by the hardware type. These properties are discoverable by::

    baremetal driver raid property list <driver name>

Mandatory properties
^^^^^^^^^^^^^^^^^^^^

These properties must be specified for each logical
disk and have no default values:

- ``size_gb`` - Size (Integer) of the logical disk to be created in GiB.
  ``MAX`` may be specified if the logical disk should use all of the
  remaining space available. This can be used only when backing physical
  disks are specified (see below).

- ``raid_level`` - RAID level for the logical disk. Ironic supports the
  following RAID levels: 0, 1, 2, 5, 6, 1+0, 5+0, 6+0.

Optional properties
^^^^^^^^^^^^^^^^^^^

These properties have default values and they may be overridden in the
specification of any logical disk. None of these options are supported for
software RAID.

- ``volume_name`` - Name of the volume. Should be unique within the Node.
  If not specified, volume name will be auto-generated.

- ``is_root_volume`` - Set to ``true`` if this is the root volume. At
  most one logical disk can have this set to ``true``; the other
  logical disks must have this set to ``false``. The
  ``root device hint`` will be saved, if the RAID interface is capable of
  retrieving it. This is ``false`` by default.

Backing physical disk hints
^^^^^^^^^^^^^^^^^^^^^^^^^^^

These hints are specified for each logical disk to let Ironic find the desired
disks for RAID configuration. This is machine-independent information. This
serves the use-case where the operator doesn't want to provide individual
details for each bare metal node. None of these options are supported for
software RAID.

- ``share_physical_disks`` - Set to ``true`` if this logical disk can
  share physical disks with other logical disks. The default value is
  ``false``, except for software RAID which always shares disks.

- ``disk_type`` - ``hdd`` or ``ssd``. If this is not specified, disk type
  will not be a criterion to find backing physical disks.

- ``interface_type`` - ``sata`` or ``scsi`` or ``sas``. If this is not
  specified, interface type will not be a criterion to
  find backing physical disks.

- ``number_of_physical_disks`` - Integer, number of disks to use for the
  logical disk. Defaults to minimum number of disks required for the
  particular RAID level, except for software RAID which always spans all disks.

Backing physical disks
^^^^^^^^^^^^^^^^^^^^^^

These are the actual machine-dependent information. This is suitable for
environments where the operator wants to automate the selection of physical
disks with a 3rd-party tool based on a wider range of attributes
(eg. S.M.A.R.T. status, physical location).  The values for these properties
are hardware dependent.

- ``controller`` - The name of the controller as read by the RAID interface.
  In order to trigger the setup of a Software RAID via the Ironic Python
  Agent, the value of this property needs to be set to ``software``.
- ``physical_disks`` - A list of physical disks to use as read by the
  RAID interface.

  For software RAID ``physical_disks`` is a list of device hints in the same
  format as used for :ref:`root-device-hints`. The number of provided hints
  must match the expected number of backing devices (repeat the same hint if
  necessary).

.. note::
    If properties from both "Backing physical disk hints" or
    "Backing physical disks" are specified, they should be consistent with
    each other.  If they are not consistent, then the RAID configuration
    will fail (because the appropriate backing physical disks could
    not be found).

.. _raid-config-examples:

Examples for ``target_raid_config``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

*Example 1*. Single RAID disk of RAID level 5 with all of the space
available. Make this the root volume to which Ironic deploys the image:

.. code-block:: json

  {
    "logical_disks": [
      {
        "size_gb": "MAX",
        "raid_level": "5",
        "is_root_volume": true
      }
    ]
  }

*Example 2*. Two RAID disks. One with RAID level 5 of 100 GiB and make it
root volume and use SSD.  Another with RAID level 1 of 500 GiB and use
HDD:

.. code-block:: json

  {
    "logical_disks": [
      {
        "size_gb": 100,
        "raid_level": "5",
        "is_root_volume": true,
        "disk_type": "ssd"
      },
      {
        "size_gb": 500,
        "raid_level": "1",
        "disk_type": "hdd"
      }
    ]
  }

*Example 3*. Single RAID disk. I know which disks and controller to use:

.. code-block:: json

  {
    "logical_disks": [
      {
        "size_gb": 100,
        "raid_level": "5",
        "controller": "Smart Array P822 in Slot 3",
        "physical_disks": ["6I:1:5", "6I:1:6", "6I:1:7"],
        "is_root_volume": true
      }
    ]
  }

*Example 4*. Using backing physical disks:

.. code-block:: json

  {
    "logical_disks": [
      {
        "size_gb": 50,
        "raid_level": "1+0",
        "controller": "RAID.Integrated.1-1",
        "volume_name": "root_volume",
        "is_root_volume": true,
        "physical_disks": [
          "Disk.Bay.0:Encl.Int.0-1:RAID.Integrated.1-1",
          "Disk.Bay.1:Encl.Int.0-1:RAID.Integrated.1-1"
        ]
      },
      {
        "size_gb": 100,
        "raid_level": "5",
        "controller": "RAID.Integrated.1-1",
        "volume_name": "data_volume",
        "physical_disks": [
          "Disk.Bay.2:Encl.Int.0-1:RAID.Integrated.1-1",
          "Disk.Bay.3:Encl.Int.0-1:RAID.Integrated.1-1",
          "Disk.Bay.4:Encl.Int.0-1:RAID.Integrated.1-1"
        ]
      }
    ]
  }

*Example 5*. Software RAID with two RAID devices:

.. code-block:: json

  {
    "logical_disks": [
      {
        "size_gb": 100,
        "raid_level": "1",
        "controller": "software"
      },
      {
        "size_gb": "MAX",
        "raid_level": "0",
        "controller": "software"
      }
    ]
  }

*Example 6*. Software RAID, limiting backing block devices to exactly two
devices with the size exceeding 100 GiB:

.. code-block:: json

  {
    "logical_disks": [
      {
        "size_gb": "MAX",
        "raid_level": "0",
        "controller": "software",
        "physical_disks": [
          {"size": "> 100"},
          {"size": "> 100"}
        ]
      }
    ]
  }

Current RAID configuration
--------------------------
After target RAID configuration is applied on the bare metal node, Ironic
populates the current RAID configuration.  This is populated in the
``raid_config`` field in the Ironic node. This contains the details about
every logical disk after they were created on the bare metal node. It
contains details like RAID controller used, the backing physical disks used,
WWN of each logical disk, etc. It also contains information about each
physical disk found on the bare metal node.

To get the current RAID configuration::

    baremetal node show <node-uuid-or-name>

Workflow
========

* Operator configures the bare metal node with a hardware type that has
  a ``RAIDInterface`` other than ``no-raid``. For instance, for Software RAID,
  this would be ``agent``.

* For in-band RAID configuration, operator builds an agent ramdisk which
  supports RAID configuration by bundling the hardware manager with the
  ramdisk. See `Build agent ramdisk which supports RAID configuration`_ for
  more information.

* Operator prepares the desired target RAID configuration as mentioned in
  `Target RAID configuration`_. The target RAID configuration is set on
  the Ironic node::

      baremetal node set <node-uuid-or-name> \
         --target-raid-config <JSON file containing target RAID configuration>

  The CLI command can accept the input from standard input also::

       baremetal node set <node-uuid-or-name> \
          --target-raid-config -

* Create a JSON file with the RAID clean steps for manual cleaning. Add other
  clean steps as desired::

    [{
      "interface": "raid",
      "step": "delete_configuration"
    },
    {
      "interface": "raid",
      "step": "create_configuration"
    }]

  .. note::
    'create_configuration' doesn't remove existing disks.  It is recommended
    to add 'delete_configuration' before 'create_configuration' to make
    sure that only the desired logical disks exist in the system after
    manual cleaning.

* Bring the node to ``manageable`` state and do a ``clean`` action to start
  cleaning on the node::

      baremetal node clean <node-uuid-or-name> \
         --clean-steps <JSON file containing clean steps created above>

* After manual cleaning is complete, the current RAID configuration is
  reported in the ``raid_config`` field when running::

      baremetal node show <node-uuid-or-name>

Software RAID
=============

Building Linux software RAID in-band (via the Ironic Python Agent ramdisk)
is supported starting with the Train release. It is requested by using the
``agent`` RAID interface and RAID configuration with all controllers set
to ``software``. You can find a software RAID configuration example in
:ref:`raid-config-examples`.

There are certain limitations to be aware of:

* Only the mandatory properties (plus the required ``controller`` property)
  from `Target RAID configuration`_ are currently supported.

* The number of created Software RAID devices must be 1 or 2. If there is only
  one Software RAID device, it has to be a RAID-1. If there are two, the first
  one has to be a RAID-1, while the RAID level for the second one can be
  0, 1, 1+0, 5, or 6. As the first RAID device will be the deployment device,
  enforcing a RAID-1 reduces the risk of ending up with a non-booting node
  in case of a disk failure.

* Building RAID will fail if the target disks are already partitioned. Wipe the
  disks using e.g. the ``erase_devices_metadata`` clean step before building
  RAID::

    [{
      "interface": "raid",
      "step": "delete_configuration"
    },
    {
      "interface": "deploy",
      "step": "erase_devices_metadata"
    },
    {
      "interface": "raid",
      "step": "create_configuration"
    }]

* If local boot is going to be used, the final instance image must have the
  ``mdadm`` utility installed and needs to be able to detect software RAID
  devices at boot time (which is usually done by having the RAID drivers
  embedded in the image's initrd).

* Regular cleaning will not remove RAID configuration (similarly to hardware
  RAID). To destroy RAID run the ``delete_configuration`` manual clean step.

* There is no support for partition images, only whole-disk images are
  supported with Software RAID. See :doc:`/install/configure-glance-images`.

* In UEFI mode, the Ironic Python Agent creates EFI system partitions (ESPs)
  for the bootloader and the boot configuration (grub.cfg or grubenv) on all
  holder devices. The content of these partitions is populated upon deployment
  from the deployed user image. Depending on how the partitions are mounted,
  the content of the partitions may get out of sync, e.g. when new kernels
  are installed or the bootloader is updated, so measures to keep these
  partitions in sync need to be taken.

Image requirements
------------------

Since Ironic needs to perform additional steps when deploying nodes
with software RAID, there are some requirements the deployed images need
to fulfill. Up to and including the Train release, the image needs to
have its root file system on the first partition. Starting with Ussuri,
the image can also have additional metadata to point Ironic to the
partition with the root file system: for this, the image needs to set
the ``rootfs_uuid`` property with the file system UUID of the root file
system. One way to extract this UUID from an existing image is to
download the image, mount it as a loopback device, and use ``blkid``:

.. code-block:: bash

    $ sudo losetup -f
    $ sudo losetup /dev/loop0 /tmp/myimage.raw
    $ sudo kpartx -a /dev/loop0
    $ blkid

The pre-Ussuri approach, i.e. to have the root file system on
the first partition, is kept as a fallback and hence allows software
RAID deployments where Ironic does not have access to any image metadata
(e.g. Ironic stand-alone).

Using RAID in nova flavor for scheduling
========================================

The operator can specify the `raid_level` capability in nova flavor for node to be selected
for scheduling::

  openstack flavor set my-baremetal-flavor --property capabilities:raid_level="1+0"

Developer documentation
=======================
In-band RAID configuration is done using IPA ramdisk. IPA ramdisk has
support for pluggable hardware managers which can be used to extend the
functionality offered by IPA ramdisk using stevedore plugins.  For more
information, see Ironic Python Agent
:ironic-python-agent-doc:`Hardware Manager <install/index.html#hardware-managers>`
documentation.

The hardware manager that supports RAID configuration should do the following:

#. Implement a method named ``create_configuration``. This method creates
   the RAID configuration as given in ``target_raid_config``. After successful
   RAID configuration, it returns the current RAID configuration information
   which ironic uses to set ``node.raid_config``.

#. Implement a method named ``delete_configuration``. This method deletes
   all the RAID disks on the bare metal.

#. Return these two clean steps in ``get_clean_steps`` method with priority
   as 0. Example::

        return [{'step': 'create_configuration',
                 'interface': 'raid',
                 'priority': 0},
                {'step': 'delete_configuration',
                 'interface': 'raid',
                 'priority': 0}]