summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorAlasdair G Kergon <agk@redhat.com>2012-06-21 23:48:40 +0100
committerAlasdair G Kergon <agk@redhat.com>2012-06-21 23:48:40 +0100
commitda42ee3a1f0280742e87a96f5d0694fb869a23e0 (patch)
tree1d895c40e47c308f774ef510fa94ac934445314d /doc
parent461eb1ac6adbf8e1938bd6e726dd4048c341cb49 (diff)
downloadlvm2-da42ee3a1f0280742e87a96f5d0694fb869a23e0.tar.gz
kernel docs: Refresh kernel target documentation
Update the packaged copy of the in-kernel target documentation files. Adds dm-verity, updates thin provisioning and makes minor corrections elsewhere.
Diffstat (limited to 'doc')
-rw-r--r--doc/kernel/persistent-data.txt2
-rw-r--r--doc/kernel/raid.txt2
-rw-r--r--doc/kernel/striped.txt7
-rw-r--r--doc/kernel/thin-provisioning.txt78
-rw-r--r--doc/kernel/verity.txt155
5 files changed, 222 insertions, 22 deletions
diff --git a/doc/kernel/persistent-data.txt b/doc/kernel/persistent-data.txt
index 0e5df9b04..a333bcb3a 100644
--- a/doc/kernel/persistent-data.txt
+++ b/doc/kernel/persistent-data.txt
@@ -3,7 +3,7 @@ Introduction
The more-sophisticated device-mapper targets require complex metadata
that is managed in kernel. In late 2010 we were seeing that various
-different targets were rolling their own data strutures, for example:
+different targets were rolling their own data structures, for example:
- Mikulas Patocka's multisnap implementation
- Heinz Mauelshagen's thin provisioning target
diff --git a/doc/kernel/raid.txt b/doc/kernel/raid.txt
index 2a8c11331..946c73342 100644
--- a/doc/kernel/raid.txt
+++ b/doc/kernel/raid.txt
@@ -28,7 +28,7 @@ The target is named "raid" and it accepts the following parameters:
raid6_nc RAID6 N continue
- rotating parity N (right-to-left) with data continuation
- Refererence: Chapter 4 of
+ Reference: Chapter 4 of
http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
<#raid_params>: The number of parameters that follow.
diff --git a/doc/kernel/striped.txt b/doc/kernel/striped.txt
index f34d3236b..45f3b91ea 100644
--- a/doc/kernel/striped.txt
+++ b/doc/kernel/striped.txt
@@ -9,15 +9,14 @@ devices in parallel.
Parameters: <num devs> <chunk size> [<dev path> <offset>]+
<num devs>: Number of underlying devices.
- <chunk size>: Size of each chunk of data. Must be a power-of-2 and at
- least as large as the system's PAGE_SIZE.
+ <chunk size>: Size of each chunk of data. Must be at least as
+ large as the system's PAGE_SIZE.
<dev path>: Full pathname to the underlying block-device, or a
"major:minor" device-number.
<offset>: Starting sector within the device.
One or more underlying devices can be specified. The striped device size must
-be a multiple of the chunk size and a multiple of the number of underlying
-devices.
+be a multiple of the chunk size multiplied by the number of underlying devices.
Example scripts
diff --git a/doc/kernel/thin-provisioning.txt b/doc/kernel/thin-provisioning.txt
index 801d9d1cf..f5cfc62b7 100644
--- a/doc/kernel/thin-provisioning.txt
+++ b/doc/kernel/thin-provisioning.txt
@@ -1,7 +1,7 @@
Introduction
============
-This document descibes a collection of device-mapper targets that
+This document describes a collection of device-mapper targets that
between them implement thin-provisioning and snapshots.
The main highlight of this implementation, compared to the previous
@@ -75,10 +75,12 @@ less sharing than average you'll need a larger-than-average metadata device.
As a guide, we suggest you calculate the number of bytes to use in the
metadata device as 48 * $data_dev_size / $data_block_size but round it up
-to 2MB if the answer is smaller. The largest size supported is 16GB.
+to 2MB if the answer is smaller. If you're creating large numbers of
+snapshots which are recording large amounts of change, you may find you
+need to increase this.
-If you're creating large numbers of snapshots which are recording large
-amounts of change, you may need find you need to increase this.
+The largest size supported is 16GB: If the device is larger,
+a warning will be issued and the excess space will not be used.
Reloading a pool table
----------------------
@@ -167,6 +169,38 @@ ii) Using an internal snapshot.
dmsetup create snap --table "0 2097152 thin /dev/mapper/pool 1"
+External snapshots
+------------------
+
+You can use an external _read only_ device as an origin for a
+thinly-provisioned volume. Any read to an unprovisioned area of the
+thin device will be passed through to the origin. Writes trigger
+the allocation of new blocks as usual.
+
+One use case for this is VM hosts that want to run guests on
+thinly-provisioned volumes but have the base image on another device
+(possibly shared between many VMs).
+
+You must not write to the origin device if you use this technique!
+Of course, you may write to the thin device and take internal snapshots
+of the thin volume.
+
+i) Creating a snapshot of an external device
+
+ This is the same as creating a thin device.
+ You don't mention the origin at this stage.
+
+ dmsetup message /dev/mapper/pool 0 "create_thin 0"
+
+ii) Using a snapshot of an external device.
+
+ Append an extra parameter to the thin target specifying the origin:
+
+ dmsetup create snap --table "0 2097152 thin /dev/mapper/pool 0 /dev/image"
+
+ N.B. All descendants (internal snapshots) of this snapshot require the
+ same extra origin parameter.
+
Deactivation
------------
@@ -189,7 +223,13 @@ i) Constructor
<low water mark (blocks)> [<number of feature args> [<arg>]*]
Optional feature arguments:
- - 'skip_block_zeroing': skips the zeroing of newly-provisioned blocks.
+
+ skip_block_zeroing: Skip the zeroing of newly-provisioned blocks.
+
+ ignore_discard: Disable discard support.
+
+ no_discard_passdown: Don't pass discards down to the underlying
+ data device, but just remove the mapping.
Data block size must be between 64KB (128 sectors) and 1GB
(2097152 sectors) inclusive.
@@ -237,16 +277,6 @@ iii) Messages
Deletes a thin device. Irreversible.
- trim <dev id> <new size in sectors>
-
- Delete mappings from the end of a thin device. Irreversible.
- You might want to use this if you're reducing the size of
- your thinly-provisioned device. In many cases, due to the
- sharing of blocks between devices, it is not possible to
- determine in advance how much space 'trim' will release. (In
- future a userspace tool might be able to perform this
- calculation.)
-
set_transaction_id <current id> <new id>
Userland volume managers, such as LVM, need a way to
@@ -257,12 +287,23 @@ iii) Messages
the current transaction id is when you change it with this
compare-and-swap message.
+ reserve_metadata_snap
+
+ Reserve a copy of the data mapping btree for use by userland.
+ This allows userland to inspect the mappings as they were when
+ this message was executed. Use the pool's status command to
+ get the root block associated with the metadata snapshot.
+
+ release_metadata_snap
+
+ Release a previously reserved copy of the data mapping btree.
+
'thin' target
-------------
i) Constructor
- thin <pool dev> <dev id>
+ thin <pool dev> <dev id> [<external origin dev>]
pool dev:
the thin-pool device, e.g. /dev/mapper/my_pool or 253:0
@@ -271,6 +312,11 @@ i) Constructor
the internal device identifier of the device to be
activated.
+ external origin dev:
+ an optional block device outside the pool to be treated as a
+ read-only snapshot origin: reads to unprovisioned areas of the
+ thin target will be mapped to this device.
+
The pool doesn't store any size against the thin devices. If you
load a thin target that is smaller than you've been using previously,
then you'll have no access to blocks mapped beyond the end. If you
diff --git a/doc/kernel/verity.txt b/doc/kernel/verity.txt
new file mode 100644
index 000000000..988468153
--- /dev/null
+++ b/doc/kernel/verity.txt
@@ -0,0 +1,155 @@
+dm-verity
+==========
+
+Device-Mapper's "verity" target provides transparent integrity checking of
+block devices using a cryptographic digest provided by the kernel crypto API.
+This target is read-only.
+
+Construction Parameters
+=======================
+ <version> <dev> <hash_dev>
+ <data_block_size> <hash_block_size>
+ <num_data_blocks> <hash_start_block>
+ <algorithm> <digest> <salt>
+
+<version>
+ This is the type of the on-disk hash format.
+
+ 0 is the original format used in the Chromium OS.
+ The salt is appended when hashing, digests are stored continuously and
+ the rest of the block is padded with zeros.
+
+ 1 is the current format that should be used for new devices.
+ The salt is prepended when hashing and each digest is
+ padded with zeros to the power of two.
+
+<dev>
+ This is the device containing data, the integrity of which needs to be
+ checked. It may be specified as a path, like /dev/sdaX, or a device number,
+ <major>:<minor>.
+
+<hash_dev>
+ This is the device that supplies the hash tree data. It may be
+ specified similarly to the device path and may be the same device. If the
+ same device is used, the hash_start should be outside the configured
+ dm-verity device.
+
+<data_block_size>
+ The block size on a data device in bytes.
+ Each block corresponds to one digest on the hash device.
+
+<hash_block_size>
+ The size of a hash block in bytes.
+
+<num_data_blocks>
+ The number of data blocks on the data device. Additional blocks are
+ inaccessible. You can place hashes to the same partition as data, in this
+ case hashes are placed after <num_data_blocks>.
+
+<hash_start_block>
+ This is the offset, in <hash_block_size>-blocks, from the start of hash_dev
+ to the root block of the hash tree.
+
+<algorithm>
+ The cryptographic hash algorithm used for this device. This should
+ be the name of the algorithm, like "sha1".
+
+<digest>
+ The hexadecimal encoding of the cryptographic hash of the root hash block
+ and the salt. This hash should be trusted as there is no other authenticity
+ beyond this point.
+
+<salt>
+ The hexadecimal encoding of the salt value.
+
+Theory of operation
+===================
+
+dm-verity is meant to be set up as part of a verified boot path. This
+may be anything ranging from a boot using tboot or trustedgrub to just
+booting from a known-good device (like a USB drive or CD).
+
+When a dm-verity device is configured, it is expected that the caller
+has been authenticated in some way (cryptographic signatures, etc).
+After instantiation, all hashes will be verified on-demand during
+disk access. If they cannot be verified up to the root node of the
+tree, the root hash, then the I/O will fail. This should detect
+tampering with any data on the device and the hash data.
+
+Cryptographic hashes are used to assert the integrity of the device on a
+per-block basis. This allows for a lightweight hash computation on first read
+into the page cache. Block hashes are stored linearly, aligned to the nearest
+block size.
+
+Hash Tree
+---------
+
+Each node in the tree is a cryptographic hash. If it is a leaf node, the hash
+of some data block on disk is calculated. If it is an intermediary node,
+the hash of a number of child nodes is calculated.
+
+Each entry in the tree is a collection of neighboring nodes that fit in one
+block. The number is determined based on block_size and the size of the
+selected cryptographic digest algorithm. The hashes are linearly-ordered in
+this entry and any unaligned trailing space is ignored but included when
+calculating the parent node.
+
+The tree looks something like:
+
+alg = sha256, num_blocks = 32768, block_size = 4096
+
+ [ root ]
+ / . . . \
+ [entry_0] [entry_1]
+ / . . . \ . . . \
+ [entry_0_0] . . . [entry_0_127] . . . . [entry_1_127]
+ / ... \ / . . . \ / \
+ blk_0 ... blk_127 blk_16256 blk_16383 blk_32640 . . . blk_32767
+
+
+On-disk format
+==============
+
+The verity kernel code does not read the verity metadata on-disk header.
+It only reads the hash blocks which directly follow the header.
+It is expected that a user-space tool will verify the integrity of the
+verity header.
+
+Alternatively, the header can be omitted and the dmsetup parameters can
+be passed via the kernel command-line in a rooted chain of trust where
+the command-line is verified.
+
+Directly following the header (and with sector number padded to the next hash
+block boundary) are the hash blocks which are stored a depth at a time
+(starting from the root), sorted in order of increasing index.
+
+The full specification of kernel parameters and on-disk metadata format
+is available at the cryptsetup project's wiki page
+ http://code.google.com/p/cryptsetup/wiki/DMVerity
+
+Status
+======
+V (for Valid) is returned if every check performed so far was valid.
+If any check failed, C (for Corruption) is returned.
+
+Example
+=======
+Set up a device:
+ # dmsetup create vroot --readonly --table \
+ "0 2097152 verity 1 /dev/sda1 /dev/sda2 4096 4096 262144 1 sha256 "\
+ "4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 "\
+ "1234000000000000000000000000000000000000000000000000000000000000"
+
+A command line tool veritysetup is available to compute or verify
+the hash tree or activate the kernel device. This is available from
+the cryptsetup upstream repository http://code.google.com/p/cryptsetup/
+(as a libcryptsetup extension).
+
+Create hash on the device:
+ # veritysetup format /dev/sda1 /dev/sda2
+ ...
+ Root hash: 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076
+
+Activate the device:
+ # veritysetup create vroot /dev/sda1 /dev/sda2 \
+ 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076