summaryrefslogtreecommitdiff
path: root/doc/kernel/thin-provisioning.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/kernel/thin-provisioning.txt')
-rw-r--r--doc/kernel/thin-provisioning.txt62
1 files changed, 49 insertions, 13 deletions
diff --git a/doc/kernel/thin-provisioning.txt b/doc/kernel/thin-provisioning.txt
index 30b8b83bd..4f67578b2 100644
--- a/doc/kernel/thin-provisioning.txt
+++ b/doc/kernel/thin-provisioning.txt
@@ -99,13 +99,14 @@ Using an existing pool device
$data_block_size $low_water_mark"
$data_block_size gives the smallest unit of disk space that can be
-allocated at a time expressed in units of 512-byte sectors. People
-primarily interested in thin provisioning may want to use a value such
-as 1024 (512KB). People doing lots of snapshotting may want a smaller value
-such as 128 (64KB). If you are not zeroing newly-allocated data,
-a larger $data_block_size in the region of 256000 (128MB) is suggested.
-$data_block_size must be the same for the lifetime of the
-metadata device.
+allocated at a time expressed in units of 512-byte sectors.
+$data_block_size must be between 128 (64KB) and 2097152 (1GB) and a
+multiple of 128 (64KB). $data_block_size cannot be changed after the
+thin-pool is created. People primarily interested in thin provisioning
+may want to use a value such as 1024 (512KB). People doing lots of
+snapshotting may want a smaller value such as 128 (64KB). If you are
+not zeroing newly-allocated data, a larger $data_block_size in the
+region of 256000 (128MB) is suggested.
$low_water_mark is expressed in blocks of size $data_block_size. If
free space on the data device drops below this level then a dm event
@@ -115,6 +116,35 @@ Resuming a device with a new table itself triggers an event so the
userspace daemon can use this to detect a situation where a new table
already exceeds the threshold.
+A low water mark for the metadata device is maintained in the kernel and
+will trigger a dm event if free space on the metadata device drops below
+it.
+
+Updating on-disk metadata
+-------------------------
+
+On-disk metadata is committed every time a FLUSH or FUA bio is written.
+If no such requests are made then commits will occur every second. This
+means the thin-provisioning target behaves like a physical disk that has
+a volatile write cache. If power is lost you may lose some recent
+writes. The metadata should always be consistent in spite of any crash.
+
+If data space is exhausted the pool will either error or queue IO
+according to the configuration (see: error_if_no_space). If metadata
+space is exhausted or a metadata operation fails: the pool will error IO
+until the pool is taken offline and repair is performed to 1) fix any
+potential inconsistencies and 2) clear the flag that imposes repair.
+Once the pool's metadata device is repaired it may be resized, which
+will allow the pool to return to normal operation. Note that if a pool
+is flagged as needing repair, the pool's data and metadata devices
+cannot be resized until repair is performed. It should also be noted
+that when the pool's metadata space is exhausted the current metadata
+transaction is aborted. Given that the pool will cache IO whose
+completion may have already been acknowledged to upper IO layers
+(e.g. filesystem) it is strongly suggested that consistency checks
+(e.g. fsck) be performed on those layers when repair of the pool is
+required.
+
Thin provisioning
-----------------
@@ -234,6 +264,8 @@ i) Constructor
read_only: Don't allow any changes to be made to the pool
metadata.
+ error_if_no_space: Error IOs, instead of queueing, if no space.
+
Data block size must be between 64KB (128 sectors) and 1GB
(2097152 sectors) inclusive.
@@ -255,10 +287,9 @@ ii) Status
should register for the event and then check the target's status.
held metadata root:
- The location, in sectors, of the metadata root that has been
+ The location, in blocks, of the metadata root that has been
'held' for userspace read access. '-' indicates there is no
- held root. This feature is not yet implemented so '-' is
- always returned.
+ held root.
discard_passdown|no_discard_passdown
Whether or not discards are actually being passed down to the
@@ -275,6 +306,14 @@ ii) Status
contain the string 'Fail'. The userspace recovery tools
should then be used.
+ error_if_no_space|queue_if_no_space
+ If the pool runs out of data or metadata space, the pool will
+ either queue or error the IO destined to the data device. The
+ default is to queue the IO until more space is added or the
+ 'no_space_timeout' expires. The 'no_space_timeout' dm-thin-pool
+ module parameter can be used to change this timeout -- it
+ defaults to 60 seconds but may be disabled using a value of 0.
+
iii) Messages
create_thin <dev id>
@@ -341,9 +380,6 @@ then you'll have no access to blocks mapped beyond the end. If you
load a target that is bigger than before, then extra blocks will be
provisioned as and when needed.
-If you wish to reduce the size of your thin device and potentially
-regain some space then send the 'trim' message to the pool.
-
ii) Status
<nr mapped sectors> <highest mapped sector>