diff options
Diffstat (limited to 'doc/kernel/thin-provisioning.txt')
-rw-r--r-- | doc/kernel/thin-provisioning.txt | 62 |
1 files changed, 49 insertions, 13 deletions
diff --git a/doc/kernel/thin-provisioning.txt b/doc/kernel/thin-provisioning.txt index 30b8b83bd..4f67578b2 100644 --- a/doc/kernel/thin-provisioning.txt +++ b/doc/kernel/thin-provisioning.txt @@ -99,13 +99,14 @@ Using an existing pool device $data_block_size $low_water_mark" $data_block_size gives the smallest unit of disk space that can be -allocated at a time expressed in units of 512-byte sectors. People -primarily interested in thin provisioning may want to use a value such -as 1024 (512KB). People doing lots of snapshotting may want a smaller value -such as 128 (64KB). If you are not zeroing newly-allocated data, -a larger $data_block_size in the region of 256000 (128MB) is suggested. -$data_block_size must be the same for the lifetime of the -metadata device. +allocated at a time expressed in units of 512-byte sectors. +$data_block_size must be between 128 (64KB) and 2097152 (1GB) and a +multiple of 128 (64KB). $data_block_size cannot be changed after the +thin-pool is created. People primarily interested in thin provisioning +may want to use a value such as 1024 (512KB). People doing lots of +snapshotting may want a smaller value such as 128 (64KB). If you are +not zeroing newly-allocated data, a larger $data_block_size in the +region of 256000 (128MB) is suggested. $low_water_mark is expressed in blocks of size $data_block_size. If free space on the data device drops below this level then a dm event @@ -115,6 +116,35 @@ Resuming a device with a new table itself triggers an event so the userspace daemon can use this to detect a situation where a new table already exceeds the threshold. +A low water mark for the metadata device is maintained in the kernel and +will trigger a dm event if free space on the metadata device drops below +it. + +Updating on-disk metadata +------------------------- + +On-disk metadata is committed every time a FLUSH or FUA bio is written. +If no such requests are made then commits will occur every second. This +means the thin-provisioning target behaves like a physical disk that has +a volatile write cache. If power is lost you may lose some recent +writes. The metadata should always be consistent in spite of any crash. + +If data space is exhausted the pool will either error or queue IO +according to the configuration (see: error_if_no_space). If metadata +space is exhausted or a metadata operation fails: the pool will error IO +until the pool is taken offline and repair is performed to 1) fix any +potential inconsistencies and 2) clear the flag that imposes repair. +Once the pool's metadata device is repaired it may be resized, which +will allow the pool to return to normal operation. Note that if a pool +is flagged as needing repair, the pool's data and metadata devices +cannot be resized until repair is performed. It should also be noted +that when the pool's metadata space is exhausted the current metadata +transaction is aborted. Given that the pool will cache IO whose +completion may have already been acknowledged to upper IO layers +(e.g. filesystem) it is strongly suggested that consistency checks +(e.g. fsck) be performed on those layers when repair of the pool is +required. + Thin provisioning ----------------- @@ -234,6 +264,8 @@ i) Constructor read_only: Don't allow any changes to be made to the pool metadata. + error_if_no_space: Error IOs, instead of queueing, if no space. + Data block size must be between 64KB (128 sectors) and 1GB (2097152 sectors) inclusive. @@ -255,10 +287,9 @@ ii) Status should register for the event and then check the target's status. held metadata root: - The location, in sectors, of the metadata root that has been + The location, in blocks, of the metadata root that has been 'held' for userspace read access. '-' indicates there is no - held root. This feature is not yet implemented so '-' is - always returned. + held root. discard_passdown|no_discard_passdown Whether or not discards are actually being passed down to the @@ -275,6 +306,14 @@ ii) Status contain the string 'Fail'. The userspace recovery tools should then be used. + error_if_no_space|queue_if_no_space + If the pool runs out of data or metadata space, the pool will + either queue or error the IO destined to the data device. The + default is to queue the IO until more space is added or the + 'no_space_timeout' expires. The 'no_space_timeout' dm-thin-pool + module parameter can be used to change this timeout -- it + defaults to 60 seconds but may be disabled using a value of 0. + iii) Messages create_thin <dev id> @@ -341,9 +380,6 @@ then you'll have no access to blocks mapped beyond the end. If you load a target that is bigger than before, then extra blocks will be provisioned as and when needed. -If you wish to reduce the size of your thin device and potentially -regain some space then send the 'trim' message to the pool. - ii) Status <nr mapped sectors> <highest mapped sector> |