summaryrefslogtreecommitdiff
path: root/man/lvmcache.7_main
blob: dff64a2a70bef1df07296c94b829df2ad76679f8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
.TH "LVMCACHE" "7" "LVM TOOLS #VERSION#" "Red Hat, Inc" "\""
.SH NAME
lvmcache \(em LVM caching

.SH DESCRIPTION

\fBlvm\fP(8) includes two kinds of caching that can be used to improve the
performance of a Logical Volume (LV). Typically, a smaller, faster device
is used to improve i/o performance of a larger, slower LV. To do this, a
separate LV is created from the faster device, and then the original LV is
converted to start using the fast LV.

The two kinds of caching are:

.IP \[bu] 2
A read and write hot-spot cache, using the dm-cache kernel module.  This
cache is slow moving, and adjusts the cache content over time so that the
most used parts of the LV are kept on the faster device.  Both reads and
writes use the cache. LVM refers to this using the LV type \fBcache\fP.

.IP \[bu] 2
A streaming write cache, using the dm-writecache kernel module.  This
cache is intended to be used with SSD or PMEM devices to speed up all
writes to an LV.  Reads do not use this cache.  LVM refers to this using
the LV type \fBwritecache\fP.

.SH USAGE

Both kinds of caching use similar lvm commands:

.B 1. Identify main LV that needs caching

A main LV exists on slower devices.

.nf
  $ lvcreate -n main -L Size vg /dev/slow
.fi

.B 2. Identify fast LV to use as the cache

A fast LV exists on faster devices.  This LV will be used to hold the
cache.

.nf
  $ lvcreate -n fast -L Size vg /dev/fast

  $ lvs vg -o+devices
  LV   VG  Attr       LSize   Devices
  fast vg -wi-------  xx.00m /dev/fast(0)
  main vg -wi------- yyy.00m /dev/slow(0)
.fi

.B 3. Start caching the main LV

To start caching the main LV using the fast LV, convert the main LV to the
desired caching type, and specify the fast LV to use:

.nf
using dm-cache:

  $ lvconvert --type cache --cachevol fast vg/main

using dm-writecache:

  $ lvconvert --type writecache --cachevol fast vg/main

using dm-cache with a cache pool:

  $ lvconvert --type cache --cachepool fastpool vg/main
.fi

.B 4. Display LVs

Once the fast LV has been attached to the main LV, lvm reports the main LV
type as either \fBcache\fP or \fBwritecache\fP depending on the type used.
While attached, the fast LV is hidden, and only displayed when lvs is
given -a.  The _corig or _wcorig LV represents the original LV without the
cache.

.nf
using dm-cache:

  $ lvs -a -o name,vgname,lvattr,origin,segtype,devices vg
  LV           VG Attr       Origin       Type   Devices       
  [fast]       vg Cwi-aoC---                     linear /dev/fast(xx)
  main         vg Cwi-a-C--- [main_corig] cache  main_corig(0) 
  [main_corig] vg owi-aoC---                     linear /dev/slow(0) 

using dm-writecache:

  $ lvs -a -o name,vgname,lvattr,origin,segtype,devices vg
  LV            VG Attr       Origin        Type       Devices       
  [fast]        vg -wi-ao----               linear     /dev/fast(xx)
  main          vg Cwi-a----- [main_wcorig] writecache main_wcorig(0)
  [main_wcorig] vg -wi-ao----               linear     /dev/slow(0)   
.fi

.B 5. Use the main LV

Use the LV until the cache is no longer wanted, or needs to be changed.

.B 6. Stop caching

To stop caching the main LV, separate the fast LV from the main LV.  This
changes the type of the main LV back to what it was before the cache was
attached.

.nf
   $ lvconvert --splitcache vg/main
.fi


.SH OPTIONS

\&

.SS option args

\&

.B --cachevol
.I LV
.br

Pass this option a standard LV.  With a cache vol, cache data and metadata
are contained within the single LV.  This is used with dm-writecache or
dm-cache.

.B --cachepool
.IR CachePoolLV | LV
.br

Pass this option a cache pool object.  With a cache pool, lvm places cache
data and cache metadata on different LVs.  The two LVs together are called
a cache pool.  This permits specific placement of data and metadata.  A
cache pool is represented as a special type of LV that cannot be used
directly.  (If a standard LV is passed to this option, lvm will first
convert it to a cache pool by combining it with another LV to use for
metadata.)  This can be used with dm-cache.

\&

.SS dm-cache block size

\&

A cache pool will have a logical block size of 4096 bytes if it is created
on a device with a logical block size of 4096 bytes.

If a main LV has logical block size 512 (with an existing xfs file system
using that size), then it cannot use a cache pool with a 4096 logical
block size.  If the cache pool is attached, the main LV will likely fail
to mount.

To avoid this problem, use a mkfs option to specify a 4096 block size for
the file system, or attach the cache pool before running mkfs.

.SS dm-writecache block size

\&

The dm-writecache block size can be 4096 bytes (the default), or 512
bytes.  The default 4096 has better performance and should be used except
when 512 is necessary for compatibility.  The dm-writecache block size is
specified with --cachesettings block_size=4096|512 when caching is started.

When a file system like xfs already exists on the main LV prior to
caching, and the file system is using a block size of 512, then the
writecache block size should be set to 512.  (The file system will likely
fail to mount if writecache block size of 4096 is used in this case.)

Check the xfs sector size while the fs is mounted:

.nf
$ xfs_info /dev/vg/main
Look for sectsz=512 or sectsz=4096
.fi

The writecache block size should be chosen to match the xfs sectsz value.

It is also possible to specify a sector size of 4096 to mkfs.xfs when
creating the file system.  In this case the writecache block size of 4096
can be used.

.SS dm-writecache settings

\&

Tunable parameters can be passed to the dm-writecache kernel module using
the --cachesettings option when caching is started, e.g.

.nf
$ lvconvert --type writecache --cachevol fast \\
	--cachesettings 'high_watermark=N writeback_jobs=N' vg/main
.fi

Tunable options are:

.IP \[bu] 2
high_watermark = <count>

Start writeback when the number of used blocks reach this watermark

.IP \[bu] 2
low_watermark = <count>

Stop writeback when the number of used blocks drops below this watermark

.IP \[bu] 2
writeback_jobs = <count>

Limit the number of blocks that are in flight during writeback.  Setting
this value reduces writeback throughput, but it may improve latency of
read requests.

.IP \[bu] 2
autocommit_blocks = <count>

When the application writes this amount of blocks without issuing the
FLUSH request, the blocks are automatically commited.

.IP \[bu] 2
autocommit_time = <milliseconds>

The data is automatically commited if this time passes and no FLUSH
request is received.

.IP \[bu] 2
fua = 0|1

Use the FUA flag when writing data from persistent memory back to the
underlying device.
Applicable only to persistent memory.

.IP \[bu] 2
nofua = 0|1

Don't use the FUA flag when writing back data and send the FLUSH request
afterwards.  Some underlying devices perform better with fua, some with
nofua.  Testing is necessary to determine which.
Applicable only to persistent memory.


.SS dm-cache with separate data and metadata LVs

\&

When using dm-cache, the cache metadata and cache data can be stored on
separate LVs.  To do this, a "cache pool" is created, which is a special
LV that references two sub LVs, one for data and one for metadata.

To create a cache pool from two separate LVs:

.nf
$ lvcreate -n fastpool -L DataSize vg /dev/fast1
$ lvcreate -n fastpoolmeta -L MetadataSize vg /dev/fast2
$ lvconvert --type cache-pool --poolmetadata fastpoolmeta vg/fastpool
.fi

Then use the cache pool LV to start caching the main LV:

.nf
$ lvconvert --type cache --cachepool fastpool vg/main
.fi

A variation of the same procedure automatically creates a cache pool when
caching is started.  To do this, use a standard LV as the --cachepool
(this will hold cache data), and use another standard LV as the
--poolmetadata (this will hold cache metadata).  LVM will create a
cache pool LV from the two specified LVs, and use the cache pool to start
caching the main LV.

.nf
$ lvcreate -n fastpool -L DataSize vg /dev/fast1
$ lvcreate -n fastpoolmeta -L MetadataSize vg /dev/fast2
$ lvconvert --type cache --cachepool fastpool \\
	--poolmetadata fastpoolmeta vg/main
.fi

.SS dm-cache cache modes

\&

The default dm-cache cache mode is "writethrough".  Writethrough ensures
that any data written will be stored both in the cache and on the origin
LV.  The loss of a device associated with the cache in this case would not
mean the loss of any data.

A second cache mode is "writeback".  Writeback delays writing data blocks
from the cache back to the origin LV.  This mode will increase
performance, but the loss of a cache device can result in lost data.

With the --cachemode option, the cache mode can be set when caching is
started, or changed on an LV that is already cached.  The current cache
mode can be displayed with the cache_mode reporting option:

.B lvs -o+cache_mode VG/LV

.BR lvm.conf (5)
.B allocation/cache_mode
.br
defines the default cache mode.

.nf
$ lvconvert --type cache --cachevol fast \\
	--cachemode writethrough vg/main
.nf

.SS dm-cache chunk size

\&

The size of data blocks managed by dm-cache can be specified with the
--chunksize option when caching is started.  The default unit is KiB.  The
value must be a multiple of 32KiB between 32KiB and 1GiB.

Using a chunk size that is too large can result in wasteful use of the
cache, in which small reads and writes cause large sections of an LV to be
stored in the cache.  However, choosing a chunk size that is too small
can result in more overhead trying to manage the numerous chunks that
become mapped into the cache.  Overhead can include both excessive CPU
time searching for chunks, and excessive memory tracking chunks.

Command to display the chunk size:
.br
.B lvs -o+chunksize VG/LV

.BR lvm.conf (5)
.B cache_pool_chunk_size
.br
controls the default chunk size.

The default value is shown by:
.br
.B lvmconfig --type default allocation/cache_pool_chunk_size


.SS dm-cache cache policy

\&

The dm-cache subsystem has additional per-LV parameters: the cache policy
to use, and possibly tunable parameters for the cache policy.  Three
policies are currently available: "smq" is the default policy, "mq" is an
older implementation, and "cleaner" is used to force the cache to write
back (flush) all cached writes to the origin LV.

The older "mq" policy has a number of tunable parameters. The defaults are
chosen to be suitable for the majority of systems, but in special
circumstances, changing the settings can improve performance.

With the --cachepolicy and --cachesettings options, the cache policy and
settings can be set when caching is started, or changed on an existing
cached LV (both options can be used together).  The current cache policy
and settings can be displayed with the cache_policy and cache_settings
reporting options:

.B lvs -o+cache_policy,cache_settings VG/LV

.nf
Change the cache policy and settings of an existing LV.

$ lvchange --cachepolicy mq --cachesettings \\
	\(aqmigration_threshold=2048 random_threshold=4\(aq vg/main
.fi

.BR lvm.conf (5)
.B allocation/cache_policy
.br
defines the default cache policy.

.BR lvm.conf (5)
.B allocation/cache_settings
.br
defines the default cache settings.

.SS dm-cache spare metadata LV

\&

See
.BR lvmthin (7)
for a description of the "pool metadata spare" LV.
The same concept is used for cache pools.

.SS dm-cache metadata formats

\&

There are two disk formats for dm-cache metadata.  The metadata format can
be specified with --cachemetadataformat when caching is started, and
cannot be changed.  Format \fB2\fP has better performance; it is more
compact, and stores dirty bits in a separate btree, which improves the
speed of shutting down the cache.  With \fBauto\fP, lvm selects the best
option provided by the current dm-cache kernel module.

.SS mirrored cache device

\&

The fast LV holding the cache can be created as a raid1 mirror so that it
can tolerate a device failure.  (When using dm-cache with separate data
and metadata LVs, each of the sub-LVs can use raid1.)

.nf
$ lvcreate -n main -L Size vg /dev/slow
$ lvcreate --type raid1 -m 1 -n fast -L Size vg /dev/fast1 /dev/fast2
$ lvconvert --type cache --cachevol fast vg/main
.fi

.SS dm-cache command shortcut

\&

A single command can be used to create a cache pool and attach that new
cache pool to a main LV:

.nf
$ lvcreate --type cache --name Name --size Size VG/LV [PV]
.fi

In this command, the specified LV already exists, and is the main LV to be
cached.  The command creates a new cache pool with the given name and
size, using the optionally specified PV (typically an ssd).  Then it
attaches the new cache pool to the existing main LV to begin caching.

(Note: ensure that the specified main LV is a standard LV.  If a cache
pool LV is mistakenly specified, then the command does something
different.)

(Note: the type option is interpreted differently by this command than by
normal lvcreate commands in which --type specifies the type of the newly
created LV.  In this case, an LV with type cache-pool is being created,
and the existing main LV is being converted to type cache.)

\&

.SS dm-cache cachevol repair

\&

If dm-cache metadata is damaged in a cachevol using writethrough, the
cachevol can be detatched using lvconvert --splitcache; no repair is
necessary.  If a writeback cache is damaged, the following steps can be
used to attempt recovery.

.P
Ensure that the main LV and the attached cachevol are inactive.

.HP 4
.nf
$ lvs -a vg -o+segtype
  LV           VG Attr       LSize  Pool   Origin        Type
  [fast]       vg Cwi---C---  8.00g                      linear
  main         vg Cwi---C--- 32.00g [fast] [main_corig]  cache
  [main_corig] vg owi---C--- 32.00g                      linear
.fi

.P
Create a new LV that will hold a repaired copy of the cache.  It must be
the same size as the existing cachevol it will replace.

.HP 4
.nf
$ lvcreate -n fast2 -L 32g -an vg
.fi

.P
Run the following lvconvert command to create a repaired copy of the
cachevol on the replacement LV.  It will use the cache_repair utility to
write repaired metadata on the destination LV, and then copy all the cache
data from the damaged cachevol to the destination LV.  All the cache data
can take a while to copy.  The main LV, cachevol LV and replacement LV
must all be inactive before running this command.  If the cache repair
fails, the damage may be unrepairable, or may require manual inspection
and repair.

.HP 4
.nf
$ lvconvert --repaircachevol fast vg/fast2
Erase all existing data on vg/fast2? [y/n]: y
      cache_repair wrote repaired metadata to vg/fast2.
      copying 7 GiB of cache data to vg/fast2...
      copied 1 GiB of 7 GiB of cache data...
      copied 2 GiB of 7 GiB of cache data...
      copied 3 GiB of 7 GiB of cache data...
      copied 4 GiB of 7 GiB of cache data...
      copied 5 GiB of 7 GiB of cache data...
      copied 6 GiB of 7 GiB of cache data...
      copied 7 GiB of 7 GiB of cache data...
      copied 8522825728 bytes of cache data from vg/fast to vg/fast2.
.fi

.P
If the repair was successful, replace the current cachevol (fast) with the
repaired copy (fast2).

.HP 4
.nf
$ lvconvert --replacecachevol fast vg/fast2
Replace current cachevol fast with fast2 for caching vg/main? [y/n]:
  LV vg/main is now using cachevol vg/fast2 for caching.
  The previous cachevol vg/fast is now unused.
.fi

.P
Verify that the repaired copy is now attached to the main LV, and the
original damaged cachevol is detached.

.HP 4
.nf
$ lvs -a vg -o+segtype
  LV           VG  Attr       LSize  Pool    Origin        Type
  fast         vg -wi-------  8.00g                       linear
  [fast2]      vg Cwi---C---  8.00g                       linear
  main         vg Cwi---C--- 32.00g [fast2] [main_corig]  cache
  [main_corig] vg owi---C--- 32.00g                       linear
.fi

.P
Try to activate the main LV with the repaired cache.

.HP 4
.nf
$ lvchange -ay vg/main
.fi

.P
Try using the main LV.  If bad data is seen, then the metadata was not
successfully repaired on the new cachevol.  In this case, the damage
may be unrepairable, or may require manual inspection and repair.


.SH SEE ALSO
.BR lvm.conf (5),
.BR lvchange (8),
.BR lvcreate (8),
.BR lvdisplay (8),
.BR lvextend (8),
.BR lvremove (8),
.BR lvrename (8),
.BR lvresize (8),
.BR lvs (8),
.BR vgchange (8),
.BR vgmerge (8),
.BR vgreduce (8),
.BR vgsplit (8)