summaryrefslogtreecommitdiff
path: root/test/shell/lvchange-raid.sh
Commit message (Collapse)AuthorAgeFilesLines
* test: add checks for not 100% sync ratio after initiation of check/repairHeinz Mauelshagen2019-10-021-0/+4
| | | | Related: rhbz1640630
* tests: enable lvmlockd for passing testsDavid Teigland2018-05-301-1/+1
|
* testsuite: Forgot to pull 'should's after fixing RAID4/5/6 mismatch testJonathan Brassow2017-11-021-10/+3
| | | | Test will now fail rather than warn if conditions are not met.
* testsuite: Fix problem when checking RAID4/5/6 for mismatches.Jonathan Brassow2017-11-021-0/+11
| | | | | | | | | | | | | | | | | | | | | | The lvchange-raid[456].sh test checks that mismatches can be detected properly. It does this by writing garbage to the back half of one of the legs directly. When performing a "check" or "repair" of mismatches, MD does a good job going directly to disk and bypassing any buffers that may prevent it from seeing mismatches. However, in the case of RAID4/5/6 we have the stripe cache to contend with and this is not bypassed. Thus, mismatches which have /just/ happened to an area that now populates the stripe cache may be overlooked. This isn't a serious issue, however, because the stripe cache is short-lived and reasonably small. So, while there may be a small window of time between the disk changing underneath the RAID array and when you run a "check"/"repair" - causing a mismatch to be missed - that would be no worse than if a user had simply run a "check" a few seconds before the disk changed. IOW, it simply isn't worth making a fuss over dropping the stripe cache before beginning a "check" or "repair" (which we actually did attempt to do a while back). So, to get the test running smoothly, we simply deactivate and reactivate the LV to force the stripe cache to be dropped and then proceed. We could just as easily wait a few seconds for the stripe cache to empty also.
* tests: use well defined testZdenek Kabelac2017-07-101-1/+1
| | | | | | Prefer [ p ] && [ q ] as [ p -a q ] is not well defined. Apparently && and || "short-circuit" while -a and -o do not.
* tests: math drop uncessary $/${}Zdenek Kabelac2017-07-101-3/+3
| | | | | | $/${} is unnecessary on arithmetic variables. Use $((..)) instead of deprecated $[..]
* tests: use bashZdenek Kabelac2017-07-101-1/+2
|
* tests: double quoteZdenek Kabelac2017-07-101-3/+3
|
* tests: correcting usage of '==' in bashZdenek Kabelac2017-07-101-5/+5
|
* tests: wait for raid in syncZdenek Kabelac2017-05-291-0/+1
| | | | Lvchange needs synchronized raid.
* lvconvert: prompt on raid1 image changesHeinz Mauelshagen2017-04-061-3/+3
| | | | | | Don't change resilience of raid1 LVs without --yes. Adjust respective tests.
* test: raid1 down convert to linearHeinz Mauelshagen2017-03-091-1/+2
| | | | Add/adjust more tests for commit 7fbe6ef16bfb.
* lvconvert: add new reporting fields for reshapingHeinz Mauelshagen2017-03-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During an ongoing reshape, the MD kernel runtime reads stripes relative to data_offset and starts storing the reshaped stripes (with new raid layout and/or new stripesize and/or new number of stripes) relative to new_data_offset. This is to avoid writing over any data in place which is non-atomic by nature and thus be recoverable without data loss in the transition. MD uses the term out-of-place reshaping for it. There's 2 other areas we don't have report capability for: - number of data stripes vs. total stripes (e.g. raid6 with 7 stripes toal has 5 data stripes) - number of (rotating) parity/syndrome chunks (e.g. raid6 with 7 stripes toal has 2 parity chunks; one per stripe for P-Syndrome and another one for Q-Syndrome) Thus, add the following reportable keys: - reshape_len (in current units) - reshape_len_le (in logical extents) - data_offset (in sectors) - new_data_offset ( " ) - data_stripes - parity_chunks Enhance lvchange-raid.sh, lvconvert-raid-reshape-linear_to_striped.sh, lvconvert-raid-reshape-striped_to_linear.sh, lvconvert-raid-reshape.sh and lvconvert-raid-takeover.sh to make use of new keys. Related: rhbz834579 Related: rhbz1191935 Related: rhbz1191978
* tests: add SKIP_WITH_LVMLOCKDDavid Teigland2016-02-231-0/+1
| | | | to all tests that don't already used vgcreate $SHARED
* doc: change fsf addressZdenek Kabelac2016-01-211-1/+1
| | | | | Hmm rpmlint suggest fsf is using a different address these days, so lets keep it up-to-date
* tests: use more SKIPZdenek Kabelac2015-10-271-0/+2
| | | | Speed-up check_lvmpolld.
* tests: minor simplificationsZdenek Kabelac2015-05-011-1/+1
| | | | minor updates
* tests: syncaction needs kernel fixZdenek Kabelac2014-10-241-2/+6
| | | | | | | Add 'should' as we currently cannot pass this test. FIXME: Add properly wrapper to not use 'should' with fixed kernel.
* tests: avoid hiding results in localZdenek Kabelac2014-07-021-3/+7
| | | | | | | | | | | | | There is a difference between: local a=$(shell) and local a a=$(shell) The first return exit code from shells' local command.
* tests: rename test to inittestZdenek Kabelac2014-06-101-1/+1
| | | | | | | We are getting into problem when we use 'test' for commands like should/not/... So avoid overloading test name and change it to inittest.
* tests: dd needs to hit diskZdenek Kabelac2014-05-281-16/+1
| | | | | | | | | Unsure if this is feature or bug of syncaction, but it needs to be present physically on the media and it ignores content of buffer cache... (maybe lvchange should implicitely fsync all disks that are members of raid array before starting test??)
* tests: add have_cache and have_raidZdenek Kabelac2014-05-201-3/+3
| | | | | Need to be aware of build options, when system would be configure without raid or cache support
* tests: speedupZdenek Kabelac2014-05-151-31/+31
| | | | | | Avoid some expencive raid/mirror synchronization when testing just allocation sizes. Use lv_attr_bit
* tests: add quotes around device pathsZdenek Kabelac2014-03-211-2/+2
|
* test: Use correct path to /dev in lvchange-raid.sh.Petr Rockai2014-03-051-1/+1
|
* tests: split raid testZdenek Kabelac2014-03-031-40/+21
| | | | | | | | | Use separate files for raid1, raid456, raid10. They need different target versions to work, so support more precise test selection. Optimize duplicate tests of target avalability and skip unsupported test cases sooner.
* test: Remove incorrect evaluationMarian Csontos2014-03-031-1/+1
|
* tests: updatesZdenek Kabelac2014-02-271-0/+2
| | | | | | Add some vgremove calls. Remove uneeded test for some unused commands. Add tests for missing commands.
* tests: utilize check and getZdenek Kabelac2014-02-111-81/+65
| | | | | | | | | Replace some in-test use of lvs commands with their check and get equivalent. Advantage is these 'checking' commands are not necessarily always valiadated via extensive valgrind testing and also the output noice is significantly reduces since the output of check/get is suppressed.
* TEST: Unaccounted possible output causing failureJonathan Brassow2013-09-121-0/+8
| | | | | | | | | | | lvchange-raid.sh checks to ensure that the 'p'artial flag takes precedence over the 'w'ritemostly flag by disabling and reenabling a device in the array. Most of the time this works fine, but sometimes the kernel can notice the device failure before it is reenabled. In that case, the attr flag will not return to 'w', but to 'r'efresh. This is because 'r'efresh also takes precedence over the 'w'ritemostly flag. So, we also do a quick check for 'r' and not just 'w'.
* Mirror/Thin: Disallow thinpools on mirror logical volumesJonathan Brassow2013-09-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The same corner cases that exist for snapshots on mirrors exist for any logical volume layered on top of mirror. (One example is when a mirror image fails and a non-repair LVM command is the first to detect it via label reading. In this case, the LVM command will hang and prevent the necessary LVM repair command from running.) When a better alternative exists, it makes no sense to allow a new target to stack on mirrors as a new feature. Since, RAID is now capable of running EX in a cluster and thin is not active-active aware, it makes sense to pair these two rather than mirror+thinpool. As further background, here are some additional comments that I made when addressing a bug related to mirror+thinpool: (https://bugzilla.redhat.com/show_bug.cgi?id=919604#c9) I am going to disallow thin* on top of mirror logical volumes. Users will have to use the "raid1" segment type if they want this. This bug has come down to a choice between: 1) Disallowing thin-LVs from being used as PVs. 2) Disallowing thinpools on top of mirrors. The problem is that the code in dev_manager.c:device_is_usable() is unable to tell whether there is a mirror device lower in the stack from the device being checked. Pretty much anything layered on top of a mirror will suffer from this problem. (Snapshots are a good example of this; and option #1 above has been chosen to deal with them. This can also be seen in dev_manager.c:device_is_usable().) When a mirror failure occurs, the kernel blocks all I/O to it. If there is an LVM command that comes along to do the repair (or a different operation that requires label reading), it would normally avoid the mirror when it sees that it is blocked. However, if there is a snapshot or a thin-LV that is on a mirror, the above code will not detect the mirror underneath and will issue label reading I/O. This causes the command to hang. Choosing #1 would mean that thin-LVs could never be used as PVs - even if they are stacked on something other than mirrors. Choosing #2 means that thinpools can never be placed on mirrors. This is probably better than we think, since it is preferred that people use the "raid1" segment type in the first place. However, RAID* cannot currently be used in a cluster volume group - even in EX-only mode. Thus, a complete solution for option #2 must include the ability to activate RAID logical volumes (and perform RAID operations) in a cluster volume group. I've already begun working on this.
* RAID: Make RAID single-machine-exclusive capable in a clusterJonathan Brassow2013-09-101-3/+3
| | | | | | | | | | | | | | | | | | | | | | Creation, deletion, [de]activation, repair, conversion, scrubbing and changing operations are all now available for RAID LVs in a cluster - provided that they are activated exclusively. The code has been changed to ensure that no LV or sub-LV activation is attempted cluster-wide. This includes the often overlooked operations of activating metadata areas for the brief time it takes to clear them. Additionally, some 'resume_lv' operations were replaced with 'activate_lv_excl_local' when sub-LVs were promoted to top-level LVs for removal, clearing or extraction. This was necessary because it forces the appropriate renaming actions the occur via resume in the single-machine case, but won't happen in a cluster due to the necessity of acquiring a lock first. The *raid* tests have been updated to allow testing in a cluster. For the most part, this meant creating devices with '-aey' if they were to be converted to RAID. (RAID requires the converting LV to be EX because it is a condition of activation for the RAID LV in a cluster.)
* TEST: Add tests for lvchange actions of RAID under thinJonathan Brassow2013-08-271-118/+197
| | | | | | | | | | | | | Patch includes RAID1,4,5,6,10 tests for: - setting writemostly/writebehind * syncaction changes (i.e. scrubbing operations) - refresh (i.e. reviving devices after transient failures) - setting recovery rate (sync I/O throttling) while the RAID LVs are under a thin-pool (both data and metadata) * not fully tested because I haven't found a way to force bad blocks to be noticed in the testsuite yet. Works just fine when dealing with "real" devices.
* RAID: Fix bug making lvchange unable to change recovery rate for RAIDJonathan Brassow2013-08-091-0/+17
| | | | | | | | | | | | | 1) Since the min|maxrecoveryrate args are size_kb_ARGs and they are recorded (and sent to the kernel) in terms of kB/sec/disk, we must back out the factor multiple done by size_kb_arg. This is already performed by 'lvcreate' for these arguments. 2) Allow all RAID types, not just RAID1, to change these values. 3) Add min|maxrecoveryrate_ARG to the list of 'update_partial_unsafe' commands so that lvchange will not complain about needing at least one of a certain set of arguments and failing. 4) Add tests that check that these values can be set via lvchange and lvcreate and that 'lvs' reports back the proper results.
* TEST: Support testing new RAID features in RHEL6 kernelsJonathan Brassow2013-07-221-38/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We check the version number of dm-raid before testing certain features to make sure they are present. However, this has become somewhat complicated by the fact that the version #'s in the upstream kernel and the REHL6 kernel have been diverging. This has been a necessity because the upstream kernel has undergone ABI changes that have necessitated a bump in the 'Y' component of the version #, while the RHEL6 kernel has not. Thus, we need to know that the ABI has not changed but the features have been added. So, the current version #'ing stands as follows: RHEL6 Upstream Comment ======|==========|======== ** Same until version 1.3.1 ** ------|----------|-------- N/A | 1.4.0 | Non-functional change. | | Removes arg from mapping function. ------|----------|-------- 1.3.2 | 1.4.1 | RAID10 fix redundancy validation checks. ------|----------|-------- 1.3.5 | 1.4.2 | Add RAID10 "far" and "offset" algorithm support. | | Note this feature came later in RHEL6 as part of | | a separate update/feature. ------|----------|-------- 1.3.3 | 1.5.0 | Add message interface to allow manipulation of | | the sync_action. | | New status (STATUSTYPE_INFO) fields: sync_action | | and mismatch_cnt. ------|----------|-------- 1.3.4 | 1.5.1 | Add ability to restore transiently failed devices | | on resume. ------|----------|-------- 1.3.5 | 1.5.2 | 'mismatch_cnt' is zero unless [last_]sync_action | | is "check". ------|----------|-------- To simplify, writemostly/writebehind, scrubbing, and transient device failure restoration are all tested based on the same version requirements: (1.3.5 < V < 1.4.0) || (V > 1.5.2). Since kernel support for writemostly/writebehind has been around for some time, this could mean a reduction in the scope of kernels tested for this feature. I don't view this as much of a problem, since support for this feature was only recently added to LVM. Thus, the user would have to be using a very recent LVM version with an older kernel.
* TEST: Update syncaction test to match latest kernel updatesJonathan Brassow2013-07-191-13/+10
| | | | | | | | | | | The mismatch count reported by a dm-raid kernel target used to be effectively random, unless it was checked after a "check" scrubbing action had been performed. Updates to the kernel now mean that the mismatch count will be 0 unless a check has been performed and discrepancies had been found. This has been the intended behaviour all along. This patch updates the test suite to handle the change.
* reporting: tidy recent new fieldsAlasdair G Kergon2013-07-191-8/+8
| | | | | Add underscores and prefixes to recently-added fields. (Might add more alias functionality in future.)
* tests: fix tests to cope with latest changesPeter Rajnoha2013-07-121-31/+31
| | | | | | - lvs -o lv_attr has now 10 indicator bits - use '--ignoremonitoring' instead of the shortcut '--ig' used before (since it would be ambiguous with new '--ignoreactivationskip')
* TEST: Test RAID syncaction, writemostly, & refresh under snapshotsJonathan Brassow2013-06-201-14/+54
| | | | | | | Test the different RAID lvchange scenarios under snapshot as well. This patch also updates calculations for where to write to an underlying PV when testing various syncactions.
* TEST: Fix 'dd' overrunning device size and causing test failureJonathan Brassow2013-06-171-1/+6
| | | | | | | | Assumed size of 4M was too large and the test was failing because 'dd' was failing to perform its write. Calculate the size we need to write with 'dd' instead, so we don't overrun the device.
* tests: more test run in cluster modeZdenek Kabelac2013-06-161-0/+3
| | | | | | | | | aux updates: prepare_vg now created clustered VG for cluster tests. since dm-raid doesn't work in cluster, skip the cluster test when someone checks for dm-raid target until fixed.
* RAID: Add writemostly/writebehind support for RAID1Jonathan Brassow2013-04-151-2/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'lvchange' is used to alter a RAID 1 logical volume's write-mostly and write-behind characteristics. The '--writemostly' parameter takes a PV as an argument with an optional trailing character to specify whether to set ('y'), unset ('n'), or toggle ('t') the value. If no trailing character is given, it will set the flag. Synopsis: lvchange [--writemostly <PV>:{t|y|n}] [--writebehind <count>] vg/lv Example: lvchange --writemostly /dev/sdb1:y --writebehind 512 vg/raid1_lv The last character in the 'lv_attr' field is used to show whether a device has the WriteMostly flag set. It is signified with a 'w'. If the device has failed, the 'p'artial flag has priority. Example ("nosync" raid1 with mismatch_cnt and writemostly): [~]# lvs -a --segment vg LV VG Attr #Str Type SSize raid1 vg Rwi---r-m 2 raid1 500.00m [raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m [raid1_rimage_1] vg Iwi---r-w 1 linear 500.00m [raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m [raid1_rmeta_1] vg ewi---r-- 1 linear 4.00m Example (raid1 with mismatch_cnt, writemostly - but failed drive): [~]# lvs -a --segment vg LV VG Attr #Str Type SSize raid1 vg rwi---r-p 2 raid1 500.00m [raid1_rimage_0] vg Iwi---r-- 1 linear 500.00m [raid1_rimage_1] vg Iwi---r-p 1 linear 500.00m [raid1_rmeta_0] vg ewi---r-- 1 linear 4.00m [raid1_rmeta_1] vg ewi---r-p 1 linear 4.00m A new reportable field has been added for writebehind as well. If write-behind has not been set or the LV is not RAID1, the field will be blank. Example (writebehind is set): [~]# lvs -a -o name,attr,writebehind vg LV Attr WBehind lv rwi-a-r-- 512 [lv_rimage_0] iwi-aor-w [lv_rimage_1] iwi-aor-- [lv_rmeta_0] ewi-aor-- [lv_rmeta_1] ewi-aor-- Example (writebehind is not set): [~]# lvs -a -o name,attr,writebehind vg LV Attr WBehind lv rwi-a-r-- [lv_rimage_0] iwi-aor-w [lv_rimage_1] iwi-aor-- [lv_rmeta_0] ewi-aor-- [lv_rmeta_1] ewi-aor--
* RAID: Add scrubbing support for RAID LVsJonathan Brassow2013-04-111-0/+145
New options to 'lvchange' allow users to scrub their RAID LVs. Synopsis: lvchange --syncaction {check|repair} vg/raid_lv RAID scrubbing is the process of reading all the data and parity blocks in an array and checking to see whether they are coherent. 'lvchange' can now initaite the two scrubbing operations: "check" and "repair". "check" will go over the array and recored the number of discrepancies but not repair them. "repair" will correct the discrepancies as it finds them. 'lvchange --syncaction repair vg/raid_lv' is not to be confused with 'lvconvert --repair vg/raid_lv'. The former initiates a background synchronization operation on the array, while the latter is designed to repair/replace failed devices in a mirror or RAID logical volume. Additional reporting has been added for 'lvs' to support the new operations. Two new printable fields (which are not printed by default) have been added: "syncaction" and "mismatches". These can be accessed using the '-o' option to 'lvs', like: lvs -o +syncaction,mismatches vg/lv "syncaction" will print the current synchronization operation that the RAID volume is performing. It can be one of the following: - idle: All sync operations complete (doing nothing) - resync: Initializing an array or recovering after a machine failure - recover: Replacing a device in the array - check: Looking for array inconsistencies - repair: Looking for and repairing inconsistencies The "mismatches" field with print the number of descrepancies found during a check or repair operation. The 'Cpy%Sync' field already available to 'lvs' will print the progress of any of the above syncactions, including check and repair. Finally, the lv_attr field has changed to accomadate the scrubbing operations as well. The role of the 'p'artial character in the lv_attr report field as expanded. "Partial" is really an indicator for the health of a logical volume and it makes sense to extend this include other health indicators as well, specifically: 'm'ismatches: Indicates that there are discrepancies in a RAID LV. This character is shown after a scrubbing operation has detected that portions of the RAID are not coherent. 'r'efresh : Indicates that a device in a RAID array has suffered a failure and the kernel regards it as failed - even though LVM can read the device label and considers the device to be ok. The LV should be 'r'efreshed to notify the kernel that the device is now available, or the device should be 'r'eplaced if it is suspected of failing.