testsuite: Fix problem when checking RAID4/5/6 for mismatches.

The lvchange-raid[456].sh test checks that mismatches can be detected properly. It does this by writing garbage to the back half of one of the legs directly. When performing a "check" or "repair" of mismatches, MD does a good job going directly to disk and bypassing any buffers that may prevent it from seeing mismatches. However, in the case of RAID4/5/6 we have the stripe cache to contend with and this is not bypassed. Thus, mismatches which have /just/ happened to an area that now populates the stripe cache may be overlooked. This isn't a serious issue, however, because the stripe cache is short-lived and reasonably small. So, while there may be a small window of time between the disk changing underneath the RAID array and when you run a "check"/"repair" - causing a mismatch to be missed - that would be no worse than if a user had simply run a "check" a few seconds before the disk changed. IOW, it simply isn't worth making a fuss over dropping the stripe cache before beginning a "check" or "repair" (which we actually did attempt to do a while back). So, to get the test running smoothly, we simply deactivate and reactivate the LV to force the stripe cache to be dropped and then proceed. We could just as easily wait a few seconds for the stripe cache to empty also.
author: Jonathan Brassow <jbrassow@redhat.com> 2017-11-02 09:49:35 -0500
committer: Jonathan Brassow <jbrassow@redhat.com> 2017-11-02 09:49:35 -0500
commit: 9e8dec2f387d8eaf48195ef38ab7699d4a8385ed (patch)
tree: 0ed72f6eb941f817edd677c9c875b177e4184f11
parent: 50130328450d1f624d30438ca835d40e0d4f942d (diff)
download: lvm2-9e8dec2f387d8eaf48195ef38ab7699d4a8385ed.tar.gz
1 files changed, 11 insertions, 0 deletions
diff --git a/test/shell/lvchange-raid.sh b/test/shell/lvchange-raid.sh
index 8c22481d2..604b7f7d6 100644
--- a/test/shell/lvchange-raid.sh
+++ b/test/shell/lvchange-raid.sh
@@ -43,6 +43,9 @@ run_writemostly_check() {
 
 	printf "#\n#\n#\n# %s/%s (%s): run_writemostly_check\n#\n#\n#\n" \
 		$vg $lv $segtype
+
+	# I've seen this sync fail.  when it does, it looks like sync
+	# thread has not been started... haven't repo'ed yet.
 	aux wait_for_sync $vg $lv
 
 	# No writemostly flag should be there yet.
@@ -169,6 +172,14 @@ run_syncaction_check() {
 	dd if=/dev/urandom of="$device" bs=1k count=$size seek=$seek
 	sync
 
+	# Cycle the LV so we don't grab stripe cache buffers instead
+	#  of reading disk.  This can happen with RAID 4/5/6.  You
+	#  may think this is bad because those buffers could prevent
+	#  us from seeing bad disk blocks, however, the stripe cache
+	#  is not long lived.  (RAID1/10 are immediately checked.)
+	lvchange -an $vg/$lv
+	lvchange -ay $vg/$lv
+
 	# "check" should find discrepancies but not change them
 	# 'lvs' should show results
 	lvchange --syncaction check $vg/$lv
author	Jonathan Brassow <jbrassow@redhat.com>	2017-11-02 09:49:35 -0500
committer	Jonathan Brassow <jbrassow@redhat.com>	2017-11-02 09:49:35 -0500
commit	9e8dec2f387d8eaf48195ef38ab7699d4a8385ed (patch)
tree	0ed72f6eb941f817edd677c9c875b177e4184f11
parent	50130328450d1f624d30438ca835d40e0d4f942d (diff)
download	lvm2-9e8dec2f387d8eaf48195ef38ab7699d4a8385ed.tar.gz