summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Release mdadm-3.3.4mdadm-3.3.4mdadm-3.3.xNeilBrown2015-08-037-6/+44
| | | | | | Important bugfix release. Signed-off-by: NeilBrown <neilb@suse.com>
* Assemble: really don't assemble IMSM array without OROM.NeilBrown2015-08-032-2/+4
| | | | | | | | | Previous patch missed on case. Also print more useful information when rejecting a device with IMSM metadata. Signed-off-by: NeilBrown <neilb@suse.com>
* mdassemble: include mapfile support.NeilBrown2015-08-033-23/+1
| | | | | | | | This does make mdassemble a bit bigger, but it also means it actually works properly with named arrays. Ref: https://bbs.archlinux.org/viewtopic.php?id=198196 Signed-off-by: NeilBrown <neilb@suse.com>
* Assemble: don't assemble IMSM array without OROM.NeilBrown2015-08-032-6/+5
| | | | | | | | | | | | | | If someone has an IMSM array, and disables RAID in the BIOS and uses the devices for some other purpose, then they really don't want mdadm to start syncing the array. So don't assemble if OROM doesn't confirm it is OK. There can still be problems for crash-dump not being able to find the OROM. Some explicit work-around might be needed for that rather than a more general workaround that can corrupt data. Signed-off-by: NeilBrown <neilb@suse.com>
* Release mdadm-3.3.3mdadm-3.3.3NeilBrown2015-07-247-7/+27
| | | | Signed-off-by: NeilBrown <neilb@suse.com>
* mdassemble: add "Name" definition.NeilBrown2015-07-241-0/+2
| | | | | | That allows it to compile again :-( Signed-off-by: NeilBrown <neilb@suse.com>
* Don't ignore return value from read and writeNeilBrown2015-07-242-9/+19
| | | | | | New gcc sometimes complains about this. Signed-off-by: NeilBrown <neilb@suse.com>
* bitmap: convert "inline" to "static inline"NeilBrown2015-07-241-3/+3
| | | | | | Otherwise new gcc ignores them with some compile options. Signed-off-by: NeilBrown <neilb@suse.com>
* Assemble: extend --homehost='<ignore>' to allow --name= to ignore homehostNeilBrown2015-07-242-7/+11
| | | | | | Also make --homehost='<ignore>' work properly. Signed-off-by: NeilBrown <neilb@suse.com>
* test: assume recovery has completed if sync_completed says so.NeilBrown2015-07-231-0/+8
| | | | | | | The final completion of a recovery can be delayed, so use sync_completed to check if it is finished, just not been reaped. Signed-off-by: NeilBrown <neilb@suse.com>
* tests: flushbufs after writing zerosNeilBrown2015-07-232-0/+2
| | | | | | | | | sometimes the removed device is re-added before the writes get all the way to the md device - so the array doesn't need any recovery and the test fails. So flush first to be safe. Signed-off-by: NeilBrown <neilb@suse.com>
* test: add -F flag to mkfsNeilBrown2015-07-224-6/+6
| | | | | | | | | newer versions of mkfs.extX ask before creating a filesystem on a device which appears to already have a filesystem. We don't want that, so add the -F flag. Also be explicit about fs type as one shouldn't depend on defaults. Signed-off-by: NeilBrown <neilb@suse.com>
* mdadm: document --homehost=any functionality.NeilBrown2015-07-221-0/+7
| | | | Signed-off-by: NeilBrown <neilb@suse.com>
* Assemble: improve tests for matching --name= request.NeilBrown2015-07-221-7/+12
| | | | | | | | If the name in the array has a home-host, then require that it matches, or is "any", or requested homehost is "any". Signed-off-by: NeilBrown <neilb@suse.com>
* raid6check: use O_DIRECT instead of O_SYNC.NeilBrown2015-07-201-2/+3
| | | | | | | O_DIRECT is more direct and is faster. This requires aligned memory allocation, but that isn't hard. Signed-off-by: NeilBrown <neilb@suse.de>
* restripe: fix data block order in raid6_2_data_recovNeilBrown2015-07-202-10/+6
| | | | | | | | | | ... rather than relying on the caller getting them in the correct order. This is better engineering and fixes a bug, but because the failed_slotX numbers are used later with assumption that they weren't swapped Signed-off-by: NeilBrown <neilb@suse.de>
* raid6check: various cleanup/fixesNeilBrown2015-07-202-129/+168
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - document meaning of various arrays. In particular: stripes[] blocks[] blocks_page[] block_index_for_slot[] It needs to be clear if these are indexed by raid_disk number or syndrome number. - changed meaning of block_index_for_slot[]. It didn't seem to be used consistently. It also made use of the block numbers in array data ordering, which is not directly relevant for syndrome calculations. - reduced number of args to autorepair and manual_repair There don't need both stripes[] and blocks[]. And they don't need diskP or diskQ. blocks[-1] is the P chunk, blocks[-2] is the Q chunk. block_index_for_slot[] can be used to find the target device for a particular syndrome block. - remove stripe locking from within manual_repair, and instead use the global stripe locking used for check and autorepair. - this necessitated changes to raid6_datap_recov and raid5_2data_reov so the P and Q blocks could be before or after the data blocks. Signed-off-by: NeilBrown <neilb@suse.de>
* Assemble: really ensure stripe_cache is bit enough to handle new chunk sizeNeilBrown2015-07-171-2/+2
| | | | | | | | | | Earlier patch: 56fcbcbb6f17df0e5dedf59744deee037c5d5fbd calculated the proper chunk size - but didn't use it.. Let's actually use it this time. Signed-off-by: NeilBrown <neilb@suse.com>
* raid6checkNeilBrown2015-07-162-21/+37
| | | | | | | | fix checking of DDF layouts. Stuff probably still broken. Signed-off-by: NeilBrown <neilb@suse.de>
* raid6check: get device ordering correct for syndrome calculation.NeilBrown2015-07-161-6/+15
| | | | | | | | | | | | | | | The order of devices used for the syndrome calculation is not the same as the order of data in the array. The D block immediately after Q is first, then they continue cyclicly in raid-disk order, skipping over the P disk if it is seen. This gets the 'check' right for all layouts other than DDF, which is quite different. I haven't confirmed that this does't break repair. Signed-off-by: NeilBrown <neilb@suse.de>
* tests: slow down --stop a bit to allow revert-inplace to work.NeilBrown2015-07-162-4/+4
| | | | | | | | | | revert-inplace would sometimes find that the original reshape had finished. So slow down the reshaping during --stop (which needs to be a little bit fast so that stop doesn't timeout waiting) and don't wait quite so long before stopping. Signed-off-by: NeilBrown <neilb@suse.de>
* tests: add 19raid6checkNeilBrown2015-07-161-0/+27
| | | | | | | | This checks that raid6check finds no errors in newly created array with all different layouts. (it doesn't...) Signed-off-by: NeilBrown <neilb@suse.de>
* test: clear out old metadata from loop devices.NeilBrown2015-07-161-0/+2
| | | | | | | Old metadata can tempt udev to assemble things, which just gets in the way. Signed-off-by: NeilBrown <neilb@suse.de>
* raid6check: report role of suspect device.NeilBrown2015-07-101-2/+3
| | | | | | i.e. -2 for Q, -1 for P, 0-N for data. Signed-off-by: NeilBrown <neilb@suse.de>
* tests: save failure logs to logdirNeilBrown2015-07-101-8/+4
| | | | | | | If --save-logs is given we already save all logs to --logdir If not, we should still save erroneous logs to --logdir. Signed-off-by: NeilBrown <neilb@suse.com>
* tests: do not try to 'flushbufs' after stopping a arrayNeilBrown2015-07-103-4/+1
| | | | | | | If the array is stopped, there is nothing to flush, and blockdev can signal an error. Signed-off-by: NeilBrown <neilb@suse.com>
* test: add dmesg output to logs on error.NeilBrown2015-07-061-0/+2
| | | | | | This can help isolate the problem. Signed-off-by: NeilBrown <neilb@suse.de>
* test: check sync_action as well when checking for an action.NeilBrown2015-07-061-5/+14
| | | | | | | | | Some actions only appear in /proc/mdstat after a little delay, so check in sync_action as well. This applies when checking for recovery etc, and when waiting for idle. Signed-off-by: NeilBrown <neilb@suse.de>
* test: speed up reshape when stopping arrays.NeilBrown2015-07-061-4/+7
| | | | | | | --stop needs to wait for reshape to get to a suitable spot, so having really slow resync isn't helpful. Signed-off-by: NeilBrown <neilb@suse.de>
* test: stop all arrays before starting test.NeilBrown2015-07-061-0/+1
| | | | | | | As well a cleaning up loop devices, stop all arrays. After all, we cannot do the one without the other. Signed-off-by: NeilBrown <neilb@suse.com>
* Grow: remove stray tracing message.NeilBrown2015-07-061-3/+1
| | | | Signed-off-by: NeilBrow <neilb@suse.com>
* Manage/stop: don't stop during initial critical section.NeilBrown2015-07-061-4/+19
| | | | | | | | | | | | If the array is reshaping to more devices, then stopping during that initial critical section is a bad idea. So check for it and wait a bit. Should probably handle final critical section of a reduction too. same-size reshape should be handled correctly already. Signed-off-by: NeilBrown <neilb@suse.de>
* Manage/stop: improve some comments.NeilBrown2015-07-061-4/+19
| | | | | | This code always confuses me - this might help a bit. Signed-off-by: NeilBrown <neilb@suse.com>
* Manage/stop: guard against 'completed' being too large.NeilBrown2015-07-061-1/+5
| | | | | | | | A race can allow 'completed' to read as 2^63-1, which takes a long time to count up to. So guard against that possibility. Signed-off-by: NeilBrown <neilb@suse.com>
* Monitor: don't Wait forever on a 'frozen' array.NeilBrown2015-07-061-2/+10
| | | | | | | If Wait() finds the array resync is 'frozen', then wait a little while to avoid races, but don't wait forever. Signed-off-by: NeilBrown <neilb@suse.com>
* sysfs: reject reads that use the whole buffer.NeilBrown2015-07-061-5/+5
| | | | | | | | | If a read fills the whole buffer, then we possibly missed something of the end, and we definitely shouldn't put a '\0' beyond the end, so just return an error. This should never happen anyway. Signed-off-by: NeilBrown <neilb@suse.com>
* Remove some trailing white spaceNeilBrown2015-07-0238-139/+139
| | | | | | It looks ugly in my editor. Signed-off-by: NeilBrown <neilb@suse.de>
* Manage: fix no-op test in Manage_stop.NeilBrown2015-07-021-1/+1
| | | | | | | | A 'devnm' never starts with '/', so this test is pointless. The code should use the passed-in devname unless it is clearly not usable. So fix it to do that. Signed-off-by: NeilBrown <neilb@suse.de>
* mdstat: discard 'dev' field, just use 'devnm'NeilBrown2015-07-028-20/+17
| | | | | | | | These both have the same value, and have done since the 'devnm' concept was introduced. So discard the pointless duplicate. Signed-off-by: NeilBrown <neilb@suse.de>
* Grow: fix typo in commentNeilBrown2015-06-181-1/+1
| | | | Signed-off-by: NeilBrown <neilb@suse.de>
* Assemble: ensure stripe_cache is big enough to handle new chunk sizeNeilBrown2015-06-181-1/+5
| | | | | | | If you reshape to a larger chunk size, and need to restart, it can have problems. Signed-off-by: NeilBrown <neilb@suse.de>
* Grow: fix a couple of typos.NeilBrown2015-05-281-2/+2
| | | | Signed-off-by: NeilBrown <neilb@suse.de>
* test: make 'check wait' more reliable.NeilBrown2015-05-281-1/+1
| | | | | | | | | | | 'recover' etc doesn't appear in /proc/mdstat immediately. The "sync" thread must be started first. But 'sync_action' shows it as soon as MD_RECOVERY_NEEDED is set in the kernel. So look there too. Now maybe I can get rid of some of those silly 'sleep' calls. Signed-off-by: NeilBrown <neilb@suse.de>
* tests/imsm-grow-template change 'wait' to 'check wait'NeilBrown2015-05-283-8/+6
| | | | | | | 'wait' is a shell builtin that isn't doing anything useful. It should be calling 'check wait' I think. Signed-off-by: NeilBrown <neilb@suse.de>
* Grow: fix problem with --grow --continueNeilBrown2015-05-281-3/+4
| | | | | | | | | | | | | | If an array is being reshaped using backup space on a 'spare' device, then mdadm --grow --continue won't find it as by the time it runs, nothing looks like a spare are more. The spare has been added to the array, but has no data yet. So allow reshape_prepare_fdlist to find a newly-incorporated spare and report this so it can be used. Reported-by: Xiao Ni <xni@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
* tests: wait a bit long for reshape to complete.NeilBrown2015-05-252-2/+4
| | | | | | | As the kernel now does less locking, 'check wait' doesn't always wait long enough. Add some pauses. Signed-off-by: NeilBrown <neilb@suse.de>
* Grow: another attempt to fix stop-during-reshape race.NeilBrown2015-05-251-16/+18
| | | | | | | | | | | When the array is stopped during a critical section, we sometimes erase the backup, which is bad. This happens when 'completed' is zero. This can happen easily when 'stop' freezes reshape. So try to be more careful and check 'reshape_position'. Signed-off-by: NeilBrown <neilb@suse.de>
* Fix minor typo in mdadm manpage.Andrew Burgess2015-05-231-1/+1
| | | | | | | | | | | Appologies if this is the wrong mailing list for this patch. This is a very small patch for the manual page for the mdadm utility. Thanks, Andrew Signed-off-by: NeilBrown <neilb@suse.de>
* mdadm: monitor: fix nullptr dereference when get_md_name() returns NULLSergey Vidishev2015-05-201-1/+9
| | | | | | | | | Function add_new_arrays() expects that function get_md_name() should return pointer to devname, but also get_md_name() may return NULL. So check the pointer before use it in add_new_arrays(). Signed-off-by: Sergey Vidishev <sergeyv@yandex-team.ru> Signed-off-by: NeilBrown <neilb@suse.de>
* test: forcefully clean up old loop devices.NeilBrown2015-05-201-0/+8
| | | | | | | | sometimes these can get left around, and udev can be looking at them at awkward times so they don't disappear. So be forceful. Signed-off-by: NeilBrown <neilb@suse.de>