summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* open_dev_excl: allow device to be read-only.devel-3.2NeilBrown2011-03-241-1/+6
| | | | | | | | | | For many operations we don't need a writable device. So if opening O_RDWR fails in open_dev_excl, then try again O_RDONLY. If we really needed write, a subsequent operation will failed. But if we didn't, we succeed when otherwise we wouldn't have. Signed-off-by: NeilBrown <neilb@suse.de>
* tests: use /tmp/mdadm.conf rather than /etc/mdadm.conf.NeilBrown2011-03-241-4/+4
| | | | | | Modifying /etc/mdadm.conf for testing is just wrong. Signed-off-by: NeilBrown <neilb@suse.de>
* Merge branch 'master' into devel-3.2NeilBrown2011-03-243-1/+44
|\ | | | | | | | | | | | | | | | | | | | | | | Conflicts: Incremental.c Manage.c ReadMe.c inventory mdadm.8.in mdadm.spec mdassemble.8 mdmon.8
| * Release mdadm-3.1.5mdadm-3.1.5NeilBrown2011-03-237-5/+48
| | | | | | | | Signed-off-by: NeilBrown <neilb@suse.de>
| * Incr: don't exclude 'active' devices from auto inclusion in a container.NeilBrown2011-03-233-19/+8
| | | | | | | | | | | | | | | | For containers, it is always appropriate to include a device in the container. Whether it should then be included in an array is a separate question. Signed-off-by: NeilBrown <neilb@suse.de>
| * --stop: separate 'is busy' test for 'did it stop properly'.NeilBrown2011-03-231-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stopping an md array requires that there is no other user of it. However with udev and udisks and such there can be transient other users of md devices which can interfere with stopping the array. If there is a transient users, we really want "mdadm --stop" to wait a little while and retry. However if the array is genuinely in-use (e.g. mounted), then we don't want to wait at all - we want to fail immediately. So before trying to stop, re-open device with O_EXCL. If this fails then the device is probably in use, so give up. If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly a transient failure, so try again for a few seconds. Signed-off-by: NeilBrown <neilb@suse.de>
| * Monitor: handle v.quick removal of devices better.NeilBrown2011-03-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | If a device fails and then is removed before Monitor sees the failure, GET_DISK_INFO returns nothing so Monitor relies on mdstat info where '_' is incorrectly interpreted as 'a spare'. We should treat '_' as 'removed' - that is safer. Without this, a v.quick fail+remove gets reported as 'Failed' then 'SpareActive'. Signed-off-by: NeilBrown <neilb@suse.de>
| * ddf: fix up detection of failed/missing devices.NeilBrown2011-03-231-5/+15
| | | | | | | | | | | | | | | | If a device hasn't been found yet we can still tell if it is expected to be working, and we must to do to make sure 'working_disks' is correct. Signed-off-by: NeilBrown <neilb@suse.de>
| * restripe: allow test code to have an offset on each device.Piergiorgio Sartor2011-03-231-0/+8
| | | | | | | | | | | | | | | | | | | | If device name ends :number, e.g. /dev/sda0:1234 then assume the RAID data starts that many sectors from start of device. Signed-off-by: NeilBrown <neilb@suse.de>
| * Assemble: improve efficacy of -Af in assembling degraded dirty arrays.NeilBrown2011-03-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | If a degraded dirty array has some superblocks which are clean and others that are dirty, and the dirty ones are newer by precisely '1' in the event count, then the current code to force the array to be clean will not work. We need to make sure to find a superblock with most recent event count and force that one to be 'clean'. Reported-by: A J Wyborny <ajwyborny@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | FIX: imsm: Do not change serial if disk failedKrzysztof Wojcik2011-03-241-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch rollback one change connected with mdadm-OROM compatibility: adding ':0' at the end of disk serial number if disk is detected as failed. Current mdadm's implementation does not distinguish two cases when disk is marked as failed: 1. If disk is really failed- disconnected, broken 2. Just marked as failed by mdadm- using "-f" option Second case is not yet fully handled and compatible with IMSM standard. Changing serial number of existing, operational disk causes problems in "thunderdome" and "load_super" functions that use serial numbers to disks comparisons and searching. The change must be recalled until full support will be developed. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | FIX: Tests: raid0->raid10 without degradationKrzysztof Wojcik2011-03-243-13/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | raid0->raid10 transition needs at least 2 spare devices. After level changing to raid10 recovery is triggered on failed (missing) disks. At the end of recovery process we have fully operational (not degraded) raid10 array. Initialy there was possibility to migrate raid0->raid10 without recovery triggering (it results degraded raid10). Now it is not possible. This patch adapt tests to new mdadm's behavior. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | FIX: imsm: Rebuild does not start on second failed diskKrzysztof Wojcik2011-03-241-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: If we have an array with two failed disks and the array is in degraded state (now it is possible only for raid10 with 2 degraded mirrors) and we have two spare devices in the container, recovery process should be triggered on booth failed disks. It does not. Recovery is triggered only for first failed disk. Second failed disk remains unchanged although the spare drive exists in the container and is ready to recovery. Root cause: mdmon does not check if the array is degraded after recovery of first drive is completed. Resolution: Check if current number of disks in the array equals target number of disks. If not, trigger degradation check and then recovery process. Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Assemble: improve efficacy of -Af in assembling degraded dirty arrays.NeilBrown2011-03-231-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | If a degraded dirty array has some superblocks which are clean and others that are dirty, and the dirty ones are newer by precisely '1' in the event count, then the current code to force the array to be clean will not work. We need to make sure to find a superblock with most recent event count and force that one to be 'clean'. Reported-by: A J Wyborny <ajwyborny@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | super-intel: enable loading metadata from non-IMSM compliant disksLabun, Marcin2011-03-231-5/+9
| | | | | | | | | | | | | | | | | | Honor ignore_hw_compat to load metadata from disk attached to non-IMSM controller or when there are no IMSM OROM/EFI capabilities. Used only for guessing and examining metadata format. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | examine: allows to examine a disk metadata on non-metadata compliant systemsLabun, Marcin2011-03-233-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Allow for loading metadata from disk attached to non-metadata compliant system. Affects mdadm --examine and guess_super. Added ignore_hw_compat in supertype to pass information to load_super handler. If ignore_hw_compat is set the handler should load metadata also from disks that do not comply with metadata requirements (i.e. disk is not attached to native controller, etc). Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | man mdadm: Add note about auto-assembly during array reshapeAdam Kwolek2011-03-231-0/+7
| | | | | | | | | | | | | | | | | | Add note to man that auto-assembly cannot be used for reshaped arrays. Revisions: NeilBrown Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | man mdadm: add information for MDADM_EXPERIMENTAL flagAdam Kwolek2011-03-231-0/+17
| | | | | | | | | | | | | | | | | | Update man for MDADM_EXPERIMENTAL flag. Minor revisions by Mathias Burén <mathias.buren@gmail.com> and Neil Brown. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | mdmon: Stop keeping track of RAID0 (and LINEAR) arrays.NeilBrown2011-03-222-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | Tracking RAID0 arrays doesn't really work. There is no need, and there are some sysfs files which won't exist when the array appears and then won't be opened when the level is changed. So simply ignore RAID0 and LINEAR arrays - don't add them when they appear and if an array we are monitoring turns into one of these, discard it promptly. Signed-off-by: NeilBrown <neilb@suse.de>
* | mdmon: don't wait for O_EXCL when shutting down.NeilBrown2011-03-223-4/+22
| | | | | | | | | | | | | | | | | | | | If mdmon is shutting down because there are no devices left to look at, then don't wait 5 seconds for an O_EXCL open, and that can block progress of --grow. Only wait for O_EXCL if we received a signal. Signed-off-by: NeilBrown <neilb@suse.de>
* | mdmon: allow manage_member to cope with ->container becoming NULL.NeilBrown2011-03-221-4/+9
| | | | | | | | | | | | | | | | | | As monitor() can set ->container to NULL, we need to be careful about dereferencing it. So take a copy in manage_member, return if it is NULL, and only use the copy. Signed-off-by: NeilBrown <neilb@suse.de>
* | Grow: increase raid_disks before adding specific spares.NeilBrown2011-03-221-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | When we add spared that have been targeted at a specific slot, we need raid_disks to be bigger than the slot number. But currently we don't increase raid_disks until after we add these spares. So introduce an early increase of raid_disks to allow the spares to be added. Signed-off-by: NeilBrown <neilb@suse.de>
* | Monitor: handle v.quick removal of devices better.NeilBrown2011-03-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | If a device fails and then is removed before Monitor sees the failure, GET_DISK_INFO returns nothing so Monitor relies on mdstat info where '_' is incorrectly interpreted as 'a spare'. We should treat '_' as 'removed' - that is safer. Without this, a v.quick fail+remove gets reported as 'Failed' then 'SpareActive'. Signed-off-by: NeilBrown <neilb@suse.de>
* | ddf: fix up detection of failed/missing devices.NeilBrown2011-03-221-5/+15
| | | | | | | | | | | | | | | | If a device hasn't been found yet we can still tell if it is expected to be working, and we must to do to make sure 'working_disks' is correct. Signed-off-by: NeilBrown <neilb@suse.de>
* | restripe: allow test code to have an offset on each device.Piergiorgio Sartor2011-03-221-0/+8
| | | | | | | | | | | | | | | | | | | | If device name ends :number, e.g. /dev/sda0:1234 then assume the RAID data starts that many sectors from start of device. Signed-off-by: NeilBrown <neilb@suse.de>
* | test: call "udevadm settle" after stopping array.NeilBrown2011-03-221-0/+3
| | | | | | | | | | | | | | | | If we don't do this, then the unlink from /dev might happen after the next step in the test creates something in /dev, and device names seem to go missing. Signed-off-by: NeilBrown <neilb@suse.de>
* | RAID-6 check standalonePiergiorgio Sartor2011-03-213-3/+270
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hi Neil, please find attached a patch, to mdadm-3.2 base, including a standalone versione of the raid-6 check. This is basically a re-working (and hopefully improvement) of the already implemented check in "restripe.c". I splitted the check function into "collect" and "stats", so that the second one could be easily replaced. The API is also simplified. The command line option are reduced, since we only level is raid-6, but the ":offset" option is included. The output reports the block/stripe rotation, P/Q errors and the possible HDD (or unknown). BTW, the patch applies also to the already patched "restripe.c", including the last ":offset" patch (which is not yet in git). Other item is that due to "sysfs.c" linking (see below) the "Makefile" needed some changes, I hope this is not a problem. Next steps (TODO list you like) would be: 1) Add the "sysfs.c" code in order to retrieve the HDDs info from the MD device. It is already linked, together with the whole (mdadm) universe, since it seems it cannot leave alone. I'll need some advice or hint on how to do use it. I checked "sysfs.c", but before I dig deep into it maybe better to have some advice (maybe just one function call will do it). 2) Add the suspend lo/hi control. Fellow John Robinson was suggesting to look into "Grow.c", which I did, but I guess the same story as 1) is valid: better to have some hint on where to look before wasting time. 3) Add a repair option (future). This should have different levels, like "all", "disk", "stripe". That is, fix everything (more or less like "repair"), fix only if a disk is clearly having problems, fix each stripe which has clearly a problem (but maybe different stripes may belong to different HDDs). So, for the point 1) and 2) would be nice to have some more detail on where to look what. Point 3) we will discuss later. Thanks, please consider for inclusion, bye, pg Signed-off-by: NeilBrown <neilb@suse.de>
* | platform_intel: support EFI SCU OEM variableLabun, Marcin2011-03-201-2/+9
| | | | | | | | | | | | | | | | | | RstScuV and RstScuO variable names are supported. First try reading from RstScuV, when it fails try RstScuO. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Tested-by: Przemyslaw Czarnowski <przemyslaw.hawrylewicz.czarnowski@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | imsm: FIX: indicate that metadada has to be writtenAdam Kwolek2011-03-201-0/+1
| | | | | | | | | | | | | | | | | | | | During adding spare disks to raid0, spare metadata is not written. This is due to exit form sync_metadata() on empty updates_pending flag. When mdmon is absent indicate sync_metadata() to flush changes to disks. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | FIX: Add spare throws exception (v2)Adam Kwolek2011-03-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | sync_metadata() requires st->sb to be loaded, otherwise exception is generated. This fails expansion, because spares cannot be added. metadata update uses tst instead st pointer, it is better than loading anchor for st as I proposed previously. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Retry writing 'inactive' state during stopping arrayKrzysztof Wojcik2011-03-181-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Issue observed: Sporadicaly stopping arrays using "mdadm -Ss" command does not succeded. Cause: Writting "inactive" to the array state not succeded- array is busy (accessed by udev, blkid etc.) Resolution: If writing 'inactive' fails, wait and retry again (because it is possibly a transient failure) Signed-off-by: Krzysztof Wojcik <krzysztof.wojcik@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | FIX: ping_monitor() usage causes memory leaksAdam Kwolek2011-03-188-11/+24
| | | | | | | | | | | | | | | | | | | | When for ping_monitor() input devnum2devname() is used, received string pointer should be passed to free() for memory release. It is not made in several places. This use case should have function to avoid memory leak. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | Manage: fix the mess I made in earlier patch.NeilBrown2011-03-181-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When I separated the 'native metadata' case more cleanly from the "external metadata" case for adding a drive, I left some 'external' code in the 'native' case, and didn't copy it to the 'external' case. When - in the external case - we add to super, we much check for mdmon first, so we know whether to do the metadata update ourselves or not, then afterwards call either flush_metadata_updates (to send to mdmon) or sync_metadata (to do it directly). Reported-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | --stop: separate 'is busy' test for 'did it stop properly'.NeilBrown2011-03-171-2/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stopping an md array requires that there is no other user of it. However with udev and udisks and such there can be transient other users of md devices which can interfere with stopping the array. If there is a transient users, we really want "mdadm --stop" to wait a little while and retry. However if the array is genuinely in-use (e.g. mounted), then we don't want to wait at all - we want to fail immediately. So before trying to stop, re-open device with O_EXCL. If this fails then the device is probably in use, so give up. If it succeeds, but a subsequent STOP_ARRAY fails, then it is possibly a transient failure, so try again for a few seconds. Signed-off-by: NeilBrown <neilb@suse.de>
* | Fix regression when using 'grow' to add a bitmap.NeilBrown2011-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | When we allowed a devlist to accompany some --grow modes - but not --bitmap - we made --bitmap always fail, in stead of fail of a device was given to add. As 'devs_found' includes the md device, we need to compare against '1'. Signed-off-by: NeilBrown <neilb@suse.de>
* | Merge branch 'master' into devel-3.2NeilBrown2011-03-154-12/+91
|\ \ | |/ | | | | | | | | | | | | Conflicts: Manage.c managemon.c super-ddf.c super-intel.c
| * mdadm.man: added encouragement to shrink filesystem before array.NeilBrown2011-03-151-3/+8
| | | | | | | | | | | | | | Suggesting by Rory Jaffe <rsjaffe@gmail.com> to make the danger of shrinking, and to recommended avoidance technique, more explicit. Signed-off-by: NeilBrown <neilb@suse.de>
| * ddf: implement remove_from_superNeilBrown2011-03-152-7/+60
| | | | | | | | | | | | | | | | | | | | This is needed to remove devices from mdmon's knowledge when the device is removed from the md container. Now that ddf have a remove_from_super we don't need the code that allows some personalities not to implement this. Signed-off-by: NeilBrown <neilb@suse.de>
| * IMSM: Fix problem in mdmon monitor of using removed disk in imsm container.Labun, Marcin2011-03-153-33/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Manager thread shall pass the information to monitor thread (mdmon) that some devices are removed from container. Otherwise, monitor (mdmon) might use such devices (spares) to rebuild the array that has gone degraded. This problem happens for imsm containers, since a list of the container disks is maintained in intel_super structure. When array goes degraded, the list is searched to find a spare disks to start rebuild. Without this fix the rebuild could be stared on the spare device that was a member of the container, but has been removed from it. New super type function handler has been introduced to prepare metadata format specific information about removed devices. int (*remove_from_super)(struct supertype *st, mdu_disk_info_t *dinfo) The message prepared in remove_from_super is later processed by process_update handler in monitor thread. Signed-off-by: Marcin Labun <marcin.labun@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
| * DDF Allow a RAID1 to be 'partially optimal'.NeilBrown2011-03-151-0/+2
| | | | | | | | | | | | | | | | | | If a RAID1 is meant to have more than 2 device and while it doesn't have that many, it still has more than 1, then according to the DDF spec it is "partially optional" rather than "degraded" So make that so. Signed-off-by: NeilBrown <neilb@suse.de>
| * ddf: remove failed devices that are no longer in use.NeilBrown2011-03-151-0/+42
| | | | | | | | | | | | | | | | | | | | The DDF spec requires we have a phys disk record for every physically attached device. But it isn't clear what that means in the case of soft raid in a general purpose Linux computer. So remove phys disk records for any failed device that is not active in any array. Signed-off-by: NeilBrown <neilb@suse.de>
| * ddf: set Rebuilding flag when adding devices to a degraded arrayNeilBrown2011-03-151-2/+15
| | | | | | | | | | | | | | | | | | | | This is a big fragile, but DDF has wierd rules that we aren't really set up to handle properly. When we add a device to a degraded array it must be a spare, so mark it as Rebuilding. Signed-off-by: NeilBrown <neilb@suse.de>
| * ddf: use correct loop variable in activate_spareNeilBrown2011-03-151-4/+5
| | | | | | | | | | | | | | Using 'i' when you mean 'j' just shows how silly it is to use variables named 'i' and 'j'. Signed-off-by: NeilBrown <neilb@suse.de>
| * ddf: Don't consider 'dl' entries with state_fd < 0NeilBrown2011-03-151-1/+2
| | | | | | | | | | | | | | These have been marked as invalid (recently failed) so don't trust the major/minor associated with them. Signed-off-by: NeilBrown <neilb@suse.de>
| * managemon: Don't do spare assignment while any updates are pending.NeilBrown2011-03-151-1/+6
| | | | | | | | | | | | | | | | Spare assignment requires full knowledge of array state. A pending update might modify that state (such as a pending spare assignment) so don't try while there are updates pending. Signed-off-by: NeilBrown <neilb@suse.de>
| * Manage/external: for external metadata, add_to_super needs lock on container.NeilBrown2011-03-151-5/+12
| | | | | | | | | | | | | | | | | | | | add_to_super could use information from the current superblock (ddf does), so add_to_super for external metadata should be called with the O_EXCL lock held on the container to ensure the update is complete before any other process tries to make any changes (like adding another device to array). Signed-off-by: NeilBrown <neilb@suse.de>
* | imsm: FIX: existing backup file fails unit testsAdam Kwolek2011-03-151-2/+6
| | | | | | | | | | | | | | | | | | | | | | During normal test execution, backup file is deleted after test execution. If test is interrupted/broken, backup file can remain for next run. When backup file exists before unit test run, suits 12 and 13 fails. To avoid this remove backup file before grow is executed. Signed-off-by: Adam Kwolek <adam.kwolek@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
* | ddf: implement remove_from_superNeilBrown2011-03-142-7/+60
| | | | | | | | | | | | | | | | | | | | This is needed to remove devices from mdmon's knowledge when the device is removed from the md container. Now that ddf have a remove_from_super we don't need the code that allows some personalities not to implement this. Signed-off-by: NeilBrown <neilb@suse.de>
* | ddf: zero space_list in ddf_activate_spare.NeilBrown2011-03-141-0/+1
| | | | | | | | | | | | Currently ->space_list is uninitialised here, which is obviously bad. Signed-off-by: NeilBrown <neilb@suse.de>
* | Merge branch 'master' into devel-3.2NeilBrown2011-03-142-10/+58
|\ \ | |/