| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
We can use the new added calc_bitmap_size func to remove some
redundant lines.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends nodes option for assemble mode, make the num of
cluster node could be change by user.
Before that, it is necessary to ensure there are enough space
for those nodes, calc_bitmap_size is introduced to calculate
the bitmap size of each node.
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support change the cluster name, the commit do the followings:
1. extend original write_bitmap function for new scenario.
2. add the scenarion to handle the modification of cluster's name
in write_bitmap1.
3. let the cluster name also show in examine_super1 and detail_super1
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
We want the clustered devices to be started exclusively by a cluster
resource-agent. So, avoid starting using the incremental option.
This also skips a clustered md from starting during boot in inactive mode.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the ability to convert a regular md without bitmap
(--bitmap=none) to a clustered device (--bitmap=clustered).
To convert a device with --bitmap=internal or --bitmap=external,
you have to convert to --bitmap=none and then re-execute the
command with --bitmap=clustered.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A clustered disk is added by the traditional --add sequence.
However, other nodes need to acknowledge that they can "see"
the device. This is done by --cluster-confirm:
--cluster-confirm SLOTNUM:/dev/whatever (if disk is found)
or
--cluster-confirm SLOTNUM:missing (if disk is not found)
The node initiating the --add, has the disk state tagged with
MD_DISK_CLUSTER_ADD and the one confirming tag the disk with
MD_DISK_CANDIDATE.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
This adds capability of exmining bitmaps corresponding to all
nodes/slots on the device.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The home-cluster is stored in the bitmap super block of the
array. The device can be assembled on a cluster with the
cluster name same as the one recorded in the bitmap.
If home-cluster is not specified, this is auto-detected using
dlopen corosync cmap library.
neilb: allow code to compile when corosync-devel is not installed.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
Specifies the maximum number of nodes in the cluster that may use
this device simultaneously. This is equivalent to the number of
bitmaps created in the internal superblock (patches to follow).
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a clustered MD, create bitmaps equal to number of nodes so
each node has an independent bitmap.
Only the first bitmap is has the bits set so that the first node
that assembles the device also performs the sync.
The bitmaps are aligned to 4k boundaries.
On-disk format:
0 4k 8k 12k
-------------------------------------------------------------------
| idle | md super | bm super [0] + bits |
| bm bits[0, contd] | bm super[1] + bits | bm bits[1, contd] |
| bm super[2] + bits | bm bits [2, contd] | bm super[3] + bits |
| bm bits [3, contd] | | |
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
| |
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
'recover' etc doesn't appear in /proc/mdstat immediately.
The "sync" thread must be started first.
But 'sync_action' shows it as soon as MD_RECOVERY_NEEDED is set
in the kernel. So look there too.
Now maybe I can get rid of some of those silly 'sleep' calls.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
'wait' is a shell builtin that isn't doing anything useful.
It should be calling 'check wait' I think.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an array is being reshaped using backup space on a 'spare' device,
then
mdadm --grow --continue
won't find it as by the time it runs, nothing looks like a spare are
more. The spare has been added to the array, but has no data yet.
So allow reshape_prepare_fdlist to find a newly-incorporated spare and
report this so it can be used.
Reported-by: Xiao Ni <xni@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
As the kernel now does less locking, 'check wait' doesn't
always wait long enough. Add some pauses.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
When the array is stopped during a critical section, we sometimes
erase the backup, which is bad.
This happens when 'completed' is zero.
This can happen easily when 'stop' freezes reshape.
So try to be more careful and check 'reshape_position'.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
Appologies if this is the wrong mailing list for this patch.
This is a very small patch for the manual page for the mdadm utility.
Thanks,
Andrew
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
Function add_new_arrays() expects that function get_md_name() should
return pointer to devname, but also get_md_name() may return NULL. So
check the pointer before use it in add_new_arrays().
Signed-off-by: Sergey Vidishev <sergeyv@yandex-team.ru>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
| |
sometimes these can get left around, and udev can be looking
at them at awkward times so they don't disappear.
So be forceful.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some old kernels set 'completed' to '0' too soon.
But modern kernels don't.
And when 'mdadm --stop' freezes and resume the grow,
'completed' goes back to zero briefly, which can confuse this
logic.
So only think '0' might be wrong from an old kernel when
the reshape has gone idle.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
| |
It can sometimes.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
| |
..that triggers an error.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
'check wait' seems a bit racy now.
Wait for the array to be fully optimal before proceeding.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
EBUSY can be returned if something has recently happened
to cause md to want to check if recovery is needed, but hasn't
had a chance yet.
This can easily happen in testing.
So retry a few times in that case.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
1/ use correct data-offset for cmp - that has changed.
2/ flushbufs on the block device before reading to avoid cache issues
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
| |
I don't really know why this is needed, but there is a delay
between the reshape finishing and the level/etc changing.
So add some sleeps.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
The current sleep/wait doesn't seem long enough,
particularly when two arrays are being reshaped in the one
container.
So wait a bit more...
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
| |
In that case, updating 'completed' to 'max_progress' is wrong.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
| |
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
| |
We might be trying to set_new_data_offset() for RAID10, when it is
a necessary requirement, or for RAID5 where it is optional.
In the latter case, a message about metadata versions is no helpful.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
avail_size1 requires ->sb, so we must only call it if ->sb
was loaded.
If ->sb wasn't loaded, then we are only proceding on the basis that
the kernel might be able to work something out - we don't need to
do any tests on size.
Reported-by: Christoffer Hammarström <christoffer.hammarstrom@linuxgods.com>
Signed-off-by: NeilBrown <neilb@suse.de>
URL: https://bugs.debian.org/784874
|
|
|
|
|
|
| |
This will trigger an error. And now that errors are fatal....
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
This can report non-zero if there was nothing to do,
and that isn't really an error.
If the array doesn't get started, something else
will complain.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
This is a very corner-case, but the self-tests tripped on it,
and it makes sense not to trust the uuid when it is being changed.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit 30bee0201, the anchor is updated from the active
DDF header. This requires fixing the header type before the
anchor is written.
The LSI Software RAID code will reject DDF meta data with wrong
anchor type and will erase all meta data when it encounters
such a broken anchor. Thus starting Linux md once on a system
with LSI RAID BIOS may cause the meta data to get destroyed.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
"--wait" will return non-zero status if it didn't need to wait.
This is no a reason to fail a test.
So ignore the return status from those commands.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
All programs now need to declare their "Name".
Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: d56dd607ba43 ("Change way of printing name of a process")
|
|
|
|
|
|
|
|
|
|
| |
We 'active_disks' does not count spares, so if array is rebuilding,
this will not necessarily find all devices, so may report an array
as failed when it isn't.
Counting up to nr_disks is better.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
Active arrays with IMSM metadata are counted per hba so far.
This is bad due to new functionality of orom shared between multiple
controllers i.e. more arrays can be created than is supported by orom.
This patch changes the way of counting arrays, so the result will be
sum of arrays under every hba supported by specific orom.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally we do not "force"-assemble devices which are in the
middle of recovery, as they are unlikely to have useful data.
However, when a reshape increases the number of devices,
the newly added devices appear to be recovering because they
do not have complete data on them yet, but then they aren't expected
to until the reshape completes.
So in this case, it can be appropriate to force-assemble them.
Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
| |
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the parity device of a RAID4 is missing, then there is no immediate
risk to data. So it doesn't matter if the array is dirty or not.
This can be important when reshaping a RAID0, and is a much better
solution that that in the resent-reverted.
b720636a5849397dbc6dc1b0f0b671d17034a28b
Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b720636a5849397dbc6dc1b0f0b671d17034a28b.
As it said, this was a hack. It causes problems when trying to
--force assemble a RAID4. There is a better way.
Reported-by: "Jonathan Harker (Jesusaurus)" <jesusaurus@gentlydownthe.net>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
Since we introduced replacement devices, the 'i' used in
start_array() is twice the slot number.
So we need to adjust when printing.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
"Wrong-Level" is a reason, not a component device, so it should
start with a space to indiciate this to alert().
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
"alert" treats the "disc" arg differently if it starts with a space.
At least it does for sending email. It doesn't for writing to syslog.
Make this consistent and obey the 'space protocol' when writing to
syslog.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Kernels between
c6563a8c38fde3c1c7fc925a v3.5-rc1~110^2~53
and
b5254dd5fdd9abcacadb5101 v3.5-rc1~110^2~51
allow new_offset to be set, but don't then allow a RAID5
to be reshaped to change that offset.
Due to selective backports, this includes the SLES11-SP3 kernel.
It is quite easy to handle this case in mdadm, so we do.
Specifically: if the reshape with data-offset fails with EINVAL,
abort the data-offset change and try the "old" way.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"mdadm -If" - triggered from udev rules when disk is removed from OS -
tries to set array in auto-read-only mode. This can interrupt rebuild
process which is started automatically, e.g. if array is mounted and
spare disk is available (I/O error is detected faster than removing
failed disk by mdadm).
This patch prevents "mdadm -If" from setting array into "auto-read-only",
by requiring exclusive open to succeed.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
| |
Devices list in PCI Data Structure is supported only in
3 and above revision. Make sure that this is checked.
Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
Replaced oroms array with list, add_orom() now only appends to this list
and add_orom_device_id() only appends devid_list node to an orom_entry.
Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|