From 99a8bfcd7233a7f18ac191c3b3150068e8ac3a72 Mon Sep 17 00:00:00 2001
From: David Teigland <teigland@redhat.com>
Date: Fri, 26 Jan 2018 06:50:52 -0600
Subject: doc: lvm disk reading

---
 doc/lvm-disk-reading.txt | 246 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 246 insertions(+)
 create mode 100644 doc/lvm-disk-reading.txt
diff --git a/doc/lvm-disk-reading.txt b/doc/lvm-disk-reading.txt
new file mode 100644
index 000000000..5d5e6b575
--- /dev/null
+++ b/doc/lvm-disk-reading.txt
@@ -0,0 +1,246 @@
+
+LVM disk reading
+
+Reading disks happens in two phases.  The first is a discovery phase,
+which determines what's on the disks.  The second is a working phase,
+which does a particular job for the command.
+
+
+Phase 1: Discovery
+------------------
+
+Read all the disks on the system to find out:
+- What are the LVM devices?
+- What VG's exist on those devices?
+
+This phase is called "label scan" (although it reads and scans everything,
+not just the label.)  It stores the information it discovers (what LVM
+devices exist, and what VGs exist on them) in lvmcache.  The devs/VGs info
+in lvmcache is the starting point for phase two.
+
+
+Phase 1 in outline:
+
+For each device:
+
+a. Read the first <N> KB of the device. (N is configurable.)
+
+b. Look for the lvm label_header in the first four sectors,
+   if none exists, it's not an lvm device, so quit looking at it.
+   (By default, label_header is in the second sector.)
+
+c. Look at the pv_header, which follows the label_header.
+   This tells us the location of VG metadata on the device.
+   There can be 0, 1 or 2 copies of VG metadata.  The first
+   is always at the start of the device, the second (if used)
+   is at the end.
+
+d. Look at the first mda_header (location came from pv_header
+   in the previous step).  This is by default in sector 8,
+   4096 bytes from the start of the device.  This tells us the
+   location of the actual VG metadata text.
+
+e. Look at the first copy of the text VG metadata (location came
+   from mda_header in the previous step).  This is by default
+   in sector 9, 4608 bytes from the start of the device.
+   The VG metadata is only partially analyzed to create a basic
+   summary of the VG.
+
+f. Store an "info" entry in lvmcache for this device,
+   indicating that it is an lvm device, and store a "vginfo"
+   entry in lvmcache indicating the name of the VG seen
+   in the metadata in step e.
+
+g. If the pv_header in step c shows a second mda_header
+   location at the end of the device, then read that as
+   in step d, and repeat steps e-f for it.
+
+At the end of phase 1, lvmcache will have a list of devices
+that belong to LVM, and a list of VG names that exist on
+those devices.  Each device (info struct) is associated
+with the VG (vginfo struct) it is used in.
+
+If the number <N> of KB read in step (a) was large enough, then
+all the structs/metadata needed in steps b-e will be found
+in the data buffer returned by a.  If a particular struct
+or metadata needed in steps b-e are located outside the range
+of the initial read, then those steps need to issue their own
+read at the necessary location to get that bit of data.
+(The optional second mda_header and VG metadata in step g
+is located at the end of the device, and will always require
+an additional read.)
+
+
+Phase 1 in code:
+
+The most relevant functions are listed for each step in the outline.
+
+lvmcache_label_scan()
+label_scan()
+_label_scan_async()
+
+for each dev: dev = dev_iter_get(iter) ...
+
+a. _label_read_async_start()
+
+b. _label_read_data_process()
+   _find_label_header()
+
+c. _label_read_data_process()
+   ops->read()
+   _text_read()
+
+d. _read_mda_header_and_metadata()
+   raw_read_mda_header()
+
+e. _read_mda_header_and_metadata()
+   read_metadata_location()
+   text_read_metadata_summary()
+   config_file_read_fd()
+   ops->read_vgsummary()
+   _read_vgsummary()
+
+f. _text_read(): lvmcache_add()
+     [adds this device to list of lvm devices]
+   _read_mda_header_and_metadata(): lvmcache_update_vgname_and_id()
+     [adds the VG name to list of VGs]
+
+
+Phase 1 in log messages:
+
+For each device:
+   Scanning data from all devs async
+
+a. Reading sectors from device <dev>
+
+b. Parsing label and data from device <dev>
+
+d. Copying mda header sector from <dev> ...
+   or if the mda_header needs to be read from disk:
+   Reading mda header sector from <dev> ...
+   
+e. Copying metadata summary for <dev> ...
+   or if the metadata needs to be read from disk:
+   Reading metadata summary for <dev> ...
+
+f. lvmcache <dev> ...
+
+
+Phase 2: Work
+-------------
+
+This phase carries out the operation requested by the command that was
+run.
+
+Whereas the first phase is based on iterating through each device on the
+system, this phase is based on iterating through each VG name.  The list
+of VG names comes from phase 1, which stored the list in lvmcache to be
+used by phase 2.
+
+Some commands may need to iterate through all VG names, while others may
+need to iterate through just one or two.
+
+This phase includes locking each VG as work is done on it, so that two
+commands do not interfere with each other.
+
+
+Phase 2 in outline:
+
+For each VG name:
+
+a. Lock the VG.
+
+b. Repeat the phase 1 scan steps for each device (PV) in this VG.
+   The phase 1 information in lvmcache may have changed because no VG lock
+   was held during phase 1.  So, repeat the phase 1 steps, but only for the
+   devices in this VG.
+
+c. Get the list of on-disk metadata locations for this VG.
+   Phase 1 created this list in lvmcache to be used here.  At this
+   point we copy it out of lvmcache.  In the simple/common case,
+   this is a list of devices in the VG.  But, some devices may
+   have 0 or 2 metadata locations instead of the default 1, so it
+   is not always equal to the list of devices.  We want to read
+   every copy of the metadata for this VG.
+
+d. For each metadata location on each device in the VG
+   (the list from the previous step):
+
+    1) Look at the mda_header.  The location of the mda_header was saved
+       in the lvmcache info struct by phase 1 (where it came from the
+       pv_header.) The mda_header tells us where the text VG metadata is
+       located.
+
+    2) Look at the text VG metadata.  The location came from mda_header
+       in the previous step.  The VG metadata is fully analyzed and used
+       to create an in-memory 'struct volume_group'.
+
+    Copying or reading the mda_header and VG metadata in steps d.1 and d.2
+    follow the same model as in phase 1:  if the data read in scan step 2.b
+    covered these areas, then data is simply copied out of the buffer from
+    step 2.b, otherwise new reads are done.
+
+e. Compare the copies of VG metadata that were found in each location.
+   If some copies are older, choose the newest one to use, and update
+   any older copies.
+
+f. Update details about the devices/VG in lvmcache.
+
+g. Pass the 'vg' struct to the command-specific code to work with.
+
+
+Phase 2 in code:
+
+The most relevant functions are listed for each step in the outline.
+
+For each VG name:
+   process_each_vg()
+
+a. vg_read()
+   lock_vol()
+
+b. vg_read()
+   lvmcache_label_rescan_vg()
+   [insert phase 1 steps a-f]
+
+c. vg_read()
+   create_instance()
+   _text_create_text_instance()
+   _create_vg_text_instance()
+   lvmcache_fid_add_mdas_vg()
+   [Copies mda locations from info->mdas where it was saved
+    by phase 1, into fid->metadata_areas_in_use.  This is
+    the key connection between phase 1 and phase 2.]
+
+d. dm_list_iterate_items(mda, &fid->metadata_areas_in_use)
+
+d1. ops->vg_read()
+    _vg_read_raw()
+    raw_read_mda_header()
+
+d2. _vg_read_raw()
+    text_read_metadata()
+    config_file_read_fd()
+    ops->read_vg()
+    _read_vg()
+
+
+Phase 2 in log messages:
+
+For each VG name:
+   Processing VG <name>
+   Reading VG <name>
+
+b. Reading VG rereading labels for <name>
+   Scanning data from devs async
+   [insert log messages from phase 1 steps a-f]
+   Scanned data from <N> devs async
+
+For each mda on each <dev> in the VG:
+
+d. Reading VG <name> from <dev>
+
+d.1. Copying|Reading mda header sector from <dev> ...
+
+d.2. Copying|Reading metadata from <dev> ...
+
-- 
cgit v1.2.1