nspawn: make sure host root can write to the uidmapped mounts we prepare for the container payload

When using user namespaces in conjunction with uidmapped mounts, nspawn so far set up two uidmappings: 1. One that is used for the uidmapped mount and that maps the UID range 0…65535 on the backing fs to some high UID range X…X+65535 on the uidmapped fs. (Let's call this mapping the "mount mapping") 2. One that is used for the userns namespace the container payload processes run in, that maps X…X+65535 back to 0…65535. (Let's call this one the "process mapping"). These mappings hence are pretty much identical, one just moves things up and one back down. (Reminder: we do all this so that the processes can run under high UIDs while running off file systems that require no recursive chown()ing, i.e. we want processes with high UID range but files with low UID range.) This creates one problem, i.e. issue #20989: if nspawn (which runs as host root, i.e. host UID 0) wants to add inodes to the uidmapped mount it can't do that, since host UID 0 is not defined in the mount mapping (only the X…X+65536 range is, after all, and X > 0), and processes whose UID is not mapped in a uidmapped fs cannot create inodes in it since those would be owned by an unmapped UID, which then triggers the famous EOVERFLOW error. Let's fix this, by explicitly including an entry for the host UID 0 in the mount mapping. Specifically, we'll extend the mount mapping to map UID 2147483646 (which is INT32_MAX-1, see code for an explanation why I picked this one) of the backing fs to UID 0 on the uidmapped fs. This way nspawn can creates inode on the uidmapped as it likes (which will then actually be owned by UID 2147483646 on the backing fs), and as it always did. Note that we do *not* create a similar entry in the process mapping. Thus any files created by nspawn that way (and not chown()ed to something better) will appear as unmapped (i.e. as overflowuid/"nobody") in the container payload. And that's good. Of course, the latter is mostly theoretic, as nspawn should generally chown() the inodes it creates to UID ranges that actually make sense for the container (and we generally already do this correctly), but it#s good to know that we are safe here, given we might accidentally forget to chown() some inodes we create. Net effect: the two mappings will not be identical anymore. The mount mapping has one entry more, and the only reason it exists is so that nspawn can access the uidmapped fs reasonably independently from any process mapping. Fixes: #20989
author: Lennart Poettering <lennart@poettering.net> 2022-03-17 13:46:12 +0100
committer: Lennart Poettering <lennart@poettering.net> 2022-03-17 19:08:12 +0100
commit: 50ae2966d20b0b4a19def060de3b966b7a70b54a (patch)
tree: d0c072dfc682f5d2e39439d8b664c76a359eba37 /src/nspawn/nspawn-mount.c
parent: 264caae299aa8f42f20460ad3280add657a3747f (diff)
download: systemd-50ae2966d20b0b4a19def060de3b966b7a70b54a.tar.gz
1 files changed, 1 insertions, 1 deletions
diff --git a/src/nspawn/nspawn-mount.c b/src/nspawn/nspawn-mount.c
index 40773d90c1..f2fad0f462 100644
--- a/src/nspawn/nspawn-mount.c
+++ b/src/nspawn/nspawn-mount.c
@@ -780,7 +780,7 @@ static int mount_bind(const char *dest, CustomMount *m, uid_t uid_shift, uid_t u
         }
 
         if (idmapped) {
-                r = remount_idmap(where, uid_shift, uid_range);
+                r = remount_idmap(where, uid_shift, uid_range, REMOUNT_IDMAP_HOST_ROOT);
                 if (r < 0)
                         return log_error_errno(r, "Failed to map ids for bind mount %s: %m", where);
         }
author	Lennart Poettering <lennart@poettering.net>	2022-03-17 13:46:12 +0100
committer	Lennart Poettering <lennart@poettering.net>	2022-03-17 19:08:12 +0100
commit	50ae2966d20b0b4a19def060de3b966b7a70b54a (patch)
tree	d0c072dfc682f5d2e39439d8b664c76a359eba37 /src/nspawn/nspawn-mount.c
parent	264caae299aa8f42f20460ad3280add657a3747f (diff)
download	systemd-50ae2966d20b0b4a19def060de3b966b7a70b54a.tar.gz