summaryrefslogtreecommitdiff
path: root/fs/overlayfs
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'work.xattr' of ↵Linus Torvalds2016-10-104-13/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs xattr updates from Al Viro: "xattr stuff from Andreas This completes the switch to xattr_handler ->get()/->set() from ->getxattr/->setxattr/->removexattr" * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Remove {get,set,remove}xattr inode operations xattr: Stop calling {get,set,remove}xattr inode operations vfs: Check for the IOP_XATTR flag in listxattr xattr: Add __vfs_{get,set,remove}xattr helpers libfs: Use IOP_XATTR flag for empty directory handling vfs: Use IOP_XATTR flag for bad-inode handling vfs: Add IOP_XATTR inode operations flag vfs: Move xattr_resolve_name to the front of fs/xattr.c ecryptfs: Switch to generic xattr handlers sockfs: Get rid of getxattr iop sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names kernfs: Switch to generic xattr handlers hfs: Switch to generic xattr handlers jffs2: Remove jffs2_{get,set,remove}xattr macros xattr: Remove unnecessary NULL attribute name check
| * vfs: Remove {get,set,remove}xattr inode operationsAndreas Gruenbacher2016-10-072-9/+0
| | | | | | | | | | | | | | These inode operations are no longer used; remove them. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * xattr: Add __vfs_{get,set,remove}xattr helpersAndreas Gruenbacher2016-10-072-4/+4
| | | | | | | | | | | | | | | | | | | | | | Right now, various places in the kernel check for the existence of getxattr, setxattr, and removexattr inode operations and directly call those operations. Switch to helper functions and test for the IOP_XATTR flag instead. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge branch 'work.misc' of ↵Linus Torvalds2016-10-102-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "Assorted misc bits and pieces. There are several single-topic branches left after this (rename2 series from Miklos, current_time series from Deepa Dinamani, xattr series from Andreas, uaccess stuff from from me) and I'd prefer to send those separately" * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits) proc: switch auxv to use of __mem_open() hpfs: support FIEMAP cifs: get rid of unused arguments of CIFSSMBWrite() posix_acl: uapi header split posix_acl: xattr representation cleanups fs/aio.c: eliminate redundant loads in put_aio_ring_file fs/internal.h: add const to ns_dentry_operations declaration compat: remove compat_printk() fs/buffer.c: make __getblk_slow() static proc: unsigned file descriptors fs/file: more unsigned file descriptors fs: compat: remove redundant check of nr_segs cachefiles: Fix attempt to read i_blocks after deleting file [ver #2] cifs: don't use memcpy() to copy struct iov_iter get rid of separate multipage fault-in primitives fs: Avoid premature clearing of capabilities fs: Give dentry to inode_change_ok() instead of inode fuse: Propagate dentry down to inode_change_ok() ceph: Propagate dentry down to inode_change_ok() xfs: Propagate dentry down to inode_change_ok() ...
| * \ Merge remote-tracking branch 'jk/vfs' into work.miscAl Viro2016-10-081-1/+1
| |\ \
| | * | fs: Give dentry to inode_change_ok() instead of inodeJan Kara2016-09-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inode_change_ok() will be resposible for clearing capabilities and IMA extended attributes and as such will need dentry. Give it as an argument to inode_change_ok() instead of an inode. Also rename inode_change_ok() to setattr_prepare() to better relect that it does also some modifications in addition to checks. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz>
| * | | locks: fix file locking on overlayfsMiklos Szeredi2016-09-161-1/+1
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows flock, posix locks, ofd locks and leases to work correctly on overlayfs. Instead of using the underlying inode for storing lock context use the overlay inode. This allows locks to be persistent across copy-up. This is done by introducing locks_inode() helper and using it instead of file_inode() to get the inode in locking code. For non-overlayfs the two are equivalent, except for an extra pointer dereference in locks_inode(). Since lock operations are in "struct file_operations" we must also make sure not to call underlying filesystem's lock operations. Introcude a super block flag MS_NOREMOTELOCK to this effect. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: Jeff Layton <jlayton@poochiereds.net> Cc: "J. Bruce Fields" <bfields@fieldses.org>
* | | Merge branch 'next' of ↵Linus Torvalds2016-10-042-0/+32
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: SELinux/LSM: - overlayfs support, necessary for container filesystems LSM: - finally remove the kernel_module_from_file hook Smack: - treat signal delivery as an 'append' operation TPM: - lots of bugfixes & updates Audit: - new audit data type: LSM_AUDIT_DATA_FILE * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (47 commits) Revert "tpm/tpm_crb: implement tpm crb idle state" Revert "tmp/tpm_crb: fix Intel PTT hw bug during idle state" Revert "tpm/tpm_crb: open code the crb_init into acpi_add" Revert "tmp/tpm_crb: implement runtime pm for tpm_crb" lsm,audit,selinux: Introduce a new audit data type LSM_AUDIT_DATA_FILE tmp/tpm_crb: implement runtime pm for tpm_crb tpm/tpm_crb: open code the crb_init into acpi_add tmp/tpm_crb: fix Intel PTT hw bug during idle state tpm/tpm_crb: implement tpm crb idle state tpm: add check for minimum buffer size in tpm_transmit() tpm: constify TPM 1.x header structures tpm/tpm_crb: fix the over 80 characters checkpatch warring tpm/tpm_crb: drop useless cpu_to_le32 when writing to registers tpm/tpm_crb: cache cmd_size register value. tmp/tpm_crb: drop include to platform_device tpm/tpm_tis: remove unused itpm variable tpm_crb: fix incorrect values of cmdReady and goIdle bits tpm_crb: refine the naming of constants tpm_crb: remove wmb()'s tpm_crb: fix crb_req_canceled behavior ...
| * | security, overlayfs: Provide hook to correctly label newly created filesVivek Goyal2016-08-081-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During a new file creation we need to make sure new file is created with the right label. New file is created in upper/ so effectively file should get label as if task had created file in upper/. We switched to mounter's creds for actual file creation. Also if there is a whiteout present, then file will be created in work/ dir first and then renamed in upper. In none of the cases file will be labeled as we want it to be. This patch introduces a new hook dentry_create_files_as(), which determines the label/context dentry will get if it had been created by task in upper and modify passed set of creds appropriately. Caller makes use of these new creds for file creation. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: fix whitespace issues found with checkpatch.pl] [PM: changes to use stat->mode in ovl_create_or_link()] Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | security,overlayfs: Provide security hook for copy up of xattrs for overlay fileVivek Goyal2016-08-081-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide a security hook which is called when xattrs of a file are being copied up. This hook is called once for each xattr and LSM can return 0 if the security module wants the xattr to be copied up, 1 if the security module wants the xattr to be discarded on the copy, -EOPNOTSUPP if the security module does not handle/manage the xattr, or a -errno upon an error. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: whitespace cleanup for checkpatch.pl] Signed-off-by: Paul Moore <paul@paul-moore.com>
| * | security, overlayfs: provide copy up security hook for unioned filesVivek Goyal2016-08-081-0/+15
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide a security hook to label new file correctly when a file is copied up from lower layer to upper layer of a overlay/union mount. This hook can prepare a new set of creds which are suitable for new file creation during copy up. Caller will use new creds to create file and then revert back to old creds and release new creds. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: whitespace cleanup to appease checkpatch.pl] Signed-off-by: Paul Moore <paul@paul-moore.com>
* | ovl: fix workdir creationMiklos Szeredi2016-09-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Workdir creation fails in latest kernel. Fix by allowing EOPNOTSUPP as a valid return value from vfs_removexattr(XATTR_NAME_POSIX_ACL_*). Upper filesystem may not support ACL and still be perfectly able to support overlayfs. Reported-by: Martin Ziegler <ziegler@uni-freiburg.de> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: c11b9fdd6a61 ("ovl: remove posix_acl_default from workdir") Cc: <stable@vger.kernel.org>
* | ovl: listxattr: use strnlen()Miklos Szeredi2016-09-011-7/+10
| | | | | | | | | | | | | | | | | | Be defensive about what underlying fs provides us in the returned xattr list buffer. If it's not properly null terminated, bail out with a warning insead of BUG. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
* | ovl: Switch to generic_getxattrAndreas Gruenbacher2016-09-014-10/+33
| | | | | | | | | | | | | | | | Now that overlayfs has xattr handlers for iop->{set,remove}xattr, use those same handlers for iop->getxattr as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: copyattr after setting POSIX ACLMiklos Szeredi2016-09-011-1/+5
| | | | | | | | | | | | | | | | | | Setting POSIX acl may also modify the file mode, so need to copy that up to the overlay inode. Reported-by: Eryu Guan <eguan@redhat.com> Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: Switch to generic_removexattrAndreas Gruenbacher2016-09-014-58/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit d837a49bd57f ("ovl: fix POSIX ACL setting") switches from iop->setxattr from ovl_setxattr to generic_setxattr, so switch from ovl_removexattr to generic_removexattr as well. As far as permission checking goes, the same rules should apply in either case. While doing that, rename ovl_setxattr to ovl_xattr_set to indicate that this is not an iop->setxattr implementation and remove the unused inode argument. Move ovl_other_xattr_set above ovl_own_xattr_set so that they match the order of handlers in ovl_xattr_handlers. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: Get rid of ovl_xattr_noacl_handlers arrayAndreas Gruenbacher2016-09-011-16/+12
| | | | | | | | | | | | | | | | | | Use an ordinary #ifdef to conditionally include the POSIX ACL handlers in ovl_xattr_handlers, like the other filesystems do. Flag the code that is now only used conditionally with __maybe_unused. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: Fix OVL_XATTR_PREFIXAndreas Gruenbacher2016-09-012-5/+4
| | | | | | | | | | | | | | | | | | Make sure ovl_own_xattr_handler only matches attribute names starting with "overlay.", not "overlayXXX". Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Fixes: d837a49bd57f ("ovl: fix POSIX ACL setting") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: fix spelling mistake: "directries" -> "directories"Colin Ian King2016-09-011-1/+1
| | | | | | | | | | | | | | Trivial fix to spelling mistake in pr_err message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: don't cache acl on overlay layerMiklos Szeredi2016-09-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some operations (setxattr/chmod) can make the cached acl stale. We either need to clear overlay's acl cache for the affected inode or prevent acl caching on the overlay altogether. Preventing caching has the following advantages: - no double caching, less memory used - overlay cache doesn't go stale when fs clears it's own cache Possible disadvantage is performance loss. If that becomes a problem get_acl() can be optimized for overlayfs. This patch disables caching by pre setting i_*acl to a value that - has bit 0 set, so is_uncached_acl() will return true - is not equal to ACL_NOT_CACHED, so get_acl() will not overwrite it The constant -3 was chosen for this purpose. Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: use cached acl on underlying layerMiklos Szeredi2016-09-011-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of calling ->get_acl() directly, use get_acl() to get the cached value. We will have the acl cached on the underlying inode anyway, because we do permission checking on the both the overlay and the underlying fs. So, since we already have double caching, this improves performance without any cost. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: proper cleanup of workdirMiklos Szeredi2016-09-013-2/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When mounting overlayfs it needs a clean "work" directory under the supplied workdir. Previously the mount code removed this directory if it already existed and created a new one. If the removal failed (e.g. directory was not empty) then it fell back to a read-only mount not using the workdir. While this has never been reported, it is possible to get a non-empty "work" dir from a previous mount of overlayfs in case of crash in the middle of an operation using the work directory. In this case the left over state should be discarded and the overlay filesystem will be consistent, guaranteed by the atomicity of operations on moving to/from the workdir to the upper layer. This patch implements cleaning out any files left in workdir. It is implemented using real recursion for simplicity, but the depth is limited to 2, because the worst case is that of a directory containing whiteouts under "work". Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
* | ovl: remove posix_acl_default from workdirMiklos Szeredi2016-09-011-0/+19
| | | | | | | | | | | | | | | | Clear out posix acl xattrs on workdir and also reset the mode after creation so that an inherited sgid bit is cleared. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
* | ovl: handle umask and posix_acl_default correctly on creationMiklos Szeredi2016-09-011-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Setting MS_POSIXACL in sb->s_flags has the side effect of passing mode to create functions without masking against umask. Another problem when creating over a whiteout is that the default posix acl is not inherited from the parent dir (because the real parent dir at the time of creation is the work directory). Fix these problems by: a) If upper fs does not have MS_POSIXACL, then mask mode with umask. b) If creating over a whiteout, call posix_acl_create() to get the inherited acls. After creation (but before moving to the final destination) set these acls on the created file. posix_acl_create() also updates the file creation mode as appropriate. Fixes: 39a25b2b3762 ("ovl: define ->get_acl() for overlay inodes") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* | ovl: don't copy up opaquenessMiklos Szeredi2016-08-083-1/+4
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a copy up of a directory occurs which has the opaque xattr set, the xattr remains in the upper directory. The immediate behavior with overlayfs is that the upper directory is not treated as opaque, however after a remount the opaque flag is used and upper directory is treated as opaque. This causes files created in the lower layer to be hidden when using multiple lower directories. Fix by not copying up the opaque flag. To reproduce: ----8<---------8<---------8<---------8<---------8<---------8<---- mkdir -p l/d/s u v w mnt mount -t overlay overlay -olowerdir=l,upperdir=u,workdir=w mnt rm -rf mnt/d/ mkdir -p mnt/d/n umount mnt mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt touch mnt/d/foo umount mnt mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt ls mnt/d ----8<---------8<---------8<---------8<---------8<---------8<---- output should be: "foo n" Reported-by: Derek McGowan <dmcg@drizz.net> Link: https://bugzilla.kernel.org/show_bug.cgi?id=151291 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
* ovl: simplify empty checkingMiklos Szeredi2016-07-291-29/+21
| | | | | | | | | | | | | | | The empty checking logic is duplicated in ovl_check_empty_and_clear() and ovl_remove_and_whiteout(), except the condition for clearing whiteouts is different: ovl_check_empty_and_clear() checked for being upper ovl_remove_and_whiteout() checked for merge OR lower Move the intersection of those checks (upper AND merge) into ovl_check_empty_and_clear() and simplify ovl_remove_and_whiteout(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* qstr: constify instances in overlayfsAl Viro2016-07-291-1/+1
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: clear nlink on rmdirMiklos Szeredi2016-07-291-2/+6
| | | | | | To make delete notification work on fa/inotify. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: disallow overlayfs as upperdirMiklos Szeredi2016-07-291-1/+2
| | | | | | | | | This does not work and does not make sense. So instead of fixing it (probably not hard) just disallow. Reported-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org>
* ovl: fix warningMiklos Szeredi2016-07-291-1/+1
| | | | | | There's a superfluous newline in the warning message in ovl_d_real(). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: remove duplicated include from super.cWei Yongjun2016-07-291-1/+0
| | | | | | | Remove duplicated include. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: append MAY_READ when diluting write checksVivek Goyal2016-07-291-1/+4
| | | | | | | | | | | | | | | | | | | Right now we remove MAY_WRITE/MAY_APPEND bits from mask if realfile is on lower/. This is done as files on lower will never be written and will be copied up. But to copy up a file, mounter should have MAY_READ permission otherwise copy up will fail. So set MAY_READ in mask when MAY_WRITE is reset. Dan Walsh noticed this when he did access(lowerfile, W_OK) and it returned True (context mounts) but when he tried to actually write to file, it failed as mounter did not have permission on lower file. [SzM] don't set MAY_READ if only MAY_APPEND is set without MAY_WRITE; this won't trigger a copy-up. Reported-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: dilute permission checks on lower only if not special fileVivek Goyal2016-07-291-1/+1
| | | | | | | | | | | Right now if file is on lower/, we remove MAY_WRITE/MAY_APPEND bits from mask as lower/ will never be written and file will be copied up. But this is not true for special files. These files are not copied up and are opened in place. So don't dilute the checks for these types of files. Reported-by: Dan Walsh <dwalsh@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: fix POSIX ACL settingMiklos Szeredi2016-07-294-11/+104
| | | | | | | | | | | | | | | | | | | | | | | | | Setting POSIX ACL needs special handling: 1) Some permission checks are done by ->setxattr() which now uses mounter's creds ("ovl: do operations on underlying file system in mounter's context"). These permission checks need to be done with current cred as well. 2) Setting ACL can fail for various reasons. We do not need to copy up in these cases. In the mean time switch to using generic_setxattr. [Arnd Bergmann] Fix link error without POSIX ACL. posix_acl_from_xattr() doesn't have a 'static inline' implementation when CONFIG_FS_POSIX_ACL is disabled, and I could not come up with an obvious way to do it. This instead avoids the link error by defining two sets of ACL operations and letting the compiler drop one of the two at compile time depending on CONFIG_FS_POSIX_ACL. This avoids all references to the ACL code, also leading to smaller code. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: share inode for hard linkMiklos Szeredi2016-07-294-48/+100
| | | | | | | | | | | | | | | | | | | Inode attributes are copied up to overlay inode (uid, gid, mode, atime, mtime, ctime) so generic code using these fields works correcty. If a hard link is created in overlayfs separate inodes are allocated for each link. If chmod/chown/etc. is performed on one of the links then the inode belonging to the other ones won't be updated. This patch attempts to fix this by sharing inodes for hard links. Use inode hash (with real inode pointer as a key) to make sure overlay inodes are shared for hard links on upper. Hard links on lower are still split (which is not user observable until the copy-up happens, see Documentation/filesystems/overlayfs.txt under "Non-standard behavior"). The inode is only inserted in the hash if it is non-directoy and upper. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: store real inode pointer in ->i_privateMiklos Szeredi2016-07-295-37/+40
| | | | | | | | | | | | | | | | To get from overlay inode to real inode we currently use 'struct ovl_entry', which has lifetime connected to overlay dentry. This is okay, since each overlay dentry had a new overlay inode allocated. Following patch will break that assumption, so need to leave out ovl_entry. This patch stores the real inode directly in i_private, with the lowest bit used to indicate whether the inode is upper or lower. Lifetime rules remain, using ovl_inode_real() must only be done while caller holds ref on overlay dentry (and hence on real dentry), or within RCU protected regions. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: permission: return ECHILD instead of ENOENTMiklos Szeredi2016-07-291-1/+1
| | | | | | The error is due to RCU and is temporary. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: update atime on upperMiklos Szeredi2016-07-294-5/+37
| | | | | | | | | | | | | | | | | Fix atime update logic in overlayfs. This patch adds an i_op->update_time() handler to overlayfs inodes. This forwards atime updates to the upper layer only. No atime updates are done on lower layers. Remove implicit atime updates to underlying files and directories with O_NOATIME. Remove explicit atime update in ovl_readlink(). Clear atime related mnt flags from cloned upper mount. This means atime updates are controlled purely by overlayfs mount options. Reported-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: fix sgid on directoryMiklos Szeredi2016-07-291-4/+27
| | | | | | | | | | | | | When creating directory in workdir, the group/sgid inheritance from the parent dir was omitted completely. Fix this by calling inode_init_owner() on overlay inode and using the resulting uid/gid/mode to create the file. Unfortunately the sgid bit can be stripped off due to umask, so need to reset the mode in this case in workdir before moving the directory in place. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: simplify permission checkingMiklos Szeredi2016-07-293-53/+1
| | | | | | | | | | | | | | | | | | The fact that we always do permission checking on the overlay inode and clear MAY_WRITE for checking access to the lower inode allows cruft to be removed from ovl_permission(). 1) "default_permissions" option effectively did generic_permission() on the overlay inode with i_mode, i_uid and i_gid updated from underlying filesystem. This is what we do by default now. It did the update using vfs_getattr() but that's only needed if the underlying filesystem can change (which is not allowed). We may later introduce a "paranoia_mode" that verifies that mode/uid/gid are not changed. 2) splitting out the IS_RDONLY() check from inode_permission() also becomes unnecessary once we remove the MAY_WRITE from the lower inode check. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: do not require mounter to have MAY_WRITE on lowerVivek Goyal2016-07-291-0/+2
| | | | | | | | | | | | | | | | | | | | | | Now we have two levels of checks in ovl_permission(). overlay inode is checked with the creds of task while underlying inode is checked with the creds of mounter. Looks like mounter does not have to have WRITE access to files on lower/. So remove the MAY_WRITE from access mask for checks on underlying lower inode. This means task should still have the MAY_WRITE permission on lower inode and mounter is not required to have MAY_WRITE. It also solves the problem of read only NFS mounts being used as lower. If __inode_permission(lower_inode, MAY_WRITE) is called on read only NFS, it fails. By resetting MAY_WRITE, check succeeds and case of read only NFS shold work with overlay without having to specify any special mount options (default permission). Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: do operations on underlying file system in mounter's contextVivek Goyal2016-07-292-37/+72
| | | | | | | | | | | | Given we are now doing checks both on overlay inode as well underlying inode, we should be able to do checks and operations on underlying file system using mounter's context. So modify all operations to do checks/operations on underlying dentry/inode in the context of mounter. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: modify ovl_permission() to do checks on two inodesVivek Goyal2016-07-291-4/+14
| | | | | | | | | | | | Right now ovl_permission() calls __inode_permission(realinode), to do permission checks on real inode and no checks are done on overlay inode. Modify it to do checks both on overlay inode as well as underlying inode. Checks on overlay inode will be done with the creds of calling task while checks on underlying inode will be done with the creds of mounter. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: define ->get_acl() for overlay inodesVivek Goyal2016-07-294-0/+28
| | | | | | | | | | Now we are planning to do DAC permission checks on overlay inode itself. And to make it work, we will need to make sure we can get acls from underlying inode. So define ->get_acl() for overlay inodes and this in turn calls into underlying filesystem to get acls, if any. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: move some common code in a functionVivek Goyal2016-07-291-8/+12
| | | | | | | | | ovl_create_upper() and ovl_create_over_whiteout() seem to be sharing some common code which can be moved into a separate function. No functionality change. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: store ovl_entry in inode->i_private for all inodesAndreas Gruenbacher2016-07-291-37/+11
| | | | | | | | | Previously this was only done for directory inodes. Doing so for all inodes makes for a nice cleanup in ovl_permission at zero cost. Inodes are not shared for hard links on the overlay, so this works fine. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: use generic_delete_inodeMiklos Szeredi2016-07-291-0/+1
| | | | | | No point in keeping overlay inodes around since they will never be reused. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
* ovl: check mounter creds on underlying lookupMiklos Szeredi2016-07-291-4/+9
| | | | | | | | | | | | | | | | The hash salting changes meant that we can no longer reuse the hash in the overlay dentry to look up the underlying dentry. Instead of lookup_hash(), use lookup_one_len_unlocked() and swith to mounter's creds (like we do for all other operations later in the series). Now the lookup_hash() export introduced in 4.6 by 3c9fe8cdff1b ("vfs: add lookup_hash() helper") is unused and can possibly be removed; its usefulness negated by the hash salting and the idea that mounter's creds should be used on operations on underlying filesystems. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 8387ff2577eb ("vfs: make the string hashes salt the hash")
* Merge branch 'd_real' into overlayfs-nextMiklos Szeredi2016-07-273-28/+25
|\
| * vfs: merge .d_select_inode() into .d_real()Miklos Szeredi2016-06-303-28/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The two methods essentially do the same: find the real dentry/inode belonging to an overlay dentry. The difference is in the usage: vfs_open() uses ->d_select_inode() and expects the function to perform copy-up if necessary based on the open flags argument. file_dentry() uses ->d_real() passing in the overlay dentry as well as the underlying inode. vfs_rename() uses ->d_select_inode() but passes zero flags. ->d_real() with a zero inode would have worked just as well here. This patch merges the functionality of ->d_select_inode() into ->d_real() by adding an 'open_flags' argument to the latter. [Al Viro] Make the signature of d_real() match that of ->d_real() again. And constify the inode argument, while we are at it. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>