git.kernel.dk Git - linux-block.git/commit

author	Miklos Szeredi <mszeredi@redhat.com>
	Thu, 14 May 2020 14:44:24 +0000 (16:44 +0200)
committer	Miklos Szeredi <mszeredi@redhat.com>
	Thu, 14 May 2020 14:44:24 +0000 (16:44 +0200)
commit	9f6c61f96f2d97cbb5f7fa85607bc398f843ff0f
tree	74ef0bbc114168317f36e81602351d1923a5c605	tree \| snapshot
parent	530f32fc370fd1431ea9802dbc53ab5601dfccdb	commit \| diff

proc/mounts: add cursor

If mounts are deleted after a read(2) call on /proc/self/mounts (or its
kin), the subsequent read(2) could miss a mount that comes after the
deleted one in the list.  This is because the file position is interpreted
as the number mount entries from the start of the list.

E.g. first read gets entries #0 to #9; the seq file index will be 10.  Then
entry #5 is deleted, resulting in #10 becoming #9 and #11 becoming #10,
etc...  The next read will continue from entry #10, and #9 is missed.

Solve this by adding a cursor entry for each open instance.  Taking the
global namespace_sem for write seems excessive, since we are only dealing
with a per-namespace list.  Instead add a per-namespace spinlock and use
that together with namespace_sem taken for read to protect against
concurrent modification of the mount list.  This may reduce parallelism of
is_local_mountpoint(), but it's hardly a big contention point.  We could
also use RCU freeing of cursors to make traversal not need additional
locks, if that turns out to be neceesary.

Only move the cursor once for each read (cursor is not added on open) to
minimize cacheline invalidation.  When EOF is reached, the cursor is taken
off the list, in order to prevent an excessive number of cursors due to
inactive open file descriptors.

Reported-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

fs/mount.h		diff \| blob \| blame \| history
fs/namespace.c		diff \| blob \| blame \| history
fs/proc_namespace.c		diff \| blob \| blame \| history
include/linux/mount.h		diff \| blob \| blame \| history