Reimplement RLIMIT_NPROC on top of ucounts
authorAlexey Gladkov <legion@kernel.org>
Thu, 22 Apr 2021 12:27:11 +0000 (14:27 +0200)
committerEric W. Biederman <ebiederm@xmission.com>
Fri, 30 Apr 2021 19:14:01 +0000 (14:14 -0500)
commit21d1c5e386bc751f1953b371d72cd5b7d9c9e270
tree4f7e6527540772494c0eb4a7d323d95a1bdade98
parentb6c336528926ef73b0f70260f2636de2c3b94c14
Reimplement RLIMIT_NPROC on top of ucounts

The rlimit counter is tied to uid in the user_namespace. This allows
rlimit values to be specified in userns even if they are already
globally exceeded by the user. However, the value of the previous
user_namespaces cannot be exceeded.

To illustrate the impact of rlimits, let's say there is a program that
does not fork. Some service-A wants to run this program as user X in
multiple containers. Since the program never fork the service wants to
set RLIMIT_NPROC=1.

service-A
 \- program (uid=1000, container1, rlimit_nproc=1)
 \- program (uid=1000, container2, rlimit_nproc=1)

The service-A sets RLIMIT_NPROC=1 and runs the program in container1.
When the service-A tries to run a program with RLIMIT_NPROC=1 in
container2 it fails since user X already has one running process.

We cannot use existing inc_ucounts / dec_ucounts because they do not
allow us to exceed the maximum for the counter. Some rlimits can be
overlimited by root or if the user has the appropriate capability.

Changelog

v11:
* Change inc_rlimit_ucounts() which now returns top value of ucounts.
* Drop inc_rlimit_ucounts_and_test() because the return code of
  inc_rlimit_ucounts() can be checked.

Signed-off-by: Alexey Gladkov <legion@kernel.org>
Link: https://lkml.kernel.org/r/c5286a8aa16d2d698c222f7532f3d735c82bc6bc.1619094428.git.legion@kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
fs/exec.c
include/linux/cred.h
include/linux/sched/user.h
include/linux/user_namespace.h
kernel/cred.c
kernel/exit.c
kernel/fork.c
kernel/sys.c
kernel/ucount.c
kernel/user.c
kernel/user_namespace.c