Commit | Line | Data |
---|---|---|
40fde647 KC |
1 | ====================== |
2 | No New Privileges Flag | |
3 | ====================== | |
4 | ||
09b24357 AL |
5 | The execve system call can grant a newly-started program privileges that |
6 | its parent did not have. The most obvious examples are setuid/setgid | |
7 | programs and file capabilities. To prevent the parent program from | |
8 | gaining these privileges as well, the kernel and user code must be | |
9 | careful to prevent the parent from doing anything that could subvert the | |
10 | child. For example: | |
11 | ||
40fde647 | 12 | - The dynamic loader handles ``LD_*`` environment variables differently if |
09b24357 AL |
13 | a program is setuid. |
14 | ||
15 | - chroot is disallowed to unprivileged processes, since it would allow | |
40fde647 | 16 | ``/etc/passwd`` to be replaced from the point of view of a process that |
09b24357 AL |
17 | inherited chroot. |
18 | ||
19 | - The exec code has special handling for ptrace. | |
20 | ||
40fde647 | 21 | These are all ad-hoc fixes. The ``no_new_privs`` bit (since Linux 3.5) is a |
09b24357 AL |
22 | new, generic mechanism to make it safe for a process to modify its |
23 | execution environment in a manner that persists across execve. Any task | |
40fde647 KC |
24 | can set ``no_new_privs``. Once the bit is set, it is inherited across fork, |
25 | clone, and execve and cannot be unset. With ``no_new_privs`` set, ``execve()`` | |
09b24357 AL |
26 | promises not to grant the privilege to do anything that could not have |
27 | been done without the execve call. For example, the setuid and setgid | |
28 | bits will no longer change the uid or gid; file capabilities will not | |
29 | add to the permitted set, and LSMs will not relax constraints after | |
30 | execve. | |
31 | ||
40fde647 KC |
32 | To set ``no_new_privs``, use:: |
33 | ||
34 | prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); | |
c540521b AL |
35 | |
36 | Be careful, though: LSMs might also not tighten constraints on exec | |
40fde647 KC |
37 | in ``no_new_privs`` mode. (This means that setting up a general-purpose |
38 | service launcher to set ``no_new_privs`` before execing daemons may | |
c540521b AL |
39 | interfere with LSM-based sandboxing.) |
40 | ||
40fde647 KC |
41 | Note that ``no_new_privs`` does not prevent privilege changes that do not |
42 | involve ``execve()``. An appropriately privileged task can still call | |
43 | ``setuid(2)`` and receive SCM_RIGHTS datagrams. | |
09b24357 | 44 | |
40fde647 | 45 | There are two main use cases for ``no_new_privs`` so far: |
09b24357 AL |
46 | |
47 | - Filters installed for the seccomp mode 2 sandbox persist across | |
48 | execve and can change the behavior of newly-executed programs. | |
49 | Unprivileged users are therefore only allowed to install such filters | |
40fde647 | 50 | if ``no_new_privs`` is set. |
09b24357 | 51 | |
40fde647 | 52 | - By itself, ``no_new_privs`` can be used to reduce the attack surface |
09b24357 | 53 | available to an unprivileged user. If everything running with a |
40fde647 | 54 | given uid has ``no_new_privs`` set, then that uid will be unable to |
09b24357 AL |
55 | escalate its privileges by directly attacking setuid, setgid, and |
56 | fcap-using binaries; it will need to compromise something without the | |
40fde647 | 57 | ``no_new_privs`` bit set first. |
09b24357 AL |
58 | |
59 | In the future, other potentially dangerous kernel features could become | |
40fde647 KC |
60 | available to unprivileged tasks if ``no_new_privs`` is set. In principle, |
61 | several options to ``unshare(2)`` and ``clone(2)`` would be safe when | |
62 | ``no_new_privs`` is set, and ``no_new_privs`` + ``chroot`` is considerable less | |
09b24357 | 63 | dangerous than chroot by itself. |