Commit | Line | Data |
---|---|---|
784c4d8b SB |
1 | |
2 | To support containers, we now allow multiple instances of devpts filesystem, | |
3 | such that indices of ptys allocated in one instance are independent of indices | |
4 | allocated in other instances of devpts. | |
5 | ||
6 | To preserve backward compatibility, this support for multiple instances is | |
7 | enabled only if: | |
8 | ||
9 | - CONFIG_DEVPTS_MULTIPLE_INSTANCES=y, and | |
10 | - '-o newinstance' mount option is specified while mounting devpts | |
11 | ||
12 | IOW, devpts now supports both single-instance and multi-instance semantics. | |
13 | ||
14 | If CONFIG_DEVPTS_MULTIPLE_INSTANCES=n, there is no change in behavior and | |
15 | this referred to as the "legacy" mode. In this mode, the new mount options | |
16 | (-o newinstance and -o ptmxmode) will be ignored with a 'bogus option' message | |
17 | on console. | |
18 | ||
19 | If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and devpts is mounted without the | |
20 | 'newinstance' option (as in current start-up scripts) the new mount binds | |
21 | to the initial kernel mount of devpts. This mode is referred to as the | |
22 | 'single-instance' mode and the current, single-instance semantics are | |
23 | preserved, i.e PTYs are common across the system. | |
24 | ||
25 | The only difference between this single-instance mode and the legacy mode | |
26 | is the presence of new, '/dev/pts/ptmx' node with permissions 0000, which | |
27 | can safely be ignored. | |
28 | ||
29 | If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and 'newinstance' option is specified, | |
30 | the mount is considered to be in the multi-instance mode and a new instance | |
31 | of the devpts fs is created. Any ptys created in this instance are independent | |
32 | of ptys in other instances of devpts. Like in the single-instance mode, the | |
33 | /dev/pts/ptmx node is present. To effectively use the multi-instance mode, | |
34 | open of /dev/ptmx must be a redirected to '/dev/pts/ptmx' using a symlink or | |
35 | bind-mount. | |
36 | ||
37 | Eg: A container startup script could do the following: | |
38 | ||
39 | $ chmod 0666 /dev/pts/ptmx | |
40 | $ rm /dev/ptmx | |
41 | $ ln -s pts/ptmx /dev/ptmx | |
42 | $ ns_exec -cm /bin/bash | |
43 | ||
44 | # We are now in new container | |
45 | ||
46 | $ umount /dev/pts | |
47 | $ mount -t devpts -o newinstance lxcpts /dev/pts | |
48 | $ sshd -p 1234 | |
49 | ||
50 | where 'ns_exec -cm /bin/bash' calls clone() with CLONE_NEWNS flag and execs | |
51 | /bin/bash in the child process. A pty created by the sshd is not visible in | |
52 | the original mount of /dev/pts. | |
53 | ||
8b253b07 KK |
54 | Total count of pty pairs in all instances is limited by sysctls: |
55 | kernel.pty.max = 4096 - global limit | |
56 | kernel.pty.reserve = 1024 - reserve for initial instance | |
57 | kernel.pty.nr - current count of ptys | |
58 | ||
59 | Per-instance limit could be set by adding mount option "max=<count>". | |
60 | This feature was added in kernel 3.4 together with sysctl kernel.pty.reserve. | |
61 | In kernels older than 3.4 sysctl kernel.pty.max works as per-instance limit. | |
62 | ||
784c4d8b SB |
63 | User-space changes |
64 | ------------------ | |
65 | ||
66 | In multi-instance mode (i.e '-o newinstance' mount option is specified at least | |
67 | once), following user-space issues should be noted. | |
68 | ||
69 | 1. If -o newinstance mount option is never used, /dev/pts/ptmx can be ignored | |
70 | and no change is needed to system-startup scripts. | |
71 | ||
72 | 2. To effectively use multi-instance mode (i.e -o newinstance is specified) | |
73 | administrators or startup scripts should "redirect" open of /dev/ptmx to | |
74 | /dev/pts/ptmx using either a bind mount or symlink. | |
75 | ||
76 | $ mount -t devpts -o newinstance devpts /dev/pts | |
77 | ||
78 | followed by either | |
79 | ||
80 | $ rm /dev/ptmx | |
81 | $ ln -s pts/ptmx /dev/ptmx | |
82 | $ chmod 666 /dev/pts/ptmx | |
83 | or | |
84 | $ mount -o bind /dev/pts/ptmx /dev/ptmx | |
85 | ||
86 | 3. The '/dev/ptmx -> pts/ptmx' symlink is the preferred method since it | |
87 | enables better error-reporting and treats both single-instance and | |
88 | multi-instance mounts similarly. | |
89 | ||
90 | But this method requires that system-startup scripts set the mode of | |
91 | /dev/pts/ptmx correctly (default mode is 0000). The scripts can set the | |
92 | mode by, either | |
93 | ||
94 | - adding ptmxmode mount option to devpts entry in /etc/fstab, or | |
95 | - using 'chmod 0666 /dev/pts/ptmx' | |
96 | ||
97 | 4. If multi-instance mode mount is needed for containers, but the system | |
98 | startup scripts have not yet been updated, container-startup scripts | |
99 | should bind mount /dev/ptmx to /dev/pts/ptmx to avoid breaking single- | |
100 | instance mounts. | |
101 | ||
102 | Or, in general, container-startup scripts should use: | |
103 | ||
104 | mount -t devpts -o newinstance -o ptmxmode=0666 devpts /dev/pts | |
105 | if [ ! -L /dev/ptmx ]; then | |
106 | mount -o bind /dev/pts/ptmx /dev/ptmx | |
107 | fi | |
108 | ||
109 | When all devpts mounts are multi-instance, /dev/ptmx can permanently be | |
110 | a symlink to pts/ptmx and the bind mount can be ignored. | |
111 | ||
112 | 5. A multi-instance mount that is not accompanied by the /dev/ptmx to | |
113 | /dev/pts/ptmx redirection would result in an unusable/unreachable pty. | |
114 | ||
115 | mount -t devpts -o newinstance lxcpts /dev/pts | |
116 | ||
117 | immediately followed by: | |
118 | ||
119 | open("/dev/ptmx") | |
120 | ||
121 | would create a pty, say /dev/pts/7, in the initial kernel mount. | |
122 | But /dev/pts/7 would be invisible in the new mount. | |
123 | ||
124 | 6. The permissions for /dev/pts/ptmx node should be specified when mounting | |
125 | /dev/pts, using the '-o ptmxmode=%o' mount option (default is 0000). | |
126 | ||
127 | mount -t devpts -o newinstance -o ptmxmode=0644 devpts /dev/pts | |
128 | ||
129 | The permissions can be later be changed as usual with 'chmod'. | |
130 | ||
131 | chmod 666 /dev/pts/ptmx | |
132 | ||
133 | 7. A mount of devpts without the 'newinstance' option results in binding to | |
134 | initial kernel mount. This behavior while preserving legacy semantics, | |
135 | does not provide strict isolation in a container environment. i.e by | |
136 | mounting devpts without the 'newinstance' option, a container could | |
137 | get visibility into the 'host' or root container's devpts. | |
138 | ||
139 | To workaround this and have strict isolation, all mounts of devpts, | |
140 | including the mount in the root container, should use the newinstance | |
141 | option. |