Commit | Line | Data |
---|---|---|
7e7cd458 MCC |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | ===== | |
4 | Tmpfs | |
5 | ===== | |
6 | ||
f38d58b7 | 7 | Tmpfs is a file system which keeps all of its files in virtual memory. |
1da177e4 LT |
8 | |
9 | ||
10 | Everything in tmpfs is temporary in the sense that no files will be | |
11 | created on your hard drive. If you unmount a tmpfs instance, | |
12 | everything stored therein is lost. | |
13 | ||
14 | tmpfs puts everything into the kernel internal caches and grows and | |
15 | shrinks to accommodate the files it contains and is able to swap | |
2c6efe9c LC |
16 | unneeded pages out to swap space, if swap was enabled for the tmpfs |
17 | mount. tmpfs also supports THP. | |
d0f5a854 LC |
18 | |
19 | tmpfs extends ramfs with a few userspace configurable options listed and | |
20 | explained further below, some of which can be reconfigured dynamically on the | |
21 | fly using a remount ('mount -o remount ...') of the filesystem. A tmpfs | |
22 | filesystem can be resized but it cannot be resized to a size below its current | |
23 | usage. tmpfs also supports POSIX ACLs, and extended attributes for the | |
2daf18a7 HD |
24 | trusted.*, security.* and user.* namespaces. ramfs does not use swap and you |
25 | cannot modify any parameter for a ramfs filesystem. The size limit of a ramfs | |
d0f5a854 LC |
26 | filesystem is how much memory you have available, and so care must be taken if |
27 | used so to not run out of memory. | |
28 | ||
29 | An alternative to tmpfs and ramfs is to use brd to create RAM disks | |
30 | (/dev/ram*), which allows you to simulate a block device disk in physical RAM. | |
31 | To write data you would just then need to create an regular filesystem on top | |
32 | this ramdisk. As with ramfs, brd ramdisks cannot swap. brd ramdisks are also | |
33 | configured in size at initialization and you cannot dynamically resize them. | |
34 | Contrary to brd ramdisks, tmpfs has its own filesystem, it does not rely on the | |
35 | block layer at all. | |
1da177e4 | 36 | |
2c6efe9c LC |
37 | Since tmpfs lives completely in the page cache and optionally on swap, |
38 | all tmpfs pages will be shown as "Shmem" in /proc/meminfo and "Shared" in | |
0bc126d4 RF |
39 | free(1). Notice that these counters also include shared memory |
40 | (shmem, see ipcs(1)). The most reliable way to get the count is | |
41 | using df(1) and du(1). | |
1da177e4 LT |
42 | |
43 | tmpfs has the following uses: | |
44 | ||
45 | 1) There is always a kernel internal mount which you will not see at | |
46 | all. This is used for shared anonymous mappings and SYSV shared | |
7e7cd458 | 47 | memory. |
1da177e4 LT |
48 | |
49 | This mount does not depend on CONFIG_TMPFS. If CONFIG_TMPFS is not | |
f38d58b7 | 50 | set, the user visible part of tmpfs is not built. But the internal |
1da177e4 LT |
51 | mechanisms are always present. |
52 | ||
53 | 2) glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for | |
54 | POSIX shared memory (shm_open, shm_unlink). Adding the following | |
7e7cd458 | 55 | line to /etc/fstab should take care of this:: |
1da177e4 LT |
56 | |
57 | tmpfs /dev/shm tmpfs defaults 0 0 | |
58 | ||
59 | Remember to create the directory that you intend to mount tmpfs on | |
bf6ee0ae | 60 | if necessary. |
1da177e4 LT |
61 | |
62 | This mount is _not_ needed for SYSV shared memory. The internal | |
63 | mount is used for that. (In the 2.3 kernel versions it was | |
64 | necessary to mount the predecessor of tmpfs (shm fs) to use SYSV | |
f38d58b7 | 65 | shared memory.) |
1da177e4 LT |
66 | |
67 | 3) Some people (including me) find it very convenient to mount it | |
68 | e.g. on /tmp and /var/tmp and have a big swap partition. And now | |
69 | loop mounts of tmpfs files do work, so mkinitrd shipped by most | |
70 | distributions should succeed with a tmpfs /tmp. | |
71 | ||
72 | 4) And probably a lot more I do not know about :-) | |
73 | ||
74 | ||
75 | tmpfs has three mount options for sizing: | |
76 | ||
7e7cd458 MCC |
77 | ========= ============================================================ |
78 | size The limit of allocated bytes for this tmpfs instance. The | |
1da177e4 LT |
79 | default is half of your physical RAM without swap. If you |
80 | oversize your tmpfs instances the machine will deadlock | |
81 | since the OOM handler will not be able to free that memory. | |
7e7cd458 MCC |
82 | nr_blocks The same as size, but in blocks of PAGE_SIZE. |
83 | nr_inodes The maximum number of inodes for this instance. The default | |
1da177e4 | 84 | is half of the number of your physical RAM pages, or (on a |
670e9f34 | 85 | machine with highmem) the number of lowmem RAM pages, |
1da177e4 | 86 | whichever is the lower. |
7e7cd458 | 87 | ========= ============================================================ |
1da177e4 LT |
88 | |
89 | These parameters accept a suffix k, m or g for kilo, mega and giga and | |
90 | can be changed on remount. The size parameter also accepts a suffix % | |
91 | to limit this tmpfs instance to that percentage of your physical RAM: | |
92 | the default, when neither size nor nr_blocks is specified, is size=50% | |
93 | ||
0edd73b3 HD |
94 | If nr_blocks=0 (or size=0), blocks will not be limited in that instance; |
95 | if nr_inodes=0, inodes will not be limited. It is generally unwise to | |
1da177e4 LT |
96 | mount with such options, since it allows any user with write access to |
97 | use up all the memory on the machine; but enhances the scalability of | |
f38d58b7 | 98 | that instance in a system with many CPUs making intensive use of it. |
2daf18a7 HD |
99 | |
100 | If nr_inodes is not 0, that limited space for inodes is also used up by | |
101 | extended attributes: "df -i"'s IUsed and IUse% increase, IFree decreases. | |
1da177e4 | 102 | |
253e5df8 HD |
103 | tmpfs blocks may be swapped out, when there is a shortage of memory. |
104 | tmpfs has a mount option to disable its use of swap: | |
105 | ||
106 | ====== =========================================================== | |
107 | noswap Disables swap. Remounts must respect the original settings. | |
108 | By default swap is enabled. | |
109 | ====== =========================================================== | |
110 | ||
d0f5a854 LC |
111 | tmpfs also supports Transparent Huge Pages which requires a kernel |
112 | configured with CONFIG_TRANSPARENT_HUGEPAGE and with huge supported for | |
113 | your system (has_transparent_hugepage(), which is architecture specific). | |
114 | The mount options for this are: | |
115 | ||
253e5df8 HD |
116 | ================ ============================================================== |
117 | huge=never Do not allocate huge pages. This is the default. | |
118 | huge=always Attempt to allocate huge page every time a new page is needed. | |
119 | huge=within_size Only allocate huge page if it will be fully within i_size. | |
120 | Also respect madvise(2) hints. | |
121 | huge=advise Only allocate huge page if requested with madvise(2). | |
122 | ================ ============================================================== | |
123 | ||
124 | See also Documentation/admin-guide/mm/transhuge.rst, which describes the | |
125 | sysfs file /sys/kernel/mm/transparent_hugepage/shmem_enabled: which can | |
126 | be used to deny huge pages on all tmpfs mounts in an emergency, or to | |
127 | force huge pages on all tmpfs mounts for testing. | |
1da177e4 | 128 | |
e09764cf CM |
129 | tmpfs also supports quota with the following mount options |
130 | ||
de4c0e7c LC |
131 | ======================== ================================================= |
132 | quota User and group quota accounting and enforcement | |
133 | is enabled on the mount. Tmpfs is using hidden | |
134 | system quota files that are initialized on mount. | |
135 | usrquota User quota accounting and enforcement is enabled | |
136 | on the mount. | |
137 | grpquota Group quota accounting and enforcement is enabled | |
138 | on the mount. | |
139 | usrquota_block_hardlimit Set global user quota block hard limit. | |
140 | usrquota_inode_hardlimit Set global user quota inode hard limit. | |
141 | grpquota_block_hardlimit Set global group quota block hard limit. | |
142 | grpquota_inode_hardlimit Set global group quota inode hard limit. | |
143 | ======================== ================================================= | |
144 | ||
145 | None of the quota related mount options can be set or changed on remount. | |
146 | ||
147 | Quota limit parameters accept a suffix k, m or g for kilo, mega and giga | |
148 | and can't be changed on remount. Default global quota limits are taking | |
149 | effect for any and all user/group/project except root the first time the | |
150 | quota entry for user/group/project id is being accessed - typically the | |
151 | first time an inode with a particular id ownership is being created after | |
152 | the mount. In other words, instead of the limits being initialized to zero, | |
153 | they are initialized with the particular value provided with these mount | |
154 | options. The limits can be changed for any user/group id at any time as they | |
155 | normally can be. | |
e09764cf CM |
156 | |
157 | Note that tmpfs quotas do not support user namespaces so no uid/gid | |
158 | translation is done if quotas are enabled inside user namespaces. | |
159 | ||
7339ff83 | 160 | tmpfs has a mount option to set the NUMA memory allocation policy for |
b00dc3ad HD |
161 | all files in that instance (if CONFIG_NUMA is enabled) - which can be |
162 | adjusted on the fly via 'mount -o remount ...' | |
7339ff83 | 163 | |
7e7cd458 | 164 | ======================== ============================================== |
55741696 KM |
165 | mpol=default use the process allocation policy |
166 | (see set_mempolicy(2)) | |
b00dc3ad HD |
167 | mpol=prefer:Node prefers to allocate memory from the given Node |
168 | mpol=bind:NodeList allocates memory only from nodes in NodeList | |
169 | mpol=interleave prefers to allocate from each node in turn | |
170 | mpol=interleave:NodeList allocates from each node of NodeList in turn | |
55741696 | 171 | mpol=local prefers to allocate memory from the local node |
7e7cd458 | 172 | ======================== ============================================== |
b00dc3ad HD |
173 | |
174 | NodeList format is a comma-separated list of decimal numbers and ranges, | |
175 | a range being two hyphen-separated decimal numbers, the smallest and | |
176 | largest node numbers in the range. For example, mpol=bind:0-3,5,7,9-15 | |
7339ff83 | 177 | |
971ada0f LS |
178 | A memory policy with a valid NodeList will be saved, as specified, for |
179 | use at file creation time. When a task allocates a file in the file | |
180 | system, the mount option memory policy will be applied with a NodeList, | |
181 | if any, modified by the calling task's cpuset constraints | |
7e7cd458 MCC |
182 | [See Documentation/admin-guide/cgroup-v1/cpusets.rst] and any optional flags, |
183 | listed below. If the resulting NodeLists is the empty set, the effective | |
184 | memory policy for the file will revert to "default" policy. | |
971ada0f | 185 | |
65d66fc0 DR |
186 | NUMA memory allocation policies have optional flags that can be used in |
187 | conjunction with their modes. These optional flags can be specified | |
188 | when tmpfs is mounted by appending them to the mode before the NodeList. | |
3ecf53e4 MR |
189 | See Documentation/admin-guide/mm/numa_memory_policy.rst for a list of |
190 | all available memory allocation policy mode flags and their effect on | |
191 | memory policy. | |
65d66fc0 | 192 | |
7e7cd458 MCC |
193 | :: |
194 | ||
65d66fc0 DR |
195 | =static is equivalent to MPOL_F_STATIC_NODES |
196 | =relative is equivalent to MPOL_F_RELATIVE_NODES | |
197 | ||
198 | For example, mpol=bind=static:NodeList, is the equivalent of an | |
199 | allocation policy of MPOL_BIND | MPOL_F_STATIC_NODES. | |
200 | ||
ad329b15 HD |
201 | Note that trying to mount a tmpfs with an mpol option will fail if the |
202 | running kernel does not support NUMA; and will fail if its nodelist | |
a210906c HD |
203 | specifies a node which is not online. If your system relies on that |
204 | tmpfs being mounted, but from time to time runs a kernel built without | |
205 | NUMA capability (perhaps a safe recovery kernel), or with fewer nodes | |
206 | online, then it is advisable to omit the mpol option from automatic | |
ad329b15 HD |
207 | mount options. It can be added later, when the tmpfs is already mounted |
208 | on MountPoint, by 'mount -o remount,mpol=Policy:NodeList MountPoint'. | |
209 | ||
7339ff83 | 210 | |
1da177e4 LT |
211 | To specify the initial root directory you can use the following mount |
212 | options: | |
213 | ||
7e7cd458 MCC |
214 | ==== ================================== |
215 | mode The permissions as an octal number | |
216 | uid The user id | |
217 | gid The group id | |
218 | ==== ================================== | |
1da177e4 LT |
219 | |
220 | These options do not have any effect on remount. You can change these | |
221 | parameters with chmod(1), chown(1) and chgrp(1) on a mounted filesystem. | |
222 | ||
223 | ||
ea3271f7 CD |
224 | tmpfs has a mount option to select whether it will wrap at 32- or 64-bit inode |
225 | numbers: | |
226 | ||
227 | ======= ======================== | |
228 | inode64 Use 64-bit inode numbers | |
229 | inode32 Use 32-bit inode numbers | |
230 | ======= ======================== | |
231 | ||
232 | On a 32-bit kernel, inode32 is implicit, and inode64 is refused at mount time. | |
233 | On a 64-bit kernel, CONFIG_TMPFS_INODE64 sets the default. inode64 avoids the | |
234 | possibility of multiple files with the same inode number on a single device; | |
235 | but risks glibc failing with EOVERFLOW once 33-bit inode numbers are reached - | |
236 | if a long-lived tmpfs is accessed by 32-bit applications so ancient that | |
237 | opening a file larger than 2GiB fails with EINVAL. | |
238 | ||
239 | ||
1da177e4 LT |
240 | So 'mount -t tmpfs -o size=10G,nr_inodes=10k,mode=700 tmpfs /mytmpfs' |
241 | will give you tmpfs instance on /mytmpfs which can allocate 10GB | |
242 | RAM/SWAP in 10240 inodes and it is only accessible by root. | |
243 | ||
244 | ||
7e7cd458 | 245 | :Author: |
1da177e4 | 246 | Christoph Rohland <cr@sap.com>, 1.12.01 |
7e7cd458 | 247 | :Updated: |
98f32602 | 248 | Hugh Dickins, 4 June 2007 |
7e7cd458 | 249 | :Updated: |
55741696 | 250 | KOSAKI Motohiro, 16 Mar 2010 |
ea3271f7 CD |
251 | :Updated: |
252 | Chris Down, 13 July 2020 |