Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
[linux-2.6-block.git] / Documentation / x86 / protection-keys.txt
CommitLineData
c51ff2c7
DH
1Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
2which is found on Intel's Skylake "Scalable Processor" Server CPUs.
3It will be avalable in future non-server parts.
4
5For anyone wishing to test or use this feature, it is available in
6Amazon's EC2 C5 instances and is known to work there using an Ubuntu
717.04 image.
591b1d8d
DH
8
9Memory Protection Keys provides a mechanism for enforcing page-based
10protections, but without requiring modification of the page tables
11when an application changes protection domains. It works by
12dedicating 4 previously ignored bits in each page table entry to a
13"protection key", giving 16 possible keys.
14
15There is also a new user-accessible register (PKRU) with two separate
16bits (Access Disable and Write Disable) for each key. Being a CPU
17register, PKRU is inherently thread-local, potentially giving each
18thread a different set of protections from every other thread.
19
20There are two new instructions (RDPKRU/WRPKRU) for reading and writing
21to the new register. The feature is only available in 64-bit mode,
22even though there is theoretically space in the PAE PTEs. These
23permissions are enforced on data access only and have no effect on
24instruction fetches.
25
c74fe394
DH
26=========================== Syscalls ===========================
27
6679dac5 28There are 3 system calls which directly interact with pkeys:
c74fe394
DH
29
30 int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
31 int pkey_free(int pkey);
32 int pkey_mprotect(unsigned long start, size_t len,
33 unsigned long prot, int pkey);
34
35Before a pkey can be used, it must first be allocated with
36pkey_alloc(). An application calls the WRPKRU instruction
37directly in order to change access permissions to memory covered
38with a key. In this example WRPKRU is wrapped by a C function
39called pkey_set().
40
41 int real_prot = PROT_READ|PROT_WRITE;
f90e2d9a 42 pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
c74fe394
DH
43 ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
44 ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
45 ... application runs here
46
47Now, if the application needs to update the data at 'ptr', it can
48gain access, do the update, then remove its write access:
49
f90e2d9a 50 pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
c74fe394 51 *ptr = foo; // assign something
f90e2d9a 52 pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
c74fe394
DH
53
54Now when it frees the memory, it will also free the pkey since it
55is no longer in use:
56
57 munmap(ptr, PAGE_SIZE);
58 pkey_free(pkey);
59
6679dac5
DH
60(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
61 An example implementation can be found in
62 tools/testing/selftests/x86/protection_keys.c)
63
c74fe394
DH
64=========================== Behavior ===========================
65
66The kernel attempts to make protection keys consistent with the
67behavior of a plain mprotect(). For instance if you do this:
68
69 mprotect(ptr, size, PROT_NONE);
70 something(ptr);
71
72you can expect the same effects with protection keys when doing this:
73
74 pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
75 pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
76 something(ptr);
77
78That should be true whether something() is a direct access to 'ptr'
79like:
80
81 *ptr = foo;
82
83or when the kernel does the access on the application's behalf like
84with a read():
85
86 read(fd, ptr, 1);
87
88The kernel will send a SIGSEGV in both cases, but si_code will be set
89to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
90the plain mprotect() permissions are violated.