Commit | Line | Data |
---|---|---|
d18edf52 | 1 | ===================== |
49076ec2 KS |
2 | Split page table lock |
3 | ===================== | |
4 | ||
5 | Originally, mm->page_table_lock spinlock protected all page tables of the | |
6 | mm_struct. But this approach leads to poor page fault scalability of | |
7 | multi-threaded applications due high contention on the lock. To improve | |
8 | scalability, split page table lock was introduced. | |
9 | ||
10 | With split page table lock we have separate per-table lock to serialize | |
11 | access to the table. At the moment we use split lock for PTE and PMD | |
12 | tables. Access to higher level tables protected by mm->page_table_lock. | |
13 | ||
14 | There are helpers to lock/unlock a table and other accessor functions: | |
d18edf52 | 15 | |
49076ec2 | 16 | - pte_offset_map_lock() |
0d940a9b HD |
17 | maps PTE and takes PTE table lock, returns pointer to PTE with |
18 | pointer to its PTE table lock, or returns NULL if no PTE table; | |
19 | - pte_offset_map_nolock() | |
20 | maps PTE, returns pointer to PTE with pointer to its PTE table | |
21 | lock (not taken), or returns NULL if no PTE table; | |
22 | - pte_offset_map() | |
23 | maps PTE, returns pointer to PTE, or returns NULL if no PTE table; | |
24 | - pte_unmap() | |
25 | unmaps PTE table; | |
49076ec2 KS |
26 | - pte_unmap_unlock() |
27 | unlocks and unmaps PTE table; | |
28 | - pte_alloc_map_lock() | |
0d940a9b HD |
29 | allocates PTE table if needed and takes its lock, returns pointer to |
30 | PTE with pointer to its lock, or returns NULL if allocation failed; | |
49076ec2 KS |
31 | - pmd_lock() |
32 | takes PMD table lock, returns pointer to taken lock; | |
33 | - pmd_lockptr() | |
34 | returns pointer to PMD table lock; | |
35 | ||
36 | Split page table lock for PTE tables is enabled compile-time if | |
37 | CONFIG_SPLIT_PTLOCK_CPUS (usually 4) is less or equal to NR_CPUS. | |
96c0f7c0 | 38 | If split lock is disabled, all tables are guarded by mm->page_table_lock. |
49076ec2 KS |
39 | |
40 | Split page table lock for PMD tables is enabled, if it's enabled for PTE | |
41 | tables and the architecture supports it (see below). | |
42 | ||
43 | Hugetlb and split page table lock | |
d18edf52 | 44 | ================================= |
49076ec2 KS |
45 | |
46 | Hugetlb can support several page sizes. We use split lock only for PMD | |
47 | level, but not for PUD. | |
48 | ||
49 | Hugetlb-specific helpers: | |
d18edf52 | 50 | |
49076ec2 KS |
51 | - huge_pte_lock() |
52 | takes pmd split lock for PMD_SIZE page, mm->page_table_lock | |
53 | otherwise; | |
54 | - huge_pte_lockptr() | |
55 | returns pointer to table lock; | |
56 | ||
57 | Support of split page table lock by an architecture | |
d18edf52 | 58 | =================================================== |
49076ec2 | 59 | |
b4ed71f5 | 60 | There's no need in special enabling of PTE split page table lock: everything |
9a4bbd8d | 61 | required is done by pagetable_pte_ctor() and pagetable_pte_dtor(), which |
b4ed71f5 | 62 | must be called on PTE table allocation / freeing. |
49076ec2 KS |
63 | |
64 | Make sure the architecture doesn't use slab allocator for page table | |
1d798ca3 KS |
65 | allocation: slab uses page->slab_cache for its pages. |
66 | This field shares storage with page->ptl. | |
49076ec2 KS |
67 | |
68 | PMD split lock only makes sense if you have more than two page table | |
69 | levels. | |
70 | ||
9a4bbd8d VMO |
71 | PMD split lock enabling requires pagetable_pmd_ctor() call on PMD table |
72 | allocation and pagetable_pmd_dtor() on freeing. | |
49076ec2 | 73 | |
c283610e KS |
74 | Allocation usually happens in pmd_alloc_one(), freeing in pmd_free() and |
75 | pmd_free_tlb(), but make sure you cover all PMD table allocation / freeing | |
76 | paths: i.e X86_PAE preallocate few PMDs on pgd_alloc(). | |
49076ec2 KS |
77 | |
78 | With everything in place you can set CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK. | |
79 | ||
9a4bbd8d | 80 | NOTE: pagetable_pte_ctor() and pagetable_pmd_ctor() can fail -- it must |
49076ec2 KS |
81 | be handled properly. |
82 | ||
83 | page->ptl | |
d18edf52 | 84 | ========= |
49076ec2 KS |
85 | |
86 | page->ptl is used to access split page table lock, where 'page' is struct | |
87 | page of page containing the table. It shares storage with page->private | |
88 | (and few other fields in union). | |
89 | ||
90 | To avoid increasing size of struct page and have best performance, we use a | |
91 | trick: | |
d18edf52 | 92 | |
49076ec2 KS |
93 | - if spinlock_t fits into long, we use page->ptr as spinlock, so we |
94 | can avoid indirect access and save a cache line. | |
95 | - if size of spinlock_t is bigger then size of long, we use page->ptl as | |
96 | pointer to spinlock_t and allocate it dynamically. This allows to use | |
97 | split lock with enabled DEBUG_SPINLOCK or DEBUG_LOCK_ALLOC, but costs | |
98 | one more cache line for indirect access; | |
99 | ||
9a4bbd8d VMO |
100 | The spinlock_t allocated in pagetable_pte_ctor() for PTE table and in |
101 | pagetable_pmd_ctor() for PMD table. | |
49076ec2 KS |
102 | |
103 | Please, never access page->ptl directly -- use appropriate helper. |