Commit | Line | Data |
---|---|---|
9c1e67f9 PP |
1 | RDMA Controller |
2 | ---------------- | |
3 | ||
4 | Contents | |
5 | -------- | |
6 | ||
7 | 1. Overview | |
8 | 1-1. What is RDMA controller? | |
9 | 1-2. Why RDMA controller needed? | |
10 | 1-3. How is RDMA controller implemented? | |
11 | 2. Usage Examples | |
12 | ||
13 | 1. Overview | |
14 | ||
15 | 1-1. What is RDMA controller? | |
16 | ----------------------------- | |
17 | ||
18 | RDMA controller allows user to limit RDMA/IB specific resources that a given | |
19 | set of processes can use. These processes are grouped using RDMA controller. | |
20 | ||
21 | RDMA controller defines two resources which can be limited for processes of a | |
22 | cgroup. | |
23 | ||
24 | 1-2. Why RDMA controller needed? | |
25 | -------------------------------- | |
26 | ||
27 | Currently user space applications can easily take away all the rdma verb | |
28 | specific resources such as AH, CQ, QP, MR etc. Due to which other applications | |
29 | in other cgroup or kernel space ULPs may not even get chance to allocate any | |
30 | rdma resources. This can leads to service unavailability. | |
31 | ||
32 | Therefore RDMA controller is needed through which resource consumption | |
33 | of processes can be limited. Through this controller different rdma | |
34 | resources can be accounted. | |
35 | ||
36 | 1-3. How is RDMA controller implemented? | |
37 | ---------------------------------------- | |
38 | ||
39 | RDMA cgroup allows limit configuration of resources. Rdma cgroup maintains | |
40 | resource accounting per cgroup, per device using resource pool structure. | |
41 | Each such resource pool is limited up to 64 resources in given resource pool | |
42 | by rdma cgroup, which can be extended later if required. | |
43 | ||
44 | This resource pool object is linked to the cgroup css. Typically there | |
45 | are 0 to 4 resource pool instances per cgroup, per device in most use cases. | |
46 | But nothing limits to have it more. At present hundreds of RDMA devices per | |
47 | single cgroup may not be handled optimally, however there is no | |
48 | known use case or requirement for such configuration either. | |
49 | ||
50 | Since RDMA resources can be allocated from any process and can be freed by any | |
51 | of the child processes which shares the address space, rdma resources are | |
52 | always owned by the creator cgroup css. This allows process migration from one | |
53 | to other cgroup without major complexity of transferring resource ownership; | |
54 | because such ownership is not really present due to shared nature of | |
55 | rdma resources. Linking resources around css also ensures that cgroups can be | |
56 | deleted after processes migrated. This allow progress migration as well with | |
57 | active resources, even though that is not a primary use case. | |
58 | ||
59 | Whenever RDMA resource charging occurs, owner rdma cgroup is returned to | |
60 | the caller. Same rdma cgroup should be passed while uncharging the resource. | |
61 | This also allows process migrated with active RDMA resource to charge | |
62 | to new owner cgroup for new resource. It also allows to uncharge resource of | |
63 | a process from previously charged cgroup which is migrated to new cgroup, | |
64 | even though that is not a primary use case. | |
65 | ||
66 | Resource pool object is created in following situations. | |
67 | (a) User sets the limit and no previous resource pool exist for the device | |
68 | of interest for the cgroup. | |
69 | (b) No resource limits were configured, but IB/RDMA stack tries to | |
70 | charge the resource. So that it correctly uncharge them when applications are | |
71 | running without limits and later on when limits are enforced during uncharging, | |
72 | otherwise usage count will drop to negative. | |
73 | ||
74 | Resource pool is destroyed if all the resource limits are set to max and | |
75 | it is the last resource getting deallocated. | |
76 | ||
77 | User should set all the limit to max value if it intents to remove/unconfigure | |
78 | the resource pool for a particular device. | |
79 | ||
80 | IB stack honors limits enforced by the rdma controller. When application | |
81 | query about maximum resource limits of IB device, it returns minimum of | |
82 | what is configured by user for a given cgroup and what is supported by | |
83 | IB device. | |
84 | ||
85 | Following resources can be accounted by rdma controller. | |
86 | hca_handle Maximum number of HCA Handles | |
87 | hca_object Maximum number of HCA Objects | |
88 | ||
89 | 2. Usage Examples | |
90 | ----------------- | |
91 | ||
92 | (a) Configure resource limit: | |
93 | echo mlx4_0 hca_handle=2 hca_object=2000 > /sys/fs/cgroup/rdma/1/rdma.max | |
94 | echo ocrdma1 hca_handle=3 > /sys/fs/cgroup/rdma/2/rdma.max | |
95 | ||
96 | (b) Query resource limit: | |
97 | cat /sys/fs/cgroup/rdma/2/rdma.max | |
98 | #Output: | |
99 | mlx4_0 hca_handle=2 hca_object=2000 | |
100 | ocrdma1 hca_handle=3 hca_object=max | |
101 | ||
102 | (c) Query current usage: | |
103 | cat /sys/fs/cgroup/rdma/2/rdma.current | |
104 | #Output: | |
105 | mlx4_0 hca_handle=1 hca_object=20 | |
106 | ocrdma1 hca_handle=1 hca_object=23 | |
107 | ||
108 | (d) Delete resource limit: | |
109 | echo echo mlx4_0 hca_handle=max hca_object=max > /sys/fs/cgroup/rdma/1/rdma.max |