Commit | Line | Data |
---|---|---|
a3fa73bd JL |
1 | ################################################################################ |
2 | # # | |
3 | # NFS/RDMA README # | |
4 | # # | |
5 | ################################################################################ | |
6 | ||
7 | Author: NetApp and Open Grid Computing | |
c272cca6 | 8 | Date: April 15, 2008 |
a3fa73bd JL |
9 | |
10 | Table of Contents | |
11 | ~~~~~~~~~~~~~~~~~ | |
12 | - Overview | |
13 | - Getting Help | |
14 | - Installation | |
15 | - Check RDMA and NFS Setup | |
16 | - NFS/RDMA Setup | |
17 | ||
18 | Overview | |
19 | ~~~~~~~~ | |
20 | ||
21 | This document describes how to install and setup the Linux NFS/RDMA client | |
22 | and server software. | |
23 | ||
24 | The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server | |
25 | was first included in the following release, Linux 2.6.25. | |
26 | ||
27 | In our testing, we have obtained excellent performance results (full 10Gbit | |
28 | wire bandwidth at minimal client CPU) under many workloads. The code passes | |
29 | the full Connectathon test suite and operates over both Infiniband and iWARP | |
30 | RDMA adapters. | |
31 | ||
32 | Getting Help | |
33 | ~~~~~~~~~~~~ | |
34 | ||
35 | If you get stuck, you can ask questions on the | |
36 | ||
37 | nfs-rdma-devel@lists.sourceforge.net | |
38 | ||
39 | mailing list. | |
40 | ||
41 | Installation | |
42 | ~~~~~~~~~~~~ | |
43 | ||
44 | These instructions are a step by step guide to building a machine for | |
45 | use with NFS/RDMA. | |
46 | ||
47 | - Install an RDMA device | |
48 | ||
49 | Any device supported by the drivers in drivers/infiniband/hw is acceptable. | |
50 | ||
51 | Testing has been performed using several Mellanox-based IB cards, the | |
52 | Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter. | |
53 | ||
54 | - Install a Linux distribution and tools | |
55 | ||
56 | The first kernel release to contain both the NFS/RDMA client and server was | |
57 | Linux 2.6.25 Therefore, a distribution compatible with this and subsequent | |
58 | Linux kernel release should be installed. | |
59 | ||
60 | The procedures described in this document have been tested with | |
61 | distributions from Red Hat's Fedora Project (http://fedora.redhat.com/). | |
62 | ||
63 | - Install nfs-utils-1.1.1 or greater on the client | |
64 | ||
65 | An NFS/RDMA mount point can only be obtained by using the mount.nfs | |
66 | command in nfs-utils-1.1.1 or greater. To see which version of mount.nfs | |
67 | you are using, type: | |
68 | ||
69 | > /sbin/mount.nfs -V | |
70 | ||
71 | If the version is less than 1.1.1 or the command does not exist, | |
72 | then you will need to install the latest version of nfs-utils. | |
73 | ||
74 | Download the latest package from: | |
75 | ||
76 | http://www.kernel.org/pub/linux/utils/nfs | |
77 | ||
78 | Uncompress the package and follow the installation instructions. | |
79 | ||
80 | If you will not be using GSS and NFSv4, the installation process | |
81 | can be simplified by disabling these features when running configure: | |
82 | ||
83 | > ./configure --disable-gss --disable-nfsv4 | |
84 | ||
85 | For more information on this see the package's README and INSTALL files. | |
86 | ||
87 | After building the nfs-utils package, there will be a mount.nfs binary in | |
88 | the utils/mount directory. This binary can be used to initiate NFS v2, v3, | |
89 | or v4 mounts. To initiate a v4 mount, the binary must be called mount.nfs4. | |
90 | The standard technique is to create a symlink called mount.nfs4 to mount.nfs. | |
91 | ||
92 | NOTE: mount.nfs and therefore nfs-utils-1.1.1 or greater is only needed | |
93 | on the NFS client machine. You do not need this specific version of | |
94 | nfs-utils on the server. Furthermore, only the mount.nfs command from | |
95 | nfs-utils-1.1.1 is needed on the client. | |
96 | ||
97 | - Install a Linux kernel with NFS/RDMA | |
98 | ||
99 | The NFS/RDMA client and server are both included in the mainline Linux | |
100 | kernel version 2.6.25 and later. This and other versions of the 2.6 Linux | |
101 | kernel can be found at: | |
102 | ||
103 | ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ | |
104 | ||
105 | Download the sources and place them in an appropriate location. | |
106 | ||
107 | - Configure the RDMA stack | |
108 | ||
109 | Make sure your kernel configuration has RDMA support enabled. Under | |
110 | Device Drivers -> InfiniBand support, update the kernel configuration | |
111 | to enable InfiniBand support [NOTE: the option name is misleading. Enabling | |
112 | InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)]. | |
113 | ||
114 | Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or | |
115 | iWARP adapter support (amso, cxgb3, etc.). | |
116 | ||
117 | If you are using InfiniBand, be sure to enable IP-over-InfiniBand support. | |
118 | ||
119 | - Configure the NFS client and server | |
120 | ||
121 | Your kernel configuration must also have NFS file system support and/or | |
122 | NFS server support enabled. These and other NFS related configuration | |
123 | options can be found under File Systems -> Network File Systems. | |
124 | ||
125 | - Build, install, reboot | |
126 | ||
127 | The NFS/RDMA code will be enabled automatically if NFS and RDMA | |
128 | are turned on. The NFS/RDMA client and server are configured via the hidden | |
129 | SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The | |
130 | value of SUNRPC_XPRT_RDMA will be: | |
131 | ||
132 | - N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client | |
133 | and server will not be built | |
134 | - M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M, | |
135 | in this case the NFS/RDMA client and server will be built as modules | |
136 | - Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client | |
137 | and server will be built into the kernel | |
138 | ||
139 | Therefore, if you have followed the steps above and turned no NFS and RDMA, | |
140 | the NFS/RDMA client and server will be built. | |
141 | ||
142 | Build a new kernel, install it, boot it. | |
143 | ||
144 | Check RDMA and NFS Setup | |
145 | ~~~~~~~~~~~~~~~~~~~~~~~~ | |
146 | ||
147 | Before configuring the NFS/RDMA software, it is a good idea to test | |
148 | your new kernel to ensure that the kernel is working correctly. | |
149 | In particular, it is a good idea to verify that the RDMA stack | |
150 | is functioning as expected and standard NFS over TCP/IP and/or UDP/IP | |
151 | is working properly. | |
152 | ||
153 | - Check RDMA Setup | |
154 | ||
155 | If you built the RDMA components as modules, load them at | |
156 | this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel | |
157 | card: | |
158 | ||
159 | > modprobe ib_mthca | |
160 | > modprobe ib_ipoib | |
161 | ||
162 | If you are using InfiniBand, make sure there is a Subnet Manager (SM) | |
163 | running on the network. If your IB switch has an embedded SM, you can | |
164 | use it. Otherwise, you will need to run an SM, such as OpenSM, on one | |
165 | of your end nodes. | |
166 | ||
167 | If an SM is running on your network, you should see the following: | |
168 | ||
169 | > cat /sys/class/infiniband/driverX/ports/1/state | |
170 | 4: ACTIVE | |
171 | ||
172 | where driverX is mthca0, ipath5, ehca3, etc. | |
173 | ||
174 | To further test the InfiniBand software stack, use IPoIB (this | |
175 | assumes you have two IB hosts named host1 and host2): | |
176 | ||
177 | host1> ifconfig ib0 a.b.c.x | |
178 | host2> ifconfig ib0 a.b.c.y | |
179 | host1> ping a.b.c.y | |
180 | host2> ping a.b.c.x | |
181 | ||
182 | For other device types, follow the appropriate procedures. | |
183 | ||
184 | - Check NFS Setup | |
185 | ||
186 | For the NFS components enabled above (client and/or server), | |
187 | test their functionality over standard Ethernet using TCP/IP or UDP/IP. | |
188 | ||
189 | NFS/RDMA Setup | |
190 | ~~~~~~~~~~~~~~ | |
191 | ||
192 | We recommend that you use two machines, one to act as the client and | |
193 | one to act as the server. | |
194 | ||
195 | One time configuration: | |
196 | ||
197 | - On the server system, configure the /etc/exports file and | |
198 | start the NFS/RDMA server. | |
199 | ||
c272cca6 | 200 | Exports entries with the following formats have been tested: |
a3fa73bd | 201 | |
c272cca6 JL |
202 | /vol0 192.168.0.47(fsid=0,rw,async,insecure,no_root_squash) |
203 | /vol0 192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash) | |
a3fa73bd | 204 | |
c272cca6 JL |
205 | The IP address(es) is(are) the client's IPoIB address for an InfiniBand HCA or the |
206 | cleint's iWARP address(es) for an RNIC. | |
207 | ||
208 | NOTE: The "insecure" option must be used because the NFS/RDMA client does not | |
209 | use a reserved port. | |
a3fa73bd JL |
210 | |
211 | Each time a machine boots: | |
212 | ||
213 | - Load and configure the RDMA drivers | |
214 | ||
215 | For InfiniBand using a Mellanox adapter: | |
216 | ||
217 | > modprobe ib_mthca | |
218 | > modprobe ib_ipoib | |
219 | > ifconfig ib0 a.b.c.d | |
220 | ||
221 | NOTE: use unique addresses for the client and server | |
222 | ||
223 | - Start the NFS server | |
224 | ||
225 | If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), | |
226 | load the RDMA transport module: | |
227 | ||
228 | > modprobe svcrdma | |
229 | ||
230 | Regardless of how the server was built (module or built-in), start the server: | |
231 | ||
232 | > /etc/init.d/nfs start | |
233 | ||
234 | or | |
235 | ||
236 | > service nfs start | |
237 | ||
238 | Instruct the server to listen on the RDMA transport: | |
239 | ||
240 | > echo rdma 2050 > /proc/fs/nfsd/portlist | |
241 | ||
242 | - On the client system | |
243 | ||
244 | If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in kernel config), | |
245 | load the RDMA client module: | |
246 | ||
247 | > modprobe xprtrdma.ko | |
248 | ||
249 | Regardless of how the client was built (module or built-in), issue the mount.nfs command: | |
250 | ||
251 | > /path/to/your/mount.nfs <IPoIB-server-name-or-address>:/<export> /mnt -i -o rdma,port=2050 | |
252 | ||
253 | To verify that the mount is using RDMA, run "cat /proc/mounts" and check the | |
254 | "proto" field for the given mount. | |
255 | ||
256 | Congratulations! You're using NFS/RDMA! |