Commit | Line | Data |
---|---|---|
02c35fca FI |
1 | Reference counting in pnfs: |
2 | ========================== | |
3 | ||
4 | The are several inter-related caches. We have layouts which can | |
5 | reference multiple devices, each of which can reference multiple data servers. | |
6 | Each data server can be referenced by multiple devices. Each device | |
7 | can be referenced by multiple layouts. To keep all of this straight, | |
8 | we need to reference count. | |
9 | ||
10 | ||
11 | struct pnfs_layout_hdr | |
12 | ---------------------- | |
13 | The on-the-wire command LAYOUTGET corresponds to struct | |
14 | pnfs_layout_segment, usually referred to by the variable name lseg. | |
c9f3f2d8 | 15 | Each nfs_inode may hold a pointer to a cache of these layout |
02c35fca FI |
16 | segments in nfsi->layout, of type struct pnfs_layout_hdr. |
17 | ||
18 | We reference the header for the inode pointing to it, across each | |
19 | outstanding RPC call that references it (LAYOUTGET, LAYOUTRETURN, | |
20 | LAYOUTCOMMIT), and for each lseg held within. | |
21 | ||
22 | Each header is also (when non-empty) put on a list associated with | |
23 | struct nfs_client (cl_layouts). Being put on this list does not bump | |
24 | the reference count, as the layout is kept around by the lseg that | |
25 | keeps it in the list. | |
26 | ||
27 | deviceid_cache | |
28 | -------------- | |
29 | lsegs reference device ids, which are resolved per nfs_client and | |
30 | layout driver type. The device ids are held in a RCU cache (struct | |
31 | nfs4_deviceid_cache). The cache itself is referenced across each | |
32 | mount. The entries (struct nfs4_deviceid) themselves are held across | |
33 | the lifetime of each lseg referencing them. | |
34 | ||
35 | RCU is used because the deviceid is basically a write once, read many | |
36 | data structure. The hlist size of 32 buckets needs better | |
37 | justification, but seems reasonable given that we can have multiple | |
38 | deviceid's per filesystem, and multiple filesystems per nfs_client. | |
39 | ||
40 | The hash code is copied from the nfsd code base. A discussion of | |
41 | hashing and variations of this algorithm can be found at: | |
42 | http://groups.google.com/group/comp.lang.c/browse_thread/thread/9522965e2b8d3809 | |
43 | ||
44 | data server cache | |
45 | ----------------- | |
46 | file driver devices refer to data servers, which are kept in a module | |
47 | level cache. Its reference is held over the lifetime of the deviceid | |
48 | pointing to it. | |
80fe2b19 FI |
49 | |
50 | lseg | |
51 | ---- | |
52 | lseg maintains an extra reference corresponding to the NFS_LSEG_VALID | |
53 | bit which holds it in the pnfs_layout_hdr's list. When the final lseg | |
54 | is removed from the pnfs_layout_hdr's list, the NFS_LAYOUT_DESTROYED | |
55 | bit is set, preventing any new lsegs from being added. | |
18d98f6c SB |
56 | |
57 | layout drivers | |
58 | -------------- | |
59 | ||
8f9cdcb2 TH |
60 | PNFS utilizes what is called layout drivers. The STD defines 4 basic |
61 | layout types: "files", "objects", "blocks", and "flexfiles". For each | |
62 | of these types there is a layout-driver with a common function-vectors | |
63 | table which are called by the nfs-client pnfs-core to implement the | |
64 | different layout types. | |
18d98f6c | 65 | |
8f9cdcb2 | 66 | Files-layout-driver code is in: fs/nfs/filelayout/.. directory |
0d6f3ebf | 67 | Blocks-layout-driver code is in: fs/nfs/blocklayout/.. directory |
8f9cdcb2 | 68 | Flexfiles-layout-driver code is in: fs/nfs/flexfilelayout/.. directory |
18d98f6c | 69 | |
18d98f6c SB |
70 | blocks-layout setup |
71 | ------------------- | |
72 | ||
73 | TODO: Document the setup needs of the blocks layout driver |