Commit | Line | Data |
---|---|---|
97162a1e MCC |
1 | ================================================================= |
2 | Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) | |
3 | ================================================================= | |
4 | ||
c73690ca VN |
5 | Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) feature |
6 | supports Ethernet functionality over Omni-Path fabric by encapsulating | |
7 | the Ethernet packets between HFI nodes. | |
8 | ||
9 | Architecture | |
10 | ============= | |
11 | The patterns of exchanges of Omni-Path encapsulated Ethernet packets | |
12 | involves one or more virtual Ethernet switches overlaid on the Omni-Path | |
13 | fabric topology. A subset of HFI nodes on the Omni-Path fabric are | |
14 | permitted to exchange encapsulated Ethernet packets across a particular | |
15 | virtual Ethernet switch. The virtual Ethernet switches are logical | |
16 | abstractions achieved by configuring the HFI nodes on the fabric for | |
17 | header generation and processing. In the simplest configuration all HFI | |
18 | nodes across the fabric exchange encapsulated Ethernet packets over a | |
19 | single virtual Ethernet switch. A virtual Ethernet switch, is effectively | |
20 | an independent Ethernet network. The configuration is performed by an | |
21 | Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM) | |
22 | application. HFI nodes can have multiple VNICs each connected to a | |
23 | different virtual Ethernet switch. The below diagram presents a case | |
97162a1e MCC |
24 | of two virtual Ethernet switches with two HFI nodes:: |
25 | ||
26 | +-------------------+ | |
27 | | Subnet/ | | |
28 | | Ethernet | | |
29 | | Manager | | |
30 | +-------------------+ | |
31 | / / | |
32 | / / | |
33 | / / | |
34 | / / | |
35 | +-----------------------------+ +------------------------------+ | |
36 | | Virtual Ethernet Switch | | Virtual Ethernet Switch | | |
37 | | +---------+ +---------+ | | +---------+ +---------+ | | |
38 | | | VPORT | | VPORT | | | | VPORT | | VPORT | | | |
39 | +--+---------+----+---------+-+ +-+---------+----+---------+---+ | |
40 | | \ / | | |
41 | | \ / | | |
42 | | \/ | | |
43 | | / \ | | |
44 | | / \ | | |
45 | +-----------+------------+ +-----------+------------+ | |
46 | | VNIC | VNIC | | VNIC | VNIC | | |
47 | +-----------+------------+ +-----------+------------+ | |
48 | | HFI | | HFI | | |
49 | +------------------------+ +------------------------+ | |
c73690ca VN |
50 | |
51 | ||
52 | The Omni-Path encapsulated Ethernet packet format is as described below. | |
53 | ||
97162a1e MCC |
54 | ==================== ================================ |
55 | Bits Field | |
56 | ==================== ================================ | |
c73690ca | 57 | Quad Word 0: |
97162a1e MCC |
58 | 0-19 SLID (lower 20 bits) |
59 | 20-30 Length (in Quad Words) | |
60 | 31 BECN bit | |
61 | 32-51 DLID (lower 20 bits) | |
62 | 52-56 SC (Service Class) | |
63 | 57-59 RC (Routing Control) | |
64 | 60 FECN bit | |
65 | 61-62 L2 (=10, 16B format) | |
66 | 63 LT (=1, Link Transfer Head Flit) | |
c73690ca VN |
67 | |
68 | Quad Word 1: | |
97162a1e MCC |
69 | 0-7 L4 type (=0x78 ETHERNET) |
70 | 8-11 SLID[23:20] | |
71 | 12-15 DLID[23:20] | |
72 | 16-31 PKEY | |
73 | 32-47 Entropy | |
74 | 48-63 Reserved | |
c73690ca VN |
75 | |
76 | Quad Word 2: | |
97162a1e MCC |
77 | 0-15 Reserved |
78 | 16-31 L4 header | |
79 | 32-63 Ethernet Packet | |
c73690ca VN |
80 | |
81 | Quad Words 3 to N-1: | |
97162a1e | 82 | 0-63 Ethernet packet (pad extended) |
c73690ca VN |
83 | |
84 | Quad Word N (last): | |
97162a1e MCC |
85 | 0-23 Ethernet packet (pad extended) |
86 | 24-55 ICRC | |
87 | 56-61 Tail | |
88 | 62-63 LT (=01, Link Transfer Tail Flit) | |
89 | ==================== ================================ | |
c73690ca VN |
90 | |
91 | Ethernet packet is padded on the transmit side to ensure that the VNIC OPA | |
92 | packet is quad word aligned. The 'Tail' field contains the number of bytes | |
93 | padded. On the receive side the 'Tail' field is read and the padding is | |
94 | removed (along with ICRC, Tail and OPA header) before passing packet up | |
95 | the network stack. | |
96 | ||
97 | The L4 header field contains the virtual Ethernet switch id the VNIC port | |
98 | belongs to. On the receive side, this field is used to de-multiplex the | |
99 | received VNIC packets to different VNIC ports. | |
100 | ||
101 | Driver Design | |
102 | ============== | |
103 | Intel OPA VNIC software design is presented in the below diagram. | |
104 | OPA VNIC functionality has a HW dependent component and a HW | |
105 | independent component. | |
106 | ||
107 | The support has been added for IB device to allocate and free the RDMA | |
108 | netdev devices. The RDMA netdev supports interfacing with the network | |
109 | stack thus creating standard network interfaces. OPA_VNIC is an RDMA | |
110 | netdev device type. | |
111 | ||
112 | The HW dependent VNIC functionality is part of the HFI1 driver. It | |
113 | implements the verbs to allocate and free the OPA_VNIC RDMA netdev. | |
114 | It involves HW resource allocation/management for VNIC functionality. | |
115 | It interfaces with the network stack and implements the required | |
116 | net_device_ops functions. It expects Omni-Path encapsulated Ethernet | |
117 | packets in the transmit path and provides HW access to them. It strips | |
118 | the Omni-Path header from the received packets before passing them up | |
119 | the network stack. It also implements the RDMA netdev control operations. | |
120 | ||
121 | The OPA VNIC module implements the HW independent VNIC functionality. | |
122 | It consists of two parts. The VNIC Ethernet Management Agent (VEMA) | |
123 | registers itself with IB core as an IB client and interfaces with the | |
124 | IB MAD stack. It exchanges the management information with the Ethernet | |
125 | Manager (EM) and the VNIC netdev. The VNIC netdev part allocates and frees | |
126 | the OPA_VNIC RDMA netdev devices. It overrides the net_device_ops functions | |
127 | set by HW dependent VNIC driver where required to accommodate any control | |
128 | operation. It also handles the encapsulation of Ethernet packets with an | |
129 | Omni-Path header in the transmit path. For each VNIC interface, the | |
130 | information required for encapsulation is configured by the EM via VEMA MAD | |
131 | interface. It also passes any control information to the HW dependent driver | |
97162a1e | 132 | by invoking the RDMA netdev control operations:: |
c73690ca VN |
133 | |
134 | +-------------------+ +----------------------+ | |
135 | | | | Linux | | |
136 | | IB MAD | | Network | | |
137 | | | | Stack | | |
138 | +-------------------+ +----------------------+ | |
139 | | | | | |
140 | | | | | |
141 | +----------------------------+ | | |
142 | | | | | |
143 | | OPA VNIC Module | | | |
144 | | (OPA VNIC RDMA Netdev | | | |
145 | | & EMA functions) | | | |
146 | | | | | |
147 | +----------------------------+ | | |
148 | | | | |
149 | | | | |
150 | +------------------+ | | |
151 | | IB core | | | |
152 | +------------------+ | | |
153 | | | | |
154 | | | | |
155 | +--------------------------------------------+ | |
156 | | | | |
157 | | HFI1 Driver with VNIC support | | |
158 | | | | |
159 | +--------------------------------------------+ |