Commit | Line | Data |
---|---|---|
5c0bb261 SN |
1 | |
2 | =============================================== | |
3 | XFRM device - offloading the IPsec computations | |
4 | =============================================== | |
5 | Shannon Nelson <shannon.nelson@oracle.com> | |
6 | ||
7 | ||
8 | Overview | |
9 | ======== | |
10 | ||
11 | IPsec is a useful feature for securing network traffic, but the | |
12 | computational cost is high: a 10Gbps link can easily be brought down | |
13 | to under 1Gbps, depending on the traffic and link configuration. | |
14 | Luckily, there are NICs that offer a hardware based IPsec offload which | |
15 | can radically increase throughput and decrease CPU utilization. The XFRM | |
16 | Device interface allows NIC drivers to offer to the stack access to the | |
17 | hardware offload. | |
18 | ||
19 | Userland access to the offload is typically through a system such as | |
20 | libreswan or KAME/raccoon, but the iproute2 'ip xfrm' command set can | |
21 | be handy when experimenting. An example command might look something | |
22 | like this: | |
23 | ||
24 | ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ | |
25 | reqid 0x07 replay-window 32 \ | |
26 | aead 'rfc4106(gcm(aes))' 0x44434241343332312423222114131211f4f3f2f1 128 \ | |
27 | sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp \ | |
28 | offload dev eth4 dir in | |
29 | ||
30 | Yes, that's ugly, but that's what shell scripts and/or libreswan are for. | |
31 | ||
32 | ||
33 | ||
34 | Callbacks to implement | |
35 | ====================== | |
36 | ||
37 | /* from include/linux/netdevice.h */ | |
38 | struct xfrmdev_ops { | |
39 | int (*xdo_dev_state_add) (struct xfrm_state *x); | |
40 | void (*xdo_dev_state_delete) (struct xfrm_state *x); | |
41 | void (*xdo_dev_state_free) (struct xfrm_state *x); | |
42 | bool (*xdo_dev_offload_ok) (struct sk_buff *skb, | |
43 | struct xfrm_state *x); | |
50bd870a | 44 | void (*xdo_dev_state_advance_esn) (struct xfrm_state *x); |
5c0bb261 SN |
45 | }; |
46 | ||
47 | The NIC driver offering ipsec offload will need to implement these | |
48 | callbacks to make the offload available to the network stack's | |
49 | XFRM subsytem. Additionally, the feature bits NETIF_F_HW_ESP and | |
50 | NETIF_F_HW_ESP_TX_CSUM will signal the availability of the offload. | |
51 | ||
52 | ||
53 | ||
54 | Flow | |
55 | ==== | |
56 | ||
57 | At probe time and before the call to register_netdev(), the driver should | |
58 | set up local data structures and XFRM callbacks, and set the feature bits. | |
59 | The XFRM code's listener will finish the setup on NETDEV_REGISTER. | |
60 | ||
61 | adapter->netdev->xfrmdev_ops = &ixgbe_xfrmdev_ops; | |
62 | adapter->netdev->features |= NETIF_F_HW_ESP; | |
63 | adapter->netdev->hw_enc_features |= NETIF_F_HW_ESP; | |
64 | ||
65 | When new SAs are set up with a request for "offload" feature, the | |
66 | driver's xdo_dev_state_add() will be given the new SA to be offloaded | |
67 | and an indication of whether it is for Rx or Tx. The driver should | |
68 | - verify the algorithm is supported for offloads | |
69 | - store the SA information (key, salt, target-ip, protocol, etc) | |
70 | - enable the HW offload of the SA | |
4a132095 SN |
71 | - return status value: |
72 | 0 success | |
73 | -EOPNETSUPP offload not supported, try SW IPsec | |
74 | other fail the request | |
5c0bb261 SN |
75 | |
76 | The driver can also set an offload_handle in the SA, an opaque void pointer | |
77 | that can be used to convey context into the fast-path offload requests. | |
78 | ||
79 | xs->xso.offload_handle = context; | |
80 | ||
81 | ||
82 | When the network stack is preparing an IPsec packet for an SA that has | |
83 | been setup for offload, it first calls into xdo_dev_offload_ok() with | |
84 | the skb and the intended offload state to ask the driver if the offload | |
85 | will serviceable. This can check the packet information to be sure the | |
86 | offload can be supported (e.g. IPv4 or IPv6, no IPv4 options, etc) and | |
87 | return true of false to signify its support. | |
88 | ||
89 | When ready to send, the driver needs to inspect the Tx packet for the | |
90 | offload information, including the opaque context, and set up the packet | |
91 | send accordingly. | |
92 | ||
93 | xs = xfrm_input_state(skb); | |
94 | context = xs->xso.offload_handle; | |
95 | set up HW for send | |
96 | ||
97 | The stack has already inserted the appropriate IPsec headers in the | |
98 | packet data, the offload just needs to do the encryption and fix up the | |
99 | header values. | |
100 | ||
101 | ||
102 | When a packet is received and the HW has indicated that it offloaded a | |
103 | decryption, the driver needs to add a reference to the decoded SA into | |
104 | the packet's skb. At this point the data should be decrypted but the | |
105 | IPsec headers are still in the packet data; they are removed later up | |
106 | the stack in xfrm_input(). | |
107 | ||
108 | find and hold the SA that was used to the Rx skb | |
109 | get spi, protocol, and destination IP from packet headers | |
110 | xs = find xs from (spi, protocol, dest_IP) | |
111 | xfrm_state_hold(xs); | |
112 | ||
113 | store the state information into the skb | |
114 | skb->sp = secpath_dup(skb->sp); | |
115 | skb->sp->xvec[skb->sp->len++] = xs; | |
116 | skb->sp->olen++; | |
117 | ||
118 | indicate the success and/or error status of the offload | |
119 | xo = xfrm_offload(skb); | |
120 | xo->flags = CRYPTO_DONE; | |
121 | xo->status = crypto_status; | |
122 | ||
123 | hand the packet to napi_gro_receive() as usual | |
124 | ||
50bd870a YE |
125 | In ESN mode, xdo_dev_state_advance_esn() is called from xfrm_replay_advance_esn(). |
126 | Driver will check packet seq number and update HW ESN state machine if needed. | |
5c0bb261 SN |
127 | |
128 | When the SA is removed by the user, the driver's xdo_dev_state_delete() | |
129 | is asked to disable the offload. Later, xdo_dev_state_free() is called | |
130 | from a garbage collection routine after all reference counts to the state | |
131 | have been removed and any remaining resources can be cleared for the | |
132 | offload state. How these are used by the driver will depend on specific | |
133 | hardware needs. | |
134 | ||
135 | As a netdev is set to DOWN the XFRM stack's netdev listener will call | |
136 | xdo_dev_state_delete() and xdo_dev_state_free() on any remaining offloaded | |
137 | states. | |
138 | ||
139 |