Commit | Line | Data |
---|---|---|
c1e4535f | 1 | .. SPDX-License-Identifier: GPL-2.0 |
1da177e4 | 2 | |
c1e4535f MCC |
3 | ==================================== |
4 | HOWTO for the linux packet generator | |
5 | ==================================== | |
1da177e4 | 6 | |
4e081e0c BH |
7 | Enable CONFIG_NET_PKTGEN to compile and build pktgen either in-kernel |
8 | or as a module. A module is preferred; modprobe pktgen if needed. Once | |
ca5b542c BH |
9 | running, pktgen creates a thread for each CPU with affinity to that CPU. |
10 | Monitoring and controlling is done via /proc. It is easiest to select a | |
11 | suitable sample script and configure that. | |
1da177e4 | 12 | |
c1e4535f MCC |
13 | On a dual CPU:: |
14 | ||
15 | ps aux | grep pkt | |
16 | root 129 0.3 0.0 0 0 ? SW 2003 523:20 [kpktgend_0] | |
17 | root 130 0.3 0.0 0 0 ? SW 2003 509:50 [kpktgend_1] | |
1da177e4 | 18 | |
1da177e4 | 19 | |
c1e4535f | 20 | For monitoring and control pktgen creates:: |
1da177e4 | 21 | |
1da177e4 LT |
22 | /proc/net/pktgen/pgctrl |
23 | /proc/net/pktgen/kpktgend_X | |
c1e4535f | 24 | /proc/net/pktgen/ethX |
1da177e4 LT |
25 | |
26 | ||
9ceb87fc JDB |
27 | Tuning NIC for max performance |
28 | ============================== | |
29 | ||
ca5b542c | 30 | The default NIC settings are (likely) not tuned for pktgen's artificial |
9ceb87fc JDB |
31 | overload type of benchmarking, as this could hurt the normal use-case. |
32 | ||
c1e4535f MCC |
33 | Specifically increasing the TX ring buffer in the NIC:: |
34 | ||
9ceb87fc JDB |
35 | # ethtool -G ethX tx 1024 |
36 | ||
37 | A larger TX ring can improve pktgen's performance, while it can hurt | |
38 | in the general case, 1) because the TX ring buffer might get larger | |
ca5b542c | 39 | than the CPU's L1/L2 cache, 2) because it allows more queueing in the |
9ceb87fc JDB |
40 | NIC HW layer (which is bad for bufferbloat). |
41 | ||
ca5b542c | 42 | One should hesitate to conclude that packets/descriptors in the HW |
9ceb87fc | 43 | TX ring cause delay. Drivers usually delay cleaning up the |
ca5b542c BH |
44 | ring-buffers for various performance reasons, and packets stalling |
45 | the TX ring might just be waiting for cleanup. | |
9ceb87fc | 46 | |
ca5b542c BH |
47 | This cleanup issue is specifically the case for the driver ixgbe |
48 | (Intel 82599 chip). This driver (ixgbe) combines TX+RX ring cleanups, | |
9ceb87fc JDB |
49 | and the cleanup interval is affected by the ethtool --coalesce setting |
50 | of parameter "rx-usecs". | |
51 | ||
c1e4535f MCC |
52 | For ixgbe use e.g. "30" resulting in approx 33K interrupts/sec (1/30*10^6):: |
53 | ||
9ceb87fc JDB |
54 | # ethtool -C ethX rx-usecs 30 |
55 | ||
56 | ||
2a1ddf27 JDB |
57 | Kernel threads |
58 | ============== | |
59 | Pktgen creates a thread for each CPU with affinity to that CPU. | |
60 | Which is controlled through procfile /proc/net/pktgen/kpktgend_X. | |
61 | ||
c1e4535f | 62 | Example: /proc/net/pktgen/kpktgend_0:: |
2a1ddf27 JDB |
63 | |
64 | Running: | |
65 | Stopped: eth4@0 | |
66 | Result: OK: add_device=eth4@0 | |
67 | ||
68 | Most important are the devices assigned to the thread. | |
1da177e4 | 69 | |
2a1ddf27 | 70 | The two basic thread commands are: |
c1e4535f | 71 | |
2a1ddf27 JDB |
72 | * add_device DEVICE@NAME -- adds a single device |
73 | * rem_device_all -- remove all associated devices | |
1da177e4 | 74 | |
edb9a1b8 | 75 | When adding a device to a thread, a corresponding procfile is created |
2a1ddf27 JDB |
76 | which is used for configuring this device. Thus, device names need to |
77 | be unique. | |
78 | ||
79 | To support adding the same device to multiple threads, which is useful | |
edb9a1b8 | 80 | with multi queue NICs, the device naming scheme is extended with "@": |
c1e4535f | 81 | device@something |
2a1ddf27 JDB |
82 | |
83 | The part after "@" can be anything, but it is custom to use the thread | |
84 | number. | |
1da177e4 LT |
85 | |
86 | Viewing devices | |
87 | =============== | |
88 | ||
ca5b542c BH |
89 | The Params section holds configured information. The Current section |
90 | holds running statistics. The Result is printed after a run or after | |
c1e4535f MCC |
91 | interruption. Example:: |
92 | ||
93 | /proc/net/pktgen/eth4@0 | |
94 | ||
95 | Params: count 100000 min_pkt_size: 60 max_pkt_size: 60 | |
96 | frags: 0 delay: 0 clone_skb: 64 ifname: eth4@0 | |
97 | flows: 0 flowlen: 0 | |
98 | queue_map_min: 0 queue_map_max: 0 | |
99 | dst_min: 192.168.81.2 dst_max: | |
100 | src_min: src_max: | |
101 | src_mac: 90:e2:ba:0a:56:b4 dst_mac: 00:1b:21:3c:9d:f8 | |
102 | udp_src_min: 9 udp_src_max: 109 udp_dst_min: 9 udp_dst_max: 9 | |
103 | src_mac_count: 0 dst_mac_count: 0 | |
104 | Flags: UDPSRC_RND NO_TIMESTAMP QUEUE_MAP_CPU | |
105 | Current: | |
106 | pkts-sofar: 100000 errors: 0 | |
107 | started: 623913381008us stopped: 623913396439us idle: 25us | |
108 | seq_num: 100001 cur_dst_mac_offset: 0 cur_src_mac_offset: 0 | |
109 | cur_saddr: 192.168.8.3 cur_daddr: 192.168.81.2 | |
110 | cur_udp_dst: 9 cur_udp_src: 42 | |
111 | cur_queue_map: 0 | |
112 | flows: 0 | |
113 | Result: OK: 15430(c15405+d25) usec, 100000 (60byte,0frags) | |
114 | 6480562pps 3110Mb/sec (3110669760bps) errors: 0 | |
2a1ddf27 | 115 | |
1da177e4 | 116 | |
2a1ddf27 JDB |
117 | Configuring devices |
118 | =================== | |
7c95a9d9 BH |
119 | This is done via the /proc interface, and most easily done via pgset |
120 | as defined in the sample scripts. | |
d2ee7973 | 121 | You need to specify PGDEV environment variable to use functions from sample |
c1e4535f MCC |
122 | scripts, i.e.:: |
123 | ||
124 | export PGDEV=/proc/net/pktgen/eth4@0 | |
125 | source samples/pktgen/functions.sh | |
1da177e4 | 126 | |
c1e4535f | 127 | Examples:: |
1da177e4 | 128 | |
d2ee7973 DS |
129 | pg_ctrl start starts injection. |
130 | pg_ctrl stop aborts injection. Also, ^C aborts generator. | |
131 | ||
1da177e4 LT |
132 | pgset "clone_skb 1" sets the number of copies of the same packet |
133 | pgset "clone_skb 0" use single SKB for all transmits | |
38b2cf29 | 134 | pgset "burst 8" uses xmit_more API to queue 8 copies of the same |
c1e4535f MCC |
135 | packet and update HW tx queue tail pointer once. |
136 | "burst 1" is the default | |
1da177e4 LT |
137 | pgset "pkt_size 9014" sets packet size to 9014 |
138 | pgset "frags 5" packet will consist of 5 fragments | |
139 | pgset "count 200000" sets number of packets to send, set to zero | |
c1e4535f | 140 | for continuous sends until explicitly stopped. |
1da177e4 LT |
141 | |
142 | pgset "delay 5000" adds delay to hard_start_xmit(). nanoseconds | |
143 | ||
144 | pgset "dst 10.0.0.1" sets IP destination address | |
c1e4535f | 145 | (BEWARE! This generator is very aggressive!) |
1da177e4 LT |
146 | |
147 | pgset "dst_min 10.0.0.1" Same as dst | |
148 | pgset "dst_max 10.0.0.254" Set the maximum destination IP. | |
149 | pgset "src_min 10.0.0.1" Set the minimum (or only) source IP. | |
150 | pgset "src_max 10.0.0.254" Set the maximum source IP. | |
151 | pgset "dst6 fec0::1" IPV6 destination address | |
152 | pgset "src6 fec0::2" IPV6 source address | |
153 | pgset "dstmac 00:00:00:00:00:00" sets MAC destination address | |
154 | pgset "srcmac 00:00:00:00:00:00" sets MAC source address | |
155 | ||
896a7cf8 ED |
156 | pgset "queue_map_min 0" Sets the min value of tx queue interval |
157 | pgset "queue_map_max 7" Sets the max value of tx queue interval, for multiqueue devices | |
c1e4535f MCC |
158 | To select queue 1 of a given device, |
159 | use queue_map_min=1 and queue_map_max=1 | |
896a7cf8 | 160 | |
d012827e | 161 | pgset "src_mac_count 1" Sets the number of MACs we'll range through. |
c1e4535f | 162 | The 'minimum' MAC is what you set with srcmac. |
1da177e4 LT |
163 | |
164 | pgset "dst_mac_count 1" Sets the number of MACs we'll range through. | |
c1e4535f | 165 | The 'minimum' MAC is what you set with dstmac. |
1da177e4 LT |
166 | |
167 | pgset "flag [name]" Set a flag to determine behaviour. Current flags | |
c1e4535f MCC |
168 | are: IPSRC_RND # IP source is random (between min/max) |
169 | IPDST_RND # IP destination is random | |
170 | UDPSRC_RND, UDPDST_RND, | |
171 | MACSRC_RND, MACDST_RND | |
172 | TXSIZE_RND, IPV6, | |
173 | MPLS_RND, VID_RND, SVID_RND | |
174 | FLOW_SEQ, | |
175 | QUEUE_MAP_RND # queue map random | |
176 | QUEUE_MAP_CPU # queue map mirrors smp_processor_id() | |
177 | UDPCSUM, | |
178 | IPSEC # IPsec encapsulation (needs CONFIG_XFRM) | |
179 | NODE_ALLOC # node specific memory allocation | |
180 | NO_TIMESTAMP # disable timestamping | |
7c7dd1d6 | 181 | SHARED # enable shared SKB |
d2ee7973 | 182 | pgset 'flag ![name]' Clear a flag to determine behaviour. |
c1e4535f MCC |
183 | Note that you might need to use single quote in |
184 | interactive mode, so that your shell wouldn't expand | |
185 | the specified flag as a history command. | |
896a7cf8 | 186 | |
d2ee7973 | 187 | pgset "spi [SPI_VALUE]" Set specific SA used to transform packet. |
1da177e4 LT |
188 | |
189 | pgset "udp_src_min 9" set UDP source port min, If < udp_src_max, then | |
c1e4535f | 190 | cycle through the port range. |
1da177e4 LT |
191 | |
192 | pgset "udp_src_max 9" set UDP source port max. | |
193 | pgset "udp_dst_min 9" set UDP destination port min, If < udp_dst_max, then | |
c1e4535f | 194 | cycle through the port range. |
1da177e4 LT |
195 | pgset "udp_dst_max 9" set UDP destination port max. |
196 | ||
ca6549af | 197 | pgset "mpls 0001000a,0002000a,0000000a" set MPLS labels (in this example |
c1e4535f | 198 | outer label=16,middle label=32, |
ca6549af SW |
199 | inner label=0 (IPv4 NULL)) Note that |
200 | there must be no spaces between the | |
201 | arguments. Leading zeros are required. | |
202 | Do not set the bottom of stack bit, | |
fa00e7e1 | 203 | that's done automatically. If you do |
ca6549af SW |
204 | set the bottom of stack bit, that |
205 | indicates that you want to randomly | |
206 | generate that address and the flag | |
207 | MPLS_RND will be turned on. You | |
208 | can have any mix of random and fixed | |
209 | labels in the label stack. | |
210 | ||
211 | pgset "mpls 0" turn off mpls (or any invalid argument works too!) | |
212 | ||
f0e82fd0 FF |
213 | pgset "vlan_id 77" set VLAN ID 0-4095 |
214 | pgset "vlan_p 3" set priority bit 0-7 (default 0) | |
215 | pgset "vlan_cfi 0" set canonical format identifier 0-1 (default 0) | |
216 | ||
217 | pgset "svlan_id 22" set SVLAN ID 0-4095 | |
218 | pgset "svlan_p 3" set priority bit 0-7 (default 0) | |
219 | pgset "svlan_cfi 0" set canonical format identifier 0-1 (default 0) | |
220 | ||
221 | pgset "vlan_id 9999" > 4095 remove vlan and svlan tags | |
222 | pgset "svlan 9999" > 4095 remove svlan tag | |
223 | ||
224 | ||
225 | pgset "tos XX" set former IPv4 TOS field (e.g. "tos 28" for AF11 no ECN, default 00) | |
226 | pgset "traffic_class XX" set former IPv6 TRAFFIC CLASS (e.g. "traffic_class B8" for EF no ECN, default 00) | |
227 | ||
43d28b65 DT |
228 | pgset "rate 300M" set rate to 300 Mb/s |
229 | pgset "ratep 1000000" set rate to 1Mpps | |
1da177e4 | 230 | |
62f64aed AS |
231 | pgset "xmit_mode netif_receive" RX inject into stack netif_receive_skb() |
232 | Works with "burst" but not with "clone_skb". | |
233 | Default xmit_mode is "start_xmit". | |
234 | ||
7c95a9d9 BH |
235 | Sample scripts |
236 | ============== | |
1da177e4 | 237 | |
6f094797 JDB |
238 | A collection of tutorial scripts and helpers for pktgen is in the |
239 | samples/pktgen directory. The helper parameters.sh file support easy | |
edb9a1b8 | 240 | and consistent parameter parsing across the sample scripts. |
6f094797 | 241 | |
c1e4535f MCC |
242 | Usage example and help:: |
243 | ||
6f094797 JDB |
244 | ./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2 |
245 | ||
c1e4535f MCC |
246 | Usage::: |
247 | ||
248 | ./pktgen_sample01_simple.sh [-vx] -i ethX | |
249 | ||
6f094797 JDB |
250 | -i : ($DEV) output interface/device (required) |
251 | -s : ($PKT_SIZE) packet size | |
246b184f | 252 | -d : ($DEST_IP) destination IP. CIDR (e.g. 198.18.0.0/15) is also allowed |
6f094797 | 253 | -m : ($DST_MAC) destination MAC-addr |
246b184f | 254 | -p : ($DST_PORT) destination PORT range (e.g. 433-444) is also allowed |
6f094797 | 255 | -t : ($THREADS) threads to start |
246b184f | 256 | -f : ($F_THREAD) index of first thread (zero indexed CPU number) |
6f094797 | 257 | -c : ($SKB_CLONE) SKB clones send before alloc new SKB |
246b184f | 258 | -n : ($COUNT) num messages to send per thread, 0 means indefinitely |
6f094797 JDB |
259 | -b : ($BURST) HW level bursting of SKBs |
260 | -v : ($VERBOSE) verbose | |
261 | -x : ($DEBUG) debug | |
246b184f JK |
262 | -6 : ($IP6) IPv6 |
263 | -w : ($DELAY) Tx Delay value (ns) | |
264 | -a : ($APPEND) Script will not reset generator's state, but will append its config | |
6f094797 JDB |
265 | |
266 | The global variables being set are also listed. E.g. the required | |
267 | interface/device parameter "-i" sets variable $DEV. Copy the | |
268 | pktgen_sampleXX scripts and modify them to fit your own needs. | |
269 | ||
1da177e4 LT |
270 | |
271 | Interrupt affinity | |
272 | =================== | |
ca5b542c BH |
273 | Note that when adding devices to a specific CPU it is a good idea to |
274 | also assign /proc/irq/XX/smp_affinity so that the TX interrupts are bound | |
275 | to the same CPU. This reduces cache bouncing when freeing skbs. | |
1da177e4 | 276 | |
2a1ddf27 JDB |
277 | Plus using the device flag QUEUE_MAP_CPU, which maps the SKBs TX queue |
278 | to the running threads CPU (directly from smp_processor_id()). | |
279 | ||
e5f79d11 FD |
280 | Enable IPsec |
281 | ============ | |
ca5b542c | 282 | Default IPsec transformation with ESP encapsulation plus transport mode |
c1e4535f | 283 | can be enabled by simply setting:: |
e5f79d11 | 284 | |
c1e4535f MCC |
285 | pgset "flag IPSEC" |
286 | pgset "flows 1" | |
e5f79d11 FD |
287 | |
288 | To avoid breaking existing testbed scripts for using AH type and tunnel mode, | |
ca5b542c | 289 | you can use "pgset spi SPI_VALUE" to specify which transformation mode |
e5f79d11 FD |
290 | to employ. |
291 | ||
7c7dd1d6 LC |
292 | Disable shared SKB |
293 | ================== | |
294 | By default, SKBs sent by pktgen are shared (user count > 1). | |
295 | To test with non-shared SKBs, remove the "SHARED" flag by simply setting:: | |
296 | ||
297 | pg_set "flag !SHARED" | |
298 | ||
299 | However, if the "clone_skb" or "burst" parameters are configured, the skb | |
300 | still needs to be held by pktgen for further access. Hence the skb must be | |
301 | shared. | |
1da177e4 LT |
302 | |
303 | Current commands and configuration options | |
304 | ========================================== | |
305 | ||
c1e4535f | 306 | **Pgcontrol commands**:: |
1da177e4 | 307 | |
c1e4535f MCC |
308 | start |
309 | stop | |
310 | reset | |
1da177e4 | 311 | |
c1e4535f | 312 | **Thread commands**:: |
1da177e4 | 313 | |
c1e4535f MCC |
314 | add_device |
315 | rem_device_all | |
1da177e4 LT |
316 | |
317 | ||
c1e4535f | 318 | **Device commands**:: |
1da177e4 | 319 | |
c1e4535f MCC |
320 | count |
321 | clone_skb | |
322 | burst | |
323 | debug | |
1da177e4 | 324 | |
c1e4535f MCC |
325 | frags |
326 | delay | |
1da177e4 | 327 | |
c1e4535f MCC |
328 | src_mac_count |
329 | dst_mac_count | |
1da177e4 | 330 | |
c1e4535f MCC |
331 | pkt_size |
332 | min_pkt_size | |
333 | max_pkt_size | |
1da177e4 | 334 | |
c1e4535f MCC |
335 | queue_map_min |
336 | queue_map_max | |
337 | skb_priority | |
91db4b3c | 338 | |
c1e4535f MCC |
339 | tos (ipv4) |
340 | traffic_class (ipv6) | |
91db4b3c | 341 | |
c1e4535f | 342 | mpls |
ca6549af | 343 | |
c1e4535f MCC |
344 | udp_src_min |
345 | udp_src_max | |
1da177e4 | 346 | |
c1e4535f MCC |
347 | udp_dst_min |
348 | udp_dst_max | |
1da177e4 | 349 | |
c1e4535f | 350 | node |
91db4b3c | 351 | |
c1e4535f MCC |
352 | flag |
353 | IPSRC_RND | |
354 | IPDST_RND | |
355 | UDPSRC_RND | |
356 | UDPDST_RND | |
357 | MACSRC_RND | |
358 | MACDST_RND | |
359 | TXSIZE_RND | |
360 | IPV6 | |
361 | MPLS_RND | |
362 | VID_RND | |
363 | SVID_RND | |
364 | FLOW_SEQ | |
365 | QUEUE_MAP_RND | |
366 | QUEUE_MAP_CPU | |
367 | UDPCSUM | |
368 | IPSEC | |
369 | NODE_ALLOC | |
370 | NO_TIMESTAMP | |
7c7dd1d6 | 371 | SHARED |
1da177e4 | 372 | |
c1e4535f | 373 | spi (ipsec) |
91db4b3c | 374 | |
c1e4535f MCC |
375 | dst_min |
376 | dst_max | |
1da177e4 | 377 | |
c1e4535f MCC |
378 | src_min |
379 | src_max | |
1da177e4 | 380 | |
c1e4535f MCC |
381 | dst_mac |
382 | src_mac | |
1da177e4 | 383 | |
c1e4535f | 384 | clear_counters |
1da177e4 | 385 | |
c1e4535f MCC |
386 | src6 |
387 | dst6 | |
388 | dst6_max | |
389 | dst6_min | |
1da177e4 | 390 | |
c1e4535f MCC |
391 | flows |
392 | flowlen | |
1da177e4 | 393 | |
c1e4535f MCC |
394 | rate |
395 | ratep | |
43d28b65 | 396 | |
c1e4535f | 397 | xmit_mode <start_xmit|netif_receive> |
62f64aed | 398 | |
c1e4535f MCC |
399 | vlan_cfi |
400 | vlan_id | |
401 | vlan_p | |
91db4b3c | 402 | |
c1e4535f MCC |
403 | svlan_cfi |
404 | svlan_id | |
405 | svlan_p | |
91db4b3c | 406 | |
62f64aed | 407 | |
1da177e4 | 408 | References: |
c1e4535f MCC |
409 | |
410 | - ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/ | |
246b184f | 411 | - ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/ |
1da177e4 LT |
412 | |
413 | Paper from Linux-Kongress in Erlangen 2004. | |
c1e4535f | 414 | - ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/pktgen_paper.pdf |
1da177e4 LT |
415 | |
416 | Thanks to: | |
c1e4535f | 417 | |
1da177e4 LT |
418 | Grant Grundler for testing on IA-64 and parisc, Harald Welte, Lennert Buytenhek |
419 | Stephen Hemminger, Andi Kleen, Dave Miller and many others. | |
420 | ||
421 | ||
ca6549af | 422 | Good luck with the linux net-development. |