Commit | Line | Data |
---|---|---|
9fbc04f2 | 1 | perf-bench(1) |
4778e0e8 | 2 | ============= |
9fbc04f2 HM |
3 | |
4 | NAME | |
5 | ---- | |
6 | perf-bench - General framework for benchmark suites | |
7 | ||
8 | SYNOPSIS | |
9 | -------- | |
10 | [verse] | |
11 | 'perf bench' [<common options>] <subsystem> <suite> [<options>] | |
12 | ||
13 | DESCRIPTION | |
14 | ----------- | |
08942f6d | 15 | This 'perf bench' command is a general framework for benchmark suites. |
9fbc04f2 HM |
16 | |
17 | COMMON OPTIONS | |
18 | -------------- | |
b6f0629a DB |
19 | -r:: |
20 | --repeat=:: | |
fc5d836c | 21 | Specify number of times to repeat the run (default 10). |
b6f0629a | 22 | |
9fbc04f2 HM |
23 | -f:: |
24 | --format=:: | |
25 | Specify format style. | |
854c5548 | 26 | Current available format styles are: |
9fbc04f2 HM |
27 | |
28 | 'default':: | |
29 | Default style. This is mainly for human reading. | |
30 | --------------------- | |
854c5548 | 31 | % perf bench sched pipe # with no style specified |
9fbc04f2 HM |
32 | (executing 1000000 pipe operations between two tasks) |
33 | Total time:5.855 sec | |
34 | 5.855061 usecs/op | |
35 | 170792 ops/sec | |
36 | --------------------- | |
37 | ||
38 | 'simple':: | |
39 | This simple style is friendly for automated | |
40 | processing by scripts. | |
41 | --------------------- | |
42 | % perf bench --format=simple sched pipe # specified simple | |
43 | 5.988 | |
44 | --------------------- | |
45 | ||
46 | SUBSYSTEM | |
47 | --------- | |
48 | ||
49 | 'sched':: | |
50 | Scheduler and IPC mechanisms. | |
51 | ||
c2a08203 DB |
52 | 'syscall':: |
53 | System call performance (throughput). | |
54 | ||
08942f6d NK |
55 | 'mem':: |
56 | Memory access performance. | |
57 | ||
95a2b3c0 RR |
58 | 'numa':: |
59 | NUMA scheduling and MM benchmarks. | |
60 | ||
61 | 'futex':: | |
62 | Futex stressing benchmarks. | |
63 | ||
121dd9ea DB |
64 | 'epoll':: |
65 | Eventpoll (epoll) stressing benchmarks. | |
66 | ||
2a4b5166 IR |
67 | 'internals':: |
68 | Benchmark internal perf functionality. | |
69 | ||
2df27071 ACM |
70 | 'uprobe':: |
71 | Benchmark overhead of uprobe + BPF. | |
72 | ||
08942f6d NK |
73 | 'all':: |
74 | All benchmark subsystems. | |
75 | ||
9fbc04f2 HM |
76 | SUITES FOR 'sched' |
77 | ~~~~~~~~~~~~~~~~~~ | |
78 | *messaging*:: | |
79 | Suite for evaluating performance of scheduler and IPC mechanisms. | |
80 | Based on hackbench by Rusty Russell. | |
81 | ||
08942f6d NK |
82 | Options of *messaging* |
83 | ^^^^^^^^^^^^^^^^^^^^^^ | |
9fbc04f2 HM |
84 | -p:: |
85 | --pipe:: | |
86 | Use pipe() instead of socketpair() | |
87 | ||
88 | -t:: | |
89 | --thread:: | |
90 | Be multi thread instead of multi process | |
91 | ||
92 | -g:: | |
93 | --group=:: | |
94 | Specify number of groups | |
95 | ||
96 | -l:: | |
b0d22e52 | 97 | --nr_loops=:: |
9fbc04f2 HM |
98 | Specify number of loops |
99 | ||
100 | Example of *messaging* | |
101 | ^^^^^^^^^^^^^^^^^^^^^^ | |
102 | ||
103 | --------------------- | |
104 | % perf bench sched messaging # run with default | |
105 | options (20 sender and receiver processes per group) | |
106 | (10 groups == 400 processes run) | |
107 | ||
108 | Total time:0.308 sec | |
109 | ||
854c5548 | 110 | % perf bench sched messaging -t -g 20 # be multi-thread, with 20 groups |
9fbc04f2 HM |
111 | (20 sender and receiver threads per group) |
112 | (20 groups == 800 threads run) | |
113 | ||
114 | Total time:0.582 sec | |
115 | --------------------- | |
116 | ||
117 | *pipe*:: | |
118 | Suite for pipe() system call. | |
119 | Based on pipe-test-1m.c by Ingo Molnar. | |
120 | ||
121 | Options of *pipe* | |
122 | ^^^^^^^^^^^^^^^^^ | |
123 | -l:: | |
124 | --loop=:: | |
125 | Specify number of loops. | |
126 | ||
79a3371b NK |
127 | -G:: |
128 | --cgroups=:: | |
129 | Names of cgroups for sender and receiver, separated by a comma. | |
130 | This is useful to check cgroup context switching overhead. | |
131 | Note that perf doesn't create nor delete the cgroups, so users should | |
132 | make sure that the cgroups exist and are accessible before use. | |
133 | ||
134 | ||
9fbc04f2 HM |
135 | Example of *pipe* |
136 | ^^^^^^^^^^^^^^^^^ | |
137 | ||
138 | --------------------- | |
139 | % perf bench sched pipe | |
140 | (executing 1000000 pipe operations between two tasks) | |
141 | ||
142 | Total time:8.091 sec | |
143 | 8.091833 usecs/op | |
144 | 123581 ops/sec | |
145 | ||
146 | % perf bench sched pipe -l 1000 # loop 1000 | |
147 | (executing 1000 pipe operations between two tasks) | |
148 | ||
149 | Total time:0.016 sec | |
150 | 16.948000 usecs/op | |
151 | 59004 ops/sec | |
79a3371b NK |
152 | |
153 | % perf bench sched pipe -G AAA,BBB | |
154 | (executing 1000000 pipe operations between cgroups) | |
155 | # Running 'sched/pipe' benchmark: | |
156 | # Executed 1000000 pipe operations between two processes | |
157 | ||
158 | Total time: 6.886 [sec] | |
159 | ||
160 | 6.886208 usecs/op | |
161 | 145217 ops/sec | |
162 | ||
9fbc04f2 HM |
163 | --------------------- |
164 | ||
c2a08203 DB |
165 | SUITES FOR 'syscall' |
166 | ~~~~~~~~~~~~~~~~~~ | |
167 | *basic*:: | |
168 | Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics). | |
169 | This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not | |
170 | cached by glibc. | |
171 | ||
172 | ||
08942f6d NK |
173 | SUITES FOR 'mem' |
174 | ~~~~~~~~~~~~~~~~ | |
175 | *memcpy*:: | |
176 | Suite for evaluating performance of simple memory copy in various ways. | |
177 | ||
178 | Options of *memcpy* | |
179 | ^^^^^^^^^^^^^^^^^^^ | |
180 | -l:: | |
a69b4f74 IM |
181 | --size:: |
182 | Specify size of memory to copy (default: 1MB). | |
08942f6d NK |
183 | Available units are B, KB, MB, GB and TB (case insensitive). |
184 | ||
2f211c84 IM |
185 | -f:: |
186 | --function:: | |
187 | Specify function to copy (default: default). | |
188 | Available functions are depend on the architecture. | |
08942f6d NK |
189 | On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported. |
190 | ||
b0d22e52 IM |
191 | -l:: |
192 | --nr_loops:: | |
08942f6d NK |
193 | Repeat memcpy invocation this number of times. |
194 | ||
195 | -c:: | |
b14f2d35 | 196 | --cycles:: |
08942f6d NK |
197 | Use perf's cpu-cycles event instead of gettimeofday syscall. |
198 | ||
08942f6d NK |
199 | *memset*:: |
200 | Suite for evaluating performance of simple memory set in various ways. | |
201 | ||
202 | Options of *memset* | |
203 | ^^^^^^^^^^^^^^^^^^^ | |
204 | -l:: | |
a69b4f74 IM |
205 | --size:: |
206 | Specify size of memory to set (default: 1MB). | |
08942f6d NK |
207 | Available units are B, KB, MB, GB and TB (case insensitive). |
208 | ||
2f211c84 IM |
209 | -f:: |
210 | --function:: | |
211 | Specify function to set (default: default). | |
212 | Available functions are depend on the architecture. | |
08942f6d NK |
213 | On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported. |
214 | ||
b0d22e52 IM |
215 | -l:: |
216 | --nr_loops:: | |
08942f6d NK |
217 | Repeat memset invocation this number of times. |
218 | ||
219 | -c:: | |
b14f2d35 | 220 | --cycles:: |
08942f6d NK |
221 | Use perf's cpu-cycles event instead of gettimeofday syscall. |
222 | ||
95a2b3c0 RR |
223 | SUITES FOR 'numa' |
224 | ~~~~~~~~~~~~~~~~~ | |
225 | *mem*:: | |
226 | Suite for evaluating NUMA workloads. | |
227 | ||
228 | SUITES FOR 'futex' | |
229 | ~~~~~~~~~~~~~~~~~~ | |
230 | *hash*:: | |
231 | Suite for evaluating hash tables. | |
232 | ||
233 | *wake*:: | |
234 | Suite for evaluating wake calls. | |
235 | ||
d65817b4 DB |
236 | *wake-parallel*:: |
237 | Suite for evaluating parallel wake calls. | |
238 | ||
95a2b3c0 RR |
239 | *requeue*:: |
240 | Suite for evaluating requeue calls. | |
241 | ||
d2f3f5d2 DB |
242 | *lock-pi*:: |
243 | Suite for evaluating futex lock_pi calls. | |
244 | ||
121dd9ea DB |
245 | SUITES FOR 'epoll' |
246 | ~~~~~~~~~~~~~~~~~~ | |
247 | *wait*:: | |
248 | Suite for evaluating concurrent epoll_wait calls. | |
d2f3f5d2 | 249 | |
231457ec DB |
250 | *ctl*:: |
251 | Suite for evaluating multiple epoll_ctl calls. | |
252 | ||
2a4b5166 IR |
253 | SUITES FOR 'internals' |
254 | ~~~~~~~~~~~~~~~~~~~~~~ | |
255 | *synthesize*:: | |
256 | Suite for evaluating perf's event synthesis performance. | |
257 | ||
9fbc04f2 HM |
258 | SEE ALSO |
259 | -------- | |
260 | linkperf:perf[1] |