Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
[linux-2.6-block.git] / Documentation / trace / ftrace-uses.rst
CommitLineData
b4d94210
SR
1=================================
2Using ftrace to hook to functions
3=================================
4
5.. Copyright 2017 VMware Inc.
6.. Author: Steven Rostedt <srostedt@goodmis.org>
7.. License: The GNU Free Documentation License, Version 1.2
8.. (dual licensed under the GPL v2)
9
10Written for: 4.14
11
12Introduction
13============
14
15The ftrace infrastructure was originially created to attach callbacks to the
16beginning of functions in order to record and trace the flow of the kernel.
17But callbacks to the start of a function can have other use cases. Either
18for live kernel patching, or for security monitoring. This document describes
19how to use ftrace to implement your own function callbacks.
20
21
22The ftrace context
23==================
24
25WARNING: The ability to add a callback to almost any function within the
26kernel comes with risks. A callback can be called from any context
27(normal, softirq, irq, and NMI). Callbacks can also be called just before
28going to idle, during CPU bring up and takedown, or going to user space.
29This requires extra care to what can be done inside a callback. A callback
30can be called outside the protective scope of RCU.
31
32The ftrace infrastructure has some protections agains recursions and RCU
33but one must still be very careful how they use the callbacks.
34
35
36The ftrace_ops structure
37========================
38
39To register a function callback, a ftrace_ops is required. This structure
40is used to tell ftrace what function should be called as the callback
41as well as what protections the callback will perform and not require
42ftrace to handle.
43
44There is only one field that is needed to be set when registering
2cd6ff4a 45an ftrace_ops with ftrace:
b4d94210 46
2cd6ff4a 47.. code-block:: c
b4d94210
SR
48
49 struct ftrace_ops ops = {
50 .func = my_callback_func,
51 .flags = MY_FTRACE_FLAGS
52 .private = any_private_data_structure,
53 };
54
55Both .flags and .private are optional. Only .func is required.
56
57To enable tracing call::
58
59.. c:function:: register_ftrace_function(&ops);
60
61To disable tracing call::
62
63.. c:function:: unregister_ftrace_function(&ops);
64
65The above is defined by including the header::
66
67.. c:function:: #include <linux/ftrace.h>
68
69The registered callback will start being called some time after the
70register_ftrace_function() is called and before it returns. The exact time
71that callbacks start being called is dependent upon architecture and scheduling
72of services. The callback itself will have to handle any synchronization if it
73must begin at an exact moment.
74
75The unregister_ftrace_function() will guarantee that the callback is
76no longer being called by functions after the unregister_ftrace_function()
77returns. Note that to perform this guarantee, the unregister_ftrace_function()
78may take some time to finish.
79
80
81The callback function
82=====================
83
2cd6ff4a 84The prototype of the callback function is as follows (as of v4.14):
b4d94210 85
2cd6ff4a 86.. code-block:: c
b4d94210 87
2cd6ff4a
MH
88 void callback_func(unsigned long ip, unsigned long parent_ip,
89 struct ftrace_ops *op, struct pt_regs *regs);
b4d94210
SR
90
91@ip
92 This is the instruction pointer of the function that is being traced.
93 (where the fentry or mcount is within the function)
94
95@parent_ip
96 This is the instruction pointer of the function that called the
97 the function being traced (where the call of the function occurred).
98
99@op
100 This is a pointer to ftrace_ops that was used to register the callback.
101 This can be used to pass data to the callback via the private pointer.
102
103@regs
104 If the FTRACE_OPS_FL_SAVE_REGS or FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED
105 flags are set in the ftrace_ops structure, then this will be pointing
106 to the pt_regs structure like it would be if an breakpoint was placed
107 at the start of the function where ftrace was tracing. Otherwise it
108 either contains garbage, or NULL.
109
110
111The ftrace FLAGS
112================
113
114The ftrace_ops flags are all defined and documented in include/linux/ftrace.h.
115Some of the flags are used for internal infrastructure of ftrace, but the
116ones that users should be aware of are the following:
117
118FTRACE_OPS_FL_SAVE_REGS
119 If the callback requires reading or modifying the pt_regs
120 passed to the callback, then it must set this flag. Registering
121 a ftrace_ops with this flag set on an architecture that does not
122 support passing of pt_regs to the callback will fail.
123
124FTRACE_OPS_FL_SAVE_REGS_IF_SUPPORTED
125 Similar to SAVE_REGS but the registering of a
126 ftrace_ops on an architecture that does not support passing of regs
127 will not fail with this flag set. But the callback must check if
128 regs is NULL or not to determine if the architecture supports it.
129
130FTRACE_OPS_FL_RECURSION_SAFE
131 By default, a wrapper is added around the callback to
132 make sure that recursion of the function does not occur. That is,
133 if a function that is called as a result of the callback's execution
134 is also traced, ftrace will prevent the callback from being called
135 again. But this wrapper adds some overhead, and if the callback is
136 safe from recursion, it can set this flag to disable the ftrace
137 protection.
138
139 Note, if this flag is set, and recursion does occur, it could cause
140 the system to crash, and possibly reboot via a triple fault.
141
142 It is OK if another callback traces a function that is called by a
143 callback that is marked recursion safe. Recursion safe callbacks
144 must never trace any function that are called by the callback
145 itself or any nested functions that those functions call.
146
147 If this flag is set, it is possible that the callback will also
148 be called with preemption enabled (when CONFIG_PREEMPT is set),
149 but this is not guaranteed.
150
151FTRACE_OPS_FL_IPMODIFY
152 Requires FTRACE_OPS_FL_SAVE_REGS set. If the callback is to "hijack"
153 the traced function (have another function called instead of the
154 traced function), it requires setting this flag. This is what live
155 kernel patches uses. Without this flag the pt_regs->ip can not be
156 modified.
157
158 Note, only one ftrace_ops with FTRACE_OPS_FL_IPMODIFY set may be
159 registered to any given function at a time.
160
161FTRACE_OPS_FL_RCU
162 If this is set, then the callback will only be called by functions
163 where RCU is "watching". This is required if the callback function
164 performs any rcu_read_lock() operation.
165
166 RCU stops watching when the system goes idle, the time when a CPU
167 is taken down and comes back online, and when entering from kernel
168 to user space and back to kernel space. During these transitions,
169 a callback may be executed and RCU synchronization will not protect
170 it.
171
172
173Filtering which functions to trace
174==================================
175
176If a callback is only to be called from specific functions, a filter must be
177set up. The filters are added by name, or ip if it is known.
178
2cd6ff4a 179.. code-block:: c
b4d94210 180
2cd6ff4a
MH
181 int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
182 int len, int reset);
b4d94210
SR
183
184@ops
185 The ops to set the filter with
186
187@buf
188 The string that holds the function filter text.
189@len
190 The length of the string.
191
192@reset
193 Non-zero to reset all filters before applying this filter.
194
195Filters denote which functions should be enabled when tracing is enabled.
196If @buf is NULL and reset is set, all functions will be enabled for tracing.
197
198The @buf can also be a glob expression to enable all functions that
199match a specific pattern.
200
201See Filter Commands in :file:`Documentation/trace/ftrace.txt`.
202
203To just trace the schedule function::
204
2cd6ff4a 205.. code-block:: c
b4d94210 206
2cd6ff4a 207 ret = ftrace_set_filter(&ops, "schedule", strlen("schedule"), 0);
b4d94210
SR
208
209To add more functions, call the ftrace_set_filter() more than once with the
210@reset parameter set to zero. To remove the current filter set and replace it
211with new functions defined by @buf, have @reset be non-zero.
212
213To remove all the filtered functions and trace all functions::
214
2cd6ff4a 215.. code-block:: c
b4d94210 216
2cd6ff4a 217 ret = ftrace_set_filter(&ops, NULL, 0, 1);
b4d94210
SR
218
219
220Sometimes more than one function has the same name. To trace just a specific
221function in this case, ftrace_set_filter_ip() can be used.
222
2cd6ff4a 223.. code-block:: c
b4d94210 224
2cd6ff4a 225 ret = ftrace_set_filter_ip(&ops, ip, 0, 0);
b4d94210
SR
226
227Although the ip must be the address where the call to fentry or mcount is
228located in the function. This function is used by perf and kprobes that
229gets the ip address from the user (usually using debug info from the kernel).
230
231If a glob is used to set the filter, functions can be added to a "notrace"
232list that will prevent those functions from calling the callback.
233The "notrace" list takes precedence over the "filter" list. If the
234two lists are non-empty and contain the same functions, the callback will not
235be called by any function.
236
237An empty "notrace" list means to allow all functions defined by the filter
238to be traced.
239
2cd6ff4a 240.. code-block:: c
b4d94210 241
2cd6ff4a
MH
242 int ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf,
243 int len, int reset);
b4d94210
SR
244
245This takes the same parameters as ftrace_set_filter() but will add the
246functions it finds to not be traced. This is a separate list from the
247filter list, and this function does not modify the filter list.
248
249A non-zero @reset will clear the "notrace" list before adding functions
250that match @buf to it.
251
252Clearing the "notrace" list is the same as clearing the filter list
253
2cd6ff4a 254.. code-block:: c
b4d94210
SR
255
256 ret = ftrace_set_notrace(&ops, NULL, 0, 1);
257
258The filter and notrace lists may be changed at any time. If only a set of
259functions should call the callback, it is best to set the filters before
260registering the callback. But the changes may also happen after the callback
261has been registered.
262
263If a filter is in place, and the @reset is non-zero, and @buf contains a
264matching glob to functions, the switch will happen during the time of
265the ftrace_set_filter() call. At no time will all functions call the callback.
266
2cd6ff4a 267.. code-block:: c
b4d94210 268
2cd6ff4a 269 ftrace_set_filter(&ops, "schedule", strlen("schedule"), 1);
b4d94210 270
2cd6ff4a 271 register_ftrace_function(&ops);
b4d94210 272
2cd6ff4a 273 msleep(10);
b4d94210 274
2cd6ff4a 275 ftrace_set_filter(&ops, "try_to_wake_up", strlen("try_to_wake_up"), 1);
b4d94210
SR
276
277is not the same as:
278
2cd6ff4a 279.. code-block:: c
b4d94210 280
2cd6ff4a 281 ftrace_set_filter(&ops, "schedule", strlen("schedule"), 1);
b4d94210 282
2cd6ff4a 283 register_ftrace_function(&ops);
b4d94210 284
2cd6ff4a 285 msleep(10);
b4d94210 286
2cd6ff4a 287 ftrace_set_filter(&ops, NULL, 0, 1);
b4d94210 288
2cd6ff4a 289 ftrace_set_filter(&ops, "try_to_wake_up", strlen("try_to_wake_up"), 0);
b4d94210
SR
290
291As the latter will have a short time where all functions will call
292the callback, between the time of the reset, and the time of the
293new setting of the filter.