Merge branches 'clk-range', 'clk-uniphier', 'clk-apple' and 'clk-qcom' into clk-next
[linux-block.git] / Documentation / admin-guide / pstore-blk.rst
CommitLineData
649304c9
WL
1.. SPDX-License-Identifier: GPL-2.0
2
3pstore block oops/panic logger
4==============================
5
6Introduction
7------------
8
9pstore block (pstore/blk) is an oops/panic logger that writes its logs to a
7dcb7848
WL
10block device and non-block device before the system crashes. You can get
11these log files by mounting pstore filesystem like::
649304c9
WL
12
13 mount -t pstore pstore /sys/fs/pstore
14
15
16pstore block concepts
17---------------------
18
19pstore/blk provides efficient configuration method for pstore/blk, which
20divides all configurations into two parts, configurations for user and
21configurations for driver.
22
23Configurations for user determine how pstore/blk works, such as pmsg_size,
24kmsg_size and so on. All of them support both Kconfig and module parameters,
25but module parameters have priority over Kconfig.
26
7dcb7848
WL
27Configurations for driver are all about block device and non-block device,
28such as total_size of block device and read/write operations.
649304c9
WL
29
30Configurations for user
31-----------------------
32
33All of these configurations support both Kconfig and module parameters, but
34module parameters have priority over Kconfig.
35
36Here is an example for module parameters::
37
45a8af44 38 pstore_blk.blkdev=/dev/mmcblk0p7 pstore_blk.kmsg_size=64 best_effort=y
649304c9
WL
39
40The detail of each configurations may be of interest to you.
41
42blkdev
43~~~~~~
44
45The block device to use. Most of the time, it is a partition of block device.
78c08247 46It's required for pstore/blk. It is also used for MTD device.
649304c9 47
c811659b 48When pstore/blk is built as a module, "blkdev" accepts the following variants:
649304c9 49
c811659b 501. /dev/<disk_name> represents the device number of disk
649304c9
WL
51#. /dev/<disk_name><decimal> represents the device number of partition - device
52 number of disk plus the partition number
53#. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk
54 name of partitioned disk ends with a digit.
c811659b
KC
55
56When pstore/blk is built into the kernel, "blkdev" accepts the following variants:
57
58#. <hex_major><hex_minor> device number in hexadecimal representation,
59 with no leading 0x, for example b302.
649304c9
WL
60#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of
61 a partition if the partition table provides it. The UUID may be either an
62 EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
63 where SSSSSSSS is a zero-filled hex representation of the 32-bit
64 "NT disk signature", and PP is a zero-filled hex representation of the
65 1-based partition number.
66#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
67 partition with a known unique id.
68#. <major>:<minor> major and minor number of the device separated by a colon.
69
78c08247
WL
70It accepts the following variants for MTD device:
71
721. <device name> MTD device name. "pstore" is recommended.
73#. <device number> MTD device number.
74
649304c9
WL
75kmsg_size
76~~~~~~~~~
77
78The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4.
79It's optional if you do not care oops/panic log.
80
81There are multiple chunks for oops/panic front-end depending on the remaining
82space except other pstore front-ends.
83
84pstore/blk will log to oops/panic chunks one by one, and always overwrite the
85oldest chunk if there is no more free chunk.
86
87pmsg_size
88~~~~~~~~~
89
90The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4.
91It's optional if you do not care pmsg log.
92
93Unlike oops/panic front-end, there is only one chunk for pmsg front-end.
94
95Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
96appended to the chunk. On reboot the contents are available in
97*/sys/fs/pstore/pmsg-pstore-blk-0*.
98
99console_size
100~~~~~~~~~~~~
101
102The chunk size in KB for console front-end. It **MUST** be a multiple of 4.
103It's optional if you do not care console log.
104
105Similar to pmsg front-end, there is only one chunk for console front-end.
106
107All log of console will be appended to the chunk. On reboot the contents are
108available in */sys/fs/pstore/console-pstore-blk-0*.
109
110ftrace_size
111~~~~~~~~~~~
112
113The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4.
114It's optional if you do not care console log.
115
116Similar to oops front-end, there are multiple chunks for ftrace front-end
117depending on the count of cpu processors. Each chunk size is equal to
118ftrace_size / processors_count.
119
120All log of ftrace will be appended to the chunk. On reboot the contents are
121combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*.
122
123Persistent function tracing might be useful for debugging software or hardware
124related hangs. Here is an example of usage::
125
126 # mount -t pstore pstore /sys/fs/pstore
127 # mount -t debugfs debugfs /sys/kernel/debug/
128 # echo 1 > /sys/kernel/debug/pstore/record_ftrace
129 # reboot -f
130 [...]
131 # mount -t pstore pstore /sys/fs/pstore
132 # tail /sys/fs/pstore/ftrace-pstore-blk-0
133 CPU:0 ts:5914676 c0063828 c0063b94 call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0
134 CPU:0 ts:5914678 c039ecdc c006385c cpuidle_enter_state <- call_cpuidle+0x44/0x48
135 CPU:0 ts:5914680 c039e9a0 c039ecf0 cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314
136 CPU:0 ts:5914681 c0063870 c039ea30 sched_idle_set_state <- cpuidle_enter_state+0x44/0x314
137 CPU:1 ts:5916720 c0160f59 c015ee04 kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204
138 CPU:1 ts:5916721 c05ca625 c015ee0c __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204
139 CPU:1 ts:5916723 c05c813d c05ca630 yield_to <- __mutex_lock_slowpath+0x314/0x358
140 CPU:1 ts:5916724 c05ca2d1 c05ca638 __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358
141
142max_reason
143~~~~~~~~~~
144
145Limiting which kinds of kmsg dumps are stored can be controlled via
146the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
147``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
148``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
149``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
150(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
151``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
152otherwise KMSG_DUMP_MAX.
153
154Configurations for driver
155-------------------------
156
b6f8ed33 157A device driver uses ``register_pstore_device`` with
7dcb7848
WL
158``struct pstore_device_info`` to register to pstore/blk.
159
160.. kernel-doc:: fs/pstore/blk.c
b30fd8e9 161 :export:
7dcb7848 162
649304c9
WL
163Compression and header
164----------------------
165
166Block device is large enough for uncompressed oops data. Actually we do not
167recommend data compression because pstore/blk will insert some information into
168the first line of oops/panic data. For example::
169
170 Panic: Total 16 times
171
172It means that it's OOPS|Panic for the 16th time since the first booting.
173Sometimes the number of occurrences of oops|panic since the first booting is
174important to judge whether the system is stable.
175
176The following line is inserted by pstore filesystem. For example::
177
178 Oops#2 Part1
179
180It means that it's OOPS for the 2nd time on the last boot.
181
182Reading the data
183----------------
184
185The dump data can be read from the pstore filesystem. The format for these
186files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end,
187``pmsg-pstore-blk-0`` for pmsg front-end and so on. The timestamp of the
188dump file records the trigger time. To delete a stored record from block
189device, simply unlink the respective pstore file.
190
191Attentions in panic read/write APIs
192-----------------------------------
193
194If on panic, the kernel is not going to run for much longer, the tasks will not
195be scheduled and most kernel resources will be out of service. It
196looks like a single-threaded program running on a single-core computer.
197
198The following points require special attention for panic read/write APIs:
199
2001. Can **NOT** allocate any memory.
201 If you need memory, just allocate while the block driver is initializing
202 rather than waiting until the panic.
203#. Must be polled, **NOT** interrupt driven.
204 No task schedule any more. The block driver should delay to ensure the write
205 succeeds, but NOT sleep.
206#. Can **NOT** take any lock.
207 There is no other task, nor any shared resource; you are safe to break all
208 locks.
209#. Just use CPU to transfer.
210 Do not use DMA to transfer unless you are sure that DMA will not keep lock.
211#. Control registers directly.
212 Please control registers directly rather than use Linux kernel resources.
213 Do I/O map while initializing rather than wait until a panic occurs.
214#. Reset your block device and controller if necessary.
215 If you are not sure of the state of your block device and controller when
216 a panic occurs, you are safe to stop and reset them.
217
218pstore/blk supports psblk_blkdev_info(), which is defined in
219*linux/pstore_blk.h*, to get information of using block device, such as the
220device number, sector count and start sector of the whole disk.
221
222pstore block internals
223----------------------
224
225For developer reference, here are all the important structures and APIs:
226
227.. kernel-doc:: fs/pstore/zone.c
228 :internal:
229
230.. kernel-doc:: include/linux/pstore_zone.h
231 :internal:
232
649304c9
WL
233.. kernel-doc:: include/linux/pstore_blk.h
234 :internal: