Commit | Line | Data |
---|---|---|
3e544d72 | 1 | .. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0) |
247097e2 | 2 | .. See the bottom of this file for additional redistribution information. |
3e544d72 TL |
3 | |
4 | Reporting issues | |
5 | ++++++++++++++++ | |
6 | ||
7 | ||
8 | The short guide (aka TL;DR) | |
9 | =========================== | |
10 | ||
4d2f46a8 TL |
11 | Are you facing a regression with vanilla kernels from the same stable or |
12 | longterm series? One still supported? Then search the `LKML | |
13 | <https://lore.kernel.org/lkml/>`_ and the `Linux stable mailing list | |
14 | <https://lore.kernel.org/stable/>`_ archives for matching reports to join. If | |
15 | you don't find any, install `the latest release from that series | |
16 | <https://kernel.org/>`_. If it still shows the issue, report it to the stable | |
6161a4b1 | 17 | mailing list (stable@vger.kernel.org) and CC the regressions list |
0043f0b2 TL |
18 | (regressions@lists.linux.dev); ideally also CC the maintainer and the mailing |
19 | list for the subsystem in question. | |
3e544d72 | 20 | |
4d2f46a8 TL |
21 | In all other cases try your best guess which kernel part might be causing the |
22 | issue. Check the :ref:`MAINTAINERS <maintainers>` file for how its developers | |
23 | expect to be told about problems, which most of the time will be by email with a | |
24 | mailing list in CC. Check the destination's archives for matching reports; | |
25 | search the `LKML <https://lore.kernel.org/lkml/>`_ and the web, too. If you | |
26 | don't find any to join, install `the latest mainline kernel | |
27 | <https://kernel.org/>`_. If the issue is present there, send a report. | |
3e544d72 | 28 | |
4d2f46a8 TL |
29 | The issue was fixed there, but you would like to see it resolved in a still |
30 | supported stable or longterm series as well? Then install its latest release. | |
31 | If it shows the problem, search for the change that fixed it in mainline and | |
32 | check if backporting is in the works or was discarded; if it's neither, ask | |
33 | those who handled the change for it. | |
3e544d72 | 34 | |
4d2f46a8 TL |
35 | **General remarks**: When installing and testing a kernel as outlined above, |
36 | ensure it's vanilla (IOW: not patched and not using add-on modules). Also make | |
37 | sure it's built and running in a healthy environment and not already tainted | |
38 | before the issue occurs. | |
3e544d72 | 39 | |
6161a4b1 TL |
40 | If you are facing multiple issues with the Linux kernel at once, report each |
41 | separately. While writing your report, include all information relevant to the | |
42 | issue, like the kernel and the distro used. In case of a regression, CC the | |
0043f0b2 TL |
43 | regressions mailing list (regressions@lists.linux.dev) to your report. Also try |
44 | to pin-point the culprit with a bisection; if you succeed, include its | |
45 | commit-id and CC everyone in the sign-off-by chain. | |
3e544d72 | 46 | |
4d2f46a8 TL |
47 | Once the report is out, answer any questions that come up and help where you |
48 | can. That includes keeping the ball rolling by occasionally retesting with newer | |
49 | releases and sending a status update afterwards. | |
3e544d72 TL |
50 | |
51 | Step-by-step guide how to report issues to the kernel maintainers | |
52 | ================================================================= | |
53 | ||
54 | The above TL;DR outlines roughly how to report issues to the Linux kernel | |
55 | developers. It might be all that's needed for people already familiar with | |
56 | reporting issues to Free/Libre & Open Source Software (FLOSS) projects. For | |
57 | everyone else there is this section. It is more detailed and uses a | |
58 | step-by-step approach. It still tries to be brief for readability and leaves | |
59 | out a lot of details; those are described below the step-by-step guide in a | |
60 | reference section, which explains each of the steps in more detail. | |
61 | ||
62 | Note: this section covers a few more aspects than the TL;DR and does things in | |
63 | a slightly different order. That's in your interest, to make sure you notice | |
64 | early if an issue that looks like a Linux kernel problem is actually caused by | |
65 | something else. These steps thus help to ensure the time you invest in this | |
66 | process won't feel wasted in the end: | |
67 | ||
2dfa9eb0 TL |
68 | * Are you facing an issue with a Linux kernel a hardware or software vendor |
69 | provided? Then in almost all cases you are better off to stop reading this | |
70 | document and reporting the issue to your vendor instead, unless you are | |
71 | willing to install the latest Linux version yourself. Be aware the latter | |
72 | will often be needed anyway to hunt down and fix issues. | |
3e544d72 | 73 | |
4b9d49d1 | 74 | * Perform a rough search for existing reports with your favorite internet |
58c53945 TL |
75 | search engine; additionally, check the archives of the `Linux Kernel Mailing |
76 | List (LKML) <https://lore.kernel.org/lkml/>`_. If you find matching reports, | |
77 | join the discussion instead of sending a new one. | |
4b9d49d1 | 78 | |
3e544d72 TL |
79 | * See if the issue you are dealing with qualifies as regression, security |
80 | issue, or a really severe problem: those are 'issues of high priority' that | |
81 | need special handling in some steps that are about to follow. | |
82 | ||
4f08d7ab TL |
83 | * Make sure it's not the kernel's surroundings that are causing the issue |
84 | you face. | |
3e544d72 TL |
85 | |
86 | * Create a fresh backup and put system repair and restore tools at hand. | |
87 | ||
88 | * Ensure your system does not enhance its kernels by building additional | |
89 | kernel modules on-the-fly, which solutions like DKMS might be doing locally | |
90 | without your knowledge. | |
91 | ||
4f08d7ab TL |
92 | * Check if your kernel was 'tainted' when the issue occurred, as the event |
93 | that made the kernel set this flag might be causing the issue you face. | |
3e544d72 TL |
94 | |
95 | * Write down coarsely how to reproduce the issue. If you deal with multiple | |
96 | issues at once, create separate notes for each of them and make sure they | |
97 | work independently on a freshly booted system. That's needed, as each issue | |
98 | needs to get reported to the kernel developers separately, unless they are | |
99 | strongly entangled. | |
100 | ||
4b9d49d1 TL |
101 | * If you are facing a regression within a stable or longterm version line |
102 | (say something broke when updating from 5.10.4 to 5.10.5), scroll down to | |
103 | 'Dealing with regressions within a stable and longterm kernel line'. | |
104 | ||
4f08d7ab TL |
105 | * Locate the driver or kernel subsystem that seems to be causing the issue. |
106 | Find out how and where its developers expect reports. Note: most of the | |
107 | time this won't be bugzilla.kernel.org, as issues typically need to be sent | |
108 | by mail to a maintainer and a public mailing list. | |
109 | ||
110 | * Search the archives of the bug tracker or mailing list in question | |
4b9d49d1 TL |
111 | thoroughly for reports that might match your issue. If you find anything, |
112 | join the discussion instead of sending a new report. | |
4f08d7ab | 113 | |
3e544d72 TL |
114 | After these preparations you'll now enter the main part: |
115 | ||
2dfa9eb0 TL |
116 | * Unless you are already running the latest 'mainline' Linux kernel, better |
117 | go and install it for the reporting process. Testing and reporting with | |
118 | the latest 'stable' Linux can be an acceptable alternative in some | |
119 | situations; during the merge window that actually might be even the best | |
120 | approach, but in that development phase it can be an even better idea to | |
121 | suspend your efforts for a few days anyway. Whatever version you choose, | |
122 | ideally use a 'vanilla' build. Ignoring these advices will dramatically | |
123 | increase the risk your report will be rejected or ignored. | |
3e544d72 TL |
124 | |
125 | * Ensure the kernel you just installed does not 'taint' itself when | |
126 | running. | |
127 | ||
128 | * Reproduce the issue with the kernel you just installed. If it doesn't show | |
613f9691 | 129 | up there, scroll down to the instructions for issues only happening with |
3e544d72 TL |
130 | stable and longterm kernels. |
131 | ||
132 | * Optimize your notes: try to find and write the most straightforward way to | |
133 | reproduce your issue. Make sure the end result has all the important | |
134 | details, and at the same time is easy to read and understand for others | |
135 | that hear about it for the first time. And if you learned something in this | |
136 | process, consider searching again for existing reports about the issue. | |
137 | ||
315c4e45 TL |
138 | * If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider |
139 | decoding the kernel log to find the line of code that triggered the error. | |
3e544d72 TL |
140 | |
141 | * If your problem is a regression, try to narrow down when the issue was | |
142 | introduced as much as possible. | |
143 | ||
144 | * Start to compile the report by writing a detailed description about the | |
145 | issue. Always mention a few things: the latest kernel version you installed | |
146 | for reproducing, the Linux Distribution used, and your notes on how to | |
147 | reproduce the issue. Ideally, make the kernel's build configuration | |
148 | (.config) and the output from ``dmesg`` available somewhere on the net and | |
149 | link to it. Include or upload all other information that might be relevant, | |
150 | like the output/screenshot of an Oops or the output from ``lspci``. Once | |
151 | you wrote this main part, insert a normal length paragraph on top of it | |
152 | outlining the issue and the impact quickly. On top of this add one sentence | |
153 | that briefly describes the problem and gets people to read on. Now give the | |
154 | thing a descriptive title or subject that yet again is shorter. Then you're | |
155 | ready to send or file the report like the MAINTAINERS file told you, unless | |
156 | you are dealing with one of those 'issues of high priority': they need | |
157 | special care which is explained in 'Special handling for high priority | |
158 | issues' below. | |
159 | ||
160 | * Wait for reactions and keep the thing rolling until you can accept the | |
161 | outcome in one way or the other. Thus react publicly and in a timely manner | |
162 | to any inquiries. Test proposed fixes. Do proactive testing: retest with at | |
163 | least every first release candidate (RC) of a new mainline version and | |
164 | report your results. Send friendly reminders if things stall. And try to | |
165 | help yourself, if you don't get any help or if it's unsatisfying. | |
166 | ||
167 | ||
4b9d49d1 TL |
168 | Reporting regressions within a stable and longterm kernel line |
169 | -------------------------------------------------------------- | |
3e544d72 | 170 | |
4b9d49d1 TL |
171 | This subsection is for you, if you followed above process and got sent here at |
172 | the point about regression within a stable or longterm kernel version line. You | |
173 | face one of those if something breaks when updating from 5.10.4 to 5.10.5 (a | |
174 | switch from 5.9.15 to 5.10.5 does not qualify). The developers want to fix such | |
175 | regressions as quickly as possible, hence there is a streamlined process to | |
176 | report them: | |
3e544d72 TL |
177 | |
178 | * Check if the kernel developers still maintain the Linux kernel version | |
58c53945 TL |
179 | line you care about: go to the `front page of kernel.org |
180 | <https://kernel.org/>`_ and make sure it mentions | |
181 | the latest release of the particular version line without an '[EOL]' tag. | |
3e544d72 | 182 | |
58c53945 TL |
183 | * Check the archives of the `Linux stable mailing list |
184 | <https://lore.kernel.org/stable/>`_ for existing reports. | |
3e544d72 TL |
185 | |
186 | * Install the latest release from the particular version line as a vanilla | |
187 | kernel. Ensure this kernel is not tainted and still shows the problem, as | |
58c53945 TL |
188 | the issue might have already been fixed there. If you first noticed the |
189 | problem with a vendor kernel, check a vanilla build of the last version | |
6161a4b1 | 190 | known to work performs fine as well. |
3e544d72 | 191 | |
58c53945 | 192 | * Send a short problem report to the Linux stable mailing list |
6161a4b1 | 193 | (stable@vger.kernel.org) and CC the Linux regressions mailing list |
0043f0b2 TL |
194 | (regressions@lists.linux.dev); if you suspect the cause in a particular |
195 | subsystem, CC its maintainer and its mailing list. Roughly describe the | |
196 | issue and ideally explain how to reproduce it. Mention the first version | |
197 | that shows the problem and the last version that's working fine. Then | |
198 | wait for further instructions. | |
4b9d49d1 TL |
199 | |
200 | The reference section below explains each of these steps in more detail. | |
201 | ||
202 | ||
203 | Reporting issues only occurring in older kernel version lines | |
204 | ------------------------------------------------------------- | |
205 | ||
206 | This subsection is for you, if you tried the latest mainline kernel as outlined | |
207 | above, but failed to reproduce your issue there; at the same time you want to | |
58c53945 TL |
208 | see the issue fixed in a still supported stable or longterm series or vendor |
209 | kernels regularly rebased on those. If that the case, follow these steps: | |
4b9d49d1 TL |
210 | |
211 | * Prepare yourself for the possibility that going through the next few steps | |
212 | might not get the issue solved in older releases: the fix might be too big | |
213 | or risky to get backported there. | |
214 | ||
215 | * Perform the first three steps in the section "Dealing with regressions | |
216 | within a stable and longterm kernel line" above. | |
217 | ||
3e544d72 TL |
218 | * Search the Linux kernel version control system for the change that fixed |
219 | the issue in mainline, as its commit message might tell you if the fix is | |
220 | scheduled for backporting already. If you don't find anything that way, | |
221 | search the appropriate mailing lists for posts that discuss such an issue | |
222 | or peer-review possible fixes; then check the discussions if the fix was | |
223 | deemed unsuitable for backporting. If backporting was not considered at | |
224 | all, join the newest discussion, asking if it's in the cards. | |
225 | ||
3e544d72 TL |
226 | * One of the former steps should lead to a solution. If that doesn't work |
227 | out, ask the maintainers for the subsystem that seems to be causing the | |
228 | issue for advice; CC the mailing list for the particular subsystem as well | |
229 | as the stable mailing list. | |
230 | ||
4b9d49d1 TL |
231 | The reference section below explains each of these steps in more detail. |
232 | ||
3e544d72 TL |
233 | |
234 | Reference section: Reporting issues to the kernel maintainers | |
235 | ============================================================= | |
236 | ||
237 | The detailed guides above outline all the major steps in brief fashion, which | |
238 | should be enough for most people. But sometimes there are situations where even | |
239 | experienced users might wonder how to actually do one of those steps. That's | |
240 | what this section is for, as it will provide a lot more details on each of the | |
241 | above steps. Consider this as reference documentation: it's possible to read it | |
242 | from top to bottom. But it's mainly meant to skim over and a place to look up | |
243 | details how to actually perform those steps. | |
244 | ||
245 | A few words of general advice before digging into the details: | |
246 | ||
247 | * The Linux kernel developers are well aware this process is complicated and | |
248 | demands more than other FLOSS projects. We'd love to make it simpler. But | |
249 | that would require work in various places as well as some infrastructure, | |
250 | which would need constant maintenance; nobody has stepped up to do that | |
251 | work, so that's just how things are for now. | |
252 | ||
253 | * A warranty or support contract with some vendor doesn't entitle you to | |
254 | request fixes from developers in the upstream Linux kernel community: such | |
255 | contracts are completely outside the scope of the Linux kernel, its | |
256 | development community, and this document. That's why you can't demand | |
257 | anything such a contract guarantees in this context, not even if the | |
258 | developer handling the issue works for the vendor in question. If you want | |
259 | to claim your rights, use the vendor's support channel instead. When doing | |
260 | so, you might want to mention you'd like to see the issue fixed in the | |
261 | upstream Linux kernel; motivate them by saying it's the only way to ensure | |
262 | the fix in the end will get incorporated in all Linux distributions. | |
263 | ||
264 | * If you never reported an issue to a FLOSS project before you should consider | |
265 | reading `How to Report Bugs Effectively | |
266 | <https://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_, `How To Ask | |
267 | Questions The Smart Way | |
268 | <http://www.catb.org/esr/faqs/smart-questions.html>`_, and `How to ask good | |
269 | questions <https://jvns.ca/blog/good-questions/>`_. | |
270 | ||
271 | With that off the table, find below the details on how to properly report | |
272 | issues to the Linux kernel developers. | |
273 | ||
274 | ||
275 | Make sure you're using the upstream Linux kernel | |
276 | ------------------------------------------------ | |
277 | ||
2dfa9eb0 TL |
278 | *Are you facing an issue with a Linux kernel a hardware or software vendor |
279 | provided? Then in almost all cases you are better off to stop reading this | |
280 | document and reporting the issue to your vendor instead, unless you are | |
281 | willing to install the latest Linux version yourself. Be aware the latter | |
282 | will often be needed anyway to hunt down and fix issues.* | |
3e544d72 TL |
283 | |
284 | Like most programmers, Linux kernel developers don't like to spend time dealing | |
2dfa9eb0 TL |
285 | with reports for issues that don't even happen with their current code. It's |
286 | just a waste everybody's time, especially yours. Unfortunately such situations | |
287 | easily happen when it comes to the kernel and often leads to frustration on both | |
288 | sides. That's because almost all Linux-based kernels pre-installed on devices | |
289 | (Computers, Laptops, Smartphones, Routers, …) and most shipped by Linux | |
290 | distributors are quite distant from the official Linux kernel as distributed by | |
291 | kernel.org: these kernels from these vendors are often ancient from the point of | |
292 | Linux development or heavily modified, often both. | |
293 | ||
58c53945 | 294 | Most of these vendor kernels are quite unsuitable for reporting issues to the |
2dfa9eb0 TL |
295 | Linux kernel developers: an issue you face with one of them might have been |
296 | fixed by the Linux kernel developers months or years ago already; additionally, | |
297 | the modifications and enhancements by the vendor might be causing the issue you | |
298 | face, even if they look small or totally unrelated. That's why you should report | |
299 | issues with these kernels to the vendor. Its developers should look into the | |
3e544d72 | 300 | report and, in case it turns out to be an upstream issue, fix it directly |
2dfa9eb0 TL |
301 | upstream or forward the report there. In practice that often does not work out |
302 | or might not what you want. You thus might want to consider circumventing the | |
303 | vendor by installing the very latest Linux kernel core yourself. If that's an | |
304 | option for you move ahead in this process, as a later step in this guide will | |
305 | explain how to do that once it rules out other potential causes for your issue. | |
306 | ||
307 | Note, the previous paragraph is starting with the word 'most', as sometimes | |
308 | developers in fact are willing to handle reports about issues occurring with | |
309 | vendor kernels. If they do in the end highly depends on the developers and the | |
310 | issue in question. Your chances are quite good if the distributor applied only | |
311 | small modifications to a kernel based on a recent Linux version; that for | |
312 | example often holds true for the mainline kernels shipped by Debian GNU/Linux | |
313 | Sid or Fedora Rawhide. Some developers will also accept reports about issues | |
314 | with kernels from distributions shipping the latest stable kernel, as long as | |
315 | its only slightly modified; that for example is often the case for Arch Linux, | |
316 | regular Fedora releases, and openSUSE Tumbleweed. But keep in mind, you better | |
317 | want to use a mainline Linux and avoid using a stable kernel for this | |
318 | process, as outlined in the section 'Install a fresh kernel for testing' in more | |
319 | detail. | |
320 | ||
321 | Obviously you are free to ignore all this advice and report problems with an old | |
322 | or heavily modified vendor kernel to the upstream Linux developers. But note, | |
323 | those often get rejected or ignored, so consider yourself warned. But it's still | |
324 | better than not reporting the issue at all: sometimes such reports directly or | |
325 | indirectly will help to get the issue fixed over time. | |
3e544d72 | 326 | |
9bc4430d | 327 | |
4b9d49d1 TL |
328 | Search for existing reports, first run |
329 | -------------------------------------- | |
9bc4430d | 330 | |
4b9d49d1 TL |
331 | *Perform a rough search for existing reports with your favorite internet |
332 | search engine; additionally, check the archives of the Linux Kernel Mailing | |
333 | List (LKML). If you find matching reports, join the discussion instead of | |
334 | sending a new one.* | |
9bc4430d | 335 | |
4b9d49d1 TL |
336 | Reporting an issue that someone else already brought forward is often a waste of |
337 | time for everyone involved, especially you as the reporter. So it's in your own | |
338 | interest to thoroughly check if somebody reported the issue already. At this | |
339 | step of the process it's okay to just perform a rough search: a later step will | |
340 | tell you to perform a more detailed search once you know where your issue needs | |
341 | to be reported to. Nevertheless, do not hurry with this step of the reporting | |
342 | process, it can save you time and trouble. | |
9bc4430d | 343 | |
4b9d49d1 TL |
344 | Simply search the internet with your favorite search engine first. Afterwards, |
345 | search the `Linux Kernel Mailing List (LKML) archives | |
346 | <https://lore.kernel.org/lkml/>`_. | |
9bc4430d TL |
347 | |
348 | If you get flooded with results consider telling your search engine to limit | |
349 | search timeframe to the past month or year. And wherever you search, make sure | |
350 | to use good search terms; vary them a few times, too. While doing so try to | |
351 | look at the issue from the perspective of someone else: that will help you to | |
352 | come up with other words to use as search terms. Also make sure not to use too | |
353 | many search terms at once. Remember to search with and without information like | |
354 | the name of the kernel driver or the name of the affected hardware component. | |
355 | But its exact brand name (say 'ASUS Red Devil Radeon RX 5700 XT Gaming OC') | |
356 | often is not much helpful, as it is too specific. Instead try search terms like | |
357 | the model line (Radeon 5700 or Radeon 5000) and the code name of the main chip | |
358 | ('Navi' or 'Navi10') with and without its manufacturer ('AMD'). | |
359 | ||
360 | In case you find an existing report about your issue, join the discussion, as | |
361 | you might be able to provide valuable additional information. That can be | |
362 | important even when a fix is prepared or in its final stages already, as | |
363 | developers might look for people that can provide additional information or | |
364 | test a proposed fix. Jump to the section 'Duties after the report went out' for | |
365 | details on how to get properly involved. | |
366 | ||
4b9d49d1 TL |
367 | Note, searching `bugzilla.kernel.org <https://bugzilla.kernel.org/>`_ might also |
368 | be a good idea, as that might provide valuable insights or turn up matching | |
369 | reports. If you find the latter, just keep in mind: most subsystems expect | |
370 | reports in different places, as described below in the section "Check where you | |
371 | need to report your issue". The developers that should take care of the issue | |
372 | thus might not even be aware of the bugzilla ticket. Hence, check the ticket if | |
373 | the issue already got reported as outlined in this document and if not consider | |
374 | doing so. | |
375 | ||
9bc4430d | 376 | |
3e544d72 TL |
377 | Issue of high priority? |
378 | ----------------------- | |
379 | ||
380 | *See if the issue you are dealing with qualifies as regression, security | |
381 | issue, or a really severe problem: those are 'issues of high priority' that | |
382 | need special handling in some steps that are about to follow.* | |
383 | ||
384 | Linus Torvalds and the leading Linux kernel developers want to see some issues | |
385 | fixed as soon as possible, hence there are 'issues of high priority' that get | |
386 | handled slightly differently in the reporting process. Three type of cases | |
387 | qualify: regressions, security issues, and really severe problems. | |
388 | ||
247097e2 TL |
389 | You deal with a regression if some application or practical use case running |
390 | fine with one Linux kernel works worse or not at all with a newer version | |
391 | compiled using a similar configuration. The document | |
392 | Documentation/admin-guide/reporting-regressions.rst explains this in more | |
393 | detail. It also provides a good deal of other information about regressions you | |
394 | might want to be aware of; it for example explains how to add your issue to the | |
395 | list of tracked regressions, to ensure it won't fall through the cracks. | |
3e544d72 TL |
396 | |
397 | What qualifies as security issue is left to your judgment. Consider reading | |
44ac5aba | 398 | Documentation/process/security-bugs.rst before proceeding, as it |
3e544d72 TL |
399 | provides additional details how to best handle security issues. |
400 | ||
401 | An issue is a 'really severe problem' when something totally unacceptably bad | |
402 | happens. That's for example the case when a Linux kernel corrupts the data it's | |
403 | handling or damages hardware it's running on. You're also dealing with a severe | |
404 | issue when the kernel suddenly stops working with an error message ('kernel | |
405 | panic') or without any farewell note at all. Note: do not confuse a 'panic' (a | |
406 | fatal error where the kernel stop itself) with a 'Oops' (a recoverable error), | |
407 | as the kernel remains running after the latter. | |
408 | ||
409 | ||
4f08d7ab TL |
410 | Ensure a healthy environment |
411 | ---------------------------- | |
412 | ||
413 | *Make sure it's not the kernel's surroundings that are causing the issue | |
414 | you face.* | |
415 | ||
416 | Problems that look a lot like a kernel issue are sometimes caused by build or | |
417 | runtime environment. It's hard to rule out that problem completely, but you | |
418 | should minimize it: | |
419 | ||
420 | * Use proven tools when building your kernel, as bugs in the compiler or the | |
421 | binutils can cause the resulting kernel to misbehave. | |
422 | ||
423 | * Ensure your computer components run within their design specifications; | |
424 | that's especially important for the main processor, the main memory, and the | |
425 | motherboard. Therefore, stop undervolting or overclocking when facing a | |
426 | potential kernel issue. | |
427 | ||
428 | * Try to make sure it's not faulty hardware that is causing your issue. Bad | |
429 | main memory for example can result in a multitude of issues that will | |
430 | manifest itself in problems looking like kernel issues. | |
431 | ||
432 | * If you're dealing with a filesystem issue, you might want to check the file | |
433 | system in question with ``fsck``, as it might be damaged in a way that leads | |
434 | to unexpected kernel behavior. | |
435 | ||
436 | * When dealing with a regression, make sure it's not something else that | |
437 | changed in parallel to updating the kernel. The problem for example might be | |
438 | caused by other software that was updated at the same time. It can also | |
439 | happen that a hardware component coincidentally just broke when you rebooted | |
440 | into a new kernel for the first time. Updating the systems BIOS or changing | |
441 | something in the BIOS Setup can also lead to problems that on look a lot | |
442 | like a kernel regression. | |
443 | ||
444 | ||
445 | Prepare for emergencies | |
446 | ----------------------- | |
447 | ||
448 | *Create a fresh backup and put system repair and restore tools at hand.* | |
449 | ||
450 | Reminder, you are dealing with computers, which sometimes do unexpected things, | |
451 | especially if you fiddle with crucial parts like the kernel of its operating | |
452 | system. That's what you are about to do in this process. Thus, make sure to | |
453 | create a fresh backup; also ensure you have all tools at hand to repair or | |
454 | reinstall the operating system as well as everything you need to restore the | |
455 | backup. | |
456 | ||
457 | ||
458 | Make sure your kernel doesn't get enhanced | |
459 | ------------------------------------------ | |
460 | ||
461 | *Ensure your system does not enhance its kernels by building additional | |
462 | kernel modules on-the-fly, which solutions like DKMS might be doing locally | |
463 | without your knowledge.* | |
464 | ||
465 | The risk your issue report gets ignored or rejected dramatically increases if | |
466 | your kernel gets enhanced in any way. That's why you should remove or disable | |
467 | mechanisms like akmods and DKMS: those build add-on kernel modules | |
468 | automatically, for example when you install a new Linux kernel or boot it for | |
469 | the first time. Also remove any modules they might have installed. Then reboot | |
470 | before proceeding. | |
471 | ||
472 | Note, you might not be aware that your system is using one of these solutions: | |
473 | they often get set up silently when you install Nvidia's proprietary graphics | |
474 | driver, VirtualBox, or other software that requires a some support from a | |
475 | module not part of the Linux kernel. That why your might need to uninstall the | |
476 | packages with such software to get rid of any 3rd party kernel module. | |
477 | ||
478 | ||
3e544d72 TL |
479 | Check 'taint' flag |
480 | ------------------ | |
481 | ||
482 | *Check if your kernel was 'tainted' when the issue occurred, as the event | |
483 | that made the kernel set this flag might be causing the issue you face.* | |
484 | ||
485 | The kernel marks itself with a 'taint' flag when something happens that might | |
486 | lead to follow-up errors that look totally unrelated. The issue you face might | |
487 | be such an error if your kernel is tainted. That's why it's in your interest to | |
488 | rule this out early before investing more time into this process. This is the | |
489 | only reason why this step is here, as this process later will tell you to | |
490 | install the latest mainline kernel; you will need to check the taint flag again | |
491 | then, as that's when it matters because it's the kernel the report will focus | |
492 | on. | |
493 | ||
494 | On a running system is easy to check if the kernel tainted itself: if ``cat | |
495 | /proc/sys/kernel/tainted`` returns '0' then the kernel is not tainted and | |
496 | everything is fine. Checking that file is impossible in some situations; that's | |
497 | why the kernel also mentions the taint status when it reports an internal | |
498 | problem (a 'kernel bug'), a recoverable error (a 'kernel Oops') or a | |
499 | non-recoverable error before halting operation (a 'kernel panic'). Look near | |
500 | the top of the error messages printed when one of these occurs and search for a | |
501 | line starting with 'CPU:'. It should end with 'Not tainted' if the kernel was | |
502 | not tainted when it noticed the problem; it was tainted if you see 'Tainted:' | |
503 | followed by a few spaces and some letters. | |
504 | ||
247097e2 | 505 | If your kernel is tainted, study Documentation/admin-guide/tainted-kernels.rst |
3e544d72 TL |
506 | to find out why. Try to eliminate the reason. Often it's caused by one these |
507 | three things: | |
508 | ||
509 | 1. A recoverable error (a 'kernel Oops') occurred and the kernel tainted | |
510 | itself, as the kernel knows it might misbehave in strange ways after that | |
511 | point. In that case check your kernel or system log and look for a section | |
512 | that starts with this:: | |
513 | ||
514 | Oops: 0000 [#1] SMP | |
515 | ||
516 | That's the first Oops since boot-up, as the '#1' between the brackets shows. | |
517 | Every Oops and any other problem that happens after that point might be a | |
518 | follow-up problem to that first Oops, even if both look totally unrelated. | |
519 | Rule this out by getting rid of the cause for the first Oops and reproducing | |
520 | the issue afterwards. Sometimes simply restarting will be enough, sometimes | |
521 | a change to the configuration followed by a reboot can eliminate the Oops. | |
522 | But don't invest too much time into this at this point of the process, as | |
523 | the cause for the Oops might already be fixed in the newer Linux kernel | |
524 | version you are going to install later in this process. | |
525 | ||
526 | 2. Your system uses a software that installs its own kernel modules, for | |
527 | example Nvidia's proprietary graphics driver or VirtualBox. The kernel | |
528 | taints itself when it loads such module from external sources (even if | |
529 | they are Open Source): they sometimes cause errors in unrelated kernel | |
530 | areas and thus might be causing the issue you face. You therefore have to | |
531 | prevent those modules from loading when you want to report an issue to the | |
532 | Linux kernel developers. Most of the time the easiest way to do that is: | |
533 | temporarily uninstall such software including any modules they might have | |
534 | installed. Afterwards reboot. | |
535 | ||
536 | 3. The kernel also taints itself when it's loading a module that resides in | |
537 | the staging tree of the Linux kernel source. That's a special area for | |
538 | code (mostly drivers) that does not yet fulfill the normal Linux kernel | |
539 | quality standards. When you report an issue with such a module it's | |
540 | obviously okay if the kernel is tainted; just make sure the module in | |
541 | question is the only reason for the taint. If the issue happens in an | |
542 | unrelated area reboot and temporarily block the module from being loaded | |
543 | by specifying ``foo.blacklist=1`` as kernel parameter (replace 'foo' with | |
544 | the name of the module in question). | |
545 | ||
546 | ||
4f08d7ab TL |
547 | Document how to reproduce issue |
548 | ------------------------------- | |
549 | ||
550 | *Write down coarsely how to reproduce the issue. If you deal with multiple | |
551 | issues at once, create separate notes for each of them and make sure they | |
552 | work independently on a freshly booted system. That's needed, as each issue | |
553 | needs to get reported to the kernel developers separately, unless they are | |
554 | strongly entangled.* | |
555 | ||
556 | If you deal with multiple issues at once, you'll have to report each of them | |
557 | separately, as they might be handled by different developers. Describing | |
558 | various issues in one report also makes it quite difficult for others to tear | |
559 | it apart. Hence, only combine issues in one report if they are very strongly | |
560 | entangled. | |
561 | ||
562 | Additionally, during the reporting process you will have to test if the issue | |
563 | happens with other kernel versions. Therefore, it will make your work easier if | |
564 | you know exactly how to reproduce an issue quickly on a freshly booted system. | |
565 | ||
566 | Note: it's often fruitless to report issues that only happened once, as they | |
567 | might be caused by a bit flip due to cosmic radiation. That's why you should | |
568 | try to rule that out by reproducing the issue before going further. Feel free | |
569 | to ignore this advice if you are experienced enough to tell a one-time error | |
570 | due to faulty hardware apart from a kernel issue that rarely happens and thus | |
571 | is hard to reproduce. | |
572 | ||
573 | ||
4b9d49d1 | 574 | Regression in stable or longterm kernel? |
3e544d72 TL |
575 | ---------------------------------------- |
576 | ||
4b9d49d1 TL |
577 | *If you are facing a regression within a stable or longterm version line |
578 | (say something broke when updating from 5.10.4 to 5.10.5), scroll down to | |
579 | 'Dealing with regressions within a stable and longterm kernel line'.* | |
580 | ||
581 | Regression within a stable and longterm kernel version line are something the | |
582 | Linux developers want to fix badly, as such issues are even more unwanted than | |
583 | regression in the main development branch, as they can quickly affect a lot of | |
584 | people. The developers thus want to learn about such issues as quickly as | |
585 | possible, hence there is a streamlined process to report them. Note, | |
586 | regressions with newer kernel version line (say something broke when switching | |
587 | from 5.9.15 to 5.10.5) do not qualify. | |
588 | ||
589 | ||
590 | Check where you need to report your issue | |
591 | ----------------------------------------- | |
592 | ||
3e544d72 TL |
593 | *Locate the driver or kernel subsystem that seems to be causing the issue. |
594 | Find out how and where its developers expect reports. Note: most of the | |
595 | time this won't be bugzilla.kernel.org, as issues typically need to be sent | |
596 | by mail to a maintainer and a public mailing list.* | |
597 | ||
598 | It's crucial to send your report to the right people, as the Linux kernel is a | |
599 | big project and most of its developers are only familiar with a small subset of | |
600 | it. Quite a few programmers for example only care for just one driver, for | |
601 | example one for a WiFi chip; its developer likely will only have small or no | |
602 | knowledge about the internals of remote or unrelated "subsystems", like the TCP | |
603 | stack, the PCIe/PCI subsystem, memory management or file systems. | |
604 | ||
605 | Problem is: the Linux kernel lacks a central bug tracker where you can simply | |
606 | file your issue and make it reach the developers that need to know about it. | |
607 | That's why you have to find the right place and way to report issues yourself. | |
608 | You can do that with the help of a script (see below), but it mainly targets | |
609 | kernel developers and experts. For everybody else the MAINTAINERS file is the | |
610 | better place. | |
611 | ||
612 | How to read the MAINTAINERS file | |
613 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
614 | To illustrate how to use the :ref:`MAINTAINERS <maintainers>` file, lets assume | |
615 | the WiFi in your Laptop suddenly misbehaves after updating the kernel. In that | |
616 | case it's likely an issue in the WiFi driver. Obviously it could also be some | |
617 | code it builds upon, but unless you suspect something like that stick to the | |
618 | driver. If it's really something else, the driver's developers will get the | |
619 | right people involved. | |
620 | ||
621 | Sadly, there is no way to check which code is driving a particular hardware | |
622 | component that is both universal and easy. | |
623 | ||
624 | In case of a problem with the WiFi driver you for example might want to look at | |
625 | the output of ``lspci -k``, as it lists devices on the PCI/PCIe bus and the | |
626 | kernel module driving it:: | |
627 | ||
628 | [user@something ~]$ lspci -k | |
629 | [...] | |
630 | 3a:00.0 Network controller: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter (rev 32) | |
631 | Subsystem: Bigfoot Networks, Inc. Device 1535 | |
632 | Kernel driver in use: ath10k_pci | |
633 | Kernel modules: ath10k_pci | |
634 | [...] | |
635 | ||
636 | But this approach won't work if your WiFi chip is connected over USB or some | |
637 | other internal bus. In those cases you might want to check your WiFi manager or | |
638 | the output of ``ip link``. Look for the name of the problematic network | |
639 | interface, which might be something like 'wlp58s0'. This name can be used like | |
640 | this to find the module driving it:: | |
641 | ||
642 | [user@something ~]$ realpath --relative-to=/sys/module/ /sys/class/net/wlp58s0/device/driver/module | |
643 | ath10k_pci | |
644 | ||
645 | In case tricks like these don't bring you any further, try to search the | |
646 | internet on how to narrow down the driver or subsystem in question. And if you | |
647 | are unsure which it is: just try your best guess, somebody will help you if you | |
648 | guessed poorly. | |
649 | ||
650 | Once you know the driver or subsystem, you want to search for it in the | |
651 | MAINTAINERS file. In the case of 'ath10k_pci' you won't find anything, as the | |
652 | name is too specific. Sometimes you will need to search on the net for help; | |
653 | but before doing so, try a somewhat shorted or modified name when searching the | |
654 | MAINTAINERS file, as then you might find something like this:: | |
655 | ||
656 | QUALCOMM ATHEROS ATH10K WIRELESS DRIVER | |
657 | Mail: A. Some Human <shuman@example.com> | |
658 | Mailing list: ath10k@lists.infradead.org | |
659 | Status: Supported | |
660 | Web-page: https://wireless.wiki.kernel.org/en/users/Drivers/ath10k | |
661 | SCM: git git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git | |
662 | Files: drivers/net/wireless/ath/ath10k/ | |
663 | ||
664 | Note: the line description will be abbreviations, if you read the plain | |
665 | MAINTAINERS file found in the root of the Linux source tree. 'Mail:' for | |
666 | example will be 'M:', 'Mailing list:' will be 'L', and 'Status:' will be 'S:'. | |
667 | A section near the top of the file explains these and other abbreviations. | |
668 | ||
669 | First look at the line 'Status'. Ideally it should be 'Supported' or | |
670 | 'Maintained'. If it states 'Obsolete' then you are using some outdated approach | |
671 | that was replaced by a newer solution you need to switch to. Sometimes the code | |
672 | only has someone who provides 'Odd Fixes' when feeling motivated. And with | |
673 | 'Orphan' you are totally out of luck, as nobody takes care of the code anymore. | |
674 | That only leaves these options: arrange yourself to live with the issue, fix it | |
675 | yourself, or find a programmer somewhere willing to fix it. | |
676 | ||
677 | After checking the status, look for a line starting with 'bugs:': it will tell | |
678 | you where to find a subsystem specific bug tracker to file your issue. The | |
679 | example above does not have such a line. That is the case for most sections, as | |
680 | Linux kernel development is completely driven by mail. Very few subsystems use | |
681 | a bug tracker, and only some of those rely on bugzilla.kernel.org. | |
682 | ||
3e544d72 TL |
683 | In this and many other cases you thus have to look for lines starting with |
684 | 'Mail:' instead. Those mention the name and the email addresses for the | |
685 | maintainers of the particular code. Also look for a line starting with 'Mailing | |
686 | list:', which tells you the public mailing list where the code is developed. | |
687 | Your report later needs to go by mail to those addresses. Additionally, for all | |
688 | issue reports sent by email, make sure to add the Linux Kernel Mailing List | |
689 | (LKML) <linux-kernel@vger.kernel.org> to CC. Don't omit either of the mailing | |
690 | lists when sending your issue report by mail later! Maintainers are busy people | |
691 | and might leave some work for other developers on the subsystem specific list; | |
692 | and LKML is important to have one place where all issue reports can be found. | |
693 | ||
694 | ||
3e544d72 TL |
695 | Finding the maintainers with the help of a script |
696 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
697 | ||
698 | For people that have the Linux sources at hand there is a second option to find | |
699 | the proper place to report: the script 'scripts/get_maintainer.pl' which tries | |
700 | to find all people to contact. It queries the MAINTAINERS file and needs to be | |
701 | called with a path to the source code in question. For drivers compiled as | |
702 | module if often can be found with a command like this:: | |
703 | ||
704 | $ modinfo ath10k_pci | grep filename | sed 's!/lib/modules/.*/kernel/!!; s!filename:!!; s!\.ko\(\|\.xz\)!!' | |
705 | drivers/net/wireless/ath/ath10k/ath10k_pci.ko | |
706 | ||
707 | Pass parts of this to the script:: | |
708 | ||
709 | $ ./scripts/get_maintainer.pl -f drivers/net/wireless/ath/ath10k* | |
710 | Some Human <shuman@example.com> (supporter:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER) | |
711 | Another S. Human <asomehuman@example.com> (maintainer:NETWORKING DRIVERS) | |
712 | ath10k@lists.infradead.org (open list:QUALCOMM ATHEROS ATH10K WIRELESS DRIVER) | |
713 | linux-wireless@vger.kernel.org (open list:NETWORKING DRIVERS (WIRELESS)) | |
714 | netdev@vger.kernel.org (open list:NETWORKING DRIVERS) | |
715 | linux-kernel@vger.kernel.org (open list) | |
716 | ||
717 | Don't sent your report to all of them. Send it to the maintainers, which the | |
718 | script calls "supporter:"; additionally CC the most specific mailing list for | |
719 | the code as well as the Linux Kernel Mailing List (LKML). In this case you thus | |
720 | would need to send the report to 'Some Human <shuman@example.com>' with | |
721 | 'ath10k@lists.infradead.org' and 'linux-kernel@vger.kernel.org' in CC. | |
722 | ||
723 | Note: in case you cloned the Linux sources with git you might want to call | |
724 | ``get_maintainer.pl`` a second time with ``--git``. The script then will look | |
725 | at the commit history to find which people recently worked on the code in | |
726 | question, as they might be able to help. But use these results with care, as it | |
727 | can easily send you in a wrong direction. That for example happens quickly in | |
728 | areas rarely changed (like old or unmaintained drivers): sometimes such code is | |
729 | modified during tree-wide cleanups by developers that do not care about the | |
730 | particular driver at all. | |
731 | ||
732 | ||
4b9d49d1 TL |
733 | Search for existing reports, second run |
734 | --------------------------------------- | |
3e544d72 TL |
735 | |
736 | *Search the archives of the bug tracker or mailing list in question | |
4b9d49d1 TL |
737 | thoroughly for reports that might match your issue. If you find anything, |
738 | join the discussion instead of sending a new report.* | |
739 | ||
740 | As mentioned earlier already: reporting an issue that someone else already | |
741 | brought forward is often a waste of time for everyone involved, especially you | |
742 | as the reporter. That's why you should search for existing report again, now | |
743 | that you know where they need to be reported to. If it's mailing list, you will | |
744 | often find its archives on `lore.kernel.org <https://lore.kernel.org/>`_. | |
745 | ||
746 | But some list are hosted in different places. That for example is the case for | |
747 | the ath10k WiFi driver used as example in the previous step. But you'll often | |
748 | find the archives for these lists easily on the net. Searching for 'archive | |
749 | ath10k@lists.infradead.org' for example will lead you to the `Info page for the | |
750 | ath10k mailing list <https://lists.infradead.org/mailman/listinfo/ath10k>`_, | |
751 | which at the top links to its | |
752 | `list archives <https://lists.infradead.org/pipermail/ath10k/>`_. Sadly this and | |
753 | quite a few other lists miss a way to search the archives. In those cases use a | |
754 | regular internet search engine and add something like | |
3e544d72 TL |
755 | 'site:lists.infradead.org/pipermail/ath10k/' to your search terms, which limits |
756 | the results to the archives at that URL. | |
757 | ||
4b9d49d1 | 758 | It's also wise to check the internet, LKML and maybe bugzilla.kernel.org again |
0043f0b2 TL |
759 | at this point. If your report needs to be filed in a bug tracker, you may want |
760 | to check the mailing list archives for the subsystem as well, as someone might | |
761 | have reported it only there. | |
3e544d72 | 762 | |
4b9d49d1 TL |
763 | For details how to search and what to do if you find matching reports see |
764 | "Search for existing reports, first run" above. | |
3e544d72 | 765 | |
4b9d49d1 TL |
766 | Do not hurry with this step of the reporting process: spending 30 to 60 minutes |
767 | or even more time can save you and others quite a lot of time and trouble. | |
3e544d72 TL |
768 | |
769 | ||
3e544d72 TL |
770 | Install a fresh kernel for testing |
771 | ---------------------------------- | |
772 | ||
2dfa9eb0 TL |
773 | *Unless you are already running the latest 'mainline' Linux kernel, better |
774 | go and install it for the reporting process. Testing and reporting with | |
775 | the latest 'stable' Linux can be an acceptable alternative in some | |
776 | situations; during the merge window that actually might be even the best | |
777 | approach, but in that development phase it can be an even better idea to | |
778 | suspend your efforts for a few days anyway. Whatever version you choose, | |
779 | ideally use a 'vanilla' built. Ignoring these advices will dramatically | |
780 | increase the risk your report will be rejected or ignored.* | |
781 | ||
782 | As mentioned in the detailed explanation for the first step already: Like most | |
783 | programmers, Linux kernel developers don't like to spend time dealing with | |
784 | reports for issues that don't even happen with the current code. It's just a | |
785 | waste everybody's time, especially yours. That's why it's in everybody's | |
786 | interest that you confirm the issue still exists with the latest upstream code | |
787 | before reporting it. You are free to ignore this advice, but as outlined | |
788 | earlier: doing so dramatically increases the risk that your issue report might | |
789 | get rejected or simply ignored. | |
790 | ||
791 | In the scope of the kernel "latest upstream" normally means: | |
792 | ||
793 | * Install a mainline kernel; the latest stable kernel can be an option, but | |
794 | most of the time is better avoided. Longterm kernels (sometimes called 'LTS | |
795 | kernels') are unsuitable at this point of the process. The next subsection | |
796 | explains all of this in more detail. | |
797 | ||
798 | * The over next subsection describes way to obtain and install such a kernel. | |
799 | It also outlines that using a pre-compiled kernel are fine, but better are | |
800 | vanilla, which means: it was built using Linux sources taken straight `from | |
801 | kernel.org <https://kernel.org/>`_ and not modified or enhanced in any way. | |
802 | ||
803 | Choosing the right version for testing | |
804 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
805 | ||
806 | Head over to `kernel.org <https://kernel.org/>`_ to find out which version you | |
807 | want to use for testing. Ignore the big yellow button that says 'Latest release' | |
808 | and look a little lower at the table. At its top you'll see a line starting with | |
809 | mainline, which most of the time will point to a pre-release with a version | |
810 | number like '5.8-rc2'. If that's the case, you'll want to use this mainline | |
811 | kernel for testing, as that where all fixes have to be applied first. Do not let | |
812 | that 'rc' scare you, these 'development kernels' are pretty reliable — and you | |
813 | made a backup, as you were instructed above, didn't you? | |
3e544d72 | 814 | |
58c53945 | 815 | In about two out of every nine to ten weeks, mainline might point you to a |
3e544d72 TL |
816 | proper release with a version number like '5.7'. If that happens, consider |
817 | suspending the reporting process until the first pre-release of the next | |
818 | version (5.8-rc1) shows up on kernel.org. That's because the Linux development | |
819 | cycle then is in its two-week long 'merge window'. The bulk of the changes and | |
820 | all intrusive ones get merged for the next release during this time. It's a bit | |
821 | more risky to use mainline during this period. Kernel developers are also often | |
822 | quite busy then and might have no spare time to deal with issue reports. It's | |
823 | also quite possible that one of the many changes applied during the merge | |
824 | window fixes the issue you face; that's why you soon would have to retest with | |
825 | a newer kernel version anyway, as outlined below in the section 'Duties after | |
826 | the report went out'. | |
827 | ||
828 | That's why it might make sense to wait till the merge window is over. But don't | |
829 | to that if you're dealing with something that shouldn't wait. In that case | |
830 | consider obtaining the latest mainline kernel via git (see below) or use the | |
831 | latest stable version offered on kernel.org. Using that is also acceptable in | |
832 | case mainline for some reason does currently not work for you. An in general: | |
833 | using it for reproducing the issue is also better than not reporting it issue | |
834 | at all. | |
835 | ||
2dfa9eb0 TL |
836 | Better avoid using the latest stable kernel outside merge windows, as all fixes |
837 | must be applied to mainline first. That's why checking the latest mainline | |
838 | kernel is so important: any issue you want to see fixed in older version lines | |
839 | needs to be fixed in mainline first before it can get backported, which can | |
840 | take a few days or weeks. Another reason: the fix you hope for might be too | |
841 | hard or risky for backporting; reporting the issue again hence is unlikely to | |
842 | change anything. | |
843 | ||
844 | These aspects are also why longterm kernels (sometimes called "LTS kernels") | |
845 | are unsuitable for this part of the reporting process: they are to distant from | |
846 | the current code. Hence go and test mainline first and follow the process | |
847 | further: if the issue doesn't occur with mainline it will guide you how to get | |
848 | it fixed in older version lines, if that's in the cards for the fix in question. | |
849 | ||
3e544d72 TL |
850 | How to obtain a fresh Linux kernel |
851 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
852 | ||
2dfa9eb0 TL |
853 | **Using a pre-compiled kernel**: This is often the quickest, easiest, and safest |
854 | way for testing — especially is you are unfamiliar with the Linux kernel. The | |
855 | problem: most of those shipped by distributors or add-on repositories are build | |
856 | from modified Linux sources. They are thus not vanilla and therefore often | |
857 | unsuitable for testing and issue reporting: the changes might cause the issue | |
858 | you face or influence it somehow. | |
859 | ||
860 | But you are in luck if you are using a popular Linux distribution: for quite a | |
861 | few of them you'll find repositories on the net that contain packages with the | |
862 | latest mainline or stable Linux built as vanilla kernel. It's totally okay to | |
863 | use these, just make sure from the repository's description they are vanilla or | |
864 | at least close to it. Additionally ensure the packages contain the latest | |
865 | versions as offered on kernel.org. The packages are likely unsuitable if they | |
866 | are older than a week, as new mainline and stable kernels typically get released | |
867 | at least once a week. | |
868 | ||
869 | Please note that you might need to build your own kernel manually later: that's | |
870 | sometimes needed for debugging or testing fixes, as described later in this | |
871 | document. Also be aware that pre-compiled kernels might lack debug symbols that | |
872 | are needed to decode messages the kernel prints when a panic, Oops, warning, or | |
873 | BUG occurs; if you plan to decode those, you might be better off compiling a | |
874 | kernel yourself (see the end of this subsection and the section titled 'Decode | |
875 | failure messages' for details). | |
876 | ||
877 | **Using git**: Developers and experienced Linux users familiar with git are | |
878 | often best served by obtaining the latest Linux kernel sources straight from the | |
879 | `official development repository on kernel.org | |
3e544d72 TL |
880 | <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_. |
881 | Those are likely a bit ahead of the latest mainline pre-release. Don't worry | |
882 | about it: they are as reliable as a proper pre-release, unless the kernel's | |
883 | development cycle is currently in the middle of a merge window. But even then | |
884 | they are quite reliable. | |
885 | ||
2dfa9eb0 TL |
886 | **Conventional**: People unfamiliar with git are often best served by |
887 | downloading the sources as tarball from `kernel.org <https://kernel.org/>`_. | |
3e544d72 | 888 | |
2dfa9eb0 | 889 | How to actually build a kernel is not described here, as many websites explain |
3e544d72 TL |
890 | the necessary steps already. If you are new to it, consider following one of |
891 | those how-to's that suggest to use ``make localmodconfig``, as that tries to | |
892 | pick up the configuration of your current kernel and then tries to adjust it | |
893 | somewhat for your system. That does not make the resulting kernel any better, | |
894 | but quicker to compile. | |
895 | ||
315c4e45 TL |
896 | Note: If you are dealing with a panic, Oops, warning, or BUG from the kernel, |
897 | please try to enable CONFIG_KALLSYMS when configuring your kernel. | |
898 | Additionally, enable CONFIG_DEBUG_KERNEL and CONFIG_DEBUG_INFO, too; the | |
899 | latter is the relevant one of those two, but can only be reached if you enable | |
900 | the former. Be aware CONFIG_DEBUG_INFO increases the storage space required to | |
901 | build a kernel by quite a bit. But that's worth it, as these options will allow | |
902 | you later to pinpoint the exact line of code that triggers your issue. The | |
903 | section 'Decode failure messages' below explains this in more detail. | |
904 | ||
905 | But keep in mind: Always keep a record of the issue encountered in case it is | |
906 | hard to reproduce. Sending an undecoded report is better than not reporting | |
907 | the issue at all. | |
908 | ||
3e544d72 TL |
909 | |
910 | Check 'taint' flag | |
911 | ------------------ | |
912 | ||
913 | *Ensure the kernel you just installed does not 'taint' itself when | |
914 | running.* | |
915 | ||
916 | As outlined above in more detail already: the kernel sets a 'taint' flag when | |
917 | something happens that can lead to follow-up errors that look totally | |
918 | unrelated. That's why you need to check if the kernel you just installed does | |
919 | not set this flag. And if it does, you in almost all the cases needs to | |
920 | eliminate the reason for it before you reporting issues that occur with it. See | |
921 | the section above for details how to do that. | |
922 | ||
923 | ||
924 | Reproduce issue with the fresh kernel | |
925 | ------------------------------------- | |
926 | ||
927 | *Reproduce the issue with the kernel you just installed. If it doesn't show | |
613f9691 | 928 | up there, scroll down to the instructions for issues only happening with |
3e544d72 TL |
929 | stable and longterm kernels.* |
930 | ||
931 | Check if the issue occurs with the fresh Linux kernel version you just | |
932 | installed. If it was fixed there already, consider sticking with this version | |
933 | line and abandoning your plan to report the issue. But keep in mind that other | |
934 | users might still be plagued by it, as long as it's not fixed in either stable | |
935 | and longterm version from kernel.org (and thus vendor kernels derived from | |
936 | those). If you prefer to use one of those or just want to help their users, | |
937 | head over to the section "Details about reporting issues only occurring in | |
938 | older kernel version lines" below. | |
939 | ||
940 | ||
941 | Optimize description to reproduce issue | |
942 | --------------------------------------- | |
943 | ||
944 | *Optimize your notes: try to find and write the most straightforward way to | |
945 | reproduce your issue. Make sure the end result has all the important | |
946 | details, and at the same time is easy to read and understand for others | |
947 | that hear about it for the first time. And if you learned something in this | |
948 | process, consider searching again for existing reports about the issue.* | |
949 | ||
950 | An unnecessarily complex report will make it hard for others to understand your | |
951 | report. Thus try to find a reproducer that's straight forward to describe and | |
952 | thus easy to understand in written form. Include all important details, but at | |
953 | the same time try to keep it as short as possible. | |
954 | ||
955 | In this in the previous steps you likely have learned a thing or two about the | |
956 | issue you face. Use this knowledge and search again for existing reports | |
957 | instead you can join. | |
958 | ||
959 | ||
960 | Decode failure messages | |
961 | ----------------------- | |
962 | ||
315c4e45 TL |
963 | *If your failure involves a 'panic', 'Oops', 'warning', or 'BUG', consider |
964 | decoding the kernel log to find the line of code that triggered the error.* | |
3e544d72 | 965 | |
315c4e45 TL |
966 | When the kernel detects an internal problem, it will log some information about |
967 | the executed code. This makes it possible to pinpoint the exact line in the | |
968 | source code that triggered the issue and shows how it was called. But that only | |
969 | works if you enabled CONFIG_DEBUG_INFO and CONFIG_KALLSYMS when configuring | |
970 | your kernel. If you did so, consider to decode the information from the | |
971 | kernel's log. That will make it a lot easier to understand what lead to the | |
972 | 'panic', 'Oops', 'warning', or 'BUG', which increases the chances that someone | |
973 | can provide a fix. | |
3e544d72 | 974 | |
315c4e45 TL |
975 | Decoding can be done with a script you find in the Linux source tree. If you |
976 | are running a kernel you compiled yourself earlier, call it like this:: | |
3e544d72 | 977 | |
315c4e45 TL |
978 | [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh ./linux-5.10.5/vmlinux |
979 | ||
980 | If you are running a packaged vanilla kernel, you will likely have to install | |
981 | the corresponding packages with debug symbols. Then call the script (which you | |
982 | might need to get from the Linux sources if your distro does not package it) | |
983 | like this:: | |
984 | ||
985 | [user@something ~]$ sudo dmesg | ./linux-5.10.5/scripts/decode_stacktrace.sh \ | |
986 | /usr/lib/debug/lib/modules/5.10.10-4.1.x86_64/vmlinux /usr/src/kernels/5.10.10-4.1.x86_64/ | |
987 | ||
988 | The script will work on log lines like the following, which show the address of | |
989 | the code the kernel was executing when the error occurred:: | |
990 | ||
991 | [ 68.387301] RIP: 0010:test_module_init+0x5/0xffa [test_module] | |
992 | ||
993 | Once decoded, these lines will look like this:: | |
994 | ||
995 | [ 68.387301] RIP: 0010:test_module_init (/home/username/linux-5.10.5/test-module/test-module.c:16) test_module | |
996 | ||
997 | In this case the executed code was built from the file | |
998 | '~/linux-5.10.5/test-module/test-module.c' and the error occurred by the | |
999 | instructions found in line '16'. | |
3e544d72 | 1000 | |
315c4e45 TL |
1001 | The script will similarly decode the addresses mentioned in the section |
1002 | starting with 'Call trace', which show the path to the function where the | |
1003 | problem occurred. Additionally, the script will show the assembler output for | |
1004 | the code section the kernel was executing. | |
e223a707 | 1005 | |
315c4e45 TL |
1006 | Note, if you can't get this to work, simply skip this step and mention the |
1007 | reason for it in the report. If you're lucky, it might not be needed. And if it | |
1008 | is, someone might help you to get things going. Also be aware this is just one | |
1009 | of several ways to decode kernel stack traces. Sometimes different steps will | |
1010 | be required to retrieve the relevant details. Don't worry about that, if that's | |
1011 | needed in your case, developers will tell you what to do. | |
e223a707 TL |
1012 | |
1013 | ||
1014 | Special care for regressions | |
1015 | ---------------------------- | |
1016 | ||
3e544d72 TL |
1017 | *If your problem is a regression, try to narrow down when the issue was |
1018 | introduced as much as possible.* | |
1019 | ||
1020 | Linux lead developer Linus Torvalds insists that the Linux kernel never | |
1021 | worsens, that's why he deems regressions as unacceptable and wants to see them | |
1022 | fixed quickly. That's why changes that introduced a regression are often | |
1023 | promptly reverted if the issue they cause can't get solved quickly any other | |
1024 | way. Reporting a regression is thus a bit like playing a kind of trump card to | |
1025 | get something quickly fixed. But for that to happen the change that's causing | |
1026 | the regression needs to be known. Normally it's up to the reporter to track | |
1027 | down the culprit, as maintainers often won't have the time or setup at hand to | |
1028 | reproduce it themselves. | |
1029 | ||
1030 | To find the change there is a process called 'bisection' which the document | |
247097e2 | 1031 | Documentation/admin-guide/bug-bisect.rst describes in detail. That process |
3e544d72 TL |
1032 | will often require you to build about ten to twenty kernel images, trying to |
1033 | reproduce the issue with each of them before building the next. Yes, that takes | |
1034 | some time, but don't worry, it works a lot quicker than most people assume. | |
1035 | Thanks to a 'binary search' this will lead you to the one commit in the source | |
1036 | code management system that's causing the regression. Once you find it, search | |
1037 | the net for the subject of the change, its commit id and the shortened commit id | |
1038 | (the first 12 characters of the commit id). This will lead you to existing | |
1039 | reports about it, if there are any. | |
1040 | ||
1041 | Note, a bisection needs a bit of know-how, which not everyone has, and quite a | |
1042 | bit of effort, which not everyone is willing to invest. Nevertheless, it's | |
1043 | highly recommended performing a bisection yourself. If you really can't or | |
1044 | don't want to go down that route at least find out which mainline kernel | |
1045 | introduced the regression. If something for example breaks when switching from | |
1046 | 5.5.15 to 5.8.4, then try at least all the mainline releases in that area (5.6, | |
1047 | 5.7 and 5.8) to check when it first showed up. Unless you're trying to find a | |
1048 | regression in a stable or longterm kernel, avoid testing versions which number | |
1049 | has three sections (5.6.12, 5.7.8), as that makes the outcome hard to | |
1050 | interpret, which might render your testing useless. Once you found the major | |
1051 | version which introduced the regression, feel free to move on in the reporting | |
1052 | process. But keep in mind: it depends on the issue at hand if the developers | |
1053 | will be able to help without knowing the culprit. Sometimes they might | |
1054 | recognize from the report want went wrong and can fix it; other times they will | |
1055 | be unable to help unless you perform a bisection. | |
1056 | ||
1057 | When dealing with regressions make sure the issue you face is really caused by | |
1058 | the kernel and not by something else, as outlined above already. | |
1059 | ||
1060 | In the whole process keep in mind: an issue only qualifies as regression if the | |
247097e2 TL |
1061 | older and the newer kernel got built with a similar configuration. This can be |
1062 | achieved by using ``make olddefconfig``, as explained in more detail by | |
1063 | Documentation/admin-guide/reporting-regressions.rst; that document also | |
1064 | provides a good deal of other information about regressions you might want to be | |
1065 | aware of. | |
3e544d72 TL |
1066 | |
1067 | ||
1068 | Write and send the report | |
1069 | ------------------------- | |
1070 | ||
1071 | *Start to compile the report by writing a detailed description about the | |
1072 | issue. Always mention a few things: the latest kernel version you installed | |
1073 | for reproducing, the Linux Distribution used, and your notes on how to | |
1074 | reproduce the issue. Ideally, make the kernel's build configuration | |
1075 | (.config) and the output from ``dmesg`` available somewhere on the net and | |
1076 | link to it. Include or upload all other information that might be relevant, | |
1077 | like the output/screenshot of an Oops or the output from ``lspci``. Once | |
1078 | you wrote this main part, insert a normal length paragraph on top of it | |
1079 | outlining the issue and the impact quickly. On top of this add one sentence | |
1080 | that briefly describes the problem and gets people to read on. Now give the | |
1081 | thing a descriptive title or subject that yet again is shorter. Then you're | |
1082 | ready to send or file the report like the MAINTAINERS file told you, unless | |
1083 | you are dealing with one of those 'issues of high priority': they need | |
1084 | special care which is explained in 'Special handling for high priority | |
1085 | issues' below.* | |
1086 | ||
1087 | Now that you have prepared everything it's time to write your report. How to do | |
1088 | that is partly explained by the three documents linked to in the preface above. | |
1089 | That's why this text will only mention a few of the essentials as well as | |
1090 | things specific to the Linux kernel. | |
1091 | ||
1092 | There is one thing that fits both categories: the most crucial parts of your | |
1093 | report are the title/subject, the first sentence, and the first paragraph. | |
1094 | Developers often get quite a lot of mail. They thus often just take a few | |
1095 | seconds to skim a mail before deciding to move on or look closer. Thus: the | |
1096 | better the top section of your report, the higher are the chances that someone | |
1097 | will look into it and help you. And that is why you should ignore them for now | |
1098 | and write the detailed report first. ;-) | |
1099 | ||
1100 | Things each report should mention | |
1101 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1102 | ||
1103 | Describe in detail how your issue happens with the fresh vanilla kernel you | |
1104 | installed. Try to include the step-by-step instructions you wrote and optimized | |
1105 | earlier that outline how you and ideally others can reproduce the issue; in | |
1106 | those rare cases where that's impossible try to describe what you did to | |
1107 | trigger it. | |
1108 | ||
1109 | Also include all the relevant information others might need to understand the | |
1110 | issue and its environment. What's actually needed depends a lot on the issue, | |
1111 | but there are some things you should include always: | |
1112 | ||
1113 | * the output from ``cat /proc/version``, which contains the Linux kernel | |
1114 | version number and the compiler it was built with. | |
1115 | ||
1116 | * the Linux distribution the machine is running (``hostnamectl | grep | |
1117 | "Operating System"``) | |
1118 | ||
1119 | * the architecture of the CPU and the operating system (``uname -mi``) | |
1120 | ||
1121 | * if you are dealing with a regression and performed a bisection, mention the | |
1122 | subject and the commit-id of the change that is causing it. | |
1123 | ||
1124 | In a lot of cases it's also wise to make two more things available to those | |
1125 | that read your report: | |
1126 | ||
1127 | * the configuration used for building your Linux kernel (the '.config' file) | |
1128 | ||
1129 | * the kernel's messages that you get from ``dmesg`` written to a file. Make | |
1130 | sure that it starts with a line like 'Linux version 5.8-1 | |
1131 | (foobar@example.com) (gcc (GCC) 10.2.1, GNU ld version 2.34) #1 SMP Mon Aug | |
1132 | 3 14:54:37 UTC 2020' If it's missing, then important messages from the first | |
1133 | boot phase already got discarded. In this case instead consider using | |
1134 | ``journalctl -b 0 -k``; alternatively you can also reboot, reproduce the | |
1135 | issue and call ``dmesg`` right afterwards. | |
1136 | ||
1137 | These two files are big, that's why it's a bad idea to put them directly into | |
1138 | your report. If you are filing the issue in a bug tracker then attach them to | |
1139 | the ticket. If you report the issue by mail do not attach them, as that makes | |
1140 | the mail too large; instead do one of these things: | |
1141 | ||
1142 | * Upload the files somewhere public (your website, a public file paste | |
1143 | service, a ticket created just for this purpose on `bugzilla.kernel.org | |
1144 | <https://bugzilla.kernel.org/>`_, ...) and include a link to them in your | |
1145 | report. Ideally use something where the files stay available for years, as | |
1146 | they could be useful to someone many years from now; this for example can | |
1147 | happen if five or ten years from now a developer works on some code that was | |
1148 | changed just to fix your issue. | |
1149 | ||
1150 | * Put the files aside and mention you will send them later in individual | |
1151 | replies to your own mail. Just remember to actually do that once the report | |
1152 | went out. ;-) | |
1153 | ||
1154 | Things that might be wise to provide | |
1155 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1156 | ||
1157 | Depending on the issue you might need to add more background data. Here are a | |
1158 | few suggestions what often is good to provide: | |
1159 | ||
1160 | * If you are dealing with a 'warning', an 'OOPS' or a 'panic' from the kernel, | |
1161 | include it. If you can't copy'n'paste it, try to capture a netconsole trace | |
1162 | or at least take a picture of the screen. | |
1163 | ||
1164 | * If the issue might be related to your computer hardware, mention what kind | |
1165 | of system you use. If you for example have problems with your graphics card, | |
1166 | mention its manufacturer, the card's model, and what chip is uses. If it's a | |
1167 | laptop mention its name, but try to make sure it's meaningful. 'Dell XPS 13' | |
1168 | for example is not, because it might be the one from 2012; that one looks | |
1169 | not that different from the one sold today, but apart from that the two have | |
1170 | nothing in common. Hence, in such cases add the exact model number, which | |
1171 | for example are '9380' or '7390' for XPS 13 models introduced during 2019. | |
1172 | Names like 'Lenovo Thinkpad T590' are also somewhat ambiguous: there are | |
1173 | variants of this laptop with and without a dedicated graphics chip, so try | |
1174 | to find the exact model name or specify the main components. | |
1175 | ||
1176 | * Mention the relevant software in use. If you have problems with loading | |
1177 | modules, you want to mention the versions of kmod, systemd, and udev in use. | |
1178 | If one of the DRM drivers misbehaves, you want to state the versions of | |
1179 | libdrm and Mesa; also specify your Wayland compositor or the X-Server and | |
1180 | its driver. If you have a filesystem issue, mention the version of | |
1181 | corresponding filesystem utilities (e2fsprogs, btrfs-progs, xfsprogs, ...). | |
1182 | ||
1183 | * Gather additional information from the kernel that might be of interest. The | |
1184 | output from ``lspci -nn`` will for example help others to identify what | |
1185 | hardware you use. If you have a problem with hardware you even might want to | |
1186 | make the output from ``sudo lspci -vvv`` available, as that provides | |
1187 | insights how the components were configured. For some issues it might be | |
1188 | good to include the contents of files like ``/proc/cpuinfo``, | |
1189 | ``/proc/ioports``, ``/proc/iomem``, ``/proc/modules``, or | |
1190 | ``/proc/scsi/scsi``. Some subsystem also offer tools to collect relevant | |
1191 | information. One such tool is ``alsa-info.sh`` `which the audio/sound | |
1192 | subsystem developers provide <https://www.alsa-project.org/wiki/AlsaInfo>`_. | |
1193 | ||
1194 | Those examples should give your some ideas of what data might be wise to | |
1195 | attach, but you have to think yourself what will be helpful for others to know. | |
1196 | Don't worry too much about forgetting something, as developers will ask for | |
1197 | additional details they need. But making everything important available from | |
1198 | the start increases the chance someone will take a closer look. | |
1199 | ||
1200 | ||
1201 | The important part: the head of your report | |
1202 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1203 | ||
1204 | Now that you have the detailed part of the report prepared let's get to the | |
1205 | most important section: the first few sentences. Thus go to the top, add | |
1206 | something like 'The detailed description:' before the part you just wrote and | |
1207 | insert two newlines at the top. Now write one normal length paragraph that | |
1208 | describes the issue roughly. Leave out all boring details and focus on the | |
1209 | crucial parts readers need to know to understand what this is all about; if you | |
1210 | think this bug affects a lot of users, mention this to get people interested. | |
1211 | ||
1212 | Once you did that insert two more lines at the top and write a one sentence | |
1213 | summary that explains quickly what the report is about. After that you have to | |
1214 | get even more abstract and write an even shorter subject/title for the report. | |
1215 | ||
1216 | Now that you have written this part take some time to optimize it, as it is the | |
1217 | most important parts of your report: a lot of people will only read this before | |
1218 | they decide if reading the rest is time well spent. | |
1219 | ||
1220 | Now send or file the report like the :ref:`MAINTAINERS <maintainers>` file told | |
1221 | you, unless it's one of those 'issues of high priority' outlined earlier: in | |
1222 | that case please read the next subsection first before sending the report on | |
1223 | its way. | |
1224 | ||
1225 | Special handling for high priority issues | |
1226 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1227 | ||
1228 | Reports for high priority issues need special handling. | |
1229 | ||
58c53945 | 1230 | **Severe issues**: make sure the subject or ticket title as well as the first |
3e544d72 TL |
1231 | paragraph makes the severeness obvious. |
1232 | ||
6161a4b1 TL |
1233 | **Regressions**: make the report's subject start with '[REGRESSION]'. |
1234 | ||
1235 | In case you performed a successful bisection, use the title of the change that | |
1236 | introduced the regression as the second part of your subject. Make the report | |
349660e9 | 1237 | also mention the commit id of the culprit. In case of an unsuccessful bisection, |
6161a4b1 TL |
1238 | make your report mention the latest tested version that's working fine (say 5.7) |
1239 | and the oldest where the issue occurs (say 5.8-rc1). | |
1240 | ||
1241 | When sending the report by mail, CC the Linux regressions mailing list | |
1242 | (regressions@lists.linux.dev). In case the report needs to be filed to some web | |
0043f0b2 TL |
1243 | tracker, proceed to do so. Once filed, forward the report by mail to the |
1244 | regressions list; CC the maintainer and the mailing list for the subsystem in | |
1245 | question. Make sure to inline the forwarded report, hence do not attach it. | |
1246 | Also add a short note at the top where you mention the URL to the ticket. | |
6161a4b1 TL |
1247 | |
1248 | When mailing or forwarding the report, in case of a successful bisection add the | |
1249 | author of the culprit to the recipients; also CC everyone in the signed-off-by | |
1250 | chain, which you find at the end of its commit message. | |
3e544d72 TL |
1251 | |
1252 | **Security issues**: for these issues your will have to evaluate if a | |
1253 | short-term risk to other users would arise if details were publicly disclosed. | |
1254 | If that's not the case simply proceed with reporting the issue as described. | |
1255 | For issues that bear such a risk you will need to adjust the reporting process | |
1256 | slightly: | |
1257 | ||
1258 | * If the MAINTAINERS file instructed you to report the issue by mail, do not | |
1259 | CC any public mailing lists. | |
1260 | ||
1261 | * If you were supposed to file the issue in a bug tracker make sure to mark | |
1262 | the ticket as 'private' or 'security issue'. If the bug tracker does not | |
1263 | offer a way to keep reports private, forget about it and send your report as | |
1264 | a private mail to the maintainers instead. | |
1265 | ||
1266 | In both cases make sure to also mail your report to the addresses the | |
1267 | MAINTAINERS file lists in the section 'security contact'. Ideally directly CC | |
1268 | them when sending the report by mail. If you filed it in a bug tracker, forward | |
1269 | the report's text to these addresses; but on top of it put a small note where | |
1270 | you mention that you filed it with a link to the ticket. | |
1271 | ||
44ac5aba | 1272 | See Documentation/process/security-bugs.rst for more information. |
3e544d72 TL |
1273 | |
1274 | ||
1275 | Duties after the report went out | |
1276 | -------------------------------- | |
1277 | ||
1278 | *Wait for reactions and keep the thing rolling until you can accept the | |
1279 | outcome in one way or the other. Thus react publicly and in a timely manner | |
1280 | to any inquiries. Test proposed fixes. Do proactive testing: retest with at | |
1281 | least every first release candidate (RC) of a new mainline version and | |
1282 | report your results. Send friendly reminders if things stall. And try to | |
1283 | help yourself, if you don't get any help or if it's unsatisfying.* | |
1284 | ||
1285 | If your report was good and you are really lucky then one of the developers | |
1286 | might immediately spot what's causing the issue; they then might write a patch | |
1287 | to fix it, test it, and send it straight for integration in mainline while | |
1288 | tagging it for later backport to stable and longterm kernels that need it. Then | |
1289 | all you need to do is reply with a 'Thank you very much' and switch to a version | |
1290 | with the fix once it gets released. | |
1291 | ||
1292 | But this ideal scenario rarely happens. That's why the job is only starting | |
1293 | once you got the report out. What you'll have to do depends on the situations, | |
1294 | but often it will be the things listed below. But before digging into the | |
1295 | details, here are a few important things you need to keep in mind for this part | |
1296 | of the process. | |
1297 | ||
1298 | ||
1299 | General advice for further interactions | |
1300 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1301 | ||
1302 | **Always reply in public**: When you filed the issue in a bug tracker, always | |
1303 | reply there and do not contact any of the developers privately about it. For | |
1304 | mailed reports always use the 'Reply-all' function when replying to any mails | |
1305 | you receive. That includes mails with any additional data you might want to add | |
1306 | to your report: go to your mail applications 'Sent' folder and use 'reply-all' | |
1307 | on your mail with the report. This approach will make sure the public mailing | |
1308 | list(s) and everyone else that gets involved over time stays in the loop; it | |
1309 | also keeps the mail thread intact, which among others is really important for | |
1310 | mailing lists to group all related mails together. | |
1311 | ||
1312 | There are just two situations where a comment in a bug tracker or a 'Reply-all' | |
1313 | is unsuitable: | |
1314 | ||
1315 | * Someone tells you to send something privately. | |
1316 | ||
1317 | * You were told to send something, but noticed it contains sensitive | |
1318 | information that needs to be kept private. In that case it's okay to send it | |
1319 | in private to the developer that asked for it. But note in the ticket or a | |
1320 | mail that you did that, so everyone else knows you honored the request. | |
1321 | ||
1322 | **Do research before asking for clarifications or help**: In this part of the | |
1323 | process someone might tell you to do something that requires a skill you might | |
1324 | not have mastered yet. For example, you might be asked to use some test tools | |
1325 | you never have heard of yet; or you might be asked to apply a patch to the | |
1326 | Linux kernel sources to test if it helps. In some cases it will be fine sending | |
1327 | a reply asking for instructions how to do that. But before going that route try | |
1328 | to find the answer own your own by searching the internet; alternatively | |
613f9691 | 1329 | consider asking in other places for advice. For example ask a friend or post |
3e544d72 TL |
1330 | about it to a chatroom or forum you normally hang out. |
1331 | ||
1332 | **Be patient**: If you are really lucky you might get a reply to your report | |
1333 | within a few hours. But most of the time it will take longer, as maintainers | |
1334 | are scattered around the globe and thus might be in a different time zone – one | |
1335 | where they already enjoy their night away from keyboard. | |
1336 | ||
1337 | In general, kernel developers will take one to five business days to respond to | |
1338 | reports. Sometimes it will take longer, as they might be busy with the merge | |
1339 | windows, other work, visiting developer conferences, or simply enjoying a long | |
1340 | summer holiday. | |
1341 | ||
1342 | The 'issues of high priority' (see above for an explanation) are an exception | |
1343 | here: maintainers should address them as soon as possible; that's why you | |
1344 | should wait a week at maximum (or just two days if it's something urgent) | |
1345 | before sending a friendly reminder. | |
1346 | ||
1347 | Sometimes the maintainer might not be responding in a timely manner; other | |
1348 | times there might be disagreements, for example if an issue qualifies as | |
1349 | regression or not. In such cases raise your concerns on the mailing list and | |
1350 | ask others for public or private replies how to move on. If that fails, it | |
1351 | might be appropriate to get a higher authority involved. In case of a WiFi | |
1352 | driver that would be the wireless maintainers; if there are no higher level | |
1353 | maintainers or all else fails, it might be one of those rare situations where | |
1354 | it's okay to get Linus Torvalds involved. | |
1355 | ||
1356 | **Proactive testing**: Every time the first pre-release (the 'rc1') of a new | |
1357 | mainline kernel version gets released, go and check if the issue is fixed there | |
1358 | or if anything of importance changed. Mention the outcome in the ticket or in a | |
1359 | mail you sent as reply to your report (make sure it has all those in the CC | |
1360 | that up to that point participated in the discussion). This will show your | |
1361 | commitment and that you are willing to help. It also tells developers if the | |
1362 | issue persists and makes sure they do not forget about it. A few other | |
1363 | occasional retests (for example with rc3, rc5 and the final) are also a good | |
1364 | idea, but only report your results if something relevant changed or if you are | |
1365 | writing something anyway. | |
1366 | ||
1367 | With all these general things off the table let's get into the details of how | |
1368 | to help to get issues resolved once they were reported. | |
1369 | ||
1370 | Inquires and testing request | |
1371 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1372 | ||
1373 | Here are your duties in case you got replies to your report: | |
1374 | ||
1375 | **Check who you deal with**: Most of the time it will be the maintainer or a | |
1376 | developer of the particular code area that will respond to your report. But as | |
1377 | issues are normally reported in public it could be anyone that's replying — | |
1378 | including people that want to help, but in the end might guide you totally off | |
1379 | track with their questions or requests. That rarely happens, but it's one of | |
1380 | many reasons why it's wise to quickly run an internet search to see who you're | |
1381 | interacting with. By doing this you also get aware if your report was heard by | |
1382 | the right people, as a reminder to the maintainer (see below) might be in order | |
1383 | later if discussion fades out without leading to a satisfying solution for the | |
1384 | issue. | |
1385 | ||
1386 | **Inquiries for data**: Often you will be asked to test something or provide | |
1387 | additional details. Try to provide the requested information soon, as you have | |
1388 | the attention of someone that might help and risk losing it the longer you | |
1389 | wait; that outcome is even likely if you do not provide the information within | |
1390 | a few business days. | |
1391 | ||
1392 | **Requests for testing**: When you are asked to test a diagnostic patch or a | |
1393 | possible fix, try to test it in timely manner, too. But do it properly and make | |
1394 | sure to not rush it: mixing things up can happen easily and can lead to a lot | |
1395 | of confusion for everyone involved. A common mistake for example is thinking a | |
1396 | proposed patch with a fix was applied, but in fact wasn't. Things like that | |
1397 | happen even to experienced testers occasionally, but they most of the time will | |
1398 | notice when the kernel with the fix behaves just as one without it. | |
1399 | ||
1400 | What to do when nothing of substance happens | |
1401 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1402 | ||
1403 | Some reports will not get any reaction from the responsible Linux kernel | |
1404 | developers; or a discussion around the issue evolved, but faded out with | |
1405 | nothing of substance coming out of it. | |
1406 | ||
1407 | In these cases wait two (better: three) weeks before sending a friendly | |
1408 | reminder: maybe the maintainer was just away from keyboard for a while when | |
1409 | your report arrived or had something more important to take care of. When | |
1410 | writing the reminder, kindly ask if anything else from your side is needed to | |
1411 | get the ball running somehow. If the report got out by mail, do that in the | |
1412 | first lines of a mail that is a reply to your initial mail (see above) which | |
1413 | includes a full quote of the original report below: that's on of those few | |
1414 | situations where such a 'TOFU' (Text Over, Fullquote Under) is the right | |
1415 | approach, as then all the recipients will have the details at hand immediately | |
1416 | in the proper order. | |
1417 | ||
1418 | After the reminder wait three more weeks for replies. If you still don't get a | |
1419 | proper reaction, you first should reconsider your approach. Did you maybe try | |
1420 | to reach out to the wrong people? Was the report maybe offensive or so | |
1421 | confusing that people decided to completely stay away from it? The best way to | |
1422 | rule out such factors: show the report to one or two people familiar with FLOSS | |
1423 | issue reporting and ask for their opinion. Also ask them for their advice how | |
1424 | to move forward. That might mean: prepare a better report and make those people | |
1425 | review it before you send it out. Such an approach is totally fine; just | |
1426 | mention that this is the second and improved report on the issue and include a | |
1427 | link to the first report. | |
1428 | ||
1429 | If the report was proper you can send a second reminder; in it ask for advice | |
1430 | why the report did not get any replies. A good moment for this second reminder | |
1431 | mail is shortly after the first pre-release (the 'rc1') of a new Linux kernel | |
1432 | version got published, as you should retest and provide a status update at that | |
1433 | point anyway (see above). | |
1434 | ||
1435 | If the second reminder again results in no reaction within a week, try to | |
1436 | contact a higher-level maintainer asking for advice: even busy maintainers by | |
1437 | then should at least have sent some kind of acknowledgment. | |
1438 | ||
1439 | Remember to prepare yourself for a disappointment: maintainers ideally should | |
1440 | react somehow to every issue report, but they are only obliged to fix those | |
1441 | 'issues of high priority' outlined earlier. So don't be too devastating if you | |
1442 | get a reply along the lines of 'thanks for the report, I have more important | |
1443 | issues to deal with currently and won't have time to look into this for the | |
1444 | foreseeable future'. | |
1445 | ||
1446 | It's also possible that after some discussion in the bug tracker or on a list | |
1447 | nothing happens anymore and reminders don't help to motivate anyone to work out | |
1448 | a fix. Such situations can be devastating, but is within the cards when it | |
1449 | comes to Linux kernel development. This and several other reasons for not | |
1450 | getting help are explained in 'Why some issues won't get any reaction or remain | |
1451 | unfixed after being reported' near the end of this document. | |
1452 | ||
1453 | Don't get devastated if you don't find any help or if the issue in the end does | |
1454 | not get solved: the Linux kernel is FLOSS and thus you can still help yourself. | |
1455 | You for example could try to find others that are affected and team up with | |
1456 | them to get the issue resolved. Such a team could prepare a fresh report | |
1457 | together that mentions how many you are and why this is something that in your | |
1458 | option should get fixed. Maybe together you can also narrow down the root cause | |
1459 | or the change that introduced a regression, which often makes developing a fix | |
1460 | easier. And with a bit of luck there might be someone in the team that knows a | |
1461 | bit about programming and might be able to write a fix. | |
1462 | ||
1463 | ||
58c53945 TL |
1464 | Reference for "Reporting regressions within a stable and longterm kernel line" |
1465 | ------------------------------------------------------------------------------ | |
9bc4430d | 1466 | |
58c53945 TL |
1467 | This subsection provides details for the steps you need to perform if you face |
1468 | a regression within a stable and longterm kernel line. | |
9bc4430d TL |
1469 | |
1470 | Make sure the particular version line still gets support | |
1471 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1472 | ||
1473 | *Check if the kernel developers still maintain the Linux kernel version | |
1474 | line you care about: go to the front page of kernel.org and make sure it | |
1475 | mentions the latest release of the particular version line without an | |
1476 | '[EOL]' tag.* | |
1477 | ||
1478 | Most kernel version lines only get supported for about three months, as | |
1479 | maintaining them longer is quite a lot of work. Hence, only one per year is | |
1480 | chosen and gets supported for at least two years (often six). That's why you | |
1481 | need to check if the kernel developers still support the version line you care | |
1482 | for. | |
1483 | ||
58c53945 | 1484 | Note, if kernel.org lists two stable version lines on the front page, you |
9bc4430d TL |
1485 | should consider switching to the newer one and forget about the older one: |
1486 | support for it is likely to be abandoned soon. Then it will get a "end-of-life" | |
1487 | (EOL) stamp. Version lines that reached that point still get mentioned on the | |
1488 | kernel.org front page for a week or two, but are unsuitable for testing and | |
1489 | reporting. | |
1490 | ||
1491 | Search stable mailing list | |
1492 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1493 | ||
1494 | *Check the archives of the Linux stable mailing list for existing reports.* | |
1495 | ||
1496 | Maybe the issue you face is already known and was fixed or is about to. Hence, | |
1497 | `search the archives of the Linux stable mailing list | |
1498 | <https://lore.kernel.org/stable/>`_ for reports about an issue like yours. If | |
1499 | you find any matches, consider joining the discussion, unless the fix is | |
1500 | already finished and scheduled to get applied soon. | |
1501 | ||
1502 | Reproduce issue with the newest release | |
1503 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1504 | ||
1505 | *Install the latest release from the particular version line as a vanilla | |
1506 | kernel. Ensure this kernel is not tainted and still shows the problem, as | |
58c53945 TL |
1507 | the issue might have already been fixed there. If you first noticed the |
1508 | problem with a vendor kernel, check a vanilla build of the last version | |
1509 | known to work performs fine as well.* | |
9bc4430d TL |
1510 | |
1511 | Before investing any more time in this process you want to check if the issue | |
1512 | was already fixed in the latest release of version line you're interested in. | |
1513 | This kernel needs to be vanilla and shouldn't be tainted before the issue | |
4b9d49d1 TL |
1514 | happens, as detailed outlined already above in the section "Install a fresh |
1515 | kernel for testing". | |
1516 | ||
58c53945 TL |
1517 | Did you first notice the regression with a vendor kernel? Then changes the |
1518 | vendor applied might be interfering. You need to rule that out by performing | |
1519 | a recheck. Say something broke when you updated from 5.10.4-vendor.42 to | |
1520 | 5.10.5-vendor.43. Then after testing the latest 5.10 release as outlined in | |
1521 | the previous paragraph check if a vanilla build of Linux 5.10.4 works fine as | |
1522 | well. If things are broken there, the issue does not qualify as upstream | |
1523 | regression and you need switch back to the main step-by-step guide to report | |
1524 | the issue. | |
1525 | ||
4b9d49d1 TL |
1526 | Report the regression |
1527 | ~~~~~~~~~~~~~~~~~~~~~ | |
1528 | ||
58c53945 | 1529 | *Send a short problem report to the Linux stable mailing list |
6161a4b1 | 1530 | (stable@vger.kernel.org) and CC the Linux regressions mailing list |
0043f0b2 TL |
1531 | (regressions@lists.linux.dev); if you suspect the cause in a particular |
1532 | subsystem, CC its maintainer and its mailing list. Roughly describe the | |
1533 | issue and ideally explain how to reproduce it. Mention the first version | |
1534 | that shows the problem and the last version that's working fine. Then | |
1535 | wait for further instructions.* | |
4b9d49d1 TL |
1536 | |
1537 | When reporting a regression that happens within a stable or longterm kernel | |
1538 | line (say when updating from 5.10.4 to 5.10.5) a brief report is enough for | |
0043f0b2 TL |
1539 | the start to get the issue reported quickly. Hence a rough description to the |
1540 | stable and regressions mailing list is all it takes; but in case you suspect | |
1541 | the cause in a particular subsystem, CC its maintainers and its mailing list | |
1542 | as well, because that will speed things up. | |
4b9d49d1 | 1543 | |
0043f0b2 | 1544 | And note, it helps developers a great deal if you can specify the exact version |
4b9d49d1 TL |
1545 | that introduced the problem. Hence if possible within a reasonable time frame, |
1546 | try to find that version using vanilla kernels. Lets assume something broke when | |
1547 | your distributor released a update from Linux kernel 5.10.5 to 5.10.8. Then as | |
1548 | instructed above go and check the latest kernel from that version line, say | |
1549 | 5.10.9. If it shows the problem, try a vanilla 5.10.5 to ensure that no patches | |
1550 | the distributor applied interfere. If the issue doesn't manifest itself there, | |
1551 | try 5.10.7 and then (depending on the outcome) 5.10.8 or 5.10.6 to find the | |
1552 | first version where things broke. Mention it in the report and state that 5.10.9 | |
1553 | is still broken. | |
1554 | ||
1555 | What the previous paragraph outlines is basically a rough manual 'bisection'. | |
1556 | Once your report is out your might get asked to do a proper one, as it allows to | |
1557 | pinpoint the exact change that causes the issue (which then can easily get | |
1558 | reverted to fix the issue quickly). Hence consider to do a proper bisection | |
1559 | right away if time permits. See the section 'Special care for regressions' and | |
247097e2 | 1560 | the document Documentation/admin-guide/bug-bisect.rst for details how to |
0043f0b2 TL |
1561 | perform one. In case of a successful bisection add the author of the culprit to |
1562 | the recipients; also CC everyone in the signed-off-by chain, which you find at | |
1563 | the end of its commit message. | |
4b9d49d1 TL |
1564 | |
1565 | ||
58c53945 TL |
1566 | Reference for "Reporting issues only occurring in older kernel version lines" |
1567 | ----------------------------------------------------------------------------- | |
4b9d49d1 | 1568 | |
58c53945 | 1569 | This section provides details for the steps you need to take if you could not |
3e544d72 TL |
1570 | reproduce your issue with a mainline kernel, but want to see it fixed in older |
1571 | version lines (aka stable and longterm kernels). | |
1572 | ||
1573 | Some fixes are too complex | |
1574 | ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1575 | ||
1576 | *Prepare yourself for the possibility that going through the next few steps | |
1577 | might not get the issue solved in older releases: the fix might be too big | |
1578 | or risky to get backported there.* | |
1579 | ||
1580 | Even small and seemingly obvious code-changes sometimes introduce new and | |
1581 | totally unexpected problems. The maintainers of the stable and longterm kernels | |
1582 | are very aware of that and thus only apply changes to these kernels that are | |
247097e2 | 1583 | within rules outlined in Documentation/process/stable-kernel-rules.rst. |
3e544d72 TL |
1584 | |
1585 | Complex or risky changes for example do not qualify and thus only get applied | |
1586 | to mainline. Other fixes are easy to get backported to the newest stable and | |
1587 | longterm kernels, but too risky to integrate into older ones. So be aware the | |
1588 | fix you are hoping for might be one of those that won't be backported to the | |
1589 | version line your care about. In that case you'll have no other choice then to | |
1590 | live with the issue or switch to a newer Linux version, unless you want to | |
1591 | patch the fix into your kernels yourself. | |
1592 | ||
4b9d49d1 TL |
1593 | Common preparations |
1594 | ~~~~~~~~~~~~~~~~~~~ | |
3e544d72 | 1595 | |
4b9d49d1 TL |
1596 | *Perform the first three steps in the section "Reporting issues only |
1597 | occurring in older kernel version lines" above.* | |
3e544d72 | 1598 | |
4b9d49d1 TL |
1599 | You need to carry out a few steps already described in another section of this |
1600 | guide. Those steps will let you: | |
3e544d72 | 1601 | |
4b9d49d1 TL |
1602 | * Check if the kernel developers still maintain the Linux kernel version line |
1603 | you care about. | |
3e544d72 | 1604 | |
4b9d49d1 | 1605 | * Search the Linux stable mailing list for exiting reports. |
3e544d72 | 1606 | |
4b9d49d1 | 1607 | * Check with the latest release. |
3e544d72 | 1608 | |
3e544d72 TL |
1609 | |
1610 | Check code history and search for existing discussions | |
1611 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
1612 | ||
1613 | *Search the Linux kernel version control system for the change that fixed | |
1614 | the issue in mainline, as its commit message might tell you if the fix is | |
1615 | scheduled for backporting already. If you don't find anything that way, | |
1616 | search the appropriate mailing lists for posts that discuss such an issue | |
1617 | or peer-review possible fixes; then check the discussions if the fix was | |
1618 | deemed unsuitable for backporting. If backporting was not considered at | |
1619 | all, join the newest discussion, asking if it's in the cards.* | |
1620 | ||
1621 | In a lot of cases the issue you deal with will have happened with mainline, but | |
1622 | got fixed there. The commit that fixed it would need to get backported as well | |
1623 | to get the issue solved. That's why you want to search for it or any | |
1624 | discussions abound it. | |
1625 | ||
1626 | * First try to find the fix in the Git repository that holds the Linux kernel | |
1627 | sources. You can do this with the web interfaces `on kernel.org | |
1628 | <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/>`_ | |
1629 | or its mirror `on GitHub <https://github.com/torvalds/linux>`_; if you have | |
1630 | a local clone you alternatively can search on the command line with ``git | |
1631 | log --grep=<pattern>``. | |
1632 | ||
1633 | If you find the fix, look if the commit message near the end contains a | |
1634 | 'stable tag' that looks like this: | |
1635 | ||
1636 | Cc: <stable@vger.kernel.org> # 5.4+ | |
1637 | ||
1638 | If that's case the developer marked the fix safe for backporting to version | |
1639 | line 5.4 and later. Most of the time it's getting applied there within two | |
1640 | weeks, but sometimes it takes a bit longer. | |
1641 | ||
1642 | * If the commit doesn't tell you anything or if you can't find the fix, look | |
1643 | again for discussions about the issue. Search the net with your favorite | |
1644 | internet search engine as well as the archives for the `Linux kernel | |
1645 | developers mailing list <https://lore.kernel.org/lkml/>`_. Also read the | |
1646 | section `Locate kernel area that causes the issue` above and follow the | |
1647 | instructions to find the subsystem in question: its bug tracker or mailing | |
1648 | list archive might have the answer you are looking for. | |
1649 | ||
1650 | * If you see a proposed fix, search for it in the version control system as | |
1651 | outlined above, as the commit might tell you if a backport can be expected. | |
1652 | ||
1653 | * Check the discussions for any indicators the fix might be too risky to get | |
1654 | backported to the version line you care about. If that's the case you have | |
1655 | to live with the issue or switch to the kernel version line where the fix | |
1656 | got applied. | |
1657 | ||
1658 | * If the fix doesn't contain a stable tag and backporting was not discussed, | |
1659 | join the discussion: mention the version where you face the issue and that | |
1660 | you would like to see it fixed, if suitable. | |
1661 | ||
3e544d72 TL |
1662 | |
1663 | Ask for advice | |
1664 | ~~~~~~~~~~~~~~ | |
1665 | ||
1666 | *One of the former steps should lead to a solution. If that doesn't work | |
1667 | out, ask the maintainers for the subsystem that seems to be causing the | |
1668 | issue for advice; CC the mailing list for the particular subsystem as well | |
1669 | as the stable mailing list.* | |
1670 | ||
1671 | If the previous three steps didn't get you closer to a solution there is only | |
1672 | one option left: ask for advice. Do that in a mail you sent to the maintainers | |
1673 | for the subsystem where the issue seems to have its roots; CC the mailing list | |
58c53945 | 1674 | for the subsystem as well as the stable mailing list (stable@vger.kernel.org). |
3e544d72 TL |
1675 | |
1676 | ||
1677 | Why some issues won't get any reaction or remain unfixed after being reported | |
1678 | ============================================================================= | |
1679 | ||
1680 | When reporting a problem to the Linux developers, be aware only 'issues of high | |
1681 | priority' (regressions, security issues, severe problems) are definitely going | |
1682 | to get resolved. The maintainers or if all else fails Linus Torvalds himself | |
1683 | will make sure of that. They and the other kernel developers will fix a lot of | |
1684 | other issues as well. But be aware that sometimes they can't or won't help; and | |
1685 | sometimes there isn't even anyone to send a report to. | |
1686 | ||
1687 | This is best explained with kernel developers that contribute to the Linux | |
1688 | kernel in their spare time. Quite a few of the drivers in the kernel were | |
1689 | written by such programmers, often because they simply wanted to make their | |
1690 | hardware usable on their favorite operating system. | |
1691 | ||
1692 | These programmers most of the time will happily fix problems other people | |
1693 | report. But nobody can force them to do, as they are contributing voluntarily. | |
1694 | ||
1695 | Then there are situations where such developers really want to fix an issue, | |
1696 | but can't: sometimes they lack hardware programming documentation to do so. | |
1697 | This often happens when the publicly available docs are superficial or the | |
1698 | driver was written with the help of reverse engineering. | |
1699 | ||
1700 | Sooner or later spare time developers will also stop caring for the driver. | |
1701 | Maybe their test hardware broke, got replaced by something more fancy, or is so | |
1702 | old that it's something you don't find much outside of computer museums | |
1703 | anymore. Sometimes developer stops caring for their code and Linux at all, as | |
1704 | something different in their life became way more important. In some cases | |
1705 | nobody is willing to take over the job as maintainer – and nobody can be forced | |
1706 | to, as contributing to the Linux kernel is done on a voluntary basis. Abandoned | |
1707 | drivers nevertheless remain in the kernel: they are still useful for people and | |
1708 | removing would be a regression. | |
1709 | ||
1710 | The situation is not that different with developers that are paid for their | |
1711 | work on the Linux kernel. Those contribute most changes these days. But their | |
1712 | employers sooner or later also stop caring for their code or make its | |
1713 | programmer focus on other things. Hardware vendors for example earn their money | |
1714 | mainly by selling new hardware; quite a few of them hence are not investing | |
1715 | much time and energy in maintaining a Linux kernel driver for something they | |
1716 | stopped selling years ago. Enterprise Linux distributors often care for a | |
1717 | longer time period, but in new versions often leave support for old and rare | |
1718 | hardware aside to limit the scope. Often spare time contributors take over once | |
1719 | a company orphans some code, but as mentioned above: sooner or later they will | |
1720 | leave the code behind, too. | |
1721 | ||
1722 | Priorities are another reason why some issues are not fixed, as maintainers | |
1723 | quite often are forced to set those, as time to work on Linux is limited. | |
1724 | That's true for spare time or the time employers grant their developers to | |
1725 | spend on maintenance work on the upstream kernel. Sometimes maintainers also | |
1726 | get overwhelmed with reports, even if a driver is working nearly perfectly. To | |
1727 | not get completely stuck, the programmer thus might have no other choice than | |
1728 | to prioritize issue reports and reject some of them. | |
1729 | ||
1730 | But don't worry too much about all of this, a lot of drivers have active | |
1731 | maintainers who are quite interested in fixing as many issues as possible. | |
1732 | ||
1733 | ||
1734 | Closing words | |
1735 | ============= | |
1736 | ||
1737 | Compared with other Free/Libre & Open Source Software it's hard to report | |
1738 | issues to the Linux kernel developers: the length and complexity of this | |
1739 | document and the implications between the lines illustrate that. But that's how | |
1740 | it is for now. The main author of this text hopes documenting the state of the | |
1741 | art will lay some groundwork to improve the situation over time. | |
d2ce2853 TL |
1742 | |
1743 | ||
1744 | .. | |
247097e2 TL |
1745 | end-of-content |
1746 | .. | |
1747 | This document is maintained by Thorsten Leemhuis <linux@leemhuis.info>. If | |
1748 | you spot a typo or small mistake, feel free to let him know directly and | |
1749 | he'll fix it. You are free to do the same in a mostly informal way if you | |
1750 | want to contribute changes to the text, but for copyright reasons please CC | |
d2ce2853 TL |
1751 | linux-doc@vger.kernel.org and "sign-off" your contribution as |
1752 | Documentation/process/submitting-patches.rst outlines in the section "Sign | |
1753 | your work - the Developer's Certificate of Origin". | |
247097e2 TL |
1754 | .. |
1755 | This text is available under GPL-2.0+ or CC-BY-4.0, as stated at the top | |
1756 | of the file. If you want to distribute this text under CC-BY-4.0 only, | |
1757 | please use "The Linux kernel developers" for author attribution and link | |
1758 | this as source: | |
1759 | https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/Documentation/admin-guide/reporting-issues.rst | |
1760 | .. | |
1761 | Note: Only the content of this RST file as found in the Linux kernel sources | |
1762 | is available under CC-BY-4.0, as versions of this text that were processed | |
1763 | (for example by the kernel's build system) might contain content taken from | |
1764 | files which use a more restrictive license. |