Commit | Line | Data |
---|---|---|
d95ea1a4 JC |
1 | .. SPDX-License-Identifier: GPL-2.0 |
2 | ||
3 | ==================== | |
4 | Rebasing and merging | |
5 | ==================== | |
6 | ||
7 | Maintaining a subsystem, as a general rule, requires a familiarity with the | |
8 | Git source-code management system. Git is a powerful tool with a lot of | |
9 | features; as is often the case with such tools, there are right and wrong | |
10 | ways to use those features. This document looks in particular at the use | |
11 | of rebasing and merging. Maintainers often get in trouble when they use | |
12 | those tools incorrectly, but avoiding problems is not actually all that | |
13 | hard. | |
14 | ||
15 | One thing to be aware of in general is that, unlike many other projects, | |
16 | the kernel community is not scared by seeing merge commits in its | |
17 | development history. Indeed, given the scale of the project, avoiding | |
18 | merges would be nearly impossible. Some problems encountered by | |
19 | maintainers result from a desire to avoid merges, while others come from | |
20 | merging a little too often. | |
21 | ||
22 | Rebasing | |
23 | ======== | |
24 | ||
25 | "Rebasing" is the process of changing the history of a series of commits | |
26 | within a repository. There are two different types of operations that are | |
27 | referred to as rebasing since both are done with the ``git rebase`` | |
28 | command, but there are significant differences between them: | |
29 | ||
30 | - Changing the parent (starting) commit upon which a series of patches is | |
31 | built. For example, a rebase operation could take a patch set built on | |
32 | the previous kernel release and base it, instead, on the current | |
33 | release. We'll call this operation "reparenting" in the discussion | |
34 | below. | |
35 | ||
36 | - Changing the history of a set of patches by fixing (or deleting) broken | |
37 | commits, adding patches, adding tags to commit changelogs, or changing | |
38 | the order in which commits are applied. In the following text, this | |
39 | type of operation will be referred to as "history modification" | |
40 | ||
41 | The term "rebasing" will be used to refer to both of the above operations. | |
42 | Used properly, rebasing can yield a cleaner and clearer development | |
43 | history; used improperly, it can obscure that history and introduce bugs. | |
44 | ||
45 | There are a few rules of thumb that can help developers to avoid the worst | |
46 | perils of rebasing: | |
47 | ||
48 | - History that has been exposed to the world beyond your private system | |
49 | should usually not be changed. Others may have pulled a copy of your | |
50 | tree and built on it; modifying your tree will create pain for them. If | |
51 | work is in need of rebasing, that is usually a sign that it is not yet | |
52 | ready to be committed to a public repository. | |
53 | ||
54 | That said, there are always exceptions. Some trees (linux-next being | |
55 | a significant example) are frequently rebased by their nature, and | |
56 | developers know not to base work on them. Developers will sometimes | |
57 | expose an unstable branch for others to test with or for automated | |
58 | testing services. If you do expose a branch that may be unstable in | |
59 | this way, be sure that prospective users know not to base work on it. | |
60 | ||
61 | - Do not rebase a branch that contains history created by others. If you | |
62 | have pulled changes from another developer's repository, you are now a | |
63 | custodian of their history. You should not change it. With few | |
64 | exceptions, for example, a broken commit in a tree like this should be | |
65 | explicitly reverted rather than disappeared via history modification. | |
66 | ||
67 | - Do not reparent a tree without a good reason to do so. Just being on a | |
68 | newer base or avoiding a merge with an upstream repository is not | |
69 | generally a good reason. | |
70 | ||
71 | - If you must reparent a repository, do not pick some random kernel commit | |
72 | as the new base. The kernel is often in a relatively unstable state | |
73 | between release points; basing development on one of those points | |
74 | increases the chances of running into surprising bugs. When a patch | |
75 | series must move to a new base, pick a stable point (such as one of | |
76 | the -rc releases) to move to. | |
77 | ||
78 | - Realize that reparenting a patch series (or making significant history | |
79 | modifications) changes the environment in which it was developed and, | |
80 | likely, invalidates much of the testing that was done. A reparented | |
81 | patch series should, as a general rule, be treated like new code and | |
82 | retested from the beginning. | |
83 | ||
84 | A frequent cause of merge-window trouble is when Linus is presented with a | |
85 | patch series that has clearly been reparented, often to a random commit, | |
86 | shortly before the pull request was sent. The chances of such a series | |
87 | having been adequately tested are relatively low - as are the chances of | |
88 | the pull request being acted upon. | |
89 | ||
90 | If, instead, rebasing is limited to private trees, commits are based on a | |
91 | well-known starting point, and they are well tested, the potential for | |
92 | trouble is low. | |
93 | ||
94 | Merging | |
95 | ======= | |
96 | ||
97 | Merging is a common operation in the kernel development process; the 5.1 | |
98 | development cycle included 1,126 merge commits - nearly 9% of the total. | |
99 | Kernel work is accumulated in over 100 different subsystem trees, each of | |
100 | which may contain multiple topic branches; each branch is usually developed | |
101 | independently of the others. So naturally, at least one merge will be | |
102 | required before any given branch finds its way into an upstream repository. | |
103 | ||
104 | Many projects require that branches in pull requests be based on the | |
105 | current trunk so that no merge commits appear in the history. The kernel | |
106 | is not such a project; any rebasing of branches to avoid merges will, most | |
107 | likely, lead to trouble. | |
108 | ||
109 | Subsystem maintainers find themselves having to do two types of merges: | |
110 | from lower-level subsystem trees and from others, either sibling trees or | |
111 | the mainline. The best practices to follow differ in those two situations. | |
112 | ||
113 | Merging from lower-level trees | |
114 | ------------------------------ | |
115 | ||
116 | Larger subsystems tend to have multiple levels of maintainers, with the | |
117 | lower-level maintainers sending pull requests to the higher levels. Acting | |
118 | on such a pull request will almost certainly generate a merge commit; that | |
119 | is as it should be. In fact, subsystem maintainers may want to use | |
120 | the --no-ff flag to force the addition of a merge commit in the rare cases | |
121 | where one would not normally be created so that the reasons for the merge | |
122 | can be recorded. The changelog for the merge should, for any kind of | |
123 | merge, say *why* the merge is being done. For a lower-level tree, "why" is | |
124 | usually a summary of the changes that will come with that pull. | |
125 | ||
126 | Maintainers at all levels should be using signed tags on their pull | |
127 | requests, and upstream maintainers should verify the tags when pulling | |
128 | branches. Failure to do so threatens the security of the development | |
129 | process as a whole. | |
130 | ||
131 | As per the rules outlined above, once you have merged somebody else's | |
132 | history into your tree, you cannot rebase that branch, even if you | |
133 | otherwise would be able to. | |
134 | ||
135 | Merging from sibling or upstream trees | |
136 | -------------------------------------- | |
137 | ||
138 | While merges from downstream are common and unremarkable, merges from other | |
139 | trees tend to be a red flag when it comes time to push a branch upstream. | |
140 | Such merges need to be carefully thought about and well justified, or | |
141 | there's a good chance that a subsequent pull request will be rejected. | |
142 | ||
143 | It is natural to want to merge the master branch into a repository; this | |
144 | type of merge is often called a "back merge". Back merges can help to make | |
145 | sure that there are no conflicts with parallel development and generally | |
146 | gives a warm, fuzzy feeling of being up-to-date. But this temptation | |
147 | should be avoided almost all of the time. | |
148 | ||
149 | Why is that? Back merges will muddy the development history of your own | |
150 | branch. They will significantly increase your chances of encountering bugs | |
151 | from elsewhere in the community and make it hard to ensure that the work | |
152 | you are managing is stable and ready for upstream. Frequent merges can | |
153 | also obscure problems with the development process in your tree; they can | |
154 | hide interactions with other trees that should not be happening (often) in | |
155 | a well-managed branch. | |
156 | ||
157 | That said, back merges are occasionally required; when that happens, be | |
158 | sure to document *why* it was required in the commit message. As always, | |
159 | merge to a well-known stable point, rather than to some random commit. | |
160 | Even then, you should not back merge a tree above your immediate upstream | |
161 | tree; if a higher-level back merge is really required, the upstream tree | |
162 | should do it first. | |
163 | ||
164 | One of the most frequent causes of merge-related trouble is when a | |
165 | maintainer merges with the upstream in order to resolve merge conflicts | |
166 | before sending a pull request. Again, this temptation is easy enough to | |
167 | understand, but it should absolutely be avoided. This is especially true | |
168 | for the final pull request: Linus is adamant that he would much rather see | |
169 | merge conflicts than unnecessary back merges. Seeing the conflicts lets | |
170 | him know where potential problem areas are. He does a lot of merges (382 | |
171 | in the 5.1 development cycle) and has gotten quite good at conflict | |
172 | resolution - often better than the developers involved. | |
173 | ||
174 | So what should a maintainer do when there is a conflict between their | |
175 | subsystem branch and the mainline? The most important step is to warn | |
176 | Linus in the pull request that the conflict will happen; if nothing else, | |
177 | that demonstrates an awareness of how your branch fits into the whole. For | |
178 | especially difficult conflicts, create and push a *separate* branch to show | |
179 | how you would resolve things. Mention that branch in your pull request, | |
180 | but the pull request itself should be for the unmerged branch. | |
181 | ||
182 | Even in the absence of known conflicts, doing a test merge before sending a | |
183 | pull request is a good idea. It may alert you to problems that you somehow | |
184 | didn't see from linux-next and helps to understand exactly what you are | |
185 | asking upstream to do. | |
186 | ||
187 | Another reason for doing merges of upstream or another subsystem tree is to | |
188 | resolve dependencies. These dependency issues do happen at times, and | |
189 | sometimes a cross-merge with another tree is the best way to resolve them; | |
190 | as always, in such situations, the merge commit should explain why the | |
191 | merge has been done. Take a moment to do it right; people will read those | |
192 | changelogs. | |
193 | ||
194 | Often, though, dependency issues indicate that a change of approach is | |
195 | needed. Merging another subsystem tree to resolve a dependency risks | |
196 | bringing in other bugs and should almost never be done. If that subsystem | |
197 | tree fails to be pulled upstream, whatever problems it had will block the | |
198 | merging of your tree as well. Preferable alternatives include agreeing | |
199 | with the maintainer to carry both sets of changes in one of the trees or | |
200 | creating a topic branch dedicated to the prerequisite commits that can be | |
201 | merged into both trees. If the dependency is related to major | |
202 | infrastructural changes, the right solution might be to hold the dependent | |
203 | commits for one development cycle so that those changes have time to | |
204 | stabilize in the mainline. | |
205 | ||
206 | Finally | |
207 | ======= | |
208 | ||
209 | It is relatively common to merge with the mainline toward the beginning of | |
210 | the development cycle in order to pick up changes and fixes done elsewhere | |
211 | in the tree. As always, such a merge should pick a well-known release | |
212 | point rather than some random spot. If your upstream-bound branch has | |
213 | emptied entirely into the mainline during the merge window, you can pull it | |
214 | forward with a command like:: | |
215 | ||
216 | git merge v5.2-rc1^0 | |
217 | ||
218 | The "^0" will cause Git to do a fast-forward merge (which should be | |
219 | possible in this situation), thus avoiding the addition of a spurious merge | |
220 | commit. | |
221 | ||
222 | The guidelines laid out above are just that: guidelines. There will always | |
223 | be situations that call out for a different solution, and these guidelines | |
224 | should not prevent developers from doing the right thing when the need | |
225 | arises. But one should always think about whether the need has truly | |
226 | arisen and be prepared to explain why something abnormal needs to be done. |