On Linus' Return to Kernel Development

on December 6, 2018

On October 23, 2018, Linus Torvalds came out of his self-imposed isolation, pulling a lot of patches from the git trees of various developers. It was his first appearance on the Linux Kernel Mailing List since September 16, 2018, when he announced he would take a break from kernel development to address his sometimes harsh behavior toward developers. On the 23rd, he announced his return, which I cover here after summarizing some of his pull activities.

For most of his pulls, he just replied with an email that said, "pulled". But in one of them, he noticed that Ingo Molnar had some issues with his email, in particular that Ingo's mail client used the iso-8859-1 character set instead of the more usual UTF-8. Linus said, "using iso-8859-1 instead of utf-8 in this day and age is just all kinds of odd. It looks like it was all fine, but if Mutt has an option to just send as utf-8, I encourage everybody to just use that and try to just have utf-8 everywhere. We've had too many silly issues when people mix locales etc and some point in the chain gets it wrong."

On the 24th, Linus continued pulling from developer trees. One of these was a batch of networking updates from David Miller, and it included contributions from a lot of different people. Linus noticed that the Kconfig rules were running into unmet dependency warnings because the code expected to run on the Qualcomm architecture, which Linus didn't use. He suggested it was a simple matter of updating the dependency list in the code. He also asked why the developers didn't notice that problem when testing their patches. Kalle Valo explained, "Mostly bad timing due to my vacation. I did do allmodconfig build but not sure why I missed the warning, also the kbuild bot didn't report anything. Jeff did report it last week, but I was on vacation at the time and just came back yesterday and didn't have time to react to it yet."

That seemed fine to Linus, who said he'd pull the fix when it became available. He remarked, "I just don't want my tree to have warnings that I see, and that may hide new warnings coming in when I do my next pull request."

On the 25th, Linus continued pulling from developer trees. In one instance, the issue of minimal tool versions came up. Linus prefers to support as many regular users as possible, which means supporting tool versions from the Linux distributions.

In response to a hard-to-read patch, Andi Kleen suggested changing the minimum supported binutils version from 2.20 to 2.21, which would support some useful assembler opcodes that would make the patch easier to review. Andy Lutomirski, another of the patch reviewers, said this would be fine. And Linus said:

I always vote for "require modern tools" as long as it doesn't cause problems.

binutils-2.21 is something like seven years old by now, but the real issue would be what versions distros are actually shipping. I don't want people to have to build their own binutils just to build a kernel.

It's usually some ancient enterprise distro that is stuck on old versions. Anybody have any idea?

Andy replied, "CentOS 6 is binutils 2.23. CentOS 5 is EOL. RHEL 5 has 'extended life', which means that it's officially zombified and paying customers can still download (unsupported) packages. SLES 11 is binutils 2.19, which is already unsupported. SLES 12 is 2.24. So I would guess we're okay and we can bump the requirement to 2.21." And the conversation ended there.

Getting back to October 23rd, Linus announced his return to the mailing list and to kernel development:

So I've obviously started pulling stuff for the merge window, and one of the things I noticed with Greg doing it for the last few weeks was that he has this habit (or automation) to send Ack emails when he pulls.

In fact, I reacted to them not being there when he sent himself his fake pull messages. Because he didn't then send himself an ack for having pulled it ;(

And I actually went into this saying "I'll try to do the same".

But after having actually started doing the pulls, I notice how it doesn't work well with my traditional workflow, and so I haven't been doing it after all.

In particular, the issue is that after each pull, I do a build test before the pull is really "final", and while that build test is ongoing (which takes anything from a few minutes to over an hour when I'm on the road and using my laptop), I go on and look at the *next* pull (or one of the other pending ones).

So by the time the build test has finished, the original pull request is already long gone - archived and done - and I have moved on.

End result: answering the pull request is somewhat inconvenient to my flow, which is why I haven't done it.

In contrast, this email is written "after the fact", just scripting "who did I pull for and then push out" by just looking at the git tree. Which sucks, because it means that I don't actually answer the original email at all, and thus lose any cc's for other people or mailing lists. That would literally be done better by simple automation.

So I've got a few options:

- just don't do it

- acking the pull request before it's validated and finalized.

- starting the reply when doing the pull, leaving the email open in a separate window, going on to the next pull request, and then when build tests are done and I'll start the next one, finish off the old pending email.

and obviously that first option is the easiest one. I'm not sure what Greg did, and during the later rc's it probably doesn't matter, because there likely simply aren't any overlapping operations.

Because yes, the second option likely works fine in most cases, but my pull might not actually be final *if* something goes bad (where bad might be just "oops, my tests showed a semantic conflict, I'll need to fix up my merge" to "I'm going to have to look more closely at that warning" to "uhhuh, I'm going to just undo the pull entirely because it ended up being broken").

The third option would work reliably, and not have the "oh, my pull is only tentatively done" issue. It just adds an annoying back-and-forth switch to my workflow.

So I'm mainly pinging people I've already pulled to see how much people actually _care_. Yes, the ack is nice, but do people care enough that I should try to make that workflow change? Traditionally, you can see that I've pulled from just seeing the end result when it actually hits the public tree (which is yet another step removed from the steps above - I do build tests between every pull, but I generally tend to push out the end result in batches, usually a couple of times a day).

Comments?

Linus Walleij appreciated the description of Linus T's workflow, and said he didn't need the acknowledgement emails. But he asked, "Can't you just tool something that mails automatically after-the-fact? Greg's 'notices' that patch so-or-so was applied are clearly auto-generated by a script after he applied and tested a whole bunch of them, the same should be possible for pull requests methinks? Just something you run after a workday sealing the deal."

Linus T replied:

A certain amount of simple/stupid automation would be possible. That's how the participants list in this email was generated, but the script I used was actually a pretty much garbage one-liner that just happens to work for most cases.

It just did my usual "mergelog" (which is a bit like "git shortlog", it's a script to just get the summary of my merges instead of the general git logs), and then it used the result of that lookup to look up the email address by just matching committers.

But it's broken to the point of almost being useless for a couple of reasons:

- my mergelog names don't necessarily match any name in the git history.

For example, Greg goes by "Greg KH" when I merge from him, because I'm lazy and feel like I don't want to mis-type his name, which I've done too many times. But in the actual git history, he goes by the full "Greg Kroah-Hartman", so my stupid script would have messed him up.

At the other end of the spectrum, people with complex characters have their names copied-and-pasted from their email or the signature from their tag, and sometimes those then don't match either.

- some people use one email for "official" purposes (ie company email etc) in the git history, but actually tend to *use* another email (because sometimes the company email is slow and/or broken).

- it wouldn't get the usual mailing list cc's etc, and those might be the most important ones. It is how I saw Greg's replies, after all.

So I feel that the automation model is just not good. The reply should go to the actual pull request, not to the git history. People who want just _that_ could already automate the git history thing without me even doing anything at all, either scripting it themselves or by using some filtering on the kernel commit mailing list.

So I happened to use the automation model for this email thread, but I think it's actually the worst of all worlds.

Willy Tarreau also replied to Linus T's original email, saying he felt that an acknowledgement email was not necessary, and that most developers just wanted to make sure the pull request was actually received by Linus T. But Willy suggested sending out an email if, after actually pulling from a tree, Linus T changed his mind and reverted the change. In that situation, an email to that developer would be useful.

Linus T replied to Willy, saying this was actually his normal workflow—to send an email only in the case where he tried to pull, but then decided to revert the change later. He said, "And that email wouldn't go away, so if I first send a 'Pulled' ack message, and then something bad happens and I unpull it, I would send a second email anyway saying 'oh, oops, not pulled after all'."

Ingo Molnar also replied to Linus T's original email, saying he used another option aside from the ones Linus T proposed. He explained:

I use "zero inbox" mail reading, last-to-first, and with that I can "delay" a reply to a pull request or patch simply by marking the mail unread. Then when I push out tested trees and patches, I go and process the tail of the mbox, a couple of entries typically. (For patches I don't even have to do anything because the notification is automatic, and I mark the patch read when I see the tip-bot notification myself.)

It's still a separate workflow step but easier to manage than postponed emails or separate email windows, which are inevitably going to get lost in browser mishaps every couple of weeks, and which are not high-profile enough in the primary workflow either.

Might not be a practical method with the amount of mail you are getting though.

Ingo also mentioned that he didn't personally feel like he preferred to receive an acknowledgement email. He agreed that it would be more convenient to get the emails, but he said, "it's not a big factor; I'd say the efficiency of your workflow (which is a single thread) should be the primary concern here."

Boris Brezillon also replied to Linus T's initial post, saying he did prefer to get the acknowledgement emails. He added, "I don't care if this notification is sent in-reply to the original email or in a separate email." He also said he assumed such a thing would be automated, and he concluded, "it's just a nice thing to have, and I can do without it if it's too complicated to automate."

Mark Brown also replied to Linus T's initial post, saying he didn't need the acknowledgement emails. In fact, he said, "I was a bit alarmed the first time Greg sent me an ack - your usual workflow is that if there's any mail it means that there's a problem."

Takashi Iwai also replied to Linus T's initial post, saying he did appreciate the acknowledgement emails—not because they indicated the patch had been accepted, but just that the pull request had been received at all.

Greg Kroah-Hartman also replied to Linus T's initial post. While Linus was away, Greg had been the one accepting pull requests from developers, and he had always given an acknowledgement email, which is why Linus T was considering doing the same. Greg said to Linus T:

I had this same issue, as I had full builds run and had to wait for the results. But I had a much smaller number of pull requests, so I just dumped them all into one folder and then did the responses when the tests came back.

So I had the same issue as you, but you have much more requests to deal with, sorry.

Kirill A. Shutemov also replied to Linus T's initial post, joining the chorus recommending an automated solution. He suggested including the email Message-ID field in the text of the merge's commit message itself. Then a tool could easily extract the Message-ID, extract the CC list from the email archive, and send an acknowledgement email to everyone on that list. And Mark Brown suggested simply including the CC list in the commit message as well, so the tools wouldn't even need to query the mailing list archive.

But Linus T said he didn't want to go too far toward automating the emails. He replied to Kirill and Mark, saying:

I think I'll just try the "ack when starting the pull" model and see how that works. Maybe I was overthinking it.

And if it turns out that it would be better to ack after everything has passed, I could easily just do an email filter for "messages that are to me, but I have archived and not replied to, and that have 'git pull' in them".

I use email filters for pinpointing the pulls to begin with, I could just use email filters to pinpoint the pull requests that I have already handled.

So it seemed as though Linus had decided to go one way, but then in another email, he said, "I'm starting to think that mailing list automation really would be a good idea", and he went on, "I think it might be good to have some generic model for 'give me a trigger when XYZ hits git tree ABC' that people could just do in general, *but* I think the 'scan mailing lists for regular pull requests' would actually be nicer."

He continued:

It would be much nicer if the "notification" really did the right thing, and created an actual email follow-up, with the correct To/Cc and subject lines, but also the proper "References" line so that it actually gets threaded properly too. That implies that it really should be integrated into the mailing list itself. But I don't know how flexible the whole lkml archive bot is for things like this. But I assume you have _some_ hook into new messages coming in for lore.kernel.org?

The discussion ended there.

So, nothing was said about the code of conduct, and nothing about how he used his time away from kernel development. He just focused on catching up on merges and discussing possible changes to his workflow. The more interesting cases will come when a real conflict does emerge, as it inevitably must. There are all sorts of security and other implementation topics that typically cause conflict, not to mention cases where developers disagree on the behavior of existing code and, thus, on the right way to fix an issue.

Note: if you're mentioned above and want to post a response above the comment section, send a message with your response text to ljeditor@linuxjournal.com.

Zack Brown is a tech journalist at Linux Journal and Linux Magazine, and is a former author of the "Kernel Traffic" weekly newsletter and the "Learn Plover" stenographic typing tutorials. He first installed Slackware Linux in 1993 on his 386 with 8 megs of RAM and had his mind permanently blown by the Open Source community. He is the inventor of the Crumble pure strategy board game, which you can make yourself with a few pieces of cardboard. He also enjoys writing fiction, attempting animation, reforming Labanotation, designing and sewing his own clothes, learning French and spending time with friends'n'family.

Load Disqus comments

Linus Torvalds

kernel

On Linus' Return to Kernel Development

Recent Articles

Related Articles