Linus & the Lunatics, Part III
Part 1 of this series included a transcript of the talk given by Linus, based around a slide presentation. In Part II, he participated in a Q&A session after his presentation. Here in Part 3, Linus and friends hold a Q&A with the Victoria Linux Users Group in Victoria, BC.
We were met at the dock by our VLUG hosts and shown the town before we gathered in a lecture hall at the University of Victoria. There were 30 Lunatics and 50 VLUGers. A bunch of the visitors were assembled on a line of chairs on stage. Carl Constantine, VLUG President & Captain, ran the meeting. Barbara Irwin presented Linus with a lifetime VLUG membership card and invited him back to Victoria to take advantage of LUG membership discounts around town. Barbara's own write-up on the event is here.
My recording unfortunately began well after the beginning of the event, and it only got a clear recording of Linus and Ted, which should be plenty.
Audience member: Can you introduce yourself?
Ted T'so: I'm Ted T'so. I've worn a lot of different hats over the years. TSX-11 was the first FTP server on this side of the Atlantic. It was an Ultrix machine, a MIPS DECstation run on my desk. That was a long long time ago when I was at MIT. These days I'm doing stuff with the standards group. I maintain the e2fsprogs, filesystem utilities for ext2, ext3, and I fool around with the filesystem. I have fun.
These days I work for the IBM Linux Technology Center.
Audience member: I have a filesystem question. If there were one feature from a filesystem other than ext3 that you'd want to implement, what would it be?
Ted T'so: There is some interesting work that a person named Alex Thomas is doing. He has some proof of concept code that he just sort of floated around over the last couple of weeks. Which is extent maps for ext3. They look very, very interesting. They give a very, very nice performance increase. The basic idea is, currently, if you are using the standard UNIX indirect block system, you have to have a table for where to find each physical block in the file. And the problem is that the filesystem is going to try very, very hard to make blocks be contiguous on this. And so therefore, if you look at the coding of the indirect block, it's extremely inefficient. You'll see something like 1000, 1001, 1002, 1003, 1004, and there is a much better way you can represent that information. An extent map is a way of doing that. You just simply have "logical block 100 starts at physical block 1000 and that goes on for 200 blocks" or something like that. So he has some proof of concept code that does that. What makes extent maps hard is pathological cases. Very often in filesystems, 90% of the work is for the pathological application that decides it wants to do a sparse file with random allocations all over the place, so you have this completely fragmented extent map, and you actually have to deal with that sanely -- what we call "The denial of service attack by a stupid application".
So that's actually what makes it hard. The basic idea of an extent map is really trivial. It's dealing with stuff like this that makes life a little difficult, which means you have to use a b tree on disk for the extent map, yada yada yada. Life gets interesting. So that's one thing.
There is some stuff I want to look at about making the encoding a bit more efficient and when we get the kernel side piece we also need to make user space tools understand it. That's definitely 2.7 stuff. Linus here would kill me if I suggested I put it in 2.6 test 5.
One of the other things, which is clearly sort of research-based, is better heuristics for deciding when a new directory should be placed in the same block group as the parent directory and when should we spread it out. Basically there are things that we are trying to do to avoid fragmentation. But it means that some of the very simple benchmarks, such as untarring your kernel tree and then deleting it, end up going slow because we're spreading things out too far in an attempt to avoid fragmentation. And clearly some of the heuristics that we set up many many years ago need to be looked at again and revisited. And about the only really realistic way of doing that is to gather real-life work loads. Because very often benchmarks are fairly deceptive because they don't really measure what's really happening. And then try some different heuristics and see which ones actually work out best. One of the problems is that we don't have good filesystem benchmarks. Typically what filesystem people do is find some sort of synthetic benchmark, try it out, and if it makes their particular filesystem or pet extension work really well they trumpet it out loud and otherwise they try to find another benchmark. And that's not very scientific. It's the equivalent of shooting an arrow at a white fence and drawing a bulls-eye around it.
There's a awful lot of that going on. It's human nature. But getting real benchmarks is hard, because you want something that accurately reflects real life, but yet is still repeatable. And that's not trivial.
Audience member: Since we mentioned the kernel, Linus, how close are we to 2.6?
Linus: It does seem to be in pretty good shape. There seem to be a lot of old drivers that lately have been marked as broken, on the assumption that they are so old that people probably aren't using them, and that's why they're still broken. That's getting rid of a lot of the compile time errors. Which is nice. And so far nobody has complained, which is not surprising, because a lot of the old CD-ROM drivers that literally came with sound cards that are unlikely to work ten years after they were made anyway, so there probably aren't a lot of them around.
So that's one of the things that I've been looking at. Trying to figure out which drivers are so broken, and marking them as broken so nobody sees them... and just wait for the screams to come.
To some degree a lot of the screaming will come only after Red Hat and SuSE and everybody else has made a distribution. And the people who didn't even care to test before that. Some will say, "Hey, my old CD-ROM doesn't work anymore!" And that's inevitable. There's nothing we can do about that. And that's okay. Making a new kernel version is always painful, and it's not going to be perfect.
But we are getting fairly close. I'm saying... two months, maybe. But that's just...
Audience member: Christmas!
Linus: (Shrugs and smiles)
Audience member: I have a genuine SCO license...
[Much laughter and joking around.]
Linus: I'm not sure I want to be tainted.
Audience Member: Thanks to Larry Ewing, the Linux Community has a beautiful mascot. Now that we have the head penguin in our midst, unfortunately we don't know what he sounds like, and I was wondering if tonight you could give voice to our mute mascot. What would he say?
Linus: Nobody has ever asked me that before.
You're obviously a unique mind.
I don't want to leave you with a bad impression of Tux. So let's leave him mute. (Pause.) After a few beers I might try.
One of the good things about Tux is that you can make of him whatever you want, right? So a lot of users groups have their own version of Tux. I know the Ottawa--I think they call themselves the Canada Users Group--I don't know, whatever. They had him playing ice hockey. The point is, a lot of people make different things about Tux. But nobody has ever made Tux sing or anything like that. But you could be the first. It's okay.
If you make a 3-D rendition of Tux playing cover tunes of old disco albums, it's okay.
Thomas McVeigh, VLUG "treasurer and geek jeweler", coming up to Linus on stage: Speaking as a person who isn't a programmer, I just wanted to give you something of what I do...
Audience member: Thomas is our jewelry expert...
Thomas: Hand-made sterling silver and gold plated lapel pin and tie pin for you.
Linus: People do a lot of things with Tux, but usually they're not very classy. Most of them tend to be the drunken frat boy kind of thing.
Audience member: When 2.6 comes up, are you going to keep developments going with 2.2?
Linus: Actually, we already have a manager for 2.2.
Audience member: Does that fracture the program [garbled]?
Linus: Maintaining old kernels tends to be one of these work-of-love things. Not a lot of people use 2.2 anymore. Actually there is a maintainer for 2.0 as well. The kind of people who use 2.2 don't tend to care about maintenance. They use 2.2 because it works and it's stable for them, and they don't want to touch the damn thing, right? So the less maintenance the better. The old kernels tend to get only really critical security and bug fixes, and that's it. Not even, like, "This device driver is broken, please fix it." It's like, "Why are you using 2.2 on a new machine?" The problem with 2.2 is literally that Alan Cox decided to go and get an MBA degree. Nobody knows why, but... whatever. And he used to be the maintainer of the old kernels. So, there are new people coming back and doing that. It doesn't fracture at all, because literally the target audiences are so different. I mean, people interested in 2.6--it's just not the same person running 2.2. So there is no issue of conflict of interest anywhere.
Audience member: What needs to be done long term to deal with corporate strategies [garbled].
Linus: I'm optimistic, frankly. I hate the amount of, just, negativity that we see whenever this subject comes up. I just want more people hopefully to be slightly more positive and say, "Hey, we shouldn't be that worried about corporate closed-mindedness." I think it's corporate closed-mindedness that should be more worried about open source. And that, yeah, there are potential real dangers, especially of the legal system being manipulated by big corporations,hat kind of thing. But on the whole, let's not get too bogged down in being afraid and just do the best we can. I mean, maybe not trust the system, I'm not saying that, just.... If you waste too much energy just worrying, you don't get anything done. Maybe I'm not answering your question, but at least I'm trying to funnel it into something else.
Audience member: As a follow-on to that, I say we should all trust the culture en masse. I know that 10 years ago we were running Linux on a 386, and at that time I felt hanging out on [garbled] or USENET, there were no rules, and it was a new world, right? I appreciate the writings of Doc Searls about the cultural changes that have happened over time. But certainly the corporate [garbled] carries that fear, and certainly we're seeing some garbage in the mainstream media about ownership and that kind of thing. But I think we can all work together and keep the dream alive collectively.
Linus: It's true that there is a lot of garbage about ownership and corporate spin from people who have a very real interest in the matter. On the other hand, there is a lot of talk in the media about 12-year-old girls getting sued, right? So, it does go both ways. It's not like the RIAA and the big corporations get everything they want. And it's never been that way. Open media is our most powerful weapon, as a matter of fact. So I wouldn't worry too much about the media.
Audience member: With 2.6 almost here, what's in store for 2.7 and 2.8?
Linus: We had a kernel summit in Ottawa two months ago, trying to kind of flesh out what 2.7 will be. The fact is, I don't want to get people thinking about 2.7 too much before 2.6 not only is out but had, like, half a year to get stable. I won't open a 2.7 tree immediately. I never did that before. Every time we do a stable kernel, we like to keep it stable for a while. There's not a lot of stuff floating around on 2.7. There's the detail kinds of stuff that Ted was talking about: making a filesystem use a new kind of layout, that's a detail when it comes to the general kernel layout. There's some bigger stuff, like clustered filesystems. Real clustered filesystems will have more of an impact on the rest of the kernel. They're not just filesystems specifically.
Most of the development these days tends actually to be hardware related and gets prompted by new devices that need new drivers. Or new hardware that just needs different ways of handling it efficiently. So obviously the biggest changes that have happened between 2.4 and 2.6 have largely been as a result of a lot of people using more high-end hardware. And the filesystem locking just got a lot cleaner, for example. In the process it also scales better. And that's not something you really plan for. Or, I think, it's a mistake to plan for. You do a lot of small details, day by day, and the end result is a big difference--between 2.6 and 2.8 and maybe 3.0. But it's really a series of fairly small evolutionary steps that you don't really plan ahead very much.
Charles Roth: I just wanted to follow up to an earlier question to Linus about commercialization and big corporations and so forth, with a question for you all. How many of you have actually written your legislator about anything?
[Many hands go up.]
Good for you! Everybody else, take five minutes, put your thoughts down or pick somebody here to write something for you. Those letters really do count. They'll listen.
Let me add one thing to that, for all the people who worry about what might happen. If you go back and look at where Linux and open-source software was, seven, eight years ago. Back then when I gave Linux presentations, I did so by booting in Windows and running PowerPoint. Everyone razzed me. And I said, "Yeah, but that's because there isn't any good open-source presentation management system", and quite frankly I wasn't sure at the time whether or not we'd ever see one. Because writing an office suite requires dedication to usability and fine-tuning. And the people who work on open-source software who wrote tools for programmers as opposed to tools for everyday use. And I wasn't sure we'd ever see an open-source office suite. Obviously I was proven wrong.
If you take a look and see how far we've gotten in what is a relatively a short period of time, there is cause for great hope.
Doc Searls is Senior Editor of Linux Journal, covering the business beat. His monthly column in the magazine is Linux For Suits, and his bi-weekly newsletter is SuitWatch.