Contributing to the Linux Kernel
Everyone who knows about Linux also knows about the ways Linux is “different” from other, more commercial operating systems. Because the Linux kernel is open source, it is possible for each and every user to become a contributor. Certainly, nearly everyone reading this knows so; it's sort of like preaching to the choir. However, the fact is that most Linux users, even those skilled in the programming arts, have never contributed to the code of the Linux kernel, even though most of us have had days where we thought to ourselves, “gee, I wish Linux could do this...” Through this article and others, I hope to convince some of you to take a look at the Linux kernel in a new, more proactive light.
What are some valid reasons for not contributing to the kernel efforts? First, maybe you can't legally. Many programmers sign contracts that limit their ability to code outside of work, even on non-commercial projects. This is the main reason I chose a profession that has relatively little to do with programming, other than the occasional Perl script. Second, it is possible you don't know how. Many Linux users are relatively new programmers trained in traditional computer science. I know from my own CS education that many schools tend to teach “modern” programming skills—I was one of the few in my particular school who chose (or knew how) to program without an IDE (integrated development environment). Sad, but true. Third, many professional programmers now tend to work with revision control systems in the workplace, and may be hesitant to contribute to projects (such as the kernel development effort) which still use the “bare metal” approach. Last and most likely, many programmers with the skills to hack Linux don't have the time to do so. These are all valid reasons why perfectly qualified programmers with good ideas, a fresh outlook and a desire to contribute have chosen not to. Nothing I can say can help them get past some of those issues, but I hope I can make kernel programming more accessible to at least a small percentage of people.
This is the first article in a series, and I will attempt to dispel some of the mystery behind revision control. Many open-source projects, including the Linux kernel, still use the diff and patch method of content control for a variety of reasons. Most open-source projects still accept patches in this format, even if they distribute code via CVS or some other revision-control system. First, diff and patch provide a project maintainer with an immense amount of control. Patches can be submitted and distributed via e-mail and in plain text; maintainers can read and judge the patch before it ever gets near a tree. Second, there's never a worry about access control or the CVS server going down. Third, it's readily available, generally doesn't require any special tools that aren't distributed as part of every GNU system, and has been used for years. However, bare-bones revision control makes it difficult to track changes, maintain multiple branches or do any other “advanced” things provided by Perforce, CVS or other revision control systems.
diff and patch are a set of command-line programs designed to generate and integrate changes into a source tree. There are multiple “diff” formats supported by the GNU utilities. One major advantage of diff and patch over newer revision-control systems is that diff, especially the unified diff format, allows kernel maintainers to look at changes easily without blindly integrating them.
For the uninitiated, diff and patch are just two of the commands in a complete set of GNU utilities. While they are the most commonly used in practice, other tools are often employed in specific situations. For the purposes of this document, I won't concentrate on these utilities, but will treat them only briefly. For a more complete look, check out your local set of man and info pages.
diff is the first command in the set. It has a simple purpose: to create a file (often confusingly called a patch or a diff) which contains the differences between two text files or two groups of text files. These files are constructed to make it easy to merge the differences into the older file. diff can write in multiple formats, although the unified difference format is generally preferred. The patches this command generates are much easier to distribute than whole files, and they allow maintainers to see quickly and easily what changed and to make a judgment.
patch is diff's logical complement, although oddly, it didn't come along until well after diff was in relatively common use. patch takes a patch file generated by diff and applies it against a file or group of files. patch will notify the user if there is a conflict, although it is often smart enough to resolve simple conflicts. Additionally, patch can act in the reverse; with an updated file and the original patch, this command can revert a file back to its pristine form.
cmp is diff's counterpart for binary files. As applications for binary files in source control are limited, this command is often not used in that environment. Usually, projects that include binary files (for example, a logo) have some other mechanism for updating these components. (Keep in mind that the XPM image format common in Linux applications can actually be text-based and can be controlled using the above commands.)
diff3 is a variant on diff that allows for computing and merging the differences among three files. Personally, I tend to use diff for these purposes, but there are likely reasons why this command is useful in specific situations with which I have not yet dealt.
And finally, sdiff is an interactive version of patch that allows for smarter patching using your very own brain.
These tools have many uses other than content control. I do not want to slight them by implying they are not useful. But, like many tools, they shine only in certain circumstances. (Like that annoying fine screwdriver you get with the set for which you've never seen a screw small enough to use it on, tools are only as good as the situations they are applied against.)
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- New Products
- Linux Systems Administrator
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Designing Electronics with Linux
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Nice article, thanks for the
7 hours 14 min ago
- I once had a better way I
13 hours 50 sec ago
- Not only you I too assumed
13 hours 18 min ago
- another very interesting
15 hours 11 min ago
- Reply to comment | Linux Journal
17 hours 4 min ago
- Reply to comment | Linux Journal
23 hours 58 min ago
- Reply to comment | Linux Journal
1 day 14 min ago
- Favorite (and easily brute-forced) pw's
1 day 2 hours ago
- Have you tried Boxen? It's a
1 day 7 hours ago
- seo services in india
1 day 12 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?