Proper Linux Kernel Coding Style

Some simple rules, written and unwritten, to ensure that other kernel users understand your work.

As more people and companies start to write Linux kernel code, understanding acceptable kernel programming coding style and conventions is very important. This article starts out by explaining the written kernel coding style, as documented in the kernel's Documentation/CodingStyle file, and then moves on to introduce some of the “unwritten” kernel coding style rules.

Why Rules?

Why are there kernel programming style rules in the first place? Why not just let authors code in whatever style they want, and let everyone live with it? After all, code formatting does not affect memory use, execution speed or anything else a normal user of the kernel would see. The reason for rules can be summed up with this passage from Elliot Soloway and Kate Ehrlich's article (see Resources):

Our studies support the claim that knowledge of programming plans and rules of programming discourse can have a significant impact on program comprehension. In their book called Elements of Style, Kernighan and Plauger also identify what we would call discourse rules. Our empirical results put teeth into these rules: it is not merely a matter of aesthetics that programs should be written in a particular style. Rather there is a psychological basis for writing programs in a conventional manner: programmers have strong expectations that other programmers will follow these discourse rules. If the rules are violated, then the utility afforded by the expectations that programmers have built up over time is effectively nullified.

There have been a number of other studies about how a large body of code written in a common style directly affects how easy it is to understand the code, review it and revise it quickly.

Because the number of developers that look at the Linux kernel code is very large, it is in the best interest for the project to have a consistent style guideline. This allows the code to be easier to understand by both someone reading it for the first time and for someone revisiting his or her old code a number of months later. It also allows someone else to read easily, understand and potentially fix and enhance your code, which is one of the greatest strengths of open-source code.

What Are the Rules?

So now that we have an understanding of why there should be rules, what are they? Linus Torvalds and other kernel programmers have written a short document that details some of the kernel programming rules. This document is located in the Documentation/CodingStyle file in the kernel source tree. It is required reading for anyone who wants to contribute to the Linux kernel. The following is a summary of these rules.

Indentation

All tabs are eight characters and will be the <TAB> character. This makes locating where different blocks of code start and end easier. If you find that your code is indented too deeply, with more than three levels of indentation that cause the code to shift to the right of the screen too far, then you should fix this.

Placing Braces

The original authors of UNIX placed their braces with the opening brace last on the line, and the closing brace first on the line:

if (x is true) {
        we do y
}

Therefore, the Linux kernel uses this style. The exceptions to this rule are functions, which have the opening brace at the beginning of the line:

int function(int x)
{
        body of function
}
Again, this is how Kernighan and Ritchie wrote their code.

For good examples of proper indentation and brace style, look at any of the fs/*.c files or anything in the kernel/*.c files. Generally, most of the kernel is in the proper indentation and brace style, but there are some notable exceptions. The code in fs/devfs/*.c and drivers/scsi/qla1280.* are good examples of how not to do indentation and braces.

A script can be used to run the indent program if you have to convert a large amount of code to the correct format. This file is located at scripts/Lindent in the kernel source tree.

Naming

Your variables and functions should be descriptive and concise. Don't use long flowery names like, CommandAllocationGroupSize or DAC960_V1_EnableMemoryMailboxInterface() but rather, cmd_group_size or enable_mem_mailbox(). Mixed-case names are frowned upon, and encoding the type of the variable or function in the name (like “Hungarian notation”) is forbidden.

Global variables should be used only if they are absolutely necessary. Local variables should be short and to the point. Valid local loop variable names include “i” and “j”, while “loop_counter” would be too verbose; “tmp” is allowed for any short-lived temporary variable.

Again, good examples of proper names can be found in fs/*.c. A lot of driver code has bad variable names, as it has been ported from other operating systems. Examples of how not to name functions and variables include drivers/block/DAC960.* and drivers/scsi/cpqfs*.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

any body know send me

Anonymous's picture

i want a unix programming code for read 10 inputs from file and stored it in other file.

I need one information: What

latha's picture

I need one information:
What are the "coding standards" used for developing a linux device driver.
If there are standards where can i get all the standards details?
If those are available in any website .Please let me know.

ifdef badness

Anonymous's picture

I don't see ifdef badness as being a binary black/white thing. Just like gotos or almost any other programming construct there are ways to use ifdef that are good or bad.

One or two here and there, used for the purpose of clarity, is potentially a good thing.

If you find you're writing a lot of ifdefs, then it is likely that there is something else you should be doing instead: macros, function calls to a CPU specific layer, or something else.

Same deal with gotos. Use goto here and there to make clearer exit paths is a Good Thing, using them to create complex branching paths is a bad thing.

And btw: there is a good way to make a macro for NULL code - something that is used all over the kernel:

#if ENABLE_BLAAH
#define blaah(x) if(x) printf("blaah %d\n",x)
#else
#define blaah(x) do { } while(0)
#endif

That creates a nice macro that always gets the semicolon logic right.

Is an 'ifdef' in C code really that bad?

David Schwartz's picture

I personally think the 'ifdef' in the C code is vastly superior. For one thing, it makes it obvious to anyone looking at the code that the function is only called if CONFIG_USB_HIDDEV is defined. For another thing, the proposed 'improvement' requires someone to look at the header file to understand the C code, rather than having the C code be self-documenting.

The claimed benefit that the code is more readable is the opposite of the truth. The original code makes it clear that nothing is done if CONFIG_USB_HIDDEV is not defined. The 'improved' code obscures that.

As for it being easier to maintain, I presume this is because the selection of the appropriate hid function is now part of the hid code rather than scattered in all the code that might call the hid code. I do agree that there is some value to this, but weighed against the fact that it obscured under what cases the function does something and under what cases it does not, I think it's just not worth it.

Especially if you try to generalize this. For example, what if 'HID_CLAIMED_HIDDEV' is not defined unless CONFIG_USB_HIDDEV is defined. Would you suggest a 'hid_is_claimed' macro that always returns 'true' or 'false' if 'CONFIG_USB_HIDDEV' is not declared? But then which should it return if it's not even supposed to exist as a test?

And what if instead of 'hid->claimed' it was something like 'is_claimed(hid)'? There might be costs associated with the empty function, and when you put that empty function in the header file, you do not necessarily know about every case that might call it.

This can lead to the ultimate ugliness -- code that has no side-effects around it will call the null function. But code that has side-effects or doesn't want the cost associated with deciding whether or not to call the null function might still need an 'ifdef'. What an unsightly mess that would be.

lots of thinking

Andrew Thompson's picture

I see you've done a lot of thinking on this problem, and come up with some logical ideas, but I suspect you've never worked on code with lot's of #ifdefs, especially that needs to be portable to a number of different systems. If there are one or two, it is fine, the problem is when you get a bunch of them they are really hard to read. They don't follow normal indentation rules, so it is hard to see where they end, or what '#else' goes with which '#if'. Scary stuff.

#ifdef section question

Anonymous's picture

Just curious...in the scenario described above for moving the #ifdef out of the c and into the h files....it comments about making a null function.

My question is...Does this when executed still make a function call during execution? Wouldn't this cause a few extra clock ticks that could be done elsewhere? Or does the compiler optimize this out of the equation?

Is the hope to avoid excessive #defines as well?

I suppose defining something else like a

#ifdef CONFIG_USB_HIDDEV

#define HIDDEV_FUNCT extern void hiddev_hid_event (struct hid_device *, unsigned int usage,int value);

#else

#define HIDDEV_FUNCT ;

#endif

and then use HIDDEV_FUNCT in the if previously would be avoided to avoid hiding the actual function.. Is this correct?

Re: #ifdef section question

gregkh's picture

Yes, the compiler optimizes any "null" inline function calls out of the object code, and will not cause any extra clock ticks.

Also, that #define will not work :)

Hope this helps

Re: #ifdef section question

Olle Bergkvist's picture

What about this one?

#ifdef CONFIG_USB_HIDDEV
   extern void hiddev_hid_event (struct hid_device *,
                                 unsigned int usage,
                                 int value);
#else
#define hiddev_hid_event(hid_device,usage,value) ;
#endif
/* the last semicolon is there to not break the logic if the call is the only statement in a block:
   if(foo)
      hiddev_hid_event (thisdev, somethg, 0);
   do_somethg_else();
would become:
   if(foo)
      do_somethg_else();
*/

Besides, what would you do if you need a return value from such a function?
The method described in the guidelines only works for 'void func(args)' (if i understand correctly).

What would such an empty inline function return?

And what if you have to choose between several different function calls upon a DEFINE.

   if (hid->claimed & HID_CLAIMED_HIDDEV)
#ifdef CONFIG_USB_HIDDEV
         hiddev_hid_event(hid, usage->hid, value);
#else
         hiddev_hid_other_event_stuff(hid, usage->hid, value);
#endif

This would look like:

   if (hid->claimed & HID_CLAIMED_HIDDEV) {
         hiddev_hid_event(hid, usage->hid, value);
         hiddev_hid_other_event_stuff(hid, usage->hid, value);
   }

which is just weird since the code is NEVER going to behave that way.

And what if you have, say 3 lines of code, ifdefed out. Not only function calls (those can be ifdefed in a header file) but also assignments? Should those 3 lines be moved to a new function, that would be ifdefed away in its header?

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix