Proper Linux Kernel Coding Style

Some simple rules, written and unwritten, to ensure that other kernel users understand your work.
Functions

Functions should do only one thing and do it well. They should be short and hopefully contain only one or two screens of text. If you have a function that does a lot of small things for different cases, it is acceptable to have a longer function. But if you have a long, complex function, you have a problem.

In addition, a large number of local variables within a function is a measure of how complex the function is. If the number of local variables exceeds ten, then there is something wrong.

Again, there are a lot of good examples of nice-sized functions in the fs/*.c and other kernel core code. Some bad examples of functions can be found in drivers/hotplug/ibmphp_res.c (where one function is 370 lines long) or drivers/usb/usb-uhci.c (where one function has 18 local variables).

Comments

Comments are good to have, but they have to be useful. Bad comments explain how the code works, who wrote a specific function on a specific date or other such useless things. Good comments explain what the file or function does and why it does it. They should also be at the beginning of the function and not necessarily embedded within the function (you are writing small functions, right?).

Now there is also a standard format for function comments. It is a variant of the documentation method used by the GNOME Project for its code. If you write your function comments in this style, the information in them can be extracted by a tool and made into standalone documentation. This can be seen by running make psdocs or make htmldocs on the kernel tree to generate a kernel-api.ps or kernel-api.html file containing all of the public interfaces to the different kernel subsystems.

This style is documented in the file Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc. The basic format is as follows:

/**
 * function_name(:)? (-
(* @parameterx: (
(* a blank line)?
 * (Description:)? (
 * (section header: (
(*)?*/

The short function description cannot run multiple lines, but the other descriptions can (and they can contain blank lines). All further descriptive text can contain the following markups:

  • funcname(): function

  • $ENVVAR: environment variable

  • &struct_name: name of a structure (up to two words, including struct)

  • @parameter: name of a parameter

  • %CONST: name of a constant

So a simple example of a function comment with a single argument would be:

/**
 * my_function - does my stuff
 * @my_arg: my argument
 *
 * Does my stuff explained.
 **/

Comments can and should be written for structures, unions and enums. The format for them is much like the function format:

/**
 * struct my_struct - short description
 * @a: first member
 * @b: second member
 *
 * Longer description
 */
struct my_struct {
    int a;
    int b;
};
Some good examples of well-commented functions can be found in the drivers/usb/usb.c file, where all global functions are documented. The file arch/i386/kernel/mtrr.c is a good example of a file with a reasonable amount of comments, but they are in the incorrect format, so they cannot be extracted by the documentation tools. A good example of how not to create the comment blocks for your functions is drivers/scsi/pci220i.c.

Data Structure Requirements

A chapter on data structures showed up in the 2.4.10-pre7 kernel. It goes into how every data structure that can exist outside of a single-threaded environment needs to implement reference counting in order to handle the memory management issues of the structure properly. If you add reference counting to your structure, you can avoid a lot of nasty locking issues and race conditions. Multiple threads can access the same structure without having to worry that a different thread will free the data from under it.

The last sentence in this chapter is required reading by any kernel developer: “Remember: if another thread can find your data structure, and you don't have a reference count on it, you almost certainly have a bug.”

A good example of why reference counting is necessary can be found in the USB data structure, “struct urb”. This structure is created by a USB device driver, filled with data, sent to a USB host controller where it will be processed and eventually sent out over the wire. When the host controller is finished with the urb, the original device driver will be notified. While a host controller driver is processing the urb, the original driver can try to cancel the urb or even free it. This led to numerous bugs in the core USB subsystem and different drivers. It also created long, detailed arguments on the linux-usb-devel mailing list about when in the life span of a urb it was allowed to be touched by either driver. In the 2.5 kernel series, struct urb had a reference count added to it, and the USB core and USB host controller drivers had a small amount of code added to handle the reference count properly. Now, whenever a driver wants to use the urb, a reference count is incremented. When it is finished with it, the reference count is decremented. If this was the last user, the memory is then freed and the urb disappears. This allows the USB device drivers to simplify their urb-handling logic vastly, fixes a lot of different race condition bugs and quiets all arguments about the topic.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

any body know send me

Anonymous's picture

i want a unix programming code for read 10 inputs from file and stored it in other file.

I need one information: What

latha's picture

I need one information:
What are the "coding standards" used for developing a linux device driver.
If there are standards where can i get all the standards details?
If those are available in any website .Please let me know.

ifdef badness

Anonymous's picture

I don't see ifdef badness as being a binary black/white thing. Just like gotos or almost any other programming construct there are ways to use ifdef that are good or bad.

One or two here and there, used for the purpose of clarity, is potentially a good thing.

If you find you're writing a lot of ifdefs, then it is likely that there is something else you should be doing instead: macros, function calls to a CPU specific layer, or something else.

Same deal with gotos. Use goto here and there to make clearer exit paths is a Good Thing, using them to create complex branching paths is a bad thing.

And btw: there is a good way to make a macro for NULL code - something that is used all over the kernel:

#if ENABLE_BLAAH
#define blaah(x) if(x) printf("blaah %d\n",x)
#else
#define blaah(x) do { } while(0)
#endif

That creates a nice macro that always gets the semicolon logic right.

Is an 'ifdef' in C code really that bad?

David Schwartz's picture

I personally think the 'ifdef' in the C code is vastly superior. For one thing, it makes it obvious to anyone looking at the code that the function is only called if CONFIG_USB_HIDDEV is defined. For another thing, the proposed 'improvement' requires someone to look at the header file to understand the C code, rather than having the C code be self-documenting.

The claimed benefit that the code is more readable is the opposite of the truth. The original code makes it clear that nothing is done if CONFIG_USB_HIDDEV is not defined. The 'improved' code obscures that.

As for it being easier to maintain, I presume this is because the selection of the appropriate hid function is now part of the hid code rather than scattered in all the code that might call the hid code. I do agree that there is some value to this, but weighed against the fact that it obscured under what cases the function does something and under what cases it does not, I think it's just not worth it.

Especially if you try to generalize this. For example, what if 'HID_CLAIMED_HIDDEV' is not defined unless CONFIG_USB_HIDDEV is defined. Would you suggest a 'hid_is_claimed' macro that always returns 'true' or 'false' if 'CONFIG_USB_HIDDEV' is not declared? But then which should it return if it's not even supposed to exist as a test?

And what if instead of 'hid->claimed' it was something like 'is_claimed(hid)'? There might be costs associated with the empty function, and when you put that empty function in the header file, you do not necessarily know about every case that might call it.

This can lead to the ultimate ugliness -- code that has no side-effects around it will call the null function. But code that has side-effects or doesn't want the cost associated with deciding whether or not to call the null function might still need an 'ifdef'. What an unsightly mess that would be.

lots of thinking

Andrew Thompson's picture

I see you've done a lot of thinking on this problem, and come up with some logical ideas, but I suspect you've never worked on code with lot's of #ifdefs, especially that needs to be portable to a number of different systems. If there are one or two, it is fine, the problem is when you get a bunch of them they are really hard to read. They don't follow normal indentation rules, so it is hard to see where they end, or what '#else' goes with which '#if'. Scary stuff.

#ifdef section question

Anonymous's picture

Just curious...in the scenario described above for moving the #ifdef out of the c and into the h files....it comments about making a null function.

My question is...Does this when executed still make a function call during execution? Wouldn't this cause a few extra clock ticks that could be done elsewhere? Or does the compiler optimize this out of the equation?

Is the hope to avoid excessive #defines as well?

I suppose defining something else like a

#ifdef CONFIG_USB_HIDDEV

#define HIDDEV_FUNCT extern void hiddev_hid_event (struct hid_device *, unsigned int usage,int value);

#else

#define HIDDEV_FUNCT ;

#endif

and then use HIDDEV_FUNCT in the if previously would be avoided to avoid hiding the actual function.. Is this correct?

Re: #ifdef section question

gregkh's picture

Yes, the compiler optimizes any "null" inline function calls out of the object code, and will not cause any extra clock ticks.

Also, that #define will not work :)

Hope this helps

Re: #ifdef section question

Olle Bergkvist's picture

What about this one?

#ifdef CONFIG_USB_HIDDEV
   extern void hiddev_hid_event (struct hid_device *,
                                 unsigned int usage,
                                 int value);
#else
#define hiddev_hid_event(hid_device,usage,value) ;
#endif
/* the last semicolon is there to not break the logic if the call is the only statement in a block:
   if(foo)
      hiddev_hid_event (thisdev, somethg, 0);
   do_somethg_else();
would become:
   if(foo)
      do_somethg_else();
*/

Besides, what would you do if you need a return value from such a function?
The method described in the guidelines only works for 'void func(args)' (if i understand correctly).

What would such an empty inline function return?

And what if you have to choose between several different function calls upon a DEFINE.

   if (hid->claimed & HID_CLAIMED_HIDDEV)
#ifdef CONFIG_USB_HIDDEV
         hiddev_hid_event(hid, usage->hid, value);
#else
         hiddev_hid_other_event_stuff(hid, usage->hid, value);
#endif

This would look like:

   if (hid->claimed & HID_CLAIMED_HIDDEV) {
         hiddev_hid_event(hid, usage->hid, value);
         hiddev_hid_other_event_stuff(hid, usage->hid, value);
   }

which is just weird since the code is NEVER going to behave that way.

And what if you have, say 3 lines of code, ifdefed out. Not only function calls (those can be ifdefed in a header file) but also assignments? Should those 3 lines be moved to a new function, that would be ifdefed away in its header?

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState