2.6.13 char driver not working on 2.6.22

Hello kernel gurus.

I have a char driver module that has been working fine under 2.6.13. It is a PCI card for a Heidenhaim rotational encoder. I am now moving to 2.6.22 and it no longer works. In order to get it to build I changed exactly one line:

I changed this:

rc = pci_module_init(&ik220_driver);

to this:

rc = pci_register_driver(&ik220_driver);

Under 2.6.22 it loads fine but as soon as my app opens /dev/ik200 the kernel crashes without a trace. I added lots of printk statements to see what is going on. The printk statements for pci_driver .probe and .remove show up as expected. But as soon as my application does an open the kernel crashes immediately -- I do not even see the file_operations .open() fops entry getting called.

Any clues would be appreciated.

Thanks in advance,

Elwood, Tucson AZ

The Code

Mitch Frazier's picture

Since this is not a standard driver a reference to where it can be downloaded would be useful.

Mitch Frazier is an Associate Editor for Linux Journal.

2.6.13 char driver not working on 2.6.22

ecdowney's picture

Hi Mitch, thanks for taking an interest in my little conundrum. I put a copy of the driver at http://www.clearskyinstitute.com/ik220 . The device in question is a PCI card from Heidenhain that interfaces to up to 8 of their absolute encoders. I got the driver originally from them for 2.4 kernels. They have nothing more recent so I hacked on it and got it to work on 2.6.13. Now at 2.6.22 it's not working again.

The probe function works because after insmod the module ik220 is listed by lsmod and reported correctly by lspci -v as the driver for the given device. It also creates /dev/ik220 correctly. But as soon as I try to open ("/dev/ik220", O_RDWR) the kernel immediately crashes. I put a printk in the ioctl file_operations but it never prints, so it looks like the kernel is panicing before it reaches the driver. I don't need an open file_ops but I added one anyway just to test with a printk and found it never prints either, the kernel crash comes before it is called.

Char Driver

Mitch Frazier's picture

I looked at your code before I finished reading your post... my first suggestion was going to be to add an open() function, but now I see you already tried that. It shouldn't be necessary anyways, the char device code does not require an open() function.

What precise version of the kernel are you using, 2.6.22.what? There were about 15 different revisions of 2.6.22.

Another thought, and I have no real reason to think this might fix it, but it's consistent with other kernel drivers, is to add static const to the file_operations declaration:

  static const struct file_operations ik220_fops = {
  ...
  };

Another thing you might want to do, again for consistency with newer code, is to change the syntax of the structure initializers:

  static struct pci_driver ik220_driver = {
          .name          = DRV_NAME,
          .id_table      = ik220_tbl,
          .probe         = ik220_init_one,
          .remove        = ik220_remove_one,
  };

  static const struct file_operations ik220_fops = {
          .ioctl         = ik220_driver_ioctl,
  };

The "name: value" syntax was long ago deprecated by gcc.

Since the code dies before it gets to your open() function (when you have one), the next move is probably to add some debugging code to the kernel proper. A good starting point is the function chrdev_open() in the file fs/char_dev.c. That's where character devices are "opened", it's the function that calls your open() function.

Mitch Frazier is an Associate Editor for Linux Journal.

2.6.13 char driver not working on 2.6.22

ecdowney's picture

Thanks for the thoughts. Alas, the extra statics and .name changes did not help. This is 2.6.22-rtai. We're not using any RTAI stuff, it's just there for possible future work.

I hesitate to open the door to hacking the kernel itself so I'm still fiddling with the driver.

Since it loads but fails on first use, I'm thinking something in ik220_init_board is setting up for later trouble. I gutted the code and added it back slowly. I found that the open would not crash until I added back the call to ioremap_nocache(). I wonder what that could mean??

Char Driver

Mitch Frazier's picture

Like I said I didn't really expect those changes to fix it, I might have hoped, but I didn't believe. :).

I don't blame you not wanting to hack the kernel itself.

Is the call to pci_request_regions() working? The return value is not checked, should return zero.

You might try changing ioremap_nocache to ioremap. Nocache should be the right one to call, since you don't want device registers to be cached, but very few existing drivers seem to use it. Although, again, I don't really expect that to fix it.

Also does it make sense that pci resource #1 is skipped?

   ik220_card[slot].conf_iomem_start = pci_resource_start(pdev, 0);
   ...
   ik220_card[slot].iomem_1_start = pci_resource_start(pdev, 2);
   ...
   ik220_card[slot].iomem_2_start = pci_resource_start(pdev, 3);

   // 0, 2, 3  no 1??

Mitch Frazier is an Associate Editor for Linux Journal.

2.6.13 char driver not working on 2.6.22

ecdowney's picture

pci_request_regions() is confirmed to be returning 0.

using ioremap: no change (still later crashes).

changing pci_resource_start to 0, 1, 2 gives this in syslog:

Jul 10 18:40:12 montsec-ocs kernel: [37134.167618] IK220: 2nd IO-Resource is no
IOMemory! Wrong Card?

<Groan>

Stock Kernel

Mitch Frazier's picture

Another thing that might be useful to test if possible is to see how the driver acts using a stock kernel, rather than an RTAI patched kernel.

Mitch Frazier is an Associate Editor for Linux Journal.

IO Memory and Virtual Memory Values

Mitch Frazier's picture

What are the values printed out by the statements:

printk(KERN_INFO "%s: Config-Region start: 0x%lX end: 0x%lX flags: 0x%lX\n", ...);
printk(KERN_INFO "%s: 1st IO-Region start: 0x%lX end: 0x%lX flags: 0x%lX\n", ...);
printk(KERN_INFO "%s: 2nd IO-Region start: 0x%lX end: 0x%lX flags: 0x%lX\n", ...);
printk(KERN_INFO "%s: Config-Region remaped to virtual address 0x%lX\n", ...);
printk(KERN_INFO "%s: 1st IO-Region remaped to virtual address 0x%lX\n", driver_name, ...);
printk(KERN_INFO "%s: 2nd IO-Region remaped to virtual address 0x%lX\n", driver_name, ,,,);

What's the output from "lspci -vvv" for the card?

Mitch Frazier is an Associate Editor for Linux Journal.

2.6.13 char driver not working on 2.6.22

ecdowney's picture

[oh I see, the > characters looked like html tags]

01:04.0 Bridge: PLX Technology, Inc. PCI - IOBus Bridge (rev 02)
	Subsystem: PLX Technology, Inc. IK220 (Heidenhain)
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- TAbort- MAbort- >SERR- PERR- INTx-
	Interrupt: pin A routed to IRQ 5
	Region 0: Memory at e80a2000 (32-bit, non-prefetchable) [size=128]
	Region 1: I/O ports at d000 [size=128]
	Region 2: Memory at e80a4000 (32-bit, non-prefetchable) [size=32]
	Region 3: Memory at e80a0000 (32-bit, non-prefetchable) [size=32]
	Kernel driver in use: ik220

2.6.13 char driver not working on 2.6.22

ecdowney's picture
Jul 11 06:50:22 montsec-ocs kernel: [80922.691198] IK220: Config-Region start: 0xE80A2000 end: 0xE80A207F flags: 0x200
Jul 11 06:50:22 montsec-ocs kernel: [80922.691239] IK220: 1st IO-Region start: 0xE80A4000 end: 0xE80A401F flags: 0x200
Jul 11 06:50:22 montsec-ocs kernel: [80922.691281] IK220: 2nd IO-Region start: 0xE80A0000 end: 0xE80A001F flags: 0x200
Jul 11 06:50:22 montsec-ocs kernel: [80922.691346] IK220: Config-Region remaped to virtual address 0xF8950000
Jul 11 06:50:22 montsec-ocs kernel: [80922.691375] IK220: 1st IO-Region remaped to virtual address 0xF8952000
Jul 11 06:50:22 montsec-ocs kernel: [80922.691403] IK220: 2nd IO-Region remaped to virtual address 0xF896C000

[lspci in next comment -- forum chopped it off]

Char Driver

Mitch Frazier's picture

Nothing strange in those values that I can see.

However, I just noticed that the second and third region are mapped with ioremap and not with ioremap_nocache. Have you tried changing those to _nocache? Really doesn't seem like those should be cacheable.

Mitch Frazier is an Associate Editor for Linux Journal.

2.6.13 char driver not working on 2.6.22

ecdowney's picture

That did not help either.

But we'll never know because we're giving up on 2.6.22. We've hatched a method to go back to 2.6.13 which we know works. It's an embedded system which will just sit and do it's job so the age of the kernel is not a big deal as long as we get it working.

Thank you very much for your efforts Mitch.

Elwood

Next Time

Mitch Frazier's picture

It would have been nice to find/know the solution... but it doesn't always work out that way.

Mitch Frazier is an Associate Editor for Linux Journal.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState