Embedded PCI and Linux

How can you get the cost advantages of PCI and the ruggedness of an industrial platform in one form factor? Use inexpensive DIMM connectors and everyone's favorite OS.
Linux Support for PCI

In the 2.0.38 kernel we used as a starting point, Linux support for PCI peripherals is extensive but is completely dependent on BIOS calls for PCI configuration. This works in the world of PCs but is a gaping hole in implementing PCI for an embedded system. We were unaware of any Linux implementations of PCI that did not use the BIOS, and so our starting point was to give our system PCI BIOS functionality in order to make use of the base Linux PCI driver.

PCI for Intel Designs vs. dimmPCI Designs

Most developers are familiar with PCI in the Intel x86 environment. It is important to note that such an implementation is a special case. In the special case, the PCI memory address space and the host processor (386, 486, Pentium, Athlon), physical address space are one and the same. The bridge arrangement required by the VZ328 architecture described above is the more general case.

There are other significant development issues as well. These stem from the reality that most PCI development is for products intended for the desktop. In addition to the above problem, other problems involved in porting PC drivers to our embedded platform include: 1) byte order is generally ignored as x86s and PCI are the same; 2) word alignment is generally poor on x86s as these processors automatically generate multiple bus cycles; 3) new hardware still reflects ISAs (industry-standard architectures, i.e., AT) by using the same I/O register maps, rather than introducing improved memory space register maps; and 4) drivers, especially video drivers, sometimes rely on expansion ROMs that contain x86 binary code, which assumes the existence of PC peripheral devices and specific x86 PC BIOS software interrupt vectors.

The consequence of our decision to use the Motorola VZ328 microprocessor is that placing the memory onto the PCI bus would transform the architecture into the special case of the PC, at increased cost and great technical complexity.

PCI BIOS Development

The PCI BIOS must do two things. It must allocate memory space, I/O space and interrupts. It also must allow a device driver access to PCI configuration cycles that allow the device driver to find the card, read the card resources and be able to configure the card as necessary.

The PCI BIOS calls have been extended to provide a hardware abstraction layer to compensate for having a split memory address space. These functions are summarized as follows:

  • pcibios_read_memory_byte()

  • pcibios_write_memory_byte()

  • pcibios_read_memory_word()

  • pcibios_write_memory_word()

  • pcibios_read_memory_dword()

  • pcibios_write_memory_dword()

The VZ328, unlike x86 processors, does not have both a memory address space and an I/O address space. The VZ328 itself cannot generate an I/O cycle, nor does it need to. The VZ328 must communicate the PCI I/O address to the bridge, as required for PCI memory addresses, and must communicate a request to the bridge for a PCI I/O cycle. The PCI BIOS calls have been extended with the addition of the following functions to support I/O cycles:

  • pcibios_read_io_byte()

  • pcibios_write_io_byte()

  • pcibios_read_io_word()

  • pcibios_write_io_word()

  • pcibios_read_io_dword()

  • pcibios_write_io_dword()

The VZ328, unlike x86 processors, is a large-endian device. PCI has been defined to work well with x86 devices, which are small-endian. Since most PCI data transfers are word or dword (i.e., few byte transfers), it is advantageous to avoid byte swapping on every word or dword access, so the data bus between the bridge and the VZ328 has been swapped in hardware. A side-effect result is that byte accesses now must be corrected by inverting the least significant address bit (A0). Fortunately, this is handled in the supplied PCI BIOS calls (and in the above extensions), so the details largely are hidden by the PCI BIOS:
  • pcibios_read_config_byte()

  • pcibios_write_config_byte()

  • pcibios_read_config_word()

  • pcibios_write_config_word()

  • pcibios_read_config_dword()

  • pcibios_write_config_dword()

Most transfers on the PCI bus are accomplished using direct memory access (DMA) bus cycles. These transfers can be initiated by any PCI device to any memory address reachable on the PCI bus. Since the VZ328 address space is separate from the PCI address space, PCI DMA transfers cannot specify a source or destination that is within the VZ328 memory space.

The Cypress Cy7C09449 contains a 32KB dual-ported memory that is used to bridge the VZ328 memory space to PCI memory space. Since there may be more than one PCI device in a system, there can be more than one PCI device driver installed at any given time. This condition compels us to treat the dual-ported memory as a shared resource and to balance efficiency with performance.

The kernel function kmalloc(), which calls get_free_pages(), was designed to allocate such special memory using GFP_ flags. GFP_DMA is defined for PC implementations and denotes a page that is bounded by a 16M memory limit and does not cross a 64K boundary (this was due to the limitations of the 8237 DMA controller at the time).

Since PCs share processor and PCI memory address spaces, there is no special flag defined to select memory for use with DMA and PCI. It appears that the PC standard here is designed around the special case. To compound the problem, there is no central resource to initiate a DMA transfer, as device drivers initiate transfers by writing to nonstandard registers on the device hardware. Generally speaking, this code needs to be altered to become bridge-aware. Therefore, to facilitate sharing the dual-ported RAM on the bridge, the PCI BIOS has been extended to include kmalloc memory tables and a GFP_PCI flag. A device driver may allocate and deallocate DPram on a transfer-by-transfer basis or request a block of memory that it will own permanently, until the device driver is closed.

Finally, interrupts are handled using the register_interrupt() system call. PCI specifies an 8-bit field for interrupt number, so the VZ328 PCI BIOS can assign a mach interrupt number that can be handled by a device driver using the register_interrupt() system call.


Geek Guide
The DevOps Toolbox

Tools and Technologies for Scale and Reliability
by Linux Journal Editor Bill Childers

Get your free copy today

Sponsored by IBM

Upcoming Webinar
8 Signs You're Beyond Cron

Scheduling Crontabs With an Enterprise Scheduler
11am CDT, April 29th
Moderated by Linux Journal Contributor Mike Diehl

Sign up now

Sponsored by Skybot