Porting LinuxBIOS to the AMD SC520: A Follow-up Report
As of July 15, 2005, we have moved the arch repository to Subversion. Arch checkouts will continue to work, but any new changes will be available only in Subversion.
Well, it was too easy. Things were going well with our project to port LinuxBIOS, until we tried to Flash the Flash part. Then we started to hit some problems with the board, the board design and the AMD SC520.
What went wrong? Put simply, when we tried to use the flash_rom program to Flash the part, it failed even to discover the type of part we had on the board. From there, it got worse. We wrote a small program to dump the Flash part, shown here:
#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h> #
include <stdlib.h>
#include <ctype.h>
main(int argc, char *argv[]) {
int fd_mem;
volatile char *bios;
unsigned long size = 512 * 1024;
int i;
if ((fd_mem = open("/dev/mem", O_SYNC|O_RDWR)) < 0) {
perror("Can not open /dev/mem");
exit(1);
}
bios = mmap(0, size, PROT_READ, MAP_SHARED, fd_mem,
off_t) (0xffffffff - size + 1));
if (bios == MAP_FAILED) {
perror("Error MMAP /dev/mem");
exit(1);
}
write(1, bios, 512*1024);
}
When we ran this program, we couldn't get sensible results. This program runs and runs well on everything else we own--several thousand K8 nodes, our laptop, 1,500 Xeon nodes--so it is not the program. What's going on?
As mentioned, we found a problem with the design of the MSM586SEG and the other SC520-based boards from Advanced Digital Logic. The problem, put simply, is the full Flash part cannot be accessed from the CPU; only the top 128KB of the part can be accessed. This limitation requires us to modify all of the tools that we support for Flash access, so they are aware that although the nominal size of the Flash is 256 or 512KB, only 128KB of that space is available.
Making that change, however, still did not help. When we dumped the Flash part, we got not garbage but nonsense. We saw strings that read CCCCoooo and so on. This nonsense led us to think that the Flash space was being cached somehow. In addition, we believed the hardware design had a problem such that burst reads from the Flash part--which would happen if the cache were enabled in the range of memory--were returning the same byte four times, not four consecutive bytes.
Then, we hit some other problems. We had two MSM586SEG boards, and the IDE interface on both of them stopped working. It turns out that the MSM586SEG has an FPGA controlling many functions, and we suspect that this FPGA has some teething problems. We decided to try out the older design, the MSM586SEV, which has no FPGA.
The MSM586SEV resolved all our problems save one: we still got nonsense when we tried to read the Flash. It now was time for some deep-diving into the SC520 architecture. We learned that a set of 16 registers, called the PAR registers, need to be managed in order to enable Flashing the part.
What are the PAR registers? They are used to steer memory and I/O access issued by the CPU. Almost all processors today have a special set of registers in the memory and I/O address generation path to modify the manner in which such addresses are handled.
Why is this type of register needed? With multiple busses capable of supporting memory and I/O access, the processor has no idea where to send the access unless it is told. That is the function of the PAR registers. Consider the block diagram of the SC520 shown below.
A given I/O access can go to the PCI bus or to the GP devices shown at right. A memory access can go to SDRAM, the Flash part or the PCI bus. The PAR registers allow the BIOS to specify, for a given range of I/O or memory, which bus it goes to, whether it is writable or read-only and whether it is cached.
We found that for the BIOS range of memory, 0xe0000-0xfffff, the PAR register was set to SDRAM. This setting is not surprising: for performance, the BIOS typically copies the BIOS image to SDRAM and then makes sure all BIOS code fetches go to the SDRAM holding the BIOS. This operation is commonly called "shadowing the BIOS".
Because Linux doesn't use the BIOS at all, we can ignore this setting. What we do is set the PAR register for the BIOS region, PAR register 15, back to the original BIOS. This is a simple matter of mapping in the registers, and then setting the register. Here is a code fragment to do so:
if ((fd_mem = open("/dev/mem", O_SYNC|O_RDWR)) < 0)
{ perror("Can not open /dev/mem"); exit(1); }
mmcr = mmap(0, 4096, PROT_WRITE|PROT_READ, MAP_SHARED,
fd_mem, (off_t) 0xfffef000);
if (mmcr == MAP_FAILED)
{ perror("Error MMAP /dev/mem"); exit(1); }
p = mmcr + 15; l = *p; printf("l is 0x%lx\n", l);
/* clear cache bits */
l |= (1<<27);
/* enable writeable bit */ l
&= ~(1<<26);
/* set type to flash, not sdram */
l &= ~(7<<29); l |= (4<<29);
/* 64k pages */ l
|= (1<<25);
/* blow away base and size stuff. */
l &= ~(0x1fff | (0x7ff<<14));
printf("l is now 0x%lx\n", l);
l |= (8 << 14) | (0x2000000>>16);
printf("l is now 0x%lx\n", l);
*p = l;
Once we had this done, we still had troubles. The further problem was the design of the PAR registers. They live in memory at 0xfffef000; in other words, they are placed right in the middle of the top 2MB of the 4GB of memory space. This space is, by convention, reserved for BIOS Flash, but the SC520 breaks that convention. So, although we had worked around the board problems, we now were faced with an architectural problem.
A light bulb went off at this point, though, relating to comments we had seen in sample code from AMD. The AMD code always was careful to program the PAR registers to place the Flash part above the top of DRAM, that is, at 32MB or hex 0x2000000. We modified our parbios program slightly, and voilà--all 512KB of Flash now was available, starting at 0x2000000.
Th effects of this change are far-reaching. We had to modify our flash_rom program to enable the Flash on the SC520 and place the Flash at an odd location in memory. Nevertheless, at least we can program it now. This change also affects LinuxBIOS itself. If we want to use all of the Flash part, or simply more then 64KB, we're going to have to make a lot of changes to how LinuxBIOS addresses Flash. We've never seen a machine to date that could not address Flash directly at the top of memory.
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
| Non-Linux FOSS: libnotify, OS X Style | Jun 18, 2013 |
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- RSS Feeds
- Introduction to MapReduce with Hadoop on Linux
- yea
1 min 43 sec ago - One advantage with VMs
2 hours 30 min ago - about info
3 hours 3 min ago - info
3 hours 4 min ago - info
3 hours 5 min ago - info
3 hours 7 min ago - info
3 hours 8 min ago - abut info
3 hours 10 min ago - info
3 hours 11 min ago - info
3 hours 12 min ago
Featured Jobs
| Linux Systems Administrator | Houston and Austin, Texas | Host Gator |
| Senior Perl Developer | Austin, Texas | Host Gator |
| Technical Support Rep | Houston and Austin, Texas | Host Gator |
| UX Designer | Austin, Texas | Host Gator |
| Web & UI Developer (JavaScript & j Query) | Austin, Texas | Host Gator |
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




Comments
MMCR can be moved...
Quote:
"Probably the single biggest problem is the location of the configuration registers, placed right in the middle of the top 2MB of memory"
You can map the MMCR to any 4k offset in the lower 1G by writing to the CBAR register. 0xFFFEF000 is just the default location, so I would not see that as a flaw.
I ended up using rolo instead of Linuxbios on a SC520 based product I designed a few years ago. U-boot is another very powerful option, but they are both more targeted to embedded products.
MMCR can be moved ... but not removed.
The problem in this case is that even if you "move" the MMCR, it is still sitting at its old position 0xFFFEF000, in addition to the new one.
Code sample
The code snippet is broken - presumably due to mangling by a CMS. Given the extreme difficulty I'm having in getting this godawful commenting system to not mangle these code snippets, that's my working theory. LJ staff, please fix your CMS so it doesn's strip spaces in >code<>/code< blocks! I had to use in a >code<>/code<block, which is (a) gross and (b) should probably not be interpreted as an entity, but rather as a literal.
In addition to the total loss of indenting, I found two other problems. First:
bios = mmap(0, size, PROT_READ, MAP_SHARED, fd_mem,
off_t) (0xffffffff - size + 1));
looks like it should be:
bios = mmap(0, size, PROT_READ, MAP_SHARED, fd_mem,
(off_t)(0xffffffff - size + 1) );
and:
#include #
include
should evidently be:
#include
#include
With those changes it builds OK here. The dump doesn't look too interesting on this system - probably not a flash image, anyway - but it's an AMD64 box so it's quite likely the flash is mapped to somewhere different. Any idea what address it might be at?