Porting LinuxBIOS to the AMD SC520: A Follow-up Report

Getting the board Flashed led to some interesting detective work for the LANL team.
First Contact

The Flash is burned. The serial port works. Let's plug it in.

We also had to modify the SC520 startup code to mimic the setup of the PAR registers. With this set of changes made, we got our first serial output:


LinuxBIOS-1.1.8.0Fallback Tue Jun 14 13:36:22 MDT 2005 
starting... 

Copying LinuxBIOS to ram. 

Jumping to LinuxBIOS

Well, it's a start. For the record, this is version


LinuxBIOS@LinuxBIOS.org--devel/freebios--devel--2.0--patch-45. 

What's going on now? What does jumping to LinuxBIOS mean?

What this all means is the ROMCC-based code is working, but the SDRAM is not. Because the SDRAM is not working, the GCC-compiled code doesn't work either. It's time to put in some printing. It's also time to scan carefully the src/cpu/amd/sc520/raminit.c code for errors. As of this version, this code still is pretty ugly, as it came from assembly code. Quick perusal does show a few errors, but prints are the best bet at this point. It is hard to tell what is really going on at times.

Here is the output from this version:


LinuxBIOS-1.1.8.0Fallback Tue Jun 14 16:29:46 MDT 2005 
starting... 

HI THERE! 

sizemem 

NOP 

And then it resets. For reference, I have committed this version as patch-46. See the raminit code to see where this is blowing up.

At this point, we had to do a bit more digging. We noticed in the AMD assembly code that although a lot of byte registers are used to control various things, some of the assembly seems to use word writes. Even for a byte-wide register that has another register right after it, the code uses word writes.

We moved to word writes and things got much better. Once it is all working, for the sake of cleanliness, we're going to try to turn these back into byte writes. Word writes make no sense, unless there's a hardware problem.

It's Important to Specify the Correct CPU

We consistently had resets on a certain code sequence, almost as though we were compiling for the wrong processor. Well, as it happened, we were. Although we had set this line in the Config.lb for the mainboard:


arch i386 end

we had a mistake in one of the extra compilation rules. We were telling ROMCC that the the CPU was a P3:


makerule ./auto.inc depends "$(MAINBOARD)/auto.c 
option_table.h ./romcc" action "./romcc -mcpu=p3 -O 
-I$(TOP)/src -I. $(CPPFLAGS) $(MAINBOARD)/auto.c -o $@" 
end 

What's the problem with doing this? In short, when we specify P3 as the CPU for ROMCC, ROMCC generates MMX instructions to use those extra registers. This usage causes trouble, as there are no MMX registers on a 486.

We modified the line as follows:


makerule ./auto.inc depends "$(MAINBOARD)/auto.c 
option_table.h ./romcc" action "./romcc -mcpu=i386 -O 
-I$(TOP)/src -I. $(CPPFLAGS) $(MAINBOARD)/auto.c -o $@" 
end 

Things suddenly got much, much better.

Finally, Getting into LinuxBIOS

But how do we tell? The code that copies the LinuxBIOS RAM part is assembly. It says jumping to LinuxBIOS, but all we see is POST EE. We're going to give you an overview of how you might debug for a new platform. What we're going to do is bypass most of LinuxBIOS that occurs outside of the ROMCC code. Mostly what this code does is uncompress the GCC code and copy it to SDRAM. This code can be hard to follow, however, so we're going to skip it completely.

We're going to make the code that gets copied to RAM be uncompressed rather than compressed, which will take more space. So we need to use as much of the Flash as we can. We're going to need to make Flash map in at 0x2000000. In the auto.c code, we're going to copy that Flash to RAM. Finally, in the code, we're going to insert a few loops like this one:


1: jmp 1b

so that if the machine hangs, we know it got to the infinite loop.

Let's take it one part at a time. In our src/cpu/amd/sc520/raminit.c file, we add the following:


*par++ = 0x8a020200; 

/*PAR15: BOOTCS:code:nocache:write:Base 0x2000000, size 
    0x80000:*/ 

You can see that code in there even now. This maps in the FLASH at the 32MB location. Next, we set up LinuxBIOS so that the GCC payload is uncompressed. How do we do this? First, we need to explain memory layout. A number of variables control Flash layout in LinuxBIOS, as shown in Figure 2. Notice that each set of variables can be changed for each payload. In our example, at this point, we are using only one payload, so we show the variables for that case.

Figure 2. ROM Sizing Controls

In src/mainboard/digitallogic/msm586/Options.lb, we set CONFIG_COMPRESS to zero. We set the ROM_SIZE to 128K, and we set ROM_IMAGE_SIZE large enough to hold the uncompressed payload. If you look at various patch levels of LinuxBIOS in the repository, you can trace our progress in debugging; space does not allow it all here. We've left the appropriate code in auto.c between ifdefs so you can see how it looks. A word of warning: care must be taken with volatile. Romcc is a good compiler. If you're not careful with volatile, it gladly will optimize out copy-assignment loops.

Also, in crt0.s, we did some playing around. Here's a useful assembly sequence for telling you where you are and making sure you get to see it:


_start: movb $0x12, %al ; outb %al, $0x80; jmp _start

We do verify that in the assembly, we're getting to hardware main. So in hardwaremain(), we put in a call to post, followed by a while(1), and we do see the system hang at that point.

The next step is to test a back-to-back post(). In other words, we call post() twice. Why is this important? It verifies that the stack is working too. To this point, all we've done is call functions; we haven't really found out about returns. Calls can work always, but returns rely on a stack that works. If memory is not correctly set up, a return will fail. We have, in the past, had a sequence of function calls that worked fine until the first function exit, at which point the system failed. Memory really can be this tricky. Back-to-back calls to post(), though, verify that we have a working stack.

The key idea here is the careful placement of so-called "halt-and-catch-fire" instructions, with a little bit of output, can allow you to pinpoint how far you are getting in the code.

To make a long story short, we got caught by our own error in the config file. We forgot to tell LinuxBIOS what kind of console we have. This is fixed easily. In src/mainboard/digitallogic/msm586seg/Options.lb, we add:


uses CONFIG_CONSOLE_SERIAL8250

default CONFIG_CONSOLE_SERIAL8250=1

Now, do we have a console? Let's see.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

MMCR can be moved...

AndrewD's picture

Quote:
"Probably the single biggest problem is the location of the configuration registers, placed right in the middle of the top 2MB of memory"

You can map the MMCR to any 4k offset in the lower 1G by writing to the CBAR register. 0xFFFEF000 is just the default location, so I would not see that as a flaw.

I ended up using rolo instead of Linuxbios on a SC520 based product I designed a few years ago. U-boot is another very powerful option, but they are both more targeted to embedded products.

MMCR can be moved ... but not removed.

Stefan Reinauer's picture

The problem in this case is that even if you "move" the MMCR, it is still sitting at its old position 0xFFFEF000, in addition to the new one.

Code sample

Craig Ringer's picture

The code snippet is broken - presumably due to mangling by a CMS. Given the extreme difficulty I'm having in getting this godawful commenting system to not mangle these code snippets, that's my working theory. LJ staff, please fix your CMS so it doesn's strip spaces in >code<>/code< blocks! I had to use   in a >code<>/code<block, which is (a) gross and (b) should probably not be interpreted as an entity, but rather as a literal.

In addition to the total loss of indenting, I found two other problems. First:


bios = mmap(0, size, PROT_READ, MAP_SHARED, fd_mem,
off_t) (0xffffffff - size + 1));

looks like it should be:


bios = mmap(0, size, PROT_READ, MAP_SHARED, fd_mem,
           (off_t)(0xffffffff - size + 1) );

and:


#include #
include

should evidently be:


#include
#include

With those changes it builds OK here. The dump doesn't look too interesting on this system - probably not a flash image, anyway - but it's an AMD64 box so it's quite likely the flash is mapped to somewhere different. Any idea what address it might be at?

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState