First Look at an Apple G4 with the Altivec Processor
Now that Linux is installed the fun with AltiVec begins. As I already mentioned, the AltiVec unit is an additional processing unit, like the floating-point unit or the integer-unit, that processes data stored in 32 128-bit vector registers. The vector execution unit processes this vector data using the single instruction multiple data (SIMD) model. The processor, with one instruction, can operate on four, eight or 16 data units at once. Shortly I give an example to clarify this.
Motorola added 162 new assembler instructions to allow programmers to use the new functionality of the AltiVec-enabled processor. These instructions are detailed in the AltiVec Technology Programming Environments Manual (altivec_pem). The higher-level C instructions that use these new assembler instructions can be found in the AltiVec Technology Programming Interface Manual (altivec_pim). Both of these documents are available for download, in PDF format, from either Motorola's web site or from http://www.altivec.org/.
My next step was to download and install the AltiVec RPMs from http://www.altivec.org/. These RPMs provide a version of gcc (2.95.2) that has been modified to use these new directives. Installation is achieved by the following:
rpm -U binutils-126.96.36.199.22-6.vec.ppc.rpm rpm -i gcc-altivec-2.95.2-1i.ppc.rpm rpm -i gcc-altivec-c++-2.95.2-1i.ppc.rpm
After installation, I was able to use this new gcc as follows:
gcc-vec program.c -o programgcc installs into /opt/bin so that it doesn't affect the default gcc. The RPM creates a link in /usr/bin, named gcc-vec, that points to the vectorized gcc in /opt.
To use the new vectorized commands, you have to write applications that use them and use a version of gcc that is aware of them. You cannot use this version of gcc on your standard C source code and expect to achieve a performance increase from the AltiVec unit. The AltiVec-enabled gcc is aware of new keywords and new functions. altivec_pim is the first step in learning the new commands provided for in gcc-vec. The new vector data types are seen in Table 2.
Notice the new keyword vector. This indicates that the following declaration is a 16-byte (128-bit) vector. Additionally, these types must be aligned on 16-byte boundaries for the vector execution unit to process the values suitably. A programmer must use caution when de-referencing data that is not aligned on a 16-byte boundary and typically will massage the data to be so aligned.
According to altivec_pim, compilers aware of the AltiVec-enabled processor should provide the following macro:
To build code that is capable of compiling on multiple architectures but is still capable of using the AltiVec instructions, you can do something like the following:
#ifdef __VEC__ /* Put your vector code here */ /* ... */ #else /* do it the old-fashioned way, here */ /* ... */ #endifTo illustrate how to begin using the AltiVec-enabled gcc, I'll provide an example in Listing 3 [see FTP site at ftp.linuxjournal.com/pub/lj/listings/issue86].
First, notice the typedef union definitions. As previously discussed, the AltiVec registers are 128-bits. These definitions guarantee that the compiler will align the data declared by these types on 128-bit boundaries. Secondly, they provide a convenient method of accessing the individual elements of the vector data types. A final benefit of using the union data type is that now you are given a mechanism to look inside your register—by using printf( ). The altivec_pim provides for formatted input/output using scanf( )/printf( ). In theory, you should be able to print a vector float register using the following in your C code:
vector float f32 = (vector float)(1.1, 2.2, 3.3, 4.4); printf( "%,vf\n", f32 );
To achieve this your C library (glibc*) must be aware of the vector format directives. The current implementation of the GNU C library (2.2) does not and probably never will. For this reason, I hope to modify a version of the GNU C library to serve this purpose. If you have any advice or interest, please feel free to contact me.
Next, notice the two different mechanisms for defining the vector types. The first declaration is for the vector constants stored in cVals, sVals, iVals and fVals where the vector data type is declared and defined in the same statement. This illustrates how to store constants (values that do not change at runtime) in vectors.
The next method declares a union type and assigns the vector values at runtime in an element-by-element fashion. This method would allow you to read in data from a buffer, copy it to a vector variable and pass it to your vector-aware functions.
Finally, notice the form of the vec_add( ) function. In all cases, I have used the same function, vec_add( ), and it provides the correct result, regardless of whether the arguments were vector shorts, vector ints or vector floats (the arguments must be of the same type). In this case, the compiler interpreted the data types I passed as arguments to vec_add( ) and generated the correct form of the assembler instruction vadd* for me. For example, in the following C code the compiler is able to generate the mapping below:
vector float a,b,c; /* Assign a,b */ /* ... */ c = vec_add( a, b);
This translates to the following assembler instruction:
vaddfp c,a,bThis just keeps getting easier.
To compile this program, use the following command:
gcc-vec -fvec vecdemo1.c -o vecdemo1
The -fvec switch to the compiler tells it to interpret the vector commands. If you don't use the -fvec switch, the compiler will not recognize the vector data types or commands and will print error messages that will remind you to use the switch the next time.
The program produces the output shown in Listing 4.
I've tried to provide an introduction to Linux on the PowerMac and to the AltiVec resources available to Linux programmers. I would like to do more. Other possible avenues would be to demonstrate how the AltiVec can be used as a platform for signal processing, how these processors can be used in place of special-purpose DSPs or to look at a common use for DSPs in signal processing, finite impulse response (FIR) filters.
I would like to thank the members of the AltiVec forum. The mail list has been an invaluable resource to get up and running. Also, thank you to all of the AltiVec developers that have provided such a rich set of tools to begin development on a platform as powerful as the G4.