AMD64 Opteron: First Look
In advance of the release of the AMD64 Opteron processor, we had the opportunity to test a dual-processor SMP system with a 64-bit Linux distribution. As far as most Linux users are concerned, Opteron is the most significant new hardware introduction so far this decade. This early look covers the following key areas:
An overview of the AMD64 Opteron architecture.
A reference two-way SMP system from Newisys, Inc.
Some results from running several GPL applications.
A survey of early system performance estimates.
The Opteron is a new device family based on a new 64-bit architecture that is compatible with the pre-existing x86 32-bit architecture. AMD's choice to preserve compatibility has positive implications for transmigrating 32-bit workloads.
The Opteron architecture supports four application programming models. The first is the General-Purpose model, which performs basic operations like memory access, control flow, exception handling and I/O. The General-Purpose model also sets up memory optimizations that are used by the other application programming models. The next model is 128-bit Media Programming, which uses 128-bit XMM registers. Operations supported under this model include integer and floating point on vectors and scalar data. Similar capabilities are supported in the 64-bit Media Programming model. The last programming model is called x87 Floating-Point Programming, which uses x87 registers for 80-bit floating-point and scalar operations.
The code name for the processor core is Sledgehammer, and the device ships in a 940-pin ceramic micro PGA package. The current Opterons use nearly 106 million transistors in a 130nm Silicon on Insulator (SOI) process. The devices were fabricated at AMD's Fab30 in Dresden, Germany. The L1 cache has a 128KB capacity, split into a 64KB instruction cache and a 64KB data cache. An on-chip L2 cache has a 1MB capacity. The processor runs at 1.55V and provides a die size of 193mm2. The Resources section has more information on AMD64 architecture and open-source software support for this processor.
The Opteron processor is a highly integrated processor with features designed to attain balanced system performance. As such, it contains an integrated high-performance coupling link called HyperTransport, which offers 6.4GB/sec full-duplex data exchanges between processors or other HyperTransport nodes. Support is provided for up to three HyperTransport links, for a total of up to 19.2GB/sec peak bandwidth per processor. In addition, each Opteron contains an integrated memory controller, which offers very high bandwidth and error control capabilities. ECC (error correcting code) protection is provided for L1 cache data, L2 cache data and tags, and in external DRAM with hardware scrubbing of all ECC-protected arrays.
AMD has a three-digit part numbering scheme for the Opterons. The first digit indicates the intended SMP scalability, which is two-way in the 1.8GHz Opteron Model 244 and the 1.6GHz Opteron Model 242 covered in this article. The second digit indicates relative performance within the scalability family. As chip manufacturing costs decline and process technology improves, other model numbers of a given scalability class will emerge at higher frequencies and lower costs. AMD also will be manufacturing a one-way Opteron, which is intended for high-performance lower-cost systems.
Models 240, 242 and 244 are available at the time of this writing. The eight-way capable models 840, 842 and 844 are scheduled to be available in May 2003 and model 144 in Q3 of 2003.
The Opteron extends the x86 architecture, allowing customers to run existing 32-bit applications on a 64-bit OS. Customers who run a 64-bit OS will be ready to support future 64-bit applications and migrate at their own pace, while maintaining the usefulness of their 32-bit applications.
The reference platform is called the 2100 and was realized by Newisys, Inc., a technology provider that now has two years of experience with Opteron (see Resources).
The 2100, a 1U, two-processor, rackmountable system, is superbly engineered. The mechanical and electrical design offers reliability in a dense package (Figure 1). For example, the power supply actually is designed for 500,000 hour MTBF. If you work a typical 2,080 hours a year, this level of reliability would be like working for 240 years without making a mistake.
With Opteron-balanced chipsets and excellent board-level integration features, this system has improved memory performance and capacity significantly. The result is a high-performance balanced server design with robust I/O. The evaluation system came with 6GB of PC2700 memory, but the server supports 16GB.
Figure 2 is a top view of the system. The two copper heatsinks are the processors. Two speed grades are supported: the 1.6GHz Opteron model 242 or the 1.8GHz Opteron model 244. The Opterons link to each other and to the chipset using HyperTransport. The CPU-to-CPU bandwidth is 3.2GB/sec—in each direction (full-duplex). The Opterons each have an internal memory controller that supports ECC DDR SDRAM at a bandwidth of 5.33GB/sec (each). There are two banks of memory, one next to each processor.
The AMD-8000 HyperTransport chipset includes an AMD-8131 HyperTransport PCI-X chip, as well as an AMD-8111 I/O hub chip. The AMD-8131 is configured to drive a full-slot PCI-X 64-bit/133MHz at 1GB/sec data rate, as well as a half-slot PCI-X 64-bit/66MHz, at a 0.51GB/sec data rate. The AMD-8131 also provides a pair of triple-mode NICs (10M/100M/1GB), plus a dual Ultra-SCSI RAID controller. Our test system had dual hot-swappable Ultra-SCSI drives in a RAID configuration in the front of the case. The AMD-8111 chip provides a VGA port, IDE CD-ROM and a USB port. Separately, a SuperI/O chip provides a floppy, keyboard, mouse and conventional serial port.
The system also contains a separate embedded server management processor and it runs Linux. This subsystem is based on a Motorola XPC855T PowerPC processor, running kernel 2.4.18. In addition to a small front-panel control console, the server management processor provides a pair of isolated 10/100 Ethernet interfaces to connect to an independent management subnet. Thus, system management can be done without a keyboard and monitor, or even a serial console access server, and using it does not take up one of the PCI slots.
This system's management facility is quite advanced. The management processor supports SNMP, CIM and IPMI protocols. NIS, Microsoft Active Directory and LDAP authentication support all are provided. Cloning of the service processor configuration can be done peer-to-peer. In addition, one service processor can be designated as the controller for an entire farm of servers. The management processor also provides zero-footprint diagnostics. Machine check analysis, such as access to memory and processor scans, can be done independently.
Newisys has announced they will supply the core technology integration and packaging engineering expertise, but leave the manufacturing and distribution to external OEMs and licensees. They have contract manufacturing arranged though Sanmina SCI and distribution through Avnet.
Newisys partners include some well-known names in IT: Angstrom Microsystems, APPRO, RackSaver, M&A Technology, Microway, New Technology Solutions, Inc., and ProMicro. At the time of this writing, some 600 systems have been fielded for development and evaluation among OEMs, Fortune 500 companies and, of course, Linux Journal.