Scaling Linux to New Heights: the SGI Altix 3000 System

With 64 processors and 512GB of memory, SGI claims the title of world's most powerful Linux system.
Other Enhancements to Linux for HPC

SGI ProPack also includes several tools and libraries to help improve performance on large NUMA systems for solving a complex problem with an application that needs large numbers of CPUs and memory, or when multiple applications are running simultaneously within the same large system. On Linux, SGI provides the commands cpuset and dplace, which give predictable and improved CPU and memory placement control for HPC applications. These tools help unrelated jobs carve out and use the resources they each need without getting into each other's way or help prevent a smaller job from inadvertently thrashing across a larger pool of resources than it can effectively use. Therefore system resources are used efficiently and deliver results in a consistent time period—two characteristics critical to HPC environments.

Also, the SGI Message Passing Toolkit (MPT) in SGI ProPack provides industry-standard message passing libraries optimized for SGI computers. MPT contains MPI and SHMEM APIs, which transparently utilize and exploit the low-level capabilities within the SGI hardware, such as its block transfer engine (BTE) for fast memory-to-memory transfers and the hardware memory controller's fetch operation (fetchop) support. Fetchop support enables direct communication and synchronization between multiple MPI processes while eliminating the overhead associated with system calls to the operating system.

The SGI ProPack NUMA tools, HPC libraries and additional software support layered on top of a standard Linux distribution provide a powerful HPC software environment for big compute and data-intensive workloads. Much like a custom ASIC on hardware providing the “glue logic” to leverage and use commodity processors, memory and I/O parts, SGI ProPack software provides the “glue logic” to leverage the Linux operating system as a commodity building block for large HPC environments.


No one believed Linux could scale so well, so soon. By combining Linux with SGI NUMAflex system architecture and Itanium 2 processors, SGI has built the world's most powerful Linux system. Bringing the SGI Altix 3000 system to market involved a tremendous amount of work, and we consider it to be only the beginning. The aggressive standards-based strategy that SGI has for using Linux on Itanium 2-based systems is raising the bar on what Linux can do while providing customers an exciting, no-compromises alternative for large HPC servers and supercomputers. SGI engineers—and the entire company for that matter—are fully committed to building on Linux capabilities and pushing the envelope even further to bring more exciting breakthroughs and opportunities for the Linux and HPC communities.

Steve Neuner has been working in UNIX kernel development for the past 19 years at major computer manufacturers including MAI Basic Four, Sequent Computer Systems, Digital Equipment Corporation and SGI. Now with SGI, Steve is the Linux engineering director and has been working on Linux and Itanium-based systems since joining SGI four years ago.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Couple more questions for Steve on CPUs and the technology.

Anonymous's picture

Why the process of choosing the Itanium 2 CPU? What are the benefits of this platform in your view? Would other CPUs have worked as well (PowerPC, MIPS, AMD/Hammer) or could there be processor module versions based on these models in the future? Can the traditional workstation market benefit from some of this technology (i.e. an small case with support for 4 modules only)?



Re: Scaling Linux to New Heights: the SGI Altix 3000 System

Anonymous's picture

how many bogomips does this monster achieve 8-) ?

Re: BogoMIPS and Pricing

SteveNeuner's picture

> how many bogomips does this monster achieve 8-) ?


First, please keep in mind that bogomips only times the cpu delay loop.

Thus, a bogomips value says nothing about memory bandwidth, parallel

instructions, cache sizes, etc. Thus, while it may be interesting to know

how fast a cpu spins while doing nothing, it doesn't say much about what

it can do with real work. :^)

Second, the method for calculating bogomips will be vary depending

on the processor, so this number should not be used to compare between

different processors types. For example, on the Intel Itanium processor,

it times the following loop (a single bundle using an instruction and loop

register that have been optimized for looping):

2e0: nop.m 0x0

2e6: nop.i 0x0

2ec: br.cloop.sptk.few 2e0 ;;

While on i386 with the rdtsc instruction (Pentium or better), bogomips

times the following loop (5 instructions, 3 registers and a read of the

hardware clock):

10: f3 90 repz nop

12: 0f 31 rdtsc

14: 29 c8 sub %ecx,%eax

16: 39 d8 cmp %ebx,%eax

18: 72 f6 jb 10

Given that comparing this number between different processors types is an apples to

oranges comparison, and FWIW, here's the info taken from a 64 processor (Intel Itanium 2),

SGI Altix(tm) system:

[root@parrot root]# grep processors /var/log/dmesg

Total of 64 processors activated (76359.40 BogoMIPS).

All processors have done init_idle

[root@parrot root]#

> And the price tag?

Please refer to the pricing and availability information in the

press release at:

Hope that helps.

Steve Neuner


Re: Scaling Linux to New Heights: the SGI Altix 3000 System

Anonymous's picture

And the price tag?