An Overview of Intel's MMX Technology
Commercially introduced in January 1997, the MMX technology is an extension of the Intel architecture that uses a single-instruction, multiple-data execution model that allows several data elements to be processed simultaneously. Applications that benefit from the MMX technology are those that do many parallelizable computations using small integer numbers. Examples of these kinds of applications are 2-D/3-D graphics, image processing, virtual reality, audio synthesis and data compression.
If your Linux system has a Pentium II or a Pentium with MMX technology, you can build programs that take advantage of the MMX instruction set using gcc and a bit of assembly language. In this article, I will briefly introduce the main features of the MMX technology, explain how to detect whether an x86 microprocessor has built-in MMX capabilities and show how to program a simple image processing application.
The assembly language code presented here uses NASM, the Netwide Assembler. NASM employs the standard Intel syntax instead of the AT&T syntax used on many popular UNIX assemblers, such as GAS.
The MMX technology extends the Intel architecture by adding eight 64-bit registers and 57 instructions. The new registers are named MM0 to MM7 (see Figure 1). Depending on which instructions we use, each register may be interpreted as one 64-bit quadword, two packed 32-bit double words, four packed 16-bit words, or eight packed 8-bit bytes (see Figure 2).
The MMX instruction set comprises several categories of instructions, including those for arithmetic, logical, comparison, conversion and data transfer operations.
The syntax for MMX instructions is similar to other x86 instructions:
OP Destination, Source
This line is interpreted as:
Destination = Destination OP SourceExcept for the data transfer instructions, the destination operand must always be any MMX register. The source operand can be a datum stored in a memory location or in an MMX register. A few specific MMX instructions will be discussed further on.
Before running a program that uses MMX instructions, it is important to make sure your microprocessor actually has MMX support. Your Linux system should be an Intel x86 or compatible microprocessor (386, 486, Pentium, Pentium Pro, Pentium II, or any of the Cyrix or AMD clones). This is easily checked by executing the uname -m command. This command should return i386, i486, i586 or i686. If it does not, your Linux system runs on a non-x86 architecture.
In order to determine if your CPU supports MMX technology, use the assembly language CPUID instruction. This instruction reveals important processor information, such as its vendor, family, model and cache information. Unfortunately, the CPUID instruction is present only on some late 80486 processors and above. So, how do you know if CPUID is available on your system? Intel documents the following trick: if your program can modify bit 21 of the EFLAGS register, then the CPUID instruction is available; otherwise, you are working with an aged CPU. See Listing 1 (lines 12-29) to learn how this can be done.
Next, request CPU feature information by putting a value of 1 in the EAX register and executing the instruction. The resulting value is returned in bit 23 of the EDX register. If this bit is on, the processor supports the MMX instruction set; otherwise, it does not. Listing 1 (lines 43-50) shows how to do this.
Programs should contain two versions of the same routine: one using MMX technology and one using regular scalar code. At runtime, the program can decide which routine it should actually call.
If MMX instructions are executed in a system that does not support them, the CPU will raise an “invalid opcode exception” (interrupt vector number 6) which is trapped by the Linux kernel. The Linux kernel in turn sends an “illegal instruction signal” (code number 4) to the offending process. By default, this action terminates the program and generates a core file.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- RSS Feeds
- Readers' Choice Awards
- Tech Tip: Really Simple HTTP Server with Python
- BASH script to log IPs on public web server
1 hour 15 min ago
4 hours 51 min ago
- Reply to comment | Linux Journal
5 hours 23 min ago
- All the articles you talked
7 hours 47 min ago
- All the articles you talked
7 hours 50 min ago
- All the articles you talked
7 hours 52 min ago
12 hours 16 min ago
- Keeping track of IP address
14 hours 7 min ago
- Roll your own dynamic dns
19 hours 21 min ago
- Please correct the URL for Salt Stack's web site
22 hours 32 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?