Moving to SMP
In order to demonstrate the difference in performance provided by a second CPU, I performed benchmarks with Linux kernel compilation, the distributed.net rc5des encryption breaker and with POV-Ray's ray tracer (see Table 1). All take direct advantage of multiple CPUs. POV-Ray can also directly use CPUs spread across a network. All figures represent averages of three runs.
Recompilation of the uniprocessor 2.2.7 kernel took 376.91 seconds when running under the same kernel. Recompilation of the SMP 2.2.7 kernel, running under the same SMP kernel, took 395.04 seconds when run on only one CPU, 5 percent longer than the uniprocessor compilation time. When run on two CPUs (make -j 2 bzImage), the compilation took 302.77 seconds, 80 percent of the uniprocessor compilation time.
For POV-Ray, I used the benchmark source file, skyvase.pov, available from POV-Ray's web site. I ran it at xpvmpov's default resolution of 320x240. SMP took 72 percent of the time for a uniprocessor run.
The rc5des code cracker performed its benchmark at nearly the same rate under both uniprocessor and SMP kernels. When in actual operation, it will run on as many CPUs as desired or automatically detect the number of CPUs. I believe there were much smaller performance differences between the two kernels because of the optimizations it contains for maximum performance. It most likely runs within the level 1 (L1) cache as much as possible.
SMP may improve performance in other ways. GUI operations may benefit from having the X server run on one CPU while an application runs on another. Anything that runs well on one CPU but can take advantage of another will benefit from using SMP. I now run the SETI@home client on all CPUs I have that run Linux.
Both L1 and L2 cache quantity and speed matter. RAM speed matters. The Intel P5-233MMX contains a 32KB L1 cache, distributed as a 16KB code cache and a 16KB data cache. My wife's AMD K6-200MMX contains a 64KB L1 cache, distributed as a 32KB code cache and a 32KB data cache. For some tasks, it performs faster than one Intel P5-233MMX. Intel Pentium Pro CPUs have both L1 and L2 cache on board, with up to 1MB of L2. Pentium II CPUs have up to 2MB L2 cache on board. New CPUs also run their caches faster. More cache on the CPU means less contention for external cache and main RAM, which means higher performance. The CPUs, through the support chip set, co-operate among themselves to maintain cache coherency, so that they always maintain accurate views of RAM.
Locking a process to one CPU, particularly when that process' code and data fit in the L1 cache, may also improve performance. Linux does not support this as fully as more mature UNIX variants, but it probably will soon.
Do I need SMP for what I do? No. A single 200MHz P5-class processor can adequately perform the tasks I want to perform. As for most tasks, adequate memory, both RAM and cache, contributes more to performance than the number of processors. Do I have fun with it? Oh, yes.
Win an iPhone 6
Enter to Win
|December 2015 Video Preview||Nov 30, 2015|
|Take Control of Your PC with UEFI Secure Boot||Nov 30, 2015|
|Geek Hide-away in Guatemala - Stay for Free!||Nov 26, 2015|
|Microsoft and Linux: True Romance or Toxic Love?||Nov 25, 2015|
|Non-Linux FOSS: Install Windows? Yeah, Open Source Can Do That.||Nov 24, 2015|
|Cipher Security: How to harden TLS and SSH||Nov 23, 2015|
- Take Control of Your PC with UEFI Secure Boot
- Cipher Security: How to harden TLS and SSH
- Microsoft and Linux: True Romance or Toxic Love?
- Non-Linux FOSS: Install Windows? Yeah, Open Source Can Do That.
- Web Stores Held Hostage
- Firefox's New Feature for Tighter Security
- Geek Hide-away in Guatemala - Stay for Free!
- PuppetLabs Introduces Application Orchestration
- diff -u: What's New in Kernel Development
- IBM LinuxONE Provides New Options for Linux Deployment