Moving to SMP
In order to demonstrate the difference in performance provided by a second CPU, I performed benchmarks with Linux kernel compilation, the distributed.net rc5des encryption breaker and with POV-Ray's ray tracer (see Table 1). All take direct advantage of multiple CPUs. POV-Ray can also directly use CPUs spread across a network. All figures represent averages of three runs.
Recompilation of the uniprocessor 2.2.7 kernel took 376.91 seconds when running under the same kernel. Recompilation of the SMP 2.2.7 kernel, running under the same SMP kernel, took 395.04 seconds when run on only one CPU, 5 percent longer than the uniprocessor compilation time. When run on two CPUs (make -j 2 bzImage), the compilation took 302.77 seconds, 80 percent of the uniprocessor compilation time.
For POV-Ray, I used the benchmark source file, skyvase.pov, available from POV-Ray's web site. I ran it at xpvmpov's default resolution of 320x240. SMP took 72 percent of the time for a uniprocessor run.
The rc5des code cracker performed its benchmark at nearly the same rate under both uniprocessor and SMP kernels. When in actual operation, it will run on as many CPUs as desired or automatically detect the number of CPUs. I believe there were much smaller performance differences between the two kernels because of the optimizations it contains for maximum performance. It most likely runs within the level 1 (L1) cache as much as possible.
SMP may improve performance in other ways. GUI operations may benefit from having the X server run on one CPU while an application runs on another. Anything that runs well on one CPU but can take advantage of another will benefit from using SMP. I now run the SETI@home client on all CPUs I have that run Linux.
Both L1 and L2 cache quantity and speed matter. RAM speed matters. The Intel P5-233MMX contains a 32KB L1 cache, distributed as a 16KB code cache and a 16KB data cache. My wife's AMD K6-200MMX contains a 64KB L1 cache, distributed as a 32KB code cache and a 32KB data cache. For some tasks, it performs faster than one Intel P5-233MMX. Intel Pentium Pro CPUs have both L1 and L2 cache on board, with up to 1MB of L2. Pentium II CPUs have up to 2MB L2 cache on board. New CPUs also run their caches faster. More cache on the CPU means less contention for external cache and main RAM, which means higher performance. The CPUs, through the support chip set, co-operate among themselves to maintain cache coherency, so that they always maintain accurate views of RAM.
Locking a process to one CPU, particularly when that process' code and data fit in the L1 cache, may also improve performance. Linux does not support this as fully as more mature UNIX variants, but it probably will soon.
Do I need SMP for what I do? No. A single 200MHz P5-class processor can adequately perform the tasks I want to perform. As for most tasks, adequate memory, both RAM and cache, contributes more to performance than the number of processors. Do I have fun with it? Oh, yes.
- Readers' Choice Awards 2013
- Linux Kernel News - November 2013
- New Products
- Advanced Hard Drive Caching Techniques
- Mars Needs Women
- Sublime Text: One Editor to Rule Them All?
- Raspberry Pi: the Perfect Home Server
- December 2013 Issue of Linux Journal: Readers' Choice
- RSS Feeds
- Linux Systems Administrator
- The kernel doesn't really
9 hours 50 min ago
10 hours 21 min ago
10 hours 21 min ago
12 hours 26 min ago
- This should be very helpful
13 hours 39 min ago
- As much as I share your point
15 hours 59 min ago
- So girls had it better ?
19 hours 31 min ago
- Reply to comment | Linux Journal
19 hours 51 min ago
- why is GNOME 3 in the fifth position at 14.1 %?
1 day 1 hour ago
- Sublime Is Brilliant!
1 day 6 hours ago