Application Defined Processors
Application defined processors are based on the concept of reconfigurable computing (RC). RC is a computing technology that blurs the line between software and hardware and provides the basis for the next big steps forward in delivering high performance with reduced power and space requirements. RC is implemented using hardware devices that can be reconfigured. Processors in an RC system are created as hardware that is optimized for the application that executes in it.
This article explains RC, examines SRC systems that implement RC and shows the performance advantage RC provides over traditional microprocessors. We also explore the programming model for RC and discuss the potential RC provides for supporting Open Hardware.
RC is a form of computing based on hardware that can be created dynamically for each application that will run in it. RC hardware is comprised of chips whose logic is defined dynamically rather than at the time the chips are fabricated. RC has been around for many years and implemented in a number of different hardware components, such as field programmable gate arrays (FPGAs), field programmable object arrays (FPOAs) and complex programmable logic devices (CPLDs). What is important to application developers is that today's reconfigurable chips have a clock rate and capacity that make it practical to do large-scale computing with RC hardware.
The most familiar chip type used to implement RC is the FPGA. An FPGA is a chip composed of SRAM memory cells used to define a configuration for the chip. FPGAs contain logic gates, flip-flops, RAMs, arithmetic cores, clocks and configurable wires to provide interconnection. FPGAs can be configured to implement any arbitrary logic function and, therefore, can be used to create custom processors that can be optimized to an application.
So, a collection of FPGAs could be configured to be a MIPS, SPARC, PowerPC or Xeon processor, or a processor of your own design. In fact, the processor need not even be an instruction processor. It could be a direct execution logic (DEL) processor that contains only computational logic requiring no instructions to define the algorithm.
DEL processors hold great potential for high performance. A DEL processor can be created with exactly the resources required to perform a specific algorithm. Traditional instruction processors have fixed resources, adders, multipliers, registers and cache memory and require significant chip real estate and processing power to implement overhead operations, such as instruction decode and sequencing and cache management.
DEL processors are reconfigurable computers created for each application in contrast to a fixed architecture microprocessor where one size fits all. A DEL processor delivers the most efficient circuitry for any particular application in terms of the precision of the functional units and parallelism that can be found in the algorithm. Being reconfigurable, a unique DEL processor can be created for each application in a fraction of a second.
But why do you care that a DEL processor can be created dynamically for an application, and that it uses its chips more effectively than a microprocessor? The answer is simple: performance and power efficiency. A DEL RC processor can be created with all of the parallelism that exists within an algorithm without the overhead present in a microprocessor. For the remainder of this article, RC processors are assumed to be implemented using FPGAs in order to be more specific in the discussion.
Performance in RC processors comes from parallel execution of logic. RC processors are completely parallel. In fact, the task of constructing the logic for a given algorithm is to coordinate the parallel execution such that intermediate results are created, communicated and retained at the proper instants in time.
A DEL processor is constructed as a network of functional units connected with data paths and control signals. Each computational element in the network becomes active with each clock pulse. Figure 1 shows a fragment of logic for computing an expression and contrasts the utilization of the chip versus a von Neumann instruction processor, like the Intel Pentium 4 microprocessor.
Even though a microprocessor can operate at a clock frequency of 3GHz and the FPGA chips operate in the 100–300MHz frequency range, the parallelism and internal bandwidth on a DEL processor can outperform the microprocessor by orders of magnitude better delivered performance. Figure 2 presents some benchmark comparisons between SRC's DEL processor, MAP, and a typical von Neumann instruction processor, the Intel Xeon 2.8GHz microprocessor. Parallel execution of exactly the required number of functional units, high internal bandwidth, elimination of instruction processing overhead and load/store elimination all contribute to overcoming the 30× difference in clock frequency between the MAP and the Intel microprocessor.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Tech Tip: Really Simple HTTP Server with Python
- SuperTuxKart 0.9.2 Released
- Parsing an RSS News Feed with a Bash Script
- Doing for User Space What We Did for Kernel Space
- Google's SwiftShader Released
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide