The Role of Linux in Grid Computing
Today, applications are developed to be geared toward a specific platform or hosting environment, for example Linux, Windows 2000, various UNIX flavors, mainframes, J2EE, Microsoft .NET and so on. Such computing tends to operate within a monolithic framework in which applications contend for resources as and when they're made available for that single platform. For a platform with limited resources, the resource availability starts decreasing as the demand for service grows. At such a time, if resources from other systems could be used or, in turn, the requirements could be serviced by resources from other systems, the strain on the native system would reduce considerably and the quality of service being offered would improve.
It is this objective that grid computing wants to meet. The objective of grid-based computing is to virtualize, manage and allocate distributed physical resources (processing power, memory, storage, networking) to applications and users on an as-needed (on-demand) basis--regardless of the resources' location. Grid networks transcend physical components, organizational units, enterprise infrastructure and geographic boundaries. Naturally, software plays a vital role in determining the success of grid computing. In this article, we focus on the role of Linux in grid computing.
Four arguments can be made for Linux becoming the basis of grid computing:
1. Open Grid Services Architecture (OGSA) is a service architecture built on the open-source paradigm of community participation and sharing code. According to the father of the grid, Ian Foster, a chief scientist at Argonne National Laboratory, the long-term success of grid computing depends on four issues: open standards, open software, open infrastructure and commercializing grid services to speed enterprise adoption. The development of Linux has progressed along similar lines.
The Globus Toolkit, which formed the basis of OGSA, is a community-based, open-architecture, open-source set of services and software libraries. Globus addresses issues of security, information discovery, resource management, data management, communication, fault detection and portability. Thus, it mirrors the community processes used for the development and evolution of the Linux kernel. Any grid network must accommodate a heterogeneous mix of existing resources. However, future generations of grid networks likely will center around operating system and development environments that support an open and collaborative community process whose infrastructure evolves through an open-source process. Because Linux has evolved from the same open-source process, there is a high degree of affinity between Linux and grid-computing projects. Open standards and protocols lead to the building of services, and services are at the heart of the grid.
2. The grid concept is based on the management and allocation of distributed resources rather than on a vertically integrated, monolithic resource tightly coupled to the underlying operating-system architecture of the platform.
The adoption of grid computing from single platform architectures will not happen all of a sudden. A few computational units will have to be deployed in small, inexpensive increments. The performance of these units will be measured and compared to the expected results. If the gain is significant, only then would there be a next round of deployments. This is in contrast to the major investments needed for large-scale monolithic systems, which typically are obsolete within four or five years and thus are a drain on capital and operating budgets.
Linux has gained a reputation for being a highly efficient operating system in simpler application environments running on smaller hardware configurations, the type that will be enabled by the grid architecture. In such experimentation-based systems, the free nature of Linux will play a crucial important role due to lower investments.
3. Computing grids are virtual, extensible and horizontally scalable and use open network protocols. Many of the early instances of grid networks were developed in the scientific and technical computing environments of universities, technical laboratories, health and bio-informatics consortia. Most of them have relied on native operating system processes for the hosting environment, typically UNIX and Linux. Their experience suggests that Linux is the best platform available for grid computing. There has hardly been any evidence of grid computing projects being deployed on other operating systems, such as Windows 98 or Windows XP.
For example, the TeraGrid is designed to help the National Science Foundation address complex scientific research, including molecular modeling for disease detection, cures and drug discovery, automobile crash simulation and investigations of alternative energy sources. TeraGrid will use more than 3,000 Intel processors running Linux. The Grid Forum, a research consortium, is aiming its research at applications in the oil industry, physical disaster prediction and simulation, biological and ocean modeling, industrial simulations, agriculture applications, health service applications and e-utilities. Many of these applications currently run on UNIX or Linux.
4. Vendor-specific initiatives are promoting Linux. Although IBM's grid architectural block diagrams show the OGSA framework supporting operating system heterogeneity, they also clearly point to the centrality of Linux in IBM's grid strategy. Sun Microsystems has written an edition of its Grid Engine 5.3 software for the version of Linux available through SuSE Linux AG. Other vendors are investing in grid computing as well. Hewlett-Packard has incorporated software specs for massive grids into the Utility Data Center, a computing power-on-demand product that supports Linux. An Oracle spokesman recently said Linux is "the smart option for grid computing". In addition, Oracle recently announced that the Oracle 10g package is grid-enabled and runs smoothly on Linux.
On the whole, Linux is the buzzword as far as the platform for grid computing is concerned. But, arguments against pervasive adoption of Linux exist. A few them are listed below:
1. Grid computing is based on the principle of heterogeneity, where virtual organizations are formed with no discrimination between resources and systems, as long as the standard toolkit services are implemented.
The OGSA model does not specify an operating system. On the contrary, it has been developed so as to invite all computing architectures into the grid-computing family. Given grid computing's emphasis on resource virtualization and usage and the heterogeneous nature of most enterprises' IT infrastructures, enterprises are under no pressure to change their hardware and software to participate in grids or establish internal grids.
2. The grid philosophy does not specify implementations: its fundamental principle is to adapt to the operating system environment of specific hosts and exploit their native capabilities.
The grid architecture does not suggest or inhibit ways or solutions for implementation of grid architecture. Similarly, it does not specify anything about the platform to be used.
3. Grid computing addresses only a small part of the IT infrastructure. Grid computing is exactly what it implies--a mesh of distributed resources whose members share each other's resources through a specifically enforced set of protocols. The heterogeneity that OGSA attempts to address is in itself recognition that IT infrastructures will consist of a mix of computing architectures.The role Linux will play in such heterogeneous environments will depend largely on its performance, reliability and economics when running on such hosts.
Both grid computing and Linux are too immature for us to forecast that Linux will dominate in commercial grid applications. Only the future will tell whether the realm of grid computing is ruled by Linux. One thing is for sure, however; Linux definitely will form a large chunk in the Grid Computing Platform market.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- Stunnel Security for Oracle
- SourceClear Open
- SUSE LLC's SUSE Manager
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- My +1 Sword of Productivity
- Tech Tip: Really Simple HTTP Server with Python
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Doing for User Space What We Did for Kernel Space
- Google's SwiftShader Released
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide