The Role of Linux in Grid Computing
Today, applications are developed to be geared toward a specific platform or hosting environment, for example Linux, Windows 2000, various UNIX flavors, mainframes, J2EE, Microsoft .NET and so on. Such computing tends to operate within a monolithic framework in which applications contend for resources as and when they're made available for that single platform. For a platform with limited resources, the resource availability starts decreasing as the demand for service grows. At such a time, if resources from other systems could be used or, in turn, the requirements could be serviced by resources from other systems, the strain on the native system would reduce considerably and the quality of service being offered would improve.
It is this objective that grid computing wants to meet. The objective of grid-based computing is to virtualize, manage and allocate distributed physical resources (processing power, memory, storage, networking) to applications and users on an as-needed (on-demand) basis--regardless of the resources' location. Grid networks transcend physical components, organizational units, enterprise infrastructure and geographic boundaries. Naturally, software plays a vital role in determining the success of grid computing. In this article, we focus on the role of Linux in grid computing.
Four arguments can be made for Linux becoming the basis of grid computing:
1. Open Grid Services Architecture (OGSA) is a service architecture built on the open-source paradigm of community participation and sharing code. According to the father of the grid, Ian Foster, a chief scientist at Argonne National Laboratory, the long-term success of grid computing depends on four issues: open standards, open software, open infrastructure and commercializing grid services to speed enterprise adoption. The development of Linux has progressed along similar lines.
The Globus Toolkit, which formed the basis of OGSA, is a community-based, open-architecture, open-source set of services and software libraries. Globus addresses issues of security, information discovery, resource management, data management, communication, fault detection and portability. Thus, it mirrors the community processes used for the development and evolution of the Linux kernel. Any grid network must accommodate a heterogeneous mix of existing resources. However, future generations of grid networks likely will center around operating system and development environments that support an open and collaborative community process whose infrastructure evolves through an open-source process. Because Linux has evolved from the same open-source process, there is a high degree of affinity between Linux and grid-computing projects. Open standards and protocols lead to the building of services, and services are at the heart of the grid.
2. The grid concept is based on the management and allocation of distributed resources rather than on a vertically integrated, monolithic resource tightly coupled to the underlying operating-system architecture of the platform.
The adoption of grid computing from single platform architectures will not happen all of a sudden. A few computational units will have to be deployed in small, inexpensive increments. The performance of these units will be measured and compared to the expected results. If the gain is significant, only then would there be a next round of deployments. This is in contrast to the major investments needed for large-scale monolithic systems, which typically are obsolete within four or five years and thus are a drain on capital and operating budgets.
Linux has gained a reputation for being a highly efficient operating system in simpler application environments running on smaller hardware configurations, the type that will be enabled by the grid architecture. In such experimentation-based systems, the free nature of Linux will play a crucial important role due to lower investments.
3. Computing grids are virtual, extensible and horizontally scalable and use open network protocols. Many of the early instances of grid networks were developed in the scientific and technical computing environments of universities, technical laboratories, health and bio-informatics consortia. Most of them have relied on native operating system processes for the hosting environment, typically UNIX and Linux. Their experience suggests that Linux is the best platform available for grid computing. There has hardly been any evidence of grid computing projects being deployed on other operating systems, such as Windows 98 or Windows XP.
For example, the TeraGrid is designed to help the National Science Foundation address complex scientific research, including molecular modeling for disease detection, cures and drug discovery, automobile crash simulation and investigations of alternative energy sources. TeraGrid will use more than 3,000 Intel processors running Linux. The Grid Forum, a research consortium, is aiming its research at applications in the oil industry, physical disaster prediction and simulation, biological and ocean modeling, industrial simulations, agriculture applications, health service applications and e-utilities. Many of these applications currently run on UNIX or Linux.
4. Vendor-specific initiatives are promoting Linux. Although IBM's grid architectural block diagrams show the OGSA framework supporting operating system heterogeneity, they also clearly point to the centrality of Linux in IBM's grid strategy. Sun Microsystems has written an edition of its Grid Engine 5.3 software for the version of Linux available through SuSE Linux AG. Other vendors are investing in grid computing as well. Hewlett-Packard has incorporated software specs for massive grids into the Utility Data Center, a computing power-on-demand product that supports Linux. An Oracle spokesman recently said Linux is "the smart option for grid computing". In addition, Oracle recently announced that the Oracle 10g package is grid-enabled and runs smoothly on Linux.
On the whole, Linux is the buzzword as far as the platform for grid computing is concerned. But, arguments against pervasive adoption of Linux exist. A few them are listed below:
1. Grid computing is based on the principle of heterogeneity, where virtual organizations are formed with no discrimination between resources and systems, as long as the standard toolkit services are implemented.
The OGSA model does not specify an operating system. On the contrary, it has been developed so as to invite all computing architectures into the grid-computing family. Given grid computing's emphasis on resource virtualization and usage and the heterogeneous nature of most enterprises' IT infrastructures, enterprises are under no pressure to change their hardware and software to participate in grids or establish internal grids.
2. The grid philosophy does not specify implementations: its fundamental principle is to adapt to the operating system environment of specific hosts and exploit their native capabilities.
The grid architecture does not suggest or inhibit ways or solutions for implementation of grid architecture. Similarly, it does not specify anything about the platform to be used.
3. Grid computing addresses only a small part of the IT infrastructure. Grid computing is exactly what it implies--a mesh of distributed resources whose members share each other's resources through a specifically enforced set of protocols. The heterogeneity that OGSA attempts to address is in itself recognition that IT infrastructures will consist of a mix of computing architectures.The role Linux will play in such heterogeneous environments will depend largely on its performance, reliability and economics when running on such hosts.
Both grid computing and Linux are too immature for us to forecast that Linux will dominate in commercial grid applications. Only the future will tell whether the realm of grid computing is ruled by Linux. One thing is for sure, however; Linux definitely will form a large chunk in the Grid Computing Platform market.
|Designing Electronics with Linux||May 22, 2013|
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
- I once had a better way I
3 hours 44 min ago
- Not only you I too assumed
4 hours 1 min ago
- another very interesting
5 hours 54 min ago
- Reply to comment | Linux Journal
7 hours 48 min ago
- Reply to comment | Linux Journal
14 hours 42 min ago
- Reply to comment | Linux Journal
14 hours 58 min ago
- Favorite (and easily brute-forced) pw's
16 hours 49 min ago
- Have you tried Boxen? It's a
22 hours 41 min ago
- seo services in india
1 day 3 hours ago
- For KDE install kio-mtp
1 day 3 hours ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?