/var/opinion - Parallel Is Coming into Its Own
I started writing about computing back in the 1980s. I don't want to say which year, or do the math for how long I've been doing this. It makes me feel old.
I've made a plethora of predictions since then. Some of them left me red-faced and embarrassed. Some of them were spot-on. Some of them have not yet been fulfilled, but I still think my predictions are on target.
One of my earliest predictions was rather easy, but it was considered controversial back in the 1980s. I said that it was only a matter of time before we bumped up against the limits of Moore's Law, and the only viable answer would be parallel processing. Lo and behold, dual-core processors are now common, and it won't be long before we see quad-core processors, and the multicore cell processor in the PlayStation 3 is around the corner.
Naturally, the next logical step is clustering or other means of distributed processing. Here's where I begin to get nervous. When “grid computing” became a buzzword, my knee-jerk reaction was, “no, thanks”. I don't work in a company office anymore, but if I did, I wouldn't want the company off-loading processing to my desktop workstation unless I was certain that everything ran in a completely isolated sandbox. Put the grid processes in a chroot environment on Linux, for example. Even then, I'm not sure I'd be happy about the idea. What if I want to do something compute-intensive, and the grid process decides it wants my CPU cycles more than I do? This isn't supposed to happen, but since it's all in the hands of some administrator with his or her own agenda, why should I trust that it won't happen?
It's the lack of control and fear of security breaches that make me nervous. I've got four computers in my home that nobody ever turns off, and two more for special purposes that I turn on as needed. The two hand-me-down computers my kids use sit idle much of the time, unless my daughter is browsing the Web, or my son is playing World of Warcraft. I use a server as a centralized provider of resources such as printers, files and e-mail. It's a very old machine, but it never breaks a sweat given its purpose. All this represents a tremendous amount of wasted processing power. I'd love to tap in to that unused power at home. This is a safe environment, because I'm not talking about exposing my processing power to everyone on the Internet. I'm talking about distributing workloads across local machines.
In principle, however, Sun was right all along when it said, “the network is the computer”. Other companies, such as IBM, worked along the same lines before Sun did, but I don't know of any company that said it better than Sun. “The network is the computer” is a powerful phrase. As long as there is adequate security built in to every aspect of distributed processing, it makes perfect sense to provide common services as remote procedure calls and distribute every conceivable workload across as many computers as you want to make available to the system. If someone could make me feel comfortable about security and control, I'd buy into distributed processing in a big way.
Here are the challenges as I see them. First, there's the problem of heterogeneous platforms. How do you distribute a workload across machines with different processors and different operating systems? ProActive is one of several good platform-agnostic distributed computing platforms (see www-sop.inria.fr/oasis/ProActive). It is 100% pure Java, so it runs on any platform that supports Java. It has a great graphical interface that lets you manage the way you distribute the load of a job. You can literally drag a process from one computer and drop it onto another.
The problem is that a tool like ProActive doesn't lend itself to the way I want to distribute computing. I want it to be as transparent as plugging a dual-core processor in to my machine. Unfortunately, you can't get this kind of transparency even if you run Linux on all your boxes. The closest thing to it that I can think of is distcc, which lets you distribute the workload when you compile programs. Even this requires you to have the same version of compiler (and perhaps some other tools) on all your boxes. If you want this to be a no-brainer, you pretty much have to install the same distro of Linux on all your machines.
The bottom line here is that I smell an opportunity for Linux. I would love to see a project that makes distributed computing on Linux brainlessly transparent and distribution-agnostic. I'm talking about the ability to start up any computation-intensive application and have it automatically distribute the work across other machines on the network configured to accept the role as yet another “processor core”. You can make this transparent to the application by building it into the core user-space APIs. You manage it like you would any other network service. Is this too pie in the sky? I'd love to hear your opinions.
Nicholas Petreley is Editor in Chief of Linux Journal and a former programmer, teacher, analyst and consultant who has been working with and writing about Linux for more than ten years.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Trying to Tame the Tablet
- RSS Feeds
- New Products
- What's the tweeting protocol?
- Dart: a New Web Programming Experience
- Reply to comment | Linux Journal
1 hour 40 min ago - Drupal is an Awesome CMS and a Crappy development framework
6 hours 19 min ago - IT industry leaders
8 hours 42 min ago - Reply to comment | Linux Journal
1 day 1 hour ago - Reply to comment | Linux Journal
1 day 4 hours ago - Reply to comment | Linux Journal
1 day 5 hours ago - great post
1 day 5 hours ago - Google Docs
1 day 6 hours ago - Reply to comment | Linux Journal
1 day 11 hours ago - Reply to comment | Linux Journal
1 day 11 hours ago






Comments
some thoughts on parallelism
I don't know if your comment system accepts trackbacks.
http://bitratchet.prweblogs.com/2006/08/09/linux-thots-on-parallelism/
OpenMosix?
Have you taken a look at OpenMosix? There are several openMosix-enabled liveCDs around: KlusTriX, ClusterKnoppix and Clusterix to name a few. Seems 2.6 support is around the corner as well.