First Beowulf Cluster in Space
All satellites are subject to cosmic ray irradiation. Besides aging effects, the most frequent consequence is random bit-flip errors in SDRAM and the CPU. Left unchecked, these ultimately lead to large-scale data corruption. From a software perspective, the result of every calculation as well as every word in memory is suspect. It goes without saying that a mechanism to detect and correct such errors must be implemented in any space-based system.
Typical solutions for error detection and correction (EDAC) protection involves custom hardware checksum generators. But for our 20-processor PPU, a checksum solution is overly complex, so we utilise a less efficient but simpler multilayer software approach. An EDAC process periodically is scheduled in kernel space to provide error protection. A second EDAC process allows the two to be cross-checked for redundancy.
Process integrity verification in our system is performed for crucial code between scheduled runs of the EDAC processes. In addition, input and output values of protected software procedures are monitored. If unexpected values are detected, the system employs either a clean-up approach, retries the calculation, outputs a previously calculated value or uses the most significant bit-flip correction scheme. Which to use is configured on a per-function basis in a parametric verification table, which again is EDAC-protected.
C code is protected through a single header file and linkable library code. The function entry definition is inserted manually:
#define EDAC_CHECK \ entry_check_edac( __func__);
GCC resolves __func__ at compile time with the string name of the function being entered. The on-demand EDAC process is invoked prior to the function executed. A return re-definition is similar:
#define return(z) return_check_edac( __func__,\ __builtin_return_address, z); return(z);
The developer inserts this into the code, as in the sample program given below:
int calc(int x, int y) {
EDAC_CHECK .....
return(z); }
Using this, a malfunctioning program can't cause too much damage. But even if the kernel is involved, a loss of heartbeat triggers a reboot. To minimise the impact on other tasks, it's preferable that only one user application should operate on each node concurrently—but of course, this is at the user's discretion.
So, what is the PPU supposed to do after launch? Even though the hardware costs are almost insignificant with respect to the overall satellite budget, with a launch price of approximately 10,000 US$/kg, each gramme has to be strongly justified. Right now, the most essential PPU task is image compression using a content-driven JPEG2000 scheme. But the major advantage of the PPU is its “standing watch” capability, in which the camera continuously monitors the Earth with image data evaluated and discarded immediately if it's not valuable. In case of detecting valuable information, which is under software control, the obtained scene is kept for subsequent transmission. But even more important, X-Sat can transmit the results of its findings instantaneously to mobile terminals on the ground—each the size and price of a conventional transistor radio. The implications of such a concept are understood easily if, for example, such a system was in place when the earthquake northwest of Sumatra, Indonesia, created a tsunami wave killing more than 285,000 people on Boxing Day 2004.
Currently, two specific applications are supported: the detection of oil spills and haze observation originating from man-made and natural fires. Both make use of the additional processing power available through the FPGAs to pre-process image data streamed into the individual processors. The images in Figure 3 are examples from a simulated acquisition campaign over a complete daylight period of one day's orbits. The raw data from a 10% duty cycle covers an area of approximately 3 million km2. If only 0.001% of this data showed oil spills, this would be equivalent to 62 catastrophic Prestige oil spills. With a fully functional PPU, the processing time for simultaneous execution of both disaster-detection tasks is 25% of the total daily orbit time. In contrast, however, it allows the evaluation of the entire data instead of only a small subset on the ground.
Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
| Trying to Tame the Tablet | May 08, 2013 |
| Dart: a New Web Programming Experience | May 07, 2013 |
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- RSS Feeds
- Home, My Backup Data Center
- New Products
- Python Programming for Beginners
- Mobile IPv6 with Linux
- New Products







2 hours 51 min ago
5 hours 23 min ago
10 hours 3 min ago
12 hours 25 min ago
1 day 5 hours ago
1 day 7 hours ago
1 day 9 hours ago
1 day 9 hours ago
1 day 10 hours ago
1 day 14 hours ago