SIDUS—the Solution for Extreme Deduplication of an Operating System

Install Wrap-up

We unmount all system folders necessary for installation:


umount ${SIDUS}/run/shm
umount ${SIDUS}/dev/pts
sidus umount /proc/sys/fs/binfmt_misc
sidus umount /proc
sidus umount /sys

We activate the startup dæmons:


rm -f ${SIDUS}/usr/bin/policy-rc.d
cp /usr/sbin/start-stop-daemon
    ${SIDUS}/usr/sbin/start-stop-daemon

We remove all process references launched by the install process:


rm -r ${SIDUS}/run/* ${SIDUS}/tmp/*

This purges all processes related to SIDUS.

Adapting to Heterogeneity

The park of computers may consist of cluster nodes (with fast network equipment), workstations (with embedded GPUs) or virtual machines (which require data sharing and GPU acceleration). For a large park, you do not want persistence. You should use boot scripts, a separate SIDUS tree or install third-party components.

Administering the system is not as easy as installing it. The gain you experience in installing the system more than makes up for the pain you experience in administering the system though. With SIDUS, every administration phase abides by the installation mechanisms: protection against booting and mounting of system folders.

Administration techniques are similar to those for the initial install. We use a script so that commands executed in SIDUS are surrounded by pre/post operations. We use this script either automatically (typically for updates) or manually. At the end of the day, these additional commands represent a negligible burden with respect to the benefits we get. We now have several (many) stations that are bit-for-bit identical to a given base system. Other benefits:

  • SIDUS works on user stations. The individual workstations can be considered shared. We started with a dozen Neoware light clients that were memory-enhanced and overclocked. We now have about 20 of those.

  • SIDUS works on cluster nodes. In March 2010, we had a proof of concept with 24 nodes. Nowadays, SIDUS serves 86 permanent nodes over four different hardware architectures.

  • SIDUS works on virtual stations. Every year since 2011, Université Joseph Fourier organizes a summer school on scientific computing. For ten busy days, students train hands-on. Thus, it's necessary to offer them a homogeneous environment in no time. Two virtual images are offered: a persistent one they can use after the summer school and another one via SIDUS. This way, teachers can adapt materials and activities day by day. Since summer 2012, this solution has been used at the Laboratory of Chemistry of ENS de Lyon as well.

  • SIDUS works on suspicious stations. Booting via the network enables the investigation of the shutdown system mass storage. There is no need for a live CD—always short of your ideal forensic tool.

  • SIDUS works on loan stations. Hardware manufacturers usually offer assessment equipment. The install phase can be tedious on recent equipment. Using SIDUS, the system boots just like on other (in use) equipment—for example, it takes a few minutes for 20 nodes.

Conclusion

Who should use SIDUS and why?

  • Users: you choose the resources on which you want to boot your station. Therefore, workstations can be segmented at will. The VirtualBox version of SIDUS has been tested successfully on Linux, MS Windows and Mac OS. GPU acceleration and sharing with the host are available. Users find themselves in the same environment as the nodes'. This makes code integration tremendously easier. Performance-wise, losses due to virtualization vary between 10% and 20% for VirtualBox and about 5% for KVM.

  • Administrators: a given operation propagates to the entire infrastructure, as if simply syncing over the SIDUS tree. The install takes a few tens of minutes for a full-featured system. To work out minor differences between systems, simple scripts or puppets do the job. In the case of more important differences, just build another SIDUS tree. The SIDUS tree might even just be cloned instantaneously using snapshot tools (LVM or, better, ZFSonLinux).

  • For the sake of experiments: the SIDUS environment offers scientists and system engineers a framework for conducting reproducible experiments. Two nodes booting on the same SIDUS base do run the exact same system. This way, even if the stations are not actually identical, relevant tests still can be carried out.

How much does it cost in terms of resources? To get an idea, the clusters' server (also gateway) at CBP hosts DHCP, DNS, TFTP and NFS services, as well as a batch server OAR. When booting the entire infrastructure (88 nodes), the NFS server takes in 900Mb/s.

To conclude, you will want to use SIDUS on a variety of environments, be they HPC nodes, workstations or virtual machines. SIDUS gives unprecedented flexibility to both users and administrators. It is so energy-efficient and it propagates so rapidly, you won't want to live without it!

______________________

Emmanuel Quemener defines his job as an "IT test pilot". His work at the HPC "Centre Blaise Pascal" (Lyon, France) involves software integration, storage, scientific computing with GPUs and technology transfer in science.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Interesting system. Certainly

Ann Honni Mousse's picture

Interesting system.
Certainly lighter to install and maintain than MIT's Athena (but less powerful too).

Interesting

Anonymous's picture

Respect to the sysadmins.

Reply to comment | Linux Journal

Internet Marketing's picture

Great post. I was checking constantly this blog and I am impressed!
Very useful information specifically the final part :) I
deal with such info a lot. I used to be looking for this
particular info for a very lengthy time.

Thank you and good luck.

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix