SIDUS—the Solution for Extreme Deduplication of an Operating System

Install the Debian Base with Debootstrap

With Debootstrap, you can install a system in an extra root location. Debootstrap needs you to specify parameters, such as the install root, the hardware architecture, the distribution and the FTP or HTTP Debian archive to use for downloading on Debian worldwide mirror sites.

Warning: this is where we get Debian-specific. Debootstrap is a familiar tool for all Debian-like distributions (typically, it is available on Ubuntu). It is not too difficult to make it happen on Red Hat-like distributions though. There is a clone for Fedora called febootstrap; we have not tested it though.

Debootstrap also takes as input a list of archives—as we all know, Debian is very particular about distinguishing the main archive area from the contrib and non-free ones—a list of packages to include and a list of packages to dismiss. We wish we could specify the latter two lists, but you cannot handle everything with Debootstrap. We install from the very beginning a set of tools we deem necessary (such as the kernel, some firmware and auditing tools).

We define environment variables corresponding to the root of our SIDUS system. We define a command that enables the execution of commands via chroot, with a specific option for package install. The variable $MyInclude corresponds to the (comma-separated) list of packages you want, and $MyExclude corresponds to the list of packages you do not want:


export SIDUS=/srv/nfsroot/sidus
time debootstrap --arch amd64
    --components='main,contrib,non-free'
    --include=$MyInclude --exclude=$MyExclude
    wheezy $SIDUS http://ftp.debian.org/debian

Precautions before Moving on with the Install

After running the last-mentioned command, you should be a little cautious. A Debian package normally starts after install. You need to define a hook to inhibit the booting of services. After completion of the install, you can remove this hook:


printf '#!/bin/sh\nexit 101\n' >
    ${SIDUS}/usr/sbin/policy-rc.d
chmod +x ${SIDUS}/usr/sbin/policy-rc.d

Some packages require access to the list of processes, system, peripherals, peripheral pointers and virtual memory. Hence, you should bind the mounting of these host system folders to SIDUS:


alias sidus="DEBIAN_FRONTEND=noninteractive chroot
    ${SIDUS} $@"
sidus mount -t proc none /proc
sidus mount -t sysfs sys /sys
mount --bind /run/shm ${SIDUS}/run/shm
mount --bind /dev/pts ${SIDUS}/dev/pts

Install Additional Packages (Scientific Libraries)

To make it simpler when installing packages of the same family, Debian ships with meta-packages. In our case, we are interested in the scientific ones: their names are prefixed by "science". For example, we have "science-chemistry", including all chemistry packages. You install all scientific packages with only one command:


time sidus apt-get install --install-suggests -f
    -m -y --force-yes science-*

Because we are talking about a full-featured OS, we also install the suggested packages: the option --install-suggests is available from Wheezy onward (released May 5, 2013).

When installing, the costliest phase is downloading packages and configuring certain components (Perl and LaTeX). In the best-case scenario, it takes 45 minutes for a 32GB full tree. There is a price to pay for this install craze. Some packages do not install well, and you will want to purge some, such as a M*tlab installer:


time sidus apt-get purge -y -f --force-yes matlab-*

Local Environment

Usually, you will want to adapt the system to a local environment (authentication and user sharing). The default is US, so you may want to configure:

  • ${SIDUS}/etc/locale.gen.

  • ${SIDUS}/etc/timezone.

  • ${SIDUS}/etc/default/keyboard.

For LDAP authentication, you may want to configure: ${SIDUS}/etc/nsswitch.conf, ${SIDUS}/etc/libpam_ldap.conf, ${SIDUS}/etc/libnss-ldap.conf and ${SIDUS}/etc/ldap/ldap.conf.

As for the mounting of NFS user folders:

  • ${SIDUS}/etc/default/nfs-common, ${SIDUS}/etc/default/idmapd.conf and ${SIDUS}/etc/fstab (for NFSv4).

  • ${SIDUS}/etc/fstab (for NFSv3).

Set Up the Boot Sequence

How do you share SIDUS without duplicating it? We found the best solution to be via live CD. The boot sequence includes the two layers you need—that is, a read-only layer (on media for live CD and on NFS in our case) and a read-write layer (on TMPFS). The two layers are linked via AUFS (the successor of UnionFS). Everything is taken care of by a single hook upon boot (the script called rootaufs). It operates in five steps:

  1. Creates the temporary files /ro, /rw and /aufs.

  2. Moves the root of NFSroot from the original mountpoint to /ro.

  3. Mounts the local or remote partition.

  4. Superimposes /ro and /rw into /aufs.

  5. Moves /aufs into the original mountpoint. rootaufs goes into ${SIDUS}/etc/initramfs-tools/scripts/init-bottom.

The original script is inspired the by rootaufs project by Nicholas A. Schembri. We adapted it to a large extent to match our infrastructure. A version is available at http://www.cbp.ens-lyon.fr/sidus/rootaufs:


wget -O
    ${SIDUS}/etc/initramfs-tools/scripts/init-bottom
    http://www.cbp.ens-lyon.fr/sidus/rootaufs

The system is not functional yet. You need to create an initrd specific to your NFS boot. Add aufs in ${SIDUS}/etc/initramfs-tools/modules and force eth0 as DEVICE in ${SIDUS}/etc/initramfs-tools/initramfs.conf:


sidus update-initramfs -k all -u

Then, you just copy the kernel and bootloader in the definition:


cp ${SIDUS}/vmlinuz /srv/tftp/vmlinux-Sidus
cp ${SIDUS}/srv/nfsroot/boot/initrd
    /srv/tftp/initrd-Sidus

How can you take advantage of SIDUS while keeping a given configuration from one boot to the next? Mounting NFS on each node separately is very costly. It is preferable to mount iSCSI on each node.

Originally, we investigated how to offer a second NFS share in read-write mode to ensure persistence of client-related changes from one boot to the next. This version, although functional, required an atomized NFS—one for each client. This was not sustainable for the server.

Therefore, we decided on another solution to ensure persistence. We create an iSCSI share for each client. The settings for mounting the iSCSI disk are defined in the line command.

So we use a network drive from iSCSI technology. In the config file /srv/tftp/pxelinux.cfg/default, we have the definition LABEL=iscsi. Each SIDUS client needs its own iSCSI storage space to ensure persistence. For reasons of simplicity, in the initrd booting sequence, the SIDUS clients fetch the volumes that bear their respective IPs. The rootaufs file contains a default login/password.

A few tricks:

  • Erase /etc/hostname to set the hostname through DHCP.

  • Set /etc/resolv.conf with a hard-coded definition.

  • Define a loopback in /etc/network/interfaces.

  • Change the booting of GDM3 so it starts only after NSCD is launched.

  • Set /etc/security/limits.conf (essential in an HPC environment).

  • Set /etc/fstab with input from the NFS server of user accounts.

  • For VirtualBox-based virtual systems, install VBoxLinuxAdditions.run in the SIDUS system.

  • For systems with an InfiniBand card, force loading of modules in /etc/modules and regenerate initrd. In /etc/rc.local, execute a script that gets the Ethernet IP address and builds an IP address for the InfiniBand card.

  • For systems with an NVIDIA card: with most NVIDIA cards, packages offered with Debian Wheezy let you install the necessary proprietary drivers and the OpenGL, Cuda and OpenCL libraries. Be careful if you want to use the OpenCL ICD (Installable Client Loader) for AMD to operate your processors and your graphics board simultaneously. To be able to do so, we had to install the entire environment—drivers, Cuda and OpenCL—from scratch.

  • For systems with an AMD ATI card: with most ATI cards, packages offered with Debian Wheezy let you install the necessary proprietary drivers and the OpenGL, Cuda and OpenCL libraries.

At the present time at CBP, we use the technique "NFSroot + iSCSI = AUFS" on SIDUS stations that require persistence, such as DiStoNet nodes. Otherwise, we use "NFSroot + TMPFS = AUFS".

______________________

Emmanuel Quemener defines his job as an "IT test pilot". His work at the HPC "Centre Blaise Pascal" (Lyon, France) involves software integration, storage, scientific computing with GPUs and technology transfer in science.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Interesting system. Certainly

Ann Honni Mousse's picture

Interesting system.
Certainly lighter to install and maintain than MIT's Athena (but less powerful too).

Interesting

Anonymous's picture

Respect to the sysadmins.

Reply to comment | Linux Journal

Internet Marketing's picture

Great post. I was checking constantly this blog and I am impressed!
Very useful information specifically the final part :) I
deal with such info a lot. I used to be looking for this
particular info for a very lengthy time.

Thank you and good luck.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState