A Secure Bioinformatics Linux Lab in an Educational Research Environment

How one university department set up labs between campuses.

In delivering a new bioinformatics curriculum in the Graduate School at the University of Medicine and Dentistry of New Jersey, we undertook the challenge of incorporating new computational resources over an existing research support infrastructure, adding new services and platforms and reacting to an increasingly burdensome responsibility to protect ourselves from network threats. Our new environment spans two cities and links Linux workstations, Linux servers, Silicon Graphics workstations, a Sun 6800 Enterprise Server and the Internet. Open-source solutions combined with selective use of commercial resources integrate in a cost-effective, service-friendly, bioinformatics research environment. In this report, we describe solutions to a set of challenges in our core, Linux-driven server/client environment.

As with many universities, our public computer labs are Microsoft boxes with the Office suite, and we have a set of clients--Web, secure telnet, secure FTP, IMAP2 mail and X. The bioinformatics software the university hosted lay behind these workstations, on Sun/Solaris and SGI/Irix servers. We needed an environment in which we could do several things: (1) manage workstations efficiently, (2) quickly add or delete applications, (3) rebuild workstations, (4) ensure availability and storage and (5) address network and data security issues.

We recognized that software and configuration information should be stored in a centralized server and available to authenticated clients. Our generic Web and e-mail servers already were overburdened with these services. In addition, each of those servers faced its own distinctive security threats and solutions. A better approach would be to establish a separate server dedicated to serving the scientific community, a scientific server. We needed to bring this project in on a modest budget.

Almost any server/workstation environment dedicated to scientific research might have offered multiple benefits,including parallel processing, centralized administration and secure storage systems. However, many fail in an important aspect in our two-city arena. We have a high demand for visualization, and users need X server clients such as ReflectionX, Exceed and Cygwin. X server clients display graphical interfaces to users accessing programs on a server or an X client. Most molecular modeling software requires visualization using OpenGL. In a local area network, this kind of architecture should suffice. However, our intercampus network was not always up to the task.

Our solution to this set of challenges was to build a bioinformatics computer lab environment dedicated to teaching and research. This lab is designed to be secure, resilient to attacks and failure and adaptable to an array of software and modes of access by authenticated users.

We began with the operating system choice. We elected Linux, for many of the usual reasons: open-source, secure, easily manageable and free availability made it attractive in an educational environment with limited funds. But in that economic mood, we still chose one step up, selecting Red Hat Enterprise Linux due to the support that commercial systems provide, including workstation monitoring, patches and upgrades using the Red Hat Network. We went with Intel x86 computers because we had a number on hand and they made good economic sense. Plus, if we were to fail, we still would have boxes that otherwise could be deployed.

As now deployed, our Piscataway lab has 14 Red Hat Enterprise Linux workstations and two Enterprise Linux servers. In the Newark lab, (where the facility is smaller), we have four Red Hat Enterprise Linux workstations and one Enterprise Linux server. All Piscataway workstations are identical in terms of hardware, as are all Newark workstations; there are minor differences between the two sets, however.

Servers

We outlined a set of initial tasks: build a server to run DHCP, host Red Hat CDs for Kickstart installations, authenticate users, host users' home directories and provide a Web and database server. To that end, we did an installation of Red Hat Enterprise Linux AS on two separate PCs in Piscataway and one in Newark to act as our servers--one primary, one backup and one Web/database server.

We also needed DHCP services to permit authorized users access to the network with personal laptops and not to run our workstations. Accommodating the laptop users, the MAC address of a user's laptop was determined, and for each laptop we have an entry similar to the following, where each host is identified by the user's username.


subnet 192.168.1.0 netmask 255.255.255.0 {
     deny unknown-clients;

     # DHCP range		
     range 192.168.1.240 192.168.1.245;

     # known clients
     host golharam { hardware ethernet 00:12:34:56:78:90; }
     ...
}

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

We recognized that software

Tom87's picture

We recognized that software and configuration information should be stored in a centralized server and available to authenticated clients. Our generic Web and e-mail servers already were overburdened with these services. In addition, each of those servers faced its own distinctive security threats and solutions. A better approach would be to establish a separate server dedicated to serving the scientific community, a scientific server. We needed to bring this project in on a modest budget. I think that this is clever idea. Budget is all it is about.

Tom

The server firewall allows

Torres's picture

The server firewall allows incoming SSH traffic from anywhere. It then performs IP address filtering to allow only certain IP addresses access to more open resources, such as NFS, LDAP, CUPS and the FlexLM license server. The Web server uses a slightly different setup to allow only incoming SSH and HTTP traffic.

Re: A Secure Bioinformatics Linux Lab in an Educational Research

Anonymous's picture

I am curious as to why you would add new users with the username and the password being the same, also why no minimum password expiration was given (possibly this was for the sake of the article, if not, then publishing the lab/machine names and the fact that default usernames are replicated for passwords would be twice as bad)?

Below is a simple suggestion for a perl subrouting which can be modified to your liking to generate semi-random passwords.

sub make_pass {
use String::Random;
$pass = new String::Random;
$pass=$pass->randpattern("CCnnccC"); #change this

print "New password is $pass
";
$pwd = (getpwuid($<))[1];
$salt = substr($pwd, 0, 2);
$salt = substr($pwd, 0, 2);
$newpass=crypt($pass, $salt);
print "New crypt is $newpass
";
}

Maybe I just overlooked where the users are forced to change their password on the initial login.

Good to see some more ink dealing with research institutions.

Phil M.
San Diego

passwd command on Linux works just fine with LDAP

Anonymous's picture

We are also a bioinformatics lab, although a smaller one with a bit less teaching responsibilities. We moved from NIS to LDAP authentication about six months ago. The passwd command on modern Linuces knows how to deal with LDAP and can change passwords in a LDAP directory just fine. Users must have write access to their own passwords in the LDAP directory for this to work, but that is trivial to configure.

Scripting languages like Perl and Python can generate passwords encrypted in various ways, so I do not quite see why the change_password_perlscript invokes the 'passwd' command and uses a local /etc/shadow file from which the encrypted password is stripped. Seems like climbing to the tree backwards when the necessary LDIF could be generated in the script directly.

A nice graphical LDAP browser/editor named LUMA (project on Sourceforge) can do mass-creation of users and passwors. Unfortunately the working versions of LUMA depend on new versions of other packages, so getting it to run on anything but the latest distros can be an excercise.

Hostname "hydrogen"

Anonymous's picture

I used to work at a place where the hostnames were element names, by atomic number -- if you knew your periodic table you didn't need the DNS server, which I think was lithium. Hydrogen was the gateway,

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState