The Best Multiplatform Development Environment that Ever Lived on One Box
I decided my consulting company needed to have a professional development environment where I could potentially host works-in-progress for my clients. I wanted high security. I wanted high availability. I wanted to support not only multiple servers (Oracle and Sybase; Tomcat, JRun and Weblogic; different JDK revs), but multiple platforms (NT and UNIX). I wanted to be able to grow disk partitions without having to repartition or re-install. I needed to enable developers from all over the country to have speedy remote access. And, oh yeah, I wanted to spend less than $4,000.00 on all of the hardware and software needed to accomplish this.
This wish list was made possible thanks to Linux, the latest kernel build, a host of freely available and very inexpensive software, and a (relatively) inexpensive server machine.
The bulk of my costs went to the purchase of a refurbished Dell PowerEdge 2300 that I got for a steal, $2700.00. For those of you who don't know, a refurbished Dell carry the exact same warranty as a new Dell.
Cable internet access provides a reasonably inexpensive high-speed connection to the Internet. Since I had only a single "real" IP address from a cable connection, a second network adapter was added to the one that came with the server, so that the server could act as a gateway for my LAN.
The table below outlines the hardware and associated costs:
Hardware Cost Obtained From Refurbished Dell PowerEdge 2300 $2,700.00 www.dell.com (Dual Pentium III 450MHz, 256MB RAM, 54GB Hard Disk, 100Mb Ethernet, external SCSI controller) Second PCI Ethernet Card $ 30.00 Had it laying around 20/40GB DAT Tape Drive $740.00 www.dell.com 256MB Additional RAM $420.00 egghead.com, from mysimon.com (comparison shopping portal) Total $3,890.00
The table below outlines all the software in use, with an overview of its purpose:
Software Purpose Location Red Hat 6.2 Base Operating System redhat.com Kernel v2.4.3 Kernel that includes support kernel.org for LVM and more robust firewalling LVM (Logical Volume Allows physical partitions to linux.msede.com/lvm Management) be grouped arbitrarily into Logical Volumes. Allows Logical Volumes to be grown dynamically. ipchains Firewall kernel module IPCHAINS-HOWTO redir A port redirector to allow sammy.net access from the Internet to the (virtual) NT machine SSH (Secure Shell) Allows highly secure openssh.com authentication and encryption of remote sessions VMware Allows us to run Windows NT vmware.com in a Virtual Machine under Linux VNC (Virtual Network Very thin remote control research.att Computing) software
Of the software listed here, only VMware costs anything, and its price is nominal. We also had software expenses related to the use of Windows NT 4 server, described below.
The first step was to install the base Red Hat 6.2 distribution. I did this by downloading the required boot disks and the DOS utility called rawrite to transfer the disk images to floppies. I then booted off these floppies and installed the entire Red Hat distribution over the Internet. Next, it was time to customize.
Using the Linux kernel v2.4.3 allowed me to take advantage of a key enabling technology: LVM (Logical Volume Management; linux.msede.com/lvm/).
The machine came with six 9GB hard drives. Ordinarily, I would have to have a minimum of six partitions along the boundaries of the physical drives. Not only didn't I want to partition based on this arbitrary boundary, but I also wanted flexibility in our partition plan. If a partition was approaching capacity, I wanted to be able to dynamically grow it. This might mean reducing the size of another, less used partition. LVM allows a high degree of configurability through a host of tools.
I also experienced a performance gain by having logical volumes composed of physical partitions across the six drives. This reduces the overall seek time required when accessing files, since you don't have a read/write head moving around a single disk, but all six at once.
During the process of setting up LVM, virtual devices are created for logical volumes and for virtual groups. Here is the output of the mount command after setting up LVM:
/dev/vgRoot/lvRoot on / type ext2 (rw) none on /proc type proc (rw) /dev/sda5 on /boot type ext2 (rw) /dev/vgHome/lvHome on /home type ext2 (rw) /dev/vgOpt/lvOpt on /opt type ext2 (rw) /dev/vgTmp/lvTmp on /tmp type ext2 (rw) /dev/vgUsr/lvUsr on /usr type ext2 (rw) /dev/vgUsrLocal/lvUsrLocal on /usr/local type ext2 (rw) /dev/vgVar/lvVar on /var type ext2 (rw) none on /dev/pts type devpts (rw,gid=5,mode=620)
Here is the output from the df command showing the space on each partition:
Filesystem 1k-blocks Used Available Use% Mounted on /dev/vgRoot/lvRoot 10885380 106804 10225616 1% / /dev/sda5 248895 15348 220697 7% /boot /dev/vgHome/lvHome 4354120 966752 3166184 23% /home /dev/vgOpt/lvOpt 10885380 3997884 6334536 39% /opt /dev/vgTmp/lvTmp 2189128 98048 1979876 5% /tmp /dev/vgUsr/lvUsr 6507036 1530480 4646012 25% /usr /dev/vgUsrLocal/lvUsrLocal 10865240 4687304 5626000 45% /usr/local /dev/vgVar/lvVar 2173016 40584 2022048 2% /var
Here is the output from one of the LVM utilities, called lvdisplay, that shows what physical partitions make up a logical volume:
--- Logical volume --- LV Name /dev/vgRoot/lvRoot VG Name vgRoot (...snip...) LV Size 10.55 GB Current LE 2700 Allocated LE 2700 Stripes 5 Stripe size (KB) 16 Allocation next free Read ahead sectors 120 Block device 58:0 --- Distribution of logical volume on 5 physical volumes --- PV Name PE on PV reads writes /dev/sda6 540 7407 394 /dev/sdb1 540 13843 6893 /dev/sdc1 540 14395 94168 /dev/sdd1 540 7793 775 /dev/sde1 540 11093 64148
The last few lines of this output show that this logical volume is made up of five physical partitions. Notice that each of these partitions is on different disks. The output also shows the number of stripes on the logical volume, in this case five. This means that writes are spread across all five physical partitions, instead of waiting until one physical partition is filled, and then moving on to the next partition. It is the striping that can improve performance, since rather than a single read/write head having to seek across a disk, you are using five at once.
During the process of setting up LVM, you create PEs (physical extents) that represent the underlying physical partitions. I broke each 9GB drive into four partitions, each containing about 2.2GB. This became my minimum unit for making up logical volumes. The next step is to create virtual groups and then assign a group to a logical volume. For more detail, refer to the LVM home page referenced above.
Firewalling support for Linux has gone through many changes. Unlike other aspects of the kernel's evolution that have been methodical, each change to the firewalling modules has been radical compared to prior versions. The downside of this is each change to the way Linux deals with firewalling has a new learning curve associated. Like other areas of Linux development, however, each release has been very stable and has provided support for backward compatibility. Linux is evolving into a mature, enterprise-strength firewalling solution, as evidenced by companies like Watchguard.
In previous releases of the kernel, the firewall configuration tool was called ipchains. Now, with the 2.4 release, this has once again changed to a tool called iptables. Both of these tools run on the command line but hook into the kernel to configure firewall settings at a very low level. The iptables arrangement offers significant advantages over the ipchains way of kernel organization, but ipchains will be supported for some time as a kernel loadable module. For more information, see kernel.org and IPCHAINS-HOWTO.
For my purposes, ipchains offered more than adequate support for my needs. My major concerns regarding firewalling were:
Lock down all extraneous/unused ports and services to have a safe box. Since I would potentially be housing sensitive information while developing software for my clients, it was important that the box be adequately protected.
Provide NAT (network address translation) services to my internal network, often called IP masquerading. Since I only had one valid IP address and a number of workstations needing access to the Internet, I needed to provide a way for to manage multiple connections through my single (real) internet connection.
Provide port tunneling services, where appropriate, to other internal boxes. In some cases, I had other machines running internet protocols internally that I wanted to allow access to from the Internet.
The ASCII diagram below shows our network/firewall setup:
|------| ------- 192.168.1.0 | | | | W1 |----------------| | 11 | | |------| | | |---------| |------| | | | | | | eth0|200 D|eth1 Internet | W2 |----------------|--------------| Linux |--------------------- | 12 | | | | |------| | |---| | | | |---------| |------| | | | | | | | VM | | W3 |----------------| | 201 | | 13 | | | | |------| -------- |---------|
The internal network is 192.168.1.0. Each machine on the diagram above shows the last number in its IP address. The Linux box has two physical adapters. The internal one (eth0) has an address on the internal network (200), while the external adapter (eth1) obtains its address dynamically via DHCP. This is why it is marked with a D on the diagram.
It is ipchains that allows NAT between the internal network and the Internet. All outgoing connections are allowed. All incoming connections are disallowed, with the exception of port 80 (HTTP), port 22 (SSH) and port 8080 (for an HTTP server running on the VM machine). These services will be explained in more detail below.
Notice the VM machine is shown "behind" the Linux box. This will be explained in more detail in the VMware section below.
A simple program named redir, found at sammy.net, allows services running on an internal machine to be accessed from the outside (Internet) network. This program accepts connections on a certain port and proxies those connections to another port on a different machine.
First of all, external connections must be allowed to the port on the Linux box. In my case, this port is 8080. The redir command we execute looks like this:
redir --lport=8080 --caddr=192.168.1.101 --cport=8080 --syslog &
The lport is the listen port of the local machine. The caddr and cport settings are the connection address and port, respectively. The syslog parameter instructs redir to log all connections to the standard system log.
As a result of this command, any connections to the Linux box on port 8080 will be served by the web server running on the VM box (also from port 8080).
There are two versions of the SSH protocol: 1 and 2. I am using version 2 of the protocol (sometimes simply called SSH2). SSH (secure shell) gives me three advantages from a security standpoint:
High degree of authentication security using public key authentication
High degree of transmission security by encrypting all traffic after authentication
Encrypted TCP tunneling for connection to services not available to the public (often referred to as VPN or virtual private networking)
Public key authentication works through the use of a public key and a private key. A key in this context is a string of characters used to identify an individual. Keys can be a variety of lengths. The longer the key, the harder it is to break, but performance in the encryption and decryption process can suffer with longer keys. Usually a tool, such as ssh-keygen, is used to generate key pairs. Information encrypted with the private key can only be decrypted with the public key. Likewise, information encrypted with the public key can only be decrypted with the private key.
Once keys have been generated, they can be used in secure connections with SSH. The private key (as its name implies) should be kept secure at all times. Often it is kept on a floppy disk or other removable media when other people may have access to a machine you also use (I keep mine on a Java powered ibutton, but that is another story). The public key, however, may be distributed, well, publicly. A copy of the public key must be present on the remote server that you wish to connect to using SSH.
When you attempt to make a connection with SSH, a small bit of information is encrypted using your public key on the server. This information is called the session key. The SSH server on the remote machine sends the encrypted session key back to the client. The client then attempts to decrypt this bit of information using your private key. This information (once successfully decrypted) is then used to encrypt the rest of the session. This solution is simple and elegant. The information used later on to encrypt the rest of the session is never passed in cleartext over the insecure Internet.
One of the most powerful features of SSH is the ability to use TCP tunnels. There are two types of tunnels: local and remote. With a local tunnel, you specify a port for your local machine to listen on, as well as a remote machine and port on the other side of your SSH connection to connect to. Once a local tunnel is established, you can connect to the specified port on your local machine, and it will be as if you had connected to the remote machine on the port you specified for the tunnel. With remote tunnels, you specify a port for the remote machine to listen on and a local address and port to connect to. When a connection is made to the specified port on the remote machine, it will be as if the connection had been made to the local machine and port you specified for the tunnel. The way these connections work is not always understood, so let's take two examples from my own configuration file:
LocalForward "5521:localhost:1521" RemoteForward "6010:localhost:6000"
On the Linux box I run a database server. I want to be able to use the client tools remotely, but in a secure way. The listener service is bound to port 1521. In the first example above, the first parameter (5521) is the port that my local machine will listen on. The second and third parameters refer to the remote machine. So, localhost in this case is the remote machine's localhost. After establishing my secure shell session, the tunnels are automatically created from the configuration file. I can then point the client tools to my local machine at port 5521. All traffic generated by the client tools will transparently be forwarded over the secure tunnel to the remote machine.
The second example uses the X Window System over a secure tunnel. Without getting into too much detail, X is somewhat counter-intuitive; you run an X server on your local machine, and X clients are run on the remote machine. These clients connect to the X server. X has a notion of display numbers bound to ports. Display 0 uses port 6000, display 1 uses 6001 and so on. In the second example above, the first parameter (6010) refers to the port on the remote machine. In this case, I am using display 10, bound to port 6010, to run my X clients. The second and third parameters refer to my local machine. So, localhost in this case is my machine's localhost. On the remote machine, I might run a command like:
xterm -display localhost:10&
This will cause xterm to connect to port 6010, which will transparently connect through the secure tunnel to my machine on port 6000 (the default port for the X server).
In actuality, SSH supports automatically tunneling X clients.
Since the tunnel is established through an already secure connection, all information passing along the tunnel is encrypted. The upside is a high degree of security over a public and insecure network. The downside is overhead in encrypting everything on the fly.
VMware, from VMware, Inc., is one of the most powerful tools I use. It runs under both Linux and a number of flavors of Windows, and is a virtual Intel machine. When you run it, you see it counting up memory, like a normal machine would, and then boots up as a normal machine would. It can use virtual hard drives or actual physical drive partitions. You must allocate the physical space you intend to use for a virtual drive upfront, but you can add other virtual drives later on if you run out of space.
Most of the programming I do is in Java, so I don't ordinarily have platform concerns. But when I am delivering software that I know is going to be run on any of the Windows platforms, I like to make sure that it will work properly; so I run Windows NT 4 server under VMware. This alleviates the need for me to have to buy a separate box solely for NT. VMware uses the CD-ROM drive as if it belonged to the virtual machine. Therefore, installing NT is no different than it would be using real physical hardware.
VMware includes a number of choices for networking. The bridged networking option is what makes this really usable. Under Linux, kernel loadable modules are installed to support virtual networking. As far as NT is concerned, a valid hardware Ethernet adapter is present in the machine. This adapter has its own IP address on the internal network just like any other machine would.
After it is set up and running, it appears to all other machines on the network as well as itself, like a normal machine. I run a web server, database server and other services on this machine, and it is accessible to every other machine on the network.
In the case of the web server, I allow port 8080 through the firewall and have a proxy server (redir described above) that then connects to the NT box. To the outside world, content is received as it would be from any other web server.
VNC or Virtual Network Computing is available at: www.uk.research.att.com/vnc/. It is a freely available, very thin remote control package that includes a client and a server. It is available for Windows and a variety of UNIX flavors, including Linux. It even has a Java applet version of the client, as well as a client for Windows CE devices. The entire binary download (which includes the client and the server) is under 1MB. It can be installed as a service on Windows NT, and it can also be installed to start automatically at boot on other versions of Windows.
VNC is similar to programs such as pcAnywhere and Carbon Copy. Among its pluses are its abilities to work over secure tunnels, tune down the color depth for faster refreshes and, of course, it's free. The major downside is it is slower than other commercially available, remote-control software.
VNC has become a key enabling technology for me to overcome one of the few drawbacks of VMware. VMware runs as just another X client under Linux. This means that it must run against an X server under someone's regular user session. I wanted to ensure that the (virtual) NT server would be up all the time, like a real NT server machine. VNC allowed me to do this because the UNIX version also doubles as an X server. Here is how I set things up:
Run the VNC server under Linux
Run the VNC client and connect to the VNC server run in step 1
Run VMware from the VNC session
Start the virtual machine
After step 4, the VNC client can be killed, leaving VMware running in the background (because it is running within the context of the VNC server X server), without being bound to any user session.
I installed the NT version of VNC on the virtual machine. Once NT is running, I then remote control it using the VNC client.
VNC is not inherently secure, but because of our firewall arrangement and the use of tunnels with SSH, I can securely control the virtual NT box remotely. I set up a local tunnel that connects to the (virtual) NT box on the port that VNC server listens on (5900 by default):
VNC, like X windows, has a notion of display tied to port numbers. Display 0 is 5900, display 1 is 5901 and so on. Once an SSH session is established, the VNC client (vncviewer) can be launched and can connect to localhost on port 5900. As with other tunnels, this will transparently connect to the NT machine on port 5900.
To better understand all these interactions, refer to the ASCII diagram below:
|-----------------------------------| |Linux | | |-----------------------------| | | |VNC Server (Linux) | | | | |-----------------------| | | | | |VMware (NT Server) | | | | | | |-----------------| | | | | | | |VNC Server (NT) | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |-----------------| | | | | | |-----------------------| | | | |-----------------------------| | |-----------------------------------|
After that glowing report on the power of Linux as a multiplatform development environment, I encountered a number of pitfalls.
Getting LVM properly configured and setting up Linux to be able to boot into the LVM environment is not for the faint of heart.
VMware has a lot of support for graphics acceleration when used in the normal way. Because of my particular needs, I lost that graphics acceleration.
While thin, VNC is slow and can have unexplained glitches, like sudden loss of connection.
The layers of security (including encryption on the fly) combined with the layers of virtuality to the NT server (VMware and VNC) causes a fairly sluggish remote session, even over high speed lines.
Linux provides an extremely powerful, enterprise-strength platform at a very low cost. Systems with comparable features to the one I have set up could easily run into the mid-to-high five figures, compared to the well-under-$5,000 I have spent.
Having said that, it still requires a fairly large time investment in configuration, learning curve in the form of reading up on new and as-yet unreleased technologies and risk related to these issues.
The past 10+ years of Linux have shown that it has the ability to gain acceptance and support from big players and will continue to do so. Who would have guessed five years ago that the major database server vendors (Oracle, Sybase) would support Linux? Or that some of the largest corporations in the world (IBM, Sun) would have major software initiatives specifically targeted at Linux?