Linux System Administration: First Tasks
Linux system administration has a place of its own in the hierarchy of information technology specializations. Some people excel in special areas of free software technology but haven't needed to learn system administration. For example, you may specialize in configuring e-mail or writing applications using Apache and MySQL. You may focus only on Domain Names Services and know esoteric ways of setting up servers on provider lines that frequently change IP addresses. But if I asked you to babysit a busy server or servers, you might not have the temperament or have learned the plethora of skills required to do so.
The above does not mean that good system administrators do not excel in areas such as configuring Apache, maintaining DNS zone files or writing Perl Scripts. It simply means that if you want to work as a system administrator in the Linux world, you need to know how to do everything from installing a server to securing the filesystem from mischievous crackers on the Internet. In between, you need to prepare your system to recover from the myriad ways a server can fail.
Consider, for example, a case in which you find that one of the Web sites you manage has gone down; the server has locked up and nothing works. How do you recover in the fastest possible way? Such an event happened to me two weeks ago. One of my articles wound up on Slashdot.org, Digg.com, NewsForge and other sites at the same time. None of my colleagues had seen that much traffic on a Linux site before. Aside from the several million hits on our server, we had a quarter of a million unique visitors concentrated in a five-hour period.
When you see that kind of traffic, you don't want the server to go down or you'll miss new readers. In our situation, a reboot allowed the system to return to service for a few minutes, but then it locked up again. Normally, we used less than ten percent of our system resources, so we thought we had prepared for the hottest day of the year.
Knowing the server and all the running processes, we could shut some down and focus on allowing a massive increase in simultaneous connections to our database. Although we have several thousand subscribers, we turned off processes such as those that restricted comments to registered readers. In the end, we made it through the day with only a short period of down time. But the surge of traffic rocked our boats.
Service outages such as the one described above can happen in the confines of a private network. Many services experience peak usage at specific times. For example, administrators know that one of the heaviest loads they'll have during the day occurs first thing in the morning, when people check their e-mail. People arrive at work about the same time, crank up their e-mail clients and read mail while drinking coffee.
The mail server might experience 75% of its use between 8 and 10 AM. Gateway traffic also increases and bandwidth on the network bogs down. Should you provide separate dedicated servers for mail, routing, proxy and gateway services? The majority of IT shops do that.
What if those computers averaged only 10% of CPU and memory capacity during the course of the day, but required 75% of resources for only a couple of hours a day, five days a week? Rather than buying individual computers, vendors have started recommending higher capacity machines and creating virtual severs.
You might want to configure a little larger metal to provide virtual machines for e-mail and related applications. Then, using Xen for example, you could let each application run in its own space. In that case, you might find server capacity utilization running around 50%, which helps maximize your resources and reduces server sprawl.
A system administrator should know how to climb a learning curve quickly. If a new technology arrives, such as virtualization, you need to master it before it masters you. You also need to know how to apply it in your environment.
What kinds of tasks occupy a system administrators day? That depends on the environment in which he or she works. You may find yourself managing dozens or even hundreds of Web servers. In contrast, you might find yourself running a local area network that supports knowledge workers and/or developers.
Regardless of your environment, you will find that some tasks are common to all system administration functions. For example, monitoring system services and starting and stopping them takes on a role of its own. Your Linux box might appear to be running smoothly while one or more processes have stopped. A Linux server might seem happy on the outside, for example, while the database serving Web pages has failed.
When services to users become critical needs, you need to be prepared and stay ahead of problems. Imagine a failed printing job is locking up a queue, keeping users from getting their documents printed. Do you wait to do something until you hear from irate users, or do you have a way to stay ahead of the problem?
Most system administrators have to face the fact that something will happen at some point that causes down time. Such events usually occur outside of our control. Perhaps your system incurs a power outage or spike. Sometimes a system bug pops up due to a combination of factors that exist only on your server; it's something that never occurred during project testing. In reality, sysadmins never know when a problem will occur; they only know that eventually one will arise.
Administrators need to monitor their systems in an efficient and effective manner. To this end, many administrators have discovered a plethora of monitoring and alert tools within the Free Software community. Some require you to log into a remote system by SSH and run command-line tools such as pstree, lsof, dstat and chkconfig.
Another useful monitoring tool is Checkservice, which provides the status of services on (remote) hosts. It provides results by way of logs, a PHP status page or output to other tools. Some administrators like tiger, which performs a thorough check of a system and reports the results to a log file. You can find a list and explanation of tools for Debian here.
When you have to monitor a larger server farm and do not want to spend all your time logging into remote servers and running command-line tests, look for free software tools you can use with a browser. I like a tool called monit. This monitoring and alert system works on a number of Linux-type systems. Monit provides a system administrator with the ability to define, manage and monitor processes, the filesystem and even devices. You also can configure monit to restart processes if they fail.
Stanford University keeps an updated list of network monitoring tools and sponsors a working group called the Internet End-to-End Performance Monitoring Group. Be sure to check out the latest tools at the top of the Stanford list. Cacti, for example, has become one of the more popular tools among system administrators.
Professional Linux system administration requires you to know a broad number of tasks associated with networking and providing services to users. It takes a special breed of person to work in this capacity. Obviously, many people have both the character and the interest to do the job. Over the next few months, we will explore the tasks that make up Linux system administration. I hope you'll join me for the ride.
|Dynamic DNS—an Object Lesson in Problem Solving||May 21, 2013|
|Using Salt Stack and Vagrant for Drupal Development||May 20, 2013|
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
- Dynamic DNS—an Object Lesson in Problem Solving
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- New Products
- A Topic for Discussion - Open Source Feature-Richness?
- Drupal Is a Framework: Why Everyone Needs to Understand This
- Validate an E-Mail Address with PHP, the Right Way
- RSS Feeds
- Readers' Choice Awards
- Tech Tip: Really Simple HTTP Server with Python
3 hours 35 min ago
- Reply to comment | Linux Journal
4 hours 7 min ago
- All the articles you talked
6 hours 31 min ago
- All the articles you talked
6 hours 34 min ago
- All the articles you talked
6 hours 35 min ago
11 hours 36 sec ago
- Keeping track of IP address
12 hours 51 min ago
- Roll your own dynamic dns
18 hours 4 min ago
- Please correct the URL for Salt Stack's web site
21 hours 16 min ago
- Android is Linux -- why no better inter-operation
23 hours 31 min ago
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?