Sysadmin 101: Automation

This is the second in a series of articles on systems administrator fundamentals. These days, DevOps has made even the job title "systems administrator" seem a bit archaic, much like the "systems analyst" title it replaced. These DevOps positions are rather different from sysadmin jobs in the past. They have a much larger emphasis on software development far beyond basic shell scripting, and as a result, they often are filled by people with software development backgrounds without much prior sysadmin experience. In the past, a sysadmin would enter the role at a junior level and be mentored by a senior sysadmin on the team, but in many cases currently, companies go quite a while with cloud outsourcing before their first DevOps hire. As a result, the DevOps engineer might be thrust into the role at a junior level with no mentor around apart from search engines and Stack Overflow posts.

In this series, I'm going to expound on some of the lessons I've learned through the years that might be obvious to longtime sysadmins but may be news to someone just coming into this position.

In the first article in this series, I talked about how to approach alerting and on-call rotations as a sysadmin. In this article, I discuss how to automate yourself out of your job. There is a quote that you see from time to time in sysadmin circles that goes something along the lines of "Be careful or I will replace you with a tiny shell script." Good system administrators hate performing mundane tasks and constantly seek to apply that saying to themselves. That said, there are many different approaches to automation, and not all of them result in a time-savings. Here, I discuss my experience with automation and describe what, when, why and how you should (and shouldn't) automate.

Why You Should Automate

There are a number of different reasons why you should take steps to automate your work as a sysadmin:

1) It frees up time spent doing mundane tasks to focus on more important work.

With all of the automation that's already built in to servers these days, it's easy to take for granted just how many mundane tasks sysadmin have had to perform in the past. Logs weren't always rotated automatically; backups usually were home-grown affairs that often were triggered manually. Even now, there still are system administrators who install every single server by hand, log in to a machine manually and install or update software, and configure server configuration files on the host by hand.

Let's take server OS installation as an example—a modern interactive server OS installation may take anywhere from 15 minutes to an hour of sysadmin time to walk through and answer questions. These are the kinds of actions that don't really require a sysadmin's expertise once you've made the initial decisions about how you want a server to be set up. By automating these mundane tasks, you can get back to the more difficult work that does require your expertise.

2) Automation reduces mistakes in routine tasks.

The thing about performing the same task over and over by hand is that it is easy to make mistakes, and if it's something you do every day, eventually you even may stop paying attention to whether your task succeeded. Also, the way that you may perform a certain task might be a little bit different from how a different administrator on the team does it. By automating a task, the team can agree on the ideal way to perform it and know that when you run your automation script, it is performed the same way every single time with no skipped steps or commands run in the wrong order.

3) Automation allows everyone on the team to be productive.

With automation, you can take even a complex process and reduce it down to a command. That command then becomes something that anyone on the team can run, whereas the complex process may have required more senior members of the team. For instance, if you take production software deployment as an example, often there can be a complex arrangement of triggering load balancer and monitoring maintenance modes, software versions to check, mirrors to sync up, and services to restart and test. Even though these individual steps may be mundane, combined, they become pretty complicated and could overwhelm a junior member of the team—especially when production uptime hangs in the balance. By automating that process, senior administrators can put all of their expertise into creating the right process that performs the right checks, and they can go on vacation knowing that anyone else on the team now can perform the task the right way.

4) Automation reduces documentation workload.

Often instead of automating a task, a sysadmin team will spend time documenting a process. There is still an important place for documentation, and in the next section, I discuss when that makes sense and when it doesn't. The fact is though, if you take take an entire process and put it into a single automated task, you no longer need a full wiki page of documentation (that inevitably will become out of date), because you've reduced it down to "run this command". Because the process is now automated, you also know the process is kept up to date; otherwise, the script wouldn't work.


Kyle Rankin is VP of engineering operations at Final, Inc., the author of many books including Linux Hardening in Hostile Networks, DevOps Troubleshooting and The Official Ubuntu Server Book, and a columnist for Linux Journal. Follow him @kylerankin