Python Scripts as a Replacement for Bash Utility Scripts
Often in Python scripts that are used on the command line, arguments
are used to give users options when they run a certain command. For
head command takes a
-n argument that takes the number
following it and prints only that number of lines. Each argument that
is provided to a Python script is exposed through the
which can be accessed by first importing
code below shows how to take
single words as arguments. This program is a simple adder, which takes two
number arguments and adds them, and prints that out to the user. However,
this format of taking in command-line arguments is rather basic. It is
easy to make mistakes—for instance, pass two strings, such as
and world, to this command, and you will start to get
#!/usr/bin/env python import sys if __name__ == "__main__": # The first argument of sys.argv is always the filename, # meaning that the length of system arguments will be # more than one, when command-line arguments exist. if len(sys.argv) > 2: num1 = long(sys.argv) num2 = long(sys.argv) else: print "This command takes two arguments and adds them" print "Less than two arguments given." sys.exit(1) print "%s" % str(num1 + num2)
Thankfully, Python has a number of modules to deal with command-line arguments. My personal favorite is OptionParser. OptionParser is part of the optparse module that is provided by the standard library. OptionParser allows you to do a range of very useful things with command-line arguments:
Specify a default if a certain argument is not provided.
It supports both argument flags (either present or not) and arguments with values (-n 10000).
It supports different formats of passing arguments—for example, the difference between -n=100000 and -n 100000.
Let's use the OptionParser to enhance the sending-mail script. The original script had a lot of variables hard-coded into place, such as the SMTP details and the users' login credentials. In the code provided below, command-line arguments are used to pass in these variables:
#!/usr/bin/env python import smtplib import sys from optparse import OptionParser def initialize_smtp_server(smtpserver, smtpport, email, pwd): ''' This function initializes and greets the SMTP server. It logs in using the provided credentials and returns the SMTP server object as a result. ''' smtpserver = smtplib.SMTP(smtpserver, smtpport) smtpserver.ehlo() smtpserver.starttls() smtpserver.ehlo() smtpserver.login(email, pwd) return smtpserver def send_thank_you_mail(email, smtpserver): to_email = email from_email = GMAIL_EMAIL subj = "Thanks for being an active commenter" # The header consists of the To and From and Subject lines # separated using a newline character. header = "To:%s\nFrom:%s\nSubject:%s \n" % (to_email, from_email, subj) # Hard-coded templates are not best practice. msg_body = """ Hi %s, Thank you very much for your repeated comments on our service. The interaction is much appreciated. Thank You.""" % email content = header + "\n" + msg_body smtpserver.sendmail(from_email, to_email, content) if __name__ == "__main__": usage = "usage: %prog [options]" parser = OptionParser(usage=usage) parser.add_option("--email", dest="email", help="email to login to smtp server") parser.add_option("--pwd", dest="pwd", help="password to login to smtp server") parser.add_option("--smtp-server", dest="smtpserver", help="smtp server url", default="smtp.gmail.com") parser.add_option("--smtp-port", dest="smtpserverport", help="smtp server port", default=587) options, args = parser.parse_args() if not (options.email or options.pwd): parser.error("Must provide both an email and a password") smtpserver = initialize_smtp_server(options.stmpserver, options.smtpserverport, options.email, options.pwd) # for every line of input. for email in sys.stdin.readlines(): send_thank_you_mail(email, smtpserver) smtpserver.close()
This script shows the usefulness of OptionParser. It provides a simple, easy-to-use interface for command-line arguments, allowing you to define certain properties for each command-line option. It also allows you to specify default values. If certain arguments are not provided, it allows you to throw specific errors.
So what have you learned? Instead of replacing a series of bash commands with one Python script, it often is better to have Python do only the heavy lifting in the middle. This allows for more modular and reusable scripts, while also tapping into the power of all that Python offers. Using stdin as a file object allows Python to read input, which is piped to it from other commands, and writing to stdout allows it to continue passing the information through the piping system. Combining information like this can make for some very powerful programs. The examples I have given here are all for a fictional service that logs to a file.
As a real-world example, recently I have been working with gigabytes of CSV files that I have been converting using a Python script to a file that contains SQL commands to insert the information. To understand the sort of data I'm concerned with here, I ran the data for a single table, and the script took 23 hours to execute and generated an SQL file that was 20GB in size. The advantage of using a Python script in the fashion described in this article is that the whole file does not need to be read into memory. This means that an entire 20GB+ file can be processed one line at a time. Also it is easier to think about a problem when each step (reading, sorting, manipulation and writing) is separated into these logical steps. The guarantee that each of these commands, which are part of the core utilities of UNIX-like environment, is efficient and stable helps the entire experience to be more stable and secure.
The other benefit is that there is no hard-coded file that is read
in. Often having the flexibility to pass it strings rather than the
concept of files is very powerful. For instance, if 20,000
lines through a certain file, the script breaks, instead of re-running
the script from the start,
tail can be used to read only from the line
on which the script failed.
There are a lot of aspects to Python in the shell that go beyond the scope of this article, such as the os module and the subprocess module. The os module is a standard library function that holds a lot of key operating system-level operations, such as listing directories and stating files, along with an excellent submodule os.path that deals with normalizing directories paths. The subprocess module allows Python programs to run system commands and other advanced operations, such as handling piping as described above within Python code between spawned processes. Both of these libraries are worth checking out if you intend to do any Python shell scripting.
Richard Delaney is a software engineer with Demonware Ireland. Richard works on back-end Web services using Python and the Django Web framework. He has been an avid Linux user and evangelist for the past five years.
Practical Task Scheduling Deployment
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.View Now!
|The Firebird Project's Firebird Relational Database||Jul 29, 2016|
|Stunnel Security for Oracle||Jul 28, 2016|
|SUSE LLC's SUSE Manager||Jul 21, 2016|
|My +1 Sword of Productivity||Jul 20, 2016|
|Non-Linux FOSS: Caffeine!||Jul 19, 2016|
|Murat Yener and Onur Dundar's Expert Android Studio (Wrox)||Jul 18, 2016|
- The Firebird Project's Firebird Relational Database
- Stunnel Security for Oracle
- My +1 Sword of Productivity
- SUSE LLC's SUSE Manager
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Google's SwiftShader Released
- Parsing an RSS News Feed with a Bash Script
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide