Python Scripts as a Replacement for Bash Utility Scripts

For Linux users, the command line is a celebrated part of our entire experience. Unlike other popular operating systems, where the command line is a scary proposition for all but the most experienced veterans, in the Linux community, command-line use is encouraged. Often the command line can provide a more elegant and efficient solution when compared to doing a similar task with a graphical user interface.

As the Linux community has grown up with a dependence on the command line, UNIX shells, such as bash and zsh, have grown into extremely formidable tools that complement the UNIX shell experience. With bash and other similar shells, a number of powerful features are available, such as piping, filename wild-carding and the ability to read commands from a file called a script.

Let's look at a real-world example to demonstrate the power of the command line. Every time users log in to a service, their user names are logged to a text file. For this example, let's find out how many unique users use the service.

The series of commands in the following example show the power of more complex utilities by chaining together smaller building blocks:


$ cat names.log | sort | uniq | wc -l

The pipe symbol (|) is used to pass the standard output of one command into the standard input of the next command. In the example here, the output of cat names.txt is passed into the sort command. The output of the sort command is each line of the file rearranged in alphabetical order. This subsequently is piped into the uniq command, which removes any duplicate names. Finally, the output of uniq is passed to the wc command. wc is a counting command, and with the -l flag set, it returns the number of lines. This allows you to chain a number of commands together.

However, sometimes what is needed can become quite complex, and chaining commands together can become unwieldy. In that case, shell scripts are the answer. A shell script is a list of commands that are read by the shell and executed in order. Shell scripts also support some programming language fundamentals, such as variables, flow control and data structures. Shell scripts can be very useful for batch jobs that will be run often and repeatedly. Unfortunately, shell scripts come with some disadvantages:

  • Shell scripts easily can become overly complicated and unreadable to a developer wanting to improve or maintain them.

  • Often the syntax and interpreter for these shell scripts can be awkward and unintuitive. The more awkward the syntax, the less readable it is for the developer who must work with these scripts.

  • The code is generally unusable in other scripts. Code reuse among scripts tends to be difficult, and scripts tend to be very specific to a certain problem.

  • Libraries for advanced features, such as HTML parsing or HTTP requests, are not as easily available as they are with modern programming and scripting languages.

These problems can make shell scripting an awkward undertaking and often can lead to a lot of wasted developer time. Instead, the Python programming language can be used as a very able replacement. There are many benefits to using Python as a replacement for shell scripts:

  • Python is installed by default on all the major Linux distributions. Opening a command line and typing python immediately will drop you into a Python interpreter. This ubiquity makes it a sensible choice for most scripting tasks.

  • Python has a very easy to read and understand syntax. Its style emphasizes minimalism and clean code while allowing the developer to write in a bare-bones style that suits shell scripting.

  • Python is an interpreted language, meaning there is no compile stage. This makes Python an ideal language for scripting. Python also comes with a Read Eval Print Loop, which allows you to try out new code quickly in an interpreted way. This lets the developer tinker with ideas without having to write the full program out into a file.

  • Python is a fully featured programming language. Code reuse is simple, because Python modules easily can be imported and used in any Python script. Scripts easily can be extended or built upon.

  • Python has access to an excellent standard library and thousands of third-party libraries for all sorts of advanced utilities, such as parsers and request libraries. For instance, Python's standard library includes datetime libraries that allow you to parse dates into any format that you specify and compare it to other dates easily.

  • Python can be a simple link in the chain. Python should not replace all the bash commands. It is as powerful to write Python programs that behave in a UNIX fashion (that is, read in standard input and write to standard output) as it is to write Python replacements for existing shell commands, such as cat and sort.

Let's build on the problem that was solved earlier in this article. Besides the work already done, let's find out know how many times a certain user has logged in to the system. The uniq command simply removes duplicates but gives no information on how many duplicates there are. Instead of uniq, a Python script can be used as another command in the chain. Here's a Python program to do this (in my examples, I refer to this file as namescount.py):


#!/usr/bin/env python
import sys

if __name__ == "__main__":
    # Initialize a names dictionary as empty to start with.
    # Each key in this dictionary will be a name and the value
    # will be the number of times that name appears.
    names = {}
    # sys.stdin is a file object. All the same functions that
    # can be applied to a file object can be applied to sys.stdin.
    for name in sys.stdin.readlines():
            # Each line will have a newline on the end
            # that should be removed.
            name = name.strip()
            if name in names:
                    names[name] += 1
            else:
                    names[name] = 1

    # Iterating over the dictionary,
    # print name followed by a space followed by the
    # number of times it appeared.
    for name, count in names.iteritems():
            sys.stdout.write("%d\t%s\n" % (count, name))

Let's look at how this Python script fits into the chain of commands. First, it reads in input from standard input exposed through the sys.stdin object. Any output is written to the sys.stdout object, which is how standard output is implemented in Python. A Python dictionary (often called a hash map in other languages) is used to get a mapping from the user name to the duplicate count. To get a count of all the users, execute the following:


$ cat names.log | python namescount.py

This displays a count of how many times a user appears along with the user's name using a tab as a separator. The next thing to do is display, in order, the users who used the system most often. This can be done at the Python level, but let's implement it using the utilities that are already provided by the core UNIX utilities. Previously, I used the sort command to sort alphabetically. If the command is provided with a -rn flag, it sorts the lines numerically, in descending order. As the Python script prints to standard out, you simply can pipe the command into sort and retrieve the output you want:


$ cat names.log | python namescount.py | sort -rn

This is an example of the power of using Python as part of a chain of commands. The advantages of using Python in this scenario are as follows:

  • The ability to chain with tools like cat and sort. Simple utilities (reading a file line by line and sorting a file numerically) are handled by tried-and-trusted UNIX commands. These commands also are reading line by line, which means these functions can scale to files that are large in size, and they are very quick.

  • When some heavy-lifting is needed in the chain, a very clear, concise Python script can be written, which does what it needs to do and then offloads the responsibility to the next link in the chain.

  • It is a reusable module, although this example is specifically about names, if you feed this any input that contains duplicate lines, it will print out each line and the number of duplicates. Making the Python code modular allows you to apply it in a range of scenarios.

______________________

Richard Delaney is a software engineer with Demonware Ireland. Richard works on back-end Web services using Python and the Django Web framework. He has been an avid Linux user and evangelist for the past five years.

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Great

Anonymous's picture

Great article, I’ve saved as a favorite this web site so with any luck , I will see a lot more on this subject matter in the foreseeable future! giraffe prints

Great post. I must say thanks

Raza's picture

Great post. I must say thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more.
hot girls

WONDERFUL Post.thanks for

dennishobson's picture

WONDERFUL Post.thanks for share..more wait .. …There are certainly a lot of details like that to take into consideration. That is a great point to bring up. I offer the thoughts above as general inspiration but clearly there are questions like the one you bring up where the most important thing will be working in honest good faith. I don?t know if best practices have emerged around things like that, but I am sure that your job is clearly identified as a fair game. Both boys and girls feel the impact of just a moment’s pleasure, for the rest of their lives.

Between that system, the

laulan's picture

Between that system, the taillights – which according to Audi, transforms the car’s “rear end into a large, continuous light surface, with innumerable small points of light flickering 12W LED work light like a swarm of fish” — there’s also headlamp development for which to account. On the volition of the gorgeous, serpentine headlamps on its current models, the brand has gracefully become one of the most coveted imports in suburbs and city alike.

This article opens my mind

Rumah Inicio's picture

This article opens my mind about certain things that I don't know about Linux, thanks for the sharing and I hope I am allowed to return to your page again. How to last longer in bed

Reply to comment | Linux Journal

simplepullingtechniques.com's picture

This post will help the internet visitors for
setting up new website or even a weblog from
start to end.

hi

how to last longer in bed's picture

Heya i'm for the primary time here. I came across this board and I to find It truly helpful & it helped me out a lot. I am hoping to give something again and aid others like you helped me.

more this great post

tipsinfomuda's picture

excellent issues altogether, you simply won a new reader. What would you recommend in regards to your submit that you just made some days in the past? Any certain? Ultrabook Terbaru and also Konsumen Cerdas Paham Perlindungan Konsumen - Iconia PC Tablet dengan Windows 8 - Mau Bikin Website + Hosting Murah AbizZ? Ke Rajawebhost.com aja!

Very efficiently written post

how to last longer in bed's picture

Very efficiently written post. It will be useful to anyone who usess it, including myself. Keep doing what you are doing - i will definitely read more posts.

It will do all the logging

alextony's picture

It will do all the logging and graphing the inexpensive software does but with the Ford specific bundle you can command engine actuators (I think) and run diagnostic routines. Very powerful diagnostic tool. That leaves scanners. There are many many inexpensive ones. Given the other options available, I would skip the cheap ones. Dual Core Tablet PC

Perl is used that way for

CK's picture

Perl is used that way for DECADES, pretty hany with it perl -ne 'print' way of calling. http://www.lovewigs88.com
This is one of the few cases where perl is much handy that python.

Incorporating is one of the

incorporate's picture

Incorporating is one of the best ways a business owner can protect his or her personal assets. Most people choose to incorporate solely for this reason, but there are other advantages as well. For example, the corporate business structure saves you money in taxes, provides greater business flexibility, avoids audit chances, better itemization and lets you more easily raise capital.

Thanks for sharing this useable article

raspberryketonesideeffects's picture

You leave so useful information, I'm sure i am very happy to make a high comment here.
raspberryketonesideeffects | meratol reviews | proactols | capsiplex plus reviews |

Unconvencing Objective

Anonymous's picture

Wow! The article topic is about how difficult it is to interpret shell scripts and then renders, yet another convoluted syntactical language. The problem with most script languages are that, they lack strict programming semantics. Granted most newbie programmers are educated using C++, which is main hype around Python, but the real problem of simplicity, isn't being resolved. If a language offers a multitude of ways to code, it forces the programmer to open a book again. How many programmers can honestly say, they are fluent in all high/low level languages? I suggest to you that "C" is the basic language we all know and that, TCL is the most compatible syntactically, is shell universal, not to mention strict and simple. It has all the good qualities of a scripting language, an interpreter (type tclsh), can be used on a shell command line to pipe with other shell commands, has a good GUI package/s, has a great debugger, not to mention adheres to strict syntactical rules. The problem with programmers these days are that they forget the KISS philosophy.

Thanks for sharing this useable article

raspberryketonesideeffects's picture

I think this is a real great article post.Really looking forward to read more. Want more.
raspberryketonesideeffects | meratol reviews | proactols | capsiplex plus reviews |

Thanks for sharing this useable article

raspberryketonesideeffects's picture

I think this is a real great article post.Really looking forward to read more. Want more.
raspberryketonesideeffects | meratol reviews | proactols | capsiplex plus reviews |

It sound that this is a

obdii's picture

It sound that this is a perfect combination between mobile phones and automotive, no doubt, android industries have begun to enter our life, and I have been thinking recently, if Obd2 auto diagnostic software can be integrated into mobile applications, it will greatly change our lives, and even bring about a revolution again. Now many shop began selling android Obd2 auto diagnostic software product, I was no exception, welcome to my shop to see what the latest android Obd2 auto diagnostic software products. VOLVO VCADS Pro 2.4

It seems interesting thanks

Web Design Firm's picture

It seems interesting thanks for sharing this useful information.

It allows you to try the new

car diagnostic tool's picture

It allows you to try the new code, in a rapid way of explanation. This allows developers to modify the idea, without having to write a complete program output to a file.

That is very interesting

reillyariel's picture

That is very interesting Smile I love reading and I am always searching for informative information like this. This is exactly what I was looking for. Thanks for sharing this great article
free cell phone spy software
cell spyware

Aftermarket products can

Tony lincle's picture

Aftermarket products can cause for Check Engine lights to light. This is only caused by products improperly installed or manufactured. We have heard stories about the Check Engine Light illuminating when swapping out exhausts or intakes, but these codes being set will not affect your power output. Truck Diagnostic Software

Very well made and very good

Voyance sérieuse par mail's picture

Very well made and very good records, very pleasant to go good luck soon.

new text to read

Anonymous's picture

This kind of text is invaluable. Where can I get more information? sanscredit

Reply to comment | Linux Journal

Cathern's picture

Nice blog! Is your theme custom made or did you download it from somewhere?

A design like yours with a few simple adjustements would really make my blog jump out.
Please let me know where you got your theme. Appreciate it

Reply to comment | Linux Journal

how pick up women at the club's picture

Your way of telling everything in this paragraph is
genuinely good, all can effortlessly understand it, Thanks a lot.

Reply to comment | Linux Journal

Port Phillip Bay Fishing Charters's picture

Thnx for providing these details within your website.

Reply to comment | Linux Journal

how to pick up girls san andreas's picture

I do not even know how I ended up here, but I thought this
post was great. I don't know who you are but certainly you are going to a famous blogger if you are not already ;) Cheers!

Reply to comment | Linux Journal

skin disorders hypothyroidism's picture

Hi there! I know this iѕ somеwhаt off topic but
I was wonԁering which blog plаtfοrm aге you using for this ωebѕite?

I'm getting sick and tired of Wordpress because I've hаԁ prοblems with hackеrѕ аnԁ I'm looking at alternatives for another platform. I would be awesome if you could point me in the direction of a good platform.

AWK as a half-way solution

Magic Banana's picture

As other readers mention, the proper solution is:
$ sort names.log | uniq -c

Anyway, one often needs some more complicated processing and AWK really is a fantastic language to effectively process text files. It is a mix between the Shell (with $i as the ith field, pipes, redirection, etc.), C (variables, arithmetic, hash maps, loops, etc.) and sed (line matching w.r.t. regexps, functions implementing the sed's 's' command, etc.). The command above becomes, in AWK:
$ awk '{ ++names[$0] }
END { for (name in names) print name, names[name] }' names.log

Pretty straightforward, isn't it?

tiny bug

perfectionatic's picture

Great article. I noticed a tiny bug in the last python script. GMAIL_EMAIL shouldn't be there.

awesome

aishavax's picture

Great article! This is the type of information that are meant to be shared around the net.
Disgrace on Google for now not positioning this submit upper!
Come on over and seek advice from my site . Thank you studio lighting and mind machine

choices

Felipe's picture

UNIX/Linux is all about choices. I pick between Python and Bash all the time (with a sprinking of awk). The more serious the task or the more likely the script will have a long life, the more likely it is that I will use Python.

The one thing I will say to support the shell and pipes approach is how easy it is to incrementally debug your work. At any step, replacing the remainder of the pipeline with more gives you a quick way to see how you are doing. While tossing in print statements in Python is not hard, it just is more typing.

Depending on your programming knowledge, the right approach for you will probably be different. But, that said, if you are building something that is not write-only (that is, write the code, run it and throw it away) picking a scripting language such as Python or Ruby will generally pay off in the long run.

Reply to comment | Linux Journal

try this blog page's picture

I every time useԁ to study paragrаph in news papегs but now аs I am
a user of internet thus from now Ι am usіng net
for агticlеs or reνiews, thanks to web.

Perl is used that way for

hmepas's picture

Perl is used that way for DECADES, pretty hany with it perl -ne 'print' way of calling.

This is one of the few cases where perl is much handy that python.

Excelent article. I will be

Anonymous's picture

Excelent article. I will be using more Python in bash soon.

why not +x it

d.py's picture

No need to pipe to "python namescount.py", just make sure you have the hashbang in your .py file, chmod it executable, then you can pipe your file directly to "namescount.py":

cat nameslog | namescount.py

Would seem a bit more elegant to me.

Guys, it's really strange to

Anonymous's picture

Guys, it's really strange to post a code in 2013 without _any_ syntax highlighting, especillay in "_linux_journal".

Thanks for the tips

anonymous's picture

I recently made the leap into python programming and I have to say that it is a easy language to use. Forcing code structure in a language was pure genious if you ask me.

I like the fact that core "modules" are included out of the box helps when you need to write a program that must function across hundreds of linux servers. You don't have this luxury with Perl. Using yum or aptget on servers with different OS patch levels, or lacking internet connections, firewall issues, etc is too much of a headache.
Shell scripting can be a pain too depending on what shell you are using (Bash, Korn, etc) and the personal preferences of the admin responsible for that box. The minute differences in syntax can cause hours of troubleshooting due to spaces, braces, brackets, character case...

Python is defintely a great tool to use if you need to write scipts/programs that must be used widely and interpreted by many.

I see the point in showing

Anonymous's picture

I see the point in showing the use of piping, but "cat names.log | sort | uniq" is dumb: it's the same as "sort -u names.log".

Hey of course you are

Richy Delaney's picture

Hey of course you are right,

this was written only to illustrate piping.

Often I find it easier to chain a ton of commands together rather than rememeber each flag for each binary. Even still, your idea is cleaner and better.

python -s

Roberto S.'s picture

I often use Perl with "-e", so I can hack quick snippets of code that do the work easier than some shell scripting. The same is possible in Python, using "python -s". Like:

cat blabla.log | python -s 'import sys; a= ... ;' | less

Which is handier, IMHO, than writing a file for things that you'll use only once or twice.

Overkill

Anonymous's picture

Use Ruby. Trust me, it's much better than Python! And since Ruby 1.9, it's faster, too, as the default interpreter has changed from the MRI to YARV.

Though honestly, for most things, shell scripts are easier to write than scrips in full-blown programming languages (and the line between programming and scripting is getting increasingly blurred)...

could be done in one line with perl

Anonymous's picture

without even writing a script, using "perl -ne"

sort -u

Anonymous's picture

$ cat names.log | sort | uniq | wc -l

really?

Are you trying to make things look complicated?

sort -u names.log | wc -l

is all you need.

Hey, Thanks for reading the

Richy Delaney's picture

Hey,

Thanks for reading the article.

sort -u names.log | wc -l

is certainly a nicer way to write that, however it is the way it is in the article for a number of reasons. First I wanted to illustrate piping.

Also, I would mention that "sort -u" seems a little less clear to me than "sort | uniq"

as a result I would be inclined to go for the later. You will find in a lot of the bsd operating systems, the unix commands have less command line arguments.

Cheers
Richy

An excellent post. The post

Raza's picture

An excellent post. The post affects a lot of urgent issues in our minds. We can not be indifferent to these problems. Your article gives the light in which we are able to watch our real life. Keep it up. Find me a lover

Command line argument options

Anonymous's picture

Not to be snarky, but regarding "You will find in a lot of the bsd operating systems, the unix commands have less command line arguments.":

This site is LinuxJournal.com, is it not?

One would think the articles are geared towards the Linux, and not any *nix, community. It doesn't hurt to revise an article after it's been published, especially when it'll remain published indefinitely.

If using shortened switches clouds readability, one may always resort to the longer switches, e.g., --unique rather than -u for the sort command.

uniq -c

Hermann Schwärzler's picture

Thank you for this article. It was really interesting. I am planning to consider python the next time I am going to write a shell-script.

But for the records: You write on Page 1

The uniq command simply removes duplicates but gives no information on how many duplicates there are.

If you use uniq -c the output will be the same as with your namescount.py: every unique value is preceded by the number of its occurrences.

Greetings
Hermann

You can try install IPython

Anonymous's picture

You can try install IPython for shell, it very easy

Another cool feature i used

kle_py's picture

Another cool feature i used some time ago is
i could build and test my script in windows, then copy it over to a Linux/Unix box and it worked the same way ;)

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix