Image Manipulation with ImageMagick

I've spent a lot of time in my column talking about text processing and analysis, with the basic assumption that if you're using the command line, you're focused on text. That's not always true, and if you work with images at all—whether JPEG, PNG, GIF or another format—there's a free-to-download suite of image-related utilities available that offers rather amazing capabilities direct from the command line and, therefore, also from within shell scripts.

I'm talking about ImageMagick, a set of programs that has grown and expanded through the years and now includes powerful Perl and Ruby interfaces too. But, pshaw! We don't need no stinkin' Perl or Ruby. We'll stick with our hard-core shell commands, thank you very much.

You'll find a downloadable binary and source both at http://www.imagemagick.org, and as always, I recommend you download source and compile it on your system if you can. It's far more reliable than hoping someone else's compiled version is optimized for your own hardware configuration.

A variety of different commands are included with the ImageMagick distribution that I divide into "analysis" and "editing" tools. For this article, let's stick with the analysis tools. Let me start by showing you how much more information it offers on a typical image file than the standard Linux command line.

Analyzing Images for Non-Optimized Resolutions

If you've been using Linux for even a short time, you've probably learned about the file command. It can be helpful with some file types:


$ file wp-content.tar.gz
wp-content.tar.gz: gzip compressed data, from Unix

But, the command is generally useless with images:


$ file pvp.jpg
pvp.jpg: JPEG image data, EXIF standard

Um, what about image size? How about any useful info at all? Jeez.

Enter the ImageMagick identify command:


$ identify pvp.jpg
pvp.jpg JPEG 970x311 DirectClass 114kb 0.010u 0:01

Ahh...so this particular image has the dimensions (the suite refers to dimensions as the "geometry" of the image) of 970x311. That's useful.

Do you want even more information though? The -verbose option spits out a somewhat overwhelming amount of data:


$ identify -verbose pvp.jpg
Image: pvp.jpg
  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Geometry: 970x311
  Class: DirectClass
  Colorspace: RGB
  Type: TrueColor
  Depth: 8 bits
  Endianess: Undefined
  Channel depth:
    Red: 8-bits
   Green: 8-bits
    Blue: 8-bits
  Channel statistics:
    Red:
      Min: 0
      Max: 255
      Mean: 180.72
      Standard deviation: 74.2122
    Green:
      Min: 0
      Max: 255
      Mean: 168.593
      Standard deviation: 76.0343
    Blue:
      Min: 0
      Max: 255
      Mean: 169.459
      Standard deviation: 77.244
  Colors: 21864
  Rendering-intent: Undefined
  Resolution: 72x72
  Units: Undefined
  Filesize: 114kb
  Interlace: None
  Background Color: white
  Border Color: #DFDFDF
  Matte Color: grey74
  Dispose: Undefined
  Iterations: 0
  Compression: JPEG
  Orientation: Undefined
  JPEG-Quality: 94
  JPEG-Colorspace: 2
  JPEG-Sampling-factors: 1x1,1x1,1x1
  signature: bc8a6a698ca35fd8feab71452423386ff98b1fb7b5ec ...
  Profile-xmp: 811 bytes
  Profile-exif: 22 bytes
    unknown
  Profile-app12: 15 bytes
  Tainted: False
  User Time: 0.020u
  Elapsed Time: 0:01

Truth be told, dimensions and resolution are the most useful pieces of information from this crazy-long output.

With a tiny bit of effort, you can extract just those items of information:


$ identify -verbose pvp.jpg | grep -E '(Resolution:|Geometry:)'
  Geometry: 970x311
  Resolution: 72x72

Now imagine you are working on a Web site and want to ensure that no images on the site are greater than 72dpi, a standard screen resolution. Higher print-ready resolutions are rather pointless, because a 300dpi image will render the same on a screen as its lower-resolution brethren—it'll just load slower.

Here's one way you can identify images in a directory with incorrect resolutions:


#!/bin/sh
identify=/usr/bin/identify
# check images to ensure that they're all 72x72 resolution.
for filename
do
  resolution=$($identify -verbose $filename | \
     grep -i "Resolution:" | grep -v 72x72)
  if [ ! -z "$resolution" ] ; then
    echo "Warning: Image $filename has $resolution"
  fi
done
exit 0

When I run this on a directory of images on my own system, a set of JPEG format files on my http://www.AskDaveTaylor.com site, here's what I get:


$ checkres.sh *.jpg
Warning: Image auction-seller-img1.jpg has Resolution: 75x75
Warning: Image auction-seller-img2.jpg has Resolution: 75x75
Warning: Image browsing-the-photo-folder.jpg has Resolution: 96x96
Warning: Image brushed-metal.jpg has Resolution: 300x300
...

That's a surprise! I didn't realize that I had 300x300 and these other weird resolutions. An easy way to speed up my site, therefore, is to lower the resolution on these images to the standard 72dpi. This is something that also can be done with a call to a different ImageMagick utility, but let's tackle that in another article.

Working with Image Size

Since I write a lot of scripts that harvest images or other content from sites and repurpose them for my own (generally private, not public-facing) use, I also find it is darn helpful in a shell script to be able to ascertain the size of an image I've just grabbed.

If you've guessed that identify is the key, you're right. In fact, given an image, this is an easy way to grab its height and width:


height=$(identify $image | cut -d\   -f3 | cut -dx -f1)
width=$(identify $image | cut -d\   -f3 | cut -dx -f2)

There's no need for verbose output, because the geometry of the image is included in the default output.

Now it's easy to produce higher-quality HTML, for example, by including images with their proper dimensions:


echo "<img src=$image height=$height width=$width>"

What's better is that Web browsers are able to scale images automatically, so if you specify a height and width that are different from the default dimensions (oops, sorry, "geometry") of the image, it'll scale automatically.

This means if I want to include the pvp.jpg image on an automatically generated page, but decide 970 pixels is just too wide, I can simply include it as:


<img src=pvp.jpg height=207 width=646>

and the browser—be it Chrome, Safari or even MS IE—will scale it appropriately.

Calculating the smaller size is straightforward with bc, another underappreciated Linux command. The entire sequence might look like this to scale the image to 66% of its original dimensions:


#!/bin/sh
identify=/usr/bin/identify
scale=0.666
image=$1   # add input validation code

height=$($identify $image | cut -d\   -f3 | cut -dx -f1)
width=$($identify $image | cut -d\   -f3 | cut -dx -f2)
newwidth="$(echo $width \* $scale | bc | cut -d. -f1)"
newheight="$(echo $height \* $scale | bc | cut -d. -f1)"
echo "<img src=$image height=$newheight width=$newwidth>"
exit 0

In practical use:


$ scaledown.sh pvp.jpg
<img src=pvp.jpg height=646 width=207>

That's easy enough!

With some creativity, you can see how even just the identify command that's included with ImageMagick opens up a world of image file scripting possibilities, whether you're working with Web sites directly or simply seek to analyze directories of images for unusual values or settings.

I'll dig into some of the really slick editing and modification capabilities, including an easy way to add a so-called watermark to your graphics, along with ways you can automate fixing 300dpi resolution images or even scaling images, in an upcoming article.

As a final note, although I explain how you can take a large image and have it show up smaller on a Web page by using different values for height and width, it would be remiss of me not to mention that if you're going to use only the smaller size, it's smarter to resize the original image. It makes your page faster to load, less unneeded data is transferred and everything just generally is happier (including the search engines). Now you know.

______________________

Dave Taylor has been hacking shell scripts for over thirty years. Really. He's the author of the popular "Wicked Cool Shell Scripts" and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState