python subprocess over shell

One of my common gripes is when people struggle with complicated shell scripts that would be much simpler in a scripting language like Python, Ruby or Perl. I used to abuse PHP for this, but saw the light.

If you’re talking about replacing shell scripts, all of these are pretty much equivalent, but I don’t really do Perl for no particular reason, I’m a big fan of Python for general purpose work just because of the rich module system, and at least “dozens of engineers” use it at work..

The reason we generally write shell scripts is because we want to execute a bunch of external processes in an automated way.

In this area you can whip something basic up fastest in shell, yes, but at some point you’re going to have to repay that technical debt if you need to get past a certain point.

Besides, it’s not like shell scripting always stays simple and easy… and the overhead of moving to a more powerful language isn’t that huge.

If I’m sure the scope of a script will be small, or I don’t have the option of moving to structured data format for input and output with external commands, or I don’t have to futz around with arrays, I’ll stick with bash.

But, if I want to work with dictionary objects or talk in a protocol like LDAP or model things as objects, or need complex handling and passing around of stdin/stdout/exit statuses, or know some module handles lots of edge cases for me, I’ll move to Python or Ruby. I quite like both, but feel that Python is more utilitarian, and simply due to whitespace enforcement and extensive linters is a good fit for code that may need to be picked up and understood quickly by a co-worker.

When it comes to getting started with Python, I still suggest Dive into Python for people. Just flipping through Chapters 1-3 equips you with an awful lot.

Anyway, some people think Python is hard, but it’s not really. I think of Python as being extremely utilitarian, which makes it a great fit for sysadmin work.

Someone posted on the MacEnterprise list a question about working out what PPD was in use by each printer on the system in OS X, and someone gave a good shell example using the usual suspects of for, grep and awk.

There’s nothing really wrong with doing this, but I’ve come to distrust parsing non-structured output if I need to keep this solution working across many multiple versions of operating systems, or even between OS X major versions given Apple’s history. One of the big advantages of moving to Python is being able to parse and manipulate property lists, which can get really painful in shell. You end up writing out lots of temp files or giving up on dealing with error conditions.

Anyway, so lets see what it’s like executing a command in Python with the subprocess module. We’re asking System Profiler for printer info, and telling it to return the output in XML plist format.


import subprocess

command = [‘system_profiler’, ‘-xml’, ‘SPPrintersDataType’]
task = subprocess.Popen(command,

(stdout, stderr) = task.communicate()
print stdout

To skim through those lines, we’re

  • setting the python shebang
  • importing the subprocess module
  • defining a list called ‘command’ to store the command we want to run
  • creating a subprocess Popen object to run our command called ‘task’
  • setting standard out and standard error to go to our own pipes
  • getting standard out and standard error from the task.
  • printing standard out

This really isn’t much work, and if you really only wanted this functionality, there are other convenience functions that you can use to make this even shorter, or write your own convenience function.

But we have all sorts of options now.

We can send standard input to the task:

(stdout, stderr) = task.communicate(stdin)

We can ask what the exit status is easily:

status = task.returncode

and if we get None we know the process hasn’t terminated yet.

And if we want to do something a bit more complicated we can search through the structured data plist output easily like:


import plistlib
import subprocess

command = [‘system_profiler’, ‘-xml’, ‘SPPrintersDataType’]
task = subprocess.Popen(command,

(stdout, stderr) = task.communicate()
printers = plistlib.readPlistFromString(stdout)
printers = printers[0][‘_items’]

for printer in printers:
  print ‘Name: ‘ + printer[‘_name’]
  print ‘PPD: ‘ + printer[‘ppd’]

This is easier to extend than the standard for/grep/awk/sed equivalents tend to be, and much easier for a co-worker to pick up and understand. It comes close to documenting itself, just needs some comments about the structure of Apple’s output, and some try/except blocks in case the output is malformed or does change.

3 thoughts on “python subprocess over shell

  1. for blind calls to commands (no communication needed), I actually have this snippet built into my TextMate boilerplate for Python sysadmin scripts:

    from subprocess import Popen, call, STDOUT, PIPE

    def sh(cmd):
    return Popen(cmd,shell=True,stdout=PIPE,stderr=PIPE).communicate()[0]

    so calling a shell command is as easy as:

    printer_xml = sh(‘system_profiler -xml SPPrintersDataType’)

  2. Yep. A custom function I often use is one that returns a tuple of (stdout, stderr, returncode) so you can use it however you want.

    I don’t use it often, mainly because I still do python 2.4 work, but the convenience function check_call that appeared in 2.5 is quite useful too:

    subprocess.check_call(*popenargs, **kwargs)
    Run command with arguments. Wait for command to complete. If the exit code was zero then return, otherwise raise CalledProcessError. The CalledProcessError object will have the return code in the returncode attribute.

    The arguments are the same as for the Popen constructor. Example:

    >>> subprocess.check_call(["ls", "-l"])
    New in version 2.5.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>