python subprocess over shell

One of my common gripes is when people struggle with complicated shell scripts that would be much simpler in a scripting language like Python, Ruby or Perl. I used to abuse PHP for this, but saw the light.

If you’re talking about replacing shell scripts, all of these are pretty much equivalent, but I don’t really do Perl for no particular reason, I’m a big fan of Python for general purpose work just because of the rich module system, and at least “dozens of engineers” use it at work..

The reason we generally write shell scripts is because we want to execute a bunch of external processes in an automated way.

In this area you can whip something basic up fastest in shell, yes, but at some point you’re going to have to repay that technical debt if you need to get past a certain point.

Besides, it’s not like shell scripting always stays simple and easy… and the overhead of moving to a more powerful language isn’t that huge.

If I’m sure the scope of a script will be small, or I don’t have the option of moving to structured data format for input and output with external commands, or I don’t have to futz around with arrays, I’ll stick with bash.

But, if I want to work with dictionary objects or talk in a protocol like LDAP or model things as objects, or need complex handling and passing around of stdin/stdout/exit statuses, or know some module handles lots of edge cases for me, I’ll move to Python or Ruby. I quite like both, but feel that Python is more utilitarian, and simply due to whitespace enforcement and extensive linters is a good fit for code that may need to be picked up and understood quickly by a co-worker.

When it comes to getting started with Python, I still suggest Dive into Python for people. Just flipping through Chapters 1-3 equips you with an awful lot.

Anyway, some people think Python is hard, but it’s not really. I think of Python as being extremely utilitarian, which makes it a great fit for sysadmin work.

Someone posted on the MacEnterprise list a question about working out what PPD was in use by each printer on the system in OS X, and someone gave a good shell example using the usual suspects of for, grep and awk.

There’s nothing really wrong with doing this, but I’ve come to distrust parsing non-structured output if I need to keep this solution working across many multiple versions of operating systems, or even between OS X major versions given Apple’s history. One of the big advantages of moving to Python is being able to parse and manipulate property lists, which can get really painful in shell. You end up writing out lots of temp files or giving up on dealing with error conditions.

Anyway, so lets see what it’s like executing a command in Python with the subprocess module. We’re asking System Profiler for printer info, and telling it to return the output in XML plist format.

#!/usr/bin/python

import subprocess

command = [‘system_profiler’, ‘-xml’, ‘SPPrintersDataType’]
task = subprocess.Popen(command,
                        stdout=subprocess.PIPE,
                        stderr=subprocess.PIPE)

(stdout, stderr) = task.communicate()
print stdout

To skim through those lines, we’re

  • setting the python shebang
  • importing the subprocess module
  • defining a list called ‘command’ to store the command we want to run
  • creating a subprocess Popen object to run our command called ‘task’
  • setting standard out and standard error to go to our own pipes
  • getting standard out and standard error from the task.
  • printing standard out

This really isn’t much work, and if you really only wanted this functionality, there are other convenience functions that you can use to make this even shorter, or write your own convenience function.

But we have all sorts of options now.

We can send standard input to the task:

(stdout, stderr) = task.communicate(stdin)

We can ask what the exit status is easily:

status = task.returncode

and if we get None we know the process hasn’t terminated yet.

And if we want to do something a bit more complicated we can search through the structured data plist output easily like:

#!/usr/bin/python

import plistlib
import subprocess

command = [‘system_profiler’, ‘-xml’, ‘SPPrintersDataType’]
task = subprocess.Popen(command,
                        stdout=subprocess.PIPE,
                        stderr=subprocess.PIPE)

(stdout, stderr) = task.communicate()
printers = plistlib.readPlistFromString(stdout)
printers = printers[0][‘_items’]

for printer in printers:
  print ‘Name: ‘ + printer[‘_name’]
  print ‘PPD: ‘ + printer[‘ppd’]
 

This is easier to extend than the standard for/grep/awk/sed equivalents tend to be, and much easier for a co-worker to pick up and understand. It comes close to documenting itself, just needs some comments about the structure of Apple’s output, and some try/except blocks in case the output is malformed or does change.

Seeking input for possible Mac IT Conference

The MacEnterprise steering committee has been talking about doing this
for way too long, but the recent lack of significant IT tracks at WWDC
has spurred us into action.

MacEnterprise is planning to partner with various other groups to get
a Mac IT focused conference started.

This is all very much up in the air, and at this stage we’re seeking
input as to how the community would like this conference to be
organized.

At this stage, we would like you to provide input on Google Moderator
as to ideas for the conference. You can submit ideas as well as vote
on other ideas here:

http://goo.gl/mod/4COQ

Additionally, there is some discussion going on on Twitter, under the
#MacITConference hashtag.

http://twitter.com/#search?q=%23MacITConference

Once we get a little bit better idea of the structure, we’ll be
calling for speakers and looking for sponsorship partners.

Profiling puppetmasterd with ruby-prof

So I’m not hugely happy with the CPU consumption of puppetmasterd under heavy load, and so I’ve been trying to work out where the bottlenecks lie.

Unfortunately I’ve yet to find a smoking gun, but here’s a reasonably simple way to produce profiles of puppetmasterd.

  • Install ruby-prof from gems.
  • Stop any apache/mongrel/nginx instances of puppetmasterd you may have running
  • Edit /usr/sbin/puppetmasterd and replace the last few lines with:
    require ‘rubygems’
    require ‘ruby-prof’

    result = RubyProf.profile do
      require ‘puppet/application/puppetmasterd’
      Puppet::Application[:puppetmasterd].run
    end

    printer = RubyProf::GraphHtmlPrinter.new(result)

    File.open(‘/tmp/ruby-profile.html’, ‘w’) do |file|
      printer.print(file, {:min_percent => 10, :print_file => true})
    end

  • Start a webrick puppetmasterd with –no-daemonize
  • Do a client run against it
  • Hit Ctrl-C to interrupt your puppetmasterd
  • wait for the html output to be generated

It’s worth filtering the min_percent value. Without it, I ended up with 300+MB HTML files with no images that took my dev server a long time to write to disk. With it, I end up with a couple of megs.

You can see an example output at:
http://www.explanatorygap.net/crap/ruby-profile.html

with the interesting thread being at:
http://www.explanatorygap.net/crap/ruby-profile.html#70121801903160

Interpretation suggestions welcome :)

Edit:

Brice had a great suggestion of using CallTreePrinter instead of GraphHtmlPrinter and analysing the output with kcachegrind (which is utterly amazing). Obviously your output file shouldn’t be html then…

I’ve put a CallTree output up here.

Finally… a sanctioned way of activating the screen saver.

Ever since I started managing Macs in a corporate environment, I’ve been annoyed that Apple has failed to offer a sanctioned way of locking the screen via a keyboard command. This is a reasonably common requirement in a lot of corporate deployments. Sure we can use hot corners etc, or we can use one of the sanctioned methods to activate the loginwindow via Fast User Switching, but the former isn’t for everyone, and the latter sucks because it will tear down userspace VPN/802.1x connections.

We have reasonable MCX controls to require a password for the screensaver, but nothing to actually activate it. There are a bunch of private API calls you can use to achieve this, but using private APIs makes me feel dirty.

When I first started poking at Automator again in 10.6, I was pleased to notice that we have a “Start Screen Saver” action. This means that we can save such an Automator workflow as a Service, and then assign a keyboard command to it such that we can activate it from any application.

Unfortunately this action is buggy. If you activate the workflow and wiggle the mouse around, you’ll get an error dialog after unlocking the screen.

Luckily we have another way of achieving the same goal. The System Events AppleScript dictionary contains the same functionality.


tell application "System Events"
start current screen saver
end tell

So you can simply create an Automator “Service” workflow and add a “Run AppleScript” action with the following code snippet.


on run {input, parameters}

tell app "System Events"
start current screen saver
end tell

return input
end run

Save it, and assign a hot key, and you can finally activate the screensaver from the keyboard.

Apple have documented the binary plist format

Thanks to Dave Dribin for pointing this out.

In http://opensource.apple.com/source/CF/CF-550/CFBinaryPList.c

So really there’s no reason why we can’t have plistlib etc for Ruby/Python/whatever deal with binary plists on non-Mac platforms.

/*
HEADER
    magic number ("bplist")
    file format version

OBJECT TABLE
    variable-sized objects

    Object Formats (marker byte followed by additional info in some cases)
    null    0000 0000
    bool    0000 1000           // false
    bool    0000 1001           // true
    fill    0000 1111           // fill byte
    int 0001 nnnn   …     // # of bytes is 2^nnnn, big-endian bytes
    real    0010 nnnn   …     // # of bytes is 2^nnnn, big-endian bytes
    date    0011 0011   …     // 8 byte float follows, big-endian bytes
    data    0100 nnnn   [int]   … // nnnn is number of bytes unless 1111 then int count follows, followed by bytes
    string  0101 nnnn   [int]   … // ASCII string, nnnn is # of chars, else 1111 then int count, then bytes
    string  0110 nnnn   [int]   … // Unicode string, nnnn is # of chars, else 1111 then int count, then big-endian 2-byte uint16_t
        0111 xxxx           // unused
    uid 1000 nnnn   …     // nnnn+1 is # of bytes
        1001 xxxx           // unused
    array   1010 nnnn   [int]   objref* // nnnn is count, unless ’1111′, then int count follows
        1011 xxxx           // unused
    set 1100 nnnn   [int]   objref* // nnnn is count, unless ’1111′, then int count follows
    dict    1101 nnnn   [int]   keyref* objref* // nnnn is count, unless ’1111′, then int count follows
        1110 xxxx           // unused
        1111 xxxx           // unused

OFFSET TABLE
    list of ints, byte size of which is given in trailer
    — these are the byte offsets into the file
    — number of these is in the trailer

TRAILER
    byte size of offset ints in offset table
    byte size of object refs in arrays and dicts
    number of offsets in offset table (also is number of objects)
    element # in offset table which is top level object
    offset table offset

*/
 

Puppet 0.25.1 debs done… but delayed.

We’ve uploaded the 0.25.1 debs, but due to this work, it might take a little while before they appear.
http://blog.ganneff.de/blog/2009/10/27/debian-ftpmaster-meeting.html

It will appear here when done.
http://packages.debian.org/sid/puppet

Instructions for building yourself….

$ git clone git://git.debian.org/pkg-puppet/puppet.git
$ cd puppet
$ git-buildpackage --git-upstream-branch=origin/upstream

Greg Neagle on Adobe Enterprise Toolkit/Munki/Puppet

If you’re a Mac IT person, and you don’t know about Greg Neagle’s Managing OS X blog, you need to fix that situation now.

One of the reasons Greg is so awesome in our field is that he’s eminently pragmatic, with enough hacker mentality to make sure he simply gets the job done with a minimum of fuss. His recent post on the trials and tribulations of working with the Adobe Enterprise Deployment Kit is a great example.

Not only is he trying to come up with something flexible enough to actually use efficiently, he’s dug into the innards and explained exactly what’s going on.

I talked to a few people at Puppet Camp last week about large scale Mac management, and everyone seemed really excited about the Munki Project, which is all Greg’s work so far. Basically the idea is to provide OS X with an actual repository for package management, using native Mac packages, and attempting to reuse vendor packages as much as is feasible.

If no-one else does it, I’ll end up putting together a munki type and provider for Puppet. I’m really looking forward to being able to simply do stuff like:

package { "iWork":
  ensure => latest,
}

just like other operating systems, letting the repository handle dependencies. The way it should be….

This really could be one of the most important community contributions to large scale Mac management in the history of OS X in my opinion.

Facter 1.5.7 MacPorts update submitted

I’ve submitted a diff to update facter in MacPorts to 1.5.7, so it should be available soon.

Note that I’ve set the maintainer for both Puppet and Facter in MacPorts to ‘openmaintainer’. This means that I accept patches from anyone, and it’s really quite trivial to update either of them, as is the case with the vast majority of Portfiles.

The process goes something like:

$ sudo port selfupdate (to get the newest versions)
$ mkdir /tmp/facter
$ cd /tmp/facter
$ cp /opt/local/var/macports/sources/rsync.macports.org/release/ports/sysutils/facter/Portfile .
$ cp Portfile Portfile.orig
(edit the port file to change version from 1.5.6 to 1.5.7)
$ port -v checksum (this will print out the expected and obtained checksums. Use this info to update the ‘checksums’ component of the Portfile)
$ port -v checksum (this should return happily now)
$ sudo port -v install (verify that the port is installed correctly)
$ diff -u Portfile Portfile.orig > Portfile-facter.diff (submit an update ticket on the MacPorts Trac site with the diff attached)

The complexity debt

This has been flowing all over the #puppetcamp twitter tag, but it’s worth repeating.

“Think of the complexity in your environment as a form of technical debt that you’re going to have to pay down” – Paul Nasrat

This is so awesomely pithy you just know he’s a bloody Pom.

(England 3/83 in the Champions Trophy semi-final as of right now…)

At Puppet Camp

Puppet Camp is on today and tomorrow.

It’s already exciting being in a room full of involved sysadmins who are concerned with making our jobs better and thinking about how the place our field will be in in the next few years…

It’s always good to put faces to IRC handles too :)

Already had a great talk from Ohad Levy on The Foreman and his infrastructure. I’m excited about The Foreman, even if we don’t end up using it at Google.