If you aren’t using automation to get you to a place where you can efficiently apply intelligence you’re almost certainly wasting time and effort. Michael Bolton put it this way on Twitter last week:

 I like … using automation as a taxi to drop me off at the site of the real work.

When we moved offices a couple of months ago Marketing updated the contact details in their collateral and then requested that we review the changed PDFs. Simple approaches to this task would include reading all of the documents from top to bottom, skimming them looking for addresses to check or using a PDF reader’s search functionality in each document to look for likely mistakes. All of these would have been slow, prone to error and without a good way to make efficiencies.

What we actually did was automate, using command line tools to do the heavy lifting. First, we transformed the PDF files into a text format in which they could easily be searched as a set:

 $ find . -name ‘*.pdf’ -exec pdftotext {} \;

Then we identified search terms that would signal the presence of outdated contact details. Some examples include the old postcode, building and phone number:

 $ grep “0WS” *.txt
 $ grep “St.*John” *.txt
 $ grep 421360 *.txt

It’s important to understand the limitations of your tools and techniques. In this case we didn’t search for long strings like “St. John’s Innovation Centre” which could have been broken over a line, or which would have missed cases where “Johns” had no apostrophe. We could have added more complexity to account for that, but given the task it didn’t seem worth the effort.

This approach took minutes to implement and turned up a bunch of errors including this one, a combination of new and old postcode:

Linguamatics 324 Cambridge Innovation Centre Cowley Cambridge CB4 0WG 0WS

Arguably this is only semi-automation because we drove each step by hand. Even if so, this just shows another strength of automation – you can use it for investigation, for exploration. If we think of a new search term we can try it cheaply and get data back on its effectiveness immediately. Because it’s cheap we can try many search terms and, once we’ve got a good set, we can put them all into a short script and re-use it in the next round of testing. The Linux history command is a handy way to review what you’ve tried.

In fact, we’d previously set up a simple script to chain together tests like this as a gate on generating our user documentation. Our source is HTML and, before it’s built into the product, the doc team run a script that checks for well-formedness, broken links, spelling, a bunch of common typos and so on.

This is cheap and cheerful automation that can become part of your regular tool arsenal, freeing you up to think about how to apply your testing brain, not how to keep track of the donkey work you’ve done in case you need to redo or change it.
Image: http://flic.kr/p/7FDvu2