One particular client wanted to automate as much of their testing as possible. I thought of a metaphor to help describe how trying to automate almost all “regression testing” is a difficult path to success. There are benefits and weaknesses to automation that need to be considered in addition to the costs.

Consider your software to be a country and the people in your country are your users. The inhabitants drive around the country and experience the different aspects of each geographical region (use the software in different ways). As they drive they will encounter potholes in the roads. Some potholes will be so large that driving down that particular street might cause damage to the vehicle or might make the street impassible. Other potholes are small and just inconvenient because you have to drive slower or pay more attention to avoid them. There are also some potholes that don’t look too big or nasty when you hit them but they can cause damage to your car that you are not aware of at the time.

How would you find and fix potholes in the roads? Everybody wants streets with no potholes (bug free software) – but road crews realize that is an impossible goal given their budget and time constraints. They need to focus on finding and fixing the most dangerous and disruptive potholes.

One solution might be for the road crew to send droids to drive along roads and report back any potholes that they detect (automated scripts).

Wow, that sounds great! What could go wrong?


Problem 1 – There are more kinds of problems than automation can be programmed to recognize

Droids may miss reporting a pothole because that pothole looks different to what it was specifically programmed to search for. Plus people are reporting other road problems that the droids miss because they cannot be programmed to detect every possible problem. Problems like missing barricades (security checking, input constraints), problems with road signs (help screens), and narrow lanes causing slow and unclear navigation (usability).

Problem 2 – Investigating reported failures takes a long time

Droids will sometimes report potholes that aren’t really potholes. False positives might be caused because there is a problem with the droid, the road layout has changed, or there might be a puddle or some garbage in the street. All of the pothole reports from the droids need to be investigated to validate that they are reporting a legitimate pothole which takes quite a lot of time. The road with the reported pothole needs to be revisited by a tester to determine if the report is legitimate or not. If there are a lot of false positives it may take longer to investigate the failures than it would have taken to have a person perform testing instead.

Problem 3 – There are more checks to automate than can possibly be written

The road crews have decided to create a fleet of automated drones that drive along as many streets as possible. The road crew started with droids checking all the major highways. Next they want to start working on other major roads. Every time a significant pothole is reported in a road that is not covered by a droid a new droid is programmed to check that road in all future road scans. A full scan may start out taking a day, but soon it takes weeks. And the goal of full road coverage always seems to recede over the horizon: the road crew thought that if they could cover the major roads then they have covered the entire country, but there are many times more miles of secondary roads that are equally important to some drivers, and even more prone to potholes.

Problem 4 – Automation is expensive to build and maintain

The crew finds that they are so busy maintaining the droids checking the highways and investigating pothole reports that they don’t have time to create new droids. To continue to grow the droid network they need more people and more money. The cost of this approach is so large that the road crews have spent far more than their allocated budget already. They have asked for more funding and are waiting for approval. They hope to get the extra funding as they have invested so much already it would be a shame to stop now, activating the sunk-cost fallacy.

Problem 5 – Some things are too difficult to automate effectively

There are things that humans are really good at such as investigating, observing, and detecting something is wrong. It is impossible to automate investigation. After the investigation has been performed, it is possible to create a check that will redo the steps that cover one implementation of the original intent of the investigation. Attempts to automate difficult algorithms will often cost more to write and maintain than they would otherwise take to execute manually over the life of the product.


Attempting complete automated coverage costs huge amounts of money and time without finding all the problems


A Strategic Mixed Approach

Perhaps a better solution would be to send drones to drive along the most important roads and report any potholes that they see, and also deploy actual people to drive around and look for potholes. These people could also use other tools to discover potholes like aerial photos, previous pothole reports and reports of areas where recent roadwork has occurred. Many factors could influence the selection of which roads to inspect and by what means The people who pay the highest taxes may get their roads checked, along with the most heavily used roads, roads that are frequently used by emergency vehicles, and so on. In other words, the people and droids are deployed strategically and not uniformly across all roads.



So, I don’t recommend automating all the checks that you can. Focus your automation to key areas and spend time every release actually testing your product. By running only the same automated checks over and over again you will never find different problems that already exist in your software; such as problems that exist in code paths not checked by the scripts as well as problems in checked paths that are missed by the automation. When testers test the software they will explore different paths and observe far more than a script ever can.

When planning your automation coverage, when should you consider automating a check?

  • When the cost is low (including on-going maintenance costs)
  • When the item being checked in important enough
  • To act as a benchmark for future tests
  • To reproduce a failure (especially intermittent problems)
  • To see if a failure has not returned
  • When the check is difficult for a person to perform (high volume, critical timing, etc.)
  • When there is a clear set of established rules that the software must follow (e.g.: communication protocols)
  • If the check will need to be executed often in the future (regression)
  • If you want to compare different platforms (although this can present new issues with automation)