Those of you who read this blog know I’ve probably spent tens, if not hundreds of thousands of words discussing the applicability of hard metrics to the management of software development.

You likely know that I’m not keen on it.

Yet I struggle to make the point sharply and quickly. Cem Kaner wrote something on metrics today that summed it all up in a hundred words or so:

Capers Jones sometimes talks disparagingly about the (claimed) fact that 95% of American software companies have no metrics program. On the surface, this sounds terrible. But what I saw as a consultant was that most software companies have tried a measurement program, or have executives with lots of experience with metrics programs in other companies. The problem is that their experiences were bad. The measurement programs failed. Robert Austin wrote a terrific book, Measuring & Managing Performance in Organizations. When you start measuring something that people do, people will change their behavior to make their scores better. People will change what they do to get better scores on the measurements—that’s what they’re supposed to do. But they don’t necessarily change in ways that improve what you want to improve. Often, the changes make things worse instead of better (a problem commonly called “measurement dysfunction.”) This problem happens more often, and worse, if you use weak, unvalidated metrics. I keep meeting software consultants, especially software process consultants, who say that it’s better to use bad measurements than no measurement at all. I think that’s’ a prescription for disaster, and that it’s no wonder that so many software executives refuse to harm their businesses in this way.

I thought it was brilliant.

If you want more, you can read the source that quote comes from – part I of a series of interviews with Cem on

Or come to the Conference for the Association for Software Testing – CAST 2010 – next week in Grand Rapids, Michigan, where Cem is giving a keynote-level speech.

UPDATE: I’ve been thinking about it, and it certainly depends how you define your metrics. For example, one kind of measurement that I am in favor of is the “slip chart.” In other words, you look at every deadline the team has committed to and how late they are, and you figure the rough percentage of how late they /always/ are. With that, you can predict when you will really be done. The folks in the extreme programming community codify this into story points and burnup charts, and that’s fine. I don’t have a problem with these used as approximations; as first-order measurements. The problem comes when they are reduced to 100-word silver-bullets without the context of how they can be used well.

So I wouldn’t say I’m totally opposed to metrics are part of a balanced breakfast. I’m just leery of the common ideology of measurement by numbers alone, without context.