This is a prettied-up version of the notes I based my CEWT #3 talk on.
Explore It! by Elisabeth Hendrickson is a classic book on exploratory testing that we read – and enjoyed – in the Test Team book club at Linguamatics a few months ago. Intriguingly, to me, although the core focus of the book is exploration, I found myself over and again drawn back to a definition given early on (p.6):
Tested = Checked + Explored
where, to elaborate (p.5):
Checking [is testing] that you design in advance to check that the implementation behaves as intended under supported configurations and conditions.
Exploratory Testing [is] simultaneously designing and executing tests to learn about the system, using your insights from the last experiment to inform the next.
And both of these aspects are necessary for testing to have been performed (p.4-5):
… you need a test strategy that answers two core questions:
1. Does the software behave as intended under the conditions it’s supposed to be able to handle?
2. Are there any other risks?
Which doesn’t sound particularly controversial does it, at first flush, that testing should involve checking against what we know and exploring for other risks? So why did I find myself worrying away at it? Here’s a few thoughts:
 The definition of testing is cast in terms of a mathematical formula. So I could wonder what ranges of values those variables could take, and their types, and what units (if any) they are measured in, and what kind of algebra is being used.
Perhaps it’s something like numerical addition, and so Checked and Explored represent some value associated with their corresponding activity – bug counts or coverage or something; or maybe Checked and Explored are more akin to sets or multisets and I should interpret “+” as a kind of union operator; or, alternatively, perhaps Checked and Explored are simply Boolean values and “+” is something like and AND operation, which makes Tested only true if both Checked and Explored are true.
 I wonder whether Checked and Explored can overlap. Can one kind of action qualify as both checking and exploration? Can the same instance of an action qualify as both?
 Choosing the past tense to express a definition of Tested appears to wrap up two things: testing and the completion of testing.
 My intuition about what I’m prepared to regard as testing is itself tested by a statement like (p.5):
Neither checking nor exploring is sufficient on its own.
For instance, I can imagine circumstances in which I’d accept that the testing for some project would consist only of “checking” via unit tests. And I might do that on the basis of a high-level risk analysis of this project vs other projects in some product, timescale, the kind of application, the expected usage, resource availability and so on.
 Further, there’s also the suggestion that exploratory testing alone is sufficient for testing. In this exchange, Elisabeth is helping a tester to see what their exploration of requirements is (p.121):
“Huh. Sounds like testing,” I said. I waited to hear his response … His face brightened. “Oh!” he exclaimed. “I get it. I hadn’t thought of it that way before. I am testing the requirements …”
No checking is mentioned although, of course, it’s possible to imagine pro-forma constraints that could be checked, for example that any document is in a language understandable by all parties that need to understand it, or that it has a particular nominated format, or that it’s not written in invisible ink.
 There’s another (implicit, non-formal) definition of testing on page 3:
interact with the software or system, observe its actual behavior, and compare that to your expectations
which appears to tie testing to software and software systems (such as firmware, infrastructure, collections of software components, say) unless, perhaps, the “system” here can be interpreted as something much more general such as processes. And if exploring requirements can be testing then “system” probably is more general. But, if so, what might be meant by observing the “actual behaviour” of a functional specification or a bunch of Given, When, Thens?
 There’s other well-known binary divisions of testing, which overlap in terminology with Explore It! and each other: Michael Bolton and James Bach have testing and checking while Paul Gerrard prefers exploring and testing. There’s some discussion of the commonality between Bolton/Bach and Gerrard in this Twitter thread and Paul’s blog and I wondered how Elisabeth’s variant fitted into that kind of space.
“Picky, picky, picky,” you may be thinking, “picky, pick, pick, pick …”
Perhaps. But please don’t get the idea that this essay is about pulling apart, or kicking, or rejecting Explore It! on this basis, because it’s not. I’ve got a lot of time for the book and the definitional stuff I’ve referred to all occurs at the beginning, in the set up, with a light touch, to motivate a description of what the book is and is not about. The depth in the book is – intentionally – in the description and discussion and examples of exploration and how it can be utilised in software development generally and testing specifically.
Which is perhaps a little ironic, because thinking about the definition turned into this essay and is the exploration of what testing means for me. The starting point was the observation that I kept on returning to my model of the Explore It! view of testing even though the core focus of the book is exploration.
Instinctively, I am happy to regard the thoughts that lead to the kinds of questions I’ve listed above as testing: I am testing the definition of testing in the book using my testing skills to find and interpret evidence, skills such as: reading, re-reading, skim reading, searching, cataloguing, note-taking, cross-referencing, grouping related terms, looking for consistency and inconsistency within and outside the book, comparing to my own intuition, reflecting on the reaction of my colleagues in the book club when I said that I’d been distracted by the definition, filtering, factoring, staying alert, experimenting, evaluating, scepticism, critical thinking, lateral thinking, …
I feel like I was performing actions which are at least consistent with testing activities:
- I criticised the definition,
- I challenged my model of the definition,
- I analysed Elisabeth’s answers,
- I reflected on the way I asked questions,
- I wondered at why I cared abou
- I sought justification for all the above.
But I felt that the definition I was thinking about wouldn’t classify what I was doing as testing.
Amongst the perennial testing problems (and Explore It!, as you would expect, talks about many of them) are these: which direction to follow and when to stop. In my case I decided, after my initial analysis, that I wanted to continue and that I’d do so by contacting Elisabeth herself and asking her questions about the definitions, her use of them in the book and her views on them today. And she graciously agreed to that.
With additional evidence from our conversation I was able to filter out some of the uncertainties I had and to refine my model of the system under test, if you will. I could then ask further questions and refine my model still further. I could then switch and ask questions of my model: test the model. Perhaps I think that some section of the model is underdeveloped because it doesn’t stand up to questioning, then I could, and did, go back to the book and re-read sections of it, again adding data to my model. And I could correlate data from any of them, I could choose what approach to take next based on what I’d just discovered, and so on.
These things are all completely analogous (for me) to the exploration of some application when software testing. And I can take exactly this kind of approach when I’m reading requirements, when talking to the product owner, when reviewing proposals for new internal systems, when looking at my own ideas about how we might organise our team, when I’m thinking about talks or blog posts that I’m writing, … More or less anything can have this kind of exploratory analysis applied to it, I think.
It can even, to take it to another – meta – level, be applied to the analysis itself: testing the testing. For example, when I was talking to Elisabeth I wanted to do review what I said, how Elisabeth responded, how I interpreted her responses and so on, to understand whether:
- I felt like I was getting my point across clearly; and, if not, then whether I could find another way such as giving examples or reframing or using different words.
- I could see that the answers helped to resolve some point of uncertainty for me; and, if not, wondering why not: perhaps it was my setup (the context in which I placed the question) or some misapprehension in my model.
- I was refining my model with any given experiment; and, if not, then ask whether the line of investigation is valid or worthwhile.
Further, when I was testing the definition of testing in Explore It! I was curious about why I cared, and so tried to understand that. How? By testing it! A couple of questions recurred and required strong answers from me:
- What value do I see in this exercise, and who for?
- Am I reading too much into the definitions, building ideas on something that isn’t intended to bear deep analysis?
And the techniques that I choose to use for these kinds of analyses are themselves testing techniques such as questioning, review, idea generation, comparison, critical thinking, and so on. Not everything that I am applying my testing to necessarily exhibits “behaviour” (as in Elisabeth’s second definition) but it can yield information in response to experiments against it.
At some point, I cast around for other views of testing. It’s not uncommon to view testing as a recursive activity. In his keynote at EuroSTAR 2015 Rikard Edgren said this beautiful thing:
Testing is simple: you understand what is important and then you test it.
Adam Knight has a couple of really nice blogs on fractal exploratory testing, and presented a talk on it at Linguamatics recently too. He argues that, in exploratory testing, each exploration uncovers points which can themselves be explored, and so on, down and down and down, with the same techniques applicable in the same kinds of ways at each step:
as each flaw .. is discovered … [a] mini exploration will result in a more targeted testing exploration around this feature area
I enjoy this insight. And I have a lot of sympathy for that view of a possible traversal of a testing space. I feel like I follow that pattern while I’m testing. But I also feel that that kind of self-similar structure applies in other ways than simply increasing resolution at each step. For me, testing can be done across, and around, and inside and outside, and above and below, and at meta levels of a system.
Which you might be happy enough to accept. But I also think that these different dimensions of testing can be taking place in different planes, looking in different directions, seeking different goals, and often many of these at the same time. Imagine this scenario, where I’ve been asked to test some product feature or other, and I have access to the product owner (PO):
- While I am putting my questions to the PO, and getting her answers, I am interpreting her response and feeding that data to my model of the system we’re testing, and so asking questions of the model.
- But I’m also wondering whether I could have got more helpful answers to my questions if I’d phrased them a different way and evaluating whether or not I should risk upsetting her now by re-asking, or wait for another opportunity to practise a different question format.
- I take a high-level view to find out what the stakeholders want from the testing I’m doing, which makes me question whether what I’ve done returns that value, which in turn makes me question what I’m doing.
- I want to find out which stakeholders have useful information about which aspects of the feature. By talking to them I begin to understand where they are reliable, their degrees of uncertainty, the clarity of their vision. While I’m doing this I’m testing their ability to express their opinion, and I’m feeding that into my model by adding uncertainties.
- At the same time, I’m running an ad hoc experiment against the system based on the PO’s data and noticing out of the corner of my eye that some of the text on the dialog we need to use is misaligned, and I recall that there have been similar examples in the past on other dialogs and so I shift my model of that problem into focus.
- As I start thinking about it, I check myself and realise that I’ve missed what the PO is saying.
- I review that decision and curse myself.
- And then I observe something that seems at odds with what the PO is saying. It could be that the software is wrong, or it could be that my model of the PO’s view of the world is wrong, or the PO’s view of the world or the PO’s expression of their view of the world, or something else. I frame another experiment – perhaps more questions – to try to isolate whether and where there’s an issue.
And so on and so on and so on. Sometimes multiple activities feed into another. Sometimes one activity feeds into multiple others. Activities can run in parallel, overlap, be serial. A single activity can have multiple intended or accidental outcomes, … By the definition in Explore It! as I interpret it, only some parts of this are testing. But, for me, it’s just testing: all the way down, and the other directions.
I started off by being curious about a definition and then about my own curiosity and then about the value of either of those things. That lead to some interesting thoughts and a very enjoyable exchange and some introspection, and indeed to this essay. But, when analysing something, a natural question to ask can be: well, what are the alternatives?
It might not always be regarded as within a tester’s remit to come up with alternatives but, as a testing tool, finding or generating alternatives is very useful. Perhaps unsurprisingly there are numerous alternatives available and Arborosa’s blog post What is Testing? lists many. Taking inspiration from Michael Bolton’s training session at Linguamatics I tried to create one that reflected testing for me, and this is what I came up with:
Testing is the pursuit of actual or potential incongruity.
There is no specific technique; it is not limited to the software; it doesn’t have to be linear; there don’t need to be requirements or expectations; the same actions can contribute to multiple paths of investigation at the same time; it can apply at many levels and those levels can be distinct or overlapping in space and time.
That’s my idea so far. Feel free to test it.
Particular thanks to Elisabeth Hendrickson for being open to my questions.