To borrow a favorite line from my daughters…
“What does snow turn into when it melts? It becomes spring!”
It’s a line from one of their all time favorite Anime series (Fruits Basket), and it’s a cute ring of optimism that points to warmer days, longer daylight and at least for me, the start of the presenting and writing season.
Over the past few years, I’ve enjoyed a three times per year experience of a conference in spring, CAST in the summer, and a conference a little farther afield in the fall. I’ve been invited to both Sweden and Ireland in the past year, and I’m keeping my fingers crossed that I might have an opportunity to present in Germany in the fall of 2015. Right now, though, that’s up in the air, but I can hope and dream.
Today, though, brings me to slightly cloudy but very pleasant San Diego for the regular sessions of STPCON Spring 2015. Monday and Tuesday were special sessions and tutorials, and due to work commitments, I couldn’t get here until last night. A group of us were gathered around the fire pit outside in the later evening, though, talking about our respective realities, and that familiar “conferring before conferring” experience that we all know and love.
A great surprise this morning was to see my friend Mike Ngo, a tester who worked with me a decade ago at Konami Digital Entertainnment. Many of us have stayed in touch with each other over the years; we jokingly refer to ourselves as being part of the “ExKonamicated”. He told me he came with a number of co-workers, and that several of them would be attending my talk this morning (oh gee, no pressure there 😉 ).
Mike Lyles stated of the program today welcoming testers and attendees coming far and wide (apparently Sweden sent the most people, with 12 attendees), and many companies sending contingents (Mike’s team from Yammer being a good example). The food was good this morning, the company has been great, and as always, it’s such great fun to see people that I really only get to see online or at these events. By the way, a quick plug for Fall 2015 STPCON. It will be on a cruise ship (“Hey Matt, it’s happened. They’re having a testing conference… on a boat ;)… HA HA, April Fools!!!” Ah well, therein lies the danger of presenting on April 1st 😉 ).
Bob Galen starts off the day with a keynote address defining and describing the three pillars of Agile Quality and Testing. As many of us can attest (and did so, loudly), there are challenges with the way Agile Development and Agile Testing take place in the real world. Interestingly enough, one of the summing points is that we have a lot of emphasis on “testing”, but that we have lost the focus on actual “quality”. In some ways, the fetish with automated testing and test driven development (which admittedly, isn’t really testing in the first place, but the wording and the language tends to make us think it does) has made for “lots of testing, but really no significant differences in the perceived quality of the applications as they make their way out in the world”. We have Selenium and Robot Framework and Gherkin, oh my, but could these same teams write anything that resembled a genuine Acceptance Test that really informed the quality of the product? So many of these tests provide coverage, but what are hey actually covering?
The biggest issue with all of our testing is that we are missing the real “value perspective”. Where are our conversations leading? What does our customer want? What are we actually addressing with our changes? Story writing is connected to the automation, and that’s not essentially a bad thing, but it needs to be a means to an end, not the end in and of itself.
The three pillars that Bob considers to be the core of successfully balanced Agile practices are Development & Test Automation, Software Testing, and Cross Functional Team Practices. Agile teams do well on the first pillar, are lacking in the second, and in many ways are really missing the third. The fetish of minimizing needless and wasteful documentation has caused some organizations to jettison documentation entirely. I frequently have to ask “how has that worked for you?”. I’m all for having conversations rather than voluminous documentation, but the problem arises when we remove the documentation and still don’t have the conversations. Every organization deals with this in some way when they make an Agile transition.
Beyond the three pillars, there’s also the sense that there is a foundation that all Agile teams need to focus on. Is there a whole team ownership of quality? Are we building the right things? Are we actually building it right? If we are using metrics, are they actually useful, and do they make sense? How are we steering the quality initiatives? Are we all following the same set of headlights? Are we actually learning from our mistakes, or are we still repeating the same problems over and over again, but doing it much faster than ever before?
Agile is a great theory (used in the scientific manner and not the colloquial, I genuinely mean that), and it offers a lot of promise for those that embrace it and put it into practice (and practice is the key word; it’s not a RonCo product, you don’t just “set it and forget it”). It’s not all automation, it’s not all exploratory testing, and it’s not all tools. It’s a rhythm and a balance, and improvising is critical. Embrace it, learn from it, use what works, discard what doesn’t. Rinse and repeat :).
The last time I spoke at STP-CON, I was the closing speaker for the tracks. This time around, I was one of the first out the gate. If I have to choose, I think I prefer the latter, though I’ll speak whenever and wherever asked ;). I’ve spoken about Accessibility in a number of places, and on this blog. Albert Gareev and I have made it a point of focus for this year, and we have collaborated on talks that we are both delivering. When you see me talking about Accessibility, and when you see Albert talk about it, we are usually covering the same ground but from different perspectives. I delivered a dress rehearsal of this talk at the BAST meetup in January in San Francisco, and I received valuable feedback that made its way into this version of the talk.
The first change that I wanted to make sure to discuss right off the bat is that we discuss Accessibility from a flawed premise. We associate it as a negative. It’s something we have to do, because a government contract mandates it or we are being threatened with a lawsuit. Needless to say, when you approach features and changes from that kind of negative pressure, its easy to be less than enthusiastic, and to take care of it after everything else. Through discussions I had with a number of people at that dress rehearsal talk, we realized that a golden opportunity to change the narrative was already happening. An attendee held up their smart phone and said “this, right here, will drive more changes to accessible design than any number of government regulations”, and they were absolutely right! We see it happening today with micro services being developed to intake data from wearable wearables like the Apple Watch, the FitBit, or other peripheral devices, where general text and screens just aren’t the way the product needs to interact.
While I was developing this talk, and sent my slides to be included in the program, a brilliant tweet was made by @willkei on Twitter that encapsulated this interesting turning point:
The same steps and processes being used to make content accessible on our smart phones and our smart watches and the Internet of Things are really no different than designing for accessibility from the outset. The form factor necessitates the need for simplicity and direct communication, in a way that a regular interface cannot and will not do. What’s great about these developments is the fact that new methods of interaction are having to be developed and implemented. By this same process, we can use this opportunity to incorporate those design ideas for everyone.
A heuristic that Albert and I (well, mostly Albert) have developed that can help testers think about testing for accessibility is called HUMBLE .
Humanize: be empathetic, understand the emotional components.
Unlearn: step away from your default [device-specific] habits. Be able to switch into different habit modes.
Model: use personas that help you see, hear and feel the issues. Consider behaviors, pace, mental state and system state.
Build: knowledge, testing heuristics, core testing skills, testing infrastructure, credibility.
Learn: what are the barriers? How do users Perceive, Understand and Operate?
Experiment: put yourself into literal situations. Collaborate with designers and programmers, provide feedback
In short, Accessible design isn’t something special, or an add on for “those other people” out there. It’s designing in a way that allows the most people the ability to interact with a product effectively, efficiently and in a way that, we should hope, includes rather than excludes. Designing for this up front is much better, and easier to create and test, then having to retrofit products to be accessible. Let’s stop looking at the negative associations of accessibility, because we have to do it, and instead focus on making a web that is usable and welcoming for all.
Maaret Pyhäjärvi covered a topic that intrigued me greatly. While there are many talks about test automation and continuous delivery, Maaret is approaching the idea of “Continuous Delivery WITHOUT Automation”. Wait, what?
Some background for Maaret’s product would probably be helpful. She’s working on a web based .NET product on a team that has 1 tester for each developer, and a scrum like monthly release for two years. As might be guessed, releases were often late and not tested well enough. This created a vicious cycle of failing with estimates and damaging team morale which, you guessed, caused them more failing with estimates.
What if, instead of trying to keep working on these failed estimates and artificial timelines, we were to move to continuous deployment instead? What was interesting to me was the way that Maaret described this. They don’t want to go to continuous delivery, but instead, continuous deployment. the build, unit test, functional test and integration test components were all automated, but the last step, the actual deployment, is manually handled. Instead of waiting for a bunch of features to be ready and push to release, each feature would be individually released to production as it is tested, deemed worthy and ready to be released.
This model made me smile because in our environment, I realized that we did this, at least partially. Our staging environment is our internal production server. It’s what we as a company actively use to make sure that we are delivering each finished feature (or sometimes plural features) and deploying the application on a regular basis.
Maaret discussed the value of setting up a Kanban board to track features as a one piece flow. We likewise do this at Socialtext. Each story makes its way through the process, and as we work on each of the stories, as we find issues, we address the issues until all of the acceptance criteria passes. From there, we merge to staging, create a build, and then we physically update the staging server, sometimes daily, occasionally multiple times a day. This helps us to evaluate the issues that we discover in a broader environment and in real time vs. inside of our more sterile and generalized test environments. It seems that Maaret and her team do something very similar.
What I like about Maaret’s model, and by extension, ours, is the fact that there remains a human touch to the deployments so that we can interact with and see how these changes are affecting our environment in real time. We remain aware of the deployment issues if they develop, and additionally, we share that role each week. We have a pumpking that handles the deployment of releases each week, and everyone on the engineering team takes their turn as the pumpking. I take my turn as well, so I am familiar with all of the steps to deploy the application, as well as places that things could go wrong. Having this knowledge allows me to approach not just my testing but the actual deployment process from an exploratory mindset. It allows me to address potential issues and see where we might have problems with services we depend on, and how we might have issues if we have to update elements that interact with those services independently.
Paul Grizzaffi has focused on the creating and deploying automated test strategies and frameworks and emphasizes in his talk about changing the conversation that we are having about test automation and how we approach it. in “Not Your Parents’ Automation – Practical Application of Non-Traditional Test Automation”, we have to face the fact that traditional automation does matter. we use it for quantitative items and both high and low level checks. However, there’s other places that automation can be applied. How can we think differently when it comes to software test automation?
Often, we think of traditional automation as a way to take the human out of the equation entirely. That’s not necessarily a bad thing. There’s a lot of tedious actions that we do that we really wish we didn’t have to do. when we build a test environment, three are a handful of commands I now run, and those commands I put into a script so I can do it all in one step. That’s a good use of traditional automation, and it’s genuinely valuable. We often say we want to take out the drudgery so the human can focus on the interesting. Let’s extend this idea, can we use automation to help us with the interesting as well?
Paul asks a fundamental question; “how can automation help me do my job?” When we think of it that way, we open new vistas and consider different angles where our automation can help us look for egregious errors, or look at running a script on several machines in parallel. How about if we were to take old machines that we could run headless and create a variety of test environments, and run the same scripts on a matrix of machines. Think that might prove to be valuable? Paul said that yes, it certainly was for their organization.
Paul uses a term I like and have used as a substitute for automation. I say Computer Assisted Testing, Paul calls it Automation Assist, but overall it’s the same thing. we’re not looking for atomic pass fail
check validations. Instead we are looking for ways to observe and use our gut instincts to help us consider if we are actually seeing what we hope to see or determine that we are not seeing what we want to.
check validations. Instead we are looking for ways to observe and use our gut instincts to help us consider if we are actually seeing what we hope to see or determine that we are not seeing what we want to.
At Socialtext, I have been working on a script grouping with this in mind. We call it Slideshow, and it’s a very visible automation tool. For those who recall my talk about balancing Automation and Exploratory Testing, I used the metaphor of a taxi cab in place of a railroad train. The value of a taxi cab is that it can go many unique places,and then let you stop and look around. Paul’s philosophy fits very well with that approach and use. Granted, it’s fairly labor intensive to make and maintain these tests because we want to keep looking for unique places to go and find new pages.
Some key ideas that Paul asks us to consider, and I feel makes a lot of sense, is that automation is important, but it gets way too much focus and attention for what it actually provides. If we don’t want to scale back our traditional automation efforts, then we should look for ways that we can use it in unique and different ways. One of my favorite statements is to say “forget the tool for now, and start with defining the problem”. By doing that, we can then determine what tools actually make sense, or if a tool is actually needed (there’s a lot you can do with bash scripts, loops and curl commands… just sayin’ 😉 ). Also, any tool can be either an exceptional benefit or a huge stumbling block. It depends entirely on the person using the tool, and their expertise or lack thereof. If there is a greater level of knowledge, then more creative uses are likely to result, but don’t sell short your non toolsmith testers. They can likely still help you find interesting and creative uses for your automation assistance.
I think Smita Mishra deserves an award for overachieving. Not content to deliver an all day workshop, she’s also throwing down with a track talk as well. For those who don’t know Smita, she is the CEO and Chief Test Consultant with QAzone Infosystems. She’s also an active advocate, writer and engaged member of our community, as well as being a rock solid software tester. In her track talk, she discussed “Exploratory Testing: Learning the Security Way”.
Smita asks us to consider what it would take to perform a reasonably good security test. Can we manually test security to a reasonable level? Can exploratory testing give us an advantage?
First off, Exploratory Testing needs to be explained as to what it is not. It’s not random, it’s not truly a ad hoc, at least not if it is being done correctly. A favorite way I like to look at Exploratory Testing is to consider interviewing someone, asking them questions, and based on their answers, asking follow up questions that might lead into interesting and informative areas. Like a personal interview, if we ask safe boring questions with predictable answers, we don’t learn very much. However, when we ask a question that makes someone wince, or if we see they are guarded with their answers, or even evasive, we know we have hit on something juicy, and our common desire is to keep probing and find out more. Exploratory testing is much the same, and security testing is a wonderful way to apply these ideas.
One of the most interesting ways to attack a system is to determine the threat model and the prevalence of the attack. SQL injections are ways that we can either confirm security of forms fields or se if we can compromise the fields and gain access.
What is interesting about security testing as a metaphor for exploratory testing is the fact that attacks are rarely tried one off and done. If an issue doesn’t manifest with one attack, a hacker will likely try several more. As a tester we need to think and try to see if we can approach the system in a similar manner. It’s back to the interviewing metaphor, with a little papparazzi determination for good measure.
Ask yourself what aspects of the application would make it vulnerable. What services reside in one place, and what services reside in another? Does your site use iFrames? Are you querying different systems? Are you exposing the query payload? Again, each answer will help inform the next question that we want to ask. Each venue we open will allow us to find a potential embarrassment of riches.
Smita encourages use of the Heuristic Test Strategy Model, which breaks the product into multiple sections (Quality Criteria, Project Environment, Product Elements and Test Techniques, as well as the Perceived Quality that the product should exhibit). Each section has a further breakdown of criteria to examine, and each breakdown thread allows us to ask a question, or group of questions. This model can be used for something as small as a button element or as large as an entire application. Charters can be developed around each of the questions we develop, and our observations can encourage us to ask further clarifying questions. It’s like a puzzle when we start putting the pieces together. Each successful link-up informs the next piece to fall in place,and each test can help us determine which further tests to perform, or interlocking areas to examine.
Our closing keynote talk today is being delivered courtesy of Mark Tomlinson, titled “It’s Non-Functional, not Dysfunctional”. There’s functional testing and the general direct interaction we are all used to considering, and then there’s all that “other” stuff, the “ilities” and the para-fucnctional items. Performance, Load, Security, Usability, Accessibility, and other items all play into the testing puzzle, and people need to have these skills. However, we don’t have the skills readily available in enough places to actually give these areas the justice it deserves.
Programmers are good at focusing on the what and the where, and perhaps can offer a fighting chance at the why, but the when and the how sometimes fall by the wayside. Have we considered what happens when we scale up the app? Does the database allow enough records? Can we create consitions that bring the system to its knees (and yes, I know the answer is “yes” 😉 ).
there are a variety of dysfunctions we need to address, and Mark broke down a few here:
Dysfunction 1: Unless the code is stable, I can’t test performance.
What determines stable? Who determines it? How long will it take to actually make it stable? Can you answer any of those? Functional testers may not be able to, but performance testers may be able to help make unstable areas more stable (mocks, stubs, virtual services, etc.). We can’t do a big run of everything, but we can allocate some effort for some areas.
Dysfunction 2: As a product manager, I’ve allocated 2 weeks at the end of the project for load testing.
Why two weeks? Is that anywhere near enough? Is it arbitrary? Do we have a plan to actually do this? What happens if we discover problems? Do we have any pivoting room?
Dysfunction 3: The functional testers don’t know how to use the tools for performance or security.
Considering how easy it is to download stuff like jMeter and Metasploit or Kali Linux, there’s plenty of opportunities to do “ilitiy” testing for very close to free if not outright free. If you happen to get some traction, share what you know with others, and suddenly, surprise, you have a tool stack that others can get involved with.
Dysfunction 4: We don’t have the same security configuration in TEST that we do in PROD.
Just because they don’t match exsctly doesn’t mean that the vulnerabilities are not there. There’s a number of ways to look at penetration testing and utilize the approaches these tools use. What if the bizarre strings we use for security testing were actually used for our automated tests? Insane? Hardly. It may not be a 1:1 comparison, but you can likely use it.
Dysfunction 5: Our test environment is shared with other environments and hosted on over provisioned virtual server hosts.
OK, yeah, that’s real, and that’s going to give us false positives (even latency in the cloud can muck some of this up). this is also a chance to communicate times and slices to access the machines without competing for resources.
Dysfunction 6: We don ot have defined SLA’s or non-functional requirements specifications.
Fair enough, but do we even have scoping for one user? Ten users? One hundred users? What’s the goal we are after? Can we quantify something and compare with expectations? Can we use heuristics that make sense to determine if we are even in the ball park?
Would we approach these questions the exact same way? Ha! Trick question! Of course we wouldn’t, but at least by structuring the questions this way (Question B) we can make decisions that at least allow us to make sure we are asking the right questions in the first place. More to the point, we can reach out to people that have a clue as to what we should be doing, but we can’t do that if we don’t know what to ask.
This concludes the Wednesday session of STP-CON. Join me tomorrow for “Delivering the Goods: A Live Blog from #STPCON”.