Wow, what a week this has been. We’re now on day three, the last day, and I’m up in an hour! I’m excited, a little frazzled, but I think we’re going to do well. I’m also excited that the four speakers in the breakout today are all good friends; Curtis Stuehrenberg, Seth Eliot and Mark Tomlinson are gonna’ help me close lout this conference and we look forward to chatting with as many people as possible that want to look at ways to the change the face and state of software testing. If you are here at ALM Forum, come join us. If you are not able to be, please read on here and take in as much as you can from my notes and observations.
Transforming Software Development in a World of Services with Sam Guckenheimer is the first session, a we are starting out with a thought experiment around Air BnB (the online service to rent rooms and houses. etc. in different cities). A boat on Puget sound is available, so a company can host all of their team members on the boat. What will the experience be? Will it be a fun stay? Will it be too cramped? We don’t know, but one thing’s for sure, it will be open, it will be public, and good or bad, if people want to talk about it, they will.
This makes for an interesting comparison to Agile development, and the way that agile has shaken out. What had intended to be a relatively private internal housekeeping mode has become a more public viewing. We are social, we are open, we use systems that are often out of our control in the 100% sense of the word. A lot of our practices and actions are not quiet and hidden, they are visible to all who would care to see them. It’s a little daunting, but it’s also tremendously liberating.
This talk is looking at a Microsoft ideal of “cloud cadence”. Customers want regular improvements, we want to maximize the value we provide to our customers, and we know that their feedback is not just for developers, it’s seen by everyone. Get it right, we have app store five star reviews. get it wrong, and we can have considerably lower reviews (and don’t for a second think those reviews don’t matter; it can be the difference between adoption or being totally forsaken).
The DevOps life cycle comes together with three aspects. we have development, we have production, and in between we have the collaboration piece. What’s the most important element there? Well, without good development, we have a product that is sub par. With bad deployment, we might have a great product but it won’t really work the way we intend it to. The middle piece is the critical aspect, and that collaboration element is really difficult to pin down. It’s not a simple prescription, a set checklist. each organization and project will be different, and many times, the underpinnings will change (from our servers to the cloud, from a dedicated and closed application to a socially aware application). Sometimes the changes are made deliberately, sometimes the changes are made a little more forcefully. Either way, without a sense of shared purpose or collaboration between the development and production groups, including the tooling necessary to accomplish the goals.
The ability to do all of these things in the Visual sTudio team is the core of Sam’s talk, and the interactions with their clients, and the variety of changes that occur drive many of their decisions. They learn from their customers and change direction. They focus on a human to human feedback model (which may sound a little unusual for a giant company like Microsoft, but Sam makes a convincing case 🙂 ).
reporters” sharing a clear story of your product. A thought experiment from Elisabeth Hendrickson that I personally love is “what is the most terrifying headline about your company you could imagine seeing in the paper? Wouldn’t you want your testers to not only find out that terrifying headline, but inform you so that you could prevent it?”
- less scripting, more active thinking
- less checking, more real testing
- less blind faith, more scientific skepticism
- creative, inventive, intuitive, mindful
- SummerQAmp: hire an intern
- PerScholas: have a chat with recent STeP graduates and their mentors
- Weekend Testing: Come join us for a session or two and see the magic happen
- Miagi-do: Do a web search for the term “Miagi-do School of Software Testing”. Or better yet, just ask me ;).
Curtis Stueherenberg is talking about how to “ACCellerate Your Agile Test Planning”. He decided to chuck the Power Point entirely, and decided to give a crash ourse in Agile testing on a live product… specifically, his procut (well, Climate Corp’s mobile app, to be specific). His point was to say “what if we have to test a product in two weeks? How about one week? How about three days? What are you going to do?”
Rather than talk it, we all participated in an active testing session, downloading the app to our mobile devices (iPhone and Android only, sorry Windows Phone users 🙁 ). By walking through the steps and the test areas, and using an idea from James Whittaker and Gogle called the ACC model, we all in real time put together sections of risk and areas we would want to make sure that we tested. In many ways, ACC is a variation on a theme of Session Based Test Management (SBTM). It informs out tests, we act on the guidance, and we pivot and adapt based on what we learn, and we do it quickly.
Much of the interaction was just things we did in real time, and for my money, this was a brilliant way to emphasize this approach. Instead of just talking about it, we all did it. Even if the idea of a formal test plan is not something you have to deal with, give this approach a try. I know I’m going to play with this when I get back home :).
Now it’s time for Seth Eliot and “Your Path to Data Driven Quality” and a roadmap towards how to use the data that you are gathering to help guide you to your ultimate destination. Seth wants to make the point that testing is measurement, and you can’t measure if you don’t have data (well, you can, but it won’t really be worth much). Seth asks if we are HiPPO driven (meaning is our strategy defined buy the “Highest Paid Person’s Opinion” or were we making decisions based on hard data. Engineering data can help a little bit (test results, bug counts, pass fail rates). They can give us a picture, but maybe not a complete one (in fact, not even close to a complete one). There’s a lot of stuff we are leaving on the table. Seth says that leveraging production data (or “near production data”) gives us a richer and more dynamic data set. Testers try to be creative, but we can’t come close to the wacko randomness of the real world users that interact with our product.
First step: Determine your questions. Use Goal Question Metrics. Start at the beginning and see what you ultimately want to do. Don’t just get data and look for answers. Your data will taint the questions you ask if you don’t ask the questions first. You may develop a confirmation bias if you look at data that may seem to point to a question you haven’t asked. Instead, the data may give you a correlation to something, but it may not actually tell you anything important. Starting with the question helps to de-bias your expectations, and then it gives you guidance as to what the data actually tells you.
Then: Design for production-data quality. There’s two types of data we can access. Active and passive data can be used. active data could be test cases or synthetic data of a simulated user. Passive data is using real world data and real users interactions. Synthetic data is safer, but it’s by definition incomplete. Passive data is more complete, but there’s a danger to using it (compromising identification data, etc.). Staging the data acquisition lets us start with synthetics data (reminds me of my “Attack on Titan” account group that I have lovingly put together when I test Socialtext… yes, I have one. Don’t judge me 😉 ), to copying my actual account and sharing on our production site (much more rich data, but needs to be scrubbed of anything that could compromise individuals privacy… which in turn gets us back to synthetic data of sorts, but a
richer set. Bulk up and repeat. Over time, we can go from having a small set of sample data to a much larger and beefier data-set, with lots more interesting data points.
Then: Select Data sources. There’s a number of ways to gather and accumulate data. We can export from user accounts, or we can actively aggregate user data and collect those details (reminds me of the days of NetFlow FlowCollection at Cisco). We need to be clear as to what we are gathering and the data handling privacy that goes with it. Anonymous data is typically safe, sensitive personally identifiable info requires protocols to gather, most likely scrub, or not touch with a ten foot pole. Will we be using Infrastructure data, app data. usage. account details, etc. Each area has its unique challenges. Plan accordingly.
Then: Use the right data tools. What are you going to use to store this data. Databases are of course common, but for big data apps, we need something a little more robust (Hadoop is hip in this area). where do you store a Hadoop instance? Split it up into smaller chunks (note, splitting it makes it vulnerable, so we need to replicate it. Wow, big data gets bigger 🙂 ).Using map reducing tools, we can crunch down to a smaller data set for analysis purposes. I’m going to take Seth’s word for it, as Hadoop is not one of my strong suits, but I appreciated the 60 second guided tour 🙂 ). Regardless of the data collection and storage, ultimately that data needs to be viewed, monitored, aggregated and analyzed. The tools that do that are wide and varied, but the goal is to drill down to the data that matters to you, and having the ability to interpret what you are seeing.
Then: Get answers to your questions. Ultimately, we hope that we are able to get answers based on the real data we have gathered that will help us either support or dispute our hypothesis (back to the scientific method; testing is asking questions and then, based on the answers we receive, considering and proposing more interesting questions. Does our data show us interesting points to focus our attention? Do we know a bit more about user sentiment? Have we figured out where our peak traffic times are? If we have asked these questions, and gathered data that is appropriate for those questions, if we have been focused on aggregating the appropriate data and analyzing it, we should be able to say “yes, we have support for our hypothesis” or “no, this data refutes our hypothesis”. Of course, that leads to even more questions, which means we go to…
Lather. Rinse. Repeat.
Hmmm, Mark Tomlinson just passed me a note with a statement that says “Computer Aided Exploratory Testing”? Hadn’t considered it quite that way, but yes, this certainly fits the description. An intriguing prospect, and one I need to play with a bit more :).
Lightning talks! Woo!!! We have four presenters looking to rifle through some quick talks.
Mark Prichard is discussing “Complete Continuous Integration and Testing for Mobile and Web Applications”. Mark is with Cloudbees, and he’s explaining how they do exactly what the title describes. Some interesting ideas surrounding how to use Jenkins and other tools to make it possible to build multiple releases and leverage a variety of common tools so as to not have to replicate everything for each environment. Leverage the cloud and Platform-as-a-Service for Continuous Delivery. Key takeaway… “ALM in the cloud will become the rule, not the exception”. Quote attributed to Kurt Bittner.
Mike Ostenberg from SOASTA is next and he’s talking about ‘Performance Testing in Production, and what you’ll find there”. Begs the question… *WHY* do we want to do performance testing in production (isn’t that what we call a “customer freak out”… well, yeah, but that’s an after effect, and we really want to not go there 😉 ). Real systems, real load, real profiling. There’s ways we can simulate load on a test environment, but it’s not really going to match what happens in the real space. additionally, we want to do our load testing earlier than we traditionally do it. At the end of the cycle, we’re a little too far gone to actually pivot base on what we learn.
Load testing in Production, Mike points out, can be done in stages and can be done on different levels. Just as we use Unit tests for components, integration tests for bigger systems, and feature/acceptance tests to tie it all together, we can deconstruct load tests to match a similar paradigm. Earlier load tests are dealing with errors, page loads, garbage collection, data management, etc. Regardless of the stage, there are some critical things to look at.
Bandwidth is #1, can everyone reach what they need? Load balancing, or making sure everyone pulls their weight, is also high priority. Application issues; there’s no such thing as perfect code. Earlier tests can shake out the system to help show inefficient code, sync issues, etc. Database performance fits in application issues, but it’s a special set of test cases. The database, as Mike points out, is the core of performance. Locking and contention, index issues, memory management, connection management, etc. all come into play. Architecture is imperative. Think of matching the right engine to the appropriate car. Connectivity comes into play as well. Latency, lack of redundancy, firewall capacity, DNS, etc. Configuration means we need to get custom and actually see if we mean it. Shared Environments… watch out for those noisy neighbors :). Random stuff comes into play when things are shared in the real world. Pay attention to what they can do for you (or to you 😉 ).
I like this staggered approach, it makes the idea of “testing in production” not seem so overwhelming.
Now on deck is Dori Exterman, and he’s talking about “Reducing the Build-Test-Deploy Cycle from Hours to Minutes at Cellebrite”. Hmmm, color me mildly skeptical, but OK, tell me more :). I’m very familiar with the idea of serial build-test-deploy, and I know that that does not bode well. Multi-core systems can certainly help with this, and leveraging multi-core environments can allow us to do a much tighter build-test-deploy pipeline. Parallel processing speeds things up, but there’s a system limit, and those system limits are also very costly at their higher end.
So what’s the option when we max out the cores on a single system? seems that going parallel to more servers would make sense. Rather than one machine with 32 cores, how about 8 machines with four cores? same number of cores, maybe similar throughput gains (and potentially better since system resources are shared over multiple machines). This approach is referred to as a CI Cluster Farm. Cool, but we’re still in a similar ball-park. Can we do better? Dori says yes, and his answers is to use distributed computing within your own network of machines. If I’m hearing this correctly, it’s kind of like the idea of letting your machine be used for “protein folding” experiments while your machine is in more idle states (anyone else remember signing up to do stuff like that :)? ). I’m not sure that’s what Dori means, but it seems this could be really viable, and we already have an example of that happening (i.e “signing up for protein folding”).
How wild would it be to be able to wire up your entire network, everyone’s machines, so that they can help speed up the build process? It’s a fascinating model. I’d be curious to see if this really comes to fruition.
We had another Lightning talk added that came from a Birds of a Feather session about CI/CD, so this is a bit of a surprise. The idea was to see how we could leverage pipelines (mini-builds that run in sequence and individually). mini builds also helps us to build individual components, with a goals to integrate the elements later on. Often, all we want is a Yes/No to see if the change is good, or not (gated check-ins).
s blends into Dori’s talk just given on distributed computing and utilizing down times for making an almost unlimitedly parallel build engine. So this is interesting, but what’s management going to say about all of this? Well, what is it costing us not to do this? Are we losing time and in effect losing money in the process? Will this help us fix some of our technical debt? If so, it may well be worth considering. If it adds more technical debt, less likely to sell that option.
Another point is that good CI infrastructure will bubble up issues in design and architecture of both the process and the application. Innovation and motivation will potentially increase when changes can be made more frequently, and subsequently, more atomically.
By using information radiators, we can get a clearer sense as to who did what to cause the build to fail. Gadgets (lights, sounds, sensory input) can help make it more apparent and in real time. Not sure if this would be a major plus, but I’m not necessarily the best judge of what developers consider to be fun ;).
The final test track talk, the anchor session, goes to Mark Tomlinson, as he discusses ‘roles and Revelations: Embracing and Evolving our Conceptions of Testing”. With a title like that, let’s just say “you had me at ‘hello'” ;).
Mark is a fun guy to listen to (check out his podcast “PerfBytes” to get a feel), and thus, it’s fun to hear him do a more narrative talk as opposed to a techy talk. We start out with the idea of what testing is, at least how we look at it historically. We find bugs, we see that we can validate to a spec, we try to reduce costs, and we aim to mitigate risks. Overall, I think if you gave that list to any lay person and said “that’s what testing is”, they’d probably have little difficulty understanding that. Those definitions are valid, but it’s also somewhat limiting. We’ve seen some interesting milestones over the past 50 years. Debugging, Demonstration, Destruction, Evaluation and Prevention can all be seen as “eras of testing”. Mark points out that there are 10 different schools of testing (Domain, Stress, Specification, Risk, Random/Statistical, Function, regression, Scenario, User, and Exploratory).
That’s all cool… but what if one day everything changed? Well, one would say that the past 14 years, or since the Agile Manifesto, the Universe did Change… to steal a little from James Burke. We are less likely today to have isolated test groups. We have a lot more alphabet soup when it comes to our titles. I’ve had lots of titles, lots of combinations, but ultimately all of them could be distilled to a “tester” of some flavor. Some teams have no dedicated testers, or just one dedicated tester. Test Driven Development is an unfortunate term choice, in the fact that what is a design process often gets mistaken for “testing” (nope, it’s not. It’s checking for correctness, but it is not testing). Out time to be interactive and effective is happening earlier, and I love this fact.
Continuous Integration, Continuous Deployment, Continuous Delivery and even Continuous Testing have entered the vernacular. What does this mean? It’s all about trying to automate as many of the steps as humanly possible. Build-Check-Deploy-Monitor-Repeat. Conceive of a time and place where we go from end to end without a person involved, just machines. Sounds great, huh? In some ways, it’s awesome, but there’s an unfortunate side effect, in that may processes are billed as testing that are not. Checking is what automation does. It’s great for a lot of things, but it can’t really think. Testing, real testing, requires thinking and judgment. There’s been a devaluing of testing in some organizations, or just doing testing is considered a liability. Unless we are all coding toolsmiths, we are of a lesser order… and that’s bunk!!!
Ultimately, testing is a cost… seriously. Testing does not make money. testing is a cost center. It’s an important cost center, but it is a cost. think of Health Insurance. It is not an investment. It’s a cost you have to pay… but when you crash a car or break a leg, then the insurance kicks in, and I’ll bet you’re happy when you have it (and really frustrated if you don’t). that’s what testing is. It’s insurance. It’s a hedge. It’s a cost to prevent calamity. With all of the changing going on, we ned to be clear what we are and what we provide.
What we generate, and what real value we provide, is feedback and information. we are not critics. We are not nay-sayers, we are honest (we hope) reporters of the state of reality, or at least as we can potentially be. the really valuable things that we can provide are not automate-able. Yes, I dared to say that :). Computers can evaluate variable values and they can confirm or deny state changes, but they cannot really think, and they cannot make an informed judgment call. They can only do what we as people tell them to.
Change is constant, and we will see more change as we continue. Testers need to be open to change,and realize that, while there is always value that we provide, the way we provide that value, and the mechanisms and institutions that surround them will evolve. If we do not evolve with them, we will be left behind.
Mark emphasizes that software testers are “Facilitators of Quality”. Testing is not just limited to dedicated testers, it’s dispersing. therefore, we need to emphasize where we can be effective, and that may mean going in totally different directions. Testing provides diversity, if we are willing to have it be a diversifying role. Think of new techniques, expand the way that we can ask questions, learn more about the infrastructure, and figure out ways that we can keep asking questions. The day we stop asking questions is the day testing dies, for real.
Testing can actually accelerate development. I believe this, and have seen it happen in my own experiences. This is where paired developer-tester arrangements can be great. think of the programmer being the pilot, and the tester being the navigator. Yes, if all we ask is “are we there yet?”, we don’t offer much, but if we watch the terrain, and ask if some ways we’ve mapped may be better or worse for the time we want to arrive, now we’re adding value, and in some ways, we can help them fix issues before they’ve even been committed. testers provoke reactions. Not to be jerks, but to get people to think and consider what they really should be doing. Do you think you can’t do that? If so, why? Give it a try. You may surprise yourself (and maybe a few programmers) with how much you deliver. In short, be the Devil’s Advocate as often as possible, and be prepared to embrace the Devil’s you don’t know ;).
Consider that every tester is an Analyst. It may be formal or informal, but we all are, deep down. we can research quality efforts, we can drill down into data and see patterns and trends, we can also see trends and efficiencies we can add to our repertoire, and adapt, adapt, adapt!
Sorry for the delay for the last bit, but with a rather meta post presentation call with Mark Tomlinson (we did a conference call about how to do podcasts, and in the process, recorded the session… so yeah, we made a podcast about how to do podcasts as an artifact of a meeting about how to do podcasts. Main takeaway, it’s fun, but there’s more to doing them than many people consider. We just hope we didn’t scare everyone off after we were done (LOL!). After that, all of the speakers descended upon Tango Restaurant and had a fabulous dinner courtesy of the ALM Forum organizing staff. Great conversation with Scott Wambler, Curtis Stuehrenberg, Peter Varhol, and Seth Eliot, as well as several others. the nerd brain power in that small room was probably off the charts, and i was honored to have been included in this event. Seattle, thank you for a very busy and truly enjoyable week. For those who have been keeping track of this rather long missive, my thanks to you, too. To everyone w
ho came to my talk and tweeted or retweeted my comments, and who commented back to me about my talk and gave me your impressions, feedback is a gift, and I’ve received many gifts today. Truly, thank you so much.
With this, I must return back to reality and back to San Francisco early this morning. I’ve enjoyed out rime together, and I hope that, in some small way, this meandering three days of live blogging has given you a flavor of the event and what I’ve learned these past few days. Let’s do it all again some time :)!!!