Marit van Dijk, 19th September 2018, at @FFSTechConf

FFS Great Test Automation Does Not Absolve You from Manual Exploratory Testing

MARIT: Hi, I think the second F is for "fox" which is there are foxes all over the presentation, just so you know!

Great testing does not absolve you from manual or exploratory testing. We've seen there's more automated testing than doing manual testing. I've seen people, companies, teams move away from having dedicated testers at all, and I've even heard people say, you know, you should never do any manual testing at all. I have an opinion on that.

Now, I don't like manual testing, especially manual regression testing; I think it's very boring! So, when I got into test automation about five years ago, I was like automate all the things! Because automation is fun, you know? I get to write code, listen to Spotify, and get paid to do that; it is awesome! However, I met lots and lots of awesome testers, and I learned that maybe we cannot automate everything. Some things are just too hard, too brittle, it takes too much work; it's more work automating it than it would be to do it manually, especially if you have to maintain those tests (which I have done).

Automation can't tell you what it's like to actually use the thing. To quote my friend Lanette Creamer: "If you don't test it, your users will." And what does that say about how you value your users?

We have ongoing discussions in my team that, if we build a new system and unit-test all the units, [referring to the slides] which should be small blocks inside those units of course, and we integration-test between individual components, then we've tested the whole. And, if we click it all together, it should just magically work, right? Because we have tested everything.

But no, because that's where the bugs live. Whenever we connect all these things, it doesn't quite work.

I wanted to illustrate with a system that we built in my team over the past few months, or, in fact, we were rebuilding a system. It was a workflow tool. We had selected an open-source tool for it but we weren't happy with how to maintain and extend that. We wanted to rebuild it.

So we were already familiar with the functionality we needed, which solves half the problem right there. We were working really close with the business analyst, and we covered everything that we were building in unit tests and added integration tests on top of that.

Just to give you an idea, we were building the workflow system, which was in our scope, which is why it's blue [referring to slides]. It was talking to our other system, also blue, but then talking to a bunch of other systems that were not ours. Different teams, sometimes in different departments, are in charge of those. So that's why we can't spin up a complete end-to-end test; because we're not even in control of all of the components.

End-to-end tests are already notoriously difficult. You have to have a bunch of systems all magically in the same state, especially if you're not in control of all of those - that's always going to go wrong - and then we have to collaborate with different teams and different people to get them all in the same state. And, if we find something, especially if it's different teams and different departments, or even different companies, it's always going to be the Shaggy defence: "It wasn't me." And that is even harder to automate and maintain.

If I'm running my end-to-end test manually, I can sort of herd my test case along the different systems and shove it, or make it go the right way, but if I'm automating it, it can't do that. Automating that and then maintaining it is something I don’t wish on my worst enemy.

So what we did instead is: we did some pair-testing. We had one of the developers who built a lot of the things involved and myself, because I had previously manually regression-tested the old system, so I knew exactly how the flows were supposed to work. So we had the knowledge of how does it work now technically? Where do I add the data? What is it supposed to do? And because we were doing it together, it was a lot less error-prone, because you know, two sets of eyes. I might by myself forget something because I get bored manual regression testing. She was right there, so, if we found a bug, she knew exactly what happened, I didn't have to redo it, to check maybe I did something wrong during the test. There was no need to go over Here's what I did, it is actually a bug”, [having to] convince someone it's actually a bug. She knew exactly what was wrong. She could quickly fix it and we could move on.

The moral of the story is test your code, and that's a little duck on the top of the shirt [referring to the slides], because it's not safe for work. 

And finally, just this morning, on my way here, I was trying to register for a mailing list. Through the flow, you enter your email address, press "go", and I get a form, [where] not all of the fields have been filled out yet, right, because I just got to the page. There's an error on this page, because not all the mandatory fields have been filled in. Someone forgot to fire up that application and test it. And that's my rant. You should fire up your application and bloody well test it. Thank you. [Applause]. Any takers?

FLOOR:  I agree. The number of times I see things come through not hitting a browser yet is ridiculous end to end, but at the same time, one thing I noticed around all conversations of manual tests and exploratory, it is always a focus on what you do before you deploy your application. It is based 100 per cent on the people in the room and what they know and they understand about what should happen, and they're basically trying to prove themselves correct. No matter how good of an exploratory testing looking for all edge cases can be, so what I look [for] with this is: stop talking about all of the things before we go to deploy the application, and let's be able to be understanding of what happens when the thousands, millions of people start touching your application, and figure out what is going on. So, yes to all of this and: Let's move pass just automation exploratory, predeployment and what is happening after.

MARIT: Excellent, agreed.

FLOOR:  I used to work in a team where we were doing Java development, and one thing I hate about JUnit is you can't mark bits as not really testable, so you can't get the sort of - if you're going for 100 per cent code coverage, you can't get it.

MARIT: Code coverage is a whole other area.

FLOOR:  We had one person in the team who aimed to get 100 per cent code coverage for everything, including in the continental United States failing to exist, and your Java install was broken. Simple changes would take three weeks because of all the tests that needed, but it was 100 per cent unit-test covered.

MARIT: Hmm-mm.

FLOOR:  And it didn't run.

MARIT: I think you're proving my point. Exactly!

FLOOR:  I 100 percent agree.

MARIT: Do you actually fire up your application and then manually test it?

FLOOR:  I do now, yes.

MARIT: Excellent, [we’re] learning.

FLOOR:  I agree with that. Yes, there's a model which was explained to me how to think of automated versus manual, and I found that useful. The model was around how they both go together. I'm not saying it is a perfect model but the model is you think of it like a map and you're exploring the map, and you need manual testing to go and explore bits of the map you haven't gone on before. And you use automated testing to sort of - the automated tests live in the bit of map you know about, the bit that is not covered by the fog of war. Your QA guys -

MARIT: Or girls -

FLOOR:  When they find something that is interesting, they can write an automated test, and that lives there in your test, and flagged on that map.

MARIT: There are lots of useful articles by Angie Jones who is very experienced in the test automation sphere on how test automation and exploratory test having completely different skills, because I don't do exploratory testing well, but still realise that I should. When I build my application, fire it up, and see how it goes, which sometimes we have to do because we also have an application - speaking of front-ends - we have an application in Vaadin. That's fun, because it has no unit test. When I build stuff on a screen, I have to fire it up, build the little thingy, bounce the server that is fired up, check that I did the thingy right - so that's lots and lots of fun. So I prefer having unit tests to not having any unit tests, but you still have to fire it up and see how it goes.

FLOOR:  So I'm working in an environment at the moment where there are separate testings who write separate regression acceptance tests in different environments from what the developers are working in, which is fine -

MARIT: And I'm so sad.

FLOOR:  That gives you a different level of confidence but I see very, very often that level of confidence translates into a handing-off of responsibility, and this is the, "It works on my machine and passing the integration test, or the unit test" but it doesn't implement the feature. My own practice, when I'm building and designing stuff, and when I'm developing stuff on other systems, is to run the software, you know? As a developer, it is my responsibility. Even if I wrote integration tests and unit tests, I will find a way of running what I'm doing, and seeing does it work in the way that I thought it would?

MARIT: Yes.

FLOOR:  This kind of talks to something that concerns me about how we've come to understand development, and Agile in particular, in terms of the sort of backlog fixation and storage fixation. As developers, we are sitting on a sort of trough feed of stories. You take a story, you implement it, and then you look for the next one. There's a lot of context missing here, and you get that context from my point of view, one of the ways you get the context is to run the stuff you are writing as a developer, so this isn't about exploratory testing, it's as a developer.

MARIT: Running what you're building because how else are you going to build it well and if you don't know how it works?

FLOOR:  Exactly.

MARIT: The testers and the developers [in different departments] - that makes me sad.

FLOOR:  All very close!

MARIT: Sorry, good!

FLOOR:  I'm more on the obs-y side of things. We have great people who work, and then they help write automated tests, and stuff, and one of the things I find kind of interesting is that you can have your code go through like the automated tests, and unit tests, and integration tests, and it gets to production, and things work, and then they will break, or like something will break, and your application is fine, but the fact that it's running in production causes it to break, whereas like in an acceptance environment, it would work, and could be like because the database is misconfigured, or the load from the users, and, like, every time you try and account for these things in your pre-production environments, it just seems to go like, huge amounts more effort to test, and check, and the more that you look at your wanting to run these tests in production, it is like there's this huge amount of overlap between automated checks to check that your application is working before you deploy it, and also check to make sure that your site is currently working because that's the thing that you really care about at the end of the day. It feels a lot like sometimes people still look at it as these very separate spheres when there seems to be more overlap between everything.

MARIT: I think that's a point that [someone else] was making earlier.

FLOOR:  Yes.

FLOOR: I'm back again but not going to really rant this time.

MARIT: Ahh!

FLOOR:  This is my second favourite, or equally favourite subject today about testing. I have absorbed myself in the community of testers for a couple of years now, because there is a problem with sometimes with teams that, you know, they're just throwing the tests over the wall to the testers, and we're creating mini waterfall-y things, and so, I came up - I'm an imposter in this community, and it so interested me, so I got into the idea of whole-team testing, and I was really inspired by Elizabeth Hendrickson who said, when - I'm going to paraphrase, she said, if the whole team takes the responsibility for the testing of a product, wonderful things can happen. This is one of the things I've been doing going around to conferences starting conversations, because it is not me that is the expert, it is our testers, and we need to involve them in everything, because they're the experts. And exploratory testing, I read the book. Boy, it's not easy. And that's it. You know about whole-team testing?

MARIT: Yes, I read a book by Lisa Crispin and Janet Gregory about whole-team quality. I met Lisa at conferences and did workshops with her. I was made to read Agile testing when the company I was working for before did an Agile transition - I should say Agile transition! That's where I started doing test automation and I learned a lot... also on how not to do that! Because we were automating really, really long Excel scenarios - flaky as hell.

FLOOR:  First rant a little bit. I hate the words "testing, testers, developers". We have teams of engineers with separate skill sets. I would love everybody to try adopting that. We need to go back and understand what testing really means. I worked closely with Dan North who said that testing is about gathering evidence and giving confidence to our stakeholders, and that's what we need to think about. I've been in the industry for 12 years, and all of the time I've done quality assurance. I only choose to work with products which I use myself before I joined the company. That's where your interest comes in. You actually use it, you think about oh, if I was - if I'm using it, would I like to see this on the interface, or on the app, so on? That's how you need to think about exploratory testing more.

FLOOR:  Quick question: how do you feel about continuous delivery, and how does exploratory testing fit into a continuous delivery environment?

MARIT: So I work in a company that's quite actually agile. We have autonomous teams and we can deploy production whenever we like. We do usually deploy when a story or an epic is done, for whatever definition of "done". I was hired on the team as a tester originally. I'm now moving to software engineering, because I'm not an exploratory tester, so we don't do enough of that, really. I think that we should do that more, but we have only software developers on the team, and they value technical things over other things. Sorry!

FLOOR: That question about continuous delivery is an interesting one, because I think in that scenario, you end up doing that sort of testing in the live environment, and whether it's you that is doing that testing, whether your customers are doing that test, so that the touchstone of quality at that point becomes how quickly you can respond to something being found that is wrong, and that can be environmental, or to do with the integration, or whatever, the interactions between separate systems or it could be a code thing in a particular component, but it's the speed and the ability to roll back or roll forward when these things are discovered that is key.

MARIT: It's more about the mean time to recovery than the mean time to failures.

FLOOR:  Hello. So, in the team that I work in, we - developers - do lots of test automation and QA manually test and exploratory-test. What we find is that QA is now about understanding the quality from the perspective of the user, not have we just got an inconsistent piece of software that no longer works? Does it meet the needs of the user? That is more closely aligned to the UX side of the team -

MARIT: Or usability.

FLOOR:  But they don't really sort of work that way. Do you have any tips on getting UX and the things that they do, and QA, and the things that they do joined up?

MARIT: No, but that's a great question, because I think we have the same problem where we have a UX department that's separate on the teams actually doing the things - and I cannot design to save my life - so having UX translate a bunch of data into something that is a clear picture, that's amazing to me, it is magic, but at the same time, because they're so far away, they don't necessarily understand our domain and don't necessarily understand the data that we are trying to create a pretty picture for.

So I know that there are teams where they are pulling in the UX knowledge, and I think that that is important, and I also have friends who are testers on other teams who are saying we should look at the UX design way earlier just to understand does it do the thing that it's supposed to do? Rather than does it do it technically correct? And I do try and do that when we are discussing our stories in our refinement sessions, is to think about how will it be used? Because, a story will be written quite - yes, it is usually quite abstract, because it is written down so that it captures everything, but then by doing that, it doesn't really capture anything, so you go through it, and think, okay, if this happens, then it should do this. Okay, what if that doesn't happen? It's often not there. I think how would I even test this, before a line of code is written, really, really helps to clarify whether you understand what it's supposed to do, and then that should help you build it.

FLOOR:  As an American, for fuck's sake doesn't come out as naturally as I would like! [Laughter]. But if I could could ever have a for fuck's sake, if you have a separate team of writing any version of automated tests, for fuck's sake don't do it. It's not fine.

MARIT: Yes! Yes, I agree. Thank you. So lunch now, right? [Applause].