We are living in ‘survival of the fastest’ era. We don’t have time for anything. We prefer reading blogs instead of books and we look for tweets rather than lengthy press releases. So when it comes to testing a release that has only a few changes, we don’t have time to run all the tests.
The question but is: which subset of tests we should be running?
I have touched this subject in Test small vs. all, but looking at build change logs and picking up tests to run is a task that requires decision making. What if we can know the changes automatically and run tests based upon that?
That is possible through TIAMaps. No this term is not mine but part of it is. It originates from Microsoft’s concept of ‘Test Impact Analysis’ which I got to know from Martin Fowler’s this blog post. I’d recommend to read it first.
If you are lazier than me and couldn’t finish the whole blog, below is a summary along with a picture copied from there:
First you determine which pieces of your source code are touched upon by your tests and you store this information is some sort of maps. Then when your source code changes, you get the tests to run from the map and then just run those tests.
Below is a summary of TIAMap implementation in our project.
Why we needed it:
We didn’t do it for fun or due to “let’s do something shiny and new”. We are running out of time. Our unit tests suite has around six thousand tests and a complete run (yes, they run in parallel) takes about 20 minutes. Hmmm… a little change that needs to go has to go through 20 minutes of Unit test execution, that’s bad. Let’s see what others are doing. Oh yeah, Test Impact Analysis is the solution.
Generating TIA Maps
Code coverage comes to the rescue. If we already have a tool that finds out which lines of code are touched by all tests, can’t we have a list of source files that are touched by a single test?
So we configured a job that would run for tests and saves this simple map: test name -> source file names. There were two lessons that we learned:
- Initially, we had a job that would run for all 6,000 thousands and it was taking days. We became smarter and after generating first TIA Map for all tests, we only update maps for the tests that changed. We don’t have a way to find the test names that changed, but our job is based upon timestamp of files that have test code.
- We were storing the Map in a SQLite Db. As the Db had to pushed to our repository again and again, it was difficult to find deltas of change. We switched to simple text file to store the Map. Changes can be seen in our source control tools and anyone can look at those text files for any inspections.
As you can imagine that the hard part is to get those TIAMaps. Once we have them, we now do the following:
- When there is a need to run tests, we determine which source files have changed since the last run.
- We have a Python script that does the magic of consulting the maps and getting a list of tests to be executed.
- We feed that list of tests to our existing test execution program.
How is it going?
It is early to say that as we have rolled this as pilot and I may have more insights into the results in few months. But the initial feedback is indicative of us being on the right path. Time is being saved big time and we are looking for any issues that may arise due to faulty maps or execution logic.
Have you ever tried anything similar? Or would you like to try it out?
The only way to improve yourself and your craft is to reflect on the sate of affairs. The #StateOfTesting Survey gives us exactly that opportunity where testers from across the globe give their valuable feedback and then gain value from this collective wisdom.
I have been taking part in the survey for sometime and I see that not much testers from Pakistan are doing that. Given that our community is on the rise and we had our first ever testing conference, it’s time to get in touch with testing community of the world!
The link to the survey is: http://qablog.practitest.com/state-of-testing/
Thanks for the help and let’s make (testing) world better than today!
Code Coverage is a good indicator of how well your tests are covering your code base. It is done through tools through which you can run your existing set of tests and then get reports on code coverage with file level, statement or line level, class and function level and branch level details.
Since the code coverage report gives you a number, the game of numbers kicks in just like any other number game. If you set hard targets, people would like to get it, and at times a number means nothing. Here are my opinions based upon experience on how to best use Code Coverage in your project:
Do run code coverage every now and then to guide new unit test development
It’s worth running coverage tools every so often and looking at these bits of untested code. Do they worry you that they aren’t being tested?
The question is not to get the list of untested code base; the question is whether we should write tests for that untested code base?
In our project, we measure code coverage for functions and spit out list of functions that are not tested. The testing team then do not order the developers to write tests against them. The testing team simply suggests to write tests and the owner of that code prioritize that task based upon how critical is that piece and how often that is requested by the Users.
Dorothy Graham suggests in this excellent talk that coverage can be either like “butter” or like “strawberry jam” on your bread. You decide if you want “butter’ like coverage i.e. cover all areas or you want “strawberry jam” coverage i.e. cover some areas more in depth.
Do not set a target of 100% code coverage
Setting up a coverage goal is in itself disputed and is often misused as Brian Marick notes in this paper which has been foundation of any Code Coverage work thereafter. Also anything that claims 100% is suspicious e.g. consider following statements:
- We can’t ship unless we have 100% code coverage
- We want 100% reported defects to be addressed in this release
- We want 100% tests to be executed in each build.
You can easily see that a 100% code coverage gives you “Test all Fallacy” to imply that we can test it all. Brian suggests in the same paper that 90 or 95% coverage is good enough.
We have set a target of 90% function coverage but it is not mandatory for release. We provide this information on the table along with other testing data like results of tests, occurrence of bugs per area etc. and then leave the decision to ship on the person who is responsible. Remember, the job of testing is to provide information not make release decisions.
Yes, there is no simple answer to how much code coverage we need. Read this for amusement and know why we get different answers to this question.
Do some analysis on the code coverage numbers
As numbers can mean different things to different people, so we need to ask stakeholders why they need code coverage numbers and what they mean when they want to be covered.
We asked this question, got the answer which is to do a test heat analysis on our code coverage numbers. It gives us following information:
- Which pieces are hard to be automated? Or easy to be automated
- Which pieces are to be tested next? (as stated in first Do)
- Which pieces need more manual testing?
- How much effort is needed for Unit testing?
Do use tools
There are language and technology specific tools. For our C++ API, we have successfully used Coverage Validator (licensed but very rightly priced) and OpenCppCoverage (free tool) that extract info by executing GoogleTest tests.
Do not assume coverage as tested well
You can easily write a test to cover each function or each statement, without testing it well or even without testing it at all.
Along with our function wise code coverage that I mentioned above, we have a strong code review policy which includes reviewing the test code. Also we write many scenario level tests that do not add to the coverage but cover the workflows (or the orders in which functions will be called) which are more important to our users.
Brian summarizes it nicely in the before mentioned paper:
I wouldn’t have written four coverage tools if I didn’t think they’re helpful. But they’re only helpful if they’re used to enhance thought, not replace it.
How you have used Code Coverage in your projects? What Dos and Don’ts you’d like to share?
It happens to all of us. We are used to doing a process in a particular way for years and never think of other ways of doing it. Then someday someone says something. That serves as an eye opener and we start seeing other ways of doing the same thing.
This happened to us for our rule: “Break a build if a single unit test fails”
Sounds very simple and rational. We have this rule for may be 10 years and I have repeated this over and over in all new projects that we took up in these years. Our structure follows the Plan A as shown below i.e. to run tests as part of the build process and if a single test fails, build fails.
What changed in year 2017 was a quest to find ways to release faster, you know the DevOps like stuff. So we started to look at the reasons for the build to be broken. We build our source 4-6 times a day, so we had enough data to look into. One of the reasons was always a failing test.
Now we thought, as you must be thinking by now, that it is a good thing. We should not ship a build for which a given test breaks. But our data (and wisdom) suggested that failing tests were in following three categories:
- The underlying code changed but test code was not changed to reflect this. Thus test fails but code doesn’t fail.
- The test is flaky (expect a full blog post on what flaky tests are and what we are doing about them). For now, flaky test is that passes and fails with same code on different occasions.
- The test genuinely fails i.e. the feature is indeed broken.
Now 1 and 2 are important and Developer who wrote the test need to pay attention. But does this stop the build to be used? Of course not.
3 is a serious issue but with the notions of ‘Testing in Production’ combined with the fact that fix is just few hours away, we figured out a new rule as shown in Plan B.
Yes, when a build fails due to a failing test, we actually report bugs for failing tests and declare the build as Pass. Thus the wheel keeps rolling and if it needs to be stopped or rolled back, it can be.
A few weeks into this strategy while all looked good, it happened what was always feared to be happening. A build in which 20% of our tests were failing was declared pass and our bug tracker saw around 100 new bugs that night. Let’s call that spamming.
That raised our understanding to move from binary (fail or pass) to a bit fuzzy implementation. We came with a new rule that if 10 or more tests (where 10 is an arbitrary number) fail, we’ll declare the build as failed. Otherwise we’ll follow the same path. So this is now our current plan called Plan C.
I know what you are thinking that a single important test could be much important than 10 relatively unimportant tests. But we don’t have a way to weigh tests for now. When we have that, we can even improve our plans.
Does this sound familiar to your situation? What is your criteria of a passed build in context of test results?
The modern notion of software development requires to work in team. We work as individual contributors, but it’s the team that delivers the final outcome. From development through production, it is a team effort that enables quality at speed.
I look around and most teams are not as productive as they might have been. To individuals that I talk always tell that they are putting their best, but somehow the net result is not what they want. Those who have such feelings include Project Mangers, Scrum Masters, Development Managers, Testing Managers, Developers, Testers and the list goes on.
I think I have a fix to suggest. In fact, a very simple one and that is “Show Respect”.
You might be thinking, oh now we’ll get some sermons on the old philosophies of respecting people and we are in 21st century. But humans are humans, they only work at their best when they are respected.
Google did a famous study just couple of years ago which was summarized in this beautiful (though lengthy) article in New York Times. It suggests:
In the best teams, members listen to each other and show sensitivity to feelings and needs
More details of the study are here where “Psychological Safety” is defined.
Now you might think or claim that you already do that as a leader or team member. Your organization may have “Respect at the workplace” as one of business values. But how do you know if you are practicing what you preach? I’m suggesting following 3 tests for you to get answers to questions like these: “Am I respectful to my team members?”, “Is my manager respectful to all?”, “Who is not showing respect in the team?”
Respect the presence
I learned this from my grandfather when I was about 8 or 10 years old that whenever anyone in the family visited him, he greeted them by standing from his seat. He was in his eighties at that time but he’d stand up for his 2 year grandson or grand-daughter.
So here is the test: when someone approaches you at work, how do you respect them? Do you stand up to greet them? Do you offer that person the time one is looking for? Do you respond as if you want them to go away from you?
(the picture is taken from: https://www.practicaletiquette.com/how-to-show-respect.html )
Respect the opinion
When two people have the same opinion, one of them is redundant
So here is the test: when someone offers you opinion at work, how do you respect them? Do you always want people to offer opinions which fit fine in your frame of things? Do you listen to any opinion coming from any member of the team? Do you follow the advice given in the opinion?
Respect the feelings
Respecting someone’s presence of opinion gets you at a position where you start respecting people’s feelings. Just like we have different skin tones, our reactions to same incident can be very different. Respect that difference and try to understand that not everyone thinks the same way about any thing in this world.
So here is the test: when someone feels differently than your thinking, how do you respect them? Do you empathize with them to understand more? Do you give space to be able to share their feelings? (what we call Psychological Safety in Google study above)
Before I go, let me tell you that I’ve personally seen this respect trick to be working. In teams where everyone was respectful, team members were more influential and they cooperated with their best efforts to do wonders.
How has been your experience? Do you also believe that Respect is the root of team productivity? You can have a different opinion and I respect that.
If there were a time to test the passion of Software Testing community in Pakistan, it was last Saturday when resilient members attended the 6th Islamabad Testers Meetup in big number despite all the traffic challenges. The event hosted by MTBC at their campus in collaboration with PSTB saw professionals of various backgrounds who discussed and learned lot of new ideas on the day.
The event was a much awaited one as after the successful PSQC’17, activities restarted to gear towards PSQC’18. Umer Saleem welcomed everyone in the morning as Host of the day, introduced the event agenda and handed over the stage for the first talk.
Kalim Ahmed Riaz, Senior SQA Engineer from Global Share presented his thoughts on “Ways to Improve Software Quality”. Kalim who maintains his own blog on testing, quoted several everyday examples to explain his views including buying vegetables and the knowledge needed to define requirements. He then suggested multiple ways to cure common Quality diseases including “treating Testers as Clients”. By which he meant that not only Testers need to step up to think and act like real Clients but also the management need to acknowledge issues reported by Testing team to be real one and not postponing them sighting they are not real users. His slides are here: IsbTestersMeetup_Ways_improve_Quality_Kalim
The next presentation was well handled by a Story-teller Farrukh Sheikh who is Lead QA Engineer at Ciklum. Farrukh shared his thoughts on “Adaptation” skill which he believes is a must for all testers. He emphasized everyone to get out of their “deceptive illusion” where they believe themselves as strong and well-built testers and rather learn the new skills of the trade. He suggested to read a lot, practice new methodologies, learn coding as few ways to build muscles for tackling the challenges of tomorrow. Full slide deck: IsbTestersMeetup_Adaptation_Farrukh
(more photos are here)
After the talks, we did an experiment of “Open House Discussion” which was very well received. I was handed over the task to moderate it around the selected topic (based upon registered participant’s feedback) “Challenges in Implementing Test automation”. The format was that anyone from audience would share a challenge he or she is facing, and then everyone in the audience was free to give input on how to address that challenge. The discussion was around following challenges shared by members present there:
- Not finding time to do automation
- How to stabilize automation when everything is changing?
- Automation falling at configurations or keeping them up-to-date
- What’s next if first round automation is done
There were many suggestions offered from experienced and uniquely qualified testers in the hall which ranged from “talking to management to buy time”, “involving Programmers in the team to be part of the automation project”, “including automation tasks in Sprint planning”, “keeping an open eye on the changes within and outside your organization”, “being selective in what to automate” and so on. The discussion was of much worth and I plan to write a full blog post on these topics. It was heart-warming to see that we as a community are now moving forward to discuss such challenges and hopefully next time we will have solved these problems and ready for next set of problems.
Dr. Zohaib Iqbal, President PSTB was then invited to give updates from PSTB. He shared the lessons learned so far in holding such events and highlighted various programs including Certifications around CTFL, ISTBQ partner program and Accredited training program. He invited all to be part of upcoming PSQC’18 to be happening at Lahore in Spring next year.
The stage was then handed over to Adeel Sarwar, CTO MTBC for a closing note. Adeel briefly talked about MTBC vision including their wonderful initiatives to incorporate fresh college graduates into their workforce and how this model has worked really well for them. He raised his concern of people in IT not reading enough which reminded me of my campaign of “Certified Book Reading Tester”. He made it clear to everyone that we need to raise the bar to improve quality of Software Quality professionals. He thanked the audience and shared some ideas on future collaborations in which MTBC can be a part.
Shields were presented to speakers and certificates were given to the wonderful MTBC organization team who were well applauded by round of applauses. We then moved to outside for an open air tea with snacks setting. The food was as delicious as were the discussions and lot of new ideas were shared by participants. New friends were made and old friendships were revived. The conversations were never ending but with every beautiful meeting, it ended too soon for all of us.
Thanks again to the hosts, PSTB support and above all the “cheetah” testers who always respond to such events. Together, let’s make Pakistan testing community even stronger!
The very idea that quality can be added later is flawed and quality must be baked in.
Speaking of baking, it is not possible for the inspector who tastes a finished cake to make it any better. That person would just tell you if the cake is good or bad. And if it’s bad, there is nothing you can do about it. But if you want the cake to be always good, you will take good consideration on the ingredients that will go in, you’ll consider hiring a professional baker to do the job, you’d like to inspect it in intervals like how the dough is or opening the oven to see if cake is in good shape.
Similarly, you cannot have Testers jump late into the development cycle and improve the quality. A Tester can only tell you if the Software is good or bad. And if it’s bad, you’ll have to go many cycles of correction before it goes reasonably good. So, if you want your Software to be in good shape, you’ll take care of the technology and tools you use, you’ll consider training or hiring professionals to do the job right, you’d like to review the code time to time to see if it has quality baked into it.
Catching logical errors:
All of us know that humans make errors but all of us also believe that those humans don’t include us. It might be true that some people make more mistakes than others but no single person is error free. So as a matter of fact, when you show your code to a peer to review it, or you explain it yourself to a peer through a walkthrough, you start catching errors. Some of these are obvious logical errors which got slipped as you wrote that code. Don’t feel ashamed as this happens and we should be happy that this error was caught much early in the cycle.
Catching Performance issues:
Writing algorithms that solve complex problems is not an easy job and at times in an attempt to do that, you can take a path that is lengthy. Well, lengthy in terms of how many CPU cycles you’ll need to complete them which can result in slow performance. There are also cases where unnecessary data structures are involved which cause more memory to be consumed than the minimum required one. Your peers can help you find such errors in the code. You can also have some expert on this subject in your team to do this type of review for all the code that is being written.
In one of the teams in which I worked, we had a person (let’s call him Ahmed) who was our performance expert. All team members when confronted with the task to write best performant code would consult Ahmed after writing the code. And if Ahmed was happy about it, it was always a quality code.
Memory leaks and Security considerations:
Most developers are capable of writing code which is free of logical errors and is usually well performing, but there are other things hidden like the bugs who only come out when there is dark and no one is around. Yes, I’m talking about writing secure code.
These aspects of code can again be reviewed by Developers who are experts in this domain. The good thing is that some of this expertise is now being passed on to tools. For example, you can use built-in tools in your IDE or can get many free tools that can help you find memory leaks in your code.
Similarly, if your code is good in terms of all requirements but is not safe and is vulnerable to various type of attacks, we cannot call it Quality code. Before some hacker can detect those vulnerabilities, how wonderful it would be knowing these so early in your development lifecycle.
Knowledge of the tribe:
The collective knowledge of your development team is what matters more than the individual member of your team. In fact, the sum is always greater than the actual sum. Let me explain this through an example.
Let’s assume that you three members in your development team named Ali, Sara and Jamshed. K represents the knowledge possessed and the brackets are meant so as who possess that knowledge. Given the above, the below is always true:
K (all team) > K (Ali) + K (Sara) + K(Jamshed)
So, the real job of management is to create an environment in which knowledge of one person is passed on in a way that it becomes a knowledge of the whole tribe. Code Reviews play a big role in achieving this.
I hope that by now you are convinced that you need to do Code Reviews. So maybe I can help you further on the how part of it where you can ask these questions:
How often should Code Review happen?
There are many school of thoughts on this. Some believe that all commits should be reviewed such as if it is not reviewed, it will not be in the production code. Others think that achieving this “all code must be reviewed” policy takes lot of effort and you should be selective in the process.
If you pick the first path, there are tools that can help you such that no code goes unreviewed. And if you like the second approach more, you’ll have to decide what is the selection criteria and who makes the decision if a particular piece of code ought to be reviewed.
Who should do the Code Review?
Just like the above, there are all and selective approach. Some teams like each member of the team to take the Reviewer role once in a while. In my opinion it helps to give this power to all rather than few. If only some team members review the code of other team members e.g. senior folks reviewing junior folks code, it will create friction in the team.
The exception is obviously special cases where you have Security and/or Performance experts in your team who’ll do more reviews than other members.
How long each review should take?
Though this largely depends on how big is the piece of code being reviewed but leaving this as an open time slot can both slow down the process along with the code author feeling being questioned for too long. You can set sort of “30 minutes or one hour rule” that each small code review should take no more than 30 minutes and each big review should take no more than an hour. These numbers are arbitrary and you have all the powers to set a limit of your own.
What the Reviewer should check?
Typically creating a checklist will help you which can have questions like the below:
- Are there any errors?
- Is code human readable?
- Is there good commenting?
- Are there tests written against this code?
- The coding guidelines have been followed
These questions vary from organization to organization and team to team. So, you should feel free to start with your own list and then grow this list as you see other cases.
All the best on your journey of making this world a better place by producing Quality Software.
Note: The above article was printed in Flash newsletter which is being reproduced for blog’s audience.