Fixing Defects in Bulk
A practitioner’s defect grouping strategy for effort reduction
Problem
In my career, numerous times I’ve faced the situation of inheriting a product with a huge defect backlog. The problem is that for each 3 to 4 defects fixed, the fixes themselves end up creating one more, which adds to the defect introduction rate of the development process.
To reduce the defect backlog, it is important to reduce the effort needed to fix these defects, so that the rate at which they are fixed is greater than the rate at which new defects are introduced.
Objective
Increase productivity by fixing similar defects sequentially. This happens because the similarity leverages reuse of effort to:
- Learn about the related code, or;
- Design the solution (by reusing it directly or the concept behind it), or;
- Test similar behavior.
Understanding the approach
The reasoning is simple. Aside from applying the fix (actually changing the software), where does the effort go?
- Understanding the code so that the fault can be localized.
- Actually understanding what the fault is (how it causes the failure).
- Designing a solution.
- Testing (including regression tests) and code reviews.
It’s only logical that if we can reuse some of this effort across different defects, we can save time that can be turned into fixing a greater number of defects without increasing the effort proportionally.
Grouping criteria
Many different criteria can be adopted to define what “similar” means when comparing defects. Here is a non-exhaustive list:
- High overlap on testing. Saves time on testing.
- High probability of the same root cause. Saves time on fault localization.
- High probability that the same fix or approach can be applied. Saves time on designing the solution (that will be done only once, not once per defect) and implementing it. It also improves consistency in the software.
- Close fault locality, like being related to the same package or same few classes. Saves time on understanding the code logic so that the faults can be localized.
- Failure similarity. Saves time on testing, possibly on designing, or solution implementation, or even on fault localization.
Method
There are basically 2 ways of implementing this approach, grouping at ticket creation or backlog refinement.
Grouping at ticket creation
When a defect ticket is created, it is already assigned to a similar group. If you already have a large backlog, this approach will require the team (or some individuals) to go over the defect backlog and create the groups. Grouping them at backlog refinement may be a better option in this case.
Grouping at backlog refinement
When there is an ungrouped backlog, during backlog refinement sessions, the team can perform the grouping searches and create the groups following the priority of the tickets in the backlog.
How to group
Searches for existing open tickets can be done using keywords to find similar tickets by failure similarity, the possibility of the same fix working for the new ticket, or possibly the same root cause. Searches by component or feature can help identify similarity by testing overlap or fault locality.
The searches should focus on open tickets, but they can include closed tickets to help identify recurring issues or solutions that can be adapted or reused to fix an existing defect. When looking for closed issues, the team may even get insights into areas that are more prone to defects and that may require extra attention.
Tackling a group
After the groups are created, each group should receive the priority of the highest item in the group.
- When someone is assigned to address the highest priority defect that belongs to a group, it is expected that the person will examine all the other defects in the group as well.
- The sequence of items in a group should correspond to the item’s individual priority.
- The criteria for each group should be explicit (fault locality, testing overlap, etc).
- During the analysis performed by the assignee, items may not be really similar to the highest priority item in the group (the criteria is a heuristic, it won’t be always correct). The similarity comparison is always with the highest priority item. If this happens:
- The item can be excluded from the group becoming an individual item or being assigned to a different group.
- The item can be excluded from the group becoming an individual item or being assigned to a different group.
- The item can be excluded from the group becoming an individual item or being assigned to a different group.
- The results of the analysis of an item excluded that was excluded from a group should be added to the item so that when it is picked up again, the information is not lost.
- The item goes back to the backlog.
- The analysis continues with the next highest priority item in the group.
These steps ensure that the highest priority item is addressed and only the ones that would benefit from that effort will be tackled. Lower priority items that are unrelated and grouped by mistake will go back to the backlog to be addressed later.
Faulty software = faulty process
If you arrive at your home to find water on the floor because the faucet was open, you don’t reach out to the towel first, you close the faucet first.
If you have a development process that is introducing many defects, that’s what you need to prioritize. You can work on the defects in parallel, but you must address the process.
Take it with a grain of salt!
These tips were compiled based on my experience managing teams that were struggling with an extensive backlog of defects. This approach hasn’t been extensively tested, although I have already employed it successfully in a few scenarios with different teams. If you have a different approach or insights into this problem, sound off in the comments!
If you like this post, please share it (you can use the buttons in the end of this post). It will help me a lot and keep me motivated to write more. Also, subscribe to get notified of new posts when they come out.