Removing the bubbles: solving bottlenecks in software product development
A challenge with software product development is visualising the work so that you can spot where there are delays in the process of converting ideas from “concept to cash”. This post shows how a cumulative flow diagram helped identify a pattern of queues over time. Removing these queues had many benefits such as fewer errors, increased team communication and improved team capacity.
Make the work visible
The first task is making the work visible. In knowledge work, such as software development, it is difficult to see the work being done, which is why a visualisation approach such as kanban can be so useful. Here’s a view of a kanban board from an earlier client team:
The kanban board is useful for a “moment in time” view, but it’s not possible to easily see patterns that might develop over time. Looking at the kanban board on a particular day doesn’t make it easy to answer questions like these:
- How long have these work items been waiting in this column (stage)?
- How long does it usually take for work items in this stage of the process to complete?”
- How often do we see queues in this step? How long do they last for?
- Are these queues a special event or do they happen regularly (touching on the difference between common and special cause I’ve mention in an earlier blog)
To find these answers and look more clearly for patterns over time we built a cumulative flow diagram (CFD, also called a ‘finger chart’) by counting the number of post-it notes in each stage (column) in the team’s process after each daily stand-up. Unlike my earlier post on using three forks and a hand-drawn chart to help a team improve in this case we used an Excel spread sheet.
Visualise the work over time to better understand queues (‘bubbles’)
The cumulative flow diagram for this team helped make visible that there were consistent queues of work in the functional testing and acceptance testing processes over time. These queues are visible as “bubbles” that develop in the cumulative flow diagram. See the highlighted in orange and red stages below (click the image for a larger version).
Do the detective work necessary to understand what causes the queues (‘bubbles’)
Around two-thirds of the way through the above chart (which covered about 36 weeks) we decided to focus on studying what was causing the queues to develop in functional and acceptance testing.
The functional testing involved someone other than the person who developed the functionality (user story) validating that it worked functionally (there were no obvious errors). Once functional testing was complete then the acceptance testing stage was performed by a business analyst or the product manager.
The team were releasing to production every second Wednesday. On the middle Wednesday the person who did the functional testing switched to doing the integration testing (ensuring the features which were created as a package to go to production worked individually and combined, as well as running a set of manual regression test scripts to make sure that the new functionality hadn’t had any impact on the rest of the website). During the week spent on Integration testing, no functional testing was done, which we believed was the cause of the queues or orange bubbles on the chart.
Creating a new policy to reduce the queues (‘bubbles’)
We sat down with the person who performed the Functional and Integration Testing and mapped out the schedule of their work across the fortnight between releases (see the hand-drawn diagram we came up with below).
We also mapped out a new “policy” that described what the person doing testing did for for the week spent integration testing:
While performing the Integration Testing in the week before the release, if there are any work items in the Functional Testing column, spend up to an hour each day doing them.
We experimented with the new policy for the last third of the cumulative flow chart. The cumulative flow diagram showed that the queue (bubble) in the Functional Testing (orange) step virtually disappeared, as did the queue in the Acceptance Testing (red) stage. The CFD not only highlighted the initial problem, but it also validated the experimental change we made in policy resulted in an improvement (it allowed us to answer the critical question – “did the change we made to our process result in an improvement?”)
It’s the system!
This example demonstrates how changing the way the work is structured can produce improvements without having to change the work that team members were doing. This example shows that the queues caused by the way the work was structured (e.g. the system we had designed) and not the work of the team members. It speaks to Deming’s ‘provocation’ that “95% of the variation [in how long the work takes] is due to the system and not the individuals”.
There were many benefits to the changes that we made above:
- Removing the queue in functional testing meant that if a problem was found then the developer got faster feedback. Getting feedback faster reduced the time it took a developer to “get their head back into the issue” and fix the problems. It also improved the communication between members of the team – the developers were more likely to speak to the person who did test at stand-up about the work that was coming because they knew it would be tested quickly, rather than potentially sitting in a queue waiting for a week.
- By reducing the bottleneck in Functional Testing also reduced the same bottleneck in Acceptance Testing.
- The reduced “thrashing” from having issues discovered close to the release date meant the team’s capacity to do work increased.
- As there were fewer queues it reduced the pressure on team members, helping them feel less rushed which improved the quality of life for the team, reduced “rushing” leading to better quality and team morale.
Hi, I’m Benjamin. I hope that you enjoyed the post. I’m a consultant and coach who helps IT teams and their managers create more effective business results. You can find out more about me and my services. Contact me for a conversation about your situation and how I could help.
How three forks on a hand-drawn chart helped a team improve
After visualising the workflow of a recent client’s software development process, and showing where the work was, the team realised there was a queue of tasks that had been developed and were waiting to be validated (‘tested’). The team decided to move from a “I just do my task” approach to a “whole team” approach where they focussed on making a single task flow across all of the development process with minimum delay. They decided that rather than continuing to do more development and hoping that the validation work would eventually complete, the people who did development would go and help those who were validating the work.
The problem was that the Development Manager (who believed his role was to maximise the number of development tasks completed) was concerned that if the development work stopped to reduce the queue in validation, then eventually the validation step might run out of work to do.
In order to address the Development Manager’s concern we went to the hand-drawn cumulative flow diagram (sometimes called a finger chart). The cumulative flow diagram was drawn each day, by the team member who ran the daily stand up, by simply counting the number of tasks in each stage of the workflow. The x-axis shows time in days, and the y-axis shows the number of tasks in each step of the workflow.
As we had a couple of weeks data it was possible to extrapolate the rate at which the validation work was being completed. We were also able to extrapolate the rate at which development work was finishing. In order to do this extrapolation we grabbed the nearest straight-edge items we could find, which turned out to be forks!
As the photo shows, the forks indicated that it would take about two weeks for the validation step to run out of work to do if most of the people doing development spent their time helping do the validation. This data allowed the Development Manager to feel more comfortable about allowing the strategy a shot for a week, after which it could be reviewed.
This example demonstrated the benefits of using quick, low-fidelity data collection methods. The discipline of hand-drawing the chart each day meant that it was better understood by the team members than if it had been automatically produced in an electronic tool. The team were also able to see how the data could be valuable, as it allowed them to discuss the Development Manager’s concerns with data rather than just opinion.
In the end, the strategy to work as a whole team and focus on the validation step turned out very well. The team were able to reduce the time it took to validate tasks, providing quicker feedback to development (which reduced defects), increase the amount of work they got done at the same time as improving the team’s morale.
Do you have stories of using quick, low-fidelity data collection techniques to improve team performance?