The importance of understanding variation or how to avoid treating all contractors as thieves
Here’s a story of how managers detected a problem, but by not understanding the cause of the problem of the type of variation the problem represented, applied the wrong type of solution which meant things were worse for everyone:
Once upon a time in a large financial institution that had many thousands of people in their headquarters, a handful of hourly-paid contractors got their manager to sign their time sheets for times they did not work.
This was clearly a type of fraud and the police were called and the contractors went to jail.
The senior managers looked for a way of making sure it would Never Happen Again.
They came up with a cunning plan! Connect the time clocks in the security gates with the electronic time tracking system for all contractors (yes, even those on day rates).
A little while later, some of the contractors began to change their behaviour. They started to watch the clocks themselves and only work the weekly 40 hour minimum number of hours. When they went out for a big lunch, they stayed out longer if they’d “done their time already this week”.
One clever team of contractors even worked out the rounding rules of the gate system so that if they arrived by 9:14 in the morning it would round their time back until 9:00 effectively saving them from having to work 70 minutes each week. Some of them even set timers to go off around the end of the day so they didn’t stay a minute longer than they were being paid for!
This story highlights the importance of understanding the cause of a problem and the type of variation the problem represents before trying to solve the problem.
Common cause vs special cause
In this case, the small handful of hourly paid contractors were not representative of the thousand of other full time employees and contractors in the building. So the fraud they committed was not a signal that something was wrong with all the people in the building, but instead just a tiny minority. Rather than seeing this problem as a signal that represented a special event with an identifiable cause (referred to as special cause variation) the management acted as if this was a problem with all contractors in the building (something that could have happened in any team at any time – referred to as common cause variation)
In a special cause situation it’s worth asking “is there a specific root cause that explains what happened here?” because it’s likely there are a small number of identifiable causes. In the absence of good data (such as a longitudinal plot of data), a useful rule of thumb is to ask “If we replaced this bunch of people with another bunch of people would the problem occur?”. In this story, there were hundreds of other hourly-paid contractors in other teams who did not fabricated their timesheets, so the answer is probably ‘no’ indicating that this was likely a special cause situation. In a common cause situation there’s no point asking “what was the cause of this?” because there are multiple sources of variation (causes) all contributing to the problem.
The fix for a special cause situation is to go to the root cause and see if it can be prevented. Indeed in this story it would have been useful to question the manager involved and understand what lead him to sign timesheets for times that his team did not work. The fix for common cause variation (and most variation is common cause) is to go and study the situation, experiment and try and look for patterns or trends in the data before making a change to the system.
Implementing the wrong type of fix is tampering and mostly makes things worse
As this story illustrates, applying a common cause solution to a special cause problem – “tampering” as Deming called it – can lead to bad results. Making all contractors (even those on day rates) use the electronic time keeping system sent that all contractors were thieves! And as Deming says, if you muck people around they will use their ingenuity finding ways around the system instead of working towards the purpose of the system. Applying a common cause solution to a special cause problem will reduce humans intrinsic motivation because it can seem unreasonable and unjust.
The story above is actually a reverse of the more common scenario where managers often treat what is a systemic problem as a special cause and blame the individual. There are many examples of this such as setting targets for sales in call centres (tip: most of the sales are the result of customers who want to buy phoning in, rather than the technique of the person who receives the call).
Have you seen examples of tampering where the wrong type of fix was applied to a problem? (such as yesterday’s blog where a manager tried to change the team’s process to cater for the behaviour of specific individuals) Do you have stories of fixes that were applied to the whole system when there was a clear special cause that could have been prevented at its source (e.g. sign-offs in a deployment process)? Please share your story in the comments.
Image credit: flickr
Hi, I’m Benjamin. I hope that you enjoyed the post. I’m a consultant and coach who helps IT teams and their managers create more effective business results. You can find out more about me and my services. Contact me for a conversation about your situation and how I could help.
14 responses to “The importance of understanding variation or how to avoid treating all contractors as thieves”
Trackbacks / Pingbacks
- July 21, 2011 -
In some situations, I suspect that people are embarrassed to talk to the specific offenders and find it easier to abstract the situation by treating the problem as a general case.
Yes, I agree with you that some people think that handling individual differences by talking about general cases will be easier (and save embarrassment). It’s frequently easier to do the wrong thing – for example, understanding and fixing common cause variation is usually much harder than reactively ‘finding’ a reason for the variation/problem.
I addressed the issue around ways to make it easier to discuss individuals’ behaviour. I’d value your thoughts on that.
Once upon a time, when I was working as a contractor, I used to get my manager to sign timesheets for hours that I wasn’t working. I was on a day rate and I filled in my timesheet as 8.30 to 17.30, with 1 hour for lunch every day.
My manager noticed that some days I arrived at 10, he didn’t notice that most days I stayed until 1900, or that I frequently took only 30 minutes for lunch.
He told me that my timesheets were not accurate and I explained that this was true, but making them accurate would require me to spend more time keeping track of time and less time doing actual work. This satisfied him for a week or so.
Then the subject came up again, so I said “Fine, I’ll keep accurate records and we’ll go with the consequences that that brings”
For the next five or six weeks, I kept accurate records, and after having received two invoices for roughly 15% more than my normal monthly rate, he queried the amounts.
“You asked for accurate time keeping,” I said, “and accurate time keeping is the input to the billing system. If you’d rather go for the simpler solution, where I write 8.30 to 1730 every day, with 1 hour for lunch, that’s fine with me”
And so, we returned to the original arrangement…
Nice story! It’s a shame that people don’t go back to the purpose of these systems – “why are we collecting time information?” and “what’s the purpose of getting a signature against the time?”. As your story highlights, the person signing wasn’t really thinking about either the purpose or what the time sheets were measuring or used for.
I was once an employee for a company that frequently sent me to customer sites, often for months at a time.
Of course the company I worked for rebilled my time to the client. If I worked 12 hours a day they would rebill for 12 hours a day
I, meanwhile, recieved no overtime for such efforts. In fact I worked with a whole team of people in the same situation.
Eventually we all reached the same conclusion: when on a customer site we would work only the minimum hours (or if we did work more we would not report it).
So by their behaviour the company lost out – both on client rebilling rates, and goodwill.
I agree that it’s silly and wasteful to make changes that apply to ALL instead of dealing with those who are violating rules or procedures.
I’ve seen many cases where managers are do not have the courage to address individuals, so they put out blanket pronouncements about new policies or training refresher courses, etc.
Ineffective and, often, lazy management.
I agree that lacking the courage to deal with individuals in those cases where they are an issue often leads to ineffective management.
When you say this represented “lazy management” I think you’re saying that the manager is deliberately choosing to avoid doing something because they don’t want to invest the effort necessary. Is that what you meant?
I’m currently puzzling through how to handle a situation where we hold this view of a manager but want to be effective in improving the situation. Here are some thoughts I have on the problem.
I could imagine a manager may feeling unjustly accused or offended if you told them “you’re lazy” unless they felt you were striving to be helpful to them and were able to show them what you’ve seen or heard that lead you to this view.
If we hold this high level negative evaluation of the manager but don’t say it (for fear we’d offend them) then I think there are several problems:
I’m curious to find out what your view is on my reasoning in this situation?
Often when an agile transformation fails it’s because the agile best practices were introduced for a special or common cause (special, one bad developer, or common, systematic communication problems).
But that is a boring example… On North Hull, when I was a kid, they cut car crime by 99% by introducing a helicopter and new squadron of police cars. There was only a handful of people stealing cars and so, in hindsight, maybe the reaction was heavy handed. (Although, normal civilians were not punished by this broad approach to a special case, unless you consider the tax bill). But what’s interesting, how does that helicopter work as a deterrent? Was car crime about to become even more popular? What would have happened if they spent all that money on education instead?
A difficult discussion to have as our actions change the output. (BTW, the special case of the helicopter was supposed to be dealt with by a local gangster with a surface to air missile he bought from a Russian gangster. Luckily that never happened.)
When you say “an agile transformation fails because the agile best practices were introduced for a special or common cause” do you meant “introduced inappropriately (e.g. a common cause fix was introduced for a special cause problem or vice versa). If not then I’m a bit confused because according to the ‘theory of variation’ frame there is only special or common cause variation. Can you say a bit more to help me understand better?
I’d say the introduction of the helicopter also had the negative impact of everyone having to listen to the helicopters fly overhead (maybe it had a positive of increasing perceptions of safety?). My own experience of having helicopters fly overhead years ago (I was told it was to use heat measuring devices to see which houses were growing hydroponic marijuana – but I never checked that out) was an increase in stress and the thought “what kind of neighbourhood am I living in that needs helicopters to police it?”
My view is the special cause of a handful of people stealing cars might have been more effectively dealt with by understanding who these people were and how that could be dealt with directly (I’d predict it may have been cheaper for everyone as well).
In terms of understanding the impact of the helicopter intervention it would be useful to “plot the dots” and see the crime trends over time. A statement like “cut crime by 99%” lacks the context to understand what two (or more?) periods are being compared.
I like the ingenuity of the gangsters thinking about a surface to air missile!
I have I no stats, but I know people who do.
And how wrong I was. I guess, at the time, crime feel or was reported to be falling because of the helicopter. What a foolish thing I did, to quote statistics from memory.
I think I meant to say, some special causes, such as a team missing a release- are solved by ‘going agile’. But other common causes, such as a culture of mis-trsut, are also solved by ‘going agile’. In both these cases, it’s not clear what the problem may be and so the soluton is bound to fail. In both these cases, the decisions makers would benefit from this blog’s advice in relation to tampering and looking for root causes.
I know of a similar story. A manager who had it in for two contractors. They looked for a way to dismiss the two individuals. They discovered that both contractors, as managers, spent time off site to do training or have sensitive conversations. This was in agreement with their management. The manager then sought a mechanism to entrap the contractors and they found the electronic gates to be perfect for the situation. Unfortunately the managers of the contractors failed to stand up for their staff who were working according to their agreement.
The company wide policy was introduced for two reasons.
1. To hide the fact that it was a focused attack.
2. To gain credibility for the manager who discovered the infringement. Enforcing the rule company wide drew attention to their dilligence in detecting the heineous theft.
The manager did not do this for the good of the company. It was a personal attack. Sometimes it is more important to discover the motivation for looking for the ‘problem’. After all, Listerine made money from inventing Halitosis.
I’ve seen many situations where IT people (both “perm” [snicker] and contract) were expected or pressured to always report exactly xx hours per week.
I wonder if your “mythical” financial firm used its data-gathering also to investigate discrepancies of *under*-reporting.
P.S. — Many times I’ve been forced — with a manager’s connivance — to over-report hours for one week as the only means of being compensated for hours unreported in a prior week.
Interesting as always. I don’t agree that root-cause analysis is only fruitful for special-cause variation; at youDevise we regularly find it helpful for looking into production bugs that appear to be down to common-cause variation, because we use the analysis to locate the common causes – for example, insufficient training, a missing quality check, or a cultural indifference to code quality.
Of course we have to be careful about overreacting to lessons learnt from a single analysis, or we will fall into the trap of applying a broad change when a narrow one would suffice, as you illustrate in the post. So we generally take small steps to improve – for example, if analysis indicates that insufficient training in a particular area may have lead to a bug, we might first invite an expert in the area to address the team, and only if we have more bugs in the same area would we do more like introducing formal training. (The practise of ensuring we can complete all post-analysis actions within a week helps keep the response proportional.)
Perhaps our definitions of “root-cause analysis” differ? The analysis described by Eric Ries (http://www.startuplessonslearned.com/2009/07/how-to-conduct-five-whys-root-cause.html) is a decent description of what we do.