It’s not unusual, and indeed in some environments common, for these sorts of claims to be accepted with no analysis at all.
The data-driven professional is brave enough to call this out and to request that some sort of case is made in the data to warrant further investigation of the claim.
2.
The coin-flip testLet’s suppose that data is now forthcoming that that shows some reason to believe that the move to a new headquarters has impacted sales.
For example someone has shown that, for a sample population of sales people, sales per person in the three months prior to the move were 10% higher than in the three months after the move.
Seeing a pattern or difference in some sample data is not proof that this pattern or difference actually exists in general.
We all know this intuitively, and to illustrate this I often bring a double headed coin to my workshops.
I progressively flip the coin, without telling my audience it is double-headed, and at each flip I ask the audience if they believe the coin is fake.
What I inevitably see is that the number of people who believe the coin is fake increases with each flip, proving that people have an intuitive sense of the idea of statistical uncertainty, and there are different thresholds at which they believe something becomes certain.
Relating this to the problem at hand, the uncertainly lies in whether the sample population was large enough, or whether the 10% difference is large enough, to confidently claim that the pattern exists in general.
It is fine to say ‘In a sample of salespeople we found that sales per person were higher before we moved headquarters’.
But you cannot say ‘Sales per person were higher before we moved headquarters’ without appropriate statistical testing by a qualified person to do so.
3.
The mozzarella testNow let’s assume that we get some (fairly straightforward) statistical tests done on the data and they establish that the data meets the requirements to establish a meaningful difference and that we can in fact make the general claim about the sales difference before and after the headquarters move.
Can we now say that moving headquarters has impacted sales?.Take a look at this chart:Yes, there is almost a perfect correlation between eating mozzarella cheese and the award of civil engineering doctorates in the US.
Suggestive patterns and relationships exist everywhere in data.
For more of them check out this website.
The statement being made in our example is very specific — it suggests that the move to a new headquarters has caused a drop in sales.
Just proving that there was a drop in sales is not enough to prove that it was caused by the move.
To prove a causative relationship more work needs to be done.
For example, it’s important to eliminate that the difference was not caused by other factors.
Perhaps we see seasonal sales drops at this time every year?.Perhaps there has been a new competitor entering the market?.It’s also important to see if there is a mechanism of causality, for example are sales people on average seeing fewer clients because the move has taken them further away from them?So becoming more data-driven doesn’t mean you have to be a math genius.
As a first step, try to change your behavior around the questions you ask when you hear hypotheses at work:Is there supporting data?.(DTA)Does it satisfactorily prove the relationship being claimed?.(Coin flip test)Is there a clear causality?.(Mozzarella test)Becoming more data-driven is fundamentally a question of behavior change.
Having the math skills helps, and it’s a good idea to look to develop these over time.
But try starting with these three simple steps.
Originally I was a Pure Mathematician, then I became a Psychometrician and a Data Scientist.
I am passionate about applying the rigor of all those disciplines to complex people questions.
I’m also a coding geek and a massive fan of Japanese RPGs.
Find me on LinkedIn or on Twitter.
.