It is better to know nothing than to know what ain’t so. – Josh Billings
A Texas cowboy fires several rounds at a barn. Looking at the holes riddled across the barn wall, he observes there happens to be some holes clustered together. He excitedly paints a bulls-eye centered over the biggest cluster of holes and proudly shows it off as proof he is a sharpshooter.
This old joke is the namesake for a type of logical fallacy known as the Texas Sharpshooter: to cherry-pick clusters of information, be it bullet holes or data, to support one’s own personal agenda.
Examples of the Texas Sharpshooter fallacy can be seen in public health investigations, politics (a politician might accuse an opponent of a “pattern” of poor actions), and pseudosciences like “intelligent design.”
But a Texas Sharpshooter fallacy can also be the result of the mistaken assignment of meaning to what is in fact a small subset of all related data. It then occurred to me: virtually every business intelligence tool has a feature to highlight a portion of a graph to focus only on a particular data cluster, enabling every BI end-user to “draw a bulls-eye” of their own! The risk of jumping to false conclusions based on such selective data selection can be very high.
So how can business intelligence users avoid unwittingly creating their own Texas Sharpshooter fallacies, and help debunk those put forth by others in the organization?
The problem occurs when a BI end-user draws a conclusion that is based on that isolated data cluster. Suppose a sales manager using a BI tool highlights a cluster of data points, all with a noticeably low gross margin compared to the other monthly sales. The sales manager then sees that those ten data points – all sales with low gross margins – are all deals closed by Joe Bloggs.
Is Joe Bloggs giving away pricing concessions to customers? Well, if the sales manager has poor analytical, managerial and/or interpersonal skills, or even just doesn’t like Joe, this data cluster may be all the “proof” (s)he needs to “have a word” with Joe, subvert Joe’s efforts by grousing over his lousy dealmaking skills with others in the hall… you get the picture. Again quoting David McRaney, “You commit the Texas Sharpshooter Fallacy when you need a pattern to provide meaning, to console you, to lay blame.”
Of course, the sales manager does not have “proof” Joe Bloggs is “giving away margin” any more than that cluster of bullet holes is “proof” the cowboy is a sharpshooter. What the sales manager does have is a hypothesis – defined succinctly by Wikipedia as “a proposed (repeat, proposed) explanation for an observable phenomenon.” That data cluster of low margin sales should be the beginning of the sales manager’s line of inquiry, not the conclusion.
A wise user of business intelligence tools will proceed like a scientific researcher, who will begin with a hypothesis and then try to disprove it. Here are two suggestions to do so effectively:
- Use the word ALL a lot. Seriously, asking new questions of the data with liberal use of the word “all” is wise; for example, how do Joe Bloggs’ ten low margin sales compare with “all” of his deals for the month (both in terms of currency and quantity); “all” sales of the same products by “all” sales reps; “all” sales to those customer(s), etc. Thinking and asking questions of the data in terms of “all” will effectively erase the cowboy’s bulls-eye target, so to speak, by focusing on relevant data in its entirety and not the original data cluster in isolation, which may well prove statistically insignificant in light of “all” related data.
- The longer time series of data you can utilize in your line of inquiry, the better. Here’s a real world example: A college finance and budget department recognized a common assumption among faculty that administrative spending was out of control, leaving faculty to fight for whatever was left. The budget and reporting director produced a trending analysis that convincingly proved that growth rates in spending were the same between faculty and administration… over the last 15 years! This diminished the conventional wisdom and led to fact-focused planning and prioritization discussions and enhanced trust and communication between departments. (Of course, a cherry-picked subset of the data would no doubt have yielded “proof” of disproportionate growth in administrative spending).
If a co-worker makes an assertion based on some isolated data set, ask whether their assertion is supported by the entire spectrum of related data.
One final thought: A highly effective business intelligence implementation will make it very easy for the end-user to compare a data cluster with “all” related data and readily observe the significance, if any, of a selected data cluster. Now there’s some real data sharpshooting!
If you liked this post, you may also like: