In a recent thought-provoking TDWI article, David Champagne informed readers of The Rise of Data Science: a discipline of emulating the scientific method when analyzing data, in a conscious and laudable effort to ensure objectivity and avoid poor analytical practices.
As I had just recently blogged on the Texas Sharpshooter Fallacy, a type of flawed analytical logic business intelligence users might fall into, David Champagne’s article caught my attention.
From David Champagne’s article:
Back in the “good old days,” data was the stuff generated by scientific experiments. Remember the scientific method? First you ask a question, then you construct a hypothesis, and you design an experiment. You run your experiment, collect and analyze the data, and draw conclusions. Finally, you communicate your results and let other people throw rocks at them.
Nowadays, thanks largely to all of the newer tools and techniques available for handling ever-larger sets of data, we often start with the data, build models around the data, run the models, and see what happens. This is less like science and more like panning for gold…Perhaps the term “data scientist” reflects a desire to see data analysis return to its scientific roots…
Barry Devlin, in his business-focused commentary on David Champagne’s article, noted the worlds of science and business have rather different goals and visions, which I interpreted as data science might offer limited benefit to business managers. But perhaps the best practices of data scientists have a lot more in common with those of business managers after all, in light of some commentary I came across on effective business decision-making. That commentary gave high praise to the manager who utilizes the scientific method in the decision-making process. The author was not a technologist, but rather: Peter Drucker, the father of modern business management.
Revisiting Peter Drucker’s writings on effective decision-making process will show surprising similarities to the best practices of data science, and yield beneficial insights for business managers seeking to make more effective, data-informed decisions.
Barry Devlin wrote that the worlds of science and business work differently with data. For example, business is concerned with improving the bottom line while science seeks “real and eternal truths.” This is true, but scientists also never stop evaluating and improving on “truths.” The scientific process of seeking “real truth” and not accepting conventional wisdom is very similar indeed to the business decision-making process. From Peter Drucker:
(E)xecutives who make effective decisions know that one does not start with facts. One starts with opinions. These are, of course, nothing but untested hypotheses and, as such, worthless unless tested against reality…
People inevitably start out with an opinion; to ask them to search for the facts first is even undesirable. They will simply do what everyone is far too prone to do anyhow: look for the facts that fit the conclusion they have already reached. And no one has ever failed to find the facts he is looking for…
The only rigorous method…is based on the clear recognition that opinions come first…Then no one can fail to see that we start out with untested hypotheses – in decision-making as in science the only starting point. We know what to do with hypotheses – one does not argue them; one tests them. One finds out which hypotheses are tenable, and therefore worthy of serious consideration… (Source: The Essential Drucker, Peter Drucker, p. 252).
The process of testing opinions, aka hypotheses, will of course involve analyzing data – a process for which Peter Drucker again calls for a scientist-like inquisitiveness: “The effective decision-maker assumes the traditional measurement is not the right measurement…The best way to find the appropriate measurement is to…look for “feedback,” (Drucker, p. 253), which Drucker describes as “organized information” that is “built around direct exposure to reality” – better known today as a performance metric. Finding the right performance metric(s) is “a risk-taking judgement,” Drucker says. And the best way to mitigate that risk is by using metrics with some actively proven experiential value.
Finally, the evaluation stage of scientific method – “letting people throw rocks” at your conclusions, as David Champagne wrote – is again very similar to Drucker’s advice to business leaders, urging them to “create dissension and disagreement rather than consensus” (p. 254):
The effective decision-maker…organizes disagreement…It gives him alternatives so that he can choose and make a decision, but also so that he is not lost in the fog when his decision proves deficient…And it forces the imagination – his own and that of his associates…[The effective decision-maker] starts out with the commitment to find out why people disagree. (p. 256)
The best practices of data scientists and business decision-makers seem to overlap heavily. Thankfully, though, the business manager does not need to actually become a data scientist. Doing so would require adding to the business manager’s functional expertise two additional copious skill sets of the data scientist: extensive mathematics/statistical capabilities, and “hacking skills” – the ability to find and retrieve data from disparate sources on one’s own. David Champagne writes, “Finding and retrieving data sometimes requires the skills of a burglar” (emphasis added). This is cringe-worthy terminology to CIOs already working hard to avoid shadow systems and the multiple versions of the truth they might bring.
That said, Barry Devlin makes a succinct and strong case for the data warehouse as a leading source, if not the only source, of trusted data, not to mention trusted performance metrics, that business decision-makers can use with no need for “hacking skills”:
This is where a data warehouse comes in. Of course, only a small proportion of the data can (or should) go through the warehouse. But the value of the warehouse is in the fact that the data it contains has already been reconciled and integrated to an accepted level of consistency and historical accuracy for the organization.
An effective data warehouse will serve business managers as a laboratory of sorts, enabling decision-makers to test opinions/hypotheses by asking good questions and getting good answers: data that is “true” (accurate) along with proven performance metrics.
Business executives and managers will do well to adopt Peter Drucker’s best practices of decision-making, which closely follow the best practices of data scientists analyzing and interpreting data. Doing so will help lead towards better, data-informed decisions, and away from managing based on irrelevant measurements or “looking for the facts that fit the conclusion.” – in short, acting like a data scientist and not a ‘mad scientist.’