In a recent thought-provoking TDWI article, David Champagne informed readers of The Rise of Data Science: a discipline of emulating the scientific method when analyzing data, in a conscious and laudable effort to ensure objectivity and avoid poor analytical practices.
As I had just recently blogged on the Texas Sharpshooter Fallacy, a type of flawed analytical logic business intelligence users might fall into, David Champagne’s article caught my attention.
From David Champagne’s article:
Back in the “good old days,” data was the stuff generated by scientific experiments. Remember the scientific method? First you ask a question, then you construct a hypothesis, and you design an experiment. You run your experiment, collect and analyze the data, and draw conclusions. Finally, you communicate your results and let other people throw rocks at them.
Nowadays, thanks largely to all of the newer tools and techniques available for handling ever-larger sets of data, we often start with the data, build models around the data, run the models, and see what happens. This is less like science and more like panning for gold…Perhaps the term “data scientist” reflects a desire to see data analysis return to its scientific roots…
Barry Devlin, in his business-focused commentary on David Champagne’s article, noted the worlds of science and business have rather different goals and visions, which I interpreted as data science might offer limited benefit to business managers. But perhaps the best practices of data scientists have a lot more in common with those of business managers after all, in light of some commentary I came across on effective business decision-making. That commentary gave high praise to the manager who utilizes the scientific method in the decision-making process. The author was not a technologist, but rather: Peter Drucker, the father of modern business management.
Revisiting Peter Drucker’s writings on effective decision-making process will show surprising similarities to the best practices of data science, and yield beneficial insights for business managers seeking to make more effective, data-informed decisions.
Barry Devlin wrote that the worlds of science and business work differently with data. For example, business is concerned with improving the bottom line while science seeks “real and eternal truths.” This is true, but scientists also never stop evaluating and improving on “truths.” The scientific process of seeking “real truth” and not accepting conventional wisdom is very similar indeed to the business decision-making process. From Peter Drucker:
(E)xecutives who make effective decisions know that one does not start with facts. One starts with opinions. These are, of course, nothing but untested hypotheses and, as such, worthless unless tested against reality…
People inevitably start out with an opinion; to ask them to search for the facts first is even undesirable. They will simply do what everyone is far too prone to do anyhow: look for the facts that fit the conclusion they have already reached. And no one has ever failed to find the facts he is looking for…
The only rigorous method…is based on the clear recognition that opinions come first…Then no one can fail to see that we start out with untested hypotheses – in decision-making as in science the only starting point. We know what to do with hypotheses – one does not argue them; one tests them. One finds out which hypotheses are tenable, and therefore worthy of serious consideration… (Source: The Essential Drucker, Peter Drucker, p. 252).
The process of testing opinions, aka hypotheses, will of course involve analyzing data – a process for which Peter Drucker again calls for a scientist-like inquisitiveness: “The effective decision-maker assumes the traditional measurement is not the right measurement…The best way to find the appropriate measurement is to…look for “feedback,” (Drucker, p. 253), which Drucker describes as “organized information” that is “built around direct exposure to reality” – better known today as a performance metric. Finding the right performance metric(s) is “a risk-taking judgement,” Drucker says. And the best way to mitigate that risk is by using metrics with some actively proven experiential value.
Finally, the evaluation stage of scientific method – “letting people throw rocks” at your conclusions, as David Champagne wrote – is again very similar to Drucker’s advice to business leaders, urging them to “create dissension and disagreement rather than consensus” (p. 254):
The effective decision-maker…organizes disagreement…It gives him alternatives so that he can choose and make a decision, but also so that he is not lost in the fog when his decision proves deficient…And it forces the imagination – his own and that of his associates…[The effective decision-maker] starts out with the commitment to find out why people disagree. (p. 256)
The best practices of data scientists and business decision-makers seem to overlap heavily. Thankfully, though, the business manager does not need to actually become a data scientist. Doing so would require adding to the business manager’s functional expertise two additional copious skill sets of the data scientist: extensive mathematics/statistical capabilities, and “hacking skills” – the ability to find and retrieve data from disparate sources on one’s own. David Champagne writes, “Finding and retrieving data sometimes requires the skills of a burglar” (emphasis added). This is cringe-worthy terminology to CIOs already working hard to avoid shadow systems and the multiple versions of the truth they might bring.
That said, Barry Devlin makes a succinct and strong case for the data warehouse as a leading source, if not the only source, of trusted data, not to mention trusted performance metrics, that business decision-makers can use with no need for “hacking skills”:
This is where a data warehouse comes in. Of course, only a small proportion of the data can (or should) go through the warehouse. But the value of the warehouse is in the fact that the data it contains has already been reconciled and integrated to an accepted level of consistency and historical accuracy for the organization.
An effective data warehouse will serve business managers as a laboratory of sorts, enabling decision-makers to test opinions/hypotheses by asking good questions and getting good answers: data that is “true” (accurate) along with proven performance metrics.
Business executives and managers will do well to adopt Peter Drucker’s best practices of decision-making, which closely follow the best practices of data scientists analyzing and interpreting data. Doing so will help lead towards better, data-informed decisions, and away from managing based on irrelevant measurements or “looking for the facts that fit the conclusion.” – in short, acting like a data scientist and not a ‘mad scientist.’
2 thoughts on “Business Managers Can Learn a Lot from Data Scientists”
Mike, very interesting post. I especially liked the dive into Peter Drucker’s writings and the recognition that decision-making starts from opinions which equate to untested hypotheses in the scientific method. I agree. In this sense, business and science can use and benefit from the same scientific method.
My concern about data science in business is two-fold. First, the personal, organizational and economic pressures in business all try to push data graphs in one direction only. Try to have a business discussion about the possibility that overall revenue figures should decrease annually! While scientists do vigorously defend the status quo in their field, they generally do not have to prove it on a monthly basis. Such pressure in business tends to turn untested hypotheses into gut certainties.
Second, I have an underlying concern about the inherent “quality” of data that is repurposed for new analyses. By quality here I mean the applicability of the previously collected data to the purpose of the analysis. For example, taking sensor data from vehicles that was gathered for tracking mechanical properties and using it as a basis for judging safe driving practices, and thus insurance premium decisions, opens up a Pandora’s Box of questions about the sampling techniques, the statistical characteristics of the data, not to mention ethical and philosophical concerns.
My bottom line is that we need to be vigilant to eliminate unintended and even deliberate bias in the application of data science principles to business.
Barry, thanks very much for your comments. Your points are well taken, particularly your concern about repurposing previously collected data for new uses that may well be inappropriate.
I like your example of insurance companies exploring using vehicle sensor data to assess safe driving and therefore premiums. To your point, suppose those sensors yielded a data point noting a very sudden hard stop from 40 to 0 mph. How could we know whether the driver braked to alertly avoid hitting a child who darted into the street after a soccer ball, or because he spilled his coffee while texting a note to his spouse about a song he just heard on the radio? 🙂
Even if we could allay this concern – let’s even assume the sensors have next-generation high-res video components (!) to indisputably validate ‘why’ hard stops, sudden accelerations, etc. took place – that mountain of “big data” would no doubt point to the same subset of very poor drivers already known all too well by police, traffic courts and auto insurers!
This leads me to wonder: will all this “big data” end up confirming Pareto’s “80/20 Rule”? Perhaps simple business pragmatism will go a long way to help discourage repurposing of data inappropriately or in ways that will yield little if any improved insight.
Such data usage, OTOH, could well be useful if, for example, the process began with drivers with a proven legacy of poor driving (speeding tickets, fender benders, and worse) agreed to such scrutiny to prove they are reformed drivers, rebuild their insurability and avoid losing their driver’s license permanently. Built-in car breathalyzers are already a welcome reality for the public while still affording drunk-driving offenders an opportunity to reform their behaviors before losing their driver’s license for good and perhaps finding themselves in a jail cell.
Thank you again for your insights, Barry. I appreciate you sharing them here. Data science can easily become “mad science” if not very carefully managed and monitored.