How They Fit Together: Bell Curves, Bayesian Inference and Black Swans

May 1, 2012
343 Views

Probability is defined as the possibility, chance or odds of likelihood that a certain event or occurrence will take place now or in the future.  In a world where business managers like to “know the odds”, how does probabilistic thinking (Frequentism and Bayesian) mesh with extreme events (i.e. Black Swans) that just cannot be predicted?

Probability is defined as the possibility, chance or odds of likelihood that a certain event or occurrence will take place now or in the future.  In a world where business managers like to “know the odds”, how does probabilistic thinking (Frequentism and Bayesian) mesh with extreme events (i.e. Black Swans) that just cannot be predicted?

Statisticians lament how few business managers think probabilistically. In a world awash with data, statisticians claim there are few reasons to not have a decent amount of objective data for decision making. However, there are some events for which there are no data (they haven’t occurred yet), and there are other events that could happen outside the scope of what we think is possible.

The best quote to sum up this framework for decision making comes from the former US Defense secretary Donald Rumsfeld in February 2002:

“There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – there are things we do not know we don’t know.”

Breaking this statement down, it appears Mr. Rumsfeld is speaking about Frequentism, subjective probability (Bayes) and those rare but extreme events coined by Nassim Taleb as “Black Swans”.

Author Sharon Bertsch McGrayne elucidates the first two types of probabilistic reasoning in her book “The Theory That Would Not Die”.  Frequentism (conventional statistics), she says, relies on measuring the relative frequency of an event that can be repeated time and again under the same conditions. This is the world of p-values, bell curves, coin flips, casinos and actuaries where data driven decision making is objective based on sampling or computations of large data sets.

The greater part of McGrayne’s tome concentrates on defining Bayesian Inference, or subjective probability also known as a “measure of belief”. Bayes, she says, allows making of predications with no prior information at all (no frequency of events).With Bayes, one makes an educated guess, and then keeps refining that guess based on new information, thus updating and revising the probabilities, thus getting “closer to certitude.”

Getting back to Rumsfeld’s quote, Rumsfeld seems to be saying we can guess the probability of the “known knowns” because they’ve happened before and we have frequent data to support objective reasoning. These “known knowns” are Nassim Taleb’s White Swans. There are also “known unknowns” or things that have never happened before, but have entered our imaginations as possible events (Taleb’s Grey Swans). We still need probability to discern “the odds” of that event (e.g. dirty nuclear bomb in Los Angeles), so Bayes is helpful because we can infer subjective probabilities or “the possible value of unknowns” from similar situations tangential to our own predicament.  

Lastly, there are “unknown unknowns”, or things we haven’t even dreamed about (Taleb’s Black Swan).  Dr. Nassim Nicholas Taleb labels this “the fourth quadrant” where probability theory has no answers.  What’s an illustration of an “unknown unknown”? Dr. Taleb gives us an example of the invention of the wheel, because no one had even though or dreamed of a wheel until it was actually invented. The “unknown unknown” is unpredictable, because—like the wheel—had it been conceived by someone, it would have been already invented.

Rumsfeld’s quote gives business managers a framework for thinking probabilistically. There are “known knowns” for which Frequentism works best, “unknown knowns” for which Bayesian Inference is the best fit, and there is a realm of “unknown unknowns” where statistics falls short, where there can be no predictions. This area outside the boundary of statistics is the most dangerous area, says Dr. Taleb, because extreme events in this sector usually carry large impacts.

This column has been an attempt to provide a decision making framework for how Frequentism, Bayes and Black Swans fit together—by using Donald Rumsfeld’s quote. 

What say you, can you improve upon this framework?