The Lessons We can Learn from Bad Data Mistakes Made Throughout History

There are many examples of bad data mistakes throughout history that have helped shaped the world we live in today.

May 26, 2017
1972 Views

Bad data is costly.  With data driving so many decisions in our lives, the cost of bad data truly impacts us all, whether or not we realize it. IBM estimates that bad data costs the U.S. economy around $3.1 trillion dollars each year.  Most people who deal with data realize that bad data can be extremely costly, but this number is truly stunning. The data that most businesses analyze is about their customers, and if you’re relying on bad data, there is no way your business can succeed.

Additional research from Experian Data found that bad data has a direct impact on the bottom line of 88% of American companies, with the average company losing around 12% of its total revenue. These numbers paint a very real picture of the negative impact of bad data on our economy.

Looking beyond just the financial impact of bad data, the impact of bad data also includes the spread of misinformation. There are many examples of bad data mistakes throughout history that have helped shaped the world we live in today.

A group of data analysts from Utopia, Inc have curated a comprehensive list of examples in an infographic that shows how bad data mistakes have led to disastrous decisions that changed the course of history and the society we live in today.  Let’s explore some of the more interesting examples from their list.

The 2016 United States Presidential Election

The most recent U.S. Presidential Election was mired with bad data.  From the myriad of polls and poll aggregators, to the exalted political oracles at FiveThrityEight and the New York Times, most pollsters and predictors got this election completely wrong and predicted a landslide Hillary Clinton victory. It was this error, that many Democrats argued, that caused a historic number of voters to stay home on Election Day. This forecast obviously did not materialize.

This spread of bad data could have been prevented by utilizing advanced statistics to analyze previous elections and by using machine-learning and creating “kitchen-sink” models based on voter rolls.  This may sound complicated, but it is an established way to improve the underlying assumptions of the polls.  These methods however are costly and time-intensive for most polls which instead use online surveys and publicly-available online Census data.

The Enron Scandal of 2001

Enron was once one of the most powerful and largest companies in the world.  During the early 2000s they experienced jaw-dropping executive compensation and soaring stock prices.  However, a host of fraudulent financial data can be directly attributed to the downfall of the company.

From internal whistleblowers to the shredding of documents by Enron’s external auditors, there is little question that the data that was being provided to shareholders was largely fictionalized. The data that was delivered by Enron’s executives and their auditing firm to stock holders and the Board of Directions in annual reports and financial statements proved to be false.

An ethical external auditing firm at Enron could have prevented this financial fraud from occurring.  The Sarbanes-Oxley Act of 2002 was enacted following the Enron scandal to ensure auditor independence, corporate responsibility, financial disclosures, conflicts of interest and overall public company oversight.  If this act was around earlier, it would have prevented the Enron disaster from occurring.

Tetraethyllead in Gasoline in the 1920s

Added to gasoline in the 1920s to control knocking in engines, tetraethyllead contributed to over 5,000 fatalities in the United States alone.  This was in part, made possible by intentionally inconclusive tests led by the leaded gas industry and the willful deceit of the American Government.

For decades, the lead paint and the leaded gas industries blamed each other for lead poisoning, both suggesting their products were safe for humans. Industry scientists even suggested the human body naturally harbors lead, so high levels shouldn’t be a health concern.

After the initial discovery of the potential threat from leaded gasoline, an independent study of its harmful effects should have been conducted. The U.S. Government and the gas industry both turned a blind eye and instead relied on bad data that cost many people their lives.

Christopher Columbus and the Discovery of the Americas

Even the discovery of the Americas was a result of bad data.  Christopher Columbus made a few significant miscalculations when charting the distance between Europe and Asia.  First, he favored values given by Persian geographer Alfraganus, over the more accurate calculations of Greek geographer, Eratosthenes. Second, Columbus assumed Alfraganus was referring to Roman miles in his calculations when, in reality he was referring to Arabic miles.

Columbus himself is to blame for the bad data.  Columbus could have stuck with one geographer’s calculations and verified the units of measurement he was using was actually correct.

The lessons we can learn from bad data mistakes in the past

There are countless examples of bad data mistakes throughout the history of the world. Better data leads to better and more accurate decisions. Relying on bad data carries negative effects to businesses and our society as a whole. Can you think of examples where bad data has affected your business or your personal life?

You may be interested

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities
Big Data
65 shares895 views
Big Data
65 shares895 views

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities

Ryan Kade - June 23, 2017

Since 2013, the Institute of Electrical and Electronics Engineers has held annual big data conferences to highlight changes and opportunities…

10 of the Top Marketing BI Software Options
Business Intelligence
117 shares1,323 views
Business Intelligence
117 shares1,323 views

10 of the Top Marketing BI Software Options

Hayden B. - June 23, 2017

Business can be complicated sometimes. It’s not always easy to keep track of all the data and information we deal…

The Race for 5G Is the Race for Data Dominance
Big Data
80 shares1,070 views
Big Data
80 shares1,070 views

The Race for 5G Is the Race for Data Dominance

Daniel Matthews - June 22, 2017

Have you noticed how often the phrase “by the year 2020” comes up? In the tech sphere, many are heralding…