When Ideology Reigns Over Data

Increasingly, the mantra of “let the data speak for themselves” is falling by the wayside and ideology promotion is zooming down the fast lane. There are dangers to reputations, companies and global economies when researchers and/or statisticians either see what they want to see—despite the data, or worse, gently massage data to get “the right results.”

Economist Thomas Piketty is in the news. After publishing his treatise “Capital in the Twenty First Century”, Mr. Piketty was lauded by world leaders, fellow economists, and political commentators for bringing data and analysis to the perceived problem of growing income inequality.

In his book, Mr. Piketty posits that while wealth and income were grossly unequally distributed through the industrial revolution era, the advent of World Wars I and II changed the wealth dynamic as tax raises helped pay for war recovery and social safety nets. Then, after the early 1970s, Piketty claims that once again his data show the top 1-10% of earners take more than their fair share. In Capital, Piketty’s prescriptions to remedy wealth inequality include an annual tax on capital and harsh taxation of up to 80% for the highest earners.

In this age of sharing and transparency, Mr. Piketty received acclaim for publishing his data sets and Excel spreadsheets for the entire world to see. However, this bold move could also prove to be his downfall.

The Financial Times, in a series of recent articles, claims that Piketty’s data and Excel spreadsheets don’t exactly line up with his conclusions. “The FT found mistakes and unexplained entries in his spreadsheet,” the paper reports. The articles also mention that a host of “transcription errors”, “incorrect formulas” and “cherry-picked” data mar an otherwise serious body of work.

Once all the above errors are corrected, the FT concludes; “There is little evidence in Professor Piketty’s original sources to bear out the thesis that an increasing share of total wealth is held by the richest few.” In other words, ouch!

Here’s part of the problem; while income data are somewhat hard to piece together, wealth data for the past 100 years is even harder to find because of data quality and collection issues. As such, the data are bound to be of dubious quality and/or incomplete. In addition, it appears that Piketty could have used some friends to check and double check his spreadsheet calculations to save him the Ken Rogoff/Carmen Reinhardt treatment.

In working with data, errors come with the territory and hopefully they are minimal. There is a more serious issue for any data worker however; seeing what you want to see, even if the evidence says otherwise.

For example, Nicolas Baverez, a French economist raised issues with Piketty’s data collection approach and “biased interpretation” of those data long before the FT report. Furthermore, Baverez thinks that Piketty had a conclusion in mind before he analyzed the data. In the magazine Le Point, Baverez writes; “Thomas Piketty has chosen to place himself under the shadow of (Karl Marx), placing unlimited accumulation of capital in the center of his thinking”.

The point of this particular article is not to knock down Mr. Piketty, nor his lengthy and researched tome. Indeed we should not be so dismissive of Mr. Piketty’s larger message that there appears to be an increasing gap between haves and have nots, especially in terms of exorbitant CEO pay, stagnant middle class wages, and reduced safety net for the poorest Western citizens.

But Piketty appeared to have a solution in mind before he found a problem. He will readily admit; “I am in favor of wealth taxation.” When ideology drives any data driven approach, it becomes just a little easier to discard data, observations and evidence that don’t exactly line up with what you’re trying to prove.

In 1977, statistician John W. Tukey said; “The greatest value of a picture is when it forces us to notice what we never expected to see.” Good science is the search for causes and explanations, sans any dogma, and willingness to accept outcomes contrary to our initial hypothesis. If we want true knowledge discovery, there can be no other way.