The Big Data Counterfeiters

September 11, 2015
105 Views

The arts and acts of counterfeiting have many manifestations, each with its peculiarities, motivations and modes.

There is the well-known counterfeiting of money, precious metals and art, and the lesser appreciated counterfeiting of government bonds, documents, certification, medication and consumer goods. There is also a hugely underappreciated side-line in the production, distribution and certification of counterfeit data, bogus information and knowledge-free knowledge.

The arts and acts of counterfeiting have many manifestations, each with its peculiarities, motivations and modes.

There is the well-known counterfeiting of money, precious metals and art, and the lesser appreciated counterfeiting of government bonds, documents, certification, medication and consumer goods. There is also a hugely underappreciated side-line in the production, distribution and certification of counterfeit data, bogus information and knowledge-free knowledge.

The 2007 film The Counterfeiters, tells the story of the Russian forger and Holocaust survivor, Salomon Sorowitsch, who was imprisoned in 1942 for forging baptismal certificates to save Jews from deportation. My favourite quote from this film is “Ich bin ich. Die anderen sind die anderen” (I am me. The others are the others). A counterfeiter? Yes… But, for entirely justifiable reasons.

Writer Ben Best talks about Magicians as being professionals in the art of deception. However he makes it clear that “they make no secret of this fact — and because their objective is entertainment — there is no blame or moral condemnation of magicians.”

The Art Science Research Laboratory’s non-partisan journalism ethics program and news site iMediaEthics carried an interesting piece on the exhibition “Intent to Deceive: Fakes and Forgeries in the Art World”. In the piece the writer draws comparisons between the possibility of hoaxes in Journalism and actual art hoaxes, clearly showing the prevalence of checks and balances in traditional journalism and emphasizing the sharp contrast between the world of journalism and the world of art. However, what they failed to mention was the equally prevalent presence of undercover hoax journalism and misleading advertising that gets palmed off as professional or peer opinion – for example, the sort of thing that appears from time-to-time in august online forums hosted by the likes of Forbes, ZD Net, LinkedIn Pulse, TDWI and Information Management, to name just a few.

If we accept prevailing  views on deceit[1] we would view it is a misrepresentation of the truth, one that can be used for a number of ends, regardless of the mode of execution, whether absent-mindedly blasé, tendentiously innocent or materially destructive.

Lying, or explicit misrepresentation is according to RationalWiki, a subdivision of deceit. It’s comforting to note that they qualify and add a humanizing touch to their remarks by adding that the “manifestations of deceit may be unintentional or the result of logical fallacy and consequently it doesn’t necessarily mean that the deceiver intends to deceive”, although this may also be a possibility.

There are so just many ways of forging, faking, hoaxing, counterfeiting and robbing. It’s difficult to know even where to start.

Suppose I convinced my feeble, confused and weak-minded neighbours to part with their life-long savings to bet on a sure-thing running in the Grand National or in the Dubai World Cup. When what I actually wanted to do was to take their money and bet it all on Arsicle Athletic beating Real Madrid in the Charity Shield. Should I then be surprised if concerned fellow citizens started to take note of my elicit activities?

If I was to claim that I had found the universal cure for cancer, all types of cancer, in all types of living organisms, and then went on to produce this live giving miracle remedy in industrial quantities, in my kitchen, at home, I suspect that I would be the subject of public scrutiny.

If I were to arrange a mega-roadblock on a major transport artery and diverted traffic along a road that lead to nowhere, nowhere other than to a deep ravine and an exhilarating downward journey and an ultimately mortal fate, then the Police might want to ask me a few questions.

Now I am only guessing here, because I have never claimed to have had a cure for cancer, nor have I encouraged Lemming like drivers to commit the ultimate sacrifice, and I certainly wouldn’t encourage people to bet, never mind whether it’s on the horses or even on a hedge fund. That’s just not my style. But I suspect that any in-your-face deceptions along these lines might just get me, you or the vast majority of us into hot water.

Which leads me to Big Data.

Why do people strive to deceive us about Big Data? They aren’t saving lives, they aren’t entertaining and amusing children and grandparents with illusion, guile and dexterity, and they aren’t even betting their own house on the Big Data jackpot in the sky. So what’s up?

Face the facts, there is a lot written about Big Data on forums such as Forbes and Linkedin, and at least 80% of it is bullshit, babble, shameless hype or vacuous marketing. But that’s just the surface detritus, the crap that is easy to spot. However, some nonsense is slightly more subtle.

Let’s take the example of fact based experimentation. Running hypothesise and test cycles over sets of unstructured data is now being characterised as Data Science. Which in turn is being conflated into a branch of Big Data. Which is then being used to flog artefacts and amenities emanating from the munificent and amorphous Hadoop ecosphere. Technologies and services that didn’t actually have anything to do with the original hypothesise and test exercise in the first place. This is simply ‘bait and switch’, and no matter how one spins it, it will always look like a species of fraud.

Indeed, the ‘profession’ of Data Scientist is being heralded as the dawn of the renaissance data person and the concomitant demise of the archaic and anachronistic hobby known as statistics – a diversion practiced by those in the contented classes who take the quaint title of ‘statistician’.

Now, call me old fashioned, and a cynic, but the idea that an exuberant, precocious and agile hacker, for as motivated as they may be, is a superior replacement for a seasoned and knowledgeable statistician, is to me at least, pure nonsense – okay, it is pure bullshit. Indeed, it is worse than nonsense, because it’s a deliberately cynical way of undermining professionalism and integrity as something to aspire to, of corrupting even social and cultural advances and of expanding alienation into the realm liberal professions, which up until fairly recently had experienced relatively little of the kind of profit-rate centred violence and alienation that plagued working class work.

So, the notion that whenever an expert is needed, an instant expert can be created – almost for free, is clearly a toxic, debilitating and corrupting trend, engendered and promoted by scoundrels.

In the Oscar winning film The Wrong Trousers, the accidental hero, Wallace, thinking that it will make his life easier, becomes the innocent victim to a pair of automated ‘Techno Trousers’, that take control and carry him off against his will in directions and to places that he does not wish to go.

Wallace uses the ‘Techno Trousers’ thinking that it will increase his power and stride when he is out walking the dog – that’s Gromit, if you remember – but he is gravely mistaken.

Things get worse when the felonious penguin Feathers McGraw attempts to make Wallace, now locked into Techno Trouser mode, an unwilling accomplice to the theft of a diamond. Thanks largely to Gromit, the plan does not succeed, and Feathers ends up in jail. After his traumatic experiences, Wallace realises that the Trousers are not the valuable addition to his lifestyle that he once thought they were.

Which pretty much sums up what can be said about some of the vast quantities and qualities of ‘valuable’ Big Data flowing through the data lakes and digital drains of the corporate underworld. Some of it is as potentially harmful and risk fraught as the ‘Techno Trousers’, and so it may not be the valuable addition to your corporate lifestyle that you might have initially guessed it could have been.

On the other hand, your idol may be Feathers McGraw. Why not, sometimes things just turn out that way. In which case the only sensible recommendation from me to you would be to be very careful with anything and everything related to Big Data, Data Science, Big Data technology and Big Data analytics. And, and this is a big ‘and’, never ever put on trousers that you don’t recognise.

I think I have said enough for now, and just to conclude, I will leave you with the words of Marcus Tullious Cicero. “True glory takes root, and even spreads; all false pretences, like flowers, fall to the ground; nor can any counterfeit last long.” Make of that what you will.


[1] RationalWiki