The digital universe is expanding at a staggering rate. Right now, there are about 4.4 zettabytes of data on the internet. For reference, a zettabyte is equivalent to 1000⁷ bytes.

According to the IDC, by 2020 there will be approximately 44 zettabytes of data. The increase is not only staggering but is scheduled to take place in a relatively short amount of time. Traditional means of handling digital information (i.e. human experts and specialized software), will no longer be enough.

The next step in handling big data is through artificial intelligence.

The Challenges of Big Data

Big Data on Computer Screen

Strictly speaking, big data refers to the vast amounts of information a business receives on a daily basis. The data is collected from a variety of sources, ranging from online financial transactions to social media pages. As such, the data tends to be unstructured.

For this data to be of any use, it has to be analyzed and structured in such way as to reveal consumer trends, patterns, and preferences. With this information, companies can optimize their business strategies to better suit the needs and wants of their clients.

Right now, most of this data is analyzed by human beings, but with the predicted increase, it is unlikely IT experts will be able to handle the workload. The challenges they are faced with stem not only from the increase in volume but also in the number of possible sources.

Human operators will still be able to handle some of the information gathered online, but big data is only relevant if there is a lot of it, and from varied sources. If information is gathered from just a few sources, the pattern which emerges might only be useful, in so far as it concerns the users of that particular source.

Add to this the fact that data is moving at lighting speed, and will continue to do so as technology evolves, and it becomes apparent that humans will no longer be able to keep up the pace. Media consumption is growing at an alarming rate. Sense may dictate that we rely more on smart data, but certain behavioral patterns & crucial insights can only be drawn from heaps of big data.

Don’t worry, big data is not going to ‘disappear’. On the contrary, it will continue to expand. But because most of it is going to be wasted, essentially it’s as if it were never there. This is where artificial intelligence comes into play.

How AI Can Handle Big Data

The role of AI in Big Data

The dream of creating an AI that truly mimics human intelligence is no longer truly pursued. Instead, scientists are working towards breaking down human behavior, and finding ways to create AIs that can recreate those behaviors.

For AI to handle big data efficiently, basically, it has to extract meaning from seemingly random bits of data. The difficulty presented by the process is that the AI has to learn on the go since it cannot be programmed to look for specific patterns. If it were, the whole process would be pointless.

Thus, AI has to be capable of interpreting vast amounts of data by themselves. To do that, they also have to be capable of contextualizing information.

Though this may seem like a distant pipe dream, AI has already made tremendous steps forward. There are already small-scale implementations, and the IoT is full of AI applications.

Since the quest for human-like intelligence has ended, paradoxically, AI has become increasingly smarter. Take AlphaGo, for instance. It was an AI designed to play “Go”, who managed to defeat the world champion ten years earlier than it was predicted. What is remarkable about this program is the fact that it taught itself how to play Go at such a level, and it did it much faster than expected.

The process which enabled AlphaGo to beat a human, the Go world-champion no less, at his own game, is called deep learning. Deep learning is what allows the Google search engine to be so efficient and enables facial, and audio recognition programs as well. What deep learnings means an AI is now being fed information in a structured, hierarchical way, from concrete to abstract.

First Steps Towards Implementation

Big Data & Artificial Intelligence

Google has already implemented deep learning into the AI governing search queries. The beauty of it is that not only will big data depend on AI to exist, but AI will also depend on big data to learn.

For deep learning to work, the AI has to be fed a lot of information. In 2012, Google managed to teach some 1000 computers to identify cats. It took about 10 million YouTube videos to do that.

There are already plenty of fields in which AI is being implemented, from surveillance systems to healthcare and online banking. The AI behind a program like Siri, who can adapt to the user’s voice and preferences, would’ve looked like science fiction a few decades ago, but has become commonplace now. Or take Watson, a program developed by IBM who managed to win the big prize in Jeopardy, in 2011, having access to about 200 million pages of information, including everything on Wikipedia.

Predictions

Artificial intelligence has certainly proven throughout recent years that it is fully capable of handling large amounts of information. Not only that but without these smart programs, all of the data contained on the internet is virtually useless.

When the idea of the world wide web was under development, many thought it would be just a fad, that would never catch on. An article published in News Week in 1995 claimed that “no online database will replace your daily newspaper.” The major complaint of the author, Clifford Stoll, was that the internet is just a jumble of unstructured information, and it takes a lot of time to find a simple answer.

While Stoll’s critiques might seem endearingly misguided to us now, his complaints were fair. Without the power of a search engine at hand to help sort and sift through all of the data, there was nothing of use there, just random content.

So far, big data has mostly been concerned with what happens online but with the emergence of the Internet of Things, AI can now tap into information that doesn’t appear online. Apart from geographical location, it can look into things that are happening on location, that users might not be aware of, through the use of sensors and other devices.

This information will expand our concept of big data to such a degree, that we understand as big data right now will likely feel like a drop in the ocean.

And there is still room for improvement in the years to come. As the amount of data on the web increases, we have to be prepared to use it. Information is only useful if you can learn something from it.

Image Source: 1; 2