The Dark Matter of Data

June 5, 2012
97 Views

“I think there’s a sense that many of us have that the great age of exploration on earth is over. That for the next generation they’re going to have to go to outer space or the deepest oceans to find something significant to explore. But is that really the case?” — Dr. Nathan Wolfe


“I think there’s a sense that many of us have that the great age of exploration on earth is over. That for the next generation they’re going to have to go to outer space or the deepest oceans to find something significant to explore. But is that really the case?” — Dr. Nathan Wolfe

I recently watched a TED video by Dr. Nathan Wolfe, founder and CEO of the Global Viral Forecasting Initiative. I was attracted by the title, “What’s Left to Explore?” In this video, Wolfe explains that about 20% of the genetic information in the human nose (which can be obtained through a nasal swab) doesn’t match anything that we’ve ever seen before:  no plant, animal, fungus, virus, or bacteria. This is also the case for 40% to 50% of the genetic information in the human gut.

Wolfe refers to these unknowns as “biological dark matter.” According to Wikipedia today, dark matter is “an unknown type of matter hypothesized to account for a large part of the total mass in the universe.” Biological dark matter is genetic matter that can’t be typed or matched with anything we’ve seen before. 

In biological dark matter, Wolfe sees an exciting possibility: identifying an entirely new class of life (like the concept of a virus, identified by Dutch scientist Martinus Beijerinck within the last century) that may enable us to identify the cause of a cancer or the source of an outbreak, or create a new tool in molecular biology.

The same holds true for business data. One might be tempted to think, “We’ve got data. Lots of it. We’ve got Big Data. What’s left to explore?” Sure, you’ve got data. But it’s in myriad different systems and in an endless number of formats. What might you discover if you brought seemingly unrelated data from these multiple source systems into one QlikView app, where the associations (and lack thereof) in the data would become visible, perhaps for the very first time?

A QlikView app automatically maintains all the associations in the data and calculates aggregations on the fly. The result? Through an associative experience, users can now explore the dark matter of data. They can click or tap away, or lasso sections of charts or regions of maps and at all times they can see what data is associated with their selections and what data is not.

Users make a selection in one chart and all other charts and graphs in the entire app update to reflect that selection — with no hard coding. Users think of a second question, and a third one, and make more selections, and all the charts and graphs in the app update again instantly. It’s through this associative experience that users can easily see, in a visual way, any outliers in the data or any unexpected relationships. (For more info see the QlikView White Paper, What Makes QlikView Unique.)

Dr. Nathan Wolfe offers a great lesson for would-be explorers and it applies to explorers of data as well as biology and genetics: “Don’t assume that what we currently think is out there is the full story. Go after the dark matter in whatever field you choose to explore. There are unknowns all around us and they are just waiting to be discovered.”