Harvesting Data: What Is the Mood in the World?

August 27, 2011
161 Views

Every day, we create 2.5 quintillion bytes of data.* This data comes from everywhere—from posts to social media sites, digital pictures and videos posted online, and cell phone GPS signals, to name a few. The amount of data in our world has been exploding. Analyzing large data sets, so called “big data,” becomes a key basis of competition and innovation. The question is: How are we going to harvest all this data? Traditional BI is too clumsy to get the job done. Why?

Every day, we create 2.5 quintillion bytes of data.* This data comes from everywhere—from posts to social media sites, digital pictures and videos posted online, and cell phone GPS signals, to name a few. The amount of data in our world has been exploding. Analyzing large data sets, so called “big data,” becomes a key basis of competition and innovation. The question is: How are we going to harvest all this data? Traditional BI is too clumsy to get the job done. Why? Big data is time sensitive; there isn’t enough time for business users and developers to spend months documenting and coding the analysis requirements. Also, big data has a lot of variety; it comes from both structured and unstructured data sources.

QlikView is the perfect fit for analyzing big data. To prove my point, I created a QlikView application analyzing human feelings all over the world. Everyday millions of blog posts are written. People blog about technology, politics, health, etc. and they talk about their feelings.  I wondered if I could scan all of these blogs and analyze human feelings all around the world.

I found an API (We Feel Fine), which has been harvesting information about human feelings from a large number of logs since 2005. Every few minutes, the system searches the world’s newly posted blog entries for occurrences of the phrases “I feel” and “I am feeling” and stores 15,000 to 20,000 new emotions per day. I used the API to extract the data (in QlikView, developers can define web files as data source). Then I started asking questions and exploring this unstructured data.

Do women feel fat more often than men? Does rainy weather affect how we feel? What are the happiest cities in the world?  Do Europeans feel sad more often than Americans? You can download my application from QlikCommunity, to ask your own questions and formulate your own insights about the human condition.

 

World Mood.png

 

QlikView provides developers with a complete set of tools for managing data extraction and transformation, all offered in one comprehensive product. It can extract data from both structured and unstructured data sources and automatically creates associations in the data. Because QlikView operates entirely in memory, it does not require data to be stored in specific, aggregated formats. Once the data is loaded, users can start exploring the data right away, creating charts and answering questions with zero wait time. These are some of the features that make QlikView the perfect fit to discover big data. By the way, how am I feeling? I am feeling like Qliking!  

 

* McKinsey Global Institute – Big data: The next frontier for innovation, competition, and productivity