By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    Promising Benefits of Predictive Analytics in Asset Management
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Data Variety: What It’s All About
Share
Notification Show More
Latest News
ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence
ai in omnichannel marketing
AI is Driving Huge Changes in Omnichannel Marketing
Artificial Intelligence
ai for small business tax planning
Maximize Tax Deductions as a Business Owner with AI
Artificial Intelligence
ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence
How Big Data Is Transforming the Maritime Industry
How Big Data Is Transforming the Maritime Industry
Big Data
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Data Variety: What It’s All About
Big DataData MiningData QualityHadoopITModelingSocial Media AnalyticsSQLText AnalyticsWeb Analytics

Data Variety: What It’s All About

Ling Zhang
Last updated: 2013/05/14 at 10:57 PM
Ling Zhang
10 Min Read
Data Variety Promise
SHARE

Data variety stands out from the three Vs of big data from the report of the big data survey conducted by NewVantage Partners in 2012. One of the survey results shows companies focusing more on data variety instead of data volume both now and in the next three years. The report does not tell why data variety turns out to be such a salient attribute while big data platform like Hadoop focus more on addressing and solving data volume problem.

Data variety stands out from the three Vs of big data from the report of the big data survey conducted by NewVantage Partners in 2012. One of the survey results shows companies focusing more on data variety instead of data volume both now and in the next three years. The report does not tell why data variety turns out to be such a salient attribute while big data platform like Hadoop focus more on addressing and solving data volume problem. Of course, data variety contributes part of the data volume.

Data variety is similar to the nature of diversity of species in the world that demonstrates the richness of information. When exploring the data variety, it’s wise to have a mindset to listen patiently to different voices about customer’s needs. We must believe customers have put their voices and problems into the data you collected. As end users, some of them may not know what they want but they present what they want in the form of problems and hide them into the data variety and volume. Once you have such a mindset, you will be amazed by what you will find.

  • Data Variety PromiseData Variety is not about only data sources, types and structures

When talking about data variety, most often people talk about multiple or diverse data sources, variant data types, structures and formats, say, structured, semi or non-structured data like text, images and videos. Except for those most common types of variety, contextual information around data and the methods used for creating and gathering data as well as the high dimensionality of data should be also considered as data variety.  Those varieties can be counted as objective or physical elements of data variety.

Except for the objective nature, data variety also includes subjective nature that is usually missing or ignored by people. What I mean by subjective variety is the interpretation of data or the insight from different perspectives and different entities like people, group and business and their corresponding usages or applications. Because those factors actually drive the way to analyze, mine, integrate and use data or explain the results. And the subjective variety matters as much as objective variety. I also believe subjective variety will drive more objective data varieties.

The Curse and Challenge of Data Variety

First data variety brings challenges to data processing, analysis, mining and modeling because data is not in a uniform or standard form. For example, a person’s name may be in different variant form. In order to mine any insight from user data, it requires intensive efforts to preprocess data that include cleansing, normalization and standardization, handling missing values and correcting errors, etc. Otherwise, the model built will lack accuracy or business will make wrong decision.

Second data variety challenges relational database in design, store and maintenance – NoSQL database comes in as the main trend for big data storage because of its flexibility to add or remove data element easily. However the convenience at storage layer brings new challenges at query layer. With structured database, analyst can easily perform all kinds of queries or reports by slicing and dicing data dynamically and quickly, but not with NoSQL database.

Third data variety breaks the link and wholeness of entities, records or content. Suppose a user has a Facebook account that may be totally different than his/her account on LinkedIn, G+ or YouTube. The same person may give similar messages in different approaches – text or audio or videos. From a raw collection, it’s hard to know if the messages are from a single person and it’s also hard to tell if different friends and hobbies are related to the same person. From Facebook, you get one perspective of a person; from LinkedIn, you get another view; and on YouTube, you hear something different, but they are from same person. Similar to products and services – they can be talked and discussed by the same group users at different places and times from different perspectives. We need to find the hidden links for the right reason but we cannot find them directly like linked web pages – that’s the challenge!

The Blessing and the Value Proposition of Data Variety

If common patterns and trends can be discovered based on data volume that represent the over popularity or publicity then deeper relationships and 360 degree views of entities are most likely found based on data variety.

Bear in mind we are in a world of paradox, paradox is the foundation of all beings. Where there is a curse, there is a blessing; where there is a problem or challenge, there is an opportunity, and the bigger the curse, the bigger the blessing and the bigger the challenge, the bigger the opportunity.  Any opportunity is wrapped in the form of a problem. Without problem, there is no opportunity at all!

If you are a forethought leader, you must already have realized some of the curses or challenges above already turned into blessings, some of the blessings are still on the way to its fullness. Many data preprocessing technologies have been in market to improve data quality. Record Linking technology is used to integrate content, resolve entity identity problems and remove duplicates. As the data variety becomes richer, there is a need for more advanced technologies to provide innovative solutions to all challenges. Data variety is calling for more innovative solutions now than before.

Below is a possible list that we can rely on data variety to create new opportunities:

– Creates an entity portfolio – combine different natures in space and time to build a horizontal or vertical view about an entity – think about any possible entities. They are not only people or organizations.

– Build relationship between entities or dimensions of the same entity like relationships among an entity and its contextual information (where, what, who, when), interests and friends, brands and products, etc.

– Deliver multiple messages from different perspectives of crowd entities and dimensions.

– Enforce the same messages from different channels, resources or time periods repeatedly.

– Reveal root causes for a specific problem and explore the deep users’ intent.

– Support real time applications like advertising – based on where a user appears and clicks a link in combination with related specific information, contextual ads can be delivered at real time according to a user’s intent and interest reasoned from the variant data elements.

With that having been said, the curse and blessing are hand in hand and the problem is usually a shadow of an opportunity from its light: follow the shadow, the opportunity will be found. Because of my personal limitations, I believe the above list is not enough. Please add yours to the list or share your problems and challenges in handling data variety.  Explore ways to enrich data variety and make it enrich your business by discovering the golden opportunities behind. I believe, in the future, it’s possible that you know much more a user than he knows himself just because you take advantage of data variety. Data variety is the next winning star of big data as it holds the promise to a blooming business.

TAGGED: data variety
Ling Zhang May 14, 2013
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence
ai in omnichannel marketing
AI is Driving Huge Changes in Omnichannel Marketing
Artificial Intelligence
ai for small business tax planning
Maximize Tax Deductions as a Business Owner with AI
Artificial Intelligence
ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?