By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    predictive analytics in dropshipping
    Predictive Analytics Helps New Dropshipping Businesses Thrive
    12 Min Read
    data-driven approach in healthcare
    The Importance of Data-Driven Approaches to Improving Healthcare in Rural Areas
    6 Min Read
    analytics for tax compliance
    Analytics Changes the Calculus of Business Tax Compliance
    8 Min Read
    big data analytics in gaming
    The Role of Big Data Analytics in Gaming
    10 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    Promising Benefits of Predictive Analytics in Asset Management
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: 7 Signs You’re Dealing with Complex Data
Share
Notification Show More
Latest News
ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence
ai in omnichannel marketing
AI is Driving Huge Changes in Omnichannel Marketing
Artificial Intelligence
ai for small business tax planning
Maximize Tax Deductions as a Business Owner with AI
Artificial Intelligence
ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence
How Big Data Is Transforming the Maritime Industry
How Big Data Is Transforming the Maritime Industry
Big Data
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > 7 Signs You’re Dealing with Complex Data
Uncategorized

7 Signs You’re Dealing with Complex Data

Eran Levy
Last updated: 2015/10/20 at 6:49 AM
Eran Levy
10 Min Read
SHARE

We talk a lot about complex data and the challenges and opportunities it poses for your business intelligence. But what makes data complex? And how can you tell if your organization’s current data can be considered “complex”, or will be so in the near future? This post will address these questions.

Contents
Why does this matter?The simple test: big or disparate data7 factors to determine your data’s complexity1. Structure2. Size3. Detail4. Query language5. Data type6. Dispersed data7. Growth rateHow to handle complex data?

We talk a lot about complex data and the challenges and opportunities it poses for your business intelligence. But what makes data complex? And how can you tell if your organization’s current data can be considered “complex”, or will be so in the near future? This post will address these questions.

Why does this matter?

The complexity of your data is likely to indicate the level of difficulty you’ll face when trying to translate it into business value – complex data is typically more difficult to prepare and analyze than simple data, and often will require a different set of BI tools to do so. Complex data necessitates additional work to prepare and model the data before it is “ripe” for analysis and visualization. Hence it is important to understand the current complexity of your data, and its potential complexity in the future, to assess whether your business intelligence project will be up to the task.

The simple test: big or disparate data

In high-level terms, there are two basic indications that your data might be considered complex:

More Read

big data improves

3 Ways Big Data Improves Leadership Within Companies

IT Is Not Analytics. Here’s Why.
Romney Invokes Analytics in Rebuke of Trump
WEF Davos 2016: Top 100 CEO bloggers
In Memoriam: Robin Fray Carey
  • Your data is “big”: We’ve placed the word big in parenthesis because of the seemingly infinite meanings of the term “big data”. However the fact of the matter remains that dealing with larger amounts of data poses a challenge in terms of the computational resources needed to process massive datasets, as well as the difficulty of separating the wheat from the chaff, i.e. distinguishing between signal and noise amid a huge deposit of raw information.
  • Your data is coming from many disparate sources: Multiple data sources can often mean messy data, or simply multiple datasets that follow a different internal logic or structure. Data must therefore be transformed, or consolidated into a central repository in order to ensure your sources are all speaking the same language.

These could be considered the two (alternate) initial warning signs: If you’re dealing with big or disparate data, you should begin to think of your data as complex. But to delve a bit deeper, here are seven more specific indicators of the complexity of your organization’s data, which in effect are a more detailed version of the above mentioned two.

(Note that there are some similarities, and one certainly does not exclude the other – on the contrary, dispersed data can often mean a variety of data structures and types, for example.)

7 factors to determine your data’s complexity

What makes data complex?Source: Demystifying Data Modeling (webinar)

1. Structure

Data from different sources, or even different tables from within the same source, could often refer to the same information but be structured entirely differently: thus for example, imagine your HR department has three different spreadsheets, one for employees’ personal details, another for their role and salary, a third for their qualifications, etc. – whereas your finance department records the same information in a single table, along with insurance, benefits and other costs. Additionally, in some of these tables employees might be mentioned by their full name, in others by initial, or some combination of the two.

To efficiently use data from all these different tables, without losing or duplicating information, requires data modeling and preparation work. This is the simplest use case: working with unstructured data sources (such as NoSQL databases) can further complicate matters, as initially these have no schema in place.

2. Size

Again returning to the murky concept of “big data”, the amount of data you collect can affect the types of software or hardware you need to analyze it. This can be measured either in raw size: gigabytes, terabytes or petabytes – the larger the data grows, the more likely it is to “choke” popular in-memory databases that rely on shifting compressed data into your server’s RAM. Additional considerations include tall data – tables that contain many rows (Excel, arguably the most commonly used data analysis tool, is limited to 1048576 rows), or wide data – tables that contain many columns. You’ll find that the tools and methods you use to analyze 100,000 rows are significantly different than those needed to analyze 1 billion.

3. Detail

The level of granularity in which you wish to explore the data. When creating a dashboard or report, presenting summarized or aggregated data is often easier than giving end-users the ability to drill into every last detail – however this is a tradeoff that comes at the price of limiting the possible depth of analysis and data discovery. Creating a BI system that enables granular drill-downs means having to process larger amounts of data on an ad-hoc basis (without relying on predefined queries, aggregations or summary tables).

4. Query language

Different data sources speak different languages: while SQL is the primary means of extracting data from common sources and RDBMS, when using a third party platform you will often need to connect to it via its own API and syntax, and to understand the internal data model and protocols used to access this data. Your BI tools need to be flexible enough to allow for this type of native connectivity to said data source, either via built-in connectors or API access, or else you will find yourself having to repeat a cumbersome process of exporting the data to a spreadsheet \ SQL database \ data warehouse and then pulling it into your Business Intelligence software from there, making your analysis cumbersome.

5. Data type

Working with mostly numeric, operational data stored in tabular form is one thing, but massive and unstructured machine data is another thing entirely, as is a text-heavy dataset stored in MongoDB, not to mention video and audio recordings. Different types of data have different rules, and finding a way to forge a single source of truth from all of them is essential in order to base your business decisions on an integrated view of all your organization’s data.

6. Dispersed data

Data stored in multiple locations: e.g.: different departments inside the organization, on-premises or in the Cloud (either in purchased storage or via cloud applications), external data originating from clients or suppliers, etc. This data is both more difficult to gather (simply because of the amount of stakeholders who need to be involved in order to receive it in a timely and effective manner), and once gathered – will typically require some ‘cleaning’ or standardization before the various datasets can be cross-referenced and analyzed, since each local dataset will be collected according to the relevant organization \ application’s own practices and focuses..

7. Growth rate

Finally, you need to consider not only your current data, but the speed in which your data is growing or changing. If the data sources are frequently being updated, or new data sources are frequently being added, this could tax your hardware and software resources (as less advanced systems would need to re-ingest the entire dataset from scratch whenever significant changes are made to the source data), as well as multiply the above mentioned issues around structure, type, size, etc.

How to handle complex data?

If you identify with one or more of the above and think your data might just be complex, don’t despair: understanding is the first step towards finding an appropriate solution, and analyzing complex data doesn’t have to be overly complicated in itself. We’ll be covering ways to tackle complex data in future posts, but the first thing you might want to ask yourself is — how many BI systems will you actually need to get a grip on your complex data?

Eran Levy October 20, 2015
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai software development
Key Strategies to Develop AI Software Cost-Effectively
Artificial Intelligence
ai in omnichannel marketing
AI is Driving Huge Changes in Omnichannel Marketing
Artificial Intelligence
ai for small business tax planning
Maximize Tax Deductions as a Business Owner with AI
Artificial Intelligence
ai in marketing with 3D rendering
Marketers Use AI to Take Advantage of 3D Rendering
Artificial Intelligence

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

big data improves
Big DataJobsKnowledge ManagementUncategorized

3 Ways Big Data Improves Leadership Within Companies

6 Min Read
Image
Uncategorized

IT Is Not Analytics. Here’s Why.

7 Min Read

Romney Invokes Analytics in Rebuke of Trump

4 Min Read

WEF Davos 2016: Top 100 CEO bloggers

14 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?