Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: How to Balance the Five Analytic Dimensions
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Analytics > Modeling > How to Balance the Five Analytic Dimensions
Modeling

How to Balance the Five Analytic Dimensions

Damian Mingle
Damian Mingle
8 Min Read
SHARE

So many data scientists select an analytic technique in hopes of achieving a magical solution, but in the end, the solution simply may not even be possible due to other limiting factors. It is important for organizations working with analytic capabilities to understand the various constraints of implementation most real-world applications will encounter. When developing a solution one has to consider: data complexity, speed, analytic complexity, accuracy & precision, and data size.

Contents
Data ComplexitySpeedAnalytic ComplexityAccuracy & PrecisionData Size Implementation Constraints

So many data scientists select an analytic technique in hopes of achieving a magical solution, but in the end, the solution simply may not even be possible due to other limiting factors. It is important for organizations working with analytic capabilities to understand the various constraints of implementation most real-world applications will encounter. When developing a solution one has to consider: data complexity, speed, analytic complexity, accuracy & precision, and data size. Data Scientists, nor the organizations they work for, will be able to be the best in each category simultaneously; however, it will prove necessary to understand the trade-offs of each. 

Data Complexity

It is important to know as much as possible about the data. Practically, this looks like understanding the data type, formal complexity measures, tab measures of overlap and linear separability, number of dimensions/columns, and linkages between data sets. For example, one must be able to link up healthcare remittances to paid claims that come in all flavors: fully paid, partially paid, and denied over long periods of time. These linkages can be extremely complex. 

More Read

Hadoop in retail
5 Common Use Cases for Hadoop in Retail
Why Predicting the Future is So Darn Difficult
ACM Data Mining Talk: Representing Solutions with PMML
How Airlines Measure Loyalty Using Big Data & Analytics
Upcoming Webinar: Real-time, Big-data Analytics

Speed

The speed at which an analytic outcome must be produced (e.g. near real-time, hourly, daily), or the time it takes to develop and implement the analytic solution, is another key consideration. This particular dimension provides a lot of angst for most Data Scientists, primarily because they generally want to come up with an optimal solution regardless of time. However, we can all agree that if an enterprise needs to deploy new predictions every 15 minutes, but it takes 1.5 hours to retrain the algorithm, then it will not be successful. 

 

Analytic Complexity

Algorithm complexity is measured as complexity class and execution resources. This dimension could be limiting if the complexity needs to be low in order for the business to grasp what is going on. Clearly this will limit a Data Scientist’s ability to create an optimal outcome. Some industries prefer lower quality prediction if they receive more understanding about the contributing factors to a prediction; this is true in the healthcare industry. A great example of this is the Netflix $1 Million Challenge. A team of Data Scientists put in over 2,000 hours of work to come up with the combination of 107 algorithms that won first place by besting Netflix’s own algorithm by 10%. However, Netflix never implemented the full benefit of the first-place solution due to the engineering effort needed to bring it into a production environment. 

Accuracy & Precision

Most businesses do not understand how to nuance when it comes to predictive accuracy; however, it will be essential for a Data Scientist to help the organization move beyond the simple notion of accuracy. Obviously we all want to hit the proverbial target. At least directionally, as a Data Scientist, you will want to steer the conversation to something more useful, like an algorithm that produces “high accuracy/low precision” or “high accuracy/high precision”. It usually proves beneficial to the business audience to distinguish what is meant by accuracy and precision as they appear to be close in meaning. Help them see that “accuracy” refers to the closeness of a predicted value to the actual value. A good example of this: a data science model predicted the weight of a package to be 19 lbs, but the actual weight of the package is 28 lbs. This would demonstrate “low accuracy”. “Precision” on the other hand refers to the closeness of two or more measurements to each other. For example, if a Data Scientist predicts the value of a package to be 19 lbs – over 5 separate iterations – then it is said to be “precise”.  From a business perspective, it is critical to note that a data science model can be extremely precise, but inaccurate in its prediction. 

Data Size 

The size of the data set is viewed as the number of rows and the number of fields. Many organizations may not understand when dealing with prediction that the more data you have, the better the output. However, there may be a point that the size of data goes beyond the typical tools skill set of the average Data Scientist. In fact, many of the classic algorithms one might use in smaller datasets may simply vanish as an option once one begins navigating in bigger data waters. As a Data Scientist, it is worth investigating the limits of your skill and tools before you get in front of an executive audience; they are counting on you to be the expert, as well they should. 

Implementation Constraints

In almost every case, the business user will dictate one or more constraints in the problem the Data Scientist will face. Once a single dimension is fixed then the hard work begins – the development of wanting to know what else can be done with the other dimensions. Take for example a hospital who needs near real-time analytics to help the physicians make clinical decisions: the speed decision is already fixed and trade offs between the other four dimensions must be made. For many Data Scientists, learning what the right balance is will be developed over the course of a career. It is more of an art than a science, but that does not mean you should not devote significant resources to expedite your learning.  

Data Science in the real-world always has to consider the five analytic dimensions; and, Data Scientists should aim to be sure to optimize each dimension for the business it seeks to serve. Whether it is data complexity, analytic complexity, accuracy & precision, speed, or data size, each is important in its own right. As a Data Scientist, it is vital to understand how to guide the business analytically by keeping in balance the five analytic dimensions.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

Quickly Deploy a Well-Engineered Apache Hadoop Solution to a Production Environment

10 Min Read
Crystal ball
AnalyticsBig DataData MiningData WarehousingHadoopITMapReduceModelingOpen SourcePredictive AnalyticsSentiment AnalyticsSocial DataSocial Media AnalyticsSoftwareUnstructured DataWorkforce AnalyticsWorkforce Data

3 Organizations That Can See the Future with Predictive Analytics

6 Min Read

Who Gets the Call When Your Analytics Process Crashes?

6 Min Read

NCAA Data Visualizer for March Madness Face-Offs

2 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

ai in ecommerce
Artificial Intelligence for eCommerce: A Closer Look
Artificial Intelligence
data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?