By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in sports industry
    Here’s How Data Analytics In Sports Is Changing The Game
    6 Min Read
    data analytics on nursing career
    Advances in Data Analytics Are Rapidly Transforming Nursing
    8 Min Read
    data analytics reveals the benefits of MBA
    Data Analytics Technology Proves Benefits of an MBA
    9 Min Read
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Top 10 analytic mistakes
Share
Notification Show More
Latest News
data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Business Intelligence > CRM > Top 10 analytic mistakes
Business IntelligenceCRMData MiningData VisualizationInside CompaniesMarketing

Top 10 analytic mistakes

JamesTaylor
Last updated: 2010/10/30 at 9:29 PM
JamesTaylor
7 Min Read
SHARE

One of my favorite presenters, John Elder, presented his top 10 analytic mistakes at Teradata Partners.

One of my favorite presenters, John Elder, presented his top 10 analytic mistakes at Teradata Partners.

Lack Data is problem zero – obviously you need data to do data mining and analytics. Without data that is relevant to the problem you cannot use analytics to solve it. In particular it can be hard when there are too few cases to use to train a model (in fraud, particularly, the number of known fraudulent cases can be low). Companies that invest in creating relevant data (by tracking how some high risk customers actually behave when given credit the models did not support for instance) can be very effective and worthwhile.

More Read

ai in ppc advertising

5 Proven Tips for Utilizing AI with PPC Advertising in 2023

5 Ways AI Technology Has Disrupted Website Development
Fortifying Enterprise Digital Security Against Hackers Weaponizing AI
Data Visualization Boosts Business Scalability with Sales Mapping
10 Ways How Artificial Intelligence Is Changing the Content Writing Landscape
  1. Focus on training
    Training a model is important but overfit is a big risk. In the end, only the effectiveness of the model against data not in the training sample matters. Sometimes training a model more can make it perform worse, as it is made to fit the training data better and better without necessarily matching other data. Keep some data out of your training set so you can check the model against it later.
  2. Rely on one technique
    Any technique can be flawed. Always compare the results of any novel technique to some conventional technique like linear regression as a sanity check. And don’t blame the algorithm for bad results as the modeling technique is rarely the issue – setting up the problem and managing complexity are much more likely to be an issue. So use a handful of good tools as, once the data is ready, more techniques don’t add much to the cost of the solution.
    Interestingly, though there are many tools, they share common techniques like decision trees, neural networks, nearest neighbor techniques etc. And while all of them have strengths and weaknesses, none of them outperform an ensemble model based on multiple techniques that just averages several models.
  3. Ask the wrong question
    You must aim at the right target, and the right target in business terms. In addition, don’t get lulled by the most accurate model, find the one that matches reality best. Best business outcomes is the only thing that should determine best model. For instance if you were predicting stock prices the model might emphasize smallest error but be happy with always making estimates that were high where a business might be happier with larger errors when those were low (because they profited from a price that was higher than predicted and lost when it was lower).
  4. Listen (only) to the data
    The data does speak, and can surprise you, but it is not the only thing to consider. For instance, some data seemed to show that spending less money would improve SAT schools (comparing SAT scores to investment per student in 50 states). But many states have more kids taking the ACT and so those taking the SAT in those states are self-selecting. That skewed the results and finding the problem require thinking about the real-world, not more data analysis.
  5. Accept leaks from the future
    Data that is not known at the time of prediction can easily be fed into a model. For instance models predicting interest rates or stock prices can be very accurate if they somehow include data about the trends such as considering the moving average of yesterday, today and tomorrow.
  6. Discount pesky cases
    These can mess you up but can be what actually matters. Outliers can be mistakes, caused by bad decimal points for instance, but sometimes the outliers show you what matters (fraud for instance).
  7. Extrapolate
    People can fall in love with their models and extrapolate too far. This is particularly a problem for folks working in machine learning who tend to extrapolate from “machines can win at chess” to “machines can think”!
  8. Answer every inquiry
    No model can ever answer every question – keep your focus on what the model is for and retest carefully before using it for something else.
  9. Sample casually
    If you are not going to use all the data (because there is too much) and when selecting your hold out group (see #1 above) be careful how you select the samples. It is easy to pick biased samples that drive the model in a particular direction.
  10. Believe the best model
    Instead build several good models, combine them in every conceivable combination and see which ones work best in combination. And if you don’t have time, just use all the models as more models almost always performs best.A lot of people want to build models that reveal the deepest truth of the universe. But having multiple pretty good models is often more effective.

To succeed:

Mistakes lead to experience which leads to learning and success so be prepared to make mistakes. John tells his students to adopt PATH:

  • Persistent – Be persistent and attack a problem in different ways
  • Attribute – be optimistic and have a can-do attitude
  • Teamwork – bring others to help you
  • Humility  – so you can learn from others and not expect too much of your technology.

JamesTaylor October 30, 2010
Share this Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data analytics in sports industry
Here’s How Data Analytics In Sports Is Changing The Game
Big Data
data analytics on nursing career
Advances in Data Analytics Are Rapidly Transforming Nursing
Analytics
data analytics reveals the benefits of MBA
Data Analytics Technology Proves Benefits of an MBA
Analytics
anti-spoofing tips
Anti-Spoofing is Crucial for Data-Driven Businesses
Security

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

ai in ppc advertising
Artificial Intelligence

5 Proven Tips for Utilizing AI with PPC Advertising in 2023

10 Min Read
ai in web design
Artificial Intelligence

5 Ways AI Technology Has Disrupted Website Development

7 Min Read
Digital Security From Weaponized AI
Security

Fortifying Enterprise Digital Security Against Hackers Weaponizing AI

11 Min Read
data visualization for small business
Data Visualization

Data Visualization Boosts Business Scalability with Sales Mapping

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?