Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data mining to find the right poly bag makers
    Using Data Analytics to Choose the Best Poly Mailer Bags
    12 Min Read
    data analytics for pharmacy trends
    How Data Analytics Is Tracking Trends in the Pharmacy Industry
    5 Min Read
    car expense data analytics
    Data Analytics for Smarter Vehicle Expense Management
    10 Min Read
    image fx (60)
    Data Analytics Driving the Modern E-commerce Warehouse
    13 Min Read
    big data analytics in transporation
    Turning Data Into Decisions: How Analytics Improves Transportation Strategy
    3 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: How to Calculate R-squared for a Decision Tree Model
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Data Management > Best Practices > How to Calculate R-squared for a Decision Tree Model
Best PracticesDecision Management

How to Calculate R-squared for a Decision Tree Model

MichaelBerry
MichaelBerry
4 Min Read
SHARE

A client recently wrote to us saying that she liked decision tree models, but for a model to be used at her bank, the risk compliance group required an R-squared value for the model and her decision tree software doesn’t supply one. How should she fill in the blank? There is more than one possible answer.

A client recently wrote to us saying that she liked decision tree models, but for a model to be used at her bank, the risk compliance group required an R-squared value for the model and her decision tree software doesn’t supply one. How should she fill in the blank? There is more than one possible answer.

Start with the definition of R-squared for regular (ordinary least squares) regression. There are three common ways of describing it. For OLS they all describe the same calculation, but they suggest different ways of extending the definition to other models. The calculation is 1 minus the ratio of the sum of the squared residuals to the sum of the squared differences of the actual values from their average value.

The denominator of this ratio is the variance and the numerator is the variance of the residuals. So one way of describing R-squared is as the proportion of variance explained by the model.

More Read

What are Advanced Segments in Google Analytics and Why You Should Use Them
LinkedIn and Hiring: Dream. Fit. Passion.
Beyond BigData, the Shift To Decision Management w/ James Taylor
Visualizations as Vocabulary…or Know the Big Words, Use the Small Ones
Adding Social Elements to Intranet? Don’t Forget Change Management

A second way of describing the same ratio is that it shows how much better the model is than the null model which consists of not using any information from the explanatory variables and just predicting the average. (If you are always going to guess the same value, the average is the value that minimizes the squared error.)

Yet a third way of thinking about R-squared is that it is the square of the correlation r between the predicted and actual values. (That, of course, is why it is called R-squared.)

Back to the question about decision trees: When the target variable is continuous (a regression tree), there is no need to change the definition of R-squared. The predicted values are discrete, but everything still works.

When the target is a binary outcome, you have a choice. You can stick with the original formula. In that case, the predicted values are discrete with values between 0 and 1 (as many distinct estimates as the tree has leaves) and the actuals are either 0 or 1. The average of the actuals is the proportion of ones (i.e. the overall probability of being in class 1).  This method is called Efron’s pseudo R-squared.

Alternatively, you can say that the job of the model is to classify things.  The null model would be to always predict the most common class. A good pseudo R-squared is how much better does your model do? In other words, the ratio of the proportion correctly classified by your model to the proportion of the most common class.

There are many other pseudo R-squares described on a page put up by the statistical consulting services group at UCLA.

Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

data mining to find the right poly bag makers
Using Data Analytics to Choose the Best Poly Mailer Bags
Analytics Big Data Exclusive
data science importance of flexibility
Why Flexibility Defines the Future of Data Science
Big Data Exclusive
payment methods
How Data Analytics Is Transforming eCommerce Payments
Business Intelligence
cybersecurity essentials
Cybersecurity Essentials For Customer-Facing Platforms
Exclusive Infographic IT Security

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

big data service-oriented hotel
AnalyticsBest PracticesBig DataBusiness IntelligenceCulture/LeadershipData ManagementDecision ManagementInside CompaniesKnowledge ManagementMarket ResearchMarketingPolicy and GovernancePredictive AnalyticsUnstructured Data

Big Data Empowers the InterContinental Hotel Group

7 Min Read

Back to the Future of Business Intelligence

6 Min Read

Creating a Business Intelligence Culture

9 Min Read

Digital data explosion highlights need for new-age Database and Business Intelligence technologies

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai chatbot
The Art of Conversation: Enhancing Chatbots with Advanced AI Prompts
Chatbots

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?