Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics
    How Data Analytics Can Help You Construct A Financial Weather Map
    4 Min Read
    financial analytics
    Financial Analytics Shows The Hidden Cost Of Not Switching Systems
    4 Min Read
    warehouse accidents
    Data Analytics and the Future of Warehouse Safety
    10 Min Read
    stock investing and data analytics
    How Data Analytics Supports Smarter Stock Trading Strategies
    4 Min Read
    predictive analytics risk management
    How Predictive Analytics Is Redefining Risk Management Across Industries
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Learning R
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Learning R
Data MiningData Visualization

Learning R

DavidMSmith
DavidMSmith
8 Min Read
SHARE

When R is brought up as a possibility for doing statistics or data mining or any sort of predictive analytics among non R users, someone will invariably point out that R has a “steep learning curve”, and the response among those gathered usually includes a significant amount of head nodding. Even those who have put in heroic efforts to help people learn R sometimes say scary things: e.g. in his introduction to Rattle, a Data Mining GUI for R, Graham Williams writes:

 R offers a breadth and depth in statistical computing beyond what is available in commercial closed source products. Yet R remains, primarily, a programming language for the highly skilled statistician, and out of the reach of many. (The R Journal Vol. ½, December 2009)

Are things really that bad? Is R that difficult to learn? I think not – unless you take the statement to refer to the absolutely worst case scenario that you can think of: the effort that would be involved in learning R by a person who doesn’t have any background in statistics or data analysis and who has absolutely no interest in learning R. In this context the statement that R has a steep learning curve conveys the same …

When R is brought up as a possibility for doing statistics or data mining or any sort of predictive analytics among non R users, someone will invariably point out that R has a “steep learning curve”, and the response among those gathered usually includes a significant amount of head nodding. Even those who have put in heroic efforts to help people learn R sometimes say scary things: e.g. in his introduction to Rattle, a Data Mining GUI for R, Graham Williams writes:

More Read

Which font uses the most ink?
3 Powerful Data Presentations That Inspired Genuine Change
The big big Analytics Conference
Tim Berners-Lee With an Update on Open Data
Rewrite the Rules

 R offers a breadth and depth in statistical computing beyond what is available in commercial closed source products. Yet R remains, primarily, a programming language for the highly skilled statistician, and out of the reach of many. (The R Journal Vol. ½, December 2009)

Are things really that bad? Is R that difficult to learn? I think not – unless you take the statement to refer to the absolutely worst case scenario that you can think of: the effort that would be involved in learning R by a person who doesn’t have any background in statistics or data analysis and who has absolutely no interest in learning R. In this context the statement that R has a steep learning curve conveys the same truth as the assertion that Chinese or Italian or Japanese or English or any other language is difficult to learn if you don’t know anything about the people who speak these languages, don’t want to know, and wouldn’t have an opportunity to practice the language anyway. There is some truth to this, but so what? Context is everything. If you have some background in statistics and a desire to learn some more, learning R is not going to be an insurmountable problem. Once you have some version of R installed on your favorite computing platform, any book that provides carefully worked examples of the kinds of statistical analyses that interest you should be all that is required to make you productive. These days, there are dozens or maybe even hundreds of statistics books that use R. Two of my personal favorites are John Fox’s classic An R and S Plus Companion to Applied Regression  (Sage Publications, 2002) and Data Analysis and Graphics Using R: An Example-Based Approach (Cambridge Series in Statistical and Probabilistic Mathematics) by John Maindonald and W. John Braun (2010). Books, of course, are only the tip of the iceberg of the resources available for the statistical cognoscenti to learn R. Have a look at the links on Inside-R as places to start on the web.

This discussion does raise the issue of what one means by learning a language. Knowing how to run the R scripts required to get some simple analysis done may not entitle you to say that you know R (significantly more is required to become an R developer) anymore than understanding enough Italian to get around Rome while eating well entitles you to say that you know Italian – but again, so what? It is possible to do some pretty impressive statistics using R without any deeper understanding of the R language than what is required to run the applicable models. On the other hand, if you don’t know any statistics and you don’t want to know more than you have to, then that might be a big problem. But, it’s the statistics part that has the steep learning curve, not R. Maybe this is the point of William’s comment: R remains the language of highly skilled statisticians because it is the language most capable of expressing what highly trained statisticians think about, and R is out of the reach of the many who just don’t care much about statistics. But, if you are reading this post this many probably doesn’t include you.

So suppose you are not a statistician but belong to some other data analysis culture, data mining, for example, and you want to learn R. Well, like any of the world’s great natural languages R approachable from many different starting points. The entry point may be different but the path to knowledge is going to be similar. Find something you care about and start formulating simple sentences. There are fewer R language guides available for people who have some data mining skills, but that is changing. Seni and Elder have recently published a very nice little book on ensemble methods: Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Mining and Knowledge Discovery) that I highly recommend.  And, of course the masters Hastie, Tibshirani and Freidman authors of the classic text, Elements of Statistical Learning, have made both the text and much of the their code available. Moreover, if you are a data miner with a background in computer science you can probably code rings around the highly trained statisticians. If so, you may find the books by Chambers, Software for Data Analysis and Gentlemen, R Programming for Bioinformatics of great value.

Finally, to be perfectly fair, I should acknowledge Graham Williams’ quote in the context in which he wrote it. Like any other language, R is much easier to learn when you have access to the best learning tools, CDs, dictionaries, grammars etc. Rattle is one such high value learning tool and so is Revolution’s R Productivity Environment. But, more about these on another day.

Link to original post

TAGGED:data miningr
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

protecting patient data
How to Protect Psychotherapy Data in a Digital Practice
Big Data Exclusive Security
data analytics
How Data Analytics Can Help You Construct A Financial Weather Map
Analytics Exclusive Infographic
AI use in payment methods
AI Shows How Payment Delays Disrupt Your Business
Artificial Intelligence Exclusive Infographic
financial analytics
Financial Analytics Shows The Hidden Cost Of Not Switching Systems
Analytics Exclusive Infographic

Stay Connected

1.2KFollowersLike
33.7KFollowersFollow
222FollowersPin

You Might also Like

Google’s coding standards for R

2 Min Read
Image
Big DataData QualityData WarehousingUnstructured Data

What Are Accumulators? A Must-Know for Apache Spark

6 Min Read

It’s time to industrialize analytics

8 Min Read
data mining
Data Mining

Data Mining Technology Helps Online Brands Optimize Their Branding

7 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?