Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    image fx (67)
    Improving LinkedIn Ad Strategies with Data Analytics
    9 Min Read
    big data and remote work
    Data Helps Speech-Language Pathologists Deliver Better Results
    6 Min Read
    data driven insights
    How Data-Driven Insights Are Addressing Gaps in Patient Communication and Equity
    8 Min Read
    pexels pavel danilyuk 8112119
    Data Analytics Is Revolutionizing Medical Credentialing
    8 Min Read
    data and seo
    Maximize SEO Success with Powerful Data Analytics Insights
    8 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-25 SmartData Collective. All Rights Reserved.
Reading: Learning R
Share
Notification
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > Learning R
Data MiningData Visualization

Learning R

DavidMSmith
DavidMSmith
8 Min Read
SHARE

When R is brought up as a possibility for doing statistics or data mining or any sort of predictive analytics among non R users, someone will invariably point out that R has a “steep learning curve”, and the response among those gathered usually includes a significant amount of head nodding. Even those who have put in heroic efforts to help people learn R sometimes say scary things: e.g. in his introduction to Rattle, a Data Mining GUI for R, Graham Williams writes:

 R offers a breadth and depth in statistical computing beyond what is available in commercial closed source products. Yet R remains, primarily, a programming language for the highly skilled statistician, and out of the reach of many. (The R Journal Vol. ½, December 2009)

Are things really that bad? Is R that difficult to learn? I think not – unless you take the statement to refer to the absolutely worst case scenario that you can think of: the effort that would be involved in learning R by a person who doesn’t have any background in statistics or data analysis and who has absolutely no interest in learning R. In this context the statement that R has a steep learning curve conveys the same …

When R is brought up as a possibility for doing statistics or data mining or any sort of predictive analytics among non R users, someone will invariably point out that R has a “steep learning curve”, and the response among those gathered usually includes a significant amount of head nodding. Even those who have put in heroic efforts to help people learn R sometimes say scary things: e.g. in his introduction to Rattle, a Data Mining GUI for R, Graham Williams writes:

More Read

Data Miners: Participate in 3rd Annual Survey
How Nate Silver Won the Election with Data Science
Reality Mining – Too Much Personalization?
Big Data Analytics – Volume, Variety, Velocity
World Series Analytics

 R offers a breadth and depth in statistical computing beyond what is available in commercial closed source products. Yet R remains, primarily, a programming language for the highly skilled statistician, and out of the reach of many. (The R Journal Vol. ½, December 2009)

Are things really that bad? Is R that difficult to learn? I think not – unless you take the statement to refer to the absolutely worst case scenario that you can think of: the effort that would be involved in learning R by a person who doesn’t have any background in statistics or data analysis and who has absolutely no interest in learning R. In this context the statement that R has a steep learning curve conveys the same truth as the assertion that Chinese or Italian or Japanese or English or any other language is difficult to learn if you don’t know anything about the people who speak these languages, don’t want to know, and wouldn’t have an opportunity to practice the language anyway. There is some truth to this, but so what? Context is everything. If you have some background in statistics and a desire to learn some more, learning R is not going to be an insurmountable problem. Once you have some version of R installed on your favorite computing platform, any book that provides carefully worked examples of the kinds of statistical analyses that interest you should be all that is required to make you productive. These days, there are dozens or maybe even hundreds of statistics books that use R. Two of my personal favorites are John Fox’s classic An R and S Plus Companion to Applied Regression  (Sage Publications, 2002) and Data Analysis and Graphics Using R: An Example-Based Approach (Cambridge Series in Statistical and Probabilistic Mathematics) by John Maindonald and W. John Braun (2010). Books, of course, are only the tip of the iceberg of the resources available for the statistical cognoscenti to learn R. Have a look at the links on Inside-R as places to start on the web.

This discussion does raise the issue of what one means by learning a language. Knowing how to run the R scripts required to get some simple analysis done may not entitle you to say that you know R (significantly more is required to become an R developer) anymore than understanding enough Italian to get around Rome while eating well entitles you to say that you know Italian – but again, so what? It is possible to do some pretty impressive statistics using R without any deeper understanding of the R language than what is required to run the applicable models. On the other hand, if you don’t know any statistics and you don’t want to know more than you have to, then that might be a big problem. But, it’s the statistics part that has the steep learning curve, not R. Maybe this is the point of William’s comment: R remains the language of highly skilled statisticians because it is the language most capable of expressing what highly trained statisticians think about, and R is out of the reach of the many who just don’t care much about statistics. But, if you are reading this post this many probably doesn’t include you.

So suppose you are not a statistician but belong to some other data analysis culture, data mining, for example, and you want to learn R. Well, like any of the world’s great natural languages R approachable from many different starting points. The entry point may be different but the path to knowledge is going to be similar. Find something you care about and start formulating simple sentences. There are fewer R language guides available for people who have some data mining skills, but that is changing. Seni and Elder have recently published a very nice little book on ensemble methods: Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions (Synthesis Lectures on Data Mining and Knowledge Discovery) that I highly recommend.  And, of course the masters Hastie, Tibshirani and Freidman authors of the classic text, Elements of Statistical Learning, have made both the text and much of the their code available. Moreover, if you are a data miner with a background in computer science you can probably code rings around the highly trained statisticians. If so, you may find the books by Chambers, Software for Data Analysis and Gentlemen, R Programming for Bioinformatics of great value.

Finally, to be perfectly fair, I should acknowledge Graham Williams’ quote in the context in which he wrote it. Like any other language, R is much easier to learn when you have access to the best learning tools, CDs, dictionaries, grammars etc. Rattle is one such high value learning tool and so is Revolution’s R Productivity Environment. But, more about these on another day.

Link to original post

TAGGED:data miningr
Share This Article
Facebook Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

image fx (2)
Monitoring Data Without Turning into Big Brother
Big Data Exclusive
image fx (71)
The Power of AI for Personalization in Email
Artificial Intelligence Exclusive Marketing
image fx (67)
Improving LinkedIn Ad Strategies with Data Analytics
Analytics Big Data Exclusive Software
big data and remote work
Data Helps Speech-Language Pathologists Deliver Better Results
Analytics Big Data Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

The three legged stool – business, analytics, IT

6 Min Read

Package Update Roundup: Mar 2009

2 Min Read

Interview: Jon Peck SPSS

12 Min Read

What is a good classification accuracy in data mining?

6 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI chatbots
AI Chatbots Can Help Retailers Convert Live Broadcast Viewers into Sales!
Chatbots
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-25 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?