The Many Faces of R

February 26, 2009
86 Views
For anyone who happened to miss it, over the past week there has been a war of words raging between proponents of the R statistical programming language and advocates of SAS.  At the heart of the debate is a quote in a NY Times article which some have interpreted as an attempt by SAS to spread misinformation about R in order to protect revenue. Let’s be clear: SAS welcomes competing products in the analytics market.  New products encourage customers and user communities to push SAS faster and further.  SAS will win business based on the value we bring to the marketplace.

For anyone who happened to miss it, over the past week there has been a war of words raging between proponents of the R statistical programming language and advocates of SAS.  At the heart of the debate is a quote in a NY Times article which some have interpreted as an attempt by SAS to spread misinformation about R in order to protect revenue. Let’s be clear: SAS welcomes competing products in the analytics market.  New products encourage customers and user communities to push SAS faster and further.  SAS will win business based on the value we bring to the marketplace.

criticism from customers with the goal of continually improving, so thanks to those who’ve offered insight into your experiences and challenges.  Criticism can be constructive – offering both tangible examples of problems and solutions – or it can be inflammatory and divisive. My hope would be that a community with so many talented people and shared interests can interact positively towards a better environment for both R and SAS.  And I would hope this dialogue continues on these blogs and other forums. There are a lot of things to discuss in raising the value of analytics in health and life sciences, and I’m anxious to participate.

Linux, Postgres, JBoss, Apache, MySQL, Unix, Oracle, Windows, Java…these are all software products in my daily life at SAS.  They all have a place.  Am I the only one who uses both Microsoft Office and OpenOffice, and sees no reason one has to “win?”  SAS has a place; R has a place — they need not be the same place.  The really interesting discussion that is being ignored in all of this back-and-forth is how should the worlds of R and SAS come together?  What would be useful?  Do they need to come together?  It’s funny, once the conversation gets real, a lot of the noise makers disappear.

From a life sciences perspective, let’s set the record straight:

  • R used in regulated research.  There is no reason you cannot use R to analyze clinical trial or any other research data.  The only issue is how you validate your use of the software.  I personally have not seen many institutions pushing to move their clinical trial analytics to R, primarily because a) they have thousands of SAS programs and assets already developed; b) the data volumes are pretty big, and they know how to scale SAS; c) they usually have permanent and contracted staff that have a lot of SAS expertise; d) they have a long-standing, well-documented validation strategy for their SAS environment; and, e) they know SAS is able to provide support to them.  None of those issues are showstoppers to using R, it really just comes down to what makes the most sense for your organization.
  • R used by the FDA and other agencies.  Neither the FDA nor any other pharmaceutical regulatory authority I am aware of prescribes the use of any particular software package for the analysis and submission of clinical research information.  It is true that SAS has been a long-time de facto standard, and as such the agency has some capabilities with respect to SAS.  But going forward, we expect to see an increased openness around all aspects of clinical research, and anyone watching the development of standards such as CDISC over the past 9 years can tell you that SAS has been one of the largest contributors to the development and proliferation of these non-proprietary standards.
  • R as a stable platform.  It is a common debate when discussing the merits of open source vs. commercial software: what produces higher quality software, a large group of loosely-connected developers and testers, or a formalized corporate structure and process around software engineering.  I’ve seen evidence supporting both sides of the argument, and I personally believe that it depends on the software, the people/organizations involved, the maturity of the product, the existing market uptake,…the list goes on.  There is no single right answer.  My personal belief is that, as the complexity of a software system increases, it becomes exponentially more important to have solid software engineering practices in place. Is that the case for R?  That’s for the market to decide.


Link to original post