By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData Collective
  • Analytics
    AnalyticsShow More
    data-driven image seo
    Data Analytics Helps Marketers Substantially Boost Image SEO
    8 Min Read
    construction analytics
    5 Benefits of Analytics to Manage Commercial Construction
    5 Min Read
    benefits of data analytics for financial industry
    Fascinating Changes Data Analytics Brings to Finance
    7 Min Read
    analyzing big data for its quality and value
    Use this Strategic Approach to Maximize Your Data’s Value
    6 Min Read
    data-driven seo for product pages
    6 Tips for Using Data Analytics for Product Page SEO
    11 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: REvolution R Enterprise 2.0 released
Share
Notification Show More
Latest News
ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics
ai in web design
5 Ways AI Technology Has Disrupted Website Development
Artificial Intelligence
cloud-centric companies using network relocation
Cloud-Centric Companies Discover Benefits & Pitfalls of Network Relocation
Cloud Computing
Aa
SmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Mining > REvolution R Enterprise 2.0 released
Data MiningPredictive Analytics

REvolution R Enterprise 2.0 released

DavidMSmith
Last updated: 2009/04/14 at 10:55 PM
DavidMSmith
11 Min Read
SHARE
- Advertisement -

As I previewed yesterday, REvolution R Enterprise 2.0 is now available to subscribers. In yesterday’s post, I focused mainly on the process of creating the release; today, I’d like to talk about some of its new features.

64-bit Windows support

- Advertisement -
REvolution R Enterprise 2.0 is the only version of R available for 64-bit Windows systems. This means that it is now possible to analyze much larger data sets on Windows systems than ever before. The reason for this is that, with a few exceptions, all of the computational routines in R are in-memory. This means that the entire data-set and any temporary copies and working variables required by the routine must be able to fit into the operating system’s memory at once. As a rough rule of thumb, most statistical routines (like regression or tree models) will require at least three temporary copies of the data. So on a 32-bit Windows system where the maximum memory available is around 3 gigabytes that means you can analyze a data set of 750 megabytes, tops. But on a 64-bit system, as long as you have enough disk space available, you can analyze much larger data sets. In fact, your limitation will likely be the amount of time you ha…

More Read

First Look – DeltaR onRules

CRAN R 2.9.0 now available
First Look – SPSS Predictive Analytic Software 13

As I previewed yesterday, REvolution R Enterprise 2.0 is now available to subscribers. In yesterday’s post, I focused mainly on the process of creating the release; today, I’d like to talk about some of its new features.

64-bit Windows support

REvolution R Enterprise 2.0 is the only version of R available for 64-bit Windows systems. This means that it is now possible to analyze much larger data sets on Windows systems than ever before. The reason for this is that, with a few exceptions, all of the computational routines in R are in-memory. This means that the entire data-set and any temporary copies and working variables required by the routine must be able to fit into the operating system’s memory at once. As a rough rule of thumb, most statistical routines (like regression or tree models) will require at least three temporary copies of the data. So on a 32-bit Windows system where the maximum memory available is around 3 gigabytes that means you can analyze a data set of 750 megabytes, tops. But on a 64-bit system, as long as you have enough disk space available, you can analyze much larger data sets. In fact, your limitation will likely be the amount of time you have to wait rather than the storage you have available. Even better, you can install and use more than 4 gigabytes of RAM (memory chips) on 64-bit Windows systems, and for large data set analysis the more RAM you have installed, the faster it will run. You can expect the best performance when the installed RAM is at least 3-4 times the size of the data. (You don’t need that much, but the analysis will run slower if you have less.)  

- Advertisement -
This opens R to a whole new world of possibilities for analyzing data on 64-bit Windows systems.  For example, you can now:
  • Estimate correlation matrices (and calculate Value at Risk) for much larger financial portfolios
  • Use the Bioconductor suite to analyze pharmaceutical and biochemical data from much larger microarrays   
  • Build predictive models about purchasing behavior on larger databases of customer data, without the need for sampling
Brian Ripley, Professor of Applied Statistics at the University of Oxford and member of the R Core Development Team, reported using REvolution R Enterprise for genetic analysis during the beta test. He said:

“REvolution are to be congratulated on a technical tour de force…This will bring to Windows users the freedom to use R on large problems that users of Unix-like platforms have enjoyed for several years.  We did some testing on a 32GB Windows box on behalf of a computational genetics project, and the beta was 100% reliable and comparable in performance to the Rcore 32-bit distribution but able to tackle much larger problems.”

Basically, if you’ve tried to use R to analyze a large dataset on Windows before and gotten an error like “cannot allocate vector of size 858213 Kb”, switching to a 64-bit version of Windows with REvolution R Enterprise 2.0 is likely to help.

ParallelR upgraded

REvolution R Enterprise 2.0 comes with ParallelR, a suite of packages from REvolution Computing that simplify parallel programming in R. If you have a multiprocessor workstation (and most higher-end laptops and desktops sold today have at least 2 processors or cores), then parallel programming is a way of instructing R to use all processors simultaneously to reduce computation time. REvolution R automatically uses multiple processors for some key mathematical routines like matrix multiplication and decomposition, but for general R code only one processor will be used at a time unless you use the features of ParallelR.

There are a few other systems available for parallel programming in R, but after talking to users who had attempted to use them, we found that most attempts by casual users had been abandoned in frustration. This is because these systems were designed primarily for use on clusters (collections of workstations) for distributed computing. This in turn requires complex procedures for setting up the environment: designating the server and clients, nominating processors on each, bypassing security measures so that each instance of R can talk to each other, and so on. We also heard that writing parallel programs in these systems was complicated: the programmer had to deal with a lot of unfamiliar concepts like clients, servers, shared variables, message-passing and so on. When correctly configured these systems can offer excellent performance, but unless you have the computer-science background and training to rewrite your R programs using these new paradigms the performance gain is, well, zero.

- Advertisement -
ParallelR is designed so that the casual R programmer can easily convert “embarrassingly parallel” R programs to run faster on multiprocessor workstations. Embarrassingly parallel problems are those with sequences of steps that can be arbitrarily reordered because no step depends on the results of any other step. Common examples in the Statistics world are simulations, bagging and boosting procedures (fitting random forest models, for example), predictions, and fitting the same model to a sequence of dependent variables (a series of regions or segments, for example). 

The key innovation is a new function called foreach, which you can use to replace the traditional for loop in R. If you had enough processors available, each iteration of the loop would run at the same time, in parallel. More realistically, a few iterations will run in parallel at any one time — one per available processor. ParallelR handles all the complexity of scheduling each iteration when a processor becomes available and collecting the results, and automatically ensures that the local variables of the loop are replicated so the values from one iteration do no trample those of another. You can see some examples of foreach in action on the REvolution website, or in this recent webcast on backtesting financial models where using a quad-core system in parallel reduced the computational time by almost 75%.

For really meaty jobs you can speed up performance even more by adding more processors with a cluster. You don’t need to have a dedicated laboratory full of high-powered workstations available: you can always harness those PCs and Macs sitting idle around the office overnight for your heaviest number-crunching problems. ParallelR makes it easy to take the code you’ve already run in parallel on your desktop and extend it to a cluster of machines running R using a feature called sleighs. And if the overnight cleaner accidentally unplugs one of those machines you won’t have wasted a night’s computing time: ParallelR has fault tolerance so your job will still complete even if some of the nodes in the cluster become unavailable.

Enterprise Support and Service

As our subscription-level version of R, REvolution R Enterprise is backed by REvolution Computing and comes with full technical support services from our teams. It also comes ready for use in validated environments, such as for the analysis of FDA-controlled clinical trials.

- Advertisement -
Looking for more?

In the coming weeks we’ll have more examples and stories about REvolution R Enterprise here, but in the meantime if you’d like more information or want to enquire about subscriptions and editions, please just contact REvolution Computing and we’ll be happy to help. 

REvolution Computing (press release): REvolution R Enterprise with Parallel Processing Now Available for 64-bit Windows

TAGGED: product launch
DavidMSmith April 14, 2009
Share this Article
Facebook Twitter Pinterest LinkedIn
Share
- Advertisement -

Follow us on Facebook

Latest News

ai in software development
3 AI-Based Strategies to Develop Software in Uncertain Times
Software
ai in ppc advertising
5 Proven Tips for Utilizing AI with PPC Advertising in 2023
Artificial Intelligence
data-driven image seo
Data Analytics Helps Marketers Substantially Boost Image SEO
Analytics
ai in web design
5 Ways AI Technology Has Disrupted Website Development
Artificial Intelligence

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

First Look – DeltaR onRules

9 Min Read

CRAN R 2.9.0 now available

2 Min Read

First Look – SPSS Predictive Analytic Software 13

5 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

data-driven web design
5 Great Tips for Using Data Analytics for Website UX
Big Data
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US

© 2008-23 SmartData Collective. All Rights Reserved.

Removed from reading list

Undo
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?