Vector Computing, Who Is More Powerful, R Language or esProc?

November 27, 2012
53 Views

Do you find Vector Computing tiresome while using statistical computing tools? Here we go for a Vector Computing Comparison: R Language vs. esProc. To me, one of the most attractive features of R language and esProc is that their codes are both agile, that is, only requiring a few lines of codes to implement plentiful functions.

Do you find Vector Computing tiresome while using statistical computing tools? Here we go for a Vector Computing Comparison: R Language vs. esProc. To me, one of the most attractive features of R language and esProc is that their codes are both agile, that is, only requiring a few lines of codes to implement plentiful functions. For example, both of them allow for composing Vector Computing expression, simplify the judgment statements, extend the basic functions to the advanced ones, and support the generic type. In which, regarding the vector computing, they are characterized with the massive data processing through functions and operators, so as to avoid the loop statement. Users can benefit from 2 resulting advantages: first, easy to grasp for business experts and keep the learning cost low; second, easy to implement the parallel computation and improve the performance.

In order to show users the subtle differences between R and esProc on vector computing, we will go on with several examples below.

Firstly, let’s check the most basic functions like vector value getting and assigning. For example, get 5 values of vectors whose subscripts are from 5 to 10, and replace them with another 5 values.

R solution:
01    A1<-c(51,52,53,54,55,56,57,58,59,60)
02    A2<-A1[6:10]
03    A1[6:10]<-seq(1,5)

esProc solution:
A1    =[51,52,53,54,55,56,57,58,59,60]
A2    =A1(to(6,10))
A3    >A1(to(6,10))=to(1,5)

Comments: Both of them enable users to get and assign values easily with almost the same usage. However, subjectively, I prefer using the “:” of R language to represent the interval ranges. It looks more intuitive and agile.

Then, let’s compare them on the arithmetical operations of vector.

R solution:
04    A4<-c(1,2,3)
05    A5<-c(2,4,6)
06    A4*A5 # multiplying the vector, and the result is: [1] 2 8 18
07    A4+2    #adding the vector to the constant, and the result is: [1] 3 4 5
08    ifelse(A4>1,A4+2,A4-2) #conditional evaluate, and the result is: [1] -1 4 5
09    sum(A4)    #aggregate, sum up the vector member, and the result is:6
10    sort(A4,decreasing = TRUE)    #sort reversely, and the result is: 3 2 1

esProc solution:
A4    =[1,2,3]
A5    =[2,4,6]
A6    =A4**A5    ‘multiplying the vector, and the result is: 2 4 18
A7    =A4.(~+2)    ‘adding the vector to the constant, and the result is:3 4 5
A8    =A4.(if(~>1,~+2,~-2))    ‘conditional evaluate, and the result is:-1 4 5
A9    =A4.sum()    ‘aggregating, vector member sum up, and the result is:6
A10    =A4.sort(~:-1)    ‘reverse sorting, and the result is:3 2 1

Comments: As can be seen from the above, no matter the four arithmetic operations, aggregating, or sorting operations of vector, both R and esProc can implement it well, and their syntaxes are very close. One thing worthy of notice is that the code of esProc looks more “object-oriented”, while R is truly “object-oriented” judging from the bottom layer. The former is more suitable for direct use by business experts by themselves and popular with those from the common business sector, and the latter is more suitable for programmers to compile the extended package by themselves and more acceptable to those from the scientific expertise sector.

Let us check the vector computing on the structured data, such as computations based on the Orders table from the Northwind database:
Query the data with freightage from 200 to 300.
Query the order dated 1997.
Compute the intersection set of above-mentioned sets, i.e. data not only with freightage from 200 to 300 but also with orders placed in 1997.
Group the result from the previous step by EmployeeID, and average the freightage for each employee.

R solution:
02    A2<-result[result$Freight>=200 & result$Freight<=300,]
03    A304    A4<-result[result$Freight>=200 & result$Freight<=300 & format(result$OrderDate,’%Y’)==”1997″,]
05    A5<-tapply(A2$Freight,INDEX=A2$EmployeeID,FUN=mean)

esProc solution:
A2    =A1.select(Freight>=200 && Freight<=300 && year(OrderDate)==1997)
A3    =A1.select(year(OrderDate)==1997)
A4    =A3^A4
A5    =A4.group(EmployeeID;~.avg(Freight))

Comments: R is good at querying and make statistics in groups. However, as for the set operations, R is worse than esProc. In the above example of R, the result is obtained by an indirect means of query instead of any set operations.

R can only perform the set operations on simple vectors, for example, intersect(A2$Orderid,A3$Orderid), and cannot directly implement the set operation on the structured data like data.frame.

Of course, this is not to say that the R is not powerful in vector computing. In effect, R is easier to use than esProc in the aspect of matrix-related computation. For example, to seek the eigenvalue of matrix A, R users can simply use eigen(A), while esProc users are not provided with any functions for them to represent it directly. Judging from this aspect, it proves that esProc is more suitable for business computing, while R is better in handling the scientific computation.

In conclusion, considering the vector computing, both R and esProc demonstrate perfect performance in the basic computing. More specifically speaking, R is second to none in matrix computation, and esProc (download) beats R in handling the structured data.

You may be interested

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities
Big Data
65 shares690 views
Big Data
65 shares690 views

IEEE Big Data Conference 2017 to Highlight Challenges, Opportunities

Ryan Kade - June 23, 2017

Since 2013, the Institute of Electrical and Electronics Engineers has held annual big data conferences to highlight changes and opportunities…

10 of the Top Marketing BI Software Options
Business Intelligence
117 shares1,055 views
Business Intelligence
117 shares1,055 views

10 of the Top Marketing BI Software Options

Hayden B. - June 23, 2017

Business can be complicated sometimes. It’s not always easy to keep track of all the data and information we deal…

The Race for 5G Is the Race for Data Dominance
Big Data
80 shares918 views
Big Data
80 shares918 views

The Race for 5G Is the Race for Data Dominance

Daniel Matthews - June 22, 2017

Have you noticed how often the phrase “by the year 2020” comes up? In the tech sphere, many are heralding…