January 31, 2013

Who Unchained the Django of Business Computing Field?

The computing in business activities involves enterprise reporting (Reporting), business data integration and cleaning (Data Integration and ETL), OLAP (Online Analysis Process), ERP (Enterprise Resource Planning), CRM (Customer Relationship Management), SCM (Supply Chain Management), and DSS (Decision Support System). Unlike the scientific computing, business computing has its own characteristics:

Focus on the structural data in the business environment

Implement computing from business prospective by business specialist

 Capable to solve the complex and diverse business problems

Confront to the dynamic business demands rapidly

Both R and esProc are capable to meet the above-mentioned business computing demands. They have many in common, for example, both are optimized for the structural data. Both of them provide the all-aroundly complete interactive computation to handle the business problem. However, they differ in their respective features. R is characterized with the open-source and massive library functions; esProc features the intuitive style and its syntax is easy to learn and use. Judging from their features, we can say that R is more tailored for technical experts, while esProc is more business-specialist-oriented.

In the discussion below, we will discuss who unchained the Django of business computing field.

Environment for use
Both R and esProc is the download-and-install desktop software, not requiring any deployment specific servers, support all business databases and the direct retrieval from the Excel and TXT files.

The IDE of esProc only supports Windows; As the JDBC interface, they both support Linux, Mac OS X, BSD, Unix, and other operating systems, for example, acting as the data source to report or JAVA application. As for R, both IDE and its backend interface support the above-mentioned platforms.

esProc provides the graphical configuration interface for connection, while R implement it all by coding. From this point of view, esProc is the best choice for those business personnel with relatively weak technical background. However, with the help of the 3-party software, R can implement the graphic configuration interface and thus the difference is not great.

Structural data computing
As the main format of business data, most structural data is usually stored and managed centralizedly in the database, and some structural data is scattered in the Excel and text files. Both R and esProc provide the good support for the basic structural data computing, for example, data retrieval, sorting, query on conditions, get the unique value, and fuzzy search. For details, let's look into the below 2 examples:

Retrieve data
R: result<-sqlQuery(conn,'select * from Objects’) 
esProc: $select * from Objects’

Sort
R: A1[order(A1$ CustomerID,-A1$salesValue),]
esProc: =A1.sort(CustomerID, salesValue:-1)

The only difference between them lies in structural data computing. The syntax of esProc is easier to understand that business personnel can grasp it rapidly, for example, to group and summarize:

=A1.group(DepName; DepName, ~.count(), ~.sum(Salary))

By comparison, R has a richer academic flavor, as shown in the above-mentioned similar computations:


A6<-aggregate(A1$ Salary,list(A1$ DepName),sum) 


A6$count<-tapply(A1$ Salary,A1$ DepName,length)


In the associative queries, this point is quite prominent:
esProc: =join@1(A1:CustomerID:Orders, B1:CustomerID:Customers)
R: merge(A1,B1,by.x="CustomerID",by.y="CustomerID",all.x=TRUE)

Interactive computing
Interactive Computing is a procedure that analyzers can monitor the computed result of the previous step, and decide the next computing actions on the basis of the computed result. In this procedure, the results of previous steps can be reused conveniently. The interactive computing is a core functions to solve the complex problem. It can decompose one complex and obscure computational goal into several simple and clear steps. By solving several simple problems, the final goal of solving the complex computing problems can be ultimately achieved.

This advantage is particularly impressive in the business computing. For example, in the business computing, some obscure and complex computing problems like these may be encountered: "The reason for the rising customer complaints in recent days.", or "analyze the characteristics of product sales in this year", while not the below simple and clear problem, such as "How many complaints the customers made in this month?", or "Which 10 States has the highest sales volume?".The later one enable users to design a set of clear algorithms to have a concise answer, and response it all at once, so we name it as "All-at-Once"; By comparison, the former one requires users to make a assumption first before verifying the assumption to set the direction of computation for the next step, so we name it as "Step-by-Step".

R and esProc both have the outstanding interactive computational capability. To illustrate it with an example, suppose that the massive order data are to be retrieved, filtered on conditions, and summarized by group.

R in Interactive Computing
Retrieve data and filter by time interval.

The computed result of R can only be viewed on the right after named. Of all these results, fTimeData is just the computed result of this step. To view the variable contents, click the variable name, as shown in the below figure:

Or enter the command in the console to watch the fTimeData, as shown in the below figure:

Go ahead to group fTimeData by name and month. The grouped data will be named as gNameMonth. The field name in the gName does not make much sense in business. Therefore, in this step, the fields will be renamed, as shown in the below figure:

The boxes and lines in the above chart illustrate the reference relations between steps clearly. Lastly, you can click the variable names of gNameMonth on the right to view the computed result:

esProc in Interactive Computing
Retrieve data and filter it by time interval.


The computational procedures of esProc are written and rendered in a grid-like cellset, which means that the results can be viewed directly by simply clicking the cell A3:
Proceed with grouping and summarizing on A3. The computational procedures are written in A4, as shown in the below figure:


The boxes and lines in the above chart illustrate that how cell A4 make reference to the computed result of A3. esProc allows for the direct reference to the result from the previous computation with the cell name; For the complex computation at great length or the frequently-used intermediate results, esProc users can also define a name that makes senses in business.

Regarding the interactive computation, esProc is more easy-to-use and fit for the business specialist. esProc has a grid-style 2D flavor for computation. The coding layout is clear and capable for the natural alignment. The reference between cells is more intuitive, and the computed results are easier for users to view. In addition, esProc provide the Instant Computation mode in which each line of code will run automatically once input. The computed result will also display automatically; Differing in debugging to that of R which requires manual coding, esProc has the real debugging functions, for example, esProc users can set the break points directly on the cells. To code on the same purpose, esProc scripts are more concise and readable, and easier for business specialists to learn and grasp.

Syntax features
The leading role of business computing is the business specialist. The syntax of esProc is just tailored for the business specialist, and R is tailored for the technical expert of mathematicians and statisticians.

In terms of structural data, esProc takes record as a unit, that is, one record can be a piece of product information or employee information. This is a quite common format for business data. R takes column as a unit (known as vector). Its working principles are similar to have all data of one field in a database, and thus it runs much faster than the record-style data when computing. In esProc, multiple records forms a table (known as TSeq), while in R, the multiple lines will form a table (known as frame). Both can be involved in the computation in the form of tables to go through the sorting, grouping, and summarizing. However, esProc is more intuitive and easy-to-understand than that of R. For example, from a business spreadsheet, filter the records whose values are less than the values of previous records by 5000.

esProc: business.select(value [-1]< 5000)  
R: subset(business,c( 0, value [- length (value)]) <5000)

esProc syntax is featured by its agility. With the flexible syntax, esProc users are only required to grasp the usage of a few functions to compose a great number of functions. R is featured by its abundant functions. A large number of functions correspond to various functions. Through memorizing a great many of function names and parameters, R users can write the code rapidly. For example, R requires users to use various functions like apply, tapply, sapply, lapply, and by to perform the grouping action in different scenarios. By comparison, esProc users can just use group function to represent various grouping conditions unifiedly.

For another example, to compute the Moving Average in recent 3 days, esProc users can just use function avg, while R users have to use filter to compute the average. From the perspective of business, it is hard to understand and only the very skilled technical expert can handle it correctly, as given below:

R: filter(data $value/3, rep(1, 3),sides = 1)
esProc: data.( ~{-1,1}.( value).avg())

esProc has especially optimized the way to solve the typical problems, such as link relative ratio comparison, year-over-year comparison, ranking, growth rate, and cumulative value. With fewer considerations on characteristics of business computing, R is a language designed especially for the technical expert. Therefore, esProc is more intuitive and understandable than R. For example, compute the year-over-year monthly comparison of sales.

R: c(0,result$value[-1]-result$value[-length(result$value)])/result$value[-length(result$value)])

esProc: result.((value-value[-1])/value[-1])

To sum up, both R and esProc are the perfect business computing software. R is capable to handle the scenarios of more diverse deployment environments. Its abundant library functions and extensive 3rd party support are ideal for the technical expert. esProc is more fit for the business specialists, considering its business data support, interactive computational capability, and syntax style.

Related Articles:

Beijing Spirit Leads Enterprises to Continuous Progress

Business Intelligence Suppliers: Are You Ready for 2013?
2012 End of the World: Is This Prediction Based on Correct Analysis?