May 23, 2013

Why Plug-and-Use Desktop BI Software Is So Powerful?

The basic function of a calculator is computing, which can be as simple as the four arithmetic operations and also as complex as the calculation for the next move in chess with Deep Blue. Among these, esCalc is a desktop data calculation tool for the business users to handle the occasional, complex, and business-related data computation. Such tools are named as the desktop BI software.

For example, a stock analyst is required to recommend some stocks to the clients urgently. Among all calculations involved, the analyst needs to find shares from more than 20 daily trading stocks which had risen on 5 consecutive days in previous month. He opens this desktop BI software and imports the daily data of the more than 20 stocks. He continuously monitors and analyzes the data, works out the outline of algorithms, and then groups, summarizes, sorts, filters and takes other possible operations to calculate the results with some simple formulas. At last, he gets the result and makes the recommendation successfully.   

The similar calculations also include:

  • For all clients of the insurance company, what’s the average insurance price of those who bought the basic insurance first and then bought the additional insurance?
  • In the 3 months with the most client complaints, find out the top 3 products with the highest defective rate.
  • For the top N sales persons who achieved the 50% of the total sales for the company, what are their respective sales proportions to that of their respective sales team?

esCalc is the plug-and-use desktop BI software with the powerful computational capability for business users to grasp easily, owing to the following advantages:

Typical Desktop Application

The installer of esCalc is only dozens of MB. The installation procedure only requires a few clicks and can be run immediately after installation. As a JAVA application running on the Windows desktop, esCalc can run on most office computers independently without having to deploy the extra server additionally. 

esCalc resembles Excel in UIs that is easy to learn and grasp. The overall interface is shown as the below figure:


The Data Computation section is shown below:


esCalc is especially designed for the business users without technical background. It can be installed in a common working environment easily, and be used once installed.

Various Data sources Support

esCalc supports various databases, including MSSQL, Oracle, Access, MySQL, DB2, Sybase, and other mainstream databases. In facts, esCalc supports any databases with JDBC and ODBC drivers.

Besides the access to the database in the LAN, esCalc also supports the access to the local data file, such as txt, log, tab, other text files, and the Excel 97~2010 spreadsheets.

esCalc also supports the interactive calculations between various data sources, for example, to store the basic information like the company name, the contact information, and the company industry in the database, and to store the follow-up visit to some clients by sales persons in another Excel spread sheet.  With esCalc, you can merge the two pieces of data easily to form a follow-up visit log classified by the company industry. Even if their Number of clients and the Sort Criteria differ to each other, esCalc can handle it with easy.

The office environment of business personnel is sometimes rather complex, such as the CRM, MIS, ERP, DSS or performance environments, supply chain management, and other application systems. esCalc supports various data sources and interactive calculations and is capable to handle the complex office environment.

Step-by-step Calculation

The step-by-step calculation can decompose the complex computational goal into several simple steps and complete a seemingly complicated great goal by solving each simple small problem.

Still with the above example, to calculate the stock rising for consecutive 5 days, you can group by stock first, and then calculate the daily increment of the stock. An increment greater than 0 indicates the stock is rising. Based on this judgment, the consecutive rising days of each stock can be calculated out. Lastly, the longest consecutive rising days of each stock can be calculated through filtering and sorting. The details are shown below:

Step 1: Import the stock data from txt file, as shown in the below figure: 


Step 2: Filter out the data in the previous month by date. Suppose it is June in 2011, as shown in the below figure:


Step 3: Set the level by the stock code:


In the above figure, the newly-built level is in the red block on the left, and in the red circle on the right is a stock. This row is the summery row.

Step 4: Sort the transaction data of each stock by Date in ascending order.


Step 5: Calculate the daily growth rate of each stock. In this step, a calculation column D needs to be added, and a formula =(C4[A2]-C3[A2])/C3[A2] is entered to the D4. The formula will be pasted to the related cells automatically, as shown in the below figure:


Step 6: Compute the consecutive rising days. In this step, a computation column E needs to be added, and then the formula in E4: =if(D4>0,E3+1) to be entered. The result is shown in the below figure:


Step 7: Calculate the longest days of a certain stock rising consecutively by inputting the formula like max({E4}) in E2. The result is shown below:


Step 8: Fold the summary row by clicking on the level number 1 on the left, as shown in below figure:


Step 9: Sort by the longest consecutive rising days, as shown in below figure:


In the above figure, 5 stocks keep rising for 5 consecutive days in June 2011, which are American Express Co., The Boeing Co., Citigroup, Inc., General Motors Corp, and Coca-Cola Co. It is certain that the data can also be filtered by the longest consecutive rising days, as shown in the below figure:


To solve the relatively complex computational goal in this case, we decompose the computational goal into 9 steps of simple operation or formula computing.

The step-by-step computation allows users to decompose, simplify, and ultimately solve problems in a rather visual train of thoughts. Owing to this, business users can also solve some complex data computing problems by themselves.

Adequate data computational capability

esCalc is powerful enough to handle the various computational task in the daily office work.

Function as SQL in every aspect

With the same computational capability as SQL, esCalc can be used to filter, group, sort, and perform the distinct, union, join, and other equivalent actions of SQL.  Please refer to the menu shown in the below figure:


No technical background required

SQL can only be grasped by the professional technician, while esCalc represents SQL functions with graphics and decomposes it into several steps so that even the business users without technical experiences can handle it easily. For example, in the step 3 of “building levels by stock code”, esCalc does not require any complex coding, and users only need to set it up in the menu easily.


Alternatively, the default shortcut menu can also be used, as shown in below figure:


Because SQL does not support the step-by-step calculation, SQL solution to this case will be lengthy, error-prone, and hard-to-understand statements. Obviously, it is hard for the business users to grasp.

Computational capability beyond Excel

The Excel and other tools alike do not provide the auto copy and intelligent adjustment functions. The similar functions can only be implemented with a great amount of manual operations. For example, the longest consecutive rising days of each stock, esCalc users can enter the formula =max({E4}) in E2 directly, as shown in the below figure:


After entering the formula in E2, all homocells of E2 will be populated with the formula, for example, E94. By comparison, Excel does not provide the auto-copy function, and you will have to conduct it manually. In addition to the auto-copy, esCalc also supports the intelligent adjustment function, for example, the formula in E94 will be adjusted to =max({E96}) to meet the business logics, as shown in the below figure:


 In addition, the homo-cell model maintains the business relations between esCalc cells, so that the true grouping is implemented, and the grouped data can be further processed. This is hard to be implemented with the Excel and other tools alike, for example, in the step 4 of the above case, “sort the dealing data of each stock by Date ascendingly”. For Excel, ungrouping is required to sort the data by stock & date and then group. However, with esCalc, you can sort directly to implement it.

Multiple computational functions

esCalc can be used to perform the complex computations related to sequence numbers. For example, still in the above example, calculate the rankings of closing price of the end of previous month on the basis of the result in step 8.

Firstly, calculate the closing price in the end of month, then append a new column F, and enter the new formula in F2: ={C3}.m(-1), as shown in the below figure:


Then calculate the rankings. Append the new column G, and input the formula ={F2}.ranki(F2) in G2, as shown in the below figure:


As can be seen from above, 9N12584 (i.e. IBM) has the highest closing prices among these shares of more than 20.

esCalc can also perform the intersection, union, compliment, and other set operations, for example, compute the stocks that are among the last 10th (cheaper) by closing price and rising for consecutive 4 days. esCalc can also be used to perform the inter-row computations such as monthly year-on-year comparison and the link relative ratio comparison, for example, compute the stock price moving averages in the 5 days.

In conclusion, esCalc is a typical desktop application which is able to support multiple data sources and step-by-step computation with sufficient data computational capability. It is the data computation tool for business users to handle the various computational problems in the daily office work easily. 

May 22, 2013

What role desktop BI plays?


All modern Information Technologies which are capable of improving the enterprise competitiveness fall in the scope of BI, such as ERP, CRM, Reporting tools, Data Computing, Statistical Analysis, Data Mining, OLAP, and ETL, etc. They can be divided into 2 categories: Desktop BI and Solution BI.

 Desktop BI runs on the desktop environment, and almost does not need the support of the servers. Usually, such tools only provide the core BI function with a relatively low requirement on the technical environments, for example, Microsoft Excel, StataCorp Stata, Raqsoft ES series, IBM SPSS, and RStudio.

On the contrary, solution BI cannot work without the support of dedicated server. It’s usually the integrate solution or multi-module platform. Besides the core BI functions, there are also some external functions like the authority management, resources sharing, and collaboration among various jobs. Such software includes the SAS, Spotfire, Tableau, Qlikview, and SAP BI.

Although there are obvious differences, these two types have many advocates. In the following sections, we will discuss their features respectively from 4 aspects, including software structure, function feature, technical requirements, and target users.

Software Structure

Basic Structure: The Desktop BI is a typical desktop application that can be installed on an average PC. With it, you can access the external database via network and analyze the local files like RStudio if network is unavailable. The Solution BI is usually the B/S or C/S architecture with its own dedicated server. Owing to this, Solution BI requires a high performance server, a complete network environment, and a disaster recovery system like Spotfire. Therefore, everyone can use Desktop BI easily because it does not require much on its environment for use. On the contrary, the solution BI has high environment requirements and can only be used in the enterprise environment.

Installation: Solution BI software can be roughly divided into 2 types: server software and client software. The IT department usually will be responsible for the server end. Although it is very complex, the end users need not to care for it. Most clients are browsers capable to run with zero installation like QlikView. Even for the fat clients, the Java WebStart and the relevant technologies can be leveraged to reduce the complexity of installation and future upgrade. By comparison, Desktop BI requires users to download and install the software and sometimes various plug-ins like RStudio, which adds difficulty to the upgrade and update. It is far less convenient than Solution BI.

Performance: The differences of basic structure result in the great differences of their performance. Desktop BI is operated on the common PC, which means that the computation involving great data volume, high I/O and performance cannot be performed on the desktop. On the contrary, Solution BI usually completes the computation on the high performance server end, so that Solution BI has its advantages in the massive data processing and the operation efficiency. Sometimes, Desktop BI adopts cloud computing, server cluster, multi-core-based parallel computing, etc. as the supplementary means to enhance the performance, such as SPSS. This is certainly not the typical application of SPSS.

Maintenance: Their maintenance overheads are different. Desktop BI is zero maintenance cost and ready to use once installed nearly without any assistance of IT department. By comparison, Solution BI requires a data center, a network administer, and a server administer with a relatively high maintenance overhead. In addition, Solution BI generally requires a set of corresponding supporting mechanism. The administrators and users are also required to go through the professional training at high expense, for example, SAS.

Functions and Features

The functions of Desktop BI usually are specific and focused on a single-purpose, while Solution BI functions are diverse and complete.

BI-specific vs. miscellaneous functions: DeskTop BI only provides the core BI function, excluding the functions like access control, resource assignment, collaboration, and other non-BI functions – even if these functions are available in some software, for example Excel, users almost do not use them. However, the Spotfire, Tableau, and other Solution BI software provides a complete set of non-BI functions such as ACL, logon, logging, performance monitoring, mobile and share, and other non-BI functions. In addition, according to module, the user types can also be divided into system administer, database administer, reporting template designer, common business user, business manager, and other roles.

Single purpose vs. multiple purposes: This means the single usage of Desktop BI. For example, Excel and esCalc mainly focus on the spreadsheet function, while their reporting and ETL functions are relatively weak; RStudio and esProc are mainly for data analysis without any spreadsheet functions. Solution BI integrates a great number of BI modules. For instance, the SAP BI not only provides the ETL and Data Warehouse functions, but also the Reporting and OLAP functions.

In this case, we can find it is a typical example of 80-20 rule/Pareto Law that Desktop BI only offers 20% functions of Solution BI but is capable to achieve 80% BI goal. It is certainly not to deny the value of Solution BI. In fact, a secure, robust, and highly expandable system is a worthy investment.

Study curve:  Due to single-purpose and highly specific functions, Desktop BI is easy and fast for users to learn. Its study curve is smooth and gentle. The learning cost is very low, and users can fully grasp it after a few days of learning. Excel is even assigned as the learning contents for the primary school students. On the contrary, the study curve of Solution BI is quite high because Solution BI software consists of many modules, and provides many complex functions. The robustness and collaboration factors of Solution BI also add the cost and difficulty to learn.

Applicable scenarios: The complex access control and cooperation based on division of jobs determines that Solution BI is only fit for the job involving thousands of documents, rigid work, and routines. For example, for the jobs of reporting, Dashboard, ETL, OLAP, software of Spotfire, Tableau, Qlikview, and SAP BI all belong to this type. For this reason, it always takes quite a long time to complete a Solution BI job. For example, the OLAP analysis with SAP BI requires the multi-step and various jobs collaboration on model design, data acquisition, and task scheduling. Even the apparently simple job of report preparing requires the database administer to compose the stored procedures, the business expert to decide the business logics, the art designer to design the style, and the report designer to write the report scripts.

Instead, Desktop BI is a tool for users to work creatively. Desktop BI requires users to deliver the reliable results before the commercial opportunities slip away. For example, use R to analyze the reason for the sharp increase of website visits. Considering this, its main concern is not the access control, robustness, and cooperation but the agility and convenience.

Technical Requirements

The technical requirements of Desktop BI are lower than that of Solution BI.

Self-independence vs. Teamwork: Desktop BI usually allows users to complete the data computing and statistical analysis independently. The computational goal is highly related to the business, and thus the technical assistance becomes unnecessary. By comparison, Solution BI requires the strong technical assistance to complete the server deployment, report scripts and SQL statements, secondary development, and other IT-related work.

Work Cycle: Because of the lower technical requirements, the work cycle of Desktop software is relatively short. Solution BI software instead requires the cooperation of experts and coordination between various departments, and thus the duration of the cycle is relatively long.

The deviation of computational goal: The relatively low technical requirements can ensure that the business experts can fully grasp the Desktop BI all by themselves. As we know, the “B” of “BI” stands for “Business”. The whole computational goal of BI is to ultimately serve the purpose of business. Therefore, facing the business requirements, to get a result of the most practical and actual to business conditions, and the closest to the computational goal, it is compulsory to enable the business experts to use the tools by themselves, analyze from the business perspective, and turn their business expertise into algorithm.

By comparison, Solution BI instead requires the technical assistance from IT experts. A lot of core algorithms are even proposed by the technical experts, such as the modeling procedure of Qlikview. In the process of implementation, the IT experts usually play the role of constructing the underlying infrastructure on which the business experts will carry out their work. As we know, a solid foundation determines the lifecycle of the superstructure of a building. Therefore, the infrastructure laid by the IT experts will have great impact on the computation procedure of business experts. The ultimate computation results may easily deviate from the computational goal. Every business expert may ever hear these words: No way. You cannot drill like this because we did not consider this in designing; No. The desired data is hard to obtain through SQL and please consider to choose other data; No. we need stored procedure to achieve this, and we will have to discuss it with the DBA and IT manager; No. The cost is too high for IT department spending a few months on it. We need to change the computational goal.

Of course, the high technical requirement is not always a bad thing. For the BI work of relatively huge data volume and clear standard like reporting, the high technical requirement and cooperation based on division of work usually leads to the high quality of output. That is to say, the report is more refined, the algorithm is more efficient, and the model is more complete, just to name a few.

Target Users

According to the above analysis, we can conclude the most suitable user groups for these two types of BI software.

Data computing oriented vs. Data management and presentation oriented: Users of Desktop BI software all focus on the data computing, such as the Microsoft Excel, Raqsoft ES series, RStudio. These types of users usually have to perform the complex procession on data to get a certain valuable computational results. For example, find the product whose sales value has ever been rising for 3 consecutive months. Users of Solution BI are usually data management and presentation oriented. For example, use Dashboard to present the performances of major departments dynamically.

Business-oriented vs. Technology-oriented: Users of Desktop BI are mostly business-oriented. For example, the accountants, account manager for banking, business analyst, and stock analyst. Users of Solution BI are mostly technology-oriented, for example, the report designer, enterprise portal designer, and data warehouse administer. Of course, such division is not always clear, for example, the stock analyst may very likely the expert in both mathematics and finances.The report designers not only understand the reporting scripts, but also have in-depth knowledge in business.

Temporary and sudden demand vs. Daily routine requirement: Users of Desktop BI usually have to confront to the temporary and sudden requirements. Such requirements are often in short of the existing and ready model to use. Instead, it depends on the concrete analysis according to the specific factors at that point. For example: find the main reason for the sharp increase of complaints in last week. Or, of the clients accounting for the top 50% of the sales values, find the ones whose ranks of this year rise again. By comparison, the results of Solution BI are mostly typical and daily common ones. For example, the monthly balance sheet and cash flow report. Or, the 3 dimensions of client, network of dealers, and product, and the 2 metrics of sales value and sales volume for OLAP analysis.

No technical support vs. Available technical support: Desktop BI is fit for those users who can hardly get any support and assistance from IT teams, for example, the sales assistant who travels a lot, business analysts who work at home, and the stock analyst as freelancer. Solution BI is fit for those users who can easily get the support and assistance from IT teams,  for example, the telephone sales assistant, and the banking report designer.

After all, we also must notice that the same users in different occasions will use different tools. For example, an accountant will login on to the enterprise portal to view the routine report, and this falls into the scope of Solution BI. Sometimes, they will use Excel to solve the temporary data computing task arranged by the finance administer, and this falls into the scope of Desktop BI. For another example, the business analyst can use esProc to handle the complex data computing independently if no support is available from the IT teams, and use SAS to perform the massive data computing if the support is available from IT teams.


May 16, 2013

Could Business Computing Be Done by Users without Technical Experience?


The business computing always occurs in enterprises, which plays an important role in enterprises operations, and mainly handled by the business users. The key business computing is usually characterized with timeliness and weak predication. There is no ready answer in the ERP, CRM, and other business systems. For example, find out the materials whose inventory level is below the warning limit for 3 consecutive days in the recent 2 weeks. Among those materials, which are used to produce the top 5 Halloween best-sellers? In another example, find out the agents contributing to the 60% of the total sales. Assign those agents in proper order to the salespersons whose performances rank ahead of others in the year-end sales promotion

Most business users have no IT experience. They can only rely on the business spreadsheet software to complete such computing. However, the traditional business spreadsheet software is weak in calculation, which will bring about errors in the calculation process.

Considering the above examples, you must filter or sort the summary values and detail values. But, the traditional business spreadsheet software is only fit for the 2D sheet computing, and lacks the further processing capability on the grouped data. The manual operations are usually required to implement computing with such spreadsheet and the process is error-prone.

In order to solve the above problems, the set operation capability to some degree is also required. But the business spreadsheet software has no concept of sets, and no business association between cells is available. The set operation can only be handled via the complicated search & find formula, which adds to the computing difficulty virtually, and thus a higher technical competence is required.

In this case, the sheets from multiple data sources need to be merged. But in the traditional business spreadsheet software, users can only copy the data, and is unable to merge the data at various levels or in the uncertain orders according to a certain relation. This is usually achieved through manual operation or complex formula.

esCalc, the desktop BI software, introduces the concept of “homocell”. It is the ideal spreadsheet to solve the above-mentioned problems. The core concept of esCalc is to treat the cells of various business statuses differently, and treat the cell of the same business status uniformly. In esCalc, filtering the summary value will not affect the detail data. The formula entered in the detail data section will be copied and pasted to other homocells automatically, while not the summary section. The homocell relationship between cells ensure both of them being in a same set, so users can seek the intersection set and union set conveniently; Multiple sheets can be merged through the join or union operation according to the homocell relation, even for the data at various levels and in the uncertain orders.

May 8, 2013

What Agile Desktop BI Tool Really Means?

BI (Business Intelligence) refers to the intelligence and ability to enhance the enterprises competitiveness, involving report presentation, reporting result calculation, OLAP analysis, business data calculation, data mining and predication. Among these, there are both the technician-oriented high-level systems, and the business user-oriented agile desktop BI tools. In this post, we only talk about agile desktop BI tools. But what is agile desktop BI tool and what are their standards?

Agile desktop BI software should have the following features:

Common BI features support:

The agile desktop BI software should fit for those business personnel to prepare the static report by themselves, even if they are inexperienced in IT. The friendly reporting design interface is necessary. It also supports the rapid reporting, the business personnel-oriented data processing, and high fidelity report preparation. In other words, it is capable to keep reports consistent in the stages of design, preview, pagination & printing, and export.

For example, make a product sales situation report for specific products to present the monthly sales of 3 products, and their link relative ratio, and monthly year-on-year growth.

The agile desktop BI software supports the calculation on the result of common reporting tools. Excel and the plain txt are the export formats supported by most reporting tools. It can import and calculate them directly; supports the data pasting from the report result directly on the clipboard; provides the calculator-style operation for business personnel; capable to conduct any process on data and the calculation between steps can be transited smoothly. 

For example, the reporting tools generate the below report result:


Find the big clients who account for 60% of the total sales of the company based on the above data. The result is as follows:


The agile desktop BI tools support the true OLAP analysis. It is capable to perform the interactive analysis arbitrarily and intuitively, decompose and simplify the obscure analysis goal. It provides the basic analysis methods that are both simple and easy to use. Then, lots of advanced analyses can be implemented through the free combination. esCalc is such kind of desktop BI software. With an Excel-style, esCalc becomes relatively easier to understand and learn. Moreover, esCalc also provides a range of powerful advanced functions to solve the complex problems regarding OLAP.

Agile installation deployment:

The size of the agile BI tool should be very small and easy to install and uninstall. For example, esCalc installer is only dozens of MB and only requires a few clicks to install and run, which is the typical example of agile desktop BI software. Agile desktop BI tools are capable to run on most desktop computers independently, not having to deploy the additional server:

Agile desktop BI software supports various mainstream databases, like esCalc which can manipulate data from different databases, including MSSQL, Oracle, Access, MySQL, DB2, and Sybase. This desktop BI software also supports the local data files, for example, Txt, Log, tab, and other text files; Excel 97, Excel 2010, and the Excel of other versions. It also supports the interactive calculation between various data sources, such as the calculation between Oracle and Excel.


Agile formula functions:

Agile BI software also provides the agile formulas and functions, so that the business personnel can easily represent the relatively complex calculations, such as comparison on year-on-year basis, link relative ratio, set operations, ranking and row number calculations.

For instance, calculate the players whose rankings are among the top 5 in every game. We take esCalc, the agile desktop BI software as an example. The data available is as follows:


Simply
input "={A3}" in E2, and the top 5 players of each game will be calculated out in E2, E8, and E14. Input "={E2}.isect ()" in E1, then the players whose rankings are among the top 5 in every game will be calculated automatically. In which, the function "isect" is to calculate the intersection of sets, and "{E2}" is a set to indicate "cells shares the same meaning with E2 regarding business” (homocell by name), that is, E2, E8, and E14.

It also has the similar function like:

diff(): Calculate the difference set of a group of data; for example, calculate the employee who made a full attendance in this quarter. You can calculate through the formulas like [set of employees, employee who ever absent in the 1st month, employee who ever absent in the 2nd month, employee who ever absent in the 3rd month].diff() to calculate.

There are also other advanced functions available, such as the sum(~*~) to calculate the sum of squares of a certain group of data; cumulate() to calculate the cumulative value of a group of data, ord() to calculate the relative row number in the calculation hierarchy, and ranki() to calculate the ranking of a certain number in a group of numbers.

All in all, agile desktop BI software supports the common BI functions, and is business personnel-oriented in the respect of installation deployment, and formula function. 

May 6, 2013

Broadening the Reach of Self-Service BI

By Gadi Yedwab, founder of Explore Analytics.

(Note: I'm  glad that Explore Analytics and Raqsoft are making efforts to self-service applications.I show high respect and agreement with the opinion in this article.)


The necessity of self-service is obvious once you realize that traditional BI has limited reach within user communities. For example, BI dashboards are typically tailored to the needs of decision makers and leave out a broader group of analytically-minded users who could leverage data to innovate and make improvements. For small companies and for teams with limited budgets, self-service is often the only viable option because current BI approaches require people’s time and expertise to setup.
The proliferation of spreadsheets as tools for data analysis is a proof that that existing needs are unmet.
The current approaches to self-service often suffer from the same problems that limit the reach of BI to broader user communities. This article focuses on these problems and discusses a new approach that can significantly broaden the reach of BI.

The Problems with Current Approaches to Self-Service BI
  • IT organizations often concentrate their efforts on the most strategic data, while leaving a lot of useful data outside the scope.
  • For performance reasons, IT often opts for data warehousing. This approach is expensive and therefore has limited reach. Small companies lack resources and find this approach to be cost prohibitive.
  • Providing self-service by periodically delivering data sets for analysis in spreadsheets or desktop tools does not satisfy the need for real-time data. Latency of information is often cited by users as the major drawback of their BI solution.
  • Desktop BI tools and spreadsheet downloads can be a security risk when users keep data on laptops, or send it via email. This approach also makes it hard to share and collaborate in the analysis.
The Spreadsheet as a Self-Service BI Tool
Let’s admit it: the number one tool for self-service BI is the spreadsheet. It’s been that way since the invention of the spreadsheet, and it still is. The most typical scenario is exporting data from an application and then analyzing it in Excel. The main drawback of this approach is that it’s outside the skill-set of most users.
Sure, having the data in a spreadsheet is better than having nothing, but using Excel for BI has serious limitations. Most users do not have the necessary skills to analyze data in Excel, especially if the data resides in more than one table. Even for users who are skilled in Excel, the data quickly becomes stale and there is no good way of collaborating with other users in the analysis.

A New Approach to Self-Service
The new approach minimizes the need for data warehousing thereby reducing costs and providing real-time data. It uses cloud-based solutions to facilitating collaboration and sharing. Moreover, cloud-based tools can bring the required expertise and cost down to within the means of small companies and teams inside large companies.
The premise is simple: if a solution can be useful to small companies with limited resources, then it can be very useful for all the under-served constituencies inside large companies. The spreadsheet already proved that, but we can do much better than that.

Reducing the Need for Data Warehousing
For more than two decades the common wisdom has been to keep ad-hoc query away from production systems. This is generally still a good idea. However, there are good reasons to reconsider that widely accepted notion.
A good self-service BI tool can control and prevent runaway queries.
Explore Analytics, for example:
  • Only joins tables on the primary key
  • Puts a limit on every query to prevent it from returning too many rows
  • Pushes all the filtering and aggregation to the data source thus eliminating the need to pull large query results
  • Controls the number of queries that concurrently execute against a data source
Modern database servers eliminate three reasons why a bad query would previously bring a database down to its knees.
  • Having multiple CPU cores, the database performs well even if several cores are momentarily tied up.
  • Large portions of the database reside in memory and a full-table scan can be done without noticeable impact to other transaction.
  • Liberally creating indexes doesn’t come with the performance penalty that it had a decade ago.
While “Big Data” is an important category, a lot of useful data reside in tables with less than a few million rows. Running a query to summarize data across a million rows can complete in a few seconds. That wasn’t the case a decade ago.

Using the Cloud
Having a centralized web-based self-service BI solution allows users to share and publish their analysis. It allows teams to leverage the diverse strengths of individuals and review the analysis to increase its accuracy. Analytically-minded people can create data analysis and share it with the rest of the team.
By keeping data sets and reports securely in the cloud, companies can avoid distributing data to laptops, desktops, and passing it around in email attachments.
If you’re thinking that the same can be accomplished using an in-house web-based solution, you may be right, but you should consider the cost and expertise that’s required to build and support this solution. A cloud solution can greatly reduce the expertise that’s needed as well as the direct costs of the service. It then becomes feasible even for small companies or teams.

IT Call to Action
IT organizations should identify data sources for real-time access. For other data sources, consider publishing data sets to the Cloud. Then provide a cloud-base tool such as Explore Analytics to deliver self-service analysis to users and unleash their creativity.

Application Vendor Call to Action
Application vendors should enable real-time data access by providing web-services APIs that allow ad-hoc query including joining data, filtering and aggregation. Remember that if you allow tools to push the filtering and aggregation to your application, then they’d have no need to pull large results in real time.


April 23, 2013

40+ Free Data Analysis and Visualization Tools 2013

Original posted on computerworld by Sharon Machlis.
This article list 30+ tools for data visualization and analysis, including desktop BI tools, tools for data analysis, visualization and presentation, JavaScript libraries for maps, charts and other data visualizations.

Skill levels are represented as numbers from easiest to most difficult to learn and use:

1. Users who are comfortable with basic spreadsheet tasks

2. Users who are technically proficient enough not to be frightened off by spending a couple of hours learning a new application

3. Power users

4. Users with coding experience or specialized knowledge in a field like GIS or network analysis.

Data visualization and analysis tools


Tool
Category Multi
purpose
visualization

Mapping  

Platform
Skill
level   
Data stored
or processed
Designed for
Web publishing?
R Project Statistical analysis Yes With plugin Linux, Mac OS X, Unix, Windows XP or later 4 Local No
Data Wrangler Data cleaningNo No Browser 2 External server No
OpenRefine (formerly Google Refine) Data cleaning No No Browser 2 Local No
Google Fusion Tables Visualization app/service Yes Yes Browser 1 External server Yes
Impure Visualization app/service Yes No Browser 3 Varies Yes
Many Eyes Visualization app/service Yes Limited Browser 1 Public external server Yes
Tableau Public Visualization app/service Yes Yes Windows 3 Public external server Yes
VIDI Visualization app/service Yes Yes Browser 1 External server Yes
Zoho Reports Visualization app/service Yes No Browser 2 External server Yes
PowerPivot** Analysis and charting Yes No Excel 2010 and some 2013 versions on Windows 3 Local No
Choosel Framework Yes Yes Chrome, Firefox, Safari 4 Local or external server Not yet
Exhibit Library Yes Yes Code editor and browser 4 Local or external server Yes
Google Chart Tools Library and Visualization app/service Yes Yes Code editor and browser 2 Local or external server Yes
JavaScript InfoVis Toolkit Library Yes No Code editor and browser 4 Local or external server Yes
D3 Library Yes Yes Code editor and browser 4 Local or external server Yes
Quantum GIS (QGIS) GIS/mapping: Desktop No Yes Linux, Unix, Mac OS X, Windows 4 Local With plugin
OpenHeatMap GIS/mapping: Web No Yes Browser 1 External server Yes
OpenLayers GIS/mapping: Web, Library No Yes Code editor and browser 4 local or external server Yes
OpenStreetMap GIS/mapping: Web No Yes Browser or desktops running Java 3 Local or external server Yes
TimeFlow Temporal data analysis No No Desktops running Java 1 Local No
IBM Word-Cloud Generator Word clouds No No Desktops running Java 2 Local As image
Gephi Network analysis No No Desktops running Java 4 Local As image
NodeXL Network analysis No No Excel 2007 and 2010 on Windows 4 Local As image
CSVKit CSV file analysis No No Linux, Mac OS X or Linux with Python installed 3 Local No
DataTables Create sortable, searchable tables No No Code editor and browser 3 Local or external server Yes
FreeDive Create sortable, searchable tables No No Browser 2 External server Yes
Highcharts* Library Yes No Code editor and browser 3 Local or external server Yes
Mr. Data Converter Data reformatting No No Browser 1 Local or external server No
Panda Project Create searchable tables No No Browser with Amazon EC2 or Ubuntu Linux 2 Local or external server No
Weave Visualization app/service Yes Yes Flash-enabled browsers; Linux server on backend 4 Local or external server Yes
Statwing Visualization app/service Yes No Browser 1 External server Not yet
Infogr.am Visualization app/service Yes Limited Browser 1 External server Yes
Datawrapper Visualization app/service Yes No Browser 1 Local or external server Yes
Cascading Tree Sheets Library Yes Yes Browser 1 Local or external server Yes
Dataset Library No No Browser 4 Local or external server Yes
Leaflet Library No Yes Browser 4 Local or external server Yes
Searchable Fusion Table Map Template Library No Yes Browser 3 Local or external server Yes
Tabletop Library No No Browser 3 Local or external server Yes
Data Explorer** Data acquisition, data reformatting No No Excel 2010 and 2013 on Windows 2 Local No


But i want to add 2 desktop bi tools, esProc for statistical computing and esCalc for business analysis and reporting. Both are under skill level 2.