In 2006, the mathematician Clive Humby coined the phrase “Data is the new oil” to correspond with the famous quote by Bill Gates “Content is king.” Equating data and oil triggered an outburst of discussion about data analytics—the science of refining data into valuable resources.
Global IT research firm Gartner defines data analytics as the management and analysis of data to solve business problems and drive business decisions. For example, unprocessed sales data may provide hindsight, but not insight. Data analytics users add value by raising business problems, solving them, and then making corresponding business decisions.
In this blog, we’ll go over the basic concepts of data analytics to introduce you to the topics we will cover in the rest of the series. We’ll go over the types of data analytics, how to implement data analytics, and more!
There are four main types of data analytics, each detailed below. Each type answers a frequently asked question that data can answer.
1. Descriptive Analytics – What happened?
Descriptive analytics reactively reports what happened. Statistical and analytical methods are used to interpret information, depict phenomena, and visualize relationships.
For example, raw financial data may show an increase in a company’s net income. This appears to be good news on surface. However, by comparing net income with the budgeted target, a relationship depicted by descriptive analytics, management can see the increase actually fell short of its target.
2. Diagnostic Analytics – Why did it happen?
Diagnostic analytics answers why an event happened. It is often referred to as root-cause analytics. A company identifies the different drivers of the gap and narrows them to the root cause or causes.
For example, a variance in budgeted to actual net income may be due to a decrease in the company’s gross profit. The decrease in gross profit can then be drilled down to the decrease in sales of the high-margin products, which is the result of larger high-margin customer churn.
3. Predictive Analytics – What will happen?
Predictive analytics proactively forecasts what will happen. This is usually aided by statistical models, computer simulations, and machine learning techniques that derive foresight into the future based on existing data.
For example, using machine learning, the company may predict a further loss of high-margin customers, which translates to a larger drop in gross margin.
4.Prescriptive Analytics – What should be done?
Prescriptive analytics goes one step further by addressing what should be done given the predictions. It involves developing actionable plans to realize or avoid the predicted results.
For example, actionable plans, such as those to acquire new customers or improve customer services to retain existing customers, may be developed to prevent the expected loss of high-margin customers.
Below is a list of reasons why the use of data analytics has skyrocketed in recent years. While this is by no means an exhaustive list, we can still get a glimpse of the full picture:
Mckinsey (2016)3 predicts the following trends led by data analytics in the following decade:
Statista (2020) footnoted the Mckinsey prediction and forecasted the following:
Data analytics is not only the prevailing game-changer but also the driver of the future business world. The question remains, given the importance of data analytics, what needs to be done to capture its value? That is, how do we implement data analytics?
The center of implementing data analytics is a data-driven strategy. A data-driven strategy starts with an analytics team who executes an iterative process that derives values from data.
The analytics team is cross-functional and guides the whole iterative process. Members of the team must adapt to the changing data-driven strategy. The team should generally consist of
1. Business translators/analysts
Business translators possess both fundamental data-related knowledge and business expertise, acting as the bridge between data analytics and business applications. They determine the outcome of the data-driven strategy because analytics must be converted into business insights and actionable business plans.
2. Data engineers
Data engineers prepare the data for use by the data scientists. They build the fundamental architectures and platforms on which data scientists can work.
3. Data scientists
Data scientists perform data operations and derive insights from the data for decision-making. They possess both fundamental business knowledge and data analytics expertise. Tasks of data scientists usually include developing and using machine learning algorithms to forecast the future.
4. Visualization specialists
Visualization is telling the story of analyzed data. Insights drawn from the data are communicated to the intended users using visualization tools such as charts, maps, or other intuitive and understandable approaches.
Step 1: Define the business question.
The goal of defining the business question is to focus on what is important. Thus, this step involves prioritizing needs and asking business questions with respect to those needs.
Need prioritization starts with need identification. At this step, broader business needs are identified. The broader needs can be referenced to the balanced scorecard, which defines the following perspectives of performance measures:
Once the broader need are identified, they must be prioritized. Prioritization should be based on the organization’s strategy and goals, especially when the needs overlap. For example, if investment in learning and growth is fulfilled at the expense of financial performance, the business translator must pick the priority that provides a better strategic fit.
After need prioritization, the business question must be asked. Value is created by asking the right questions, but not as many questions as possible, because it diverts the focus of the process. The right questions, when answered, should make a difference to the business. Thus, the business context and implications of the identified need are analyzed by the business translators in order to ask the right questions.
Step 2: Define the measurement metrics.
This step is similar to a root-cause analysis in that drivers of the issues are identified. The business translators convert the question at issue into various measurable metrics (similar to key performance indicators) and decide what to measure. For example, if the question asked is about customer satisfaction, the business translator identifies metrics such as customer retention rate, customer survey scores, and even social media posts. These metrics are then conveyed to the data engineers for data modeling.
Step 3: Model the data.
Once the measurable metrics are defined, data engineers capture the necessary data and design how the data are stored and related to each other in the databases from which data scientists can retrieve the data for analytics.
This is the step where the ETL model kicks in. ETL is the abbreviation of Extraction, Transformation, and Loading that refers to the process by which data are moved from the data sources to the data warehouse.
Extraction – Data from a sea of sources and formats are mined and extracted. This includes downloading excel spreadsheets or writing programs to parse social media posts, comments, or videos.
Transformation – The extracted data are converted to the desired formats based on the business needs. For example, after parsing social media comments, the text comments are converted to numerical scales by analyzing words indicating emotions (e.g., satisfactory, sad, happy, etc.)
Loading – All the transformed data are moved to the data warehouse where the data to be analyzed are centrally stored.
Step 4: Perform analytics.
Analytics can be performed by either data scientists or self-service users.
Data scientists gather data from different data warehouses and further cleanse them based on their needs. They then use statistical, mathematical, or computer models to identify relationships, trends, and patterns in the data. Predictive analytics and machine learning algorithms are also commonly used to draw insights. Results generated by data scientists normally contain largely statistical or mathematical values. These results will be converted into business contexts.
Self-service analytics is becoming a trend due to its speed, autonomy, and adaptability. Compared to analyses by data scientists, users of the data (e.g., business analysts) directly access and analyze the data. Self-service analytics results, while containing less statistical or mathematical content, generally are more creative and adaptive to risks.
Step 5: Interpret the analytics results.
With the help of data scientists, business translators interpret the statistical or mathematical results and convert them to business contexts. For example, regression results may suggest correlations between various variables. Such correlations, when put into business contexts, may help explain business phenomena, identify drivers of such phenomena, and make forecasts.
These translated results, when combined with the results generated from self-service analytics, form the basis for answering the business question and making business decisions.
Step 6: Deliver the results.
Interpreted results, while informative and accurate, may not be understandable to the decision-maker. For example, presenting the monthly sales results for the last 5 years in hundreds of rows in a spreadsheet is informative and accurate but would be time consuming to interpret, especially when the decision-maker needs the gist of the results quickly.
Data visualization is designed to strike a balance between information and comprehension. Using visualization tools, visualization specialists communicate the results to the audience in an intuitive yet informative manner.
1. Microsoft Excel
Excel is one of the most popular analytical tools. Its popularity in all industries makes it indispensable for accountants and auditors. Basic analytical tasks such as data computation, sorting, and filtering, as well as information summarization and visualization, can be performed using embedded functions, pivot tables, and pivot charts. More advanced data analyses can be handled by Excel add-ins such as Analysis Toolpak and Solver tools. Functionality to help manage workflows can be extended through macros, which use VBA, its native programming language. For example, routine operations can be coded so tasks are automated for greater accuracy and efficiency.
However, Excel is not suitable for big data analyses, especially for data exceeding its capacity limit of around 1.05 million rows. Data in excess of Excel’s capacity must be analyzed by the business intelligence tools or programming languages listed below. Moreover, because Excel is outcome-based, users need to know the exact desired outcome and format. For example, when plotting graphs, users must envision what the final graph should look like and then try to replicate that vision. Thus, to work around these obstacles, other supplementary or standalone analytical tools are required.
2. Microsoft Power BI
Power BI used to be an add-in of Excel but was later developed into a standalone business intelligence tool. Power BI uses powerful compression algorithms that support the analyses of much larger amounts of data than Excel. Further, Power BI performs analytical tasks such as data cleaning, data merging, and data visualization. Other distinctive features of Power BI that set it apart from Excel include the ability to present multiple results using dashboards and a complementary cloud service for sharing data. However, similar to Excel, users need to know the outcomes and their formats before creating graphs.
Tableau is a data visualization tool specializing in the presentation and sharing of results. It gained growing popularity due to its simple user interface. It utilizes only one set of row-column combinations for data analysis. Users can use “trial-and-error” (drag and drop the data fields to the row and column) and immediately visualize the graphs without any prior knowledge of the outcomes or their formats. In the latest versions, Tableau also added functions to clean and shape data, such as data joint, data conversion, and data replacement.
Alteryx is characterized by self-service analytics. It mimics the workflow of real-life operations by dividing workflows into a sequence of tasks. Users can design the workflow by dragging and dropping the required information and tasks into Alteryx. For example, to combine different data files, users can first drag in the data files and then drag in the task (i.e., data blending) and connect it to the data files. Lastly, a data file is connected to the task to store the outputs. By specifying this extraction, transformation, and loading (ETL) process as a workflow, Alteryx performs the same process every time the workflow is run.
The aforementioned analytical tools do not require knowledge of coding and programming languages. However, their functions are limited to embedded features. Knowledge in programming languages such as Python, on the other hand, offers the flexibility to code and perform any task. Among the many programming languages, Python is a relatively easy-to-learn yet powerful analytical tool. With over 200,000 packages available to the public, Python is able to perform comprehensive analyses spanning from data analysis, visualization, and even machine learning.
While Python serves general data science purposes, R is more specialized in statistical and numerical analysis and visualization. Similar to Python, R has a large community of packages that can be used for data analysis, visualization, and machine learning. However, due to it’s different syntax (language structures) from other languages, R is a more difficult tool for begin.
Data analytics works best with concept introduction and practice. In the following weeks, we will walk through the basics of data analytics using Excel and Python. Each series will feature more detailed information for you to have a deeper understanding of this topic.
If you’re studying for a professional certification exam featuring data analytics, it’s important to know how the topics will be tested. For a preview of the types of questions you will find on professional certification exams, be sure to check out our Free Exam Questions for CPA, CMA, and CIA. Gleim is the leader in accounting exam review, and we’re here to help you pass on your first attempt!
We’ll continue our weekly blog series. Check back regularly for all exam news and study tips!
Functions and Modules
Numerical Python (NumPy)
Pandas Data Capture and Cleansing
Merging and Grouping
Manipulating Text and Datetime Data
Errors and Exceptions