Saturday, June 27, 2020
Get Equipped: Introduction to Financial Management for MIT
GETTING TO KNOW YOUR SUBJECT
This section provides links for you to follow so you can understand our subject better and be equipped better for future discussions.
Introduction to Financial Management
Introduction to Financial Terms and Concepts
https://www.youtube.com/watch?v=wvXDB9dMdEo
Basic tools
https://www.youtube.com/watch?v=0FV8NXtcVBY
10 Axioms
https://www.youtube.com/watch?v=4VZCzk7zIAE
Accounting is the Language of Finance
Understanding Accounting
https://www.youtube.com/watch?v=YjkRSlTxsZM
The Accounting Process
( Tagalog) https://www.youtube.com/watch?v=TmufsiUkjfQ&t=941s
( English) https://www.youtube.com/watch?v=L5izAWXPB4c
Cost concepts
https://www.youtube.com/watch?v=--C8nUpRgA8&list=PLxP0KZzCGFYOeZIwY-2IoEoAin-uDgnHl&index=1
https://www.youtube.com/watch?v=fs-67qg_OFk&list=PLxP0KZzCGFYOeZIwY-2IoEoAin-uDgnHl&index=2
https://www.youtube.com/watch?v=ovoxkpdiKZY&list=PLxP0KZzCGFYOeZIwY-2IoEoAin-uDgnHl&index=3
https://www.youtube.com/watch?v=M8AbnX0gQHY&list=PLxP0KZzCGFYOeZIwY-2IoEoAin-uDgnHl&index=4
https://www.youtube.com/watch?v=6jWNuHtlIi4&list=PLxP0KZzCGFYOeZIwY-2IoEoAin-uDgnHl&index=5
https://www.youtube.com/watch?v=d0hu9aEoi78&list=PLxP0KZzCGFYOeZIwY-2IoEoAin-uDgnHl&index=6
https://www.youtube.com/watch?v=U7H47l2u8Ak&list=PLxP0KZzCGFYOeZIwY-2IoEoAin-uDgnHl&index=7
Financial Management MIT Course Outline
I.
Course
Description:
This course prepares the students to
analyze financial statement and examine financial decision-making. Topics
include understanding financial statements, funding decisions, the use of financial
tools, cost analyses, and risk management tools.
II.
Bases
of Evaluation:
1. Course
requirements that are evaluated based on the rubrics.
2. Research
Output
3. Midterm
and final major examinations.
4. The standard
grading system of the University.
III. Grading System:
Midterm Score = 33% Class Standing + 33% Research +
34% Midterm Exams
Tentative Final Score = 33%
Class Standing + 33% Research + 34% Final Exams
Final Score = 33%
Midterm Score + 67% Tentative Final Score
Note: Class Standing is composed of
activities such as group dynamics, laboratory activities, seat works, quizzes,
assignments, and participation
. .
VI. Course Content:
TOPICS
|
Unit
1. Introduction to Financial
Management and the Accounting Language
1.
Basic concepts of
Financial Management and Accounting
2.
The Financial Management
Framework ( 10 Axioms of FM)
3.
The Fundamentals of the
Accounting Process
4.
The Accounting Cycle
5.
Understanding the
Financial Statements
6. Costs concepts and behaviors for financial reporting and cost
analysis used in decision making.
|
Unit
2. Financial Information in Action
1. Analyzing Financial Statements
a. Background analysis
b. Areas in Financial Analysis
b.1.
Liquidity
b.2.
Profitability
b.3.
Asset Management
b.4.
Solvency
b.5.
Marketability/ value analysis
c.
Methods in Financial Analysis
c.1. Horizontal
c.2. Vertical
c.3. Ratio Analysis
c.4. Du Pont Expansions
2. Financial Planning and Proforma financial
statements
a.
Operating Budgets
b.
Financial budgets
c.
Master budgets
d.
Projected Financial Statements
|
Unit
3. Analyzing Financial Information for
Short Term Decision Making
|
Unit
4. Analyzing Financial Information for Long Term Decision Making
1.
Financial Risk and
Return analysis
2.
The role of Time
value of money and cost of capital
3.
Investment decisions
4.
Capital budgeting
5.
Financing decisions
6.
Debt and Equity
Financing Analysis
|
VI. References:
RESOURCE TYPE
|
RESOURCES
|
Books
|
Titman,
S., Keown, A. J., Martin, J. D. (2011) Financial management : principles and
applications. (11th ed.) Boston : Prentice Hall.
|
Jordan,
Bradford D. Fundamentals of
investments : valuation and management.
5th ed. Boston : McGraw-Hill,
c2009.
|
|
Gitman, Lawrence J. Fundamentals of investing. Twelfth edition. Boston : Pearson, c2014.
|
|
Rosenbaum,
Joshua,(2013) 1971- author. Investment
banking : valuation, leveraged buyouts, and mergers & acquisitions. Second edition, University edition.
Hoboken, N.J. : John Wiley & Sons, 2013
|
|
Gitman,
L. J. (2011) Fundamentals of investing. (11th ed.) Singapore : Pearson.
|
|
Websites
|
|
http://www.slideshare.net/abhishek9066389/financial-management-16056415
|
|
http://www.businessdictionary.com/definition/financial-management.html
|
|
http://www.managementstudyguide.com/financial-management.htm
|
|
Journals
|
FinEx Journals
|
Accounting and Finance Journals
|
|
EBSCOHost Journals
|
Monday, June 22, 2020
Investment Management and Portfolio Analysis
Downloadables
Investment Analysis and Portfolio Management easy handouts
https://drive.google.com/file/d/1KKPiX9LOFCiFv6cGR8duoDMwYh92FUmh/view?usp=sharing
Investment Analysis and Portfolio Management Book
https://drive.google.com/file/d/1g7DE1Bu1C11cqf_gGVFAprRQ9W1xUKr9/view?usp=sharing
Investing for Dummies
https://drive.google.com/file/d/1dpN_69ytYxk4l0JWpDnbkYRJdWWZwAOt/view?usp=sharing
Powerpoint from Gitman
https://drive.google.com/file/d/136NSKNjdvJRh7CXovwqyQDGF9vdTxkmm/view?usp=sharing
Downloadables
Investment Analysis and Portfolio Management easy handouts
https://drive.google.com/file/d/1KKPiX9LOFCiFv6cGR8duoDMwYh92FUmh/view?usp=sharing
Investment Analysis and Portfolio Management Book
https://drive.google.com/file/d/1g7DE1Bu1C11cqf_gGVFAprRQ9W1xUKr9/view?usp=sharing
Investing for Dummies
https://drive.google.com/file/d/1dpN_69ytYxk4l0JWpDnbkYRJdWWZwAOt/view?usp=sharing
Powerpoint from Gitman
https://drive.google.com/file/d/136NSKNjdvJRh7CXovwqyQDGF9vdTxkmm/view?usp=sharing
INVESTMENT ANALYSIS AND PORTFOLIO MANAGEMENT
INVESTMENT ANALYSIS AND PORTFOLIO MANAGEMENT
Tentative Course Content
GETTING TO KNOW YOUR SUBJECT
Tentative Course Content
This subject is a continuation of Advanced Financial Management. Part 1 and Part 2 are reviews of basic concepts coming from Advanced Financial Management. Please take time to check the links below for the discussion.
Part one
|
Preparing to Invest
|
1
|
The Investment Environment
|
2
|
Securities Markets and Transactions
|
3
|
Investment Information and Securities Transaction
|
Part Two
|
Important Conceptual Tools
|
4
|
Return and Risk and Time value of Money
|
5
|
Modern Portfolio Concepts
|
Part Three
|
Investing in Common Stocks
|
6
|
Common Stocks
|
7
|
Analyzing Common Stocks
|
8
|
Stock Valuation
|
9
|
Market Efficiency, Behavioral Finance and Technical Analysis
|
Part Four
|
Investing in Fixed-Income Securities
|
10
|
Fixed income securities
|
11
|
Bond Valuation
|
12
|
Investing in Preferred Stocks
|
Part Five
|
Portfolio Management
|
13
|
Mutual Funds: Professionally Managed Portfolio
|
14
|
Managing Your Own Portfolio
|
Part Six
|
Derivative Securities
|
15
|
Options: Puts and Calls
|
16
|
Commodities and Financial futures
|
Special section
|
Web chapters (www.pearsonhighered.com/gitman)
|
Time Value of Money
|
|
Tax-Advantaged Investments
|
|
Real state and Other Tangible Investments
|
GETTING TO KNOW YOUR SUBJECT
The Investment Environment
Securities Market transactions
Investment Information
Introduction to Investment Analysis and Portfolio Management
Return and Risk
Time Value of Money
Friday, June 19, 2020
Statistics with Computer Application Downloadables
https://drive.google.com/file/d/196bqUF8sdfbp7KlThuToT9-1imkS5OJy/view?usp=sharing
Enjoy!
a. Statistics Essentials for Dummies (PDF File)
b. SPSS for Dummies (PDF File)
https://drive.google.com/file/d/1kBmQaqYHDonO4mLFWmqdDZJUaLcqXAox/view?usp=sha
c. Excel All in one for Dummies (PDF File)
d. SPSS 17 Installer
or
e. Sample Data (compressed file with sample data
you can use in your SPSS practice)
Enjoy!
Financial Analytics
Course overview and midterm module
(Source: IMA Data Analytics & Visualization
Fundamentals CertificateTM – Course Script)
Download this content at
Download original document at
Podcast video
lesson 1 section 1
Dan Smith
Course
Overview
-
This course will introduce the impact of technology and
analytics on the management accounting profession and delve into understanding
and applying data analytics and visualization through a case-based scenario.
The four modules are:
Module 1 - Becoming Data-Driven will set the stage by
introducing some of the changes that are happening in finance and accounting,
how business processes will be impacted, and what skills you will need to
succeed. You’ll also be introduced to concepts of data science, data analytics,
and data governance as an overall foundation of the course.
Module 2 - Visualizing the Present & Predicting the
Future will discuss how to harness the power of data through analytics and
effectively communicate the data with visualizations. All of this will be
depicted using a fictitious case scenario that describes a dilemma in which
data analytics and effective data visualization is necessary to resolve it.
Throughout the module, as the case progresses, a video series will elaborate on
key visualization concepts and techniques.
In Module 3 - Applying Data Analytics and Visualization,
you’ll learn how to apply some data analytics and visualization concepts
covered in prior modules to solve issues presented in the case-study scenario.
After that, you’ll be in the driver’s seat as you take your newly acquired
knowledge on a test drive.
In Module 4 - Conclusion and Final Assessment, we’ll wrap up
the course with a summary and provide you with some valuable resources that
will enable you to expand your learnings and dig deeper into these topics. You
will then be presented with a final assessment that will test your knowledge of
all topics discussed throughout the course.
Data analytics can be performed using a multitude of software
available in the market. In this course, you’ll be introduced to many of these
software tools. However, in order to perform the requisite analytics, you are
required to have access to Microsoft Excel. It will be necessary for you to
have working knowledge of basic functions, including pivot tables, in Microsoft
Excel.
Course
Learning Objectives
At the conclusion of the course, you should be able to:
Recognize the impact of technology and analytics on the
accounting profession.
Demonstrate how data analytics can influence organizational strategy.
Identify ways data visualizations effectively enable
appropriate business decisions
These are the overall learning objectives for this course.
Each module will begin with a set of unique goals that align with these overall
course learning objectives.
Course
Roadmap
As we previously outlined, this course consists of four
modules. The beginning of each module starts with the module goals. These goals
outline the topics that will be discussed in that module.
Each module is divided into lessons. You will see a menu screen
as you begin each lesson. The menu screen will list the sections that are
included in that lesson.
We highly recommend that you go through this course in
sequential order for an optimal learning experience.
Module 1:
Becoming Data‐Driven
Introduction
Welcome to Module 1, Becoming Data-Driven!
This module will introduce changes to the accounting and
finance profession that are on the rise caused by advancements in technology,
and discuss the potential lasting impact of these emerging technologies.
Module 1
Goals
Upon completing Module 1, learners will be able to:
Recognize the impact of technological advancements on the
finance and accounting profession.
Describe the typical life cycle of data.
Explain the impact data governance has on business and its
stakeholders.
Define types of data analytics and how each progressively
creates value within an organization.
Module
Menu
Module 1 consists of two lessons, Lesson 1: Changes to the
Profession and Lesson 2: Making Sense of Data.
Lesson
1: Changes to the Profession
Section
1: Technology is Changing
As technology transforms the management accounting profession
at an ever-increasing rate, are you concerned about having the necessary skills
to further your career? Do you want to enhance your skill set but don’t know
where to start? Through this lesson, we will define the essential knowledge and
skills needed to adapt to the modern workplace and set the stage for how you
will be able to develop these skills.
Video:
The Future of the Profession
Now, let’s hear from IMA’s President and CEO, Jeff Thomson as
he discusses his thoughts on the future of the finance and accounting
profession.
Video Transcript (Est. time: 13:28 min):
Hello, I’m Jeff Thomson, president and CEO of IMA, the Institute
of Management Accountants. And I’m proud and honored to be speaking with you
about the future of the accountant in business, especially in a digital age.
When you think about the challenge of digitalization of the value chain, it
really is a challenge, but it’s also an opportunity. So think robotics process
automation, think automation
in general. Think about machine learning, cognitive computing. The challenge of
this course is to how to harness all of these technologies to add value. But a
challenge also relates specifically to our profession. Will this automation
result in fewer jobs, displaced jobs? I would argue that it will result in
different types of jobs. Jobs that will add more value of foresight and insight
to the organization, but only if we commit to upskilling in areas such as data
analytics and data visualization.
So let’s talk briefly about the CFO team today and how it’s
evolved over time. And as many of you know, I was a CFO in industry during this
evolution of the role expansion of the CFO and the CFO team. So in the past,
the CFO team had a very, very critical fiduciary responsibility and that was
safeguarding of assets, what we would call value stewardship, protecting value,
internal controls, accurately and fairly reporting on the financial condition
of the corporation. But over time, that
role has expanded to include those table stakes plus
additional responsibilities in the area of value creation, merger and
acquisition activity, financial planning and analysis, and much more. So the
evolution has resulted in greater expectations for the CFO and his or her team.
In fact, the CFO is often referred to as the chief futures officer or the chief
value officer. In many cases, the CFO of large and small companies have
responsibility for operations, for strategy, for IT, and for more. So with this
expectation, with this challenge comes the opportunity, to add greater
influence and greater relevance as individuals and as professionals in
accounting.
So technology and analytics is sweeping us, and moving at
such an incredible pace that you almost can’t keep up with all of the
technology, but really we must all embrace technology. Millennials are digital
natives. It comes very, very naturally. But even in the middle of our careers,
as we see some jobs being lost or displaced, it’s important that we learn new
skills and new capabilities, especially in data analytics, data visualization,
data governance, and even storytelling. When we’re trying to influence the
organization on a business case or a new merger or a new product or service, we
have to tell a story. We have to tell a story beyond the numbers that’s
futuristic, inspiring, and
leads to great outcomes.
But the other aspect of learning and embracing technology is
not just the Millennials and those mid-career in terms of upskilling and
staying current, and harnessing technology, but it’s also tone at the top. It’s
critically important that those at the top embrace analytics as the new science
of winning in the market, of being a competitive differentiator, to know your
consumers even better than they know themselves in terms of needs and wants and
desires of future purchases. So it’s very important that senior leaders embrace
technology, not necessarily to become experts in programming and coding, but to
understand the value that analytics and technology can bring to
a differentiated value proposition to support training,
courseware, education, and more so, many would say that it’s crunch time for
finance and accounting. You know, finance talent models are evolving quickly
with a premium placed on data scientists, business analysts, and storytellers.
Everyone’s talking about it, the big four, Downmarket, McKinsey, Accenture,
etcetera, etcetera.
And so this is the trend we’re seeing to embrace technology,
but to also step up and upscale in the area of analytics so that we can add
more foresight and insight. And the softer skills of us as humans-empathy,
active listening, professional judgment, professional skepticism-really is not
replaceable by a robot.
So some key messages, automation’s been around for a long
time. What’s different now? Look, we’ve been through multiple industrial
revolutions. So why is automation the talk of the town, so to speak, whether
you’re a consumer or a professional. Well simple, the reason it’s different
this time is because we’ve programmed robots to be much smarter to think, to
synthesize, to react, to adapt. However, keep in mind that it’s human beings
doing the programming and harnessing the computing and cognitive power of the
machine and not vice versa. Second key message, look, we have a choice here.
We’re at a turning point in our profession. We could choose to be more
relevant and influential or less relevant and influential.
Think about it this way. As automation over time replaces or displaces more
routine, repetitive tasks, it is a reality that some of our jobs, some of our
tasks will go away, especially in auditing and transaction processing. That’s
at one end of
the spectrum. At the other end of the spectrum, if we say,
you know what? This advanced analytics and providing insight and foresight is
not what I was trained to do as an accountant. It’s really not for me. I’m
going to outsource the analytics and the deeper thinking, the deeper judgmental
capabilities to a consultant, to a machine, to whatever. Well, my question to
you is what’s left? What’s left of our great profession. So this is a call to
action for us to upskill in strategy management, data science, data analytics,
and more. It’s a business imperative critical to the
future of our profession. So nothing less than our relevance
and influence is at stake.
And when you think about the rate of change of technology, I
mean just two years ago, who knew what blockchain is or was. Today we’re
talking about more and more use cases of blockchain in financial services and
insurance. And more, five years ago we weren’t even aware of what robotics
process automation is all about. And now there’s hundreds of use cases in
organizations large and
small to create greater operational efficiencies. And then
the final point, closer to home as your CEO, CEO of IMA, is we are absolutely
committed to being leaders in preparing you for a great future. By the way,
that future is now.
So I’ve referred to data science several times, and it could
be a little bit intimidating. Change is hard. Upskilling is necessary, but data
science is really the overlap, the intersection of three broad domains that I’d
like to describe to you in turn. The first is business context. You know,
whether you are an FPA professional, financial planning and analysis, a
statistician, econometrician, regardless of your level of analytical
horsepower, you have to understand the business question being asked. What is
the problem we’re trying to solve? What new discoveries are we trying to seek?
As we mine through data and understand data, what are the opportunities to
learn something else to put us at a great competitive position? So business
context, understanding how customers behave, understand how value and cash
flows in the organization within your industry vertical is very, very
important. To ask the right questions and to seek answers in the data as a data
explorer, business context, critically important. The second element of data
science is the technical modeling, some level of statistics, some level of
knowledge of forecasting tools and techniques. And this can be a little bit
intimidating and overwhelming, but it also is very manageable. If you focus on
the problem at hand and the opportunities to be seized, I’m not suggesting that
every finance and accounting professional be a Ph.D. statistician. But I am
suggesting that the more data science, the more statistics you know and can
apply, the stronger you will be in this brave new world of digitalization.
There was actually a book written called exploratory data analysis written by a
famous statistician. And that very much is what statistics is all about. It’s
not just reporting averages and standard deviations, it’s to kind of seek new
possibilities by mining through large sets of data and doing forward-looking
analysis. In fact, according to Glassdoor, data scientist is the No. 1-ranked
job in terms of satisfaction and entry- level pay. So we need to infuse more
and more of that at a reasonable pace, at a reasonable level into our
profession. And then the third component of data science is the actual
applications. It could be Excel, advanced analytics and data mining and Excel.
It could be open source software like R, visualization, like Tableau, Python,
Power BI. So learning these tools and application capabilities is also
critically important.
I made reference earlier to a book that while it’s over 12
years old is timeless and it’s called Competing on Analytics, The New
Science of Winning. It’s one of my favorite books because the title speaks
volumes for what we must be doing in a competitive age. When you think about
winning in the market, you think about apps and things that face consumers, not
things that are kind of in
the background but make no mistake about it in a incredibly
treacherous competitive environment with geopolitics at play, nontraditional
competition analytics is and can be the new science of winning. But you know,
analytics should not be equated with information technology. It is the human
and organizational aspects of analytical competition that truly differentiates.
You know, there is a language barrier though when we think about analytics,
data scientists, IT professionals, finance and accounting. We’ve got to get over
it. We’ve got to solve it. We’ve got to become data translators. Finance and
accounting professionals need to learn and understand more about IT. IT and
information systems professionals need to learn and understand more about the
language of finance and accounting. Because guess what? At the end of the
supply chain, consumers and shareholders are expecting that we’re doing that
translation. So there’s plenty of opportunities to upskill in these areas and
to speak the language of the machine and speak the language of business and who
better than management accountants, CMAs, CFO teams to do that translation to
create great, great outcomes. How to prepare for the future of work .In summary, look, there is no
simple or prescriptive solutions. The environment is moving too quickly and too
rapidly. Technology is evolving very, very quickly and rapidly, and we must not
only keep pace, we must stay ahead. So as we embrace this new future, which is
here today, we need a combination of technological know-how, problem solving,
and critical thinking as well as the human skills, the soft skills as
perseverance, collaboration, and empathy. We are absolutely committed to
preparing you today and in the near and future, whether it’s five years out or
10 years out, that’s our obligation to you. Thank you.
Innovation
and Change
These emerging technologies continue to drive innovation and
change throughout the business landscape. While these specific trends can seem
abstract, accounting and finance professionals do need to have an understanding
of what these technologies are.
Arguably more important than any specific technical
knowledge, however, is the ability to leverage these innovations to become
strategic partners.
Section
3: The Changing Role of the Management Accountant
As management accountants adapt to new technologies, they are
well-suited to increasingly assume the role of facilitator, using the breadth
and depth of their knowledge to find opportunities for improvement and craft a
shared vision for product stakeholders.
Management
Accountants' Skills
Management accountants already possess most of the skills
needed to implement advanced analytics: They have a holistic view of business,
they intuitively understand the interrelation between financial and strategic
business decisions, and they have strong written and verbal communication
skills. When management accountants can combine these skills with technological
knowledge and increased aptitude, they have the power to add further value to
their organizations.
In this course, we’ll explore how to build on these on these
skills by understanding some of the current technology trends. These trends
lead to the creation of data that can be analyzed for value-added, strategic
decision-making.
Section
2: Keeping Up With the Trends
As a management accountant in the digital age, it’s important
to have a fundamental working knowledge of technological trends. Those include
robotic process automation, artificial intelligence, and blockchain.
Important Technology Trends Defined
Let’s take a moment to define each of these.
Robotic Process Automation (RPA) is an advanced and
intelligent form of process automation leveraging software tools (or software
robots) to effectively and efficiently perform tasks traditionally performed by
humans.
Artificial Intelligence (AI) is a field of computer science
concerned with empowering machines to think, behave, and act like human beings
to draw conclusions and make judgements. Examples include speech recognition,
machine learning, and natural language processing.
Blockchain is a decentralized and distributed ledger of
encrypted information. Uses for blockchain include cryptocurrencies, smart
contracts, and transparent proof of work.
Management accountants can leverage emerging technologies
such as AI and RPA to streamline processes and decrease the amount of time
spent on repetitive tasks. They can instead focus their time on strategy and
decision making.
Leveraging
Technology For Leadership
Once accounting and finance professionals are armed with the
knowledge of which technology trends are currently generating attention and
attracting investment, there are several approaches and tactics they should
embrace as they leverage technology to elevate themselves as strategic thinkers
and leaders. Click each icon to learn more.
Communicate Implications
Be able to communicate the implications of technology for the
organization and its strategic priorities. Not every new technology platform or
tool is going to be equally helpful for every organization. Regardless of
whether it’s RPA, AI, or some other emerging technology, every tool has its own
set of pros and cons. Taking an objective perspective on technology and
articulating the pros and cons of these tools are critical steps for effective
leaders to take in an increasingly digital world.
Link Stakeholders
Link different stakeholders across the company into the
conversation. One of the most common pitfalls that can trip up even the most
sophisticated and well-prepared organizations is when important dialogue occurs
in silos. Discussing technology tools and options with only the accounting and
finance team isn’t going to drive change and might very well lead to project
failure. True leaders and strategic management teams understand that involving
as many participants from different departments as possible creates a more
diverse team, better solutions, and more robust use cases
Identify and Present Easy Wins
Identify and present easy wins. Having a conceptual
understanding of, and a vision for, how technologies will eventually transform
a profession, subset, or field is a great start but doesn’t usually generate
actionable ideas for professionals to implement. After communicating the
potential of a technology or process and linking in a variety of stakeholder
groups, presenting
goals that are achievable in the short term and documenting
first steps toward achieving them will get the ball rolling within the
organization, increase the likelihood of stakeholder buy-in, and help to create
enthusiasm for future projects that are likely to be more complicated.
Section
4: The Big Picture of Data
The best way to begin is to learn the terminology surrounding
data. The first term is “architecture,” and just like when designing a
building, the architecture is the overall structure of whatever it is we’re
trying to create. First term is “architecture,” and just like when designing a
building, the architecture is the overall structure of whatever it is we’re
trying to create.
The next term is “solutions” because the only 100% consistent
aspect of any analytics project is that we are trying to solve some problem or
answer a question. “Solution architecture” may take many forms from a simple
report performing calculations once a week to an automated artificial
intelligence (AI) ultra-high-frequency trading application.
“Patterns” are templates or guidelines that solution
architects will reuse frequently.
Architect
a Solution
In our new data-focused terminology, we hope to architect a
solution using an established pattern. There are different types of solutions
that may use different patterns and/or expertise. At the highest level (a.k.a.
“least detailed” or “10,000-foot view”), we typically group solutions into
domains.
Just as a building architecture may require domain solutions
in landscaping, interior design, plumbing, and more, an analytics solution
architecture may include domains such as data storage, data creation, and
different business specializations. To architect effective analytics solutions,
we require knowledge and skills in multiple domains of business data.
High‐Level
Domain Design Pattern
Next, we use the following high-level domain design pattern:
data creation, data storage, and business domain(s). The business domain may
include a single business domain, as marketing, or many, such as marketing, supply
chain, and finance.
Data Governance, a set of processes and policies that help an
organization manage and secure its data, which we will discuss in later
lessons, is an integral part of every domain.
Section
6: Data Driven Culture
There are four essential elements in establishing a
data-driven organization. These include data- savvy people, quality data,
appropriate tools, and processes and incentives that support analytical
decision making. Organizations attempting to adopt leading-edge analytics often
face challenges in each of these dimensions. The result is an inability to
effectively support managerial decision making through the use of analytical
technologies.
Much of the focus on implementation of advanced analytics has
been on the tangible elements of a successful data-driven organization (people,
data, and tools). Less attention has been paid to the fourth factor --
organizational intent. An organization committed to the goal of being data-driven
will work to develop the people, data, and tools needed to accomplish that
objective.
Resolve
to Be Data‐Driven
Becoming a data-driven organization requires creating
structures, processes, and incentives to support analytical decision making. It
requires the organization to resolve to be data-driven and define what it hopes
to accomplish through the use of Big Data and analytics. The top leadership of
the organization needs to describe how analytics will shape the business’s
performance.
Six
Factors For a Data‐Driven Culture
While many organizations are striving to implement a
data-driven culture, success isn’t assured. Achieving this goal requires that
certain elements be present. There are six key factors for successfully
establishing a data-driven organizational culture. Click each icon to learn
more.
Having the right tone at the top. Setting
the right tone at the top is critical for most
organizational initiatives, and this includes developing a
data-driven culture. In most organizations, executives are championing the use
of leading-edge analytics, although in some companies the initiative is being
led from the bottom up, with various departments being first to embrace it.
In addition to championing the use of leading-edge analytics,
executives in a data-driven
organization must consider
ethics in the context of technology and analytics. While ethics is not covered
in this course, it is important for the executive team to incorporate ethics
policies for managing technologies.
Having strategies for the effective use of technology. The
ability to use leading-edge analytic techniques effectively is important for a
variety of reasons. Companies whose decision making is reactive to the
competition are less likely to have developed strategies for the effective use
of techniques and technologies. Being reactive instead of proactive implies
that these organizations lack the ability to predict trends or to turn customer
data into useful insights that can be used to enhance the organization’s
business.
Having a commitment to collecting and using data from both
internal and external sources to support analytics efforts. To
harness the potential of leading-edge analytics, organizations need to utilize
a wide variety of data sources. This is especially true when it comes to
strategy development and execution. In this regard, about half of organizations
use data from both internal and external sources. Of concern is that the other
half of organizations are only using
internal data, only using data to validate strategy
post-execution, or (in a few cases) not using data at all! Using a wide variety
of data sources yields better insights. Organizations that truly want to derive
value from their data must be comfortable with complexity and remain flexible
enough to respond to what the data tells them.
Using both monetary and nonmonetary rewards to promote
analytical decision making. Slightly more than half of
organizations use incentives to promote analytical decision making. These can
be monetary, nonmonetary, or both. Yet nearly half of organizations aren’t
doing so. This may be a mistake: The use of incentives is key to conveying the
importance of developing
enhanced analytics capabilities throughout an organization.
Those that do believe in the importance of developing such capabilities are
more likely to create the appropriate culture by providing incentives to their
employees.
Having a willingness to adequately provide resources to the analytics
efforts. Organizations often are facing resource challenges concerning
the development of enhanced analytics capabilities. By far, the most frequently
cited challenge is the ability to find staff with the necessary skill set. The
next most common resource challenge is budget. A third challenge, related to
the previous two, is a lack of staffing resources and competing priorities.
Clearly, these four essential elements needed for companies to develop advanced
analytical capabilities are interrelated: data- savvy people, quality data,
state-of-the-art tools, and organizational intent.
Alignment of analytics efforts throughout the organization. Responsibility
for analytics can reside in various parts of an organization. It has been
argued that CFOs should “own” analytics as they are regarded as impartial
“guardians of the truth.” Most companies seem to agree, with finance being an
owner (although often not the sole owner) of analytics. Other popular options
include analytics being owned by IT, a dedicated analytics group, or
operations, or having each department independently maintaining its own
analytics capabilities. Of course, these options aren’t mutually exclusive,
with a variety of possible combinations, the most popular being analytics
jointly owned by finance and IT.
The
Benefits are Clear
The benefits of implementing a data-driven culture are clear.
Organizations possessing such cultures more effectively perform key business
processes such as strategy formulation and performance evaluation. In
implementing such a culture, establishing processes and incentives that support
analytical decision making (i.e., organizational intent) is critical.
When deciding to venture along the path of implementing
leading-edge analytics, evaluate the extent to which the six factors discussed
are present in your organization. By ensuring that they are, you can improve
the chances of successful implementation and achieving the competitive benefits
that come with being data driven.
Section
5: Business Domain Pattern
What does the interrelation of data creation, storage, and
business domains look like? Consider a traditional business selling widgets
directly to customers.
Data is created by customer activity such as purchasing
widgets or browsing websites. Data integration moves data generating
application and into data storage. Raw data transactions are cleaned and shaped
with other data into an aggregate data store. Aggregate data is used for
descriptive, diagnostic, predictive, and prescriptive analytics by various
business domains. The business domains use the analysis to make business
decisions on how to sell more widgets. Business activity resulting from
business decisions influences customer activity.
The entire point of this cycle is for the business to better
influence the customer. This could be through product design, marketing,
support, warranty, shipping, inventory, etc. The business exists to make
decisions that result in value to its customers translated into profit.
Therefore, the goal of an analytics solution architect is to accelerate the
cycle of customer activity to business activity.
Data
Science
This is data science: the intersection of data, analytics,
and business decision making. The current state of data science is to
accelerate the data-to-decision process.
The Management Accountants’ understanding of data science
better facilitates the data-to- decision process for their organizations. In
this course, we will focus on the Statistics & Analytics component of data
science.
Enhancing
Analytical Capabilities Is Critical to Success
The Digital Age is upon us, and it brings with it challenges
and opportunities for businesses. Management accountants have the opportunity,
and need, to develop their data and technological skills so they can use
advanced analytics and glean new insights from their data for their
organizations.
The collection, assessment, interpretation, and use of data
are enabling companies to create new business models and make existing ones
more efficient. Most organizations now believe that enhancing their digital and
analytical capabilities is critical to their continued success and survival and
are leaning on management accountants to take the lead in facilitating this
strategic initiative.
As management accountants develop their skills and complete
their repertoire of strategic business competencies, they can share their holistic
view of business and the interrelation between financial and strategic business
decisions to communicate the importance of creating and establishing a
data-driven organization.
Lesson
2: Making Sense of Data
Section
1: Introduction to Data
Big Data is top of mind for many finance leaders. The
question is: How can we leverage all this data to drive business success? How
can we can turn data into meaningful action?
The progression includes identifying key information from the
data, turning that information into knowledge, creating valuable insight from
that knowledge, and taking action.
In this lesson, we will further define the progression and
life cycle of data and introduce how and where data analytics fits in.
Phases
of the Data Life Cycle
Data is an essential part of any enterprise. An enterprise
that’s agile and innovative requires an understanding of data as it flows
through the organization, interacts within various departments, and transforms
itself. Though an international standard for the data life cycle doesn’t exist,
the following phases, in order, are identified as typical during data life
cycle management. Click each image in the order shown to learn more.
Capture. Business is surrounded by data, but an
enterprise needs to capture it in order to make use of it. Data capture occurs
in three major distinct ways:
· Data Entry. Manual or automated entry of data into the data
warehouse to create new data values.
· Data Acquisition. Acquiring or transferring data from an
already existing data source or data warehouse.
· Connected Devices. Internet of Things (or IoT), or the
interconnection of computing devices enabling them to send and receive data,
has will continue to transform the way data is captured by making it real time
and continuous as devices listen to and interact with the environment and each
other. These devices capture and transmit the data so it can be stored.
Qualify. Have you ever wondered why
month-/year-end close processes are prone to errors or why reconciliations take
such a long time? Inaccurate or incomplete data may lead to major problems
later in the data life cycle. These problems may include critical business
processes being held up, bad decision making, or final reports running afoul of
compliance because of erroneous data values. In this phase, data is assessed
for its quality and completeness using a set of predefined rules.
The Capture and Qualify phases are traditionally seen as
under the purview of the IT team, which sets up the system architecture. But
management accountants, with their knowledge of accounting processes and the
way data will be utilized in later phases, have the ability and responsibility
to envision the framework of the system in partnership with ITTransform.
The advent of Big Data has led to enterprises being able to capture a seemingly
infinite amount of data. Couple this situation with IoT, and soon you can be
drowning in data. Thus, enterprises have to transform, synthesize, and simplify
the data so it can be utilized by functional departments. This phase is
commonly called “analytical modeling” in the financial world. A certain
level of functional expertise is required at this phase as
data from different sources is linked together to find the intrinsic value
hidden beneath.
Utilize. The final aim of data is to help
enterprises make good business decisions, and, in some cases, data itself is
the final product or service of the enterprise. Either way, the true value of
data is unlocked in this phase, and the previous efforts made in data capture,
qualification, and transformation finally bear fruit. Management accountants
act as business partners during the Transform and Utilize phases. As business
partners, we need to translate the data values into business stories that help
enterprise leadership understand the magnitude of their decisions and their
long-term impact.
Report. This phase relates to external
reporting. Internal management reporting for decision making is realized during
the Utilize phase in the data life cycle. External reporting could involve
quarterly/yearly financial reports, financial data sent to other vendors for
bids, and other compliance reports. Reporting of data is a key phase of the
data life cycle that is ripe for automation. Since rules, definitions, and
requirements for these reports either rarely change or have slight changes year
after year, automated processes designed by management accountants help to
create and publish these reports in a more efficient manner.
Archive. This is the beginning of the end for the
data that the enterprise has spent a considerable amount of time and resources
on to unlock its value. Data archiving is the transfer of data from an active
stage to a passive stage so that it can be retrieved and reutilized as needed.
Purge. The final phase of the data life cycle
is the removal of the data (and any copies) from the enterprise. It occurs in
the data archive and is sometimes accompanied by a communication both inside
and outside the enterprise.
Financial compliance rules within an enterprise or those
imposed by regulatory bodies normally drive the Archive and Purge phases. And
management accountants act as custodians to ensure that these compliance rules
are followed within the enterprise for financial data.
Section
2: Introduction to Data Governance
As users of systems, technology, and data, finance
professionals are invested in the integrity of our information and its sources.
IT and data governance can provide the structure and rules to ensure data
accuracy and availability while managing the associated risks.
Every organization with shared data is concerned with data
integrity. Data governance, a specific sub-element of IT governance, parallels
the capabilities of corporate governance and IT governance at the data level.
This can be as simple as a set of rules specifying what data (for example
system fields) is to be entered by whom, when, and from what source, to as
complex as you want (as an example multiple levels of data entry, audit, and
control structures).
In this lesson, we will identify how organizations can build
trust in data by implementing a data- driven culture with effective and
efficient data governance.
Data
Governance and the Data Life Cycle
Data governance helps an enterprise administer the data as it
flows through the various phases of the data life cycle discussed in section 1
of this lesson.
During the Capture phase, enterprises need to identify the
capture points for the data and define the data that will be captured.
As data enters the Qualify phase, the rules of data governance
act as a check to ensure that inaccurate data is identified, assessed for
completeness, and secured.
At the Transform and Utilize phase, focus shifts toward
adherence to transformation rules and the legal utilization of the data
according to regulatory standards for decision-making purposes. As
the Reporting phase is all about showcasing data to external
parties, data governance lists the steps to take when inaccurate data is
reported outside the enterprise.
Archiving data relies on a set of rules that define what
occurs, as well as when and how. And in the Purge phase, it’s critical to set a
purge schedule for the data as per the retention period requirements.
Why Data
Governance is Needed
Effective and efficient data governance can facilitate powerful
analytics and decision making
across an organization. Demand for advanced analytics is
increasing, and, as previously discussed in the course, so, too, is the
expectation that management accountants will perform the analytics.
But how often have we heard of an analysis resulting in a
business decision that later proved problematic because of faulty data? As
technology makes it easier for employees to access data, write reports, and
conduct their own analysis, data governance becomes an even more important
safeguard to ensure the integrity of underlying data.
As businesses become more data-driven, data governance
provides the foundation for growth into predictive modeling and automation.
Data‐Driven
Culture
When asked, 99 percent of leaders of large organizations say
they want a data-driven culture to maximize the value of data through
analytics. As we also previously identified, they aim to make business
decisions faster and more accurately through automation.
Why emphasize culture? The limitation for achieving analytics
maturity isn’t usually related to data or technology but, rather, people’s
reluctance to use data and technology to answer business questions-in other
words, using data analytics rather than intuition as a driver for business decisions.
A shift is needed toward a culture that trusts that these data-driven decisions
will be effective.
Governance
Attributes
For this analytics process to function effectively, the data
inputs (“raw materials”) must be consistent and reliable for the information
outputs (“finished goods”) to be relevant and comparable. Relevant, reliable,
comparable, and consistent are the four desired attributes of accounting
information. Effective governance means that data used in decision making is of
consistent quality and from reliable sources. Efficient governance leverages
connectivity and technology to enable the comparison of data from many
different sources and to deliver relevant analysis.
Governance
Problems
Unfortunately, only about one-third of data-driven culture
initiatives succeed in larger firms. Often the reasons for failure stem from
insufficient data governance. If an organization suggests its data quality is
insufficient to use for decision making, that signals ineffective data
governance. If the data can’t be accessed, that’s a symptom of inefficient data
governance. Click each icon to learn more.
Ineffectiveness. “I think you forgot to
adjust your dates for time zone…”; “I’ve heard ‘churn’ defined three different
ways today…”; “We can’t do that analysis; we don’t have good data….” All common
phrases for emerging data-driven cultures who struggle with effective data
governance. Issues with data and analyses, from little mistakes in calculations
to instability in data sets, all build to create a culture of mistrust in data
and, by extension, mistrust in any decision based on that data. The root cause
of these issues is often ineffective data governance.
Inefficiency. Certain phrases signal
inefficient data governance: “Will you email me that database extract?”; “Why
can’t I access that reporting database? There’s no way for me to get
approval?”; “It’s going to take more than a week to access?”; “I’ll just get
someone to build out a new data platform for my department.” Inefficient data
governance is usually more difficult to resolve than ineffectiveness. Sometimes
it’s necessary due to regulations, e.g., open-source language restrictions or
the EU General Data Protection Regulation (GDPR); in other cases, it’s a
symptom of an organization failing to commit sufficient resources to data
governance.
Often the easiest way to govern a data set is to block
access. But blocking access can create inefficiencies: While no access means no
one can compromise the data, it also means no one can use it to improve the
business.
Blocked Access. Often the easiest way to
govern a data set is to block access. But blocking access can create
inefficiencies. While no access means no one can compromise the data, it also
means no one can use it to improve the business.
Nonconnected Data Sharing is also inefficient. If
users are constantly getting data from File Transfer Protocol (FTP) or email,
then it will be difficult for them to create a report that updates
automatically and impossible to automate decisions. Commitment to analytics
governance means giving people access to data in a way that facilitates
analytics maturity to create further value, even though that takes time.
Principles
for Implementation
Efficient and effective data
governance provides clear ownership and standards for data and data processes
to ensure data quality. Although many approaches exist for data governance
implementation, most share the following principles. Click each icon to learn
more.
Accountability. There must be clearly
defined ownership of, and accountability for, different types of data.
Interconnected data managed throughout the organization requires consistent
practices in order to maintain its effectiveness and value. In most
organizations, data oversight doesn’t reside within one department. Human
resources is the keeper of employee-related data, for example, while accounting
maintains financial data. Shared governance brings consistency by establishing
organization-wide policies and procedures.
Standardization. Data is an asset and must be
protected like one. Clear policies on access, definitions, privacy, and
security standards are needed. The committee must define the policies, and each
department head must ensure adherence. This approach will ensure that the
organization is in compliance with regulations, such as GDPR (General Data
Protection Regulation).
Quality. Analysis is a critical tool for decision
making and is only as good as the data upon which it relies. The quality of
data should be managed from the time it’s captured. Good data governance
includes defining one set of data-quality standards for the organization and
establishing consistency in how that data quality is measured and recorded.
We live in a fast-paced world where management accountants
are asked to provide insight and foresight through analytics, often on short
notice. Data governance helps ensure that our data is readily accessible and
accurate. That’s especially true in situations where data is spread throughout
disparate systems and departments. Often the analysis we’re asked to perform
relies on data that we don’t oversee. By coordinating with other data owners in
the organization, we can
protect the integrity of data and spend more time on
value-added analysis than on scrubbing data.
Committee
of Sponsoring Organizations'
Standards for data and data processes to ensure data quality
are outlined in various frameworks. The Committee of Sponsoring Organizations’
(or COSO’s) mission is to provide thought leadership through the development of
comprehensive frameworks and guidance on enterprise risk management, internal
control and fraud deterrence designed to improve organizational performance and governance and
to reduce the extent of fraud in organizations.
The COSO Internal Control - Integrated Framework, is a
globally recognized framework for developing and accessing internal controls.
The purpose of the COSO Internal Control – Integrated
Framework is to help management better control the organization and to provide
a board of directors’ with an added ability to oversee internal control.
Internal control enables an organization to deal more effectively with changing
economic and competitive environments, leadership, priorities, and evolving
business models. The framework identifies five components that are of comprised
of 17 principles. For more information, you can download the framework from the
Resources link above.
The COSO
Internal Control ‐‐ Integrated Framework
As a high-level overview of the COSO Internal Control -
Integrated Framework, we’ll refer to the COSO cube and two of its principles
that can be applied to data governance. These principles reside within the
Control Activities and Information & Communication components of the cube.
Click each icon to learn more about these principles and how
they apply to data governance.
Principle 11 states “The organization
selects and develops general control activities over technology to support the
achievement of objectives.”
Ensuring an acceptable system of internal control over
systems and processes will protect stakeholders and the data integrity itself.
Management must carefully select appropriate control activities over the
technology infrastructure that help ensure data completeness, accuracy, and
availability of technology processing.
Principle 13 states “The organization
obtains or generates and uses relevant, quality information to support the
functioning of internal control.”
To achieve this, information systems should capture internal
and external sources of data, and process and transform relevant data into
information that is timely, current, accurate, complete, accessible, protected,
verifiable and retained. Information must be reviewed to assess its relevance
in supporting the internal control components.
Although we pointed out two principles from two different
components within this COSO Internal Control – Integrated Framework, factors
relating technology must be considered throughout all five components and seventeen
principles to provide reasonable assurance of appropriate controls. An
effective system of internal control will demonstrate that all components are
present and functioning to the achievement of objectives throughout the entire
entity structure.
Section
3: Introduction to Data Analytics
Data analytics is the science of examining data with the
purpose of creating actionable insight. It gives you the ability to react
quickly to an increasingly complex, volatile, and competitive environment. Most
organizations know that enhancing their analytical capabilities is critical to
their success and survival-helping them gain a competitive advantage or helping
them maintain their current market position by helping managers and
organizational leaders make better business decisions.
Data
Analytics and Strategic Management
Data analytics must be anchored in the entire strategic
management process: strategic analysis, strategy formulation, strategy
execution, and strategy evaluation.
It is all about quantifying business issues and making
decisions with more accurate and fact-based data.
Business analytics and business intelligence use data mining
-- examining large databases to generate information and extract patterns,
statistics, and modeling software to support data-driven business
decision-making.
Data
Analysis Models
There are many data analytics models available including
regression analysis, classification analysis, customer segmentation, market
basket analysis, and others. Let’s take a moment to define some of these
models.
Regression analysis is a statistical model used for obtaining
an equation that best fits a set of data and is used to show relationships
between variables.
Classification analysis attempts to find variables that are
related to a categorical (often binary)
variable.
Customer segmentation, also known as clustering, groups
customers into similar clusters, based on the values of their variables. This
method is similar to classification except that there are not fixed groups. The
purpose of clustering is to discover the number of groups and their
characteristics, based entirely on data. Market
basket analysis tries to find products that customers purchase together in the
same “market basket.” In a supermarket setting this knowledge can help a
manager position or price various products in the store. In banking or other
retail settings, it can help managers to cross-sell (sell a product to a
customer already purchasing a related product) or up-sell (sell a more
expensive product than a customer originally intended to purchase).
It is important to note that Artificial Intelligence and
machine learning also play an important role in performing data analytics.
Now you will be presented with an overview of how data
analytics can be used to create value in an organization. organization.
Video:
Value Creation Through Data Analytics
Video Transcript (Est. time: 3:44 min):
Technological advances in gathering and processing data are
increasing at an unprecedented speed. Data analytics includes the
extraction and analysis of data using quantitative and qualitative techniques
to gain insights, improve predictions, and support decision making.
For management accountants, the ability to exploit this data
through meaningful data analytics is a critical component of the accounting and
finance profession. In their evolving roles, management accountants will be
required to acquire enhanced skills in data mining, analysis, and effective
communication through data visualization. As business partners, management
accountants become storytellers, providing relevant hindsight, insight, and
foresight to their organization, effectively turning information into
intelligence through data analytics.
Let’s take a few moments to review the key aspects of data
analytics and their contribution to the value creation within an organization.
Value is added to an organization as the sophistication level of the analytics
being performed becomes more robust.
We begin by leveraging historical data of an organization to
provide hindsight into what happened. This type of analytics is known as descriptive
analytics. Many accountants leverage tools in Excel, such as pivot tables
and graphing to perform descriptive analytics. There are also many other tools
in the market being used to perform this type of analytics. Although understanding what
happened is very important, further hindsight explaining why things happened is
even more valuable. The type of analytics that help us understand why things
happened is referred to as diagnostic analytics. Tools often used to
perform diagnostic analytics may include Excel’s customer segmentation, what-if
analysis, and multi-variable regression. Here again, other software products in
the market are also being used.
Up to now, we’ve defined analytics that provide hindsight.
What if we could gain insight into what is likely to happen going forward? Predictive
analytics adds value to an organization by attaining this type of insight
into what is likely to happen. Accountants can access Excel tools like
exponential trend smoothing, and solver to perform predictive analytics. As
software sophistication continuously improves, the market will continue to
expand upon product offerings that provide this type of insight.
Taking the value added through analytics even further, Prescriptive
analytics help organizations attain foresight needed to decide what an
organization should do or what actions should be taken to create added value.
Applying some of the predictive analytic tools from Excel may also help in
performing prescriptive analytics.
Finally, Adaptive analytics help further gain value by
providing additional foresight into how machine learning can help. As
organizations continue to innovate new technologies with artificial
intelligence, more adaptive analytics will be applied.
Advances in technology and analytics is rapidly changing the
role of the management accountant, how their work is done, and which types of
accounting functions are becoming obsolete.
Embracing the changes brought on by artificial intelligence
and further technological achievements, will enable management accountants to
play a more proactive role in providing insight, transparency, and foresight as
valued business partners within their organization.
Understanding what it takes to perform relevant data mining,
value added data analytics and effective communication through data
visualization are critical competencies associated with the future of the
accounting and finance profession.
ABC
Electronics Data Analytics
You’ve just heard about how management accountants use data
analytics to make informed strategic decisions and increase company value. Data
analytics provide hindsight, insight, and foresight for helping companies
achieve their goals.
Let’s learn about how the five types of data analytics can be
used to tackle a declining sales problem using a fictional scenario.
Let’s find out:
How the sales decline was identified and why it was
problematic.
What they learned about reasons for the sales decline.
What options they had to choose from for addressing the
problem.
Which options they chose to meet their goals.
ABC
Electronics – Sales Decline Data Mining
ABC Electronics is a retailer of household electronics
products including televisions, refrigerators, washers and dryers, and more.
Through data mining, they discovered a decline in Brand C
television sales.
ABC Electronics has an interactive dashboard that provides
insightful visualization for analysis and decision making. Take a moment to
review ABC’s sales dashboard for their line of televisions. Then click next to
continue.
Data
Analytics Phases
Use the interactive analytics value chart to learn more about
how ABC Electronics made the most of data mining and modeling to address the
declining sales.
Working from hindsight to foresight, click on each analytics
type to discover ABC Electronics data revelations and decisions.
Descriptive Analytics
Descriptive analytics define a business problem. Raw data was
cleaned, transformed, and summarized. ABC then used this data to create their
television sales dashboard. The dashboard’s drill-down
capabilities for descriptive information is how the Brand C sales decline was
identified.
The dashboard revealed overall television sales was up 5% but
Brand C sales decreased 2%.
Diagnostic Analytics
Diagnostic analytics provides possible reasons for the
business problem. It tries to find correlations of the product’s activity using
data.
In the case of ABC Electronics, Inc., diagnostic analytics
tries to find possible reasons for the sudden decline in the sales of Brand C
televisions by searching for relationships between data attributes and product
activity. For ABC Electronics, attributes are factors that could have an impact
on sales include:
Time of year (seasonality)
New comparable TV from competitor
Product reviews
Product price
Social media mentions
Statistical modeling looks at the relationship between the
attributes and sales. Visually, the more linear the relationship, the better
the results.
What did ABC’s analysis find?
Reasons related to the attributes are:
1. The decline happened in July and August, a typical slow
sales period.
2. The competitor’s product sells for less.
3. Customer reviews gave average ratings
Predictive Analytics
Predictive data analytics can provide insight about what
might happen given the current circumstances. This is done through building
analytical models, such as regression or market basket analysis, to compare
with actual results.
Based on their analysis, ABC Electronics’s prediction is that
Brand C sales will continue to decline due to
the attributes of new competitor product, pricing, and customer reviews. This
predicted decline will result in leaving ABC with twice as much inventory as
desired.
Is that what happened during the next sales cycle?
The results from actuals in the next reporting period matched
the prediction. Brand C sales declined 2%.
With the prediction proven correct, the model will be
adjusted by assigning higher weighted scores to the attributes associated with
the decline in sales.
Prescriptive Analytics
As we just learned, predictive analytics are a company’s
attempt to foresee how changes they make might address a business need. Taking
it a step further, prescriptive analytics uses complex modelling to suggest
actions to take in order to achieve possible outcomes.
The accuracy of possible outcomes depends on two important
factors: data quality and model quality.
With the prediction of twice as much Brand C inventory than
needed, what do you think ABC’s best options are?
Two possible actions were identified:
1. Offer a 30 % discount which predicts sales will improve
10%.
2. Offer a package discount for purchasing a soundbar with
the television because analytics showed customers frequently buy a soundbar
when buying the television.
Adaptive Analytics
Adaptive analytics is another resource to provide foresight
about what might happen given what has happened so far. Machine learning models
incorporate actual results and continuously adjust based on new data received.
Adaptive analytics collect Big Data into one central
repository. Included in Big Data is information related to sales, marketing,
email, websites, and content management systems. Machine Learning models use
this data to make more accurate predictions.
Analytics at this stage are useful in validating the accuracy
of predictions. For example, if a prediction recommends a change in marketing
strategy, adaptive data analytics tells whether the change is working and how
to adjust the model if it isn’t. Adaptive
analytics can also help find unrelated trends such as customers who recently
purchased homes and bought a television are also likely to buy five additional
electronic products.
What did ABC Electronics do?
ABC electronics used adaptive modeling to build customer
profiles based on customer behavior data collected from web ads, emails, text
messages, etc. Using the information from these specific customer profiles, ABC
Electronics will plan to offer personalized promotions such as pricing
discounts and package deals to increase sales and effectively reduce inventory.
ABC will continue to adjust their analytical models as new
actuals data is received. With continuous monitoring and adjusting, ABC will be
equipped to make the best use of the data available for making effective
strategic decisions.
Analytics
Value at Your Organization
Time and more data will tell if ABC Electronics is able to
reduce the excess Brand C inventory. While thinking about your organization and
data analytics, what information is already gathered
that can tell a story about a business need? Or what
information is not gathered that could be helpful? How can the types of data
analytics help drive decisions in your organization?
Data
Analytics Competencies
According to IMA’s Management Accounting Competency
Framework, within the “Technology & Analytics” domain, data analytics
competencies encompass extracting, transforming, and analyzing data to gain
insights, improve predictions, and support decision making. While the required
level of competency may vary, at a minimum it must include knowing what types
of analytical models are available and to what business problems they can be
applied.
Beyond that, important, perhaps essential, skills include the
ability to transform raw, unstructured data into a form more appropriate for
analysis (data wrangling), the ability to mine large data sets to reveal
patterns and provide insights, and the ability to interpret results, draw
insights, and
make recommendations based on analysis. At the conclusion of
this module, we will begin to work with data sets and apply analytical models
to begin making business decisions.
Concluding
Thoughts
No
Longer a Human Endeavor
Accounting is no longer solely a human endeavor. Similar to
what has already occurred in manufacturing, transportation, and medical
industries, robots are now performing previously people-driven accounting and
finance tasks, including transaction matching, variance analysis, and
reconciliations.
Exponential
Growth
Technology is transforming the management accounting
profession at an accelerated rate, but management accountants should not be
concerned about job security or possessing the skills to adapt to their new
roles in the digital age. RPA, AI, blockchain, and any other technological
advancement that contributes to the exponential growth of available data can
and should be properly leveraged to make strategic business decisions.
Strategic
Business Partner
When financial data is available within minutes, management
accountants have the time to serve as true strategic business partners and
analyze and visualize the data so organizations can respond more quickly to the
marketplace, capitalize on innovation opportunities, ensure continuous
integrity, and, most importantly, uphold stakeholder and consumer confidence.
In the next module, we will focus on the development and
practical application of data analytics and data visualization for accounting
and finance professionals.
Module 1
Wrap‐up
You have completed Module 1 of this course and should now be able
to:
Recognize the impact of technological advancements on the finance
and accounting profession.
Describe the typical life cycle of data.
Explain the impact data governance has on business and its
stakeholders.
Define the various types of data analytics and how each
progressively creates value within an organization.
Module 2:
Visualizing the Present & Predicting the Future
Introduction
Welcome to Module 2, Visualizing the Present & Predicting
the Future.
The goal of this module is to gain a deeper understanding of
data analytics and data visualization through the fictitious scenario depicted
in the case study, Huskie Motors Corporation: Visualizing the Present &
Predicting the Future.
The case study is available in the Resources link of this
course. You are not required to read the case study ahead of time, however, it
is available to download if you prefer to print it and follow along.
.
Module 2
Goals
Upon completing these lessons, learners will be able to:
Define data visualization.
Describe how data visualization can impact the way data is
communicated.
Identify various data visualization tools and their different
uses.
Recognize the importance of choosing the right visualizations
based on your audience.
In Module 2, Knowledge Check questions are dispersed
intermittently throughout the lessons. There are a total of six knowledge check
questions among the three lessons. Be sure to answer all of the six questions
in this module before moving ahead to the next module.
Module
Menu
Module 2 consists of three lessons:
Lesson 1: Case Study Introduction
Lesson 2: Huskie Motors Operations, and
Lesson 3: The Data Dilemma.
.
Lesson
1: Case Study Introduction
Section
1: Introduction to the Case Study
In order to effectively work with, analyze, and make business
decisions from the abundance of data made available through automation, we have
made it clear that solid data governance is vital to the success of an
organization’s strategy.
To establish a practical understanding of data analytics and
data visualization, we will now examine the opportunities and challenges
created by Big Data and demonstrate how management accountants can apply new
competencies to the various data processes of an organization.
We’ll now begin the case study Huskie Motor Corporation:
Visualizing the Present and Predicting the Future. Before we begin, it’s
important to note that Huskie Motor Corporation and all of the characters that
are introduced are fictional and have no relationship to an actual
organization.
Big Data
Organizations create and collect massive amounts of data as a
result of their day-to-day operations. Frequently referred to as Big Data, it
represents an important asset for the organization.
Big Data presents both opportunities and challenges for
accounting professionals, who are expected to know how this data is created,
collected, stored, and accessed. As the custodians of the organization’s
assets, accountants are expected to understand and implement controls over the
storage and use of the organization’s data.
Big Data
and the Management Accountant
As business professionals, accountants are expected to know
how to use this vast source of data to make better business decisions and
identify potential risks. Understanding how to use Big Data to formulate and
solve business problems provides an opportunity for the accounting professional
to become a forward-thinking strategic partner in the organization. The
challenge for accountants is to develop the skill set needed to extract value
from Big Data through advanced analytics.
Data
Visualization
One skill that is becoming increasingly important for
analysis of large data sets is data visualization.
Data visualization is the process of displaying data to
provide insights that will support better decisions. Gartner’s 2017 Magic
Quadrant for Business Intelligence and Analytics Platforms states that “the
visual-based exploration paradigm has become mainstream.”
Gartner identifies three platforms as leaders in
visualization software: Tableau, Microsoft, and Qlik. All three products
provide relatively easy-to-use data visualization tools. In this lesson, you’ll
also learn about some of the other platforms that are available in the market.
As new software is introduced to the market, there may other data visualization
tools that are leading in the industry.
Throughout this module, as you progress through the Huskie
Motors case study, you’ll be taken through a series of videos that discuss data
visualization topics such as knowing your audience, selecting the best
visualization in context of the business question, various visualization
software, and much more. Dan Smith, accounting professional turned data
scientist and leader in data analytics, will take you through these important
concepts on data visualization.
Video 1:
Data Visualization ‐ Why Visualization
Now let’s hear from Dan Smith where he further introduces the
concept of data visualization, discusses why visualizations are important, and
emphasizes the role and function of data visualization in analyzing Big Data.
Video Transcript (Est. time: 9:34 min):
In this series of videos. We’ll review the tools often
associated with creating data visualizations and provide a little context of
when you would use what tool or what visualization in what scenario. As the
case study progresses, you’ll see situations where you need to understand for
whom you are creating. The visualization for that is understanding your
audience, and understanding your audience means, does this audience need to explore the
data and find their own answers to their own business
questions, or does your audience simply need to be informed
of the answer, or do they want to report only providing a repeatable answer to
a well-defined question? We’ll use these concepts of exploratory, informative,
and reporting audience needs as a framework to talk about all the
visualizations concepts associated with those visualizations. We’ll also dive
into the source of the data and where your data visualization tools live. The
data environment concepts, again, that’s both for the data visualization tools
and how the audience receives the information provided by those visualization
tools, the visualizations that you create.
Finally at the end of the case study, we’ll discuss the
future of data analytics and the role of the management accountant in data
analysis. With that out of the way, let’s talk about why visualizations are
important. The formal definition of visual analytics is defined as the creation
and study of the visual representation of information or the process of
displaying data to support decision making, but what does that mean? What the
creation and study of a visual representation of information, quote unquote, is
really saying is that we’re simply taking data and using the data, the raw data
to create information using a visualization. Now there’s an important
distinction between data and information. Data are the raw materials for the
creation of information data. Our numbers, rows in a table there, transactions
before anyone sees them. Information is processed. Data, data, which is
processed in such a way to make some type of decision or gaining knowledge
about the data or what that data represents.
And what about the need for visualizations? Why are they so
important? In modern analysis, visualizations are closely associated with how
humans consume information, how humans take data and process it as information.
You see, humans naturally use
visual indicators to understand their environment. Way back
at the beginning of humanity we would use visual cues to know if a plant was
ripe or a saber-tooth tiger was about to attack us. Well, understanding a bar
chart or a line chart is not exactly the same as understanding the threat of a
wild animal visibility into the state of our environment. Specifically the
business environment is nonetheless very important for the modern human, and
because humans are so well adapted to processing massive amounts of information
visually, we can use visualizations to communicate complex ideas and
interactions between groups of data using visual representation of those and
ideas and interactions.
Using visualizations is much more efficient than if we had to
communicate that information through text or
speech alone. For example, imagine explaining double-entry accounting using
only your words, no pictures, T tables, ledgers, or whiteboards. I remember my
first accounting professor talked about debits and credits being on the right
hand and the left hand side of a, of a, a T table over the general ledger. He
even stood on the table and had debit and credit written on his shoes to
reinforce that visual concept. You see, he was being a great professor because
he understood that we had to have a shared visual context of what was being
explained. Without visualization, without something tangible, you cannot know
what the other person is visualizing in their mind. It’s very difficult to be
on the same page to share a vision if you don’t have something to see in the
first place.
So visualizations, in other words, create shared context
between the audience receiving the information and you the individual that’s
communicating this context to them. Speaking of shared context, let’s do a
quick overview of the visualizations that we’re going to be covering. Now, you
may be unfamiliar with some of the terms that I use in this quick summary,
factors, groups, aggregates, distribution trends, etcetera, or unfamiliar with
the visualizations themselves. Don’t worry too much about that. Just make a
note. We’ll define all of these later, either in the videos, in the course
content, or through reference material which I’ll provide to you.
So let’s get started on this quick summary bar charts for
comparing single values or aggregate values between a small number of groups, a
small number of groups that all share a common numeric baseline. Bar charts are
very effective when comparing the composition of groups or factors within a
larger group.
A pie chart for a single larger group or a stacked bar chart
if you have many larger groups. Are the visualizations typically used? When we
display trends over time, we use lines or
line graphs when we have two separate fields, two separate
sets of values, and we want to
explore the relationship between them that’s facilitated by
scatterplots or principal coordinates plots. When we look for the relationship
between many groups within those two values, so multiple factors within a set
of two values, we use a colored size or shape scatterplot or we may use
something like a heat map to indicate high vs. low values within a table. This
is known as a heat map just because it looks like there’s hot and cold areas
on your table. You’d say a way of getting a quick visual
indicator of hot spots while at the same time providing raw numerical context.
Going onto more advanced visualizations. When we want to look
at statistical measures or traditionally statistical
measures like average mean median values, these are known as measures of
central tendency. We need to understand how the data is distributed around that
central tendency, so we use visualizations like histograms to see the
distribution of data around an average mean, median so that we can see how
spread out the data may be. If we want to compare multiple groups and their
central tendencies and distributions, we would use a box whisker or a
candlestick plot. When we have a lot of information that we want to communicate
and perhaps we need to add an interactive component to it, allowing people to
explore that information, we would use a dashboard. A dashboard is a
combination of visualizations oftentimes with either filters which allow us to
modify the underlying data that populates the visualizations or, that is,
filter the data.
Either filters or we can include a drill-down component where
you have one visualization interactively modify another visualization. I would
be able to click on one so that I could see more details about that
visualization.
That was a very fast summary. We defined visualization as the
creation and study of the visual representation of information. We learned
visualization is important because it assists with informing an audience and
the exploration of data. Finally, we introduced many visualization techniques.
Again, don’t worry if you’re unfamiliar with some of the visuals or the
language I mentioned. We’ll go into more detail later and also there may be
some visualizations that you really like or that you’re really interested in
that I didn’t mention, like a Sankey diagram, waterfall chart, or ribbon chart.
We’ll dive into the core visualization concepts used across all of these
visualizations, so the underlying shapes and lines, the mechanisms of how
they’re communicating information. This will allow us to understand the
fundamentals of all visualizations, visual analytics, everything from
simple bar charts to complex infographics that you find
online on websites. I look forward to this journey, individual analytics with
you.
Next up, we’ll be talking about the tools used to create
visualizations.
Video 2:
Data Visualization ‐ Tools of the Trade
While the case study identifies three platforms as leaders in
visualization software, there are many others to be aware of. Dan will now
discuss the various visualization tools, and the strengths and weaknesses of
each.
Video Transcript (Est. time: 13:37 min):
Let’s begin our discussion on visual analytics by discussing
visual analytics tools. In this video we’re going to talk about what, exactly
is a visualization tool and why are they becoming more and more prevalent in
business. We’ll cover the general concepts applied by visualization tools so
you can begin to feel more comfortable using a wide range of tools as opposed
to feeling like you have to specialize in just one tool over another. And
finally we’ll outline the general categories of types of visualization tools,
the spreadsheet- focused ones, the ones that concentrate on visualizations
first, the workflow-centric tools, and the code-based tools. In the case study,
they talk about how the visual-based exploration paradigm has become
mainstream. Now that’s a very consultanty way of saying that people now expect
to be able to connect and explore their data in a visual platform. Well, what
do I mean by visual platform?
Well, literally anything that can connect to a data source
and modify the data in a way that displays a visualization can be called a data
platform. Okay, fine. So what do I mean by data source, and what do I mean by
modify data? Well, we’ll get to all of that towards the end of the case study
when we’re talking about the data and data environment concepts. But for now,
understand that particularly when it comes to Big Data, you’re not going to
be able to pull all the data into a single spreadsheet
because the data is simply too large. But more on that later. Let’s focus for
now on the tools themselves. So what, what do I mean when I say tool? Because
any of the tools, Tableau, Power BI, Click, Excel, Jupyter Notebooks, R Studio,
RapidMiner Nine, the list goes on and on.
Any of these tools are able to create pretty much the same
visual analytics. They just have their own way of doing it. And that’s because
all these tools apply the same fundamental concepts. And I’m not talking about
just visualization concepts, I’m talking about the highest-level analytics
concepts. You see we tend to use words like tools, concepts, and technology
interchangeably, but technology is the application of concepts using a tool.
Visual analytics tools apply analytics concepts to create useful information.
Competency in a tool is knowing how to build a bar chart in Tableau or in Power
BI. Competency in the concept is knowing when to use a bar chart or should you
use a line chart or should you use something else instead? How does the data
need to be shaped so that you can do that? How do I connect to the data?
There’s a bunch of underlying concepts that make these tools easy. And if you
think about it, when you hire a contractor to oversee the construction of your
house, for example, would you care what kind of hammer they used?
Probably not. The tool isn’t as important as the person’s
capabilities. Just like when you first learned to be a management accountant,
did you have to learn Excel first? I doubt it. I don’t remember there being
questions on Excel when I passed the CMA. Building an analytics solution,
visualization or otherwise, can be viewed through the same lens. Our concern is
how we are applying the visualization concepts so that they can be understood.
Now, this may seem like an unimportant distinction. Why am I going on and on
about
this? Because it’s so important that you understand the
difference between a tool and the concept the tool is applying. That’s how you
can see me jumping from Power BI to Tableau to RapidMiner to all these
different tools. It’s not because I’m a genius. No, I’m no smarter than anyone
else. It’s just I know the underlying concepts so I can easily jump from one to
another vs. having to learn specifically how to do something in one tool.
We’ve talked in depth about why concepts are important. What
are the concepts? Same analytics concepts used in data science. You connect to
the data, you transform that data and information, and you represent the data
in an understandable way. The specific concepts will be covered in the
visualization concepts video, and the data concepts and connecting to data
concepts will be covered in the data and data environments concepts video. But
at a high level we connect, we transform, we represent. So what are the
categories of these tools? Well, and this is purely my personal experience. I
feel there are roughly four general categories to which visualization tools
fall under. The categories that I found are a spreadsheet-focused, things like
Excel, Domo, the, a visualization first, the ones where they focus on you being
able to create a visualization. Those are the famous ones like Tableau and
Power BI.
Then you have ones that are centered more on the workflow,
workflow process management utilities like RapidMiner and Nine. Finally,
code-centric tools like Jupyter Notebooks, Zeplin. Even our studio notebooks
like let’s dive into each one of these. So first are our spreadsheet focus
tools and Excel is the bread and butter of accounting. Many visualization
features have been added to Excel. Excel itself is not a terrible visualization
platform, particularly if you’re using pivot tables and pivot charts instead of
making new tabs and new data sets for each visualization. Pivot charts actually
work a lot like the other visualization tools, so if you’re using those, you’re
probably comfortable with the visualization focus tools already. Because of
Excel’s market dominance, many apps have tried to mimic or extend Excel’s
functionality. After all if you can’t beat them, join them. Apps like Domo,
Pitch that you can do advanced analytics within Excel, but there’s a lot of
operations happening on the back end.
I’ll put up a little information in the course content to
links to some of the platforms. But I’d
wait until you’re finished with all of these videos before you start exploring
them so that you can better understand the data and data environment concepts.
I hear people say that I need to see my data as a common complaint on the other
tool categories, particularly for people that started with Excel. I mentioned
before, though, that
sometimes you have terabytes of data. You have so much data,
you can’t bring it into your computer. It’s just not practical to see all of
the data. It won’t fit onto your computer. And even if it could, there’ so much
of it that seeing each individual row isn’t going to provide that much value to
you. There are plenty of ways where you can click on a visualization, a
component of visualization, and see what the details are about that point.
That’s called drilling down, which is a component in
dashboards. And we’ll talk about that in depth in a future video. For now,
sticking to the tools themselves, one of the primary ways of creating a
dashboard, particularly interactive dashboards, are the visualization- focused
tools. These visualization-focused tools might be more likely called focused
platforms since they tend to encompass both the transformation of data, data
workflows, and the visualizations themselves, however, they’re visualization first
and they tend to abstract a lot of those data transformations. So if I say
abstract, it means that the tool is determining how to perform an operation,
not the user, which may not always be ideal, particularly if you need to
communicate how that tool is doing the transformation. Since you can’t see the
back-end operation with visualization tools, though, as their name implies,
they can look gorgeous. Those gorgeous visuals abstract a lot of complex
operations down to relatively simple steps, so the good-looking visuals can be
created quickly because of all that abstraction, you can create interactive
data, connected dashboards.
Those dashboards can be made into reoccurring data, connected
reports. They can extract large data source queries from applications like
Hive, Spark, Big Query, Redshift, etcetera, for those exploratory operations.
The major players in visualization-focused platforms are Tableau, Power BI,
Click, lesser-known players like Burst, Pentaho, Spotfire, and if you don’t
have access to one through your workplace and you want to learn, both Tableau
and Power BI have free versions, Tableau Public and Power BI Free License,
respectively. In terms of the capabilities of these platforms, there’s pros and
cons to all of them. If you want to use AR or create some of the more advanced
visualizations, Power BI is considered by some to be a bit better, however,
they use an underlying language called Dax, D-A-X, in order to create
functions. So if you need to create custom functions or custom operations
within the platform, there’s a pretty steep learning curve. Whereas building custom
calculations in Tableau, it’s generally easier because they have a more
abstracted, a SQL language that makes it easier to create custom calculations.
But that level of abstraction can make it hard to extend the platform beyond
making a custom calculation. So due to the difficulty in exploring the workflow
and the different operations used to create a visualization or an analysis,
tools that are focused more on the workflow, the transformations that are done
to your data set. So if you look at things like
RapidMiner 9, they’re great for exploratory data mining and
for creating visuals of the data exploration. All you really need to know is
search the name of the concept you want to apply and you’ll see a module for
it. They can be a little confusing for beginners and they often require a lot
of local processing, but these are a good middle ground between the abstraction
present in visualization tools and the more detailed operations found in code
focus tools. So speaking of the code focus tools, things like Jupyter
Notebooks, our notebooks is that plan. These are visualization tools second,
code management tools first. They are fantastic for exploratory data analysis
because you have a clear record of every transformation that happened to that
data. You have a clear record of everything that’s happened because it’s right
there in code. In the notebooks, you can write your research notes and your
findings directly in the notebook and have the code recompile and be completely
reproducible on demand. However, because you need to know the fundamentals of
coding and that’s beyond the scope of this course, we won’t be covering
code-focused tools in these videos. However, I know you are a data scientist
and I know you already know the basics of coding even if you don’t think so.
I highly recommend you watch some introductory videos on
BASIC, Python, or R to learn the concepts. So in conclusion, in this video we
covered what is a visualization tool, how did they apply the same analytics
concepts, but focused on different aspects of the analytics workflow and
or/user experience. The analytics concepts they apply are connecting to data,
transforming the data into information, and representing the information in a
usable format. Visualization tools may focus on spreadsheet data, the
visualizations themselves, the workflow to connect, transform, and represent
data and information or the underlying code behind all those data operations. The
different tools all have advantages and disadvantages, and you’ll be able to
accomplish much the same in all of them with varying degrees of difficulty and
effectiveness. What’s more important is how you use the analytics concepts to
provide information on solved business questions as well as explore what they
do to find new business questions to solve. We’ll introduce concepts around
communicating results in your various audiences for visualizations in our next
video.
Section
2: Huskie Motor Corporation: Background
Huskie Motor Corporation (HMC) is an automobile manufacturing
company with production and sales throughout the world. Automobile
manufacturing and sales is a complex and highly competitive business. Let’s
explore the Huskie Motors background in detail to learn key business
information about the organization.
Huskie
Motor Corporation
Although the automotive industry has a broad global reach,
only 15 companies produce 88% of the world’s vehicles. HMC is a new and a
smaller player in the automotive manufacturing market.
If it is to survive, the company must fully understand its
markets, customer base, and costs to
keep profit margins positive. It has some very popular brands
and high customer satisfaction-both are critical assets at this stage of the
game.
HMC:
Miranda Albany
Miranda Albany was hired at HMC as a senior cost analyst three
years ago, when the company
first began operations after a spin-off from Blue Diamond
Automotive, a large auto manufacturing company. Recently promoted to assistant
controller, Miranda is anxious to make a good impression on her boss.
Communication
with HMC Executives
Although Miranda is sure that the data she has collected can
help her management team make better decisions, she does not have the time, or
expertise, to figure out how to organize or use the data effectively.
Miranda communicates with HMC’s executive team on a weekly
basis to convey vital information regarding marketing strategies, sales
targets, and production needs. Yet she feels that her information is often
“lost in translation,” as the executive team struggles to digest the numbers.
Miranda believes that data visualization may be a crucial
component in helping her effectively connect with HMC executives.
Video 3: Data Visualization – Know Your Audience
Miranda’s struggle to effectively communicate to her
executive team highlights the importance of understanding the audience’s needs,
and how visualizations answer business questions. Dan will discuss how to
select visualizations based on your audience.
Video Transcript (Est. time: 13:34 min):
In this video we introduce the concept of understanding your
audience interest. That is a business question and how it determines the types
of visualizations you should use. We may create visualizations for audience
interest of exploring data or perhaps informing the audience or simply reporting
a value. In our previous video, we introduced high-level analytics concepts. In
this video we overlay those concepts of connecting to data, transforming data,
and representing the information created by transforming the data,
we overlay those concepts with the data analytics process
introduced in the case study as well as the audience needs. Furthermore, the
audience needs, the audience interest for visualizations also overlaps with
analytical maturity concepts. The analytical maturity concept stages are
descriptive, diagnostic, and predictive. Finally, we spend a little time on
what exactly do we mean when we say business question and what does that have
to do with the audience interest. So first let’s talk about the audience. It’s
easy to assume that visualizations and the concepts associated with
visualizations are one size fits all.
We will soon learn in an upcoming video that there are rules
for what visualizations work best in a given scenario. So it’s only natural to
assume that those scenarios are based on the underlying data or information
only. Who’s receiving that visualization should be irrelevant, right? And
sometimes there are boilerplate best practices. Sometimes there are universal
best practices for visualizations. However in most cases, context matters. The
audience matters.
For example, if the visualization explains information, we
discovered in data that is an informative visualization. The visualization
would be very different than the numerous visualizations we would have used to
explore the data in order to discover that new information that we are now
informing someone else about. An informative visualization is simple. It's to
the point. And exploratory visualization is usually quite complex. Understanding
which visualization to use and for whom is an important concept when it comes
to visual storytelling. So visual storytelling is what you’re trying to get
across. The idea of visual storytelling means that your audience is receiving
the information that they need in order to best use that information. So
somebody who would need a more exploratory type of visualization or even an
interactive dashboard would be someone like an analyst, a manager, a
consultant, somebody who really understands that data and needs to understand
the entire process of how some insight came about to really have by and to
really understand what’s happening. Alternatively, someone like an executive, a
VP, a director wouldn’t really care that much about how you reached some
conclusion. They would just want to know what that conclusion was. You would
still need the underlying evidence of how you reached the conclusion, of
course, but it’s just the facts. Tell me what matters. Those are simple
informative visualizations, and when the exploration has informed people so
much that they now are repeatedly asking the same question, now that becomes
something more along the lines of a report, very similar to informative but now
we’re talking about something that’s reoccurring, something that’s about
stability. So these would be, for even an external audience, would be a
reporting-focused audience,
but often it would still be the same people as exploratory
and informative, just a very, very, very, simple type of visualization.
And if you notice that I talked about that progression, that
general process of taking data and creating information for an audience, well
that’s similar to the data analytics process that we outlined in the case
study. You’ll see all these steps identifying the business question connecting
to data, cleaning it, etcetera. All of those involve exploratory
visualizations. Once we’ve gotten the data and explored it, cleaned it,
explored it again, and believe we have an indication of the answer to the
business question. Well, then we use more informative visuals. Informative
visuals that clearly explain the answer that we’ve determined through our
exploratory effort. Many times the information that we’ve obtained isn’t a
one-off answer. The information can provide an ongoing solution. For
example, many of the financial metrics which we’ve all
learned in earning our CMAs or our basic finance courses when we first started
our job as a financial analyst. Those metrics, those measures, like
price-to-equity ratio, like CAPM, like earnings before interest, debt and
taxes, even debt-to-equity ratio.
These are not metrics or KPI, that, KPI that have always
existed. Somebody originally discovered that they are an indicator of business
performance so whether informative visualizations answer a question that is
asked repeatedly, often that informative visualization will become a report. A
report again is a visualization, a reporting visualization. Even a table is a
visualization, needs to be extremely precise and communicates the answer to a
business question over and over and over again reporting. Like financial
reporting and visual reporting these are very rules-driven, it’s very data-
driven and requires more capability around consistency and the inference
structure of the data environment rather than necessarily the competency of how
effective the visualization may be at communicating the report. The
effectiveness of the visualization has already been demonstrated in the
exploratory and informative steps and
conceptually the idea of reporting informative, exploratory,
etcetera, that also aligns with the concepts of predictive capability,
descriptive, informative, the steps of analytic maturity in analytics maturity
is a business theory indicating that as an organization becomes more capable of
using their data to make informed business decisions, i.e., analytics, they
will gain more value from their data.
The steps in analytics maturity are descriptive, diagnostic,
and predictive. Many of you will have also seen this before with prescriptive
at the end, beyond the scope of this video but it’s out there when you look it
up. Going back descriptive analytics, descriptive analytics merely communicates
some historical event and leaves interpretation wholly up to the end user. Just
like a financial report. Consistency and speed are typically the most important
aspects of descriptive analytics. Diagnostic analytics, this goes a step
further than descriptive. It’s not just describing some attribute of data, it
informs the audience with additional information either to help the audience
reaching more informed conclusion or to inform the audience about the analytics
conclusion. As you can see, this
is similar to informative visualizations. You will come
across this in financial analytics. When they’re taking a metric found in a
diagnostic measure and they’re adding additional information to provide context
as to why that metric may have gone up or down. Predictive analytics uses
statistical models to forecast new information. Given historical information
such as changes in a KPI and the additional data around it, it’s using that to
provide a forecast of what could be a future value.
Obviously, financial forecasting is a good proxy or is a good
example of predictive analytics. Now, let’s talk about business questions. A
few times I’ve said understanding the business question or defining the
business question, what does that mean exactly? Pretty much what it sounds
like. It’s the question we need to answer to help achieve our objectives, the objectives
usually solving some problem or fixing a situation. But as we know, a simple
definition, a simple question, can hide a great deal of complexity. A business
question can be interpreted as what problem are we trying to solve? And in many
cases, though, if we can define the problem, somebody has already figured out
how to solve it. Identifying the business question is by far the hardest part
of this entire process. Case in point, I once had a client in the course of six
months, this client rejected two different visual analytics contractors I had
placed at their firm.
I personally went to that client at my company's expense,
shadowed the third contractor for a week, and during that time I found that
while the client had asked for a highly technical resource, somebody that was
proficient in the visualization platform, they weren’t really defining their
business questions. The client wasn’t defining what they needed out of the
visualization platform. Their process of defining their business questions was
just a kind of have a roundtable discussion with the contractor about what
their problems were in general. The meeting was more of an airing of grievances
rather than a proper request requirements-gathering session. This was my
mistake. I was young and I thought if somebody was asking for specific skills,
for example, a Tableau developer, then they may only need those skills. I
looked only at the problem as it was defined, not the root problem itself. The
lesson here is that just because somebody communicates their problem does not
mean they are communicating the problem.
The question that needs answering, this is the real skill of
an analyst, is identifying what question needs answering, not just the problem
as it’s stated. Trying to solve a problem vs. determining the actual problem
which needs solving is the difference between a junior developer and a senior
analyst. It’s the difference between a purely technical skill and a broader
awareness of the business. It’s the difference between being reactive and
proactive. That may seem obvious to you but that’s because you’re probably one
of those senior resources, you’ve already accomplished the hard part. In this
course it’s assumed
the root business question is already defined. It’s like how
introductory physics assumes everything is in a vacuum. Assuming the root
business question is defined is a necessary assumption for you to learn the
technical competencies to answer said business question. And many of you are
taking this course because you feel the need to increase your technical
analytic skills.
Those skills are important. They are the price for entry.
Take it from me, management accountants are probably the best people I’ve ever
met at understanding the real problem, at defining the actual question. So use
my hard-earned experience, if you’ve already mastered the hard part of this
process, and that’s understanding the bigger picture, the bigger needs of the
business, you’re already well on your way to being a data scientist, a data analyst, whatever you
want to call yourself. In conclusion, in this video we’re beginning to
categorize our visualizations based on the intended audience for said
visualizations. We may create visualizations for exploring data, informing our
audience, or simply reporting a value. The audience for visualizations overlaps
with analytical maturity concepts. Exploratory visuals are often used in
predictive and diagnostic analytics, while informative visualizations are
primarily in descriptive and diagnostic analytics. Finally, because of the goal
of analytics is ultimately to either answer a business question for your
audience or identify new business questions to answer, in which case you are
often the audience, we discussed what a business question actually means.
Miranda
Hires a Consulting Firm
To help her utilize the massive amounts of data at her
disposal, Miranda has interviewed consulting firms that specialize in
information technology (IT) and data engineering.
Ultimately, Miranda chose D & A Consulting because of its
automotive industry expertise and its focus on data analytics and
visualization.
Video 4:
Data Visualization – Visualization Concepts
D&A Consulting’s expertise in the automotive industry
will enable them to choose the best visualizations to effectively communicate
their findings. Now we’ll hear from Dan as he discusses some basic
visualization concepts and the various ways data can be visualized.
Video Transcript (Est. time: 6:32 min):
Now that we have some awareness around how our visualizations
will be received by different audiences, let’s move our discussion towards the
visualizations themselves. This video concentrates on visualization theory,
namely how shapes, lines, and points are used to display information. Shapes
represent a single value or groups of values. Lines represent change or
connection. Points represent the intersection of two or more values. Just as even the most advanced
statistics are still based on addition, multiplication, and division, even the
most complex visualization is formed through some combination of shapes, lines,
and points, and in this video I’ll be white boarding, because white boarding is
how we often start defining questions or creating visualizations. In practice
it’s good for us to see some of those real-world principles in action, and yes,
that means that [we’ll] basically be drawing shapes in this video, and that may
make visual theory seem a little childish at first, but I assure you we are not
in elementary school again because we will quickly move from basic shapes to
much more advanced concepts.
Let’s get started. We’ll begin our discussion on
visualizations with a little theory, followed by more practical examples in the
upcoming videos. Generally we have shapes like bars, blocks, or circles,
sometimes separated into segments. These shapes tend to be single or single
aggregate value and aggregate value as in sum or total. The shapes, the pattern
of data within the category, or the relationship between the categories is not
important. We’ll talk more about what that means when we cover distributions
and relationships respectively. Next, we have lines. Lines tend to represent a
change or difference between values most often associated with a change over
time. Lines are visually effective at showing that one point is lower or higher
than another when on the same axis.
It’s also useful to reinforce size differences or position
differences when shapes alone may be ambiguous due to their ability to indicate
direction or trend. Lines are useful to simplify complex patterns as seen in
forecasting and regression. Due to their ability to show difference and changes
between shapes, they’re useful for things like data distributions as we see in
histograms. For points, the position of a point in a visualization represents
the intersection of two values, which is why points are most often used in
scatterplots. That level of granularity is useful for recognizing larger
patterns within the category or group,
as well as between the categories or values. As I mentioned,
you’ll see lines used to simplify trends or relationship within a collection of
points like with a scatterplot and a regression line. In fact, you may see
shapes or colors used to distinguish patterns between groups of points.
You can see that we mix the concepts of different shapes and
lines and colors within visualizations, so as a gentle introduction to some of
the more complex visualizations, let’s look at how points, lines, and shapes
are used together to visualize more complex
patterns within and between groups of data. Visualizing
patterns, the data itself is called looking at the distribution of your data.
Distribution visualization such as histograms and box whisker plots combine both
lines and shapes. There are even points showing outliers at times, and when I
say distribution of your data or frequency of data, it means how often a given
value appears within a given data element, or a group, or category, or column.
It can be the sale price of cars, age of customers, time between purchases,
etcetera. The frequency distribution or shape of data gives us more insight
into the underlying data which would not be apparent if we were only given the
average or median data value.
The boxes in a box whisker plot represent the middle 50
percentile of your data with a line that represents the median or the middle
value of your data. Now you’ll also notice that there’s a line above and below
that isn’t a line so much as it is a T shape or whisker. Those whiskers
represent typically the top and bottom 25% of your data, but you may also see
lines between box whisker plots to show how different one box whisker is to
another, so you get a sense of how much overlap there are between the two. And
keep in mind, histograms and box whisker plots represent the same thing, the
shape or distribution of your data. You can look at box whisker plots as a
histogram simply turned on its side.
Alright, in this video we covered how simple geometric objects
can be combined to represent complex information. Shapes indicated some
aggregation of values. Lines showed change over time and/or trends, points and
intersection of values and shapes, lines and points all combined show
distribution of values with a data set. Moving forward, we’ll see how the data
visualization tools apply the concepts within the context of your case study
and look in detail on specific visualization types, both informative and
exploratory.
HMC
Staff
Miranda asked Megan Martinez, a senior staff accountant at
HMC, and Adam Green, a staff accountant at HMC, to work with D & A on the
project.
HMC:
Megan Martinez
Megan has been with HMC for two years and has recently
relocated to the corporate headquarters in Dearborn, Mich. Megan’s corporate
transition is almost complete, and she is anxious to move forward in her
current position.
HMC:
Adam Green
Adam is a young, aggressive employee who began with the
company eight months ago, since he graduated college. He has a good sense of
judgment and is eager to make a good impression on upper management. Miranda
believes the two employees will provide a good mix of experience, dedication,
and teamwork.
Section
3: D&A Consulting Group: Background
D & A Consulting was started by Doug Chan and Arlo Paxton
five years ago. Doug and Arlo have been friends since college, having graduated
with accounting degrees from the same university 15 years ago.
Although they initially went to work at different accounting
firms, they both followed similar career paths. After becoming managers at
their respective firms, Doug and Arlo decided that their real passion was in
teaching clients how to use data to make better business decisions.
They started their own consulting firm with one primary
focus: helping clients better understand their businesses via the use of data
analytics and data visualization.
Video 5:
Data Visualization – Visualization Selection
D&A Consulting’s primary focus is to help clients
understand their business better by using data analytics and visualization. Dan
Smith will now discuss how to determine the best visualizations in the context
of the business question.
Video Transcript (Est. time: 5:20 min):
In this video, we’re going to look at the different types of
visualizations and how to select a given visualization for a given business
problem, and provide a simple flowchart to aid
you in visualization selection. All visualizations have a
function. You may simply be comparing two different collections of objects such
as total revenue vs. total expenses. You may be looking at the pattern of a
single value over time, such as profit over time, or
you may be exploring a complex relationship between many different
elements such as total revenue vs. expenses by country by month. Regardless,
all visualization functions are ultimately used to answer some form of business
question, and you can use a simple decision tree once you know what types of
questions you need to solve.
So let’s dive deeper into defining these patterns in
visualization functions. The pattern that we see in most, if not all,
visualization decision tree diagrams, it starts with a question of are we
trying to visualize a comparison, composition, relationship, or distribution? And are we looking at a single
point in time or a change over time, are more informative. Visualizations are
the comparative and composition visuals. Comparison is when we want to compare
differences between our groups as a whole, so the aggregates of profit between
countries or between even sales channels, something along those lines. The
standard bar charts are typically associated with comparisons and are probably
the most common visualization you’ll see. So many of you may already have
experience using them when comparing categories, factors, or groups. Again,
those terms tend to be used interchangeably. When, when comparing those
categories over time, particularly when you have many categories, it’s almost
always a good idea to use a line chart. We do often see bar charts used when
it’s a small number of time periods like in financial statements from year to
year, but if we have a lot like weeks or days, a line chart is far more
effective at representing those more granular changes.
Composition, visuals on the other hand, these are typically
shapes or combinations of shapes as well. They show the categories within a
larger group, think population per city, within a state or sales per region
within a company. You’ll also see composition visuals associated with pie
charts. We can talk about those. They have their use, but we’ll dive into that
more in the details of our informative visualizations.
Let’s move on to the relationship and distribution visuals.
Relationship visualizations tend to focus on how the individual values of one
group change given the values of another. The last visualization family are
distributions. These are scatterplots, histograms, and box whisker plots we
already mentioned. We’ll dive a lot deeper into the relationship and
distribution visualizations when we talk about exploratory data, but let’s talk
about theory again. Why do we use these visualizations when we do? Who decided
these rules?
Well, we did. It’s simply the way that humans process information.
Decades ago, a famous study by Cleveland et al tested many, many people with a
series of visualizations including
pie charts, bar charts, line charts, etcetera, all the things
we just talked about. The study found a few general trends in the way people
interpret information visually. Visualizations that share a baseline that start
from the same point which are only viewing a single distance or direction, for
example, a bar chart, which is ultimately representing the distance of that bar
from a baseline. Those tend to be the least confusing, confusing visuals. While
people struggle to interpret the differences between area that is one circle
bigger than the other and even more so, they struggle with interpreting
difference between color, shades of color in particular. So Cleveland has a
hierarchy of which visualization to use, when. We will dive deeper into that in
the upcoming videos.
In conclusion, in this video,
we saw how taking the basic concepts of visualizations and how we can use that
to find the best visualization for most business questions. Using a simple
flowchart, we reviewed how research established using these visualizations
resulted in more interpretability and less human error. In our next two videos,
we’ll take an in-depth look at the informative visualizations, line, pie
charts, as well as the exploratory visualizations, scattered tree bubble and
histograms. Stay tuned.
D&A's
Business Purpose
Though data analytics is not a new concept in the business
world, the amount of data available and the number of sources from which it can
be captured has skyrocketed. Driven by lower storage costs and more
“user-friendly” analysis, software businesses have vastly increased the amount
of data they collect and store. Yet finding the talent needed to transform that
data into useful insights is what businesses find challenging. D&A
Consulting helps companies fill that void.
D&A
Consulting
Doug and Arlo are excited about the opportunity to work with
HMC. They have assigned their automotive industry expert, Kevin Lydon, as the
project lead, along with a D & A new hire, Jan Morrison.
Kevin has been with D & A nearly as long as the company
has been in existence. He and Doug had met each other on a consulting project
where Kevin was working as an IT engineer.
As project lead, Kevin is enthusiastic about mentoring Jan on
her first assignment. He is equally enthusiastic about the potential for improvement
at HMC. He reassures Jan that this client will be a great opportunity for her
to test her data analytics and data visualization skills.
HMC:
Miranda Albany, Continued
She has advocated a “data-driven” strategy for decision
making at the company by capturing a vast number of product-specific details
relevant for both production and marketing.
The problem is that the company has grown so quickly that
Miranda is having a difficult time keeping up with the massive amounts of data
that continue to accumulate.
To further complicate matters, there is a growing need for
reporting detail and in-depth analysis of product lines given the availability
of additional data.
Video 6:
Data Visualization – Informative Visualizations
Next, to further understand how to communicate ideas and
discover business opportunities through data visualization within analytics,
Dan will now go through informative visualizations and how they can ultimately
tell a new story for the business.
Video Transcript (Est. time: 9:46 min):
This video covers visualizations more focused on informative
and reporting functions, namely bar, pie, and line graphs. In our visualization
selection flowchart, bar, pie, and line focused visuals are associated with
comparative composition and time series visualizations, respectively. Before we
begin exploring these informative and reporting visualizations, recall that
different audiences require different types of information communication, so
our first thought with visualizations should be to consider if our task is
exploratory, informative, or repeated reporting. In your case study, they
mentioned the data analytics process. Determining the correct visualization
begins the same way. We need to first understand the business question before
we can determine the correct visualization. However, you may find there is no
clear business question or you are the only one who is supposed to determine
the business question, but for now let,s start with the assumption that we have
discovered a clear objective and the question we need answered is well defined.
Clear, well-defined question with quantifiable answers means
that we should approach our visualization from an informative or a reporting
perspective. And most well-defined business questions involve comparing the
performance of one group to another. So let’s look very quickly at the theory
of why we use the visualizations we do for comparative visualization. We talked
about Cleveland before. In a further analysis using the concepts that Cleveland
uncovered, researchers found clear differences in people’s ability to interpret
comparative visualizations. The diagram on your screen shows distributions of
errors based on the perceived differences in comparative visuals. Looking at distance
from a common baseline. That's the first grouping of
perceived errors, the lowest amount. Next we saw difference in length or radius
such as we would see with a, with a pie chart. And the final one with the most
error were visualizations that had a difference in shape without a common
baseline.
So these are things like bubble plots or tree, tree graphs.
And this reinforces the concept that we should use bar charts for comparative
visuals whenever possible. And this is to reduce errors. Whereas bubble and
tree map visuals may look cool, but they’re not ideal for business critical
reports. It helps us to find some rules of thumb for our visualizations. So now
we can finally move into some examples of visualizations, such as if our
business question is what percent each region contributed to total sales. Now,
because there are only
three regions, we could look for a pie or bar chart. We could pick either one,
but if the question was how much did each country contribute because there are
more than four, only a bar chart is really appropriate here unless we do some
type of additional grouping. Bar charts and pie charts are used when comparing
single or cumulative values between groups. As a general rule, use a bar chart
over a pie chart, particularly when you’re talking about actual values. We
probably want to know the absolute value or actual value of a given country in
these examples rather than the percentage of total. If we were
comparing one country to another, then potentially the pie
chart would be useful, but we want to know the actual values so a bar chart
makes more sense. So some rules of thumb, you always want to use a bar chart.
If you have more than five groups, if you have more than 10 groups, consider
combining some of the groups or creating a detailed
visualization that we’ll talk about in dashboarding. So now
let’s look at when we are focused not just on comparison but changing from one
group to another.
If the business question is “How much did each region/country
of their revenue or contribution margin change from year to year?”, it is
possible to use a stacked bar chart or a hundred percent stacked bar if it’s
percent contribution. However, given the time series nature of the business
question and just the gross quantity of them, it becomes very difficult. It’s
been shown in all the research that we’ve listed that lines connecting points
over time are more effective at communicating change. Therefore, a
visualization with connecting lines is preferred. So you can also say and
suggest that of an area chart. I’ve heard some people say this as well. Area
charts don’t work very well with negative values. So you’d see a much better
representation of a change over time with the line. If the
business question is comparing group contribution to a whole
over time, because this is a compound question, that is, what is the total
amount contributed by group and how
much did that amount change over time?
My preference here is to use two separate visuals, a bar
chart to show total amount and align charge. This showing percent change of
total. In this situation it’s okay to use a stacked bar but more as a visual
cue to our audience. We need to consider the groups making up our total and
this is so they have a better frame of reference for the associated percent of
total visualization. Also notice I placed the first visual above the second. In
the case, the second visual is difficult to interpret without context from the
first when visual interpretation depends on another visual. While some cultures
may read left to right and others read right to left, most if not all read top
to bottom. So put your simplest, highest- level visual on top and the details
on the bottom. Moving onto more
reporting-scoped business questions. What if you need to know how much of this
year’s variable cost budget has been used year to date? More generally, how is
a given KPI performing against its target value? Now, for the visualization
alone, some would only want to report the value using something complicated
like a gauge. People as soon as it gets into a KPI, they want to stick a gauge
on there. However, in my opinion, the next question when we’re talking about
any KPI with a performance target is almost always, well what was it doing last
period? What about last year? So why not just start with a time series graph
and insert a line for the target budget. This way you know both how
it’s performing now and if it’s above or below that target.
Whatever data manipulation you have to do to insert a line will be exactly the
same as with a gauge or other visualization, plus you’re already answering
questions about historical performance.
Now as a final piece of advice when talking about informative
vs. reporting visualizations, when the business question is established and
requires repeated refreshing-a refresh just means updating the data behind the
visualization-this is almost certainly a report rather than an informative
analysis. Look for keywords like KPI, SLA, metric, defined business rules. When
these are spoken about the needs of the visualization, your organization may
have a dedicated reporting team. This reporting team may specialize in
maintaining such reports. If you are not part of that group, consider reaching
out to them to maintain the informative analysis that you now want to turn into
a report. Personally, I would leave that up to the professionals.
All right. We covered a lot in a brief period of time. Some
key takeaways from this video.
Bar graphs work well when comparing values between a few
groups on the same scale. Pie charts may be okay when showing percent of total
for a few groups; however, a bar chart can show the same information answered
both as a comparative and composition visual. Use line charts for change over
time. Don’t be afraid to combine visualizations if you need to tell a more
complicated story. And finally, your organization may have a data and reporting
specialist to help standardize your information into a more consistent report.
Our next video, we’ll move into more complicated exploratory analysis and the
associated relationship and distribution visualization.
Video 7:
Data Visualization – Exploratory Visuals
As a next step, we’ll now hear from Dan as he discusses best
practices and tips for using exploratory visuals, relating back to the business
question, so one can learn about the data and find new stories to tell.
Video Transcript (Est. time: 13:04 min):
Now let’s move on to exploratory visualizations. In this
section, we will look at more advanced visualizations such as scatterplots,
tree maps, bubble charts, principal coordinates, plots, histogram, and box
whisker plots, and let’s start with some financial and sales metrics. What if
we want to know what is the relationship of material costs to after-tax
profits?
Well, as a starting point, even the most advanced analyst
will use scatterplots to start visualizing patterns. Here you can see the
material cost vs. the after-tax profits with a point for each VIN number, so
that’s what did it cost to make the car in terms of raw materials and how much
were we able to sell that car for. This demonstrates how scatterplots show us
the intersection of two related series of values. Think about two values in the
same row, but different columns. In Excel, you have a row of data, you have
your column, and then you have another column. That’s the same way you create a
scatterplot. In a pivot chart, we may be able to observe a general upward or
downward trend in data points between the two factors, or we may be able to
observe groups in the data set. In the example on your screen, you can see
Bose, you can see the relationship between material costs and after-tax
profits, as you expect. Generally, we see profits decrease as material costs
increase, that big downward slice in the middle. However, there are two major
exceptions. There is a group in the upper right with really high material cost,
but also high after-tax profits. And then you see a group in the lower left
with negative profits but also unusually, extremely low material costs. So how
do we examine this? Well, within the scatterplot we can start adding colors or
other shapes to start trying to pick out patterns. We can look at a model and
see that the CHARE model, C- H-A-R-E, constitutes the entirety of that
upper-right-hand cluster. So we can imagine that this is the luxury brand with
high margins and high material costs. Unfortunately, the lower left is a little
more ambiguous. That looks like there's a bunch of other models
within that grouping. So you see a scatterplot may not always
tell enough of a story. In fact, it rarely does.
So let’s look at some other factors that might be influencing
these unusual patterns in groupings of a cost vs. profit. We saw there were
business questions related to channel, so let’s explore viewing multiple
channels against models against their profits. Let’s explore viewing groups in
tree maps. These are similar to pie charts but with squares
instead of slices. So I guess it’s more of like a brownie or
a lasagna chart that I don’t know. If you look at seeing after-tax earnings by
channel one, two, three, and model, that’s a lot of information. There are some
groups here and there that may stand out. Um, but it’s not that descriptive.
Similar to a bubble chart, which a bubble is sized by the value. And then colors delineate the different
channels or models. So for each slice you get a bubble. And the value is the
size of that. I’ll be honest with you. These charts aren’t very useful by
themselves. They’re good for exploring things in a dashboard
because you can mark them. However, by themselves they don’t typically tell
much of a story. It’s often more effective
to use a bar chart for this type of stand-alone
visualization. But when exploring in a dashboard, they can be rather effective.
More on that later.
So as strange as it sounds coming from me for even an
exploratory visualization, but also one that can be very effectively moved from
exploratory to informative, I feel a table with a heat map is a better
visualization for this scenario, and often in many scenarios, the example on
your screen, you just see a simple, well, basically, a pivot table. It’s your
sales channel by model, by the average after-tax profits. This represents the
same thing you saw on all those other diagrams. Just it’s more condensed. You
can still see the same information based on the color, but you also see a
value. You see a raw number. This is tangible and in fact, speaking of
exploratory visualizations and an example of how you would combine these so
that you can explore but also inform, you can create a simple dashboard of a
drill down, and we’ll describe what a drill down means in the dashboarding section,
where you can look at your scatterplot and see how the different clusters or
values within that scatterplot relate to another visualization. Like in this
case, the table diagram that we created enough about dashboarding and groupings
in scatterplots. Let’s move on to say a business question about does an
increase in sales for one sales channel take away sales in another? So does an
increase in sales for one sales channel take away sales in another. This is a
common question in the real world, and it’s an extremely complex question.
This question is probably better served by machine learning
and some complex modeling, but for the sake of visualization, we only want to
compare the percentage of sales in one channel vs. these percentage of sales in
another and how that changes over the course of months. We want to look at it
month to month in this example because we are exploring the relationship
between two groups with multiple occurrences. In this case, the differential of
percentage sales in one channel vs. the another. For a given month, we can use
a special variation of a line chart, a principle components plot, so the
specific criteria,
it has to be the same scale. Two groups, some relation, and
this tells us how one moves vs. the other. In fact, there’s some additional
research on this showing that a principle components plot is better interpreted
in highly ambiguous scenarios. Where you’re trying to show there is not much of
the relationship vs. places where there’s a strong
relationship. A scatterplot is well interpreted on the
fringes. A principle components plot helps people not make false assumptions of
relationships. Scatterplots are very easy to do that. If you draw a line
through it, you kind of draw a picture in your head. It’s very hard to make a
fake assumption or false assumption of relationship when there isn’t one with a
principle components plot.
Off my soapbox for principal components plots and now onto my
soapbox for measures
of central tendency. Now, in many cases I’ve seen people use
bar charts or even pie charts to communicate average value between groups. But
an average isn’t just a single value. It isn’t a value that, that means a
single thing. So like some total sales, I can sum up my sales. There’s no
misinterpretation about what that value means. It’s a bunch of numbers put
together as a total average or arithmetic mean is a measure of central
tendency. It’s describing a group of values average only suggest the central
value, but it doesn’t tell the whole story. Average can be skewed, meaning the
same average can represent vastly different underlying values. Look at a normal
distribution vs. a skewed distribution.
I’m going to show you my process of looking at a distribution
for a few different factors as they relate to contribution margin. And when we
look at the frequency of our contribution margin, we see a lot of hills. It
looks like a mountain range. We see when we see a multi- peaked distribution
like this, we refer to it as multimodal. Multimodal means that there
are some values that occur very frequently.
Multimodal suggests there are multiple categories within this
trust distribution that there are groups, so we may see effects similar to what
we saw in that scatterplot. We may see clustering groupings of things that vary
by country, region, sales channel, etcetera. Our distribution may not be
comprised of a single group of data or information. There may be multiple blobs
of, of information, of groups that we need to tease out. You can see that some
of these patterns emerge when we overlay different categories in contribution
margin distribution, but first I need to take a step back and talk about what
is a distribution.
What do I mean by distribution? You can see a histogram, but
what does that really talk about? Let’s, let’s simplify the scenario and let’s
say we have two car models, A and B. Let’s say for model A, we have
contribution margin of 100, 95, 90, 87, 85 or whatever. It’s on your screen as
model B. We have similar contribution margins and this vastly simplified
example, we can see the average for model A is around 83. The average from
model B is around 73. However, if we add some additional values and model A,
say something that’s well below the, the existing average, we’re in model B, we
pick just two values that are similar to the average. Our new average is
completely different. If we just look at the average for model A vs. model B in
the scenario, we would think that model B had a far higher contribution model
than model A. Whereas it’s only these outlying values that are driving things,
driving things down for model A, which
is one of the primary reasons why histograms and box whisker plots are so
important.
What we’re looking at when we see a histogram are not the
numbers themselves. It’s a frequency distribution of frequency of how often
those numbers occur within given groups. We call those groups bins. In this
example, let’s say we have bins of 190, 80, and so on and we can look at the
frequency count in model A and model B, we can see that there is an outlier.
Two outliers in this case, way down at the bottom for those values of
10 and five histograms are so important for an exploratory
analysis because we see these patterns. We can see when there’s outliers, we
can see when there’s different patterns in data. And the example I gave it with
some outlying value that could have been from, I [don’t know], a different
country, it could have been from some new sales initiative. Uh, it could have
been from who knows where, but there’s definitely something going on that
wasn’t represented with just the data that we saw. And finally, when you see
yourself clicking through a bunch of different histograms or a bunch of
different scatterplots or a bunch of different stuff and just exploring,
exploring, there are other visualizations that you can use, visualization
specifically designed to tease out these patterns. One of the more powerful
visualizations for this. In fact, an analytics technique is a decision or
regression tree. However, that is a little beyond the scope of this course.
Exploratory visuals can be a lot of fun. You know, I love them personally. They
help you see your data in a brand-new way.
We learned a lot, talked about histograms, talked about, uh,
scatterplots, understanding how these exploratory visualizations are about
looking at the data itself to find new patterns, and we even did a little
dashboarding. And in our next session, we’ll cover the underlying dashboarding
concepts to help you become the best you can at data exploration and informed
visualizations.
Lesson
2: Huskie Motors Operations
Section
1: Huskie Motor Corporation: Operations
HMC is currently selling in 15 countries in three regions:
North America, South America, and
Europe. Table 1 provides a breakdown of the countries within
each region.
Huskie
Motor Corporation: Operations, Continued
Automobile manufacturing and sales is a complex and highly
competitive business. Though the automotive industry has a broad global reach,
only 15 countries produce 88% of the world’s vehicles.
HMC
Models
HMC is a fairly new and small player in the automotive
manufacturing market. HMC currently offers three brands: Apechete, Jackson, and
Tatra. Each brand has several models as detailed in Table 2. The models
available fall within seven segments of vehicle types: compact, sub-compact,
full-size, mid-size, luxury, minivan, and sports utility.
HMC's
Manufacturing Options
HMC offers several series for each model for a total of 34
different series. A breakdown of the available series offered by model is
provided. See Table 3. Each model is available in various body styles, engines,
drive configurations, transmissions, trim, color, and seat types.
Since various engine and transmission builds, see Table 4,
come from one division and finishing is done in another division, these options
are described in different tables.
Click each icon to learn more.
Packages
and Options
Like many automotive manufacturers, HMC offers a variety of
packages and options that buyers can add to their vehicles. Packages include a
specific set of bundled options, or buyers can choose individual options
separately. Table 6 provides a list of the six packages along with the detail
of products and services contained in each package. Table 7 provides
information regarding options that may be purchased ala carte. At least one
option, however, is contained in each available package.
Click each icon to learn more.
HMC Data
Movement
Recall the data life cycle from the previous module and
consider how Huskie Motor Corporation’s data ages.
To again explain how HMC gets and uses all their data, let’s
take a look at a short video to further illustrate how data is created and
stored in business.
Video 7:
Movement of Data
To conclude the topic of data movement, we will now move to a
presentation from Dan Smith where he further illustrates how data is created
and stored in business.
Video Transcript (Est. time: 3:00 min):
First, we’re going to look at data creation and storage. And
I’ll start way back in 1967 when Dr. Melvin Conway identified that any system
designed by an organization will match the communication structures of that
organization. What that means in plain English is that a business will create
processes and solutions based around how they are structured- marketing, supply
chain, accounting. They’ll all produce their own applications and typically
their own data as well. And who were the first people to start storing and
recording data all the way back in Sumeria, the accountants.
In fact, I would argue that our current data architecture is largely founded on
accounting practices. Accountants use transactional data created by customers
and business operations produce reports that
are then used to inform stakeholder decisions.
So what does this look like? Well, there’s three primary
domains, business data, the creation, storage of data, and the business
decisions. Now, how is it, where does this data live? Well the customer
activity generates transactional or raw data, that data is stored in a
transactional database, which is then and then shaped into a reporting data
set. The reporting data is used to make business decisions, and the resulting activity
from the business decisions create more transactional data. Now that we’ve seen
how that data is created, let’s look at how it gets from one place to another.
You can see the data movement cycle over here on the right of your slide. When
that data is first created, the information messages are typically transferred
through a bunch of layers in applications, so load balancers, temporary cashes,
firewalls, etcetera. The process of getting activity
data into your data system is typically called integration
and it moves data into a transactional data store. Now, once it’s in its raw
form or sometimes as part of integration, the data is shaped or cleaned into a
form usable by people in analytic applications. This is generally called ETL,
extract transform load, and the data output is stored in the reporting
database, data warehouse, or data lake. The transformation of data into
business-usable information is called analytics. We will talk about a lot about
this in the next module. Although Guy said, simply put, analytics is what
facilitates data informed decision making. And
finally, business decisions attempt to influence customer activity. Making this
one little arrow right here, influencers in the upper-left-hand corner of the
business data creation and movement cycle. The entire reason that business
exists and the entire reason that the data analytics cycle exists to help the
business make more informed business decisions. So let’s look at this process
in action.
Lesson
3: The Data Dilemma
Section
1: Data Dilemma
Prior to calling Miranda at HMC, Kevin meets with Jan to go
over some basics regarding the project. Kevin uses a five-step model as a data
analytics framework. As a first step, he explains this model
to Jan, because he believes it will help her understand his
process.
Data
Analytics Process
Kevin: “So, Jan, this should be an interesting
client and project. I know you are a bit nervous about your lack of experience
with data analytics design, but I have been studying this stuff for a long
time, both from an applied perspective as well as a theoretical perspective.
There is a great book on this subject, Data Analytics, by Warren Stippich and
Bradley Preber. In this book, the authors describe a five-step approach to the
data analytics process:
Step 1: Define the question
Step 2: Obtain the data,
Step 3: Clean and normalize the data
Step 4: Analyze the data and understand the results, and
Step 5: Communicate the results.
I think that understanding and following these steps will
help you tremendously as you find your footing on this engagement. Our client,
HMC, would like to have the project finished within a four- to six-week time
frame. How do you feel about that?”
The Data
Dilemma
Jan: “That seems like a pretty short window, Kevin. Do you think
we can finish within that time frame?”
Kevin: “Well, I do not think we can set a realistic
time frame without knowing exactly what our challenges are and what the data
looks like. Remember, we have to be able to trust our numbers. We do not know
yet whether the data is clean. We will have to do some validity testing first.
Remember, follow the steps, think it through, and keep a calm head.”
The Data
Dilemma, Continued
Jan: “What exactly is our goal for HMC?”
Kevin: “Basically, HMC captures very detailed
transaction-specific data. For example, HMC has extensive cost data, marketing
information, and plan data for each car sold. The amount of data it has is
overwhelming for both the HMC management team and the executive board. Our job
is to help Miranda and her team figure out how to use the data to better
understand costs and profitability by vehicle model. They also need detailed,
relevant feedback regarding sales volume and sales location for planning
purposes. Miranda would like to be able to predict sales at least three
quarters out so that the management team can better plan production schedules.”
The Data
Dilemma, Continued
Kevin: “An equally important goal is to help
them understand the benefits of data visualization and give them some ideas
about how to present the data to their executive board. As you know, Jan, data
visualization software, such as Tableau and Qlik, allows us to turn large
volumes of raw data from various sources into easily comprehensible graphical
representations of information. Data can be accessed live or extracted from
some other source. Data can also be presented in summary fashion while
preserving the underlying detail, which can be instantaneously viewed as
desired. This type of technology could be instrumental in helping HMC improve
and maintain its competitive advantage.”
The Data
Dilemma, Continued
Jan: “How does the management team store the
data, and how will we be able to integrate it with the visualization software?”
Kevin: “I believe that the plan and forecast data
are on Excel spreadsheets. The actual sales data will likely come from its
enterprise resource planning (ERP) system. We will need to confirm that at our
initial client meeting. We should be able to get transaction-specific
information at the vehicle identification number (VIN) level, since the VIN is
a unique identifier. I am not sure how much information is being collected at
the individual vehicle level, but that could be a rich data source
for us if we can get it.”
The Data
Dilemma, Continued
Jan: “Is the data they have ready to be used, or will we need to
verify and organize it?”
Kevin: “We will see just how clean it is once
we start pulling it in. We will initially build the analysis using a
representative sample of HMC’s data. The sample represents about 25% of the
total company data. The four- to six-week time frame should be feasible if we
receive clean data. Our contact at HMC, Miranda, is assigning two accountants
from the controller’s department to pull the sample and help clean the data.
Once the client approves of the proposed analytics and the dashboard, we can
roll it out using all the data.”
Data
Sampling and Analysis
Jan: “Will using a sample be enough to convince the client of
the value of the analysis?”
Kevin: “Absolutely. The sample allows us to
determine which type of analyses we can provide, by looking not only at the
data, but at the format of the data as well. We can use the sample data to
create demonstration dashboards for the client. The analyses we perform with
the sample can then be recreated for the full data set.”
The Data
Dilemma, Continued
Jan: “I am not sure I understand what you mean by a
‘dashboard.’”
Kevin: “Basically, a dashboard is a screen that
consolidates visualizations, graphs, charts, and so on to concisely display the
metrics and key performance indicators for a business. Summarized data can come
from a variety of sources and can even be presented in real time.”
Jan: “I cannot wait to get started! This is going to be a
tremendous learning opportunity for me.”
Video 8:
Data Visualization – Dashboarding
As Kevin helped Jan understand the purpose of a dashboard, we
will again turn to Dan Smith as he further defines dashboarding and introduces
concepts and best practices when using dashboards.
Video Transcript (Est. time: 18:22 min):
With the popularity of data-connected visualization tools
like Tableau and Power BI, data visualization is becoming almost synonymous
with data dashboarding. Well, some, myself included, would dispute that
association. It’s still very important that we include dashboarding when
learning about data visualization. In this video, we will define dashboarding,
explore dashboarding concepts, and recommend some dashboarding best practices.
But first, what is dashboarding? When someone thinks of a
dashboard, they usually think of a car or other device full of gauges, numbers,
meters, things that tell us the immediate state and relevant information
required to understand how something is doing related to data and information.
Dashboarding can be defined as an information management tool used to organize
and display information typically with an interactive element and connected to
an underlying data source for rapid updating. And let’s, let’s think about why
in your case study a dashboard seemed like such an appealing prospect for Huskie
Motors and consider also if it’s the right choice for the
overarching problem statement and the requirements of our stakeholders. In our
case study we’re asked to determine how data will be represented for several
different stakeholders. Let’s explore how we could use a dashboard to display
this information, the dashboarding concepts in each, and if we should use a
dashboard to display this information.
Our first task is related to the dashboard concept of
displaying KPI and metrics. We’re tasked with displaying the overall
performance analytics, a question such as how Huskie Motor Corp. is performing
globally, how are the various brands performing in various channels, and what
are the least profitable models. Here we can apply a core concept of dashboarding
displaying KPIs and metrics. I mentioned before that lines are very effective
for displaying KPIs. Presumably there are a set of KPIs, key performance
indicators, that they can use to define performance. I would assume these were
defined by somebody, and we can include these in a dashboard that was connected
to a data source that had these KPIs defined. And let’s
think of our audience for these high-level performance metrics. Anyone who, who
is interested at metrics at this level are likely going to be executives,
senior vice presidents, maybe senior directors, depending on the organizational
structure and title responsibility. In other words, metrics at this level are
foreign audience unlikely to have the time or inclination to go digging around
their own data? They have trusted advisors to give them guidance on the
pertinent pieces of information. In other words, these are informative
visualizations, probably more reporting than anything, so a dashboard may be
appealing and a dashboard and the purest definition of data connected and
immediate results, but it’s not going to be an interactive dashboard or if it
is interactive, it’ll be for someone reporting results with the dashboard, not
for exploration. By our reporting results, I mean sitting in a presentation and
clicking on things in a predefined script, so the dashboard will probably be
for a presentation, not something that somebody is going to be regularly
accessing online.
Our next concept, filtering and drilling down the next series
of questions. In your case study, the financial analytics here, we’re looking
at prices per model and how they
changed over time or which model has the most variability in
variable cost, etcetera. While these are well-defined questions, there is a
degree of slicing and dicing exploration available. If someone wants to look at
different packages or regions or country or models, sometimes there will be
complicated questions which may be hard to represent in a single visualization.
Therefore, the financial analytics questions have more opportunity for
exploration. So a more interactive dashboard is probably useful in this
scenario. And by the way, by slicing and dicing, I mean a dashboard where a
user may wish to view the data in a different way than represented by a static
dashboard or table. For example, they may wish to highlight an area of one
visualization to learn more about the data point
contained in that visual. That would be a drilling-down
operation that’s often used in scatterplots and other complicated visuals like
a map and tree charts to give more information about a given data point or
points. That would be like a dicing of data.
And filters are often used to change the dashboard. In
contrast to the drill down, filters typically exclude all the data for portions
of the dashboard. They exclude the underlying data. A filter is exactly the
same as a filter in a pivot chart or pivot table. And a pivot table is also an
excellent example of slicing and dicing data. You just pull in different
elements and you see data in a different aggregation or context than you would,
and it’s an original form. And fundamentally, if there is a demand for data
exploration, just a bit of caution, you will never be able to satisfy all the
data exploration to band with simple visualizations, you have to include some
interactivity. So a dashboard makes a lot of sense for the questions from the
finance team as a, for the operational analytics questions, those are much more
complex. They’re asking what are the top values overall in multiple groups.
Now, asking what are the top sellers over is easy. You can just sort it. But
when you start comparing the top sellers from one group to a top seller in
another group, your analysis becomes much more difficult because you’re not
just ordering a column in a table. You have to select the top individuals or
regions or models or whatever it is from a group and adding to that complexity,
the calculation for the number of model options. How many days are the various
models on the lot sale? Uh, you have the potential for a very complicated
dashboard. The questions ask for by the operational analytics folks. Those will
likely require custom calculations and/or data transformations. Uh, we’ll talk
about those concepts shortly. But this is something when people start asking
those types of questions that you would probably want to call in a data expert,
to help you create that type of dashboard and those visualizations forecasting
our final set of options where we look at the forecast.
I know I'm giving a lot of warnings here. Primarily because
this is an introductory to visualizations course. So I put out cautions when
people are, what they think are engagement, where they’re building a
visualization, but it’s actually a mathematics data problem, when we start
talking about forecasting, this is another one of those yellow flags. So as a
caveat, I would not, as a warning, I would not suggest a business, particularly
one the size of Huskie Motor Corporation make a significant business decision
based solely on the forecast that is built into a utility of a visualization
tool. Not to get into the details, but there’s a lot of additional information
that can be included in that type of mathematical model. That being said, the
built-in forecasting utilities can still be informative for identifying
patterns over time in your data and creating a degree of confidence in those
types of pattern predictions. In a forecast, we are able to show confidence
intervals above and below the predicted values. This is the case of a
visualization so we can visualize our confidence intervals and that’s similar
to a box whisker plot where you have the predicted values 80% of the time vs.
the actual 80% of the values. The high/low markers are equivalent to the
whiskers and a box whisker plot if you will.
Moving on from forecast, let’s talk about connecting to data.
Well, you always hear the term data-connected dashboard. Let’s expand the
dashboard conversation to introduce what a dashboard is doing, a dashboard is
doing under the hood. The first concept is connecting to data. A dashboard
works best when the dashboard can update as its underlying data set. Also
updates, we’ll cover this more in our data platform video, but for now
understand the visualization. A tool does not and in most cases should not
change the underlying data. So here we can connect to an online Google sheet.
Here we can connect to an O-data online data source or a JSON data source. So
these are web data sources. They’re separate.
Contrast that to something like Excel where Excel houses the
data. In many cases, especially the cases we’re accustomed to working with,
Excel is the data you can type in and modify the data. In many cases you have
to modify the data to create a visualization. So you end up with a lot of disconnected
data tables, pivot tables, and pivot charts are more like other visualization
tools as long as you don’t touch and edit the underlying data. In fact, I would
say if you can use a pivot table effectively, you can probably use most of the
other visualization tools with minimal training, aggregates, and values. This
concept is probably the most important concept when building a dashboard. We’re
at least using a visualization-focused dashboarding tool and that is the
concept of categorical values and numerical values. Tableau calls them
dimensions and measures, Power BI columns, measures or aggregates, and measures
in a pivot table or called rows or columns and vs. values whenever you want to
call them. Your categories are what determines the number of slices in a pie
chart, whereas measures determines the size of each slice category determines
the number of bars. Measure is the height of the bar. Again, categories or
factors. These are the same concepts as when you use an Excel table column to
make separate rows in a pivot table measures the same concept as values in a
pivot table or some mathematical operation.
Typically, sum is applied to the values corresponding to the
pivot table. Row values in the data video, I’ll talk about more about this concept.
Even including the SQL structured query language aggregation code that’s being
created by the visualization platform. That sounds a little scary. I’ll walk
you through it. It’s not going to be as bad as it sounds. Calculated values,
all dashboarding tools let you create custom calculations for aggregating your
values. These can get pretty complicated. In this example, I’m relatively
simple when I’m creating a new column to create an address and I’m merging a
bunch of other columns concatenating them as what that’s called to create an
address value for geocoding. Yeah, that’s a simple example. And the hand
calculated values are one of the hardest aspects of learning a dashboard tool
because calculated values, in particular, calculated values like rolling
average or cumulative totals, tend to either obscure or be very specific to
that platform to get the exact calculation I use to create the value you see on
the screen and honestly calculated values, especially the nuances of them. They
aren’t something you need to understand for this class. But as you learn
dashboarding and because it’s an important element of visualization, keep in
mind that the measures you see visualized in the dashboarding tool are
calculated not only using the underlying data but they will also change based
on the visualization, visualization used and the categories used in the
visualization. Yeah, I know, I’m sure that didn’t make any sense. It takes a
whole class so you don’t need to worry too much about this concept. I’m telling
it to you
so you’re aware of it. This is a potential stumbling block,
and it may be something that you struggle with cause I struggled with it when I
started. So as you’re learning and, and honestly in general, it’s a good idea
to have a, your data team how, validate your calculations or have predefined
calculations, even have a data set. If you have a particularly complex
calculation you want to put into a visualization tool, have some data, people
will create a new data set for the analysis instead of relying on the built-in
dashboard capabilities.
Alright. Moving on. Dashboarding best practices. The first
best practice is to know you do not always need a dashboard. Yeah. Think about
it. How many of you had a boss that would not look at a presentation unless it
was shown to them at a meeting? You could write an email saying, next week's
winning lottery numbers. We’re in the attachment on that email and they still
wouldn’t open it. It’s reasonable to assume if they will not click an
attachment to open a presentation, they are not going to click a link in your
email, log into our reporting server, assume they remember the username, and
then navigate to the dashboard themselves in a dashboard. You want to keep,
infer it. Keep your informative visuals at two or three major points at most,
so you’ll be tempted to cram everything on one dashboard. Look at that
information density. Oh, you’ll say, and no one will know
what you mean. Or maybe you will. Somebody will be confused,
and you just go up to them and you say, Oh, you just need to, you know, click
this. Then this, then this, so highlight that. Pick this filter and voilà .
Here’s your answer. No, and informative dashboard allows exploration, but it
should not require it. The line chart dashboards are great examples. They may
seem simple, and they answer a question very simply and succinctly, but there’s
little room for misinterpretation and you can still drill down and filter. If
you want to compare that to say like the geographic dashboard that we saw
earlier. It looks cool, don’t get me wrong, but it doesn’t say much without
digging around
in it and it doesn’t even follow some best practices about
high-level visualizations basically, form over function. We should be doing
function over form, my next best practice.
Make your own darn dashboard. I’ve seen people spend weeks
working with consultants to get a dashboard, and
they’re still not happy. I’ve seen that time and time again, and it’s because
there’s always another question. That’s the whole point of exploration. You
want to find new questions. So if you find yourself working hours and hours
with a consultant
or contractor or someone to build the dashboard, consider it
spending that time watching free training videos, all the platforms provide
them, that way you can build the dashboard you want. And if there are still
complex data operations, like joining tables, rolling time series aggregations,
work with a data expert or reporting expert to create a new data set or fix
that one calculation rather than trying to read your mind or predict the future
about what questions you’re gonna ask because you don’t even know what they are
yet, it’s
really not that hard. I promise they’re there. They’re easy
to make. [inaudible] That’s what they’re designed for.
Next best practice. Keep data operations to a minimum. Just
because you can do complex calculations doesn’t mean you should. Dashboards and
tools like Tableau, Power BI, even Excel, can be unclear how a value is
calculated. When I was talking about that before in
the calculated values. Then when you throw in different
terminology, what they’ll call it, partition window table calculations, those
are calculations that are aggregated over several categories or dimensions
within the visualization. It gets confusing and isn’t transparent. So again,
leave data to the experts. A data team-approved data set with verified
calculations that’ll ensure proper governance. It will ensure procedures are
followed. Give you the peace of mind that you’re not making a small and opaque
calculation error that could lead to $1 million incorrect decision.
In conclusion, in this video, we use dashboarding as an
information management tool used to organize and display information.
Typically, a dashboard has an interactive element, and it’s connected to an
underlying data source for rapid updating. In our next video, we’ll give you a
little more visibility into the underlying data environment so you’re able to
communicate more effectively with your data stakeholders and you are better
able to understand what’s happening under the hood of popular visualization
tools.
2.64
Section 2: The Data Dilemma, Continued
After reviewing the project proposal and meeting with Jan,
Kevin calls Miranda to introduce himself and confirm their initial meeting the
following week.
Kevin: “Hi, Miranda. This is Kevin from D &
A Consulting. I just wanted to touch base with you to set up our initial
meeting.”
Miranda: “Hi, Kevin. I am really looking forward
to your team’s visit. We need help as soon as possible. I have so much data
coming in that I barely have time to look at it, much less analyze it. Also,
our data is located in various databases, spreadsheets, and our ERP system,
making it difficult to integrate and fully utilize. We have data visualization
software available on our server,
but we have not really tapped into its capabilities just yet.
I am told that we can easily pull together large amounts of data from various
sources with user-friendly, adaptable output using
dashboards. Is that something you can help us with?”
The Data
Dilemma, Continued
Kevin: “Absolutely. What kind of data
visualization software do you have?”
Miranda: “Tableau. It is supposed to be very
user-friendly. I am hoping you can get a dashboard up and running in a couple
weeks.”
HMC
Dashboard Needs
Kevin: “Do you know what you want on the
dashboard?”
Miranda: “I know that we want to be able to see
profitability by brand and model, since solid profit margins are crucial if we
want to stay in business. We also want to keep a pulse on sales by country and
region. Ultimately, we want to do a better job of planning our production
schedule, but as you know, this requires up-to-date information on many moving
parts.”
The Data
Dilemma, Continued
Kevin: “I cannot say whether a couple of weeks
will be enough time until we take a look at the data and map out the specific
decisions you hope the dashboard will enable you to make. I will have a better
idea after our initial meeting next week.”
Miranda: “Okay,
fair enough. We’ll see you on Monday at 1 p.m.”
HMC and
D&A Meeting
The following week, Kevin and Jan meet with Miranda and her
team for their initial meeting.
Kevin: “So, let’s get down to business. Let’s
talk about the dashboard. Help me understand what questions you are trying to
answer? Or what story are you trying to tell? Who are the users going to be?
Will different users need different dashboards?”
The Data
Dilemma, Continued
Miranda: “I have a list of questions that I would
like to be able to answer using our data. There are several areas where I
believe we can gain greater insight from looking at our data. Specifically, we
need overall performance analytics to tell us how we are performing globally.
We need to know which of our models is profitable, where we are selling well,
and, perhaps, how sales channels are driving sales volume. It is also important
that we are able to see information from a top-level perspective with the
ability to drill down into the detail. The profitability information is crucial
for our executive team: chief executive officer (CEO), chief financial officer
(CFO), chief operations officer (COO), and chief marketing officer (CMO). This
team is ultimately responsible for the direction of HMC and profitability for shareholders.
“I would like for our management accounting group and
financial reporting group, including the CFO, to have a financial analytics
dashboard that would give them information about contribution margins, total
costs, and sales volume. Both groups will also need to monitor changes in costs
and contribution margins for all of the models we offer. These are the most
important metrics and, therefore, the ones we need to work on first. We will
expand our analysis for the management accounting group to include other
efficiency measures as soon as we can get these initial dashboards up and
running.
“Our CMO and the sales team will need operations analytics to
help them understand turnover and demand. They will need to know which models
are selling and how long it takes to sell them. They will also need to
understand how packages and options impact sales. They will need to have a
handle on which of our package/option offerings are popular and which ones are
profitable.
“Finally, our budgeting and production teams will need to
utilize forecast analytics to predict sales and margins at least three quarters
in advance.
“Do you think it is possible to make the dashboards ‘real
time’?”
HMC Data
Kevin: “I think it is possible, but first we
need to understand the data that you are currently collecting. What can you
tell me about the extent and magnitude of your data?”
Miranda: “Well, we track details down to the VIN
level in our ERP system. So, we have a lot of data. For each vehicle sold, we
know the sales amount, marketing expense, and all variable and fixed costs. We
also track a lot of nonfinancial data.”
2.71 The
Data Dilemma, Continued
Kevin: “What kind of nonfinancial data?”
Miranda: “Well, there is vehicle data such as
brand, model, model year, series, segment, body style, drive configuration,
engine type, and transmission type. We also have the detail for each vehicle
sold as far as color, trim, and so on, as well as any package or options
purchased.”
Jan: “Wow! I can see why the data set is so large!”
Miranda: “That is just the vehicle information.
We also capture the region and country of sale, the number of days that any
given car was on the lot prior to sale, and the type of marketing campaign in
place at the time of sale. In addition to that, we track sales channel
information.”
The Data Dilemma, Continued
Jan: “Can you give me a little more detail about what you mean
by sales channel information?”
Miranda: “Sure. We identify sales using three
sales channel dimensions. Sales channel 1 identifies whether the sale was made
through our dealers, fleet, or retail operations. Sales channel 2 identifies
the type of customer account. We have commercial accounts, employee/partner
programs, government accounts, nonemployee accounts, and rental accounts. The
third sales channel identifies whether the sale was cash, financing, or lease.”
Kevin: “Okay, so ideally we will want to use
both the financial and nonfinancial data in our analyses.”
Miranda: “Exactly! Other potentially informative
data sources that we have are social media platforms like Facebook, Twitter,
and Instagram. We have just started collecting data from these sources about
our vehicles, but we are still trying to figure out the best way to analyze
it.”
The Data
Dilemma, Continued
Kevin: “Eventually we can bring that data into
the analysis and dashboard as well, but for now let’s stick to the data we
already have.”
“So, the actual sales data is in your ERP system. Where is
the budget data?”
Miranda: “We keep track of the budget data in an
Excel spreadsheet we call ‘plan data.’”
Video 9:
Data Visualization – Data Ecosystems
In the case, it is evident that HMC data comes from and is
stored in a variety of places, making it difficult to organize and analyze it.
In Dan’s next video, he will go through the concept of data ecosystems and walk
us through the process of aggregating data to see how data is connected and transformed so it can
ultimately be visually presented.
Video Transcript (Est. time: 10:22 min):
As you become more familiar with creating data
visualizations, you will quickly understand if you haven’t already, that data
is just as important as the visualizations themselves. In fact, data’s probably
more important than the visualizations. Furthermore, having taught hundreds of
people, how do you use visualization tools and even more advanced
analytics? I know that understanding how the data operations
and data connections are enabled by those tools, that provides a more
conceptually understanding. Understanding the data concepts allows the students
to use their skills across multiple tools rather than just the one they’re
currently learning. So I’m going to let you in on a big secret. Those
complicated calculated values and dimensions that I described in the previous
video. Those are just SQL structured query language or sequel, they’re just SQL
aggregations. They’re so complicated because they have to figure out how to
abstract relatively simple code, but it’s still using a visualization to create
code. This is a slide that I use when I teach people data concepts. For many of
you, it may look too technical because it shows a little code. That’s a common
reaction. Please just bear with me. I hope to show you that much of the SQL,
much of the structured query language generated by dashboarding and
visualization tools, that code is similar in function to an ordinary pivot
table. And by extension a pivot chart is just like a dashboarding tool.
So the example that you see on your screen is just a table
from a database. I’m glad that it is a different data set because it shows you
that these concepts extend across any data.
So to be consistent, though, with the language in the case
study, the two tables here, these are the difference in what they mean when they
say normalized. That is, there is a unique row for each individual data value.
We see the same types of coverage and education repeated multiple times and our
normalized table in this example, if you need context, it doesn’t really matter
what the data pieces are when you’re talking about code. But here we’re just
talking about the coverage and education for different insurance premiums for
folks. And if our task here is to show what the sum of claims for each type of
coverage and we do not want to know the sum of claims for coverage where
education does not equal graduate, if that’s the business question, we would
create an aggregate table with one row for each type of coverage and its
associated total claim amount. If this was in Excel, the simplest solution
would be a pivot table. We put coverage for each row and move our claim amount
to the aggregate value. It even defaults to some of that value filtering the
education. We put education into our filters and select all except graduate.
Our SQL statement is exactly the same as the pivot table. It reads like a
sentence select coverage, some claim amount from normalized table where
education does not equal graduate group by coverage.
Looking at the corresponding colored squares between our
pivot table and the SQL statement, we can see that selecting a field or column
and placing it in your group by statement is the same as putting it in a row.
Meaning that we have a value or group or row for each of the associated values
within our coverage column. One row for basic, one for extended, and one for
premium for some claim amount. The green box in our SQL statement makes the
same calculation as our aggregate value. Some claim amount in the pivot table,
some sum, the claim amount, and finally where education does not equal graduate
in the purple box is the same as the filter in the pivot table. Although you
can’t
see it seeing the side-by-side comparison like this. I would
argue that the SQL statement is actually clearer. It’s more transparent, it’s
more understandable in what the data transformations and filtering are actually
doing. It's rather than the uncertain filtering and source for the pivot table,
we don't, we don’t know what the sources for that pivot table by looking at it,
we have to click and sort through all sorts of check boxes for the filter. We
may not even know which exact calculation is being performed on the value.
So remember when I talked multiple times that abstraction was
both a strength and weakness of visualization tools. This is a common example.
So you like the hidden filter. In this case, the education filter in the pivot
table may lead to mistakes. If someone was unaware of that filtering logic,
it’s easy to assume that the calculation, your aggregate
table would have all the education in it. Whereas with the
SQL statement, you see it right away. And to reinforce the point, just to bring
it up again, the pivot table that you see there, if this was connected to a
database, if we were using power query, this is exactly what it’s doing. If we
had an ODBC connection, open database connection to a database, which we
connected a pivot table to, it would create this SQL statement. That’s all
these things are doing. They’re abstracting the creation of SQL code. When it’s
very simple, there’s not a lot of harm. It’s much faster to use the
visualization tools, but when it gets complicated, it’s typically faster and
better governed to have the SQL statement. But either way, understanding these
basic SQL concepts makes it very easy to understand how to use the different
tools.
So that’s on the transformation side. Let’s talk about the
data connection side, the separation of concerns. Why? Why is it so important
that we pass this code to a data source anyway? We’ve used Excel for decades
where we just transform the data in the spreadsheet.
Why all of a sudden is that not okay? Well, there’s a few reasons. I already
mentioned the data is often too big to fit in a single spreadsheet, but you
know, memory is pretty big now and most data sets that we use still aren’t more
than a few gigs. So now why is everyone saying we shouldn’t pull our data into
Excel? First it boils down to governance. Accountants are at their core data
managers. We respect financial data and as you know in any general ledger, you
don’t erase information. You don’t erase a transaction, you reverse a
transaction. You want to keep that data unchanged. If you’re creating a
financial statement in a spreadsheet and you just backspace that number,
you’ve just changed it. You’ve deleted information, you’ve
deleted data, the ability to easily audit all the transformations to the data
as we created that information and the opportunity to share that, create an
information rapidly, securely into a wide audience. That’s why you keep the
data separate from the visualization tool and the transformations we need to be
able to audit it. We need to have a consistent mechanism of transforming it and
we need to leave that data unchanged. With Excel, it’s extremely hard to ensure
that data pooled into a spreadsheet remains unchanged. Even with all its cells
locked down, it’s still possible to copy that information to a new sheet, make
a new report, and then distribute all that information. If all that data
remains in the original sheet, there’s no assurance. Somebody who’s not gonna
update the original report as well or the other report as well. You have two
versions of the truth, so when all the data stays in one database, instead of
getting shipped around through email attachments, the analysis,
we’ll always update any changes to the database or any
updates I should say to the database. We’ll also update any analysis and if
there’s an error, we can either update an error in the data or update an error
in that singular analysis sheet that we have that everyone else is just
connecting to.
So see that wasn’t that hard. Right. Okay. Yeah, that’s a
little difficult. I know. I know. It’s a lot to take in. Unfortunately. It’s
just impossible to talk about data visualization without talking about data.
However, it’s my strong belief that a conceptual understanding, just the simple
conceptual understanding. If you watch this video a few more times and just
really get your head around those basic SQL statements and the concept of
keeping data separate from the transformations to that data, you are going to
have such an easy time picking up all the different data visualization tools.
It’s kind of that light switch that gives you the skill set to, to just
understand what these things are trying to accomplish as opposed to just
feeling like you’re clicking something and just memorizing a process
rather than learning a solution. If you can connect to a data
source in Tableau, you can do it in Power BI. If you understand that the
dimensions in Tableau are the same as a group by statement in SQL, and if you understand
that the measures in Tableau are the same as some or select average in SQL,
then you’re going to understand that in Power BI as well, you can understand it
in any platform.
The Data
Dilemma, Continued
Kevin: “Okay, if you give Jan and me a sample,
meaning at least 25% of the most recent data, we will make sure that the data
is ‘clean’ by running some data validity tests, and then we can start putting
together some analytics. Once we have a better feel for the data, we can give
you a better estimate of how long it will take to develop the dashboards.”
Miranda: “Great! My team will make sure you get
all the data you need, and they will be available to help as well.”
Video
10: Data Visualization – Data Wrap‐up
We’ll now move to the last video in Dan Smith’s Data
Visualization video series where he summarizes key takeaways pertaining to
data, data analytics, and data visualizations.
Video Transcript (Est. time: 3:12 min):
Congratulations if you’re watching this, it means you’ve
completed all the videos on data analytics, visualizations. We covered a lot of
material over the course of these videos. We started with why visualizations
are important for analytics. Following that, we looked at some of the tools of
the trade, the leaders of creation software and platform for creating data
visualizations.
Next, we looked at the different types of audiences and how
visual analytics answers the different types of business questions that members
of that audience may have. Exploratory visuals for audiences looking to get a
better grip, a better understanding of data, and find new ways in which they
can use that data to help the business informative visualizations for the
higher-level questions or for demonstrable proof of how those visualizations of
how those exploratory visualizations can help the business. And finally,
reporting visualizations to keep track of ongoing activities and answer the
repeated business questions which we’ve already demonstrated can help the
business.
We looked at the low-level concepts of visualizations such as
how shapes can be used to represent an aggregate value such as a sum, how align
represents changes over time, and how a point represents the intersection of
two or more values such as in a Cartesian plot, a scatterplot. We saw how
combining those low-level concepts of shapes, lines, and points can create much
more complex visualizations like histograms and tree charts, bubble
charts, ribbon charts, etcetera. We walk through a flowchart
which allowed us to select the correct visualization based on our business
question, such as are we looking at a comparison of groups or a composition of
the whole? Are we looking at time series or a single point in time or do we
have multiple factors which we want to include in our visualization?
From there, we took a closer look at the types of
visualizations that we use for exploratory, informative in reporting business
questions and audiences. Then we looked at some of
the best practices on how a dashboard can provide our end
user, our audience with data
connected visualizations and add an exploratory component to
those visualizations. And finally, we explore the data component of data
analytics visualizations, looking at some of the data operations performed by
data visualization tools and the data environment in which those tools live.
I hope you all enjoyed learning these data visualization
concepts as much as I have enjoyed teaching them. Again, thank you so much for
watching. I'm Daniel Smith. Goodbye.
HMC
Conclusion
You have now completed the case study, Huskie Motor
Corporation: Visualizing the Present and
Predicting the Future and finished the series of 10
videos with Dan Smith.
Module 2
Wrap‐up
You have completed Module 2 of this course and should now be able
to:
Define data visualization.
Describe how data visualization can impact the way data is
communicated.
Identify various data visualization tools and their different uses.
Recognize the importance of choosing the right visualizations based
on your audience.
Module 3:
Applying Data Analytics and Visualization
Course
Roadmap
Click Next to proceed to Module 3, Data Analytics and
Visualization Practicum.
Introduction
Welcome to Module 3, Applying Data Analytics and
Visualization. The goals of this module are:
To explore the data analytics and visualizations discussed in
the Huskie Motors case study, and
To apply acquired knowledge of data analytics and data
visualization to complete scenario- based exercises.
In the previous module, you were taken through a scenario of
Huskie Motors Corporation’s dilemma of the abundance of data available to them
and the need to use it to make business decisions. In this module, you’ll begin
in Lesson 1 with a tutorial of how some of the Huskie Motors analytics can be
performed. Through a series of five videos, you will be walked through how the
data can be loaded, transformed, and analyzed, and ultimately visualized for
effective communication and decision-making. Following each video, there will
be a set of knowledge check questions that you’ll be required to answer before
moving on.
After that, in Lesson 2, you will have the opportunity to
apply the knowledge gained by performing in-depth, scenario-based exercises on
data analytics and data visualization. Before you begin Lesson 2, you’ll need
to access the data files that correspond to Module 3 from the Resources link.
You can download them directly onto your computer. Read the question
presented in each exercise, use the data files to figure out
the best answer, and then come back to the question and select your answer. You
are allowed 2 attempts per question. If you still do not get it correct, you
will be able to click the ‘Show Answer’ button to view the correct answer. All
exercises in this lesson are based on the Huskie Motors case study reviewed in
Module 2.
Module
Menu
Module 3 consists of two lessons,
Lesson 1: The Data Analytics of Huskie Motors Corporation,
and
Lesson 2: Data Analytics & Visualization Practicum
Click Lesson 1 to proceed.
Lesson
1: The Data Analytics of Huskie Motors Corporation
Huskie
Motors Data Analytics
Dr. Ann Dzuranin, co-author of the Huskie Motor Corporation
case study, will present a series of videos showing how to perform some of the
data analytics required by Miranda and her team in the case study.
Dr Dzuranin will take you step by step, using Microsoft
Excel, through a demonstration of how to work with the large amounts of
financial and non-financial data that the Huskie Motor Corporation is tasked to
analyze.
In the video series, you will see how to:
Identify, extract, load, and verify the data;
Analyze the data to identify the most profitable brands;
Calculate profit margins, contribution margins, cost trends
over time and more.
Huskie
Motors Data Analytics: Video 1
Let’s now take a look at a video on identifying the issue and
the data, extracting and loading the data, and then verifying the
data.
Video Transcript (Est. time: 16:58 min):
Hello, welcome to the Huskie Motors data analytics Video 1,
”Extract, Load, and Verify Data.” Evaluating source data. I am Ann Dzuranin,
and I will be discussing the material in this video.
In video one, the issue that we have identified is that we
need to load Huskie motorcars
2015 sales transaction data and confirm gross sales by
region. The data that we’ve identified we need is the 2015 sales transaction
data. To extract that data, we contacted the client and asked them to provide
us with that data. The client has provided us with a file, and the file is in a
.TXT format, so it is in text file format. This is not an unusual way that data
will be downloaded and I should say extracted from databases, for example. We
need to take that data and import it into our evaluation or
analytic software, in this case is going to be importing that text file into
Microsoft Excel. Once we have it in Excel, we’re going to have to verify that
all the data that the client had told us was in that file is actually there and
has actually loaded correctly into our Excel file. We’ll confirm that by
comparing what we’ve loaded into our Excel file to check figures that were
provided by the client.
So let’s get started. First thing that we need to do is take
a look at the text file that the client has sent to us. So if you notice here
we have what’s called the HMC client data file. When you try to open that, it
will open up a notepad and in this format it’s not really very useful for us to
work with. So instead we’re going to close that. We’re going to go ahead and
open up Excel and open a new workbook. Once I’m in Excel, now I go to the data
tab and I go over here where it says, all the way to the left, where it says
get data. You click on that get data, and notice you have a lot of choices from
which you can bring data into Excel. We know that we have a text file and so we
are going to click on ”From Text\CSV” file format. This will take us to our
computer where we need to then go to our files and find where we had saved that
client data. Go ahead and import that.
Now if you have 2000 Excel, 2016, or a newer [version] you
will have the option to actually transform the data before you load it into
your Excel spreadsheet. This can be a really useful tool to use and I’ll show
you on how this works. So if I look at my data right now, I can click on, I can
kind of page through it. It only shows me the first 200 rows just so I get
a feel for what’s come over. If I click on transform data,
what that will do is, essentially Excel will look at each of the data points
and identify what type of data that is. So for each column, for example, we
would identify the VIN number as a text. ABC stands for a text and that makes sense, right?
Because it’s not a continuous number. The same thing for brand model. Notice
model year is just a number and that makes sense also because we actually have,
that’s really a category. It’s not the year when it was sold. Same thing for
seat and then series is a category segment, body style, etcetera. And so as we
go through these, you know, look and see if everything kind of makes sense, and
it does make sense
to me, you know, month, day, and year. We don’t need those to
be deep fields because we have a sold date field here. Again, as we go through
it makes sense that sales volume is a number. It makes sense that the options
and things like that are categorical or text fields. And if we want to be sure
though that all of our financial information is accurate.
So if we haven’t been through this, we haven’t been through
this file before. Um, this file is essentially sales for all of 2015 for Huskie
Motor Car Company. And it starts with VIN numbers who each row is a specific
transaction, right? So it represents a sale of a unique vehicle to an into a,
uh, an individual. And all of the information that’s in here, um, essentially
describes everything about the vehicle that was sold. If we scroll all the way
to the left, we will eventually get to our financial data. And I wanted to
point out a couple of things that happen when we click on that transform,
bringing over this text file into an Excel. Um, we see that we have, um,
currency is for gross sales and marketing, net sales, labor tooling materials
with and look at option costs. Again, the option costs. And we can see that
Excel has identified this as a text field, which initially doesn’t make a lot
of sense. Same thing for packaged cost, but notice what’s happening because the
data set when it was downloaded from the client’s database, but if there was a
Z, instead of putting zeros, it put dashes. And when it brings it over as a
currency field with a dash in it instead of a zero, Excel doesn’t know what to
do with that. And so what Excel does is it says, no, this is not currency
because clearly we have text fields within here. We can in this transform
mode change this data type to be currency. And what will
happen is we have the choice to either add another column. So essentially
duplicate this as the new column type or I, we can replace it.
So I’m going to go ahead and click replace so you can see
what happens.
So notice now that wherever there had been blanks or dollar
signs with dashes, the word error comes up. If I do the same thing for package
costs and change that to currency, I will get the same issue and we’ll talk
about that in another, in just a minute. I just want to quickly cruise through,
the rest of these looks like tariffs, also was identified not as currency. And
so we’re going to go ahead and change that data type to currency and say, okay,
and everything else looks alright. So now that I’ve done this, I can go up here
and
say close and load. And what will happen is all of those
places where it said error, I’m going to get a notification over here, a query
that says, hey look, you’ve got 72 errors. What you would do at this
point is you would click on that and you would look and see what the errors
are. So it just brings over the rows where there are errors. And if you scroll
across, you will eventually find where the error is. So remember it was an
option, costs and package costs as well as tariffs. If they didn’t add an
option, it should be zero. If they didn’t add a special package, it would also be
zero.
So what you would do at this point is you would actually go
talk to the client. You would confirm whether or not these should be zeros or
whether they should actually have dollar amounts in them. For our purposes
today, we are just going to go ahead and load this anyway and because now
what’ll happen is it will just come through and it will just be instead of an
error when it loads into my Excel spreadsheet, it will just show up as a blank.
Okay. All right. So we have our data in Excel now and now the next thing we had
wanted
to do, I’ll take you quickly back to the PowerPoint. What we
want to do is check to see if our file came over correctly. So what we want to
do is confirm the number of transactions are 1,308 that the gross sales total,
$38,675,912, that the total number of blue cars sold was 340 that our first
sale happened on January 17, 2015, and then our last sale
happened on December 20, 2015. We can use pivot tables as a
kind of quick and convenient way to do a lot of these verifications. So for
example, if I click anywhere in my worksheet and click on pivot table, it being
Excel will highlight my entire table range. I’m going to go ahead and put this
in a new worksheet since this has a lot in it already.
Now when I get to this worksheet, the first thing I wanted to
do, remember, was confirmed the total number of transactions. So a really quick
way to do that would be to say, okay, I could just count for that values, the
number of VIN numbers in that column, because each transaction is a unique VIN
number. And so when I do that, I see that I get account of 1,308, which if we
glance quickly back is exactly what we want. So it looks like the number of
transactions is correct.
The next thing I would want to do is confirm sales. So what I
can do for that is I can continue to work in this same pivot table if I wanted
to, but for an audit trail I’m going to go ahead and do each confirmation on a
different page. And I’m going to call this one count of transactions. And then
I’ll go ahead back to my first sheet and I’m going to
actually rename this so that we know exactly what it is. And
this is 2015 sales. And now I’m going to go ahead and insert another pivot
table on a new worksheet. And for this one I want to look at total gross sales.
So I’m going to find my gross sales and I get my, some of gross sales. I can
change the format of that by going into the values, clicking on the down arrow
and picking value field settings. And then number format. I’m gonna go ahead
and change
that to currency with zero decimal places. So it’s easier to read. Click OK. So
$38,675,192 and if we look back at our check figures, that is
accurate.
The next thing we can using a pivot table is how many blue
cars were sold. So we’re going to go ahead back into our sales transactions.
I’m going to go ahead and click anywhere in there. Insert pivot table. Okay.
And I am going to find color passed it. We’d go color. So these are the various
colors that they offer for their cars. And remember I want it to count how many
they I had of each color. Essentially what I really want I accidentally, that
went into rows. I’m going to drag it over to values and notice that I get a
count of the VIN numbers for all of the cars I could if I wanted to also bring
color as a filter and then only choose blue. And you can see that my account of
blue cars sold is 340. I can go back to my PowerPoint and confirm that that is
accurate.
Now to confirm the first day of sale on the last day of sale.
Probably the easiest way to do that in the sales table is to make sure that
your sold date column is sorted from oldest to newest, meaning that the first
one should be the first sale of the year and the last one should be the last
sale of the year. So the 12/20/2015 was our last sale 1/17/2015 was our first,
and that checks out with what we have over here. So what we’ve done is we have
gone through and we have confirmed the total number of transactions by counting
the number of VINs that we sold. We confirm the total gross sales of
$38,675,192 by creating a pivot table that summed gross sales. We confirmed the
number of blue cars sold was 340 by creating a pivot table where we had color
as the rows and then we filtered it for only blue. And then finally we went and
looked at our data, made sure that our sold column
was sorted from oldest to newest, and confirm that the first
sale date was the 17th of
January and that the last sale was made on the 20th of
December.
Remember when we first talked about what we were, our issue
was today, one of the things we wanted to identify was the total gross sales by
region. And so after we verified the data, we needed to provide that, those
results. So we verified our data, we feel comfortable that the data that the
client gave us is complete, and accurate, and we went ahead and totaled gross
sales. I did this by using, again, another pivot table. So if we want to pop
quickly back into our Excel spreadsheet that we just created and we want to
insert a new pivot table here, I can pick region and another way that you can quickly
find what you’re looking for, you can start to type in the name of what you’re
looking for. So I know I’m looking for gross sales. I just typed in G-R and
took me right to gross sales. And again I click that down arrow, changed my
field settings so that my number format is currency with zero decimal places
and it’s okay. And then we can see Europe, North America, and South America and our total,
which agrees with what we have in our visual here.
So the key takeaways. First, we had to identify your issue,
which we did. We were looking to verify our data and then, um, show the results
for sales by region. We identify we were using 2015 sales data. We extracted it
via the client and loaded it into our Excel database. They, we were given a
text file by the client. We imported that into Excel. We looked at how the
transformation options are available in 2016 version of Excel and… and higher.
We talked about the issues of you have blank or missing data. We then verify
that we had
the data. We were supposed to have confirmed our results and
provided the summary of sales by region.
So thank you for watching Video 1, “Extract, Load, and Verify
Data.”
Huskie
Motors Data Analytics: Video 2
In the next video, we’ll see how to transform and analyze the
data in order to evaluate Huskie
Motor Corporation’s financial performance by brand.
Video Transcript (Est. time: 12:31 min):
Hello, welcome to the Huskie Motors data analytics Video 2:
“Transform and Analyze Evaluating Brand Performance.” In this video, we are
going to identify a business issue facing Huskie Motors. We will identify the
data we need to evaluate that issue. We’re going to transform the data by
creating a measure that’s going to help us address the issue. We will then
analyze the data and then communicate our results.
In this video we’re going to be evaluating the HMC financial
performance by brand, and
the data that we’re going to use for that are the 2015
through 2016 sales transactions. The transformation that we’re going to do is
to create a measure that’s going to help us evaluate financial brand
performance, and we’re going to use the profit margin ratio calculation as that
measure. Once we create that ratio, we’re going to go ahead and analyze the
data by preparing a pivot table to summarize profit margin by brand, and that
will allow us to communicate the results as to which brand has the highest
profit margin ratio for 2015 to 2016.
As a review, I’d like to cover what the profit margin ratio
calculation is. So we take for a profit margin ratio, we take operating income
and we divide that by sales. There are probably other measures that you might
be thinking would be also good measures of financial brand performance, but the
reason we’re choosing profit margin ratio is because that helps us eliminate
any type of volume differences between the brands. So the profit margin ratio
allows us to break down by brand what their profit margin ratio is regardless
of how many cars they sell of each brand.
So let’s go ahead and open up the Excel spreadsheet that
we’re going to be working with in this video. This is the Huskie Motors sales
transactions from 2015 to 2016, and if you’re not already familiar with the
data, I’ll give you a brief overview of it. Each row in this data set
represents a single sale of a vehicle. And so you can see that the VIN number
should be unique in each row of this data. There are approximately 2,674
transactions that we’re
going to be working with. The other information we have are
typical things that you would think about if you were buying a car: the brand
name of the car, the model of that car, the model year, perhaps there’s a
series number for that vehicle. Is it a full-size car or a compact? For
example, what is the body style? And then we have sold date information. We
also have, sales volume will be one for each transaction since you can only
sell unique vehicle once, drive configurations, engine, etcetera.
If we continue to page through or scroll over, I should say,
we will also see that we have a lot of information about options that can be
added to it. We have the region in which the car was sold, the country in which
it was sold, how many days it was on the lot, etcetera. So if we’ve continued
to scroll over, we will eventually get the financial information, and that
starts with the column of gross sales. I do want to point out that there is
also a column called net sales. And what that column represents is the
difference between the gross sale amount and any variable marketing expenses
that were incurred. So these variable marketing expenses are based on specials
and deals that may be happening at the time that car was sold. So cash rebates,
low-interest financing, special lease terms, those are types of variable
marketing campaigns being used. And then this number represents what that cost
was.
We're going to go keep scrolling to the end of our financial
data, and you can see I already created a column for profit margin ratio and
we’re going to recreate that so that I can walk you through how I did that.
It’s fairly straightforward so I’m just going to type in profit margin ratio.
So we said that the calculation is operating income divided by sales. So the
equivalent in this data set is going to be our net revenue, and we’re going to
take net revenue instead of after-tax revenue. The reason we’re going to do
that is after-tax revenue is driven by the tax rate that is in the specific
countries in which the cars are
being sold. And so we don’t want to muddy the waters by also
having to account for differences in tax rates. So we’re going to go ahead with
net revenue, and we’re going to
divide that by gross sales.
Now the reason we’re using gross sales instead of net sales is because gross
sales is a better measure of sales. Since net sales is really just net of a
marketing expense. Now if this was a retail company and net
sales represented the difference between gross sales and sales returns, well
then you would want the net sales, but that’s not the case here. So we go ahead
and we take our net revenue divided by our gross sales, and we get our profit
margin. Now I would recommend going and changing this so that it is a little bit
easier to read so you can change up to percentage. I actually probably would
add another decimal place to that. Now we can use the autofill feature by just
double-clicking in the corner here and that will autofill all the way down to
the end of our data. What our profit margin ratio is. I do want to point out a
word of caution on autofill. Autofill will only work if there is a
corresponding data point in the column immediately to the left of where you’re
filling. If there was a blank, for example, somewhere in this profit margin
ratio column then the autofill would stop there. So what's a good way you can
tell if you’ve got everything? Well, if you notice down here after I did
autofill, it gave me a count. So it gave me a count of how many rows it autofilled
that formula to. And we know from looking at the number of rows we had, right
in the beginning there that we have 2,674 so it looks like it autofilled
correctly.
All right, so now we have our profit margin ratio, and what
we want to do now is create a pivot table so that we can evaluate which model
has the highest profit margin ratio for sales from ’15 to ’16. So what we can
do if you click anywhere in your worksheet and go up to insert pivot table,
notice that Excel will automatically find the whole parameter of all of your
data. So you just have to be clicked within a cell for that to work. We’re
going to go ahead and put this in a new worksheet because we already have a lot
in this worksheet.
So click OK. And now notice that you have your fields here that
you can start to drop. And you’ve got it pretty much got your canvas here that
you can start to drop in what you want in your pivot table. So we’re interested
in profit margin ratio by brand. So if I click on
brand over here, you can see that it drops in the Apechete,
the Jackson, and the Tatra. We also want profit margin. So if we look down to
the bottom, you can see, and these go in the order in which you have your
columns in your worksheet. Notice I have two profit
margin ratios because remember I had already created one and
then we created a second one as a demonstration. It doesn’t matter which one we
use. I’ll go ahead and take the second one we just created. So now notice what
we get. It defaulted to give me the sum of profit margin. And that doesn’t
really make a lot of sense, right? We don’t want to add up all the profit
margin ratios. That really doesn’t give us any kind of number that makes any
sense for us to evaluate. What we really want to do is look at the average. So
if I click on the down arrow and I go to value field settings, you can see here
where you get choices of how you want to summarize those, the fields. So in
this case we were going to look at the average
of the profit margin, and I’m just going to click OK.
Now if I want to change this so that it’s in a nicer format
and it has the percentages,
instead of, the way it looks right now, I can go back to
value field settings and click number format. And here if I pick percentage,
I’m just going to use one decimal place and then I’m going to click OK, click
OK again. And now I have a much neater-looking table here. Grand total doesn’t
necessarily make a lot of sense, either. So if you want to take out grand
total, you can just click on it. And then go to remove grand total. We can also
change what we want to call this if we wanted to call it something different as
well. So now you can see
that we have created a pivot table of the average profit
margin, and as we can see, the
Apechete is performing the best followed by the Jackson and
then by the Tatra.
Now before we call it a day, let’s think about what we talked
about in the very beginning that there might be other measures that you may
have been tempted to use to evaluate performance. Let’s just say for sake of
argument that we want to also look at net revenue. So if I click on net revenue
and I’m looking now at total net revenue, some of net revenue, and I’m going to
go ahead and change the format on that. So that’s a little bit easier to read.
So we can see putting in net revenue that the Apechete also has the highest sum
of net revenues. So the highest total net revenue followed by the Jackson and
then the Tatra. But notice that the Tatra has very close revenue to the
Jackson, and yet if we look at which one is more profitable, we can see that
it’s the Jackson of those two and that the Tatra actually has a negative profit
margin ratio. So if we go back to our slides, essentially what we have is the
results here on this slide shows you the results that we just calculated in
the pivot table where the Apechete has the highest profit
margin followed by the Jackson and then the Tatra.
So what are our key takeaways? Well, what we did in this
video is we identified the issue of what is the financial performance of the
brands. Can we identify which brand is
performing best in the years ’15 to ’16, we took the data,
which was the sales data from
2015 and ’16. And we transformed that data by creating a
measure of profit margin ratio. We then took that profit margin ratio and
conducted a pivot table analysis, which helped us to identify that the Apechete
was in fact the best-performing brand.
So thank you for watching Video 2: “Transform and Analyze.
Huskie
Motors Data Analytics: Video 3
Now that you saw how some data analytics are performed using
Microsoft Excel, we’ll move on to some exploratory data visualization using
Tableau to understand the Huskie Motor Corporation’s variable costs.
Video Transcript (Est. time: 22:20 min):
Hello, welcome to Huskie Motors Data Analytics Video 3,
“Exploratory Data Visualization,” understanding variable costs. I am Ann
Dzuranin, and I will be discussing the material in this video.
What we’re going to cover in Video 3 is first to identify the
issue that we’re going to examine, which in our case for this video is going to
be evaluating Huskie Motor cars’s variable cost. We have to identify what our
key questions are. So what are we going to explore in regard to Huskie Motor
cars’s variable cost. Specifically, we’re going to be examining which models
have the highest variable cost, which costs comprise the largest percentage of
total variable costs, and how have variable costs changed over time.
The data that we’re going to be using is the data we’ve been
working with in the other two videos, the 2015 to 2016 sales transaction data
for Huskie Motors. We’re going to explore the data in two ways. We’re going to
use Excel, and we’re going to use pivot tables and charts to help us
investigate the identified questions. And then we’re also going to use a
visualization software called Tableau to investigate the identified questions.
So let’s go ahead and get started. The first thing we’re
going to do is get our Excel file opened up. So this is the sales data for
Huskie Motors. It’s the same data we worked with in the last couple of videos.
We’re going to be using this data to create three visualizations. We’re going to
explore it using both pivot tables and some pivot charts. What we want to do is
identify model variable cost. So this is our goal, to create this data, pivot
table, and this visualization. So that’s our first step. So let’s go ahead and
start a new worksheet, and we will go ahead and do that. So we want to know by
model what total variable costs are. To do that, we can go back to our sales
data and click anywhere in the data sheet and go
to insert pivot table. So Excel will automatically highlight
all the area, and create the table that you’re going to be using. We’re going
to go ahead and say to put it in a new
worksheet. So we get to our
new worksheet. And now we have our canvas here to work with to develop our
pivot table. And from that pivot table we will develop a pivot chart.
So we know that we want to look at variable costs by model.
So let’s go ahead and bring model down to rows and then let’s bring years sold
into columns. We’ve pretty much set up how we want our rows and columns to
look. We want each model listed as an individual row. And then we want to look
at the cost in 2015 and 2016. So now what I need to do is grab that cost
information and bring it over to values. So I’m going to go
ahead and scroll down until I come to total variable cost.
And I’m going to bring that down to values. It automatically creates a sum, and
that’s okay. We’re interested in sum this
time, but we do want to make those fields, the number format
look a little bit better so we can better read these numbers.
So I’m going to go ahead and choose currency, and we don’t
need decimal places for this analysis, and I’ll hit OK. And then once I hit OK
again, it’ll format my numbers. It looks much easier to understand. So pivot
tables are really useful. We’ve summarized all of our
total costs by model for each year, but you know, it does get
to be a little confusing to now try to look through here and make sense of,
okay, which one is highest, which one is lowest, which ones changed the most?
So often visualizing the data is going to enable you to get to those answers a
lot more quickly.
So we can go ahead and work with what we have and we can
actually create a pivot chart if we would like to. So notice I’m in pivot table
analyze and I have this choice over here for pivot chart. If I click on that,
it’s going to open up a box for me here to decide what type of chart that I
want. I’m okay with the column chart for now. So we’re going to go ahead and
click OK. And now we have our column chart. So as you can see, I’m going to
close our field tables over here. As you can see, we now can much more quickly
identify which
model in 2015 had the highest total variable costs. Just by
visually looking. We can see that it’s Jespie and we hover on that point, we
can see how much that is in total for that year. And then if we wanted to see
which one had the highest for 2016, we can again visually pick that out pretty
quickly, and it is the model called Pebble. We can also see visually which ones
are going up or down, right? And which ones have stayed the same. So very
quickly we can get to the information that we would like to understand better
just by
taking that pivot table and changing into a pivot chart.
That’s the power of using data visualization for exploratory analysis. The other question that we
wanted to answer was which costs comprise the largest percentage of total
variable cost. So for that we want to look at the model variable cost, but we
want to look at it broken down instead of by model. We want it broken down by
the type of cost. So we know from the data set that our costs, that go into our
total variable costs include labor, materials, overhead, freight, warranty,
package cost, and
option cost. So let’s go ahead and create a pivot table with
those and then we can use that to create this pivot chart. So again, I’m going
to go back to the sales data set and I’m going to click anywhere within my
worksheet and go to insert pivot table. Now you might be thinking, why don’t I
just start with a pivot chart? And you could do that. You could just go right
to pivot chart, but it often is, it’s good to have the numbers in one place and
then you can manipulate the chart from those numbers vs. continually having to
change the chart. I’m going to go ahead and click OK to put this on a new
worksheet. All right, so remember now we want to look at the breakdown, by year
of variable costs of the components of variable costs. So what I would like to
do is find each of my components and bring those over. So let’s start with,
let’s see, we said that it was labor, materials, option costs, package cost,
freight, overhead, and warranty are all in my variable cost. So notice now it’s
just putting a sum here. I also wanted it to be by year as well, right? So,
what I would want to
do then is say, maybe I want these as rows. So I just drag
that from columns, the values from column to rows. What that did, this value
represents all of the values that I’ve put over here. And then I’m going to
take years sold, which I believe is up here somewhere. Right? And bring that
over to my columns. So now I have all of my variable costs components and how
much they are for each of the two years and how much they comprise of grand
total. Now we could go back into each one of our fields here and go ahead and
change the format of those. Or we can change the format here. The benefit of
doing it within the values is then it stays within that. If
you change your pivot chart around, add things or move things, if we just
change it here, the way we would do that, if we were, in any spreadsheet on the
home sheet, in the number section, obviously that will work but again, if we
change or add a column to this, that column would not be formatted at the same
way. Okay.
So we’ve got all of our components. And again, now how do we
want to visualize this to be able to see what comprises most of the variable
costs. I think even just from the grand total, we can see that that’s going to
be materials which makes sense in the production of cars. But let’s go ahead
and click in your table, anywhere in your pivot table, on your top ribbon pick
pivot table analyze, which should be highlighted once you’re in that pivot
table. And go ahead and pick pivot chart. And again, we have a clustered,
column is the default that Excel is suggesting for
us. We could choose another one. If we would like, it doesn’t really always
give us the best choices. So typically what I would do is stick with the
clustered bar. And so then you can see what we’ve done is we’ve brought this,
I’ll bring this over here and close our fields. So again, you know, we can kind
of confirm quickly that materials does make up the majority of our variable
cost, and we can see followed really closely by labor and option costs,
warranty costs. We can also see some, a little bit of a trend on which ones are
going up and which ones are going down.
Okay. So now that we’ve, taken a look at this in Excel, why
don’t we think about how we might look at this in a different, well actually
before we even move from Excel, let’s go ahead and do that comparison. The
year-to-year comparison. We just want to see, how things- our third question
was how have variable costs changed over time? So there’s a couple of ways you
could do that, right? We can look at it in total. And so let’s just go ahead
and create a new pivot table that looks at how variable costs have changed
from ’15 to ’16, let’s say. So let’s go ahead and again,
click anywhere in our data, go to insert pivot table, and now we can see that
we can go ahead and hit a new worksheet. So what do we want to know here? Here
we wanted to know by years sold, don’t want to sum the years sold. What is our
total variable cost?
Okay. So, if I wanted to go ahead and change this so that it
was formatted in currency, I can go ahead and do that. I’ll do zero decimal
places. So now we have basically our total variable costs and we can see what
it is each year. Go ahead and click OK, so it changes the format. What if I
wanted to create a pivot chart of this so that I could compare year to year. So
again, click anywhere in your pivot table and then click on pivot chart and you
will get your option here for your pivot chart. And so we can see that, very
quickly that total variable costs have gone up from 2015 to 2016. I want to
point out a couple of things in the default visualizations for Excel. Notice
that we’re not starting at zero. That can always be a very confusing type of
visualization to create. So, this would not be considered best practice for
visualization. But again, we’re just using this for exploratory. We’re not
using it to explain or communicate our results yet. We’re just using this as a
way for us to better
understand what’s happening. So it’s okay for that reason. So
we can see that we’ve got an increase in total variable costs, but what drives
variable costs, sales volume, right? So we really need to compare this to sales
volume so we can go ahead and create another visualization that gives us sales
volumes.
So let’s go ahead, back to our data again, click anywhere in
the data, click on insert pivot table.
Now in this case, I don’t want to put it in a new worksheet because I want it
in the worksheet I was just working in so I can compare the two. So I’m going
to click on existing worksheet location and then click in that location box.
And what you need to do there is go back to the sheet that we were on, that had
that new visualization and it was this one and click on a cell in there for it
to go to and say, okay. All right, so now we’ve got another option here for
another pivot table. What I would like to do here again is look at sales volume
now by year. I’m going to go ahead and pick my year sold. I want a sum of my
years sold. And let’s go ahead and look at sales volume.
Now we had a sales volume number, but there’s another way
that we can do that. We can actually take the VIN number into values and it’ll
automatically do a count for us. Okay, so it’s counting the VIN numbers for us.
So now we have the sales volume. I’m going to go ahead and click within that
pivot table and create a pivot chart and click OK. So just briefly looking at
this, and again, our scales are different though, right? But we can see that at
least we know that sales volume went up. In addition, right to our total
variable costs. But wouldn’t it be nice if we could see all of this in one
visualization. I’m going to show you how to use a visualization software to
create an image like that and also to get a better understanding of what your
total variable costs might be.
So I’m going to go ahead and use a program called Tableau.
Now I’m going to open up the Tableau file that I’ve already started for this.
So in this Tableau worksheet this looks very similar as you can tell to a pivot
table, right? We’ve got areas where we can drag our rows, areas where we can
drag our columns to create our visualization. The power of a visualization
software is that generally it has many more visualizations to choose from. And
in addition, you can manipulate things, a lot more easily. So let me show you
as an
example. So one of the things we want to see was total
variable costs. We’re going to see a breakdown of total variable costs, right?
So if I were to take my individual cost, I could do that by again just clicking
on each of those items. Now in this particular program, it will alphabetize the
data vs. having it in the order it was in the worksheet. There’s pros and cons
to that I suppose.
So we know that we want, freight, and if you double-click,
it’ll bring it over. We want labor. We want materials, we want option cost,
overhead, packaged costs, and warranty. Now notice we get something that we’re
not really sure if this is exactly the visual we would like to look at. We also
want to be able to look at it, by year as well, right? So if I drag my sold
date up to my filter and pick years, then I can choose which years I want. In
this case I want both. So I'm going to hit
OK. I can also drag that over to be in columns, let’s say. So that then I’ve
got, all of my information in here that I would like, now I can just go to the
show me button and pick a visualization that I would like to see. So what I’ve
done is I’ve picked this column chart and I can see in 2015 which items made up
the majority of my costs. And in 2016 which items made up the majority of my
costs. We knew from our previous analysis that it was going to be materials,
but again, very quickly, visually we can see that. Now, back to our question
of, well, what if I wanted to see, 2015 total variable costs and compared to
volume. So let’s go ahead and start a new sheet and let’s think about how we
would want to do that. Well, we know that we would want to look at total
variable cost. We also know we would like to look at it by year. And we also know,
that we would be interested in seeing what our total sales volume was as well.
So we can see I have total variable costs and I have sales volume, and it’s not
really all that much more meaningful to me except I can see they’re both going
up. But if I wanted to, I could, take
my sales volume number and I can use that as a dual access.
So if I click within this second visualization, I can pick dual access. Now
what this does is allows me to look at sales volume and variable costs. These
are two different measurement scales, but it puts them on the same plane so
that I can see if my variable costs are rising at the same rate as my volume.
And so after doing that, I can feel pretty comfortable,
right? That they are moving in sync, which is what we would expect variable
costs to do. So again, just to summarize, this was our total variable cost by
modeling year. This was the one that we prepared in Excel, and again confirming
that we had the Jespie was the highest. Jespie was the highest in 2015, and
Pebble was the highest in 2016. We then looked at a breakdown of variable cost
in Excel so that we could see which of the variable costs were contributing the
most of the total. And that clearly came out as some of the material costs. We
also split it by years that we could see which ones are increasing or
decreasing. And then we wanted to look at the change over time and we did it
initially in Excel, but then decided that really to evaluate whether or not
your variable costs are rising at the same rate as your volume, you need to see
both together. And so we used the visualization software Tableau to take both
sales volume and variable costs for both years. And using a dual access, we
were able to plot them both together in the same graph.
For our key takeaways today, we identified our issue, which
was to evaluate Huskie Motors’ variable cost. We identified the questions,
specific questions, which models had the highest variable cost. It was the
Jespie in ’15, and the Pebble in ’16. Which costs comprise the largest
percentage of total variable costs. And that was materials for both years,
changed over time. And although they’ve increased from ’15 to ’16, they’ve done
so in sync with sales volume. And then we explore the data in both Excel and
Tableau.
So thank you for watching Video 3, “Exploratory Data
Visualization.”
Huskie
Motors Data Analytics: Video 4
In the next video, we’ll perform some statistical analysis to
evaluate total variable costs and prepare a histogram.
Video Transcript (Est. time: 23:02 min):
Hello, welcome to Huskie Motors data analytics video for
statistical analysis, “Understanding Variable Costs.” I am Ann Dzuranin, and I
will be discussing the material in this video. So let’s go over what we’re
going to cover in the video. The first thing we’ll do is identify the issue.
We’ll then identify the data we need, we’ll analyze the data and then evaluate
our results.
So the issue that we’re tackling today is the use of
descriptive statistics to evaluate our total variable cost. The data that we’re
going to use is the 2015-2016 sales transactions. We’re going to prepare
descriptive statistics and prepare a histogram of variable cost as well. And
then we’ll summarize our key findings. So let’s go ahead and get started. We’ll
look at our Excel file that we’re going to be working with today and that’s the
sales data from 2015 to 2016. And what we’re trying to understand better are
our total variable cost. And we’re going to do that by preparing some
descriptive statistics and a histogram of the distribution as well. So this is
what our goal is going to be to create these descriptives and to create a
histogram that gives us an idea of what our distribution looks like.
So what we’re gonna do first is create this, these
descriptive statistics. Each of these statistics can tell us something really
important about our data. So let’s just go ahead and get started in and
recreate this. If we go to the 2015-’16 sales data, and I’m going to scroll
over until I get to my total variable cost, that’s what I’m analyzing in this
video, total variable cost. So I’m going to go ahead and copy this column of
data and put it in a separate worksheet so that I don’t have it, all in one
large spreadsheet. Go ahead and copy
that and put it into a new sheet. So this is the data that we’re going to be
working with in this video. A total variable cost. Each row represents a
specific car that was sold and the variable costs associated with that vehicle.
Now descriptive statistics can be done using the data analysis tool in Excel.
So if I click on data in the toolbar, I can see in my ribbon up here that in my
data ribbon, that there is a section called analysis and I have a tool called
data analysis. It’s possible that you don’t have that in your Excel. It’s
possible you need to add it. So if you don’t have it and you need to add it,
what you’re going to do is click on file and then go all the way down to
options. Once you’re in the options screen, you want to go down to add-ins.
Once you’re in add-ins you want to go to the bottom where it says manage Excel
add-ins and click go. Now you should see a pop-up box here that gives you the
ability to add, some analysis things to your data ribbon. So make sure you have
analysis tool pack checked, that’s the first box. And then click OK. It should
pop up right in your data tab. If it doesn’t, then just refresh. Go out of your
data tab and go back to your data tab and it should be there. So once you have
the data analysis tool, go ahead and I’m going to click on that and show you
what’s in here.
So this data analysis tool essentially summarizes all of the
statistical functions that Excel has available for to use. We’re going to be
working with today two of these, we’re going to work with the descriptive
statistics tool and then also the histogram tool. So go ahead and click on
descriptive statistics and click okay. And now you get, a box here for you to
fill in the information to calculate the descriptive statistics. So your input
range is going to be the range of data that you would like to, get the
descriptives on. So I always like to take
the column name because then it returns that in my results.
I’m going to click on the first row and I am going to highlight all the data in
that row. Because I picked the label, I’m going to make sure I click labels in
first row. If you don’t do that, then Excel is going to tell you there’s
non-numeric data in the range. Because there are words at the top. We’re going
to go ahead and keep that in this worksheet. So I’m going to click on output
range and then click in that box and I’m just going to go ahead and scroll back
up to the top here where I want to put my results. Then I’m gonna click summary
statistics. Otherwise I won’t get any printout at all. Once I’ve done all that,
I can click okay. And now what you see is a report of variable cost. One thing
that I like to do, because sometimes it can get confusing, there’s many decimal
places and no commas. I at least like to go to my formatting here for numbers
and click common style, just makes it easier to see the, the statistics, and
read the numbers more easily. Okay.
So, what I’d like to do is kind of point out what are the
important, results in this variable cost
our most important descriptives. So if we look at these, I’m going to go ahead
and highlight them, as I talk about them. So the mean is the first one I want
to talk about. We all understand what a mean is, right? It’s a sum of all the
observations divided by the number of observations that we have. But one thing
to keep in mind with a mean is that it also can be influenced by outliers. So
you have, you know, one or a couple of very high numbers, it’s going to pull
that mean toward that high number. Conversely, if you have a few, extremely low
numbers compared to the rest of yours, it’s going to pull the mean down. So you
always want to keep that in mind. The other measure of location that we’re
going to talk about is the median. So the median, essentially, is what is
determined by taking all of the values arranged from smallest to largest. And
then the median is that middle value. So the median is not influenced by
outliers. And that’s an important thing to keep in mind. If there are outliers
in your data, then you probably want to go with the median instead of the mean.
The mode is also in here, I wouldn’t exactly highlight that as, as something
that’s important for this data set use of the mode is, it’s essentially the
observation that occurs most frequently. It’s really best used with small data
sets that
have a small number of unique values and not really useful if
you have a data set that has
a lot of repeating values. So those are the measures of
location that that we’re going to be, using or interpreting in our data.
Measures of dispersion are also important. And so measures of dispersion
include the standard deviation and the variance. So standard deviation and
variance and range would also be considered, to be measures of dispersion. Of
these three the one that’s going to be most frequently used is going to be the
standard deviation. The standard deviation is an indication of how the
observations in our data set are spread out from the mean. So if we have a low
standard deviation, then that would imply that the observations are all close
to the mean. If we have a large or high standard deviation, that would indicate
that the observations are more spread out, we get to the standard deviation by
calculating the variance. So the variance is the average of the squared
deviations from the mean. But the reason that we generally use the standard
deviation is the standard deviation, it’s more practical to use because it’s
the same
measure dimension as the data. So in other words, 5,573.89 is
5,573 and 89 cents of variable cost. Whereas the sample variance really is not
that interpretable. The range is also an indication of dispersion. Obviously
the larger the range, the more possibility you have for your data being
dispersed. However, it’s really not a good measure; as good a measure as using
standard deviation.
The other things that we can, glean from our descriptive
statistics are the measures for the coefficients for Kurtosis and skewness and
what these measures do, I'm going to highlight them in different colors. What
these measures tell us skewness and kurtosis and skewness tell us is the shape
of our data. So skewness is an indication of how symmetrical our data is, how,
what’s the symmetry or lack of symmetry of the data. So if we have data that is
symmetrical, then that would mean that is all spread around the mean. If we
have data that is not symmetrical, it’s either going to be positively skewed or
negatively skewed. And with that suggest is that there’s more data at one end
or the other of the distribution, kurtosis is a measure of how peaked our data
is. So what we mean by that is we look at, and when we look at the histogram,
we would have seen this. That gives us an idea if we have one area of highly
peaked data and then all the rest is, spread from there. So it gives us an
indication of whether or not there’s a somewhat flat dispersion or a wide
degree of dispersion. That compared to whether or not there is a peaked amount
of dispersion. Okay.
So, the other interesting thing to look at here is if we look
at the minimum, the maximum, these are going to be useful for when we’re trying
to determine what our bins are going to be or what our groupings are going to
be for our histogram. The minimum is the smallest observation in the data, and
the maximum is the largest observation. And then finally count as an important
thing to look at because it tells us very quickly whether or not we have all of
the data observations that we are supposed to have. So we know, that we are
supposed to have 2,674 observations, and we do have that. So it gives us some
confidence that we’ve got all of our data. All right, so now
we’ve done, we have these descriptives. What we can tell is that the average or
the mean variable cost for a vehicle is
$17,841, the median is $15,516. We can also tell if there’s a
standard deviation from that mean of about $5,573. So in determining whether or
not a standard deviation is large or small is relative to the amount of your
mean. So I would say that if we’ve got a mean of
$17,841 and our standard deviation is $5,573, there’s a
pretty good amount of deviation from the mean. So what we want to do next is we
want to go ahead and do a histogram of our total variable cost to get a feel
for where we have buckets of data, of observations.
The way you want to do a histogram, generally you don’t want
to have more than eight bins in a histogram. We’re going to go ahead and create
our bins over here, which is essentially the columns of data that we’re going
to have on the column bar chart. So to create bands, you generally want to start
close to where the minimum is and then increase it by equal amounts until you
get to the maximum. So that’s where the minimum/maximum come in handy.
I’m going to go ahead and start my bins at 9,000 and then I’m
going to increase those by 5,000 for each bin. So if I put in 9,000 and then
14, which is the increase, I can just drag that down until I get up 39,000.
These are the bins that I’m going to use to create my histogram and the
groupings of variable cost observations. So let’s go ahead and create that
histogram. If we go back to our data tab, and we click on data analysis, and
then we click on histogram and click OK. What Excel is asking us to do is tell
us, okay, what is it you want me to put in your histogram? So that’s our
variable costs. Again, I’m going to pick the label and I'm going to pick all
the data in there. The Bin range is going to be those bins
that I just established. So I want to see my groupings. I’m
starting at 9,000 and going up by
5,000. And I’m going to put the output range here on this
worksheet so that we can see it right next to what we’ve created. Now down here
is where you’re gonna want to chart your output so you actually do get the
histogram. Go ahead and click OK. Now you can see what I get. I get my bins
repeated and then the frequency, how often I see an observation that is in that
range. So 9,000, that essentially is that minimum one of 8,912. Notice that
I’ve got the highest amount in the bin between 14 and 19,000, which makes
sense because our mean is 17. Our median is 15. So it makes
sense that this would be the highest, the highest amount of observation. So
it’s basically saying that variable costs between 14 and 19,000 occurred 889
times. So it gives us a really good indication of where, we have our, most of
our observations distributed. It also gives us maybe a hint that perhaps this
is an outlier. We might want to go back and check and be sure that data is
accurate. And then when we’ve got only three that are above the 39,000, we
might want to make sure that data is accurate as well. So I’m going to go ahead
and call this sheet descriptives.
So our goal though is to understand how total variable costs
are, per model. And so we know now that we’ve got a pretty significant standard
deviation, we know what our overall mean is for all of our vehicles that we
sold, but we really want to get a little bit more detailed. So we really want
to understand better what’s happening by each model. So to do that, we can
create a pivot table. And what we’re going to do is go back to our data and
again, as we did and in other videos, we’re going to click anywhere in that
data, go up to
insert pivot table, and we’re going to go ahead and put that
on a new worksheet. And then click OK. So what I’m interested in is looking at
by model what my total variable costs are. But I want to think of it in the
same way that I just did my, other variable cost analysis. I want to look at
the average or the mean and the standard deviation. So if I click on total
variable cost, it brings that down into my values as a sum. And I’m not
interested in sum right now, I really want to understand better what’s
happening by looking at the average. I’m going to click on average, and I’m
going to change my number format so it’s easier to read to currency with no
decimal places and click OK. So now I have my average. I also want to look at
the standard deviation. So I’m gonna drag that variable costs down again.
Again, it’s going to give me a sum, and I don’t want it to be a sum, I want to
change that to be the standard deviation. So I have that option there. Again,
change the number format to currency with no decimal places. And click OK and
click OK again. So now I can see by model what the average variable cost is and
what the standard deviation is. This is helpful information because it can help
me to determine if I have a specific or a group of models that maybe have a
much higher standard deviation than others indicating that there’s a very large
variance in what’s happening with my total variable cost. That would help me
take that to that next step, where I want to really start diving into that
model and looking at the individual variable costs that are going into those
models to see what’s happening there. Perhaps there’s a correlation between the
types of options, that the buyer puts in and the variable cost. So there’s
still a lot of research to be done, but this will help me at least narrow it
down to where I should start my investigation.
So although it’s all great in a table, it really would be
useful to see this in a visualization. So again, if I click anywhere in my
pivot table and go to pivot table, analyze and click on pivot chart, I will get
the option to create a chart out of that data. So notice that Excel is going to
give me the option to do a clustered column, which doesn’t really help me all
that much. Obviously I can see by looking at which models have the a higher
standard deviation, but they might also have a higher average total variable
cost. Another useful way to look at
this would be to use a combo chart. The default that it’s
giving me is bars for average, total variable and align for standard. Let’s
look at this as to, as a stacked column and then click OK. So I’m going to make
this bigger so we can see it a little bit better. So now we can see by this
visual, we can right away see the total of this or the total average variable
costs and then we can see the standard deviation of that. So again, we could
see this one, they were side by side as well, but we can clearly see that the
Chare has the highest variable costs as well as the highest variance, the
highest standard deviation. So we may want to
go back and look at the Chare model. Look at the Island model
and perhaps look a little more deeply into, either Pebble or the Jespie. This
is just one step in our analysis journey, but it definitely helps us to narrow
it down.
To summarize what we’ve done in going through, we identified
our issue, our data, and we analyzed it. This was the analysis of our
descriptive statistics. This was our histogram where we were able to see that.
In our descriptives, we see that we have a mean of 17,000 in variable costs per
model. The median was 15,000, but we have a pretty large standard deviation, as well. It’s
about, a little more than a third of the mean. So we then took that information
that we wanted to see. Well, let’s see what our distribution looks like, right?
Let’s see what our dispersion looks like. And we can see the majority of the
observations are between the 14 and the 24, which coincides with what our
descriptive statistics told us. And then we took it one step further and we
wanted to see by model what the
descriptives looked like. And, so we really wanted to get an
understanding of which models seem to have the highest variable cost and
combined with the highest standard deviation. And we can see that the models
that stand out to us are the Chare, the Island, and the Pebble. The ones with
the highest standard deviation is definitely the Chare. So our next step will
likely to be to go back and look at the models that seem to have the highest
variable cost, average variable cost and the highest standard deviation, and
see if we can dig down a little bit deeper to see what costs within those
variable costs are perhaps driving that or maybe some correlations with what
options the buyer bought.
So our key takeaways, we identified that we were looking at
trying to better understand total variable cost using the 2015 to ’16 sales
data. We did descriptive statistics on total variable costs, and then we broke
that down by model, so that we could evaluate which specific model had the
highest average variable costs and the largest standard deviation.
So thank you for watching this video for statistical
analysis.
Huskie
Motors Data Analytics: Video 5
In the final video of this data analytics video series, Dr.
Dzuranin will show you how to use Tableau to bring it all together to create a
dashboard with key performance indicators (KPIs) of the Huskie Motor
Corporation.
Video Transcript (Est. time: 24:41 min):
Hello, welcome to Huskie Motors data analytics Video 5,
“Explanatory Data Visualization,” creating decision useful dashboards with key
performance indicators. I’m Ann Dzuranin, and I will be discussing the material
in this video. What we will cover in this video is we’ll first identify what
the issue is. And the issue is that we want to create decision-useful
dashboards with key performance indicators. So what we need to do is identify
the dashboards that are going to be needed and then the relevant key performance
indicators that we would like
to have on those dashboards. So when reading through the Huskie Motors case,
we’ve identified that the Huskie Motors management is interested in
understanding and being able to quickly evaluate region performance, brand
performance, model performance, and overall performance.
As we discussed in Video 2, deciding on performance
measurements can, there can be a variety of those, but what we’ve determined is
that, we’ve determined that the KPIs that we’re most interested in are going to
be gross sales, sales volume, net revenue, and profit margin. The data we’re
going to use to create the dashboards is going to be the 2015 to
2016 sales transactions for Huskie Motors. We’ll use data
visualization software. In this video I’ll be demonstrating the software
Tableau to analyze the data and evaluate that with the visualizations that we
create. We’ll then communicate our results by preparing interactive dashboards,
allowing management to be able to filter the dashboards to see down to the most
granular level of information that they’re interested in.
So let’s go ahead and get started. As I mentioned, the
software I’m going to use to demonstrate the visualizations in this video is
called Tableau. Any visualization software can be used to do something like
this. But we’re going to work with Tableau today. I’ve
already loaded Tableau data into the Tableau file and that’s
that 2015-’16 actual sales data. So we can go right to the first worksheet and
start to create visualizations that we then
use in our dashboards. So we identified that one of the
things we’re interested in is gross sales. So lets just look at gross sales
right now by region. So I can go over and I can quickly pull region into my
columns and then I’ll find gross sales and drag that into rows. So now you can
see I’ve got Europe, North America, and South America, my three regions, and I
have gross sales, and this represents total gross sales, but I actually want to
look at gross sales separate 2015 and 2016, so the other thing that I need to
bring over is going to be the sold date. So I bring over the sold date into
columns I’m going to get then the 2015-
2016 data.
Tableau, like a lot of visualization software packages, will
give you a default that they think is best. That’s not always the case. I want
to look at this in column, bars so that I can kind of look and see visually
from one year to the next what’s happening with my data. I can quickly change
the name of my sheets so that I remember what I’m working with, and we’re going
to go ahead and call this gross sales by region. The other thing I want to
think about is how is it that I want to be able to view this data? Am I
interested in region and year, just years, when I have a color scheme that’s
identified years as two different colors. So ’15
is blue and ’16 is this orange color. That may be good if I’m just concerned
about year-to-year comparisons. I’m actually concerned about region to region
and then within that year. So I actually would prefer to have region as my
color. So what I’m going to do is take region and drag that into my colormark.
Now I have colored my regions. So now any of the visualizations that I use if I
choose to use region is my color scheme then I can quickly identify, you know,
orange is North America, red is South America, blue is Europe. I could also, if
I wanted to, see the exact number of sales instead of having to look across at
the axis, I could also just drag gross sales into the label and that’ll put it
on top of all of my bars. Now that’s great, except it’s a little bit large. So
I can format that number first of all,
so that it’s currency. And then I can also change the display
unit. So then I just have it say in millions. So now we can see very quickly
what our sales are. We could actually even remove this gross sale access over
here if we wanted to as well to take away some of the additional ink on the
page that we may not need. I also might be interested in actually seeing this
with the highest gross sale, region first and then filtered from there. So what
I can do is go to region and I can sort this, and I can sort this based on the
field, which would be the sum of gross sales. And I’ll do it in descending
order so that it’s from highest to lowest. So now I can quickly see that North
America, even though I could tell that from the other version of it, at least
this way I can see them in order as to which is contributing the most to sales,
gross sales by region.
So the power of doing visualizations is not only to be able
to change the type of chart that you’re using, but also to be able to make it
interactive. And the best way to make something interactive is to be sure that
you have the information in here down to the most granular region. So, for
example, right now I have gross sales by region. I know eventually I’m going to
want to look at gross sales by brand and by model. So what I can do is be sure
that I have those in here as filters. So if I drag brand to filters right now,
I want all of them in there and I’m going to click OK. If I drag model to
filters again, I'll choose all right now and click oK. So what this allows me
to do is if I only wanted to see how the brand Tatra was doing in all the
regions, I could just pick Tatra and click OK. And now I have my visualization
has changed and I can just see how the Tatra brand is doing. So that can help
to make something really powerful. And so this way, if I want to just quickly
use my visualization in a dashboard, now we’ll be able to filter down all the
way to the granular level for what we’re looking at right now is model. So this
is a demonstration of how we actually work with visualizations.
Let’s go ahead and look in a completed Tableau file that has
put, where we’ve put together the
dashboards for the areas that we identified that we would like to have
dashboards ready to help us make decisions. So I’m going to switch over to the
completed Tableau file and we’ll start again with region performance. So what
I’m going to do is put this in view, a display mode. So this is a visualization
that was created for region performance. So remember we said we’re interested
in, at revenue, we’re interested in sales volume, we’re interested in profit
margin. So the visualizations that were created for this particular, um,
dashboard show us net revenue by region, the increase and decrease in sales
volume by region, and then our profit margin by region. So very quickly in this
dashboard I can see that I’ve got profit margin, North America is decreasing.
That could be driven and probably is largely driven by my decrease in sales
volume. I have a slight decrease in, um, South America from ’15 to ’16. So that
one I can see, although I have an increase in net revenue, my profit margin is
going down a bit. Now my sales volume line up. So this could be some, somewhat
due to perhaps increased expenses that’s driving my net revenue down. And then
when I look at my Europe, I can see that Europe actually is improving. So my
profit margin went from 10% to 13.6% from ’15 to ’16. And I could see my net
revenue reflects that. A large increase as well. And I can see that my sales
volume is up there as well. I keep in mind you can also use this to just, um,
filter based on one, um, aspect of it. So I just clicked on Europe, and then I
have my other visualizations changed to only show me Europe. So if I don’t want
to look at what else is happening, if I’m just interested in looking at Europe
today, I just go in, I click on Europe, and that’s the only ones that pop up
for me. If I click on that again, it’ll bring them all back. And the same is
true if I just clicked on
North America or I just clicked on South America.
So this is our dashboard that we’re going to use for region
performance. Now we also said another dashboard we’re interested in is brand
performance. So I’m going to show you a brand performance dashboard. So this
dashboard, again, we’re looking at profit margin, net revenue, and sales
volume. I could also have brought gross sales over into the visualization. Um,
but you probably want to keep your visualizations to three and maybe four
visuals within the dashboard. Makes it easier for the decision maker to kind of
zero in on what they’re interested in. So now I’m looking at it by brand and I
can see my brand Apechete is decreasing on profit margin. Jackson has increased
and my Tatra, now this is a problem, ’cause I can see that that brand actually
has a negative profit margin. It’s increased a little bit from ’15 to ’16, but
something is going on here that I really need to look further into. If I look
at brand performance, net revenue, again, I can see that the Apechete, even
though net revenue has gone up a bit, our profit margin is going down. When I
look at my Jackson, my net revenue has gone up, as has my profit margin. And when I look at the Tatra, my
net revenues going up a bit, but you know, that’s still in the negative area
here. So I really need to start thinking about what’s going on with the Tatra.
Um, and another thing that you can do in visualizations is you can create
what’s called a dual axis. So I’m interested in looking at how sales volume has
changed from one year to the next within my brands. And I can see, I can look
at sales volume and I could have another chart that just looks at the sales
volume change, or I can put the two together. So that gives me kind of an
indication of the magnitude of the increase or decrease in my sales volume. So
if we look at the Apechete, for example, we had a sales volume in 2015
of 371 and that increased to 395. That represents a 6.47%
increase. If I look at what happened in Jackson, Jackson went up just a bit at
2.54%, and then I can see that sales volume for Tatra is pretty high overall.
But it did have an increase, and that increase was about 4.2%. It may have
helped a little bit of what’s happening over here. But I still have some things
I’ll really need to think about what that Tatra. The other dashboard that we
said we’re interested in are the other decision-making tools we’re interested
in using is something that can help us understand what’s happening with the
model itself. And so we created a dashboard here that combines both brand and
model. And what’s interesting about this is so there’s three types of
visualizations that we’ve used here. Again, looking at profit margin only. And
so I’ve looked at profit margin ratio by brand and model. And what I’ve done
here is created a chart that if I look at the darkest blue, that’s the highest
profit margin, median profit margin ratio, and the darkest red is going to be
the lowest. So if I look at the Apechete right away, I see that I’m all in the
blue, right? So I’ve got all positive profit margins. When I look at the
Jackson, I can see I’ve got a couple of brands, a couple
of models that I’m a little bit concerned about because they
are showing a negative profit margin, right? So they’re in losses. And when I
look at the Tatra, I can see that I’ve got some really strong models here, but
I’ve got two models that are really dragging down what’s happening overall at
the Tatra. If I want to see just one model and see how that profit margin is
impacted in what region those are in, for example. So let’s say I really
want to see what’s happening with this, Jespie. And so if I
click on 2015 for Jespie. I can see really quickly how that’s affected in my
regions, right? So even though my profit margin is negative in 2015, you know
what, let’s just look at 2016 instead, even though my profit margin is negative
I’m still doing okay in South America with that one. But you know, I’m barely,
I’m barely breaking, you know, positive margin in Europe, and I’ve got a huge
negative profit margin ratio in North America. So if I wanted to see which
models were doing best, so that would be the advantage. The advantages at 36.3%
profit margin, clearly, it’s driving a lot of North America. It’s doing well in
Europe, although not nearly as well. And it’s doing well in South America as
well, although again, definitely performing best
in North America. So what’s interesting about being able to do this is, again,
you can quickly filter, and see how everything changes. Right now, I’m looking
at both years. I do have a filter up here where I can change that and I can
only look at ’16. So that helps to, allow me to very quickly just look at
what’s happening in one year. And so if I look at just that one year, then I
can see again, the Mortimer is actually doing worse than Jespie in that year.
And I can also quickly see, that the one that’s doing best is still the advantage.
So this is a nice way to bring together both brand and model performance, to be
able to understand better what’s happening within our regions, our brands, and
our models, and from year to year.
Now, the last dashboard that I’m going to illustrate for you
is that overall performance dashboard. So this dashboard, what we want to do is
really be able to make this very interactive for the user. So they’re able to
look at overall performance and then drill that down by model or by brand. So
what we’ve brought into this visualization is net revenue by region, profit
margin by region, and then growth sales by country so we can get a better
understanding of what’s happening within those regions. So the regions are
still color-coded as they were in my example where we’ve got North America as
this orange color, Europe is blue, and the red is South America. So if we look
at this visualization on the bottom, this is what’s called a tree map and what
this represents, the larger the area,
right, the higher the amount of sales. And the color
represents the region. And then I have added the detail here where you can see
what that number is in millions. So we can see just by looking at this that the
U.S. comprises right North America I should say, comprises the largest portion
of all of our sales or half of it, looks about half of our sales. And within
North America, the U.S. is by far the largest. If we look in South America,
we’re actually fairly close between Columbia, Venezuela, and Brazil. Then
followed by Chile, Argentina, and Bolivia. And when we look in Europe, we can
see that we’ve got the U.K. and France are equal, Germany, followed by Spain
and then Poland and Sweden, which are the same. So this is all well and good. I
can see how this is all happening by country. If I just wanted to click on the
U.S. and see what the U.S. net revenue is and what the U.S. profit margins are,
I just click in U.S. and then I can just look at what’s happening there. If I’m
interested, for example, in just how France is performing, I click on France. I
can see what my gross sales are. I can see what my net revenue is year to year,
and I can see what my profit
margin is as well. If I wanted to use this down and go down
to an even more granular level, remember we were concerned about what was
happening with the Tatra. So if I picked
just Tatra, now my visualizations all change and it just
shows me the results for just that brand. So it’s that brand, but all the
models within that brand and I can see that that brand does not do very well in
Europe and in South America, although this revenue is high and the sales are
high in the U.S., we are running at a loss as far as profit margin. So we
really need to think about kind of what’s happening there as well.
So, then if I just click on the U.S. and that will again show
me what’s happening just within the U.S. for the Tatra. Click back on there and
bring it all back. We know that the Tatra has some really high-performing
brands and some really poorly performing brands. So if I go and look at, for
example, we knew that the, I think it was the Jespie was not doing so well. We
can see that this Jespie is what’s driving a lot of the negative profit margin,
right? Because we can see we’ve got a loss year to year for that model and we
can see also what’s happening over here in North America, right, with the loss,
although we can see that it is performing better in Europe in ’15, but
declining in ’16 and improving in South America. So if I look down and wanted
to see, you know, well, hey, where is it doing well in South America, I go down
here and I can see that really Argentina is where I have sales. And again, this
is all relative, this profit margin is relative to what’s happening in
Argentina in South America. If I just clicked on Argentina, I can see that
Argentina has had an increase in sales of the Tetra in net revenue I should say
from year to year. And I can see that it’s had a very large increase in profit
margin as well. So being able to make this dashboard interactive, and I’ll go
ahead and change this back now to all brands and all models. When I have
everything in here, it gives me that overall performance evaluation. And then
if I want to really dig into what’s driving some of my more concerning issues,
for example, my declining profit margin, then I can dig down deeper by looking
at what’s happening with each of the brands. And that automatically changes all
my visualizations, making it a very powerful way to explain what’s happening
with your operations.
So we’re going to go back and kind of summarize what we’ve
done here. And what we’ve done is gone through and created dashboards for each
of the areas that we want to evaluate performance and identify the KPIs using
the 2015-16 data and using those visualizations to prepare that interactive
dashboard. We put together first region performance where we summarized what
was happening in our region by looking at net revenue, increase or decrease in
sales volume, and then mean profit margin, median profit margin by region. We
then delved a little bit deeper by looking at what was happening with our
brands. We have three brands in our business. We’ve got the Apechete, the
Jackson, and the Tatra. And we looked again at what was happening with profit
margin, revenue, and then sales volume. And for this one, instead of just
looking at the increase or decrease, we looked at volume as well as what the
percentage change was from year to year in each of the brands. We took that one
step further and looked at brand and model performance.
And there we created a visualization where we could look at profit margin by
growth, brand, and model. And we could make this, and we made this interactive
so that we could look at it either by both years, which is what this static
view is by both years. Or we could have changed and looked at one year at a
time. And we could look and see very quickly which profit margins are somewhat
concerning. So we’ve got the Jespie, the Mortimer, and the Crux, and these were
for both years. And then we can look
at it by region over here. And we saw that we could look at
just ’16. We could look at just one particular brand or model, and this was
color-coded so that the more the lower the profit margin, if it was negative,
it was in the red. And if it was positive it was in the blue, and the darker
the color, the more extreme either way. And then we fit, we wrapped it up by
looking at overall performance where we made this interactive so that we could,
if we wanted to, look at it just overall. So this all models, all brands, all
models for all areas. We looked a little bit deeper to see within the regions
how the countries are performing. And then we showed how we could change all of
the visualizations at the same time by simply going through and picking
different models and brand combinations so that we could see how those were
doing.
So the takeaways, again, we identified our issue, which was
to create useful dashboards for decision making. We identified the KPIs we were
interested [in] for that. We use the data from 2015 and ’16 to analyze that
data and communicated our results in a variety of interactive dashboards that
would allow the decision maker to change those dashboards to help identify
whatever areas they were interested in examining further.
So thank you for watching Video 5, “Explanatory Data
Visualization.”
Huskie
Motors Conclusion
This concludes the Huskie Motors data analytics video series.
Up next, you will have the opportunity to perform some data analytics yourself,
as you attempt the exercises based on the Huskie Motors case study. You’ll now
need to access the data files from the Resources tab and save them to your
computer. You’ll use these files to perform the analytics required for the
exercises in the following lesson.
Please note: To perform the exercises, you will be asked to
load raw data into Microsoft Excel. Based on the version of Excel you are
using, please refer to the following links for guidance on loading raw data:
For Office 365: https://support.office.com/en-us/article/data-import-and-analysis-options-3ea52160-08bc-45ac-acd9-bc4a11bcc2a2
For Excel 2010 – 2016:
https://support.office.com/en-us/article/text-import-wizard-c5b02af6-fda1-4440-899f-f78bafe41857#ID0EAAEAAA=Office_2010_-_Office_2016
Module 3 Wrap‐up
You have now completed Module 3, Applying Data Analytics and
Visualization. In this module, you were able to
Explore the data analytics and visualizations described in the
Huskie Motors case study, and
Apply this knowledge to complete the exercises.
You will now advance to the final module of this course, Module 4
Conclusion and Final
Assessment
Module 4:
Conclusion and Final Assessment
Course
Roadmap
Click Next to proceed to Module 4, Conclusion and Final
Assessment.
Summary
Welcome to Module 4, Conclusion and Final Assessment! In this
module, we will conclude the Data Analytics & Visualization Fundamentals
Certificate course by reviewing the overall course learning objectives, and
then moving to the Final Assessment.
Course Objectives
Review
Let’s revisit the learning objectives for this course. In the
preceding modules, you learned to:
Recognize the impact of technology and analytics on the
accounting profession.
Demonstrate how data analytics can influence organizational
strategy.
Identify ways data visualizations effectively enable
appropriate business decisions.
Download this content at
https://drive.google.com/file/d/1dqD_ATF2k4GaSpFhgw5rqJp572qmzGdi/view?usp=sharing
Download original document at
Subscribe to:
Posts (Atom)
Mini Case I and II
This material is taken from Harvard Business Review PART I. The Case of CPC International. In the summer of 1986, financial analysts be...
-
Like any other subject, the first few discussions are always focused on basic concepts and principles. Here we discuss the basic cost con...
-
This article will not include pictures of cats and dogs. Perhaps in the near future, depending on the mood, I might include one. But r...